A2PMethod

class a2pm.A2PMethod(pattern, preassigned_patterns=None, class_discriminator=<function A2PMethod.<lambda>>, seed=None)

Bases: sklearn.base.BaseEstimator

Adaptative Perturbation Pattern Method.

A2PM generates realistic adversarial examples by assigning an independent sequence of adaptative patterns to each class, which analyze specific feature subsets to create valid and coherent data perturbations.

Note: Class-specific data perturbations can only be created if the class of each sample is identified, either as a label or a numeric representation. To obtain external Class IDs for internal use by this method, there are two alternatives:

Specify a class_discriminator function;
Provide the y parameter to the fit, partial_fit, transform and generate methods.

Parameters

pattern (pattern, config or tuple of patterns/configs) – Default pattern (or pattern tuple) to be adapted for each new found class. Supports configurations to create patterns, as well as pre-fitted pattern instances.
preassigned_patterns (dict of ‘Class ID - pattern’ pairs (default None)) – Pre-assigned mapping of specific classes to their specific patterns (or pattern tuples). Also supports configurations to create patterns, as well as pre-fitted pattern instances.

{ Class ID : pattern, Class ID : (pattern, pattern), Class ID : None }

Preassign None to a Class ID to disable perturbations of that class.

Set to None to disable pre-assignments, treating all classes as new.
class_discriminator (callable or None (default lambda)) – Function to be used to identify the Class ID of each sample of input data X, in order to use class-specific patterns.

class_discriminator(X) -> y

If no discriminator is specified and the y parameter is not provided to a method, all samples will be assigned to the same general class. To prevent overlapping with regular Class IDs, that class has the -2 ID. Therefore, the default function is:

lambda X: numpy.full(X.shape[0], -2)

Set to None to disable the default function, imposing the use of the y parameter for all methods.
seed (int, None or a generator (default None)) – Seed for reproducible random number generation. If provided:
- For pattern configurations, it will override any configured seed;
- For already created patterns, it will not have any effect.

Variables

classes (list of Class IDs) – The currently known classes. Only available after a call to fit or partial_fit.
class_mapping (dict of 'Class ID - pattern' pairs) – The current mapping of known classes to their respective pattern tuples. Only available after a call to fit or partial_fit.

fit(X, y=None)

Fully adapts the method to new data.

First, the method is reset to the preassigned_patterns, be it configurations or pre-fitted pattern instances. Then, for new found classes, the default pattern will be assigned and updated. For classes with pre-assigned patterns, these will be updated.

Parameters

X (array-like in the (n_samples, n_features) shape) – Input data.
y (array-like in the (n_samples, ) shape or None (default None)) – Class IDs of input data, to use class-specific patterns.

Set to None to use the class_discriminator function.

Returns

This A2PMethod instance.

Return type

self

partial_fit(X, y=None)

Partially adapts the method to new data.

For new found classes, the default pattern will be assigned and updated. For known classes, either pre-assigned or previously found, their patterns will be updated.

Parameters

X (array-like in the (n_samples, n_features) shape) – Input data.
y (array-like in the (n_samples, ) shape or None (default None)) – Class IDs of input data, to use class-specific patterns.

Set to None to use the class_discriminator function.

Returns

This A2PMethod instance.

Return type

self

transform(X, y=None, quantity=1, keep_original=False) → numpy.ndarray

Applies the method to create adversarial examples.

Parameters

X (array-like in the (n_samples, n_features) shape) – Input data.
y (array-like in the (n_samples, ) shape or None (default None)) – Class IDs of input data, to use class-specific patterns.

Set to None to use the class_discriminator function.
quantity (int, > 0 (default 1)) – Number of examples to create for each sample.
keep_original (bool (default False)) – Signal to keep the original input data in the returned array, in addition to the created examples.

Returns

X_adversarial – Adversarial data, in the same order as input data.

If quantity > 1, the resulting array will be tiled:

example1_of_sample1

example1_of_sample2

example1_of_sample3

example2_of_sample1

example2_of_sample2

example2_of_sample3

…

If keep_original is signalled, the resulting array will contain the original input data and also be tiled:

sample1

sample2

sample3

example1_of_sample1

example1_of_sample2

example1_of_sample3

…

Return type

numpy array of shape (n_samples * quantity, n_features)

fit_transform(X, y=None, quantity=1, keep_original=False) → numpy.ndarray

Fully adapts the method to new data, and then applies it to create adversarial examples.

First, the method is reset to the preassigned_patterns, be it configurations or pre-fitted pattern instances. Then, for new found classes, the default pattern will be assigned and updated. For classes with pre-assigned patterns, these will be updated.

Parameters

X (array-like in the (n_samples, n_features) shape) – Input data.
y (array-like in the (n_samples, ) shape or None (default None)) – Class IDs of input data, to use class-specific patterns.

Set to None to use the class_discriminator function.
quantity (int, > 0 (default 1)) – Number of examples to create for each sample.
keep_original (bool (default False)) – Signal to keep the original input data in the returned array, in addition to the created examples.

Returns

X_adversarial – Adversarial data, in the same order as input data.

If quantity > 1, the resulting array will be tiled.

If keep_original is signalled, the resulting array will contain the original input data and also be tiled.

Return type

numpy array of shape (n_samples * quantity, n_features)

partial_fit_transform(X, y=None, quantity=1, keep_original=False) → numpy.ndarray

Partially adapts the method to new data, and then applies it to create adversarial examples.

For new found classes, the default pattern will be assigned and updated. For known classes, either pre-assigned or previously found, their patterns will be updated.

Parameters

X (array-like in the (n_samples, n_features) shape) – Input data.
y (array-like in the (n_samples, ) shape or None (default None)) – Class IDs of input data, to use class-specific patterns.

Set to None to use the class_discriminator function.
quantity (int, > 0 (default 1)) – Number of examples to create for each sample.
keep_original (bool (default False)) – Signal to keep the original input data in the returned array, in addition to the created examples.

Returns

X_adversarial – Adversarial data, in the same order as input data.

If quantity > 1, the resulting array will be tiled.

If keep_original is signalled, the resulting array will contain the original input data and also be tiled.

Return type

numpy array of shape (n_samples * quantity, n_features)

generate(classifier, X, y=None, y_target=None, iterations=10, patience=2, callback=None) → numpy.ndarray

Applies the method to perform adversarial attacks against a classifier.

An attack can be untargeted, causing any misclassification, or targeted, seeking to reach a specific class. To perform a targeted attack, the class that should be reached for each sample must be provided in the y_target parameter.

Note: The misclassifications are caused on the class predictions of the classifier. These predictions are independent from the Class IDs provided in y or by the class_discriminator function, which remain for internal use only.

Parameters

classifier (object with a predict method) – Fitted classifier to be attacked.
X (array-like in the (n_samples, n_features) shape) – Input data.
y (array-like in the (n_samples, ) shape or None (default None)) – Class IDs of input data, to use class-specific patterns.

Set to None to use the class_discriminator function.
y_target (array-like in the (n_samples, ) shape or None (default None)) – Class predictions that should be reached in a targeted attack.

Set to None to perform an untargeted attack.
iterations (int, > 0 (default 10)) – Maximum number of iterations that can be performed before ending the attack.
patience (int, >= 0 (default 2)) – Patience for early stopping. Corresponds to the number of iterations without further misclassifications that can be performed before ending the attack.

Set to 0 to disable early stopping.
callback (callable or list of callables) – List of functions to be called before the attack starts (iteration 0), and after each attack iteration (iteration 1, 2, …).

callback(**kwargs)

callback(X, iteration, samples_left, samples_misclassified, nanoseconds)

It can receive five parameters:
- the current data (input data at iteration 0, and then adversarial data);
- the current attack iteration;
- the number of samples left to be misclassified;
- the number of samples misclassified in the current iteration;
- the number of nanoseconds consumed in the current iteration.
For example, a simple function to print each iteration can be:

def callback(**kwargs): print(kwargs[“iteration”])

Returns

X_adversarial – Adversarial data, in the same order as input data.

Return type

numpy array of shape (n_samples, n_features)

fit_generate(classifier, X, y=None, y_target=None, iterations=10, patience=2, callback=None) → numpy.ndarray

Fully adapts the method to new data, and then applies it to perform adversarial attacks against a classifier.

First, the method is reset to the preassigned_patterns, be it configurations or pre-fitted pattern instances. Then, for new found classes, the default pattern will be assigned and updated. For classes with pre-assigned patterns, these will be updated.

An attack can be untargeted, causing any misclassification, or targeted, seeking to reach a specific class. To perform a targeted attack, the class that should be reached for each sample must be provided in the y_target parameter.

Note: The misclassifications are caused on the class predictions of the classifier. These predictions are independent from the Class IDs provided in y or by the class_discriminator function, which remain for internal use only.

Parameters

classifier (object with a predict method) – Fitted classifier to be attacked.
X (array-like in the (n_samples, n_features) shape) – Input data.
y (array-like in the (n_samples, ) shape or None (default None)) – Class IDs of input data, to use class-specific patterns.

Set to None to use the class_discriminator function.
y_target (array-like in the (n_samples, ) shape or None (default None)) – Class predictions that should be reached in a targeted attack.

Set to None to perform an untargeted attack.
iterations (int, > 0 (default 10)) – Maximum number of iterations that can be performed before ending the attack.
patience (int, >= 0 (default 2)) – Patience for early stopping. Corresponds to the number of iterations without further misclassifications that can be performed before ending the attack.

Set to 0 to disable early stopping.
callback (callable or list of callables) – List of functions to be called before the attack starts (iteration 0), and after each attack iteration (iteration 1, 2, …).

callback(**kwargs)

callback(X, iteration, samples_left, samples_misclassified, nanoseconds)

Returns

X_adversarial – Adversarial data, in the same order as input data.

Return type

numpy array of shape (n_samples, n_features)

partial_fit_generate(classifier, X, y=None, y_target=None, iterations=10, patience=2, callback=None) → numpy.ndarray

Partially adapts the method to new data, and then applies it to perform adversarial attacks against a classifier.

For new found classes, the default pattern will be assigned and updated. For known classes, either pre-assigned or previously found, their patterns will be updated.

An attack can be untargeted, causing any misclassification, or targeted, seeking to reach a specific class. To perform a targeted attack, the class that should be reached for each sample must be provided in the y_target parameter.

Note: The misclassifications are caused on the class predictions of the classifier. These predictions are independent from the Class IDs provided in y or by the class_discriminator function, which remain for internal use only.

Parameters

classifier (object with a predict method) – Fitted classifier to be attacked.
X (array-like in the (n_samples, n_features) shape) – Input data.
y (array-like in the (n_samples, ) shape or None (default None)) – Class IDs of input data, to use class-specific patterns.

Set to None to use the class_discriminator function.
y_target (array-like in the (n_samples, ) shape or None (default None)) – Class predictions that should be reached in a targeted attack.

Set to None to perform an untargeted attack.
iterations (int, > 0 (default 10)) – Maximum number of iterations that can be performed before ending the attack.
patience (int, >= 0 (default 2)) – Patience for early stopping. Corresponds to the number of iterations without further misclassifications that can be performed before ending the attack.

Set to 0 to disable early stopping.
callback (callable or list of callables) – List of functions to be called before the attack starts (iteration 0), and after each attack iteration (iteration 1, 2, …).

callback(**kwargs)

callback(X, iteration, samples_left, samples_misclassified, nanoseconds)

Returns

X_adversarial – Adversarial data, in the same order as input data.

Return type

numpy array of shape (n_samples, n_features)