Adversarial Machine Learning (AML)

Machine-learning algorithms often require human input to train and can be vulnerable to adversarial inputs that fool the system. Adversarial examples are perturbations added to an image, which alters it in a way that causes the algorithm to misclassify it. Examples of such attacks include adding undetectable noise or changing the hue slightly so that a face is classified as an airplane by a facial-recognition algorithm. In addition, machine-learning algorithms may accidentally learn biases from their training data and replicate them when making predictions.

Types of Attacks in AML

Black box

A black-box attack is an attack on a machine learning model that exploits its lack of transparency. For example, attackers can use the gradient information of an ML model to predict its responses to inputs slightly different from those used in training.

Square Attack

The square attack is a type of adversarial machine learning attack in which an attacker forces the classifier to make errors by leveraging the structure of data. This can be done either by directly modifying a small portion of the feature space or by creating new examples with very few modifications. This strategy was first described in 2016 and has been implemented on several different types of classifiers.

HopSkipJump Attack

The HopSkipJump attack, unlike the Square Attack, does not require the capacity to calculate gradients or access to score values, but rather only the model’s class prediction output (for any given input).

Adversarial machine learning is just as important for the future of artificial intelligence, and in many ways more significant. It’s one thing to try to infuse a computer system with human-like qualities like intuition and judgment.

Leave a Reply

%d bloggers like this: