Artificial Intelligence: evasion techniques and cyber defenses

(To Horace Danilo Russo)
19/07/21

It is counterintuitive to the idea of ​​rationality, speed, effectiveness and efficiency that we have made of computers, but the reality is that artificial intelligence (AI) systems exhibit a feature very similar to the human analogue concept of naivety. And, therefore, they are vulnerable to deception and manipulation.

A bit like it happens among humans, where we often witness scams perpetrated by subtly taking advantage of the victim's ignorance or innocence, so it also happens for AI during the automatic learning process, better known with the term of Machine Learning (ML): the ability to learn the performance of typical human intelligence tasks, such as image classification or speech recognition.

To remedy this problem, the so-called Adversarial Machine Learning (AML), that sector that studies how to make the automatic learning phase safer in order to make the system more robust with respect to attempts at deception.

For the layman, machine learning includes a set of techniques based on statistical approaches or mathematical optimization techniques, which allow to recognize patterns and similarities between data: for example, in the approaches of Supervised Learning, computer learning is supervised by an expert who teaches the machine what decisions to make or what actions to do in the presence of a certain event; in those of Unsupervised Learninginstead, it is explained to the machine how to recognize elements of commonality or diversity between the information, but then it is left to work on the data alone; or, finally, in the Reinforcement learning they are taught to recognize the goodness of decisions made by having received positive feedback, thus ensuring reinforcement learning.

The grounds of choice for a cyber attack on Artificial Intelligences are basically three. First of all the physical domain represented by sensors and actuators that allow dialogue with the environment, those that for us humans are the five senses, since they can be damaged to create malfunctions. Think for example of the fact that sabotaging a microphone disturbs the intelligent system in listening to a voice command; or that, by sabotaging a relay, an industrial control intelligence is prevented from turning off the furnace of a foundry when a critical temperature is reached. Then, there are attacks that exploit the weaknesses of the mechanisms of digital representation of data, for example by replacing correct information with polluted data. And finally there are the assaults on learning algorithms, to inject into computers - for example - a study method manipulated for hidden purposes or, on the other hand, to understand how it learns: after all, it is precisely starting from the knowledge of "how" the machine instructs itself, that one it can boycott their learning or predict their behavior.

The attack can take place according to different techniques: either it ranges from malicious training methods, to elusive interaction operations or underhanded exploration procedures.

The first category includes all those tactics of poisoning with which, directly or indirectly, the acquired knowledge or the learning logic are polluted. In these cases, hackers must out of necessity access artificial intelligence in order to falsify the data stored in memory or to alter the learning algorithm. The consequences of these attacks can be very serious and have a tangible impact in the physical world, such as the cases of malevolent training recently described by the academics of the University of Cagliari in a study on self-driving cars in smart cities: these cars could without a driver, did not stop at an intersection if, following an attack by Label Manipulation of the data relating to the recognition of the “stop” signal, the intelligence was induced to consider the notion contrary to that of stopping the vehicle.

In subtle exploration techniques, on the other hand, interactions are carried out with artificial intelligence aimed at understanding the logic of cognitive assimilation. Typical example is theOracle attack, where a reasoned series of questions are sent to the learning software and, by examining the pattern of the relative answers, a model is structured to predict future behavior. The tactics gradient-based instead they are clear examples of the elusive interaction technique with which intelligence is engaged - for example - with visual signals that present perturbations not detectable by human perception, but sufficient to cause paradoxical outcomes in the learning algorithm that prevent or disturb - precisely evade - the ability to categorize images. In other words, these techniques aim to identify the smallest number of changes necessary to build an image that confuses the decision-making capabilities of the system.

Research has already come up with suitable defense strategies. For example, to counteract hidden and malicious training, encryption algorithms have been developed for memory partitions that contain the notions learned or the learning logic; to defend against elusive interactions, countermeasures have been designed that tend to reduce sensitivity to disturbances - a sort of digital anesthetic that reduces susceptibility to deceptive artifacts, better known in cybersecurity environments with the term Gradient Masking - or examples of disturbing signals are injected into the training database, so that they are recognized as malicious and therefore discarded (so-called technique of Adversarial Training); and finally to protect the artificial intelligence from the tactics of devious exploration, it is taught to detect the actions of monitoring, testing and controlling opponents on the network.

In short, research is making enormous strides to make intelligent systems safer and more resilient, while maintaining the dependence on human control: the latter is an essential issue, especially for those artificial intelligences with a critical impact, such as those subservient to armament materials and to dual-use items used to develop i Lethal Autonomous Weapons Systems (LAWS), intelligent weapon systems so to speak, whose use and effects must always and in any case remain attributable to clear and determinable human responsibilities, both state and individual.

To learn more:

https://smartcities.ieee.org/newsletter/june-2021/explainable-machine-le...

https://gradientscience.org/intro_adversarial/

https://nvlpubs.nist.gov/nistpubs/ir/2019/NIST.IR.8269-draft.pdf