Elements in Securing AI - Part 2 Attacks and Defences to AI Systems


The second part will give an overview of discovering security vulnerabilities and attacks to AI systems or systems with AI components and develop effective defensive techniques to address these type of attacks.

Attacks to AI Systems

Backdoor Attacks: Machine learning models are often trained on data from potentially untrustworthy sources, including crowd-sourced information, social media data, and user-generated data such as customer satisfaction ratings, purchasing history, or web traffic. Recent work has shown that adversaries can introduce backdoors or “trojans” in machine learning models by poisoning training sets with malicious samples. The resulting models perform as expected on normal training and testing data, but behave badly on specific attacker-chosen inputs.

For example, an attacker could introduce a backdoor in a deep neural network (DNN) trained to recognise traffic signs so that it achieves high accuracy on standard inputs but misclassifies a stop sign as a speed limit sign if a yellow sticky note is attached to it. Unlike adversarial samples that require specific, complex noise to be added to an image, backdoor triggers can be quite simple and easily applicable to images or even objects in the real world. This poses a real threat to the deployment of machine learning models in security-critical applications.

Defences of AI Systems

Risk Management: It used to establish the right balance between openness and security, improving technical measures for formally verifying the robustness of systems and ensuring those policy frameworks developed in a world that was previously a less AI-infused world.
  1. Intelligent machines often have hidden biases, not necessarily derived from any intent on the part of the designer but from the data provided to train the system. For instance, if a system learns which job applicants to accept for an interview by using a data set of decisions made by human recruiters in the past, it may inadvertently learn to perpetuate racial, gender, ethnic or other biases. Moreover, these biases may not appear as an explicit rule but, rather, be embedded in subtle interactions among the thousands of factors considered.
  2. A second risk is that, unlike traditional systems built on explicit rules of logic, neural networks deal with statistical truths rather than literal truths. That can make it difficult, if not impossible, to prove with complete certainty that a system will work in all cases, particularly in situations that were not represented in training data. Lack of verifiability can be a concern in mission-critical applications (such as controlling a nuclear power plant) or when life-or-death decisions are involved.
  3. A third risk is that when machine learning systems make errors, diagnosing and correcting the precise nature of the problem can be difficult. What led to the solution set may be unimaginably complex, and the solution may be far from optimal if the conditions under which the system was trained to happen to change. Given all this, the appropriate benchmark is not the pursuit of perfection, but rather, the best available alternative.
By being aware of these risks it allows for steps to be taken to minimise the impact of them when they interact with the world. 

Defending Against Backdoor Attacks: A defence method for detecting these attacks involves examining and clustering the neural activations in the training samples, by identifying which samples are legitimate and which ones are manipulated by an adversary. This defence method has shown good results for known backdoor attacks. 


Hopefully, this presented some useful introductory information to attacks that can be targeted at AI systems and what is needed to defend AI systems.




Popular posts

Balancing functionality, usability and security in design

Personal Interest - Unbuilt fleets of the Royal Navy

Personal Interest - RAF Unbuilt Projects