Adversarial Machine Learning (AML) Somesh Jha University of - PowerPoint PPT Presentation

Adversarial Machine Learning (AML) Somesh Jha University of Wisconsin, Madison Thanks to Nicolas Papernot, Ian Goodfellow, and Jerry Zhu for some slides .

Machine learning brings social disruption at scale Healthcare Energy Source: Peng and Gulshan (2017) Source: Deepmind Transportation Education Source: Google Source: Gradescope 2

Machine learning is not magic (training time) Training data 3

Machine learning is not magic (inference time) C ? 4

Machine learning is deployed in adversarial settings YouTube filtering Microsoft’s Tay chatbot Content evades detection at inference Training data poisoning 5

Machine learning does not always generalize well Training data Test data 6

ML reached “human -level performance” on many IID tasks circa 2013 ...recognizing objects and faces…. (Szegedy et al, 2014) (Taigmen et al, 2013) ...solving CAPTCHAS and reading addresses... (Goodfellow et al, 2013) (Goodfellow et al, 2013)

Caveats to “human - level” benchmarks The test data is not very diverse. ML models are fooled Humans are not very good by natural but unusual data. at some parts of the benchmark (Goodfellow 2018)

ML (Basics) • Supervised learning • Entities • (Sample Space) 𝑎 = 𝑌 × 𝑍 • (data, label) 𝑦, 𝑧 • (Distribution over 𝑎 ) 𝐸 • (Hypothesis Space) 𝐼 • (loss function) 𝑚: 𝐼 × 𝑎 → 𝑆

ML (Basics) • Learner’s problem • Find 𝑥 ∈ 𝐼 that minimizes • 𝑆 (regularizer) • 𝐹 𝑨∼𝐸 𝑚 𝑥, 𝑨 + 𝜇 𝑆 𝑥 𝑚 𝑥, 𝑦 𝑗 , 𝑧 𝑗 + 𝜇 𝑆(𝑥) 1 𝑛 𝑛 • 𝑗=1 • Sample set 𝑇 = { 𝑦 1 , 𝑧 1 , … , 𝑦 𝑛 , 𝑧 𝑛 } • SGD • (iteration) 𝑥 𝑢 + 1 = 𝑥 𝑢 − 𝜃 𝑢 𝑚 ′ (𝑥 𝑢 , 𝑦 𝑗 𝑢 , 𝑧 𝑗 𝑢 ) • (learning rate) 𝜃 𝑢 • …

ML (Basics) • SGD • How learning rates change? • In what order you process the data? • Sample-SGD • Random-SGD • Do you process in mini batches? • When do you stop?

ML (Basics) • After Training • 𝐺 𝑥 : 𝑌 → 𝑍 • 𝐺 𝑥 (𝑦) = argmax 𝑧∈𝑍 𝑡 𝐺 𝑥 (𝑦) • (softmax layer) 𝑡(𝐺 𝑥 ) • Sometimes we will write 𝐺 𝑥 simply as 𝐺 • 𝑥 will be implicit

ML (Basics) • Logistic Regression • 𝑌 = ℜ 𝑜 , 𝑍 = +1, −1 • 𝐼 = ℜ n • Loss function 𝑚 𝑥, 𝑦, 𝑧 • log 1 + exp −𝑧 𝑥 𝑈 𝑦 • 𝑆 𝑥 = | 𝑥 | 2 • Two probabilities 𝑡(𝐺) = (𝑞 −1 , 𝑞 +1 ) 1 1 • ( 1+exp 𝑥 𝑈 𝑦 , 1+exp −𝑥 𝑈 𝑦 ) • Classification • Predict -1 if 𝑞 −1 > 0.5 • Otherwise predict +1

Adversarial Learning is not new!! • Lowd: I spent the summer of 2004 at Microsoft Research working with Chris Meek on the problem of spam. • We looked at a common technique spammers use to defeat filters: adding "good words" to their emails. • We developed techniques for evaluating the robustness of spam filters, as well as a theoretical framework for the general problem of learning to defeat a classifier (Lowd and Meek, 2005) • But… • New resurgence in ML and hence new problems • Lot of new theoretical techniques being developed • High dimensional robust statistics, robust optimization, …

Attacks on the machine learning pipeline Learned Parameters Learning algorithm Training data Attack X ✓ y Training data Training set Test input Test output poisoning Adversarial Examples Model theft

I.I.D. Machine Learning I: Independent Train Test I: Identically D: Distributed All train and test examples drawn independently from same distribution

Security Requires Moving Beyond I.I.D. • Not identical: attackers can use unusual inputs (Eykholt et al, 2017) • Not independent: attacker can repeatedly send a single mistake (“test set attack”)

Training Time Attack

Training time 7 • Setting: attacker perturbs training set to fool a model on a test set • Training data from users is fundamentally a huge security hole • More subtle and potentially more pernicious than test time attacks, due to coordination of multiple points

Lake Mendota Ice Days

Poisoning Attacks

Formalization • Alice picks a data set 𝑇 of size 𝑛 • Alice gives the data set to Bob • Bob picks • 𝜗 𝑛 points 𝑇 𝐶 • Gives the data set 𝑇 ∪ 𝑇 𝐶 back to Alice • Or could replace some points in 𝑇 • Goal of Bob • Maximize the error for Alice • Goal of Alice • Get close to learning from clean data

Representative Papers • Being Robust (in High Dimensions) Can be Practical I. Diakonikolas, G. Kamath, D. Kane, J. Li, A. Moitra, A. Stewart ICML 2017 • Certified Defenses for Data Poisoining Attacks. Jacob Steinhardt, Pang Wei Koh, Percy Liang. NIPS 2017 • ….

Model Extraction/Theft Attack

Model Theft • Model theft: extract model parameters by queries (intellectual property theft) • Given a classifier 𝐺 • Query 𝐺 on 𝑟 1 , … , 𝑟 𝑜 and learn a classifier 𝐻 • 𝐺 ≈ 𝐻 • Goals: leverage active learning literature to develop new attacks and preventive techniques • Paper: Stealing Machine Learning Models using Prediction APIs , Tramer et al., Usenix Security 2016

Fake News Attacks 9 Abusive use of machine learning: Using GANs to generate fake content (a.k.a deep fakes) Strong societal implications : elections, automated trolling, court Generative media : evidence … ● Video of Obama saying things he never said, ... ● Automated reviews, tweets, comments, indistinguishable from human-generated content

Definition “Adversarial examples are inputs to machine learning models that an attacker has intentionally designed to cause the model to make a mistake” (Goodfellow et al 2017)

What if the adversary systematically found these inputs? 31 Biggio et al., Szegedy et al., Goodfellow et al., Papernot et al.

Good models make surprising mistakes in non-IID setting “Adversarial examples” + = Schoolbus Ostrich Perturbation (rescaled for visualization) (Szegedy et al, 2013)

Adversarial examples... … beyond deep learning … beyond computer vision P[X= Malware ] = 0.90 P[X=Benign] = 0.10 P[X*=Malware] = 0.10 P[X*= Benign ] = 0.90 Logistic Regression Nearest Neighbors Support Vector Machines Decision Trees 33

Threat Model • White Box • Complete access to the classifier 𝐺 • Black Box • Oracle access to the classifier 𝐺 • for a data 𝑦 receive 𝐺(𝑦) • Grey Box • Black- Box + “some other information” • Example: structure of the defense

Metric 𝜈 for a vector < 𝑦 1 , … , 𝑦 𝑜 > • 𝑀 ∞ 𝑜 • max 𝑗=1 | 𝑦 𝑗 | • 𝑀 1 • 𝑦 1 + … + |𝑦 𝑜 | • 𝑀 𝑞 ( 𝑞 ≥ 2) 𝑦 1 𝑞 + … + 𝑦 𝑜 𝑞 𝑟 • 1 • Where 𝑟 = 𝑞

White Box • Adversary’s problem • Given: 𝑦 ∈ 𝑌 • Find 𝜀 • min 𝜈 𝜀 𝜀 • Such that: 𝐺 𝑦 + 𝜀 ∈ 𝑈 • Where: 𝑈 ⊆ 𝑍 • Misclassification: 𝑈 = 𝑍 − 𝐺 𝑦 • Targeted: 𝑈 = {𝑢}

FGSM (misclassification) • Take a step in the • direction of the gradient of the loss function • 𝜀 = 𝜗 𝑡𝑗𝑕𝑜(Δ 𝑦 𝑚 𝑥, 𝑦, 𝐺 𝑦 ) • Essentially opposite of what SGD step is doing • Paper • Goodfellow, Shlens, Szegedy. Explaining and harnessing adversarial examples. ICLR 2018

PGD Attack (misclassification) • 𝐶 𝑦, 𝜗 𝑟 • 𝑟 = ∞, 1 , 2, … . • A ϵ ball around 𝑦 • Initial • 𝑦 0 = 𝑦 • Iterate 𝑙 ≥ 1 • 𝑦 𝑙 = 𝑄𝑠𝑝𝑘 𝐶 𝑦, 𝜗 𝑟 [ 𝑦 𝑙−1 + 𝜗 𝑡𝑗𝑕𝑜 Δ 𝑦 𝑚 𝑥, 𝑦, 𝐺 𝑦 ]

J SMA (Targetted) The Limitations of Deep Learning in Adversarial Settings [IEEE EuroS&P 2016] 39 Nicolas Papernot , Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z. Berkay Celik, and Ananthram Swami

Carlini-Wagner (CW) (targeted) Formulation ● min | 𝜀 | 2 ○ 𝜀 Such that 𝐺 𝑦 + 𝜀 = 𝑢 ■ Define ● 𝑕 𝑦 = max(𝑛𝑏𝑦 𝑗 !=𝑢 𝑎 𝐺 𝑦 𝑗 − 𝑎 𝐺 𝑦 𝑢 , −𝜆) ○ Replace the constraint ○ 𝑕 𝑦 ≤ 0 ■ Paper ● Nicholas Carlini and David Wagner. Towards Evaluating the Robustness of Neural Networks. ○ Oakland 2017.

CW (Contd) The optimization problem ● min 𝜀 2 ○ 𝜀 Such that 𝑕 𝑦 ≤ 0 ■ Lagrangian trick ● min 𝜀 2 + 𝑑 𝑕 𝑦 ■ δ Use existing solvers for unconstrained optimization ● Adam ○ Find 𝑑 using grid search ○

Adversarial Machine Learning (AML) Somesh Jha University of - PowerPoint PPT Presentation

Adversarial Machine Learning (AML) Somesh Jha University of Wisconsin, Madison Thanks to Nicolas Papernot, Ian Goodfellow, and Jerry Zhu for some slides . Machine learning brings social disruption at scale Healthcare Energy Source: Peng and

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

SECURITY, ADVERSARIAL SECURITY, ADVERSARIAL LEARNING, AND PRIVACY LEARNING, AND PRIVACY

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

AdVersarial: Perceptual Ad Blocking meets Adversarial Machine Learning Florian Tramr November

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

Neglected topics CS 446 Adversarial examples and deep networks 1 / 23 Adversarial

Adversarial Learning Bounds for Linear Classes and Neural Nets Understanding Adversarial Learning

15-780 Graduate Artificial Intelligence: Adversarial attacks and provable defenses J. Zico

Differentiable Abstract Interpretation for Provably Robust Neural Networks safeai.ethz.ch

Distributed Denial of Service Attacks & Defenses Guest Lecture by: Vamsi Kambhampati Fall

On the Feasibility of Rerouting-based DDoS Defenses Muoi Tran , Min Suk Kang, Hsu-Chun Hsiao,

The Federal Payroll Tax Case (Focus on Trust-Fund Recovery Penalty) S T E P H E N P . K A U F F

Defending against malicious peripherals with Cinch Presented by Avesta Hojjati CS598 Computer

Outline Return address protections CSci 5271 Introduction to Computer Security Announcements

CSE 544 Advanced Systems Security Trent Jaeger Systems and Internet Infrastructure Security

Adversarial Machine Learning (AML) Somesh Jha University of - PowerPoint PPT Presentation

Adversarial Machine Learning (AML) Somesh Jha University of Wisconsin, Madison Thanks to Nicolas Papernot, Ian Goodfellow, and Jerry Zhu for some slides . Machine learning brings social disruption at scale Healthcare Energy Source: Peng and

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

SECURITY, ADVERSARIAL SECURITY, ADVERSARIAL LEARNING, AND PRIVACY LEARNING, AND PRIVACY

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

AdVersarial: Perceptual Ad Blocking meets Adversarial Machine Learning Florian Tramr November

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Synthesizing Robust Adversarial Examples Anish Athalye*, Logan Engstrom*, Andrew Ilyas*, Kevin

Neglected topics CS 446 Adversarial examples and deep networks 1 / 23 Adversarial

Adversarial Learning Bounds for Linear Classes and Neural Nets Understanding Adversarial Learning

15-780 Graduate Artificial Intelligence: Adversarial attacks and provable defenses J. Zico

Differentiable Abstract Interpretation for Provably Robust Neural Networks safeai.ethz.ch

Distributed Denial of Service Attacks &amp; Defenses Guest Lecture by: Vamsi Kambhampati Fall

On the Feasibility of Rerouting-based DDoS Defenses Muoi Tran , Min Suk Kang, Hsu-Chun Hsiao,

The Federal Payroll Tax Case (Focus on Trust-Fund Recovery Penalty) S T E P H E N P . K A U F F

Defending against malicious peripherals with Cinch Presented by Avesta Hojjati CS598 Computer

Outline Return address protections CSci 5271 Introduction to Computer Security Announcements

CSE 544 Advanced Systems Security Trent Jaeger Systems and Internet Infrastructure Security

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

Distributed Denial of Service Attacks & Defenses Guest Lecture by: Vamsi Kambhampati Fall