Gradient Masking in Machine Learning Nicolas Papernot Pennsylvania - PowerPoint PPT Presentation

@NicolasPapernot Gradient Masking in Machine Learning Nicolas Papernot Pennsylvania State University ARO Workshop on Adversarial Machine Learning, Stanford September 2017

@NicolasPapernot Thank you to our collaborators Sandy Huang (Berkeley) Pieter Abbeel (Berkeley) Somesh Jha (U of Wisconsin) Michael Backes (CISPA) Alexey Kurakin (Google) Dan Boneh (Stanford) Praveen Manoharan (CISPA) Z. Berkay Celik (Penn State) Patrick McDaniel (Penn State) Yan Duan (OpenAI) Arunesh Sinha (U of Michigan) Ian Goodfellow (Google) Ananthram Swami (US ARL) Matt Fredrikson (CMU) Florian Tramèr (Stanford) Kathrin Grosse (CISPA) Michael Wellman (U of Michigan) 2

Gradient Masking 3

Training Small when prediction is correct on legitimate input 4

Adversarial training Small when prediction is Small when prediction is correct on legitimate input correct on adversarial input 5

Gradient masking in adversarially trained models Direction of Direction of the adversarially another model’s trained model’s gradient gradient Tramèr et al. Ensemble Adversarial Training: Attacks and Defenses 6 Illustration adapted from slides by Florian Tramèr

Gradient masking in adversarially trained models Adversarial example Non-adversarial example Direction of Direction of the adversarially another model’s trained model’s gradient gradient Tramèr et al. Ensemble Adversarial Training: Attacks and Defenses 7 Illustration adapted from slides by Florian Tramèr

Gradient masking in adversarially trained models Adversarial example Non-adversarial example Non-adversarial example Direction of the Direction of adversarially trained another model’s model’s gradient gradient Tramèr et al. Ensemble Adversarial Training: Attacks and Defenses 8 Illustration adapted from slides by Florian Tramèr

Evading gradient masking (1) Threat model: white-box adversary Attack: (1) Random step (of norm alpha) (2) FGSM step (of norm eps - alpha) Tramèr et al. Ensemble Adversarial Training: Attacks and Defenses 9 Illustration adapted from slides by Florian Tramèr

Evading gradient masking (2) Threat model: black-box attack Attack: (1) Learn substitute for defended model (2) Find adversarial direction using substitute Papernot et al. Practical Black-Box Attacks against Machine Learning 10 Papernot et al. Towards the Science of Security and Privacy in Machine Learning

Attacking black-box models Local Black-box substitute ML sys “no truck sign” “STOP sign” (1) The adversary queries the remote ML system with synthetic inputs to learn a local substitute. 11 Papernot et al. Practical Black-box Attacks Against Machine Learning

Attacking black-box models “yield sign” Local Black-box substitute ML sys (2) The adversary uses the local substitute to craft adversarial examples. 12 Papernot et al. Practical Black-box Attacks Against Machine Learning

Adversarial example transferability 13 Papernot et al. Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples

Large adversarial subspaces enable transferability On average: 44 orthogonal directions -> 25 transfer 14 Tramèr et al. The Space of Transferable Adversarial Examples

Adversarial training Small when prediction is Small when prediction is correct on legitimate input correct on adversarial input gradient is not adversarial 15

Ensemble Adversarial Training 16

Ensemble adversarial training Intuition: present adversarial gradients from multiple models during training Adversarial Model A gradient 17

Ensemble adversarial training Intuition: present adversarial gradients from multiple models during training Model A Inference 18

Ensemble adversarial training Intuition: present adversarial gradients from multiple models during training Adversarial Model A Model B Model C Model D gradient 19

Ensemble adversarial training Intuition: present adversarial gradients from multiple models during training Model A Model B Model C Model D Inference 24

Experimental results on MNIST (from holdout) 25

Experimental results on ImageNet (from holdout) 26

Reproducible Adversarial ML research with CleverHans 27

CleverHans library guiding principles 1. Benchmark reproducibility 2. Can be used with any TensorFlow model 3. Always include state-of-the-art attacks and defenses 28

Growing community 1.1K+ stars 290+ forks 35 contributors 29

Adversarial examples represent worst-case distribution drifts 30 [DDS04] Dalvi et al. Adversarial Classification (KDD)

Adversarial examples are a tangible instance of hypothetical AI safety problems 31 Image source: http://www.nerdist.com/wp-content/uploads/2013/07/Space-Odyssey-4.jpg

nicolas@papernot.fr www.cleverhans.io @NicolasPapernot ? Thank you for listening! Get involved at: github.com/tensorflow/cleverhans This research was funded by: 32

Gradient Masking in Machine Learning Nicolas Papernot Pennsylvania - PowerPoint PPT Presentation

@NicolasPapernot Gradient Masking in Machine Learning Nicolas Papernot Pennsylvania State University ARO Workshop on Adversarial Machine Learning, Stanford September 2017 @NicolasPapernot Thank you to our collaborators Sandy Huang

Proposal to add masking function to GFM Proposal part 1 Adding a masking reference attribute on

Understanding the Masking-Shadowing Function INRIA ; CNRS ; Univ. Grenoble Alpes in

Very High-Order Masking: Efficient Implementation and Security Evaluation Anthony Journault and

On the Multiplicative Complexity of Boolean Functions and Bitsliced Higher-Order Masking Dahmun

Masking schemes: evaluation Oscar Reparaz COSIC/KU Leuven PROOFS Taipei (Taiwan)

Applied Machine Learning Gradient Descent Methods Siamak Ravanbakhsh COMP 551 (Fall 2020)

CS 6316 Machine Learning Gradient Descent Yangfeng Ji Department of Computer Science University

Applied Machine Learning Applied Machine Learning Gradient Descent Methods Siamak Ravanbakhsh

Applied Machine Learning Applied Machine Learning Gradient Descent Methods Siamak Ravanbakhsh

Gradient Analysis NMDS Indirect Gradient Analysis NMDS Direct Gradient Analysis Objective:

Conjugate Gradient (CG) Majid Lesani Alireza Masoum Overview Backpropagation Gradient

Reinforcement Learning for Continuous State and Action Spaces Gradient Methods 1 MACHINE LEARNING

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

Efficient and Accurate Estimation of Lipschitz Constants for Deep Neural Networks Mahyar Fazlyab,

Attacking the stack (continued) hic 1 1 Firefox bugs this morning hic 2 Last week

Tor: Anonymous Communications for the US Dept of Defense ... and you. Roger Dingledine The Free

On t On the F he Feas easibilit ibility o y of Re Rerouting-bas based D d DDoS D S Defens

InvisiSpec: Making Speculative Execution Invisible in the Cache Hierarchy Mengjia Yan, Jiho Choi,

Effective Layering of Defenses Nathaniel Boggs, Salvatore J. Stolfo Columbia University

CSC 4992 Cyber Security Prac1ce Fengwei Zhang Wayne State University CSC 4992 Cyber Security

Jonathan Sacks Executive Director Wayne County Criminal Advocacy Program September 2015