AI and Security: Lessons, Challenges & Future Directions Dawn - PowerPoint PPT Presentation

AI and Security: Lessons, Challenges & Future Directions Dawn Song UC Berkeley

AI and Security Enabler AI Security Enabler • AI enables security applications • Security enables better AI • Integrity: produces intended/correct results (adversarial machine learning) • Confidentiality/Privacy: does not leak users’ sensitive data (secure, privacy- preserving machine learning) • Preventing misuse of AI

AI and Security: AI in the presence of attacker

AI and Security: AI in the presence of attacker • Important to consider the presence of attacker • History has shown attacker always follows footsteps of new technology development (or sometimes even leads it) • The stake is even higher with AI • As AI controls more and more systems, attacker will have higher & higher incentives • As AI becomes more and more capable, the consequence of misuse by attacker will become more and more severe

AI and Security: AI in the presence of attacker • Attack AI • Cause learning system to not produce intended/correct results • Cause learning system to produce targeted outcome designed by attacker • Learn sensitive information about individuals • Need security in learning systems • Misuse AI • Misuse AI to attack other systems • Find vulnerabilities in other systems • Target attacks • Devise attacks • Need security in other systems

AI and Security: AI in the presence of attacker • Attack AI: • Cause learning system to not produce intended/correct results • Cause learning system to produce targeted outcome designed by attacker • Learn sensitive information about individuals • Need security in learning systems • Misuse AI • Misuse AI to attack other systems • Find vulnerabilities in other systems • Target attacks • Devise attacks • Need security in other systems

Deep Learning Systems Are Easily Fooled ostrich Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., & Fergus, R. Intriguing properties of neural networks. ICLR 2014.

STOP Signs in Berkeley

Adversarial Examples in Physical World Can we generate adversarial examples in the physical world that remain effective under different viewing conditions and viewpoints, including viewing distances and angles? 10

Adversarial Examples in Physical World Subtle Perturbations Evtimov, Ivan, Kevin Eykholt, Earlence Fernandes, Tadayoshi Kohno, Bo Li, Atul Prakash, Amir Rahmati, and Dawn Song. “Robust Physical-World Attacks on Machine Learning Models.” arXiv preprint arXiv:1707.08945 (2017). 11

Adversarial Examples in Physical World Subtle Perturbations 12

Adversarial Examples in Physical World Camouflage Perturbations 13

Camouflage Perturbations 14

Adversarial Examples in Physical World Adversarial perturbations are possible in physical world under different viewing conditions and viewpoints, including viewing distances and angles. Deep loss function: 15

Adversarial Examples Prevalent in Deep Learning Systems • Most existing work on adversarial examples: • Image classification task • Target model is known • Our investigation on adversarial examples: Deep Generative Blackbox Reinforcement Models Attacks Learning Weaker Threat Models (Target model is unknown) VisualQA/ New Attack Image-to-code Methods Other tasks and model classes Provide more diversity of attacks

Generative models ● VAE-like models (VAE, VAE-GAN) use an intermediate latent representation ● An encoder : maps a high-dimensional input into lower- dimensional latent representation z . ● A decoder: maps the latent representation back to a high- dimensional reconstruction.

Adversarial Examples in Generative Models ● An example attack scenario: ● Generative model used as a compression scheme ● Attacker’s goal: for the decompressor to reconstruct a different image from the one that the compressor sees.

Adversarial Examples for VAE-GAN in MNIST Target Image Original images Reconstruction of original images Adversarial examples Reconstruction of adversarial examples Jernej Kos, Ian Fischer, Dawn Song: Adversarial Examples for Generative Models

Adversarial Examples for VAE-GAN in SVHN Target Image Original images Reconstruction of original images Adversarial examples Reconstruction of adversarial examples Jernej Kos, Ian Fischer, Dawn Song: Adversarial Examples for Generative Models

Deep Reinforcement Learning Agent (A3C) Playing Pong Original Frames Jernej Kos and Dawn Song: Delving into adversarial attacks on deep policies [ICLR Workshop 2017].

Adversarial Examples on A3C Agent on Pong Score No. of steps Jernej Kos and Dawn Song: Delving into adversarial attacks on deep policies [ICLR Workshop, 2017]

Attacks Guided by Value Function Score Score No. of steps No. of steps Injecting adversarial perturbations Blindly injecting adversarial perturbations every 10 frames. guided by the value function.

Agent in Action With FGSM perturbations With FGSM perturbations Original Frames ( 𝜗 = 0.005) inject in ( 𝜗 = 0.005) inject based every frame on value function Jernej Kos and Dawn Song: Delving into adversarial attacks on deep policies [ICLR Workshop 2017].

Visual Q&A Given a question and an image, predict the answer.

Studied VQA Models Model 1: MCB ( https://arxiv.org/abs/1606.01847 ) • Uses Multimodal Compact Bilinear pooling to combine the image feature and question embedding.

Studied VQA Models Model 2: NMN ( https://arxiv.org/abs/1704.05526 ) • A representative of neural module networks • First predicts a network layout according to the question, then predicts the answer using the obtained network.

Question: What color is the sky? Original answer: MCB - blue, NMN - blue. Target: gray. Answer after attack: MCB - gray, NMN - gray. benign image adversarial image for MCB adversarial image for NMN Xiaojun Xu, Xinyun Chen, Chang Liu, Anna Rohrbach, Trevor Darell, Dawn Song: Can you fool AI with adversarial examples on a visual Turing test?

Question: Is it raining? Original answer: MCB - no, NMN - no. Target: yes. Answer after attack: MCB - yes, NMN - yes. benign image adv image for MCB adv image for NMN

Question: What is on the ground? Original answer: MCB - sand, NMN - sand. Target: snow. Answer after attack: MCB - snow, NMN - snow. benign image adv image for MCB adv image for NMN

Question: Where is the plane? Original answer: MCB - runway, NMN - runway. Target: sky. Answer after attack: MCB - sky, NMN - sky. benign image adv image for MCB adv image for NMN

Question: What color is the traffic light? Original answer: MCB - green, NMN - green. Target: red. Answer after attack: MCB - red, NMN - red. benign image adv image for MCB adv image for NMN

Question: What does the sign say? Original answer: MCB - stop, NMN - stop. Target: one way. Answer after attack: MCB - one way, NMN - one way. benign image adv image for MCB adv image for NMN

Question: How many cats are there? Original answer: MCB - 1, NMN - 1. Target: 2. Answer after attack: MCB - 2, NMN - 2. benign image adv image for MCB adv image for NMN

Adversarial Examples Prevalent in Deep Learning Systems • Most existing work on adversarial examples: • Image classification task • Target model is known • Our investigation on adversarial examples: Deep Generative Blackbox Reinforcement Models Attacks Learning Weaker Threat Models (Target model is unknown) VisualQA/ New Attack Image-to-code Methods Other tasks and model classes Provide more diversity of attacks

A General Framework for Black-box attacks • Zero-Query Attack (Previous methods) • Random perturbation • Difference of means • Transferability-based attack • Practical Black-Box Attacks against Machine Learning [Papernot et al. 2016] • Ensemble transferability-based attack [ Yanpei Liu, Xinyun Chen, Chang Liu, Dawn Song: Delving into Transferable Adversarial Examples and Black-box Attacks, ICLR 2017] • Query Based Attack (new method) • Finite difference gradient estimation • Query reduced gradient estimation • A general active query game model The zero-query attack can be viewed as a special case for the query based attack, where the number of queries made is zero

Query Based attacks • Finite difference gradient estimation • Given d -dimensional vector x , we can make 2 d queries to estimate the gradient as below • An example of approximate FGS with finite difference Similarly, we can also approximate for logit-based loss by making 2d queries x adv = x + ✏ · sign (FD x ( ` f ( x , y ) , � )) • Query reduced gradient estimation • Random grouping • PCA [Bhagoji, Li, He, Song, 2017]

Query Based Attacks Finite Differences method outperforms other black-box attacks and achieves similar attack success rate with the white-box attack Gradient estimation method with query reduction performs approximately similar as without query reduction

AI and Security: Lessons, Challenges & Future Directions Dawn - PowerPoint PPT Presentation

AI and Security: Lessons, Challenges & Future Directions Dawn Song UC Berkeley AI and Security Enabler AI Security Enabler AI enables security applications Security enables better AI Integrity: produces intended/correct

G. G. Stokes 1857 Stokes diagram with Stokes directions Halo at with singular directions

G. G. Stokes 1857 Stokes diagram with Stokes directions Halo at with singular directions

DNS and Security DNS and Security DNS and Security DNS and Security DNS and Security DNS and

DB Future Directions Future Directions The Future is hard to predict and is driven by

AI and Security: Lessons, Challenges & Future Directions Dawn Song UC Berkeley AlphaGo:

Future Directions in High Future Directions in High P Performance Computing Performance

Future directions in convective Future directions in convective parameterization

May 2018 ALL THINGS ADAPTED LESSONS What are adapted lessons? therapeutic music lessons

Lessons Learned Lessons Learned From From Lessons Learned Lessons Learned From From

mid-term growth directions Agora Group: -1- Agenda Key challenges 3-7 Growth directions of

The Role of Fundamentals in Future The Role of Fundamentals in Future Directions for the Chemical

FUTURE PULL: Future Pull Creating Change From the THE FARMHOUSE IN MY FUTURE Future Back Bill

Three right directions and three wrong directions for tensor research Michael W. Mahoney

Directions and Rubric for Magnified Giving Presentation Project Directions: For this project, you

New directions in phase- -field modeling of field modeling of New directions in phase

The Glass Menagerie Tristan, Jacob, Harrison Author Choices Stage Directions Juxtaposition

MONTEFIORE EINSTEIN: Vertical Consolidation in the Health Care Industry Steven M. Safyer, MD

CIS700: Security and Privacy of Machine Learning Prof. Ferdinando Fioretto ffiorett@syr.edu

Adversarial Robustness: Theory and Practice Zico Kolter Aleksander Mdry madry-lab.ml

Ho Far Can Robust Learning Go? Mohammad Mahmoody based on joint works from NeurIPS-18, AAAI-19,

Poisoning Networks Motjvatjon You are sittjng in an Internet Cafe at the airport heading back

MODULE 3: HOUSING IDIS Online for CDBG Entitlement Communities 1 Eligible Housing Activities

REGIONAL CALLS PARTICIPATE IN DISCUSSION BY UNMUTING YOUR PHONE OR USING THE WEBINAR PANEL

Strategic Approaches to HOME, Housing Trust Fund, & the Housing Credit January 9, 2018

AI and Security: Lessons, Challenges & Future Directions Dawn - PowerPoint PPT Presentation

AI and Security: Lessons, Challenges & Future Directions Dawn Song UC Berkeley AI and Security Enabler AI Security Enabler AI enables security applications Security enables better AI Integrity: produces intended/correct

G. G. Stokes 1857 Stokes diagram with Stokes directions Halo at with singular directions

G. G. Stokes 1857 Stokes diagram with Stokes directions Halo at with singular directions

DNS and Security DNS and Security DNS and Security DNS and Security DNS and Security DNS and

DB Future Directions Future Directions The Future is hard to predict and is driven by

AI and Security: Lessons, Challenges &amp; Future Directions Dawn Song UC Berkeley AlphaGo:

Future Directions in High Future Directions in High P Performance Computing Performance

Future directions in convective Future directions in convective parameterization

May 2018 ALL THINGS ADAPTED LESSONS What are adapted lessons? therapeutic music lessons

Lessons Learned Lessons Learned From From Lessons Learned Lessons Learned From From

mid-term growth directions Agora Group: -1- Agenda Key challenges 3-7 Growth directions of

The Role of Fundamentals in Future The Role of Fundamentals in Future Directions for the Chemical

FUTURE PULL: Future Pull Creating Change From the THE FARMHOUSE IN MY FUTURE Future Back Bill

Three right directions and three wrong directions for tensor research Michael W. Mahoney

Directions and Rubric for Magnified Giving Presentation Project Directions: For this project, you

New directions in phase- -field modeling of field modeling of New directions in phase

The Glass Menagerie Tristan, Jacob, Harrison Author Choices Stage Directions Juxtaposition

MONTEFIORE EINSTEIN: Vertical Consolidation in the Health Care Industry Steven M. Safyer, MD

CIS700: Security and Privacy of Machine Learning Prof. Ferdinando Fioretto ffiorett@syr.edu

Adversarial Robustness: Theory and Practice Zico Kolter Aleksander Mdry madry-lab.ml

Ho Far Can Robust Learning Go? Mohammad Mahmoody based on joint works from NeurIPS-18, AAAI-19,

Poisoning Networks Motjvatjon You are sittjng in an Internet Cafe at the airport heading back

MODULE 3: HOUSING IDIS Online for CDBG Entitlement Communities 1 Eligible Housing Activities

REGIONAL CALLS PARTICIPATE IN DISCUSSION BY UNMUTING YOUR PHONE OR USING THE WEBINAR PANEL

Strategic Approaches to HOME, Housing Trust Fund, &amp; the Housing Credit January 9, 2018

AI and Security: Lessons, Challenges & Future Directions Dawn Song UC Berkeley AlphaGo:

Strategic Approaches to HOME, Housing Trust Fund, & the Housing Credit January 9, 2018