synthesizing robust adversarial examples
play

Synthesizing Robust Adversarial Examples Anish Athalye*, Logan - PowerPoint PPT Presentation

Synthesizing Robust Adversarial Examples Anish Athalye*, Logan Engstrom*, Andrew Ilyas*, Kevin Kwok Adversarial examples Adversarial examples Imperceptible perturbations to an input can change a neural network's prediction adversarial


  1. Synthesizing Robust Adversarial Examples Anish Athalye*, Logan Engstrom*, Andrew Ilyas*, Kevin Kwok

  2. Adversarial examples

  3. Adversarial examples • Imperceptible perturbations to an input can change a neural network's prediction adversarial perturbation 88% tabby cat 99% guacamole

  4. Adversarial examples Given: Input image x , target label y Optimize: P ( y ∣ x ′ � ) arg max x ′ � d ( x , x ′ � ) < ϵ subject to

  5. Do adversarial examples work in the physical world?

  6. Adversarial examples in the physical world (Kurakin et al. 2016)

  7. ... or not? Foveation-based Mechanisms NO Need to Worry about Adversarial Alleviate Adversarial Examples Examples in Object Detection in (Luo et al. 2015) Autonomous Vehicles (Lu et al. 2017)

  8. Standard examples are fragile

  9. Are adversarial examples fundamentally fragile?

  10. Image processing pipeline PREDICTIONS IMAGE MODEL optimize P ( y ∣ x ′ � ) using gradient descent

  11. Physical world processing pipeline MODEL PREDICTIONS TRANSFORMATION IMAGE PARAMETERS these are randomized Challenge: No direct control over model input

  12. Attack: Expectation Over Transformation is di ff erentiable MODEL PREDICTIONS TRANSFORMATION IMAGE PARAMETERS these are randomized but the distribution T is known optimize 𝔽 t ∼ T [ P ( y ∣ t ( x ′ � )) ] using gradient descent (sampling, chain rule, di ff erentiating through t )

  13. EOT produces robust examples T = {rescale from 1x to 5x}

  14. EOT produces robust physical-world examples T = {rescale + rotate + translate + skew}

  15. Can we make this work with 3D objects?

  16. Physical world 3D processing pipeline is this di ff erentiable? MODEL PREDICTIONS RENDERING TEXTURE PARAMETERS 3D MODEL zoom: 1.3x rotation: [60°, 30°, 15°] translation: [1, 5, 0] ...

  17. Differentiable rendering • For any pose, 3D rendering is di ff erentiable with respect to texture • Simplest renderer: linear transformation of texture

  18. EOT produces 3D adversarial objects

  19. EOT reliably produces 3D adversarial objects Classification Attacker Inputs Distortion (l2) accuracy success rate Original 70% N/A 0 2D Adversarial 96.4% 5.6 ⨉ 10 -5 0.9% Original 84% N/A 0 3D Adversarial 84.0% 6.5 ⨉ 10 -5 1.7%

  20. Implications • Defenses based on randomized input transformations are insecure • Adversarial examples / objects are a physical-world concern Poster (and live demo): 6:15 – 9:00pm @ Hall B #73

Recommend


More recommend