an abstract domain for certifying neural networks
play

An Abstract Domain for Certifying Neural Networks Gagandeep Singh - PowerPoint PPT Presentation

An Abstract Domain for Certifying Neural Networks Gagandeep Singh Timon Gehr Markus Pschel Martin Vechev Department of Computer Science 1 Adversarial input perturbations Neural network f 8 " Neural network f 7


  1. An Abstract Domain for Certifying Neural Networks Gagandeep Singh Timon Gehr Markus PΓΌschel Martin Vechev Department of Computer Science 1

  2. Adversarial input perturbations Neural network f 8 𝐽 " Neural network f 7 𝐽 ∈ 𝑀 % (𝐽 ' , πœ—) Neural network f 9 2 𝐽 ∈ 𝑆𝑝𝑒𝑏𝑒𝑓(𝐽 ' , πœ— , 𝛽, 𝛾)

  3. Neural network robustness Challenges Neural network 𝑔: ℝ 5 ⟢ ℝ 7 The size of β„› 𝐽 ' , 𝜚 grows exponentially Given: Perturbation region β„› 𝐽 ' , 𝜚 in the number of pixels: β€’ cannot compute f 𝐽 for all 𝐽 separately 𝑀 % 𝐽 ' , πœ— : All images 𝐽 where Prior Work pixel values in 𝐽 and 𝐽 ' differ by β€’ Precise but does not scale: Regions: at most πœ— β€’ SMT Solving [CAV’17] Rotate( 𝐽 ' , πœ— , 𝛽, 𝛾 ): All images 𝐽 in β€’ Input refinement [USENIX’18] 𝑀 % 𝐽 ' , πœ— rotated by πœ„ ∈ [𝛽, 𝛾] β€’ Semidefinite relaxations [ICLR’18] β€’ Scales but imprecise β€’ Linear relaxations [ICML’18] βˆ€π½ ∈ β„› 𝐽 ' , 𝜚 . 𝑔 𝑑 > 𝑔(π‘˜) β€’ Abstract interpretation [S&P’18, T o Prove: where c is the correct output NIPS’18] and j is any other output 3

  4. This work: contributions A new abstract domain combining floating First approach to certify robustness under rotation combined with linear interpolation : point Polyhedra with Intervals: β€’ based on refinement of the abstract input β€’ custom transformers for common functions in β€’ πœ— = 0.001, 𝛽 = βˆ’45 I , 𝛾 = 65 I neural networks such as affine transforms, ReLU, sigmoid, tanh, and maxpool activations β€’ scalable and precise analysis Network 𝝑 NIPS’18 DeepPoly DeepPoly: β€’ complete and parallelized end-to-end Ø 6 layers 0.035 proves 21% proves 64% implementation based on ELINA Ø 3010 units 15.8 sec 4.8 sec β€’ https://github.com/eth-sri/eran Ø 6 layers 0.3 proves 37% proves 43% Ø 34,688 units 17 sec 88 sec 4

  5. Our Abstract Domain N constraint with each 𝑦 L M and an upper polyhedral 𝑏 L Shape: associate a lower polyhedral 𝑏 L Concretization of abstract element 𝑏: Domain invariant: store auxiliary concrete lower and upper bounds π‘š L , 𝑣 L for each 𝑦 L π‘œ: #neurons, 𝑛: # constraints β€’ less precise than Polyhedra, restriction π‘₯ 5WX : max #neurons in a layer, 𝑀 : # layers needed to ensure scalability β€’ captures affine transformation precisely Transformer Polyhedra Our domain unlike Octagon, TVPI Ο(π‘œπ‘› U ) U Affine Ο(π‘₯ 5WX 𝑀) β€’ custom transformers for ReLU, sigmoid, Ο(exp (π‘œ, 𝑛)) Ο(1) ReLU tanh, and maxpool activations 5

  6. Example: Analysis of a Toy Neural Network Input layer Hidden layers Output layer 0 0 1 max (0, 𝑦 ^ ) max (0, 𝑦 ` ) [βˆ’1,1] 1 1 1 𝑦 ] 𝑦 ^ 𝑦 _ 𝑦 ` 𝑦 a 𝑦 ]] 1 1 0 1 1 1 𝑦 U 𝑦 b 𝑦 c 𝑦 d 𝑦 ]' 𝑦 ]U [βˆ’1,1] βˆ’1 βˆ’1 1 max (0, 𝑦 b ) max (0, 𝑦 d ) 0 0 0 6

  7. 0 0 1 max (0, 𝑦 ^ ) max (0, 𝑦 ` ) [βˆ’1,1] 1 1 1 𝑦 ] 𝑦 ^ 𝑦 _ 𝑦 ` 𝑦 a 𝑦 ]] 1 1 0 1 1 1 𝑦 U 𝑦 b 𝑦 c 𝑦 d 𝑦 ]' 𝑦 ]U [βˆ’1,1] βˆ’1 βˆ’1 1 max (0, 𝑦 b ) max (0, 𝑦 d ) 0 0 0 7

  8. ReLU activation Pointwise transformer for 𝑦 g ≔ 𝑛𝑏𝑦(0, 𝑦 L ) that uses π‘š L , 𝑣 L M = 𝑏 g N = 0, π‘š g = 𝑣 g = 0, 𝑗𝑔 𝑣 L ≀ 0, 𝑏 g max (0, 𝑦 ^ ) M = 𝑏 g N = 𝑦 L , π‘š g = π‘š L , 𝑣 g = 𝑣 L , 𝑦 ^ 𝑦 _ 𝑗𝑔 π‘š L β‰₯ 0, 𝑏 g 𝑗𝑔 π‘š L < 0 π‘π‘œπ‘’ 𝑣 L > 0 𝑦 b 𝑦 c max (0, 𝑦 b ) choose (b) or (c) depending on the area Constant runtime 8

  9. Affine transformation after ReLU 𝑦 _ 1 0 𝑦 ` 𝑦 c 1 N Imprecise upper bound 𝑣 ` by substituting 𝑣 _ , 𝑣 c for 𝑦 _ and 𝑦 c in 𝑏 ` 9

  10. Backsubstitution 𝑦 _ 1 0 𝑦 ` 𝑦 c 1 10

  11. 0 max (0, 𝑦 ^ ) 1 𝑦 ] 𝑦 _ 𝑦 ^ 1 1 0 𝑦 ` 1 𝑦 c 1 𝑦 b 𝑦 U βˆ’1 max (0, 𝑦 b ) 0 U Affine transformation with backsubstitution is pointwise, complexity: Ο π‘₯ 5WX 𝑀 11

  12. 0 0 1 max (0, 𝑦 ^ ) max (0, 𝑦 ` ) [βˆ’1,1] 1 1 1 𝑦 ] 𝑦 ^ 𝑦 _ 𝑦 ` 𝑦 a 𝑦 ]] 1 1 0 1 1 1 𝑦 U 𝑦 b 𝑦 c 𝑦 d 𝑦 ]' 𝑦 ]U [βˆ’1,1] βˆ’1 βˆ’1 1 max (0, 𝑦 b ) max (0, 𝑦 d ) 0 0 0 12

  13. Checking for robustness Prove 𝑦 ]] βˆ’ 𝑦 ]U > 0 for all inputs in βˆ’1,1 Γ—[βˆ’1,1] Computing lower bound for 𝑦 ]] βˆ’ 𝑦 ]U using π‘š ]] , 𝑣 ]U gives - 1 which is an imprecise result With backsubstitution, one gets 1 as the lower bound for 𝑦 ]] βˆ’ 𝑦 ]U , proving robustness 13

  14. More complex perturbations: rotations Challenge: Rotate( 𝐽 ' , πœ— , 𝛽, 𝛾 ) is non-linear and cannot be captured in our domain unlike 𝑀 % 𝐽 ' , πœ— Solution: Over-approximate Rotate( 𝐽 ' , πœ— , 𝛽, 𝛾 ) with boxes and use input refinement for precision Result: Prove robustness for networks under Rotate( 𝐽 ' , 0.001 ,-45 ,65 ) 14

  15. More in the paper Sigmoid transformer Tanh transformer Maxpool transformer Floating point soundness 15

  16. Experimental evaluation β€’ Neural network architectures: β€’ fully connected feedforward (FFNN) β€’ convolutional (CNN) β€’ Training: β€’ trained to be robust with DiffAI [ICML’18] and PGD [CVPR’18] β€’ without adversarial training β€’ Datasets: β€’ MNIST β€’ CIFAR10 β€’ DeepPoly vs. state-of-the-art DeepZ [NIPS’18] and Fast-Lin [ICML’18] 16

  17. Results 17

  18. MNIST FFNN (3,010 hidden units) 18

  19. CIFAR10 CNNs (4,852 hidden units) 19

  20. Large Defended CNNs trained via DiffAI [ICML’18] 𝝑 Dataset Model #hidden units %verified robustness Average runtime (s) DeepZ DeepPoly DeepZ DeepPoly MNIST ConvBig 34,688 0.1 97 97 5 50 ConvBig 34,688 0.2 79 78 7 61 ConvBig 34,688 0.3 37 43 17 88 ConvSuper 88,500 0.1 97 97 133 400 CIFAR10 ConvBig 62,464 0.006 50 52 39 322 ConvBig 62,464 0.008 33 40 46 331 20

  21. Conclusion A new abstract domain combining floating point Polyhedra with Intervals: π‘œ: #neurons, 𝑛: # constraints π‘₯ 5WX : max #neurons in a layer, 𝑀 : # layers Transformer Polyhedra Our domain Ο(π‘œπ‘› U ) U Affine Ο(π‘₯ 5WX 𝑀) Ο(exp (π‘œ, 𝑛)) Ο(1) ReLU DeepPoly: complete and parallelized end-to-end β€’ implementation based on ELINA https://github.com/eth-sri/eran β€’ 21

Recommend


More recommend