testing deep neural networks
play

Testing Deep Neural Networks Xiaowei Huang, University of Liverpool - PowerPoint PPT Presentation

Testing Deep Neural Networks Xiaowei Huang, University of Liverpool Outline Safety Problem of AI Verification (brief) Testing Conclusions and Future Works Human-Level Intelligence Robotics and Autonomous Systems Deep neural networks all


  1. Testing Deep Neural Networks Xiaowei Huang, University of Liverpool

  2. Outline Safety Problem of AI Verification (brief) Testing Conclusions and Future Works

  3. Human-Level Intelligence

  4. Robotics and Autonomous Systems

  5. Deep neural networks all implemented with

  6. Figure: safety in image classification networks

  7. Figure: safety in natural language processing networks

  8. Figure: safety in voice recognition networks

  9. Figure: safety in security systems

  10. Safety Definition: Human Driving vs. Autonomous Driving Traffic image from “The German Traffic Sign Recognition Benchmark”

  11. Safety Definition: Human Driving vs. Autonomous Driving Image generated from our tool

  12. Safety Problem: Incidents

  13. Safety Definition: Illustration

  14. Safety Requirements ◮ Pointwise Robustness (this talk) ◮ if the decision of a pair (input, network) is invariant with respect to the perturbation to the input. ◮ Network Robustness ◮ or more fundamentally, Lipschitz continuity, mutual information, etc ◮ model interpretability

  15. Certification of DNN https://github.com/TrustAI

  16. Outline Safety Problem of AI Verification (brief) Testing Conclusions and Future Works

  17. Safety Definition: Traffic Sign Example

  18. Maximum Safe Radius Definition The maximum safe radius problem is to compute the minimum distance from the original input α to an adversarial example, i.e., α ′ ∈ D {|| α − α ′ || k | α ′ is an adversarial example } MSR ( α ) = min (1)

  19. Existing Approaches ◮ layer-by-layer exhaustive search, see e.g., [2] 1 ◮ SMT, MILP, SAT based constraint solving, see e.g., [3] 2 ◮ global optimisation, see e.g., [6] 3 ◮ abstract interpretation, see e.g., [1] 4 1 Huang , Kwiatkowska, Wang, Wu, CAV2017 2 Katz, Barrett, Dill, Julian, Kochenderfer, CAV2017 3 Ruan, Huang , Kwiatkowska, IJCAI2018 4 Gehr, Mirman, Drachsler-Cohen, Tsankov, Chaudhuri, Vechev, S&P2018

  20. Outline Safety Problem of AI Verification (brief) Testing Test Coverage Criteria Test Case Generation Conclusions and Future Works

  21. Deep Neural Networks (DNNs) Input Hidden Hidden Output layer layer layer layer n 2 , 1 n 3 , 1 v 1 , 1 u 4 , 1 n 2 , 2 n 3 , 2 v 1 , 2 u 4 , 2 n 2 , 3 n 3 , 3 label = argmax 1 ≤ l ≤ s K u K , l

  22. Deep Neural Networks (DNNs) Input Hidden Hidden Output layer layer layer layer n 2 , 1 n 3 , 1 v 1 , 1 u 4 , 1 n 2 , 2 n 3 , 2 v 1 , 2 u 4 , 2 n 2 , 3 n 3 , 3 label = argmax 1 ≤ l ≤ s K u K , l 1) neuron activation value 2) rectified linear unit (ReLU): � u k , i = b k , i + w k − 1 , h , i · v k − 1 , h 1 ≤ h ≤ s k − 1 v k , i = max { u k , i , 0 } weighted sum plus a bias; w , b are parameters learned

  23. DNN as a program . . . // 1) neuron a c t i v a t i o n value u k , i = b k , i for (unsigned h = 0; h ≤ s k − 1 ; h += 1) { u k , i += w k − 1 , h , i · v k − 1 , h } v k , i = 0 // 2) ReLU i f ( u k , i > 0) { v k , i = u k , i } . . .

  24. Testing Framework ◮ Test Coverage Criteria ◮ Test Case Generation

  25. Examples of Test Coverage Criteria ◮ Neuron coverage [5] 5 ◮ Neuron boundary coverage [4] 6 ◮ MC/DC for DNNs [8] 7 ◮ Lipschitz continuity 5 Pei, Cao, Yang, Jana, SOSP2017. 6 Ma, Xu, Zhang, Sun, Xue, Li, Chen, Su, Li, Liu, Zhao, Wang, ASE2018 7 Sun, Huang , Kroening, ASE2018

  26. Neuron coverage For any hidden neuron n k , i , there exists test case t ∈ T such that the neuron n k , i is activated: u k , i > 0. Test coverage conditions: {∃ x . u [ x ] k , i > 0 | 2 ≤ k ≤ K − 1 , 1 ≤ i ≤ s k }

  27. Neuron coverage ◮ ≈ statement (line) coverage . . . // 1) neuron a c t i v a t i o n v a l u e For any hidden neuron n k , i , u k , i = b k , i there exists test case t ∈ T such for (unsigned h = 0; h ≤ s k − 1 ; h += 1) that the neuron n k , i is activated: { u k , i += w k − 1 , h , i · v k − 1 , h u k , i > 0. } v k , i = 0 Test coverage conditions: // 2) ReLU i f ( u k , i > 0) {∃ x . u [ x ] k , i > 0 | { v k , i = u k , i ⇐ this line is covered 2 ≤ k ≤ K − 1 , 1 ≤ i ≤ s k } } . . .

  28. Neuron Coverage Problem of neuron coverage: ◮ too easy to reach 100% coverage

  29. MC/DC in Software Testing Developed by NASA and has been widely adopted in e.g., avionics software development guidance to ensure adequate testing of applications with the highest criticality. Idea: if a choice can be made, all the possible factors (conditions) that contribute to that choice (decision) must be tested. For traditional software, both conditions and the decision are usually Boolean variables or Boolean expressions.

  30. MC/DC Example Example: the decision d ⇐ ⇒ (( a > 3) ∨ ( b = 0)) ∧ ( c � = 4) (2) contains the three conditions ( a > 3), ( b = 0) and ( c � = 4). The following two test cases provide 100% condition coverage (i.e., all possibilities of the conditions are exploited): 1. ( a > 3)=True, ( b = 0)=True, ( c � = 4)=True, d = True 2. ( a > 3)=False, ( b = 0)=False, ( c � = 4)=False, d = False

  31. MC/DC Example Example: the decision d ⇐ ⇒ (( a > 3) ∨ ( b = 0)) ∧ ( c � = 4) (3) contains the three conditions ( a > 3), ( b = 0) and ( c � = 4). The following six test cases provide 100% MC/DC coverage: 1. ( a > 3)=True, ( b = 0)=True, ( c � = 4)=True, d = True 2. ( a > 3)=False, ( b = 0)=False, ( c � = 4)=False, d = False 3. ( a > 3)=False, ( b = 0)=False, ( c � = 4)=True, d = False 4. ( a > 3)=False, ( b = 0)=True, ( c � = 4)=True, d = True 5. ( a > 3)=False, ( b = 0)=True, ( c � = 4)=False, d = False 6. ( a > 3)=True, ( b = 0)=False, ( c � = 4)=True, d = True

  32. MC/DC for DNNs – General Idea The core idea of our criteria is to ensure that not only the presence of a feature needs to be tested but also the effects of less complex features on a more complex feature must be tested. n 2 , 1 n 3 , 1 v 1 , 1 v 4 , 1 n 2 , 2 n 3 , 2 v 1 , 2 v 4 , 2 n 2 , 3 n 3 , 3 For example, check the impact of n 2 , 1 , n 2 , 2 , n 2 , 3 on n 3 , 1 .

  33. MC/DC for DNNs – Neuron Pair and Sign Change A neuron pair ( n k , i , n k +1 , j ) are two neurons in adjacent layers k and k + 1 such that 1 ≤ k ≤ K − 1, 1 ≤ i ≤ s k , and 1 ≤ j ≤ s k +1 . (Sign Change of a neuron) Given a neuron n k , l and two test cases x 1 and x 2 , we say that the sign change of n k , l is exploited by x 1 and x 2 , denoted as sc ( n k , l , x 1 , x 2 ), if sign ( v k , l [ x 1 ]) � = sign ( v k , l [ x 2 ]).

  34. MC/DC for DNNs – Value Change and Distance Change (Value Change of a neuron) Given a neuron n k , l and two test cases x 1 and x 2 , we say that the value change of n k , l is exploited with respect to a value function g by x 1 and x 2 , denoted as vc ( g , n k , l , x 1 , x 2 ), if g ( u k , l [ x 1 ] , u k , l [ x 2 ])=True .

  35. MC/DC for DNNs – Sign-Sign Cover, or SS Cover A neuron pair α = ( n k , i , n k +1 , j ) is SS-covered by two test cases x 1 , x 2 , denoted as cov SS ( α, x 1 , x 2 ), if the following conditions are satisfied by the network instances N [ x 1 ] and N [ x 2 ]: ◮ sc ( n k , i , x 1 , x 2 ); ◮ ¬ sc ( n k , l , x 1 , x 2 ) for all n k , l ∈ P k \ { i } ; ◮ sc ( n k +1 , j , x 1 , x 2 ).

  36. MC/DC for DNNs – Other Covering Methods Value-Sign Cover, or VS Cover Sign-Value Cover, or SV Cover Value-Value Cover, or VV Cover

  37. Relation M N denotes the neuron coverage metric arrows represent “weaker than” relation between metrics

  38. Activation pattern 8 Activation Pattern ◮ Given a concrete input x , N [ x ] corresponds to a linear model C ◮ C represents the set of inputs following the same activation pattern ◮ One DNN activation pattern corresponds to a program execution path ◮ traverse of all activation patterns ⇒ formal verification ◮ too many patterns: e.g., 2 > 10 , 000 ... 8 Sun, Huang , Kroening. ”Testing Deep Neural Networks.” (2018).

  39. Safety Coverage [10] 9 Definition Let each hyper-rectangle rec contains those inputs with the same pattern of ReLU, i.e., for all x 1 , x 2 ∈ rec we have sign ( n k , l , x 1 ) = sign ( n k , l , x 2 ) for all n k , l ∈ H ( N ). A hyper-rectangle rec is safe covered by a test case x , denoted as cov S ( rec , x ), if x ∈ rec . 9 Wicker, Huang , Kwiatkowska, TACAS2018

  40. Relation M S denotes the safety coverage metric

  41. Safety Coverage Problem of safety coverage: ◮ exponential number of hyper-rectangles to be covered Therefore, our MC/DC based criteria strikes the balance between intensive testing and computational feasibility (justified by the experimental results).

  42. Relation with a few other criteria from [4] ◮ M MN : multi-section neuron coverage ◮ M NB : neuron boundary coverage ◮ M TN : top-k neuron coverage

  43. What we can do? ◮ bug finding ◮ DNN safety statistics ◮ testing efficiency ◮ DNN internal structure analysis

  44. Test Case Generation ◮ optimisation based (symbolic) approach ◮ concolic testing ◮ monte carlo tree based input mutation testing ◮

Recommend


More recommend