and features in deep learning interpretation
play

and Features in Deep Learning Interpretation Sahil Singla Joint - PowerPoint PPT Presentation

Understanding Impacts of High-Order Loss Approximations and Features in Deep Learning Interpretation Sahil Singla Joint work with Eric Wallace, Shi Feng, Soheil Feizi University of Maryland Pacific Ballroom # 69 , 6:30-9:00 PM June 13th 2019


  1. Understanding Impacts of High-Order Loss Approximations and Features in Deep Learning Interpretation Sahil Singla Joint work with Eric Wallace, Shi Feng, Soheil Feizi University of Maryland Pacific Ballroom # 69 , 6:30-9:00 PM June 13th 2019 https://github.com/singlasahil14/CASO

  2. Why Deep Learning Interpretation?

  3. Why Deep Learning Interpretation?

  4. Why Deep Learning Interpretation? Deep neural network Classified as y=0 (low-grade glioma)

  5. Why Deep Learning Interpretation? Deep neural network Classified as y=0 (low-grade glioma) Saliency map to highlight salient features We need to explain AI decisions to humans

  6. Assumptions of Current Methods Loss function

  7. Assumptions of Current Methods Loss function 1. Linear approximation of the loss

  8. Assumptions of Current Methods Loss function 1. Linear approximation of the loss 2. Isolated features : perturb (i) keeping all other features fixed

  9. Assumptions of Current Methods Loss function 1. Linear approximation of the loss 2. Isolated features : perturb (i) keeping all other features fixed

  10. Desiderata of a New Interpretation Framework

  11. Desiderata of a New Interpretation Framework Loss function

  12. Desiderata of a New Interpretation Framework Loss function 1. Quadratic approximation of the loss

  13. Desiderata of a New Interpretation Framework Loss function 1. Quadratic approximation of the loss 2. Group features: find group of k pixels that maximizes the loss

  14. Confronting the Second-Order term

  15. Confronting the Second-Order term ● Optimization can be non-concave maximization

  16. Confronting the Second-Order term ● Optimization can be non-concave maximization ● Hessian can be VERY LARGE: ~150k x 150k for 224 x 224 x 3 input

  17. Confronting the Second-Order term Concave for > L/2 where L is the largest eigenvalue of ● Optimization can be non-concave maximization ● Hessian can be VERY LARGE: ~150k x 150k for 224 x 224 x 3 input

  18. Confronting the Second-Order term Concave for > L/2 where L is the largest eigenvalue of ● Optimization can be non-concave maximization ● Hessian can be VERY LARGE: ~150k x 150k for 224 x 224 x 3 input Can efficiently compute Hessian vector product

  19. When Does Second-Order Matter?

  20. When Does Second-Order Matter? For a deep ReLU network: Theorem: •

  21. When Does Second-Order Matter? For a deep ReLU network: Theorem: • Theorem: If the probability of the predicted class is close to • one and the number of classes is large:

  22. Empirical results on the impact of Hessian

  23. Empirical results on the impact of Hessian Confidence of predicted class RESNET-50 (uses only ReLU )

  24. Empirical results on the impact of Hessian Confidence of predicted class Confidence of predicted class SE-RESNET-50 (uses Sigmoid ) RESNET-50 (uses only ReLU )

  25. Second-Order vs First Order (qualitative)

  26. Second-Order vs First Order (qualitative)

  27. Second-Order vs First Order (qualitative)

  28. Confronting the L 1 term

  29. Confronting the L 1 term

  30. Confronting the L 1 term y = |x| Not smooth at 0 term is non-smooth ●

  31. Confronting the L 1 term y = |x| Not smooth at 0 term is non-smooth ● ● How to select ?

  32. Confronting the L 1 term y = |x| Not smooth at 0 Use proximal gradient descent to optimize the objective. term is non-smooth ● ● How to select ?

  33. Confronting the L 1 term y = |x| Not smooth at 0 Use proximal gradient descent to optimize the objective. term is non-smooth ● ● How to select ? Select the value that induces sparsity within a range (0.75, 1).

  34. Impact of Group Features

  35. Impact of Group Features First-Order

  36. Impact of Group Features First-Order Second-Order

  37. Conclusions A new formulation for interpretation ● ➢ Second-Order information ➢ Group Features Efficient Computation ● Pacific Ballroom #69 , 6:30-9:00 PM https://github.com/singlasahil14/CASO

Recommend


More recommend