and Features in Deep Learning Interpretation Sahil Singla Joint - PowerPoint PPT Presentation

Understanding Impacts of High-Order Loss Approximations and Features in Deep Learning Interpretation Sahil Singla Joint work with Eric Wallace, Shi Feng, Soheil Feizi University of Maryland Pacific Ballroom # 69 , 6:30-9:00 PM June 13th 2019 https://github.com/singlasahil14/CASO

Why Deep Learning Interpretation?

Why Deep Learning Interpretation? Deep neural network Classified as y=0 (low-grade glioma)

Why Deep Learning Interpretation? Deep neural network Classified as y=0 (low-grade glioma) Saliency map to highlight salient features We need to explain AI decisions to humans

Assumptions of Current Methods Loss function

Assumptions of Current Methods Loss function 1. Linear approximation of the loss

Assumptions of Current Methods Loss function 1. Linear approximation of the loss 2. Isolated features : perturb (i) keeping all other features fixed

Desiderata of a New Interpretation Framework

Desiderata of a New Interpretation Framework Loss function

Desiderata of a New Interpretation Framework Loss function 1. Quadratic approximation of the loss

Desiderata of a New Interpretation Framework Loss function 1. Quadratic approximation of the loss 2. Group features: find group of k pixels that maximizes the loss

Confronting the Second-Order term

Confronting the Second-Order term ● Optimization can be non-concave maximization

Confronting the Second-Order term ● Optimization can be non-concave maximization ● Hessian can be VERY LARGE: ~150k x 150k for 224 x 224 x 3 input

Confronting the Second-Order term Concave for > L/2 where L is the largest eigenvalue of ● Optimization can be non-concave maximization ● Hessian can be VERY LARGE: ~150k x 150k for 224 x 224 x 3 input

Confronting the Second-Order term Concave for > L/2 where L is the largest eigenvalue of ● Optimization can be non-concave maximization ● Hessian can be VERY LARGE: ~150k x 150k for 224 x 224 x 3 input Can efficiently compute Hessian vector product

When Does Second-Order Matter?

When Does Second-Order Matter? For a deep ReLU network: Theorem: •

When Does Second-Order Matter? For a deep ReLU network: Theorem: • Theorem: If the probability of the predicted class is close to • one and the number of classes is large:

Empirical results on the impact of Hessian

Empirical results on the impact of Hessian Confidence of predicted class RESNET-50 (uses only ReLU )

Empirical results on the impact of Hessian Confidence of predicted class Confidence of predicted class SE-RESNET-50 (uses Sigmoid ) RESNET-50 (uses only ReLU )

Second-Order vs First Order (qualitative)

Confronting the L 1 term

Confronting the L 1 term y = |x| Not smooth at 0 term is non-smooth ●

Confronting the L 1 term y = |x| Not smooth at 0 term is non-smooth ● ● How to select ?

Confronting the L 1 term y = |x| Not smooth at 0 Use proximal gradient descent to optimize the objective. term is non-smooth ● ● How to select ?

Confronting the L 1 term y = |x| Not smooth at 0 Use proximal gradient descent to optimize the objective. term is non-smooth ● ● How to select ? Select the value that induces sparsity within a range (0.75, 1).

Impact of Group Features

Impact of Group Features First-Order

Impact of Group Features First-Order Second-Order

Conclusions A new formulation for interpretation ● ➢ Second-Order information ➢ Group Features Efficient Computation ● Pacific Ballroom #69 , 6:30-9:00 PM https://github.com/singlasahil14/CASO

and Features in Deep Learning Interpretation Sahil Singla Joint - PowerPoint PPT Presentation

Understanding Impacts of High-Order Loss Approximations and Features in Deep Learning Interpretation Sahil Singla Joint work with Eric Wallace, Shi Feng, Soheil Feizi University of Maryland Pacific Ballroom # 69 , 6:30-9:00 PM June 13th 2019

INTERPRETATION INTERPRETATION INTERPRETATION INTERPRETATION How can I know what How can I know

COMPANY PROFILE WATER FEATURES 1 WATER FEATURES 2 WATER FEATURES 3 WATER FEATURES 4 WATER

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Trends in Interpretation SCIC-Universities Conference 6-7 April 2017 Ana MOUZINHO DE

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

Presentation about Deep Learning --- Zhongwu xie Contents 1.Brief introduction of Deep learning.

Deep learning Deep reinforcement learning Hamid Beigy Sharif university of technology December

Differen'able Func'onal Programming Noel Welsh @noelwelsh underscore Goals Deep learning

DSC 102 Systems for Scalable Analytics Arun Kumar Topic 6: Deep Learning Systems 1 Outline

Geometric Interpretation of the Derivative (Review) Geometric Interpretation of the Derivative

An interpretation of surface displacements An interpretation of surface displacements An

ACCELERATE DEEP LEARNING WITH NVIDIA'S DEEP LEARNING PLATFORM | STEPHEN JONES | GTC16 DEEP

WHO 2016 UPDATE OF CNS TUMORS Arie Perry, M.D. Director, Neuropathology Courtesy of Dr. David

Insights and algorithms for the multivariate square-root lasso Aaron J. Molstad Department of

The Cross Language Image Retrieval Track ImageCLEF 2009 Henning Mller 1 , Barbara Caputo 2 ,

ORTHOGONAL NMF-BASED TOP-K PATIENT MUTATION PROFILE SEARCHING Ref. Publication: Kim, S., Sael,

Subsampling versus bootstrap in resampling-based model selection for multivariable regression

PRACTICAL USE OF MOLECULAR MARKERS IN DIAGNOSTIC NEUROPATHOLOGY NOTHING TO DISCLOSE Tarik

Ontology Droplet: Neurons and single unit recordings Sidarta Ribeiro Universidade Federal do Rio

Branching algebras for classical groups Soo Teck Lee National University of Singapore Survey on