1 image classification
play

1 Image Classification BVM 2018 Tutorial: Advanced Deep Learning - PowerPoint PPT Presentation

1 Image Classification BVM 2018 Tutorial: Advanced Deep Learning Methods Jakob Wasserthal, Division of Medical Image Computing Author Division Classification of skin cancer 02.11.16 | vs Esteva et al., Dermatologist-level classification of


  1. 1 Image Classification BVM 2018 Tutorial: Advanced Deep Learning Methods Jakob Wasserthal, Division of Medical Image Computing

  2. Author Division Classification of skin cancer 02.11.16 | vs Esteva et al., Dermatologist-level classification of skin cancer with deep neural networks, Nature, 2017 | Jakob Wasserthal 2

  3. Author Division Classification of skin cancer 02.11.16 | vs benign malignant Esteva et al., Dermatologist-level classification of skin cancer with deep neural networks, Nature, 2017 | Jakob Wasserthal 3

  4. Author Division Classification 02.11.16 | p(benign) 0.98 p(malignant) 0.02 | Jakob Wasserthal 4

  5. Author Division ILSVRC challenge / ImageNet 02.11.16 | top-5 
 second best 
 error 26.2% AlexNet 
 15.3% ZFNet 
 11.2% VGG 
 GoogLeNet 
 7.3% 6.67% Human 
 5.1% ResNet 
 Inception v3 
 DenseNet 
 3.57% 3.5% ~3.5% 2012 2013 2014 2015 2017 | Jakob Wasserthal 5

  6. Author Division ILSVRC challenge / ImageNet 02.11.16 | top-5 
 second best 
 error 26.2% AlexNet 
 15.3% ZFNet 
 11.2% VGG 
 GoogLeNet 
 7.3% 6.67% Human 
 5.1% ResNet 
 Inception v3 
 DenseNet 
 3.57% 3.5% ~3.5% 2012 2013 2014 2015 2017 | Jakob Wasserthal 6

  7. Author Division VGG 02.11.16 | - simple structure - 160M parameters Simonyan et al.,Very deep convolutional networks for 
 large-scale image recognition, arXiv, 2014 He et al., Deep Residual Learning for Image Recognition, arXiv, 2015 | Jakob Wasserthal 7

  8. Author Division ILSVRC challenge / ImageNet 02.11.16 | top-5 
 second best 
 error 26.2% AlexNet 
 15.3% ZFNet 
 11.2% VGG 
 GoogLeNet 
 7.3% 6.67% Human 
 5.1% ResNet 
 Inception v3 
 DenseNet 
 3.57% 3.5% ~3.5% 2012 2013 2014 2015 2017 | Jakob Wasserthal 8

  9. Author Division GoogLeNet 02.11.16 | Inception module Szegedy et al., Going Deeper with Convolutions, arXiv, 2014 | Jakob Wasserthal 9

  10. Author Division GoogLeNet 02.11.16 | stride=1 Szegedy et al., Going Deeper with Convolutions, arXiv, 2014 | Jakob Wasserthal 10

  11. [Width x Height x Nr of Filters] Author Division GoogLeNet 02.11.16 | WxHx(256+256+256+256) = WxHx1024 WxHx256 WxHx256 WxHx256 WxHx256 stride=1 WxHx256 Szegedy et al., Going Deeper with Convolutions, arXiv, 2014 | Jakob Wasserthal 11

  12. [Width x Height x Nr of Filters] Author Division GoogLeNet 02.11.16 | WxHx(256+256+256+256) = WxHx1024 WxHx256 WxHx256 WxHx256 WxHx256 stride=1 WxHx256 WxHx(128+192+96+64) = WxHx480 WxHx192 WxHx96 WxHx64 WxHx128 WxHx128 WxHx32 WxHx256 stride=1 WxHx256 Szegedy et al., 2014 | Jakob Wasserthal 12

  13. Author Division GoogLeNet 02.11.16 | VGG GoogLeNet Data #Parameters dimensions Data #Parameters dimensions 1000*1024=1M 1000 14x14x512 7x 7x512=25.088 1x1x1024 4094 25088*4094=102M 7x7x1024 Szegedy et al., Going Deeper with Convolutions, arXiv, 2014 | Jakob Wasserthal 13

  14. Author Division GoogLeNet 02.11.16 | Inception module - 4M parameters (VGG: 160M) - 22 trained layers Szegedy et al., Going Deeper with Convolutions, arXiv, 2014 | Jakob Wasserthal 14

  15. Author Division Inception v3 - Improvement 1 02.11.16 | Parameters: 5x5-convolution: 5*5=25 2* 3x3-convolution: 2* (3*3)=18 => ~30% less parameters and 
 computations Szegedy et al., Rethinking the Inception Architecture for Computer Vision, arXiv, 2015 | Jakob Wasserthal 15

  16. Author Division Inception v3 - Improvement 1 02.11.16 | Szegedy et al., Rethinking the Inception Architecture for Computer Vision, arXiv, 2015 | Jakob Wasserthal 16

  17. Author Division Inception v3 - Improvement 2 02.11.16 | Parameters: 3x3-convolution: 3*3=9 2* 1x3-convolution: 2* (1*3)=6 => ~33% less parameters and 
 computations Szegedy et al., Rethinking the Inception Architecture for Computer Vision, arXiv, 2015 | Jakob Wasserthal 17

  18. Author Division Inception v3 - Improvement 2 02.11.16 | Szegedy et al., Rethinking the Inception Architecture for Computer Vision, arXiv, 2015 | Jakob Wasserthal 18

  19. Author Division Inception v3 - Improvement 3 02.11.16 | Representational 3x more computations bottleneck Szegedy et al., Rethinking the Inception Architecture for Computer Vision, arXiv, 2015 | Jakob Wasserthal 19

  20. Author Division Inception v3 - Improvement 3 02.11.16 | Representational - No bottleneck 3x more computations bottleneck - 1x computations Szegedy et al., Rethinking the Inception Architecture for Computer Vision, arXiv, 2015 | Jakob Wasserthal 20

  21. Author Division Inception v3 - Improvement 3 02.11.16 | Optimised Inception module Szegedy et al., Rethinking the Inception Architecture for Computer Vision, arXiv, 2015 | Jakob Wasserthal 21

  22. Author Division Inception v3 02.11.16 | - 3.5% top-5 error - 42 Layers - 2.5x number of parameters of GoogLeNet Szegedy et al., Rethinking the Inception Architecture for Computer Vision, arXiv, 2015 | Jakob Wasserthal 22

  23. Author Division ILSVRC challenge / ImageNet 02.11.16 | top-5 
 second best 
 error 26.2% AlexNet 
 15.3% ZFNet 
 11.2% VGG 
 GoogLeNet 
 7.3% 6.67% Human 
 5.1% ResNet 
 Inception v3 
 DenseNet 
 3.57% 3.5% ~3.5% 2012 2013 2014 2015 2017 | Jakob Wasserthal 23

  24. Author Division Classification of skin cancer 02.11.16 | - Inception v3 pretained on ImageNet - Dermatologist-level accuracy Esteva et al., Dermatologist-level classification of skin cancer with deep neural networks, Nature, 2017 | Jakob Wasserthal 24

  25. Author Division Classification of diabetic retinopathy 02.11.16 | - Inception v3 pretained on ImageNet - Expert-level accuracy Gulshan et al., Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs, JAMA, 2016 | Jakob Wasserthal 25

  26. Author Division ILSVRC challenge / ImageNet 02.11.16 | top-5 
 second best 
 error 26.2% AlexNet 
 15.3% ZFNet 
 11.2% VGG 
 GoogLeNet 
 7.3% 6.67% Human 
 5.1% ResNet 
 Inception v3 
 DenseNet 
 3.57% 3.5% ~3.5% 2012 2013 2014 2015 2017 | Jakob Wasserthal 26

  27. Author Division ResNet 02.11.16 | He et al., Deep Residual Learning for Image Recognition, arXiv, 2015 | Jakob Wasserthal 27

  28. Author Division ResNet 02.11.16 | He et al., Deep Residual Learning for Image Recognition, arXiv, 2015 | Jakob Wasserthal 28

  29. Author Division ResNet 02.11.16 | - 152 Layers | Jakob Wasserthal 29

  30. Author Division ResNet 02.11.16 | He et al., Deep Residual Learning for Image Recognition, arXiv, 2015 | Jakob Wasserthal 30

  31. Author Division ILSVRC challenge / ImageNet 02.11.16 | top-5 
 second best 
 error 26.2% AlexNet 
 15.3% ZFNet 
 11.2% VGG 
 GoogLeNet 
 7.3% 6.67% Human 
 5.1% ResNet 
 Inception v3 
 DenseNet 
 3.57% 3.5% ~3.5% 2012 2013 2014 2015 2017 | Jakob Wasserthal 31

  32. Author Division DenseNet 02.11.16 | Huang et al., Densely Connected Convolutional Networks, CVPR, 2017 | Jakob Wasserthal 32

  33. Author Division DenseNet 02.11.16 | Huang et al., Densely Connected Convolutional Networks, CVPR, 2017 | Jakob Wasserthal 33

  34. Author Division Challenges in medical image classification 02.11.16 | - few training data - no RGB images - small lesions - big images - interpretability Source: The Radiology Assistant : Bi-RADS for Mammography and Ultrasound 2013 | Jakob Wasserthal 34

  35. Author Division Interpretability of predictions 02.11.16 | ? p(diabetic) 0.98 p(normal) 0.02 A deep neural network is often considered as a “black box”. | Jakob Wasserthal 35

  36. Author Division Interpretability of predictions 02.11.16 | “What parts of the input image affect the decision?” Gulshan et al., Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs, 2016 | Jakob Wasserthal 36

  37. Author Division Recap: Training via Backpropagation 02.11.16 | w ij x p dog (x) 0.98 p cat (x) 0.02 dc c = − log ( p dog ( x )) dw ij Slides by courtesy of Paul Jäger | Jakob Wasserthal 37

  38. Author Division Saliency maps 02.11.16 | “What parts of the input image affect the decision?” x ij p dog (x) 0.98 p cat (x) 0.02 dp dog ( x ) “backprop into image”: dx ij Slides by courtesy of Paul Jäger | Jakob Wasserthal 38

  39. Author Division Saliency maps 02.11.16 | dp dog ( x ) x ij dx ij Slides by courtesy of Paul Jäger | Jakob Wasserthal 39

  40. Author Division Interpretability of predictions 02.11.16 | Jamaludin et al., SpineNet: Automated classification and evidence visualization in spinal MRIs, Medical image analysis, 2017 | Jakob Wasserthal 40

  41. Author Division 02.11.16 | Questions | Jakob Wasserthal 41

  42. Author Division 02.11.16 | Backup | Jakob Wasserthal 42

  43. Author Division Advanced: Saliency via Perturbation 02.11.16 | “Interpretable Explanations of Black Boxes by Meaningful Perturbation” Ruth et al., arXiv, 2018 Trick: Backprop into a mask d [ w ∗ ( x ∗ m )] m multiplied with the image to = w ∗ x be the “minimal destroying dm region”. | Jakob Wasserthal 43

Recommend


More recommend