1 Image Classification BVM 2018 Tutorial: Advanced Deep Learning Methods Jakob Wasserthal, Division of Medical Image Computing
Author Division Classification of skin cancer 02.11.16 | vs Esteva et al., Dermatologist-level classification of skin cancer with deep neural networks, Nature, 2017 | Jakob Wasserthal 2
Author Division Classification of skin cancer 02.11.16 | vs benign malignant Esteva et al., Dermatologist-level classification of skin cancer with deep neural networks, Nature, 2017 | Jakob Wasserthal 3
Author Division Classification 02.11.16 | p(benign) 0.98 p(malignant) 0.02 | Jakob Wasserthal 4
Author Division ILSVRC challenge / ImageNet 02.11.16 | top-5 second best error 26.2% AlexNet 15.3% ZFNet 11.2% VGG GoogLeNet 7.3% 6.67% Human 5.1% ResNet Inception v3 DenseNet 3.57% 3.5% ~3.5% 2012 2013 2014 2015 2017 | Jakob Wasserthal 5
Author Division ILSVRC challenge / ImageNet 02.11.16 | top-5 second best error 26.2% AlexNet 15.3% ZFNet 11.2% VGG GoogLeNet 7.3% 6.67% Human 5.1% ResNet Inception v3 DenseNet 3.57% 3.5% ~3.5% 2012 2013 2014 2015 2017 | Jakob Wasserthal 6
Author Division VGG 02.11.16 | - simple structure - 160M parameters Simonyan et al.,Very deep convolutional networks for large-scale image recognition, arXiv, 2014 He et al., Deep Residual Learning for Image Recognition, arXiv, 2015 | Jakob Wasserthal 7
Author Division ILSVRC challenge / ImageNet 02.11.16 | top-5 second best error 26.2% AlexNet 15.3% ZFNet 11.2% VGG GoogLeNet 7.3% 6.67% Human 5.1% ResNet Inception v3 DenseNet 3.57% 3.5% ~3.5% 2012 2013 2014 2015 2017 | Jakob Wasserthal 8
Author Division GoogLeNet 02.11.16 | Inception module Szegedy et al., Going Deeper with Convolutions, arXiv, 2014 | Jakob Wasserthal 9
Author Division GoogLeNet 02.11.16 | stride=1 Szegedy et al., Going Deeper with Convolutions, arXiv, 2014 | Jakob Wasserthal 10
[Width x Height x Nr of Filters] Author Division GoogLeNet 02.11.16 | WxHx(256+256+256+256) = WxHx1024 WxHx256 WxHx256 WxHx256 WxHx256 stride=1 WxHx256 Szegedy et al., Going Deeper with Convolutions, arXiv, 2014 | Jakob Wasserthal 11
[Width x Height x Nr of Filters] Author Division GoogLeNet 02.11.16 | WxHx(256+256+256+256) = WxHx1024 WxHx256 WxHx256 WxHx256 WxHx256 stride=1 WxHx256 WxHx(128+192+96+64) = WxHx480 WxHx192 WxHx96 WxHx64 WxHx128 WxHx128 WxHx32 WxHx256 stride=1 WxHx256 Szegedy et al., 2014 | Jakob Wasserthal 12
Author Division GoogLeNet 02.11.16 | VGG GoogLeNet Data #Parameters dimensions Data #Parameters dimensions 1000*1024=1M 1000 14x14x512 7x 7x512=25.088 1x1x1024 4094 25088*4094=102M 7x7x1024 Szegedy et al., Going Deeper with Convolutions, arXiv, 2014 | Jakob Wasserthal 13
Author Division GoogLeNet 02.11.16 | Inception module - 4M parameters (VGG: 160M) - 22 trained layers Szegedy et al., Going Deeper with Convolutions, arXiv, 2014 | Jakob Wasserthal 14
Author Division Inception v3 - Improvement 1 02.11.16 | Parameters: 5x5-convolution: 5*5=25 2* 3x3-convolution: 2* (3*3)=18 => ~30% less parameters and computations Szegedy et al., Rethinking the Inception Architecture for Computer Vision, arXiv, 2015 | Jakob Wasserthal 15
Author Division Inception v3 - Improvement 1 02.11.16 | Szegedy et al., Rethinking the Inception Architecture for Computer Vision, arXiv, 2015 | Jakob Wasserthal 16
Author Division Inception v3 - Improvement 2 02.11.16 | Parameters: 3x3-convolution: 3*3=9 2* 1x3-convolution: 2* (1*3)=6 => ~33% less parameters and computations Szegedy et al., Rethinking the Inception Architecture for Computer Vision, arXiv, 2015 | Jakob Wasserthal 17
Author Division Inception v3 - Improvement 2 02.11.16 | Szegedy et al., Rethinking the Inception Architecture for Computer Vision, arXiv, 2015 | Jakob Wasserthal 18
Author Division Inception v3 - Improvement 3 02.11.16 | Representational 3x more computations bottleneck Szegedy et al., Rethinking the Inception Architecture for Computer Vision, arXiv, 2015 | Jakob Wasserthal 19
Author Division Inception v3 - Improvement 3 02.11.16 | Representational - No bottleneck 3x more computations bottleneck - 1x computations Szegedy et al., Rethinking the Inception Architecture for Computer Vision, arXiv, 2015 | Jakob Wasserthal 20
Author Division Inception v3 - Improvement 3 02.11.16 | Optimised Inception module Szegedy et al., Rethinking the Inception Architecture for Computer Vision, arXiv, 2015 | Jakob Wasserthal 21
Author Division Inception v3 02.11.16 | - 3.5% top-5 error - 42 Layers - 2.5x number of parameters of GoogLeNet Szegedy et al., Rethinking the Inception Architecture for Computer Vision, arXiv, 2015 | Jakob Wasserthal 22
Author Division ILSVRC challenge / ImageNet 02.11.16 | top-5 second best error 26.2% AlexNet 15.3% ZFNet 11.2% VGG GoogLeNet 7.3% 6.67% Human 5.1% ResNet Inception v3 DenseNet 3.57% 3.5% ~3.5% 2012 2013 2014 2015 2017 | Jakob Wasserthal 23
Author Division Classification of skin cancer 02.11.16 | - Inception v3 pretained on ImageNet - Dermatologist-level accuracy Esteva et al., Dermatologist-level classification of skin cancer with deep neural networks, Nature, 2017 | Jakob Wasserthal 24
Author Division Classification of diabetic retinopathy 02.11.16 | - Inception v3 pretained on ImageNet - Expert-level accuracy Gulshan et al., Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs, JAMA, 2016 | Jakob Wasserthal 25
Author Division ILSVRC challenge / ImageNet 02.11.16 | top-5 second best error 26.2% AlexNet 15.3% ZFNet 11.2% VGG GoogLeNet 7.3% 6.67% Human 5.1% ResNet Inception v3 DenseNet 3.57% 3.5% ~3.5% 2012 2013 2014 2015 2017 | Jakob Wasserthal 26
Author Division ResNet 02.11.16 | He et al., Deep Residual Learning for Image Recognition, arXiv, 2015 | Jakob Wasserthal 27
Author Division ResNet 02.11.16 | He et al., Deep Residual Learning for Image Recognition, arXiv, 2015 | Jakob Wasserthal 28
Author Division ResNet 02.11.16 | - 152 Layers | Jakob Wasserthal 29
Author Division ResNet 02.11.16 | He et al., Deep Residual Learning for Image Recognition, arXiv, 2015 | Jakob Wasserthal 30
Author Division ILSVRC challenge / ImageNet 02.11.16 | top-5 second best error 26.2% AlexNet 15.3% ZFNet 11.2% VGG GoogLeNet 7.3% 6.67% Human 5.1% ResNet Inception v3 DenseNet 3.57% 3.5% ~3.5% 2012 2013 2014 2015 2017 | Jakob Wasserthal 31
Author Division DenseNet 02.11.16 | Huang et al., Densely Connected Convolutional Networks, CVPR, 2017 | Jakob Wasserthal 32
Author Division DenseNet 02.11.16 | Huang et al., Densely Connected Convolutional Networks, CVPR, 2017 | Jakob Wasserthal 33
Author Division Challenges in medical image classification 02.11.16 | - few training data - no RGB images - small lesions - big images - interpretability Source: The Radiology Assistant : Bi-RADS for Mammography and Ultrasound 2013 | Jakob Wasserthal 34
Author Division Interpretability of predictions 02.11.16 | ? p(diabetic) 0.98 p(normal) 0.02 A deep neural network is often considered as a “black box”. | Jakob Wasserthal 35
Author Division Interpretability of predictions 02.11.16 | “What parts of the input image affect the decision?” Gulshan et al., Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs, 2016 | Jakob Wasserthal 36
Author Division Recap: Training via Backpropagation 02.11.16 | w ij x p dog (x) 0.98 p cat (x) 0.02 dc c = − log ( p dog ( x )) dw ij Slides by courtesy of Paul Jäger | Jakob Wasserthal 37
Author Division Saliency maps 02.11.16 | “What parts of the input image affect the decision?” x ij p dog (x) 0.98 p cat (x) 0.02 dp dog ( x ) “backprop into image”: dx ij Slides by courtesy of Paul Jäger | Jakob Wasserthal 38
Author Division Saliency maps 02.11.16 | dp dog ( x ) x ij dx ij Slides by courtesy of Paul Jäger | Jakob Wasserthal 39
Author Division Interpretability of predictions 02.11.16 | Jamaludin et al., SpineNet: Automated classification and evidence visualization in spinal MRIs, Medical image analysis, 2017 | Jakob Wasserthal 40
Author Division 02.11.16 | Questions | Jakob Wasserthal 41
Author Division 02.11.16 | Backup | Jakob Wasserthal 42
Author Division Advanced: Saliency via Perturbation 02.11.16 | “Interpretable Explanations of Black Boxes by Meaningful Perturbation” Ruth et al., arXiv, 2018 Trick: Backprop into a mask d [ w ∗ ( x ∗ m )] m multiplied with the image to = w ∗ x be the “minimal destroying dm region”. | Jakob Wasserthal 43
Recommend
More recommend