Making Convolutional Networks Shift-Invariant Again Richard Zhang Adobe Research
Example classifications P(correct class) P(correct class)
not Shift-Invariant Deep Networks are not P(correct class) P(correct class)
not Shift-Invariant Deep Networks are not P(correct class) P(correct class) Azulay and Weiss. Why y do deep convo volutional networks ks generalize ze so so poorly y to sm small image transf sformations? s? In ArXiv, 2018. Engstrom, Tsipras, Schmidt, Madry. Exp xploring the Landsc scape of Spatial Robust stness. ss. In ICML, 2019.
Why is shift-invariance lost? 5
Why is shift-invariance lost? “ Convo volutions are sh shift-equiva variant ” 6
Why is shift-invariance lost? “ Convo volutions are sh shift-equiva variant ” “ Po Poolin ling builds up sh shift-inva variance ” 7
Why is shift-invariance lost? “ Convo volutions are sh shift-equiva variant ” “ Po Poolin ling builds up sh shift-inva variance ” …but st striding ignores Nyquist sampling theorem and aliase ses 8
Re-examining Max-Pooling 9
Re-examining Max-Pooling max 10
Re-examining Max-Pooling max 11
Re-examining Max-Pooling max 12
Re-examining Max-Pooling max 13
Re-examining Max-Pooling max 14
Re-examining Max-Pooling max 15
Re-examining Max-Pooling max 16
Re-examining Max-Pooling max 17
Re-examining Max-Pooling 18
Re-examining Max-Pooling max 19
Re-examining Max-Pooling max 20
Re-examining Max-Pooling max 21
Re-examining Max-Pooling max 22
Re-examining Max-Pooling max 23
Re-examining Max-Pooling max 24
Re-examining Max-Pooling 25
Re-examining Max-Pooling 26
Re-examining Max-Pooling 27
Re-examining Max-Pooling Max-pooling breaks ks shift-equivariance 28
softmax classifier Shift-equivariance in VGG 1x1 pool5 • CIFAR conv5 • VGG network pool4 • 5 max-pools conv4 pool3 • Test shift-equivariance condition • conv3 pool2 conv2 pool1 conv1 pixels 32x32
softmax classifier Shift-equivariance in VGG 1x1 pool5 • CIFAR conv5 • VGG network pool4 • 5 max-pools conv4 pool3 • Test shift-equivariance condition • conv3 pool2 conv2 pool1 conv1 pixels 32x32
softmax Shift-equivariance, per layer classifier pool5 Large deviation from shift-eq. conv5 pool4 conv4 pool3 conv3 Perfect shift-eq. pool2 conv2 pool1 conv1 v1 31 pixels
softmax Shift-equivariance, per layer classifier pool5 Large deviation from shift-eq. conv5 pool4 conv4 pool3 conv3 Convolution is shift-equivariant Perfect shift-eq. pool2 conv2 pool1 conv1 v1 32 pixels
softmax Shift-equivariance, per layer classifier pool5 Large deviation from shift-eq. conv5 pool4 conv4 pool3 conv3 Perfect shift-eq. pool2 conv2 pool1 po conv1 33 pixels
softmax Shift-equivariance, per layer classifier pool5 Large deviation from shift-eq. conv5 pool4 conv4 pool3 conv3 Pooling breaks shift-equivariance Perfect shift-eq. pool2 conv2 pool1 po conv1 34 pixels
softmax Shift-equivariance, per layer classifier pool5 Large deviation from shift-eq. conv5 pool4 conv4 pool3 conv3 Perfect shift-eq. pool2 po conv2 pool1 conv1 pixels
softmax Shift-equivariance, per layer classifier pool5 Large deviation from shift-eq. conv5 pool4 conv4 pool3 po conv3 Perfect shift-eq. pool2 conv2 pool1 conv1 pixels
softmax Shift-equivariance, per layer classifier pool5 Large deviation from shift-eq. conv5 pool4 po conv4 pool3 conv3 Perfect shift-eq. pool2 conv2 pool1 conv1 pixels
softmax Shift-equivariance, per layer classifier pool5 Large deviation from shift-eq. conv5 po pool4 conv4 pool3 Nyquist theorem ignored when pooling; conv3 sing breaks shift-equivariance aliasi Perfect shift-eq. pool2 conv2 pool1 conv1 pixels
Alternative downsampling methods • Blur+subsample • Antialiasi sing in signal processing; image processing; graphics • Max-pooling • Performs better in deep learning applications [Scherer 2010] 39
Alternative downsampling methods • Blur+subsample • Antialiasi sing in signal processing; image processing; graphics • Max-pooling • Performs better in deep learning applications [Scherer 2010] 40
Alternative downsampling methods • Blur+subsample • Antialiasi sing in signal processing; image processing; graphics • Max-pooling • Performs better in deep learning applications [Scherer 2010] Reconcile antialiasing with max-pooling 41
max( x( ) Baseline (MaxPool) max( x( ) heavy vy aliasi sing
max( x( ) Baseline (MaxPool) max( x( ) heavy vy aliasi sing
max( x( ) max( x( ) Baseline (MaxPool) max( x( ) max( x( ) (1) Max x (dense se eva valuation) heavy vy aliasi sing no aliasi sing
max( x( ) max( x( ) Baseline (MaxPool) max( x( ) max( x( ) (1) Max x (dense se eva valuation) (2) Subsa sampling heavy vy aliasi sing no aliasi sing heavy vy aliasi sing
max( x( ) max( x( ) Baseline (MaxPool) max( x( ) max( x( ) (1) Max x (dense se eva valuation) (2) Subsa sampling heavy vy aliasi sing no aliasi sing heavy vy aliasi sing
max( x( ) max( x( ) Baseline (MaxPool) max( x( ) max( x( ) (1) Max x (dense se eva valuation) (2) Subsa sampling heavy vy aliasi sing no aliasi sing heavy vy aliasi sing max( x( ) Anti-aliased (MaxBlurPool) max( x( ) (1) Max x (dense se eva valuation) no aliasi sing
max( x( ) max( x( ) Baseline (MaxPool) max( x( ) max( x( ) (1) Max x (dense se eva valuation) (2) Subsa sampling heavy vy aliasi sing no aliasi sing heavy vy aliasi sing max( x( ) Anti-aliased Blur Bl (MaxBlurPool) max( x( ) (1) Max x (dense se eva valuation) (2 (2) ) Anti Anti-aliasi sing filter filter no aliasi sing no aliasi sing
max( x( ) max( x( ) Baseline (MaxPool) max( x( ) max( x( ) (1) Max x (dense se eva valuation) (2) Subsa sampling heavy vy aliasi sing no aliasi sing heavy vy aliasi sing max( x( ) Anti-aliased Blur Bl (MaxBlurPool) max( x( ) (1) Max x (dense se eva valuation) (2) (2 ) Anti-aliasi sing filter (3) Subsa sampling no aliasi sing no aliasi sing reduced aliasi sing
max( x( ) max( x( ) Baseline (MaxPool) max( x( ) max( x( ) (1) Max x (dense se eva valuation) (2) Subsa sampling heavy vy aliasi sing no aliasi sing heavy vy aliasi sing max( x( ) Anti-aliased Blur Bl (MaxBlurPool) max( x( ) Evaluated together as “BlurPool” (1) Max x (dense se eva valuation) (2) (2 ) Anti-aliasi sing filter (3) Subsa sampling no aliasi sing no aliasi sing reduced aliasi sing
Antialiasing any downsampling layer • Max Pool • VGG, Alexnet • Strided Convolution • Resnet, MobileNetv2 • Average Pool • DenseNet 51
Antialiasing any downsampling layer • Max Pool • VGG, Alexnet • Strided Convolution • Resnet, MobileNetv2 • Average Pool • DenseNet 52
Antialiasing any downsampling layer • Max Pool • VGG, Alexnet • Strided Convolution • Resnet, MobileNetv2 • Average Pool • DenseNet 53
ImageNet Shift-invariance
ImageNet Shift-invariance Accuracy
ImageNet Shift-invariance Accuracy
ImageNet Baseline Shift-invariance Accuracy
ImageNet Baseline Shift-invariance Antialiased Accuracy
ImageNet Baseline Shift-invariance Antialiased Accuracy
ImageNet Baseline Shift-invariance Antialiased Antialiasing also improves accur accuracy acy Accuracy
Discussion Striding aliases (stride=2) Add antialiasing filter + Improved shift-equivariance + Improved accuracy Additionally + Improved stability to other perturbations + Improved robustness 61
Discussion Striding aliases (stride=2) Add antialiasing filter + Improved shift-equivariance + Improved accuracy Additionally + Improved stability to other perturbations + Improved robustness 62
Discussion Striding aliases (stride=2) Add antialiasing filter + Improved shift-equivariance + Improved accuracy Additionally + Improved stability to other perturbations + Improved robustness 63
Discussion Striding aliases (stride=2) Add antialiasing filter + Improved shift-equivariance + Improved accuracy Additionally + Improved stability to other perturbations + Improved robustness 64
Discussion Striding aliases (stride=2) Add antialiasing filter + Improved shift-equivariance + Improved accuracy Additionally + Improved stability to other perturbations + Improved robustness 65
Discussion Striding aliases (stride=2) Add antialiasing filter + Improved shift-equivariance + Improved accuracy Additionally + Improved stability to other perturbations + Improved robustness 66
Discussion Striding aliases (stride=2) Add antialiasing filter + Improved shift-equivariance + Improved accuracy Additionally + Improved stability to other perturbations + Improved robustness Antialiasing code, pretrained models https://richzhang.github.io/antialiased-cnns/ 67
Recommend
More recommend