Deep Tracking & Flow Instructor - Simon Lucey 16-623 - Designing Computer Vision Apps
Today • Deep Features • Deep Tracking • Deep Flow
Primary Visual Cortex
Spatial Sensitivity • Which image has the greatest distortion with respect to the template? Kingdom, Field, Olmos, 2007
Spatial Sensitivity Kingdom, Field, Olmos, 2007
Handling Geometric Distortion • As pointed out in seminal work by Berg and Malik (CVPR’01) the effectiveness of SSD will degrade with significant viewpoint change. • Two options to match local image patches:- 0 “1D Patch” 0 “Distorted 1D Patch” Source: A. C. Berg 6
Handling Geometric Distortion • As pointed out in seminal work by Berg and Malik (CVPR’01) the effectiveness of SSD will degrade with significant viewpoint change. • Two options to match local image patches:- 0 “1D Patch” “match” 0 0 0 “Distorted 1D Patch” Source: A. C. Berg 6
Handling Geometric Distortion • As pointed out in seminal work by Berg and Malik (CVPR’01) the effectiveness of SSD will degrade with significant viewpoint change. • Two options to match local image patches:- 1. simultaneously estimate the distortion and position of matching patch 0 “1D Patch” 0 “Distorted 1D Patch” 7
Handling Geometric Distortion • As pointed out in seminal work by Berg and Malik (CVPR’01) the effectiveness of SSD will degrade with significant viewpoint change. • Two options to match local image patches:- 1. simultaneously estimate the distortion and position of matching patch 0 0 “1D Patch” “align” 0 0 “Distorted 1D Patch” 7
Handling Geometric Distortion • As pointed out in seminal work by Berg and Malik (CVPR’01) the effectiveness of SSD will degrade with significant viewpoint change. • Two options to match local image patches:- 1. simultaneously estimate the distortion and position of matching patch 0 0 “1D Patch” “align” 0 0 “Distorted 1D Patch” 7
Handling Geometric Distortion • As pointed out in seminal work by Berg and Malik (CVPR’01) the effectiveness of SSD will degrade with significant viewpoint change. • Two options to match local image patches:- 1. simultaneously estimate the distortion and position of matching patch 0 0 “1D Patch” “match” “align” 0 0 0 “Distorted 1D Patch” 7
Handling Geometric Distortion • As pointed out in seminal work by Berg and Malik (CVPR’01) the effectiveness of SSD will degrade with significant viewpoint and/or illumination change. • Two options to match patches:- 1. simultaneously estimate the distortion and position of matching patch. 2. to “blur” the template window performing matching coarse-to-fine. 0 “1D Patch” 0 “Distorted 1D Patch” 8
Handling Geometric Distortion • As pointed out in seminal work by Berg and Malik (CVPR’01) the effectiveness of SSD will degrade with significant viewpoint and/or illumination change. • Two options to match patches:- 1. simultaneously estimate the distortion and position of matching patch. 2. to “blur” the template window performing matching coarse-to-fine. 0 “1D Patch” “blur” 0 “Distorted 1D Patch” 8
Handling Geometric Distortion • As pointed out in seminal work by Berg and Malik (CVPR’01) the effectiveness of SSD will degrade with significant viewpoint and/or illumination change. • Two options to match patches:- 1. simultaneously estimate the distortion and position of matching patch. 2. to “blur” the template window performing matching coarse-to-fine. 0 “1D Patch” “match” “blur” 0 “Distorted 1D Patch” 8
Handling Geometric Distortion • As pointed out in seminal work by Berg and Malik (CVPR’01) the effectiveness of SSD will degrade with significant viewpoint and/or illumination change. • Two options to match patches:- 1. simultaneously estimate the distortion and position of matching patch. 2. to “blur” the template window performing matching coarse-to-fine. 0 “1D Patch” “match” “blur” 0 “Distorted 1D Patch” Option 2 is attractive, low computational cost! 8
Sparseness and Positiveness • Blurring only works if the signals being matched are sparse and positive. • Unfortunately natural images are neither. • Combination of oriented filter banks and rectification can remedy this problem with little loss in performance. ... ... ... y ) ⇤ x “Rectification” ... ... ... y e.g., sigmoid, squared, x relu, etc. e.g., oriented gradients, Gabor filters 9
Sparseness and Positiveness • Blurring only works if the signals being matched are sparse and positive. • Unfortunately natural images are neither. • Combination of oriented filter banks and rectification can remedy this problem with little loss in performance. 10
Reminder: Convolution “convolution operator” 8 4 1 ∗ 6 2 2 h 7 “filter” x “signal”
Reminder: Convolution “convolution operator” 8 >> conv(x,h,’valid’) 4 ans = 1 ∗ 6 20 2 14 2 14 h 7 11 “filter” x “signal”
Multi-Channel Convolution � ∗ “multi-channel filter” h “multi-channel signal” x “single-channel response” y
Multi-Channel Convolution K-channels K-channels � ∗ “multi-channel filter” h “multi-channel signal” x “single-channel response” y
Multi-Channel Convolution K X x ( k ) ∗ h ( k ) y = k =1
Multi-Channel Convolution � ∗ “multi-channel filter” h “multi-channel signal” x “multi-channel response” y
Multi-Channel Convolution � ∗ “multi-channel filter” h “multi-channel signal” x L-channels “multi-channel response” y
Multi-Channel Convolution K y ( l ) = x ( k ) ∗ h ( k,l ) for l = 1 : L X k =1
CNNs for Object Detection
CNNs for Object Detection image patch 3@ (224x224)
CNNs for Object Detection conv image patch L@ 3@ L-channels (NxM) (224x224) N-pixels M-pixels
CNNs for Object Detection conv image patch L@ 3@ (NxM) (224x224) ( K ) X x ( k ) ∗ h ( k,l ) D · η k =1 η {} → non-linear function (relu, max pooling)
ReLU - Sparse and Positive • Rectified Linear Unit relu { x } = max(0 , x ) • Connection to LASSO and sparsity?? 2 + λ || y − Ax || 2 2 || x || 1
Max Pooling - Down Sampling max( ) Sub-sampling Input Image Convolutional Layer Max Pool Input image Convolutional layer layer LeCun 1980
Hierarchical Learning Simple Complex View-tuned cells Bob Crimi
Hierarchical Learning V1 Ventral Visual Stream V2/V4 IT Simple Complex View-tuned cells Bob Crimi
Current State of the Art image patch 3@ conv1 conv2 conv3 conv4 conv5 fc-6 fc-7 “bird” 96@ 256@ 384@ 384@ 256@ (227x227) (55x55) (27x27) (13x13) (13x13) (13x13) (4096) (K) “car” . . . “cat” A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012.
Current State of the Art image patch 3@ conv1 conv2 conv3 conv4 conv5 fc-6 fc-7 “bird” 96@ 256@ 384@ 384@ 256@ (227x227) (55x55) (27x27) (13x13) (13x13) (13x13) (4096) (K) “car” . . . “cat” A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012.
Current State of the Art image patch K × 1 3@ conv1 conv2 conv3 conv4 conv5 fc-6 fc-7 96@ 256@ 384@ 384@ 256@ (227x227) 0 (55x55) (27x27) (13x13) (13x13) (13x13) (4096) (K) 1 . . . 0 A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012.
Current State of the Art - Pose Selection image patch 3@ conv1 conv2 conv3 conv4 conv5 fc-6 fc-7 fc-8 (224x224) 64@ 256@ 384@ 384@ 256@ “bird” (4096) (4096) (54x54) (27x27) (13x13) (13x13) (13x13) “car” . . . “cat” A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012. K. Chatfield, V. Lempitsky, A. Vedaldi and A. Zisserman. “Return of the Devil in the Details: Delving Deep into Convolutional Networks.” In BMVC, 2014.
Impact on Object Recognition BC AD (before ConvNets) (after deep learning) 6.8% ImageNet Challenge Year
Visualizing CNNs
CNNs as Feature Extraction image patch 3@ conv1 conv2 conv3 conv4 conv5 fc-6 fc-7 fc-8 96@ 256@ 384@ 384@ 256@ (224x224) (54x54) (27x27) (13x13) (13x13) (13x13) (4096) (4096) ? parameters to learn pre-learned parameters (VGG) A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012. K. Chatfield, V. Lempitsky, A. Vedaldi and A. Zisserman. “Return of the Devil in the Details: Delving Deep into Convolutional Networks.” In BMVC, 2014.
Today • Deep Features • Deep Tracking • Deep Flow
Recommend
More recommend