deep tracking flow
play

Deep Tracking & Flow Instructor - Simon Lucey 16-623 - - PowerPoint PPT Presentation

Deep Tracking & Flow Instructor - Simon Lucey 16-623 - Designing Computer Vision Apps Today Deep Features Deep Tracking Deep Flow Primary Visual Cortex


  1. Deep Tracking & Flow Instructor - Simon Lucey 16-623 - Designing Computer Vision Apps

  2. Today • Deep Features • Deep Tracking • Deep Flow

  3. Primary Visual Cortex             

  4. Spatial Sensitivity • Which image has the greatest distortion with respect to the template? Kingdom, Field, Olmos, 2007

  5. Spatial Sensitivity Kingdom, Field, Olmos, 2007

  6. Handling Geometric Distortion • As pointed out in seminal work by Berg and Malik (CVPR’01) the effectiveness of SSD will degrade with significant viewpoint change. • Two options to match local image patches:- 0 “1D Patch” 0 “Distorted 1D Patch” Source: A. C. Berg 6

  7. Handling Geometric Distortion • As pointed out in seminal work by Berg and Malik (CVPR’01) the effectiveness of SSD will degrade with significant viewpoint change. • Two options to match local image patches:- 0 “1D Patch” “match” 0 0 0 “Distorted 1D Patch” Source: A. C. Berg 6

  8. Handling Geometric Distortion • As pointed out in seminal work by Berg and Malik (CVPR’01) the effectiveness of SSD will degrade with significant viewpoint change. • Two options to match local image patches:- 1. simultaneously estimate the distortion and position of matching patch 0 “1D Patch” 0 “Distorted 1D Patch” 7

  9. Handling Geometric Distortion • As pointed out in seminal work by Berg and Malik (CVPR’01) the effectiveness of SSD will degrade with significant viewpoint change. • Two options to match local image patches:- 1. simultaneously estimate the distortion and position of matching patch 0 0 “1D Patch” “align” 0 0 “Distorted 1D Patch” 7

  10. Handling Geometric Distortion • As pointed out in seminal work by Berg and Malik (CVPR’01) the effectiveness of SSD will degrade with significant viewpoint change. • Two options to match local image patches:- 1. simultaneously estimate the distortion and position of matching patch 0 0 “1D Patch” “align” 0 0 “Distorted 1D Patch” 7

  11. Handling Geometric Distortion • As pointed out in seminal work by Berg and Malik (CVPR’01) the effectiveness of SSD will degrade with significant viewpoint change. • Two options to match local image patches:- 1. simultaneously estimate the distortion and position of matching patch 0 0 “1D Patch” “match” “align” 0 0 0 “Distorted 1D Patch” 7

  12. Handling Geometric Distortion • As pointed out in seminal work by Berg and Malik (CVPR’01) the effectiveness of SSD will degrade with significant viewpoint and/or illumination change. • Two options to match patches:- 1. simultaneously estimate the distortion and position of matching patch. 2. to “blur” the template window performing matching coarse-to-fine. 0 “1D Patch” 0 “Distorted 1D Patch” 8

  13. Handling Geometric Distortion • As pointed out in seminal work by Berg and Malik (CVPR’01) the effectiveness of SSD will degrade with significant viewpoint and/or illumination change. • Two options to match patches:- 1. simultaneously estimate the distortion and position of matching patch. 2. to “blur” the template window performing matching coarse-to-fine. 0 “1D Patch” “blur” 0 “Distorted 1D Patch” 8

  14. Handling Geometric Distortion • As pointed out in seminal work by Berg and Malik (CVPR’01) the effectiveness of SSD will degrade with significant viewpoint and/or illumination change. • Two options to match patches:- 1. simultaneously estimate the distortion and position of matching patch. 2. to “blur” the template window performing matching coarse-to-fine. 0 “1D Patch” “match” “blur” 0 “Distorted 1D Patch” 8

  15. Handling Geometric Distortion • As pointed out in seminal work by Berg and Malik (CVPR’01) the effectiveness of SSD will degrade with significant viewpoint and/or illumination change. • Two options to match patches:- 1. simultaneously estimate the distortion and position of matching patch. 2. to “blur” the template window performing matching coarse-to-fine. 0 “1D Patch” “match” “blur” 0 “Distorted 1D Patch” Option 2 is attractive, low computational cost! 8

  16. Sparseness and Positiveness • Blurring only works if the signals being matched are sparse and positive. • Unfortunately natural images are neither. • Combination of oriented filter banks and rectification can remedy this problem with little loss in performance. ... ... ... y ) ⇤ x “Rectification” ... ... ... y e.g., sigmoid, squared, x relu, etc. e.g., oriented gradients, Gabor filters 9

  17. Sparseness and Positiveness • Blurring only works if the signals being matched are sparse and positive. • Unfortunately natural images are neither. • Combination of oriented filter banks and rectification can remedy this problem with little loss in performance. 10

  18. Reminder: Convolution “convolution operator” 8 4 1 ∗ 6 2 2 h 7 “filter” x “signal”

  19. Reminder: Convolution “convolution operator” 8 >> conv(x,h,’valid’) 4 ans = 1 ∗ 6 20 2 14 2 14 h 7 11 “filter” x “signal”

  20. Multi-Channel Convolution � ∗ “multi-channel filter” h “multi-channel signal” x “single-channel response” y

  21. Multi-Channel Convolution K-channels K-channels � ∗ “multi-channel filter” h “multi-channel signal” x “single-channel response” y

  22. Multi-Channel Convolution K X x ( k ) ∗ h ( k ) y = k =1

  23. Multi-Channel Convolution � ∗ “multi-channel filter” h “multi-channel signal” x “multi-channel response” y

  24. Multi-Channel Convolution � ∗ “multi-channel filter” h “multi-channel signal” x L-channels “multi-channel response” y

  25. Multi-Channel Convolution K y ( l ) = x ( k ) ∗ h ( k,l ) for l = 1 : L X k =1

  26. CNNs for Object Detection

  27. CNNs for Object Detection image patch 3@ (224x224)

  28. CNNs for Object Detection conv image patch L@ 3@ L-channels (NxM) (224x224) N-pixels M-pixels

  29. CNNs for Object Detection conv image patch L@ 3@ (NxM) (224x224) ( K ) X x ( k ) ∗ h ( k,l ) D · η k =1 η {} → non-linear function (relu, max pooling)

  30. ReLU - Sparse and Positive • Rectified Linear Unit relu { x } = max(0 , x ) • Connection to LASSO and sparsity?? 2 + λ || y − Ax || 2 2 || x || 1

  31. Max Pooling - Down Sampling max( ) Sub-sampling Input Image Convolutional Layer Max Pool Input image Convolutional layer layer LeCun 1980

  32. Hierarchical Learning Simple Complex View-tuned cells Bob Crimi

  33. Hierarchical Learning V1 Ventral Visual Stream V2/V4 IT Simple Complex View-tuned cells Bob Crimi

  34. Current State of the Art image patch 3@ conv1 conv2 conv3 conv4 conv5 fc-6 fc-7 “bird” 96@ 256@ 384@ 384@ 256@ (227x227) (55x55) (27x27) (13x13) (13x13) (13x13) (4096) (K) “car” . . . “cat” A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012.

  35. Current State of the Art image patch 3@ conv1 conv2 conv3 conv4 conv5 fc-6 fc-7 “bird” 96@ 256@ 384@ 384@ 256@ (227x227) (55x55) (27x27) (13x13) (13x13) (13x13) (4096) (K) “car” . . . “cat” A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012.

  36. Current State of the Art image patch K × 1 3@ conv1 conv2 conv3 conv4 conv5 fc-6 fc-7   96@ 256@ 384@ 384@ 256@ (227x227) 0 (55x55) (27x27) (13x13) (13x13) (13x13) (4096) (K) 1     . .   .   0 A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012.

  37. Current State of the Art - Pose Selection image patch 3@ conv1 conv2 conv3 conv4 conv5 fc-6 fc-7 fc-8 (224x224) 64@ 256@ 384@ 384@ 256@ “bird” (4096) (4096) (54x54) (27x27) (13x13) (13x13) (13x13) “car” . . . “cat” A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012. K. Chatfield, V. Lempitsky, A. Vedaldi and A. Zisserman. “Return of the Devil in the Details: Delving Deep into Convolutional Networks.” In BMVC, 2014.

  38. Impact on Object Recognition BC AD (before ConvNets) (after deep learning) 6.8% ImageNet Challenge Year

  39. Visualizing CNNs

  40. CNNs as Feature Extraction image patch 3@ conv1 conv2 conv3 conv4 conv5 fc-6 fc-7 fc-8 96@ 256@ 384@ 384@ 256@ (224x224) (54x54) (27x27) (13x13) (13x13) (13x13) (4096) (4096) ? parameters to learn pre-learned parameters (VGG) A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012. K. Chatfield, V. Lempitsky, A. Vedaldi and A. Zisserman. “Return of the Devil in the Details: Delving Deep into Convolutional Networks.” In BMVC, 2014.

  41. Today • Deep Features • Deep Tracking • Deep Flow

Recommend


More recommend