ee 6882 visual search engine
play

EE 6882 Visual Search Engine Prof. Shih Fu Chang, Feb. 13 th 2012 - PDF document

2/13/2012 EE 6882 Visual Search Engine Prof. Shih Fu Chang, Feb. 13 th 2012 Lecture #4 Local Feature Matching Bag of Word image representation: coding and pooling (Many slides from A. Efors, W. Freeman, C. Kambhamettu, L. Xie, and likely


  1. 2/13/2012 EE 6882 Visual Search Engine Prof. Shih ‐ Fu Chang, Feb. 13 th 2012 Lecture #4  Local Feature Matching  Bag of Word image representation: coding and pooling (Many slides from A. Efors, W. Freeman, C. Kambhamettu, L. Xie, and likely others) (Slides preparation assisted by Rong ‐ Rong Ji) Corner Detection  Types of local image windows  Flat : Little or no brightness change  Edge : Strong brightness change in single direction  Flow : Parallel stripes  Corner/spot : Strong brightness changes in orthogonal directions  Basic idea  Find points where two edges meet  Look at the gradient behavior over a small window (Slide of A. Efros) 1

  2. 2/13/2012 Harris Detector: Mathematics Change of intensity for the shift [u,v]:    2     E u v w x y I x u y v I x y ( , ) ( , ) ( , ) ( , ) x y , Window Shifted Intensity function intensity Window function w(x,y) = or 1 in window, 0 outside Gaussian Harris Detector: Mathematics Taylor’s Expansion: For small shifts [u,v ] we have a bilinear approximation:   u    E u v u v M ( , ) ,   v   where M is a 2  2 matrix computed from image derivatives:   I 2 I I   x x y M w x y   ( , ) I I I 2     x y x y y , 2

  3. 2/13/2012 Harris Detector: Mathematics Intensity change in shifting window: eigenvalue analysis   u    1 >  2 – eigenvalues of M  E u v u v M ( , ) ,   v   If we try every possible shift, the direction of fastest change is  1 Ellipse E(u,v) = const (  1 ) -1/2 (  2 ) -1/2 (Slide of K. Efros) Harris Detector: Mathematics Measure of corner response: M   det 2   R M k M  R det trace M Trace Or    M det    M det 1 2 1 2     M trace     M trace 1 2 1 2 (k – empirical constant, k = 0.04-0.06) 3

  4. 2/13/2012 Harris Detector  The Algorithm:  Find points with large corner response function R ( R > threshold)  Take the points of local maxima of R Models of Image Change  Geometry  Rotation  Similarity (rotation + uniform scale)  Affine (scale dependent on direction) valid for: orthographic camera, locally planar object  Photometry  Affine intensity change ( I  a I + b ) (Slide of C. Kambhamettu) 4

  5. 2/13/2012 Harris Detector: Some Properties  But: non-invariant to image scale ! Corner ! All points will be classified as edges (Slide of C. Kambhamettu) Scale Invariant Detection Consider regions (e.g. circles) of different sizes around a point  Regions of corresponding sizes (at different scales) will look the  same in both images Fine/Low Coarse/High (Slide of C. Kambhamettu) 5

  6. 2/13/2012 Scale Invariant Detection The problem: how do we choose corresponding circles  independently in each image? (Slide of C. Kambhamettu) Scale-Space Pyrimad 6

  7. 2/13/2012 Scale Space: Difference of Guassian x 2  y 2    G x y e  2 ( , , ) 1 2  2 Scale Invariant Detection f   Kernel Image Functions for determining scale  Kernels:     DoG G x y k G x y ( , , ) ( , , ) (Difference of Gaussians)        L G x y G x y 2 ( , , ) ( , , ) xx yy (Laplacian) where Gaussian Note: both kernels are invariant to 2  2 x y    scale and rotation G x y e  2 ( , , ) 1 2  2 (Slide of C. Kambhamettu) 7

  8. 2/13/2012 Gausian Kernel, DOG Sigma 4 Diff Sigma2-Sigma4 Sigma 2 Difference of Gaussian, DOG 8

  9. 2/13/2012 Key Point Localization Detect maxima and minima of  difference-of-Gaussian in scale s e l p a m e R r u B l S u b t r a c t space Scale Invariant Interest Point Detectors scale  Laplacian  Harris-Laplacian 1  Find local maximum of: Harris corner detector in space  (image coordinates) y Laplacian in scale   Harris  x scale • SIFT (Lowe) 2  DoG  Find local maximum of: – Difference of Gaussians in space and scale y  DoG  x (Slide of C. Kambhamettu) 1 K.Mikolajczyk, C.Schmid. “Indexing Based on Scale Invariant Interest Points”. ICCV 2001 2 D.Lowe. “Distinctive Image Features from Scale-Invariant Keypoints”. IJCV 2004 9

  10. 2/13/2012 Scale Invariant Detectors Experimental evaluation of detectors  w.r.t. scale change Repeatability rate: # correct correspondences avg # detected points K.Mikolajczyk, C.Schmid. “Indexing Based on Scale Invariant Interest Points”. ICCV 2001 SIFT keypoints 10

  11. 2/13/2012 After extrema detection After curvature, edge responses 11

  12. 2/13/2012 Keypoints orientation and scale SIFT Invariant Descriptors • Extract image patches relative to local orientation Dominant direction of gradient 12

  13. 2/13/2012 Local Appearance Descriptor (SIFT) Compute gradient in a local patch Histogram of oriented gradients over local grids • e.g., 4x4 grids and 8 directions ‐ > 4x4x8=128 dimensions • Scale invariant S.-F. Chang, Columbia U. 25 [Lowe, ICCV 1999] Point Descriptors We know how to detect points  Next question:  How to match them? ? Point descriptor should be:  Invariant  Distinctive  13

  14. 2/13/2012 Feature matching ? Slide of A. Efros Feature-space outlier rejection [Lowe, 1999]: • 1-NN: SSD of the closest match • 2-NN: SSD of the second-closest match • Look at how much the best match (1-NN) is than the 2 nd best match (2-NN), e.g. 1-NN/2-NN Slide of A. Efros 14

  15. 2/13/2012 Feature-space outliner rejection Can we now compute H from the blue points? • No! Still too many outliers… • What can we do? Slide of A. Efros RANSAC for estimating homography RANSAC loop: 1. Select four feature pairs (at random) 2. Compute homography H (exact) Compute inliers where SSD(p i ’, H p i) < ε 3. 4. Keep largest set of inliers 5. Re-compute least-squares H estimate on all of the inliers Slide of A. Efros 15

  16. 2/13/2012 Least squares fit Find “average” translation vector Slide of A. Efros RANSAC Slide of A. Efros 16

  17. 2/13/2012 From local features to Visual Words 128 ‐ D feature space visual word vocabulary clustering … K-Mean Clustering  Training data x(2) C 1      x label i ? + ( ) + i C 2 ++ + + +  Unsupervised learning + + + … oo  K ‐ mean clustering o o o o o C K o o C 3  Fix K value x(1)  Initialize the representative of each cluster  Map samples to closest cluster  Re ‐ compute the centers x x x samples , ,..., N 1 2 for i=1,2,...,N,    x C if Dist(x C Dist(x C k k , , ) , ), ' i k i k i k' end Can be used to initialize other clustering methods  17

  18. 2/13/2012 Visual Words: Image Patch Patterns Corners Blobs eyes letters Sivic and Zisserman, “Video Google”, 2006 Represent Image as Bag of Words keypoint features visual words clustering … BoW histogram … … 18

  19. 2/13/2012 Pooling Binary Features Boureau, Jean Ponce, Yann LeCun, A Theoretical Analysis of Feature Pooling in Visual Recognition, ICML 2010 � � � … � � � Consider PxK matrix � � P: # of features, K: # of codewords � � To begin with simple model, assume vi � � … are iid. … � � Distribution Separability Better separability achieved by 1. increasing the distance between the means of the two class ‐ conditional distributions 2. reducing their standard deviations. 19

  20. 2/13/2012 Distribution Separability Average pooling: Max pooling: Class separability 20

  21. 2/13/2012 For binary features: For continuous features: • Modeling will be more complex and the conclusions are slightly different Soft Coding -- Assign a feature to multiple visual words -- weights are determined by feature-to-word similarity Details in: Jiang, Ngo and Yang, ACM CIVR 2007. 42 Image source: http://www.cs.joensuu.fi/pages/franti/vq/lkm15.gif 21

  22. 2/13/2012 Multi ‐ BoW Spatial Pyramid Kernel • a S. Lazebnik, et al, CVPR 2006 43 Classifiers • K ‐ Nearest Neighbors + Voting • Linear Discriminative Model (SVM) 44 22

  23. 2/13/2012 Machine Learning: Build Classifier Find separating hyperplane: w to maximize margin Airplane w T x + b = 0 Decision function: f (x) = sign(w T x + b ) w T x i + b > 0 if label y i = +1 w T x i + b < 0 if label y i = ‐ 1 Support Vector Machine (tutorial by Burges ‘98) Look for separation plane with  the highest margin Decision boundary t H b : w x + = 0 0 Linearly separable  w T x i + b > 1 if label y i = +1 w T x i + b < ‐ 1 if label y i = ‐ 1 y i (w T x i + b) > 1 for all x i Two parallel hyperplanes defining the margin  t H H b hyperplane ( ) : + = + 1 w x i 1 + t H H b hyperplane ( ) : w x + = - 1 i 2 - Margin: sum of distances of the closest points to the separation plane  margin = 2/ w Best plane defined by w and b  23

Recommend


More recommend