6.869 Advances in Computer Vision Prof. Bill Freeman March 1, 2005 1 2 Local Features Today Matching points across images important for: Interesting points, correspondence. object identification (instance recognition) object (class) recognition Scale and rotation invariant descriptors [Lowe] pose estimation stereo (3-d shape) motion estimate stitching together photographs into a mosaic etc 3 4 Correspondence using window matching Correspondence using window matching Left Right Points are highly individually ambiguous… More unique matches are possible with small scanline regions of image. error Criterion function: disparity 5 6 1
Image Normalization Sum of Squared (Pixel) Differences • Even when the cameras are identical models, there Left Right can be differences in gain and sensitivity. w w L R • The cameras do not see exactly the same surfaces, m w w m so their overall light levels can differ. L R x − • For these reasons and more, it is a good idea to ( x L y , ) ( d , y ) L L L I I normalize the pixels in each window: L R w and w are correspond ing m by m windows of pixels. ∑ = 1 ( , ) Average pixel L R I I u v W ( x , y ) We define the window function : m ∈ ( , ) ( , ) u v W x y m = − ≤ ≤ + − ≤ ≤ + ∑ = W ( x , y ) { u , v | x m u x m , y m v y m } 2 [ ( , )] Window magnitude I I u v m 2 2 2 2 W ( x , y ) m ∈ The SSD cost measures the intensity difference as a function of disparity : ( u , v ) W ( x , y ) m − ∑ I ( x , y ) I = − − 2 ˆ C ( x , y , d ) [ I ( u , v ) I ( u d , v )] = ( , ) Normalized pixel I x y r L R − ∈ I I ( u , v ) W ( x , y ) m 7 W ( x , y ) 8 m Images as Vectors Image windows as vectors “Unwrap” Left Right image to form vector, using raster scan order w L row 1 m w m R w m L row 2 m Each window is a vector w in an m 2 dimensional L m row 3 vector space. Normalization makes them unit length. 9 10 Image Metrics Possible metrics ( d ) (Normalized) Sum of Squared Differences w R Distance? ∑ = ˆ − ˆ − 2 ( ) [ ( , ) ( , )] C d I u v I u d v ( d ) w R SSD L R w ∈ ( u , v ) W ( x , y ) m w L L = − 2 w w ( d ) L R Normalized Correlation Angle? ∑ = ˆ ˆ − ( ) ( , ) ( , ) C d I u v I u d v NC L R ∈ ( u , v ) W ( x , y ) m = ⋅ = θ w w ( d ) cos L R = − 2 = ⋅ * d arg min w w ( d ) arg max w w ( d ) d L R d L R 11 12 2
Local Features Not all points are equally good for matching… 13 14 Aperture Problem and Normal Flow 15 16 Aperture Problem and Normal Flow Aperture Problem and Normal Flow 17 18 3
Aperture Problem and Normal Flow Aperture Problem and Normal Flow 19 20 (Review) Differential approach: Aperture Problem and Normal Flow Optical flow constraint equation Brightness should stay constant as you track + δ + δ + δ = ( , , ) ( , , ) I x u t y v t t t I x y t motion 1 st order Taylor series, δ valid for small t + δ + δ + δ = ( , , ) ( , , ) I x y t u tI v tI tI I x y t x y t Constraint equation + + = 0 uI vI I x y t “BCCE” - Brightness Change Constraint Equation 21 22 Aperture Problem and Normal Flow Combining Local Constraints The gradient constraint: + + = 0 I u I v I v x y t ∇ • = − r 1 1 I U I ∇ • = t I U 0 ∇ • = − 2 2 I U I t Defines a line in the (u,v) space ∇ • = − 3 3 I U I t v etc. u Normal Flow: ∇ I I = − t u u ⊥ ∇ ∇ I I 23 24 4
Lucas-Kanade: Integrate Local Patch Analysis gradients over a Patch Assume a single velocity for all pixels within an image patch ( ) ∑ = + + 2 ( , ) ( , ) ( , ) E u v I x y u I x y v I x y t ∈ Ω , x y Solve with: ∑ ∑ ∑ ⎡ ⎤ ⎛ ⎞ ⎛ ⎞ 2 I I I u I I ⎜ ⎟ ⎜ ⎟ = − ⎢ x x y ⎥ x t ⎜ ⎟ ∑ ∑ ⎜ ∑ ⎟ 2 ⎢ ⎥ I I I ⎝ v ⎠ I I ⎣ ⎦ ⎝ ⎠ x y y y t On the LHS: sum of the 2x2 outer product tensor of the gradient vector ( ) r ∑ ∑ ∇ ∇ = − ∇ T I I U I I t 25 26 Good Features to Track Selecting Good Features ∑ ∑ ∑ ⎡ ⎤ ⎛ ⎞ • What’s a “good feature”? 2 ⎛ ⎞ I I I u I I ⎜ ⎟ ⎜ ⎟ x x y = − x t ⎢ ⎥ ⎜ ⎟ ∑ ∑ ⎜ ∑ ⎟ – Satisfies brightness constancy ⎢ 2 ⎥ ⎝ ⎠ ⎣ I I I ⎦ v ⎝ I I ⎠ x y y y t – Has sufficient texture variation A u = b – Does not have too much texture variation – Corresponds to a “real” surface patch When is This Solvable? – Does not deform too much over time • A should be invertible • A should not be too small due to noise – eigenvalues λ 1 and λ 2 of A should not be too small • A should be well-conditioned λ 1 / λ 2 should not be too large ( λ 1 = larger eigenvalue) – Both conditions satisfied when min ( λ 1 , λ 2 ) > c 27 28 Harris detector Selecting Good Features Auto-correlation matrix ∑ ∑ ⎡ ⎤ 2 ( I ( x , y )) I ( x , y ) I ( x , y ) ⎢ x k k x k k y k k ⎥ ∈ ∈ ( x , y ) W ( x , y ) W ⎢ ∑ k k k k ∑ ⎥ 2 I ( x , y ) I ( x , y ) ( I ( x , y )) ⎢ ⎥ x k k y k k y k k ⎣ ⎦ ∈ ∈ ( x , y ) W ( x , y ) W k k k k • Auto-correlation matrix – captures the structure of the local neighborhood – measure based on eigenvalues of this matrix • 2 strong eigenvalues => interest point • 1 strong eigenvalue => contour • 0 eigenvalue => uniform region • Interest point detection – threshold on the eigenvalues λ 1 and λ 2 are large 29 30 – local maximum for localization 5
Selecting Good Features Selecting Good Features large λ 1 , small λ 2 small λ 1 , small λ 2 31 32 CVPR 2003 Tutorial Today Recognition and Matching Interesting points, correspondence. Based on Local Invariant Features Scale and rotation invariant descriptors [Lowe] David Lowe Computer Science Department University of British Columbia 33 Invariant Local Features Advantages of invariant local features • Image content is transformed into local feature coordinates that are invariant to translation, rotation, • Locality: features are local, so robust to scale, and other imaging parameters occlusion and clutter (no prior segmentation) • Distinctiveness: individual features can be matched to a large database of objects • Quantity: many features can be generated for even small objects • Efficiency: close to real-time performance • Extensibility: can easily be extended to wide range of differing feature types, with each 35 adding robustness SIFT Features 6
Scale space processed one octave at a time Scale invariance Requires a method to repeatably select points in location and scale: • The only reasonable scale-space kernel is a Gaussian (Koenderink, 1984; Lindeberg, 1994) c r t B l u b S u r t r a a S t u b l B r t u c • An efficient choice is to detect peaks in the difference of Gaussian pyramid (Burt & Adelson, 1983; Crowley & Parker, 1984 – but examining more scales) • Difference-of-Gaussian with constant ratio of scales is a close approximation to Lindeberg’s scale-normalized Laplacian (can be shown from the heat diffusion equation) 37 38 Select canonical orientation Key point localization • Detect maxima and minima of difference-of-Gaussian in scale • Create histogram of local space gradient directions computed • Fit a quadratic to surrounding values for sub-pixel and sub-scale at selected scale interpolation (Brown & Lowe, • Assign canonical orientation 2002) B l u r a r t c b t u S at peak of smoothed • Taylor expansion around point: histogram • Each key specifies stable 2D • Offset of extremum (use finite coordinates (x, y, scale, differences for derivatives): orientation) 2 π 0 39 40 SIFT vector formation Example of keypoint detection • Thresholded image gradients are sampled over 16x16 Threshold on value at DOG peak and on ratio of principle array of locations in scale space curvatures (Harris approach) • Create array of orientation histograms (a) 233x189 image (b) 832 DOG extrema • 8 orientations x 4x4 histogram array = 128 dimensions (c) 729 left after peak value threshold (d) 536 left after testing ratio of principle curvatures 41 42 7
Feature stability to noise Feature stability to affine change • Match features after random change in image scale & • Match features after random change in image scale & orientation, with differing levels of image noise orientation, with 2% image noise, and affine distortion • Find nearest neighbor in database of 30,000 features • Find nearest neighbor in database of 30,000 features 43 44 Distinctiveness of features • Vary size of database of features, with 30 degree affine change, 2% image noise • Measure % correct for single nearest neighbor match 45 46 A good SIFT features tutorial http://www.cs.toronto.edu/~jepson/csc2503/tutSIFT04.pdf By Estrada, Jepson, and Fleet. 47 48 8
Recommend
More recommend