Feature Descriptors Computer Vision Fall 2018 Columbia University
Tali Dekel Tuesday, October 2, 11am, CEPSR 620 http://people.csail.mit.edu/talidekel
Seam Carving
Seam carving: main idea Content-aware resizing Traditional resizing [Shai & Avidan, SIGGRAPH 2007]
Seam Carving
Seam carving: main idea [Shai & Avidan, SIGGRAPH 2007]
Seam Carving
Seam carving: algorithm s 1 s 2 s 3 s 4 s 5 Energy ( f ) = Let a vertical seam s consist of h positions that form an 8- connected path. h Cost ( s ) Energy ( f ( s )) ∑ = Let the cost of a seam be: i i 1 = Optimal seam minimizes this cost: s * min s Cost ( s ) = Compute it efficiently with dynamic programming. Slide credit: Kristen Grauman
How to identify the minimum cost seam? • First, consider a greedy approach: 1 3 0 2 8 9 5 2 6 Energy matrix Slide credit: Kristen (gradient magnitude) Grauman
Seam carving: algorithm • Compute the cumulative minimum energy for all possible connected seams at each entry (i,j) : ( ) M ( i , j ) Energy ( i , j ) min M ( i 1 , j 1 ), M ( i 1 , j ), M ( i 1 , j 1 ) = + − − − − + row i-1 j-1 j j+1 j row i Energy matrix M matrix: (gradient magnitude) cumulative min energy (for vertical seams) • Then, min value in last row of M indicates end of the minimal connected vertical seam. • Backtrack up from there, selecting min of 3 above in M . Slide credit: Kristen Grauman
Example ( ) M ( i , j ) Energy ( i , j ) min M ( i 1 , j 1 ), M ( i 1 , j ), M ( i 1 , j 1 ) = + − − − − + 1 3 0 1 3 0 2 8 9 3 8 9 5 2 6 8 5 14 Energy matrix M matrix Slide credit: Kristen (gradient magnitude) (for vertical seams) Grauman
Example ( ) M ( i , j ) Energy ( i , j ) min M ( i 1 , j 1 ), M ( i 1 , j ), M ( i 1 , j 1 ) = + − − − − + 1 3 0 1 3 0 2 8 9 3 8 9 5 2 6 8 5 14 Energy matrix M matrix Slide credit: Kristen (gradient magnitude) (for vertical seams) Grauman
Real image example Energy Map Original Image Blue = low energy Red = high energy Slide credit: Kristen Grauman
Seam Carving
Why did it fail? Original Resized
Why did it fail? Original Resized
Feature Descriptors
Core visual understanding task: finding correspondences between images Source: Deva Ramanan
Example: image matching of landmarks Correspondence + geometry estimation Source: Deva Ramanan
Object recognition by matching Sparse correspondence Dense corrrespondence Source: Deva Ramanan
Example: license plate recognition Source: Deva Ramanan
Example: product recognition Source: Deva Ramanan
Motivation Which of these patches are easier to match? Why? How can we mathematically operationalize this? Source: Deva Ramanan
Corner Detector: Basic Idea “flat” region: “edge”: “corner”: no change in any no change along the significant change in direction edge direction all directions Defn: points are “matchable” if small shifts always produce a large SSD error Source: Deva Ramanan
The math Defn: points are “matchable” if small shifts always produce a large SSD error W cornerness( x 0 , y 0 ) = min u,v E x 0 ,y 0 ( u, v ) where � [ I ( x + u, y + v ) − I ( x, y )] 2 E x 0 ,y 0 ( u, v ) = ( x,y ) ∈ W ( x 0 ,y 0 ) Why can’t this be right? Source: Deva Ramanan
The math Defn: points are “matchable” if small shifts always produce a large SSD error W cornerness( x 0 , y 0 ) = min u,v E x 0 ,y 0 ( u, v ) where � [ I ( x + u, y + v ) − I ( x, y )] 2 E x 0 ,y 0 ( u, v ) = ( x,y ) ∈ W ( x 0 ,y 0 ) Why can’t this be right? Source: Deva Ramanan
The math Defn: points are “matchable” if small shifts always produce a large SSD error W cornerness( x 0 , y 0 ) = min u,v E x 0 ,y 0 ( u, v ) u 2 + v 2 = 1 where � [ I ( x + u, y + v ) − I ( x, y )] 2 E x 0 ,y 0 ( u, v ) = ( x,y ) ∈ W ( x 0 ,y 0 ) Source: Deva Ramanan
Background: taylor series expansion f ( x + u ) = f ( x ) + ∂ f ( x ) u + 1 ∂ f ( x ) ∂ xx u 2 + Higher Order Terms 2 ∂ x log ( x + 1) Approximation of f(x) = e x at x=0 Why are low-order expansions reasonable? Underyling smoothness of real-world signals Source: Deva Ramanan
Multivariate taylor series i u � h ∂ I ( x,y ) ∂ I ( x,y ) I ( x + u, y + v ) = I ( x, y ) + + ∂ x ∂ y v gradient " ∂ I ( x,y ) ∂ I ( x,y ) # u � 1 ∂ xx ∂ xy ⇥ u v ⇤ + Higher Order Terms ∂ I ( x,y ) ∂ I ( x,y ) v 2 ∂ xy ∂ yy Hessian I ( x + u, y + v ) ≈ I + I x u + I y v where I x = ∂ I ( x, y ) ∂ x Source: Deva Ramanan
Feature detection: the math Consider shifting the window W by (u,v) • how do the pixels in W change? • compare each pixel before and after by summing up the squared differences W • this defines an “error” of E(u,v): X [ I ( x + u, y + u ) − I ( x, y )] 2 E ( u, v ) = ( x,y ) ∈ W X [ I + I x u + I y v − I ] 2 ≈ ( x,y ) ∈ W x u 2 + I 2 y v 2 + 2 I x I y uv ] X [ I 2 = ( x,y ) ∈ W I 2 � � u I x I y X ⇥ ⇤ x = A = u v A , I 2 v I y I x y ( x,y ) ∈ W Source: Deva Ramanan
Interpreting the second moment matrix The surface E ( u , v ) is locally approximated by a quadratic form. Let’s try to understand its shape. � � u � E ( u , v ) [ u v ] M � � A v � � � � 2 I I I ∑ � A = � x x y M w ( x , y ) � � 2 I I I � � � � ( x , y ) ∈ W x , y x y y James Hays
Interpreting the second moment matrix � � u � [ u v ] M const Consider a horizontal “slice” of E ( u , v ): � � A v � � This is the equation of an ellipse. James Hays
Interpreting the second moment matrix � � u � [ u v ] M const Consider a horizontal “slice” of E ( u , v ): � � A v � � This is the equation of an ellipse. � � � 0 � � 1 1 M R R Diagonalization of M: A A � � � 0 � � 2 The axis lengths of the ellipse are determined by the eigenvalues, and the orientation is determined by a rotation matrix 𝑆 . direction of the fastest change direction of the slowest change ( � max ) -1/2 ( � min ) -1/2 James Hays
Classification of image points using eigenvalues of M � 2 “Edge” � 2 >> � 1 “Corner” � 1 and � 2 are large, � 1 ~ � 2 ; E increases in all directions � 1 and � 2 are small; “Edge” E is almost constant “Flat” � 1 >> � 2 in all directions region � 1 Source: Deva Ramanan
Back to corner(ness) Defn: points are “matchable” if small shifts always produce a large SSD error W Corner( x 0 , y 0 ) = u 2 + v 2 =1 E ( u, v ) min where I 2 u � � I x I y X ⇥ u v ⇤ x E ( u, v ) = A = A , I 2 v I y I x y ( x,y ) ∈ W ( x 0 ,y 0 ) Solution is given by minimum eigenvalue Implies (xo,yo) is a good corner if minimum eigenvalue is large (or alternatively, if both eigenvalues of ‘A’ are large) Source: Deva Ramanan
Efficient computation Computing eigenvalues (and eigenvectors) is expensive Turns out that it’s easy to compute their sum (trace) and product (determinant) – Det( A ) = λ min λ max – Trace( A ) = λ min + λ max (trace = sum of diagonal entries) Det ( A ) (is proportional to the ratio of R = 4 eigvenvalues and is 1 if they are equal) Trace ( A ) 2 R = Det ( A ) − α Trace ( A ) 2 (also favors large eigenvalues) Source: Deva Ramanan
James Hays Harris Corner Detector [ Harris88 ] 0. Input image We want to compute M at each pixel. 𝐽 1. Compute image derivatives (optionally, blur first). 𝐽 𝑦 𝐽 𝑧 2. Compute 𝑁 components as squares of derivatives. 2 2 𝐽 𝑧 𝐽 𝑦𝑧 𝐽 𝑦 3. Gaussian filter g() with width s 2 ) (𝐽 𝑦 2 ) (𝐽 𝑧 (𝐽 𝑦 ∘ 𝐽 𝑧 ) 4. Compute cornerness 𝐷 = det 𝑁 − 𝛽 trace 𝑁 2 2 2 ∘ 𝐽 𝑧 2 − 𝐽 𝑦 ∘ 𝐽 𝑧 = 𝐽 𝑦 2 2 + 𝐽 𝑧 2 −𝛽 𝐽 𝑦 𝑆 5. Threshold on 𝐷 to pick high cornerness 6. Non-maxima suppression to pick peaks.
Harris Detector: Steps Source: Deva Ramanan
Harris Detector: Steps Compute corner response 𝐷 Source: Deva Ramanan
Harris Detector: Steps Find points with large corner response: 𝐷 > threshold Source: Deva Ramanan
Harris Detector: Steps Take only the points of local maxima of 𝐷 Source: Deva Ramanan
Harris Detector: Steps Source: Deva Ramanan
Scale and rotation invariance Will interest point detector still fire on rotated & scaled images? Source: Deva Ramanan
Rotation invariance (?) Are eigenvector stable under rotations? No Are eigenvalues stable under rotations? Yes Source: Deva Ramanan
Image rotation Second moment ellipse rotates but its shape (i.e., eigenvalues) remains the same. Corner location is covariant w.r.t. rotation James Hays
Scale invariance? Are eigenvector stable under scalings? Yes Are eigenvalues stable under scalings? No Source: Deva Ramanan
Recommend
More recommend