CS201: Computer Vision Lect 09: SIFT Descriptors John Magee 26 - PowerPoint PPT Presentation

CS201: Computer Vision Lect 09: SIFT Descriptors John Magee 26 September 2014 Slides Courtesy of Diane H. Theriault

Questions of the Day: • How can we find matching points in images? • How can we use matching points to recognize objects?

SIFT • Find repeatable, scale-invariant points in images (Tuesday) • Compute something about them (Today) • Use the thing you computed to perform matching (Today) • A lot of engineering decisions • “Distinctive Image Features from Scale -Invariant Keypoints ” by David Lowe • Patented!

How to find the same cat? • Imagine that we had a library of cats • How could we find another picture of the same cat in the library? • Look for the markings?

Scale Space • Image convolved with Gaussians of different widths

Keypoints with Image Filtering Image • Perform image filtering by convolving an image with a “filter”/”mask” / “kernel” to obtain a “result” / “response” • The value of the result will be positive in regions of the image that “look like” the filter • What would a “dot” filter look like? Filter

Laplacian of a Gaussian • Sum of spatial second derivatives

Difference of Gaussians • Approximation of the Laplacian of a Gaussian

Scale-space Extrema • “ Extremum ” = local minimum or maximum • Check 8 neighbors at a particular scale • Check neighbors at scales above and below

Scale-space Extrema • Find locations and scales where the response to the LoG filter is a local extremum

Removing Low Contrast Points • Threshold on the magnitude of the response to the LoG filter • Threshold empirically determined

Removing Points Along Edges • In 1D: first derivative shows how the function is changing (velocity) • In 1D: second derivative how the change is changing (acceleration) • In 2D: first derivative leads to a gradient vector, which has a magnitude and direction • In 2D: second derivatives lead to a matrix, which gives information about the rate and orientation of the change in the gradient

Removing Points Along Edges • Hessian is a matrix of 2 nd derivatives • Eigenvectors tell you the orientation of the curvature • Eigenvalues tell you the magnitude • Ratio of eigenvalues tells you extent to which one orientation is dominant Gradient of Hessian of a Gaussian a Gaussian

Attributes of a Keypoint • Position (x,y) – location in the image • Scale – scale where this point is a LoG extremum • Orientation?

Gradient Orientation Histogram • Make a histogram over gradient orientation • Weighted by gradient magnitude • Weighted by distance to key point • Contribution to bins with linear interpolation

Gradient Orientation Histogram Gradient orientation histogram

Gradient Orientation Histogram • Plain Histogram of Gradient Orientation

Gradient Orientation Histogram • Weighted by gradient magnitude • (Could also weight by distance to center of window)

Gradient Orientation Histogram • Interpolated to avoid edge effects of bin quantization

Assigning Orientation to Keypoint • Support: from image at assigned scale, all points in a window surrounding keypoint • 36 bins over 360 degrees • Contributions weighted by distance to center of key point, weighted by a Gaussian with sigma 1.5 x assigned scale Dominant orientation

Computing SIFT Descriptor • Divide 16 x 16 region surrounding keypoint into 4 x 4 windows • For each window, compute a histogram with 8 bins • 128 total elements • Interpolation to improve stability (over orientation and over distance to boundary of window)

Normalizing the descriptor • To get (some) invariance to brightness and contrast – Clamp weight due to gradient magnitude (In case some edges are very strong due to weird lighting) – Normalize entire vector to unit length (So the absolute value of the gradient magnitude isn’t as important as the distribution of the gradient magnitude)

Using the keypoints • Assemble a database: – Pick some “training” images of different objects – Find keypoints and compute descriptors – Store the descriptors and associated source image, position, scale, and orientation

Using the keypoints • New Image – Find keypoints and compute descriptors – Search database for matching descriptors – (Throw out descriptors that are not distinctive) – Look for clusters of matching descriptors • (e.g. In your new image, you found 10 keypoints and associated descriptors, and in the database, there is an image where 6 of the descriptors match, but only 1 or 2 on other database images)

Using the keypoints – http://chrisjmccormick.wordpress.com/2013/01/24/opencv-sift- tutorial/

Voting for Pose • Matching keypoints from database image and new image will imply some relationship in pose (position, scale, and orientation) – Example: This keypoint was found 20 pixels down and 50 pixels to the right of the matching descriptor from the database image – Example: This keypoint was computed at 2x the scale of the matching descriptor from the database image – Look for clusters of matches with similar offsets – ( “Generalized Hough Transform”)

Discussion Questions • What types of invariance do we want to have when we think about doing object recognition? • What does it mean to be invariant to different image attributes? (brightness, contrast, position, scale, orientation) • What does it mean for an image feature to be stable? • Why might it make sense to use a weighted histogram? What kinds of weights? • What is a problem with the quantization associated with creating a histogram and what can we do about it?

CS201: Computer Vision Lect 09: SIFT Descriptors John Magee 26 - PowerPoint PPT Presentation

CS201: Computer Vision Lect 09: SIFT Descriptors John Magee 26 September 2014 Slides Courtesy of Diane H. Theriault Questions of the Day: How can we find matching points in images? How can we use matching points to recognize objects?

CS201 Lecture 02 Computer Vision: Image Formation and Basic Techniques John Magee 1 Computer

CS201: Computer Vision Lect 06: Face Detection John Magee Slides Courtesy of Diane H. Theriault

CS201 Computer Vision Lect 08: SIFT Keypoint Detection John Magee 23 Septermber 2014 Slides

Computer Vision Computer Vision How does vision work? What is vision for? Ela Claridge

COMPUTER SYSTEMS PROGRAMMING CS201 Hello! Im Chris Kim . Im a software engineer at

Heather Zheng Department of Computer Science p p University of California, Santa Barbara CS201

CS201 RECITATION 1 Introduction to C++ Outline Part 1 : Writing and debugging code with

CS262: Computer Vision (and Human-Computer Interaction) John Magee 1 Computer Vision How are

Branding Presentation VISION Mevushal VISION Muscat of Alexandria & Viognier VISION

Vision Services Vision Services & & Vision Therapy Vision Therapy February 2, 2007

Vision Our National Church partners .. Vision Our National Network partners Vision Getting

Computer Vision Introduction Historical context Connections to other disciplines Vision and

HIM Without Walls Realizing Our Vision! Realizing Our Vision Realize Our Vision Realizing Our

Deep Learning in Computer Vision Caner Hazrba Deep Learning in Action 24. June 15

J J R R Our Vision . . . Our Vision . . . Our Vision . . . Our Vision . . . TO BE THE BEST

Post- -trauma vision trauma vision Post Post- -trauma vision trauma vision Post syndrome

Interprocedural Optimisation Seminar Static Program Analysis Barbara D orr Sources :

Loops Definition A back edge is a CFG edge whose target dominates its source. Definition A

Tight-Binding Reduction for Continuum IQHE Models Jacob Shapiro (ongoing project with Michael I.

Considerations for the Practical Application of the Safety Requirements for Nuclear Power Plant

Issues for Future Progress: Practical Survey Design Alex Kim Lawrence Berkeley National

Flexible Pavement Design Methods Heuristic Methods (early 1900s) Empirical Based on Soil

Resistive Plate Chambers for Time-of-Flight P. Fonte LIP/ISEC Coimbra, Portugal. Compressed

A Closer Look at the Hill Estimator: Edgeworth Expansions and Confidence Intervals Erich

CS201: Computer Vision Lect 09: SIFT Descriptors John Magee 26 - PowerPoint PPT Presentation

CS201: Computer Vision Lect 09: SIFT Descriptors John Magee 26 September 2014 Slides Courtesy of Diane H. Theriault Questions of the Day: How can we find matching points in images? How can we use matching points to recognize objects?

CS201 Lecture 02 Computer Vision: Image Formation and Basic Techniques John Magee 1 Computer

CS201: Computer Vision Lect 06: Face Detection John Magee Slides Courtesy of Diane H. Theriault

CS201 Computer Vision Lect 08: SIFT Keypoint Detection John Magee 23 Septermber 2014 Slides

Computer Vision Computer Vision How does vision work? What is vision for? Ela Claridge

COMPUTER SYSTEMS PROGRAMMING CS201 Hello! Im Chris Kim . Im a software engineer at

Heather Zheng Department of Computer Science p p University of California, Santa Barbara CS201

CS201 RECITATION 1 Introduction to C++ Outline Part 1 : Writing and debugging code with

CS262: Computer Vision (and Human-Computer Interaction) John Magee 1 Computer Vision How are

Branding Presentation VISION Mevushal VISION Muscat of Alexandria &amp; Viognier VISION

Vision Services Vision Services &amp; &amp; Vision Therapy Vision Therapy February 2, 2007

Vision Our National Church partners .. Vision Our National Network partners Vision Getting

Computer Vision Introduction Historical context Connections to other disciplines Vision and

HIM Without Walls Realizing Our Vision! Realizing Our Vision Realize Our Vision Realizing Our

Deep Learning in Computer Vision Caner Hazrba Deep Learning in Action 24. June 15

J J R R Our Vision . . . Our Vision . . . Our Vision . . . Our Vision . . . TO BE THE BEST

Post- -trauma vision trauma vision Post Post- -trauma vision trauma vision Post syndrome

Interprocedural Optimisation Seminar Static Program Analysis Barbara D orr Sources :

Loops Definition A back edge is a CFG edge whose target dominates its source. Definition A

Tight-Binding Reduction for Continuum IQHE Models Jacob Shapiro (ongoing project with Michael I.

Considerations for the Practical Application of the Safety Requirements for Nuclear Power Plant

Issues for Future Progress: Practical Survey Design Alex Kim Lawrence Berkeley National

Flexible Pavement Design Methods Heuristic Methods (early 1900s) Empirical Based on Soil

Resistive Plate Chambers for Time-of-Flight P. Fonte LIP/ISEC Coimbra, Portugal. Compressed

A Closer Look at the Hill Estimator: Edgeworth Expansions and Confidence Intervals Erich

Branding Presentation VISION Mevushal VISION Muscat of Alexandria & Viognier VISION

Vision Services Vision Services & & Vision Therapy Vision Therapy February 2, 2007