Overview Introduction to local features Harris interest points + - - PowerPoint PPT Presentation

overview
SMART_READER_LITE
LIVE PREVIEW

Overview Introduction to local features Harris interest points + - - PowerPoint PPT Presentation

Overview Introduction to local features Harris interest points + SSD, ZNCC, SIFT Scale & affine invariant interest point detectors Scale & affine invariant interest point detectors Evaluation and comparison of


slide-1
SLIDE 1

Overview

  • Introduction to local features
  • Harris interest points + SSD, ZNCC, SIFT
  • Scale & affine invariant interest point detectors
  • Scale & affine invariant interest point detectors
  • Evaluation and comparison of different detectors
  • Region descriptors and their performance
slide-2
SLIDE 2

Scale invariance - motivation

  • Description regions have to be adapted to scale changes
  • Interest points have to be repeatable for scale changes
slide-3
SLIDE 3

Harris detector + scale changes

|) | |, max(| | } ) ), ( ( | ) , {( | ) (

i i i i i i

H dist R b a b a b a ε ε < =

Repeatability rate

slide-4
SLIDE 4

Scale adaptation

        =         =        

1 1 2 2 2 2 1 1 1

sy sx I y x I y x I Scale change between two images       Scale adapted derivative calculation

slide-5
SLIDE 5

Scale adaptation

        =         =        

1 1 2 2 2 2 1 1 1

sy sx I y x I y x I Scale change between two images      

) ( ) (

1 1

2 2 2 1 1 1

σ σ s G y x I s G y x I

n n

i i n i i

        = ⊗        

Scale adapted derivative calculation

σ s

n

s

slide-6
SLIDE 6

Scale adaptation

) (σ

i

L where are the derivatives with Gaussian convolution

        ⊗ ) ( ) ( ) ( ) ( ) ~ (

2 2

σ σ σ σ σ

y y x y x x

L L L L L L G

slide-7
SLIDE 7

Scale adaptation

) (σ

i

L where are the derivatives with Gaussian convolution

        ⊗ ) ( ) ( ) ( ) ( ) ~ (

2 2

σ σ σ σ σ

y y x y x x

L L L L L L G

      ⊗ ) ( ) ( ) ( ) ( ) ~ (

2 2 2

σ σ σ σ σ s L s L L s L L s L s G s

y y x y x x

Scale adapted auto-correlation matrix

slide-8
SLIDE 8

Harris detector – adaptation to scale

} ) ), ( ( | ) , {( ) ( ε ε < =

i i i i

H dist R b a b a

slide-9
SLIDE 9

Multi-scale matching algorithm

1 = s 3 = s 5 = s

slide-10
SLIDE 10

Multi-scale matching algorithm

1 = s

8 matches

slide-11
SLIDE 11

Multi-scale matching algorithm

1 = s

3 matches

Robust estimation of a global affine transformation

slide-12
SLIDE 12

Multi-scale matching algorithm

1 = s

3 matches

3 = s

4 matches

slide-13
SLIDE 13

Multi-scale matching algorithm

1 = s

3 matches

3 = s 5 = s

4 matches 16 matches

correct scale

highest number of matches

slide-14
SLIDE 14

Matching results

Scale change of 5.7

slide-15
SLIDE 15

Matching results

100% correct matches (13 matches)

slide-16
SLIDE 16

Scale selection

  • We want to find the characteristic scale of the blob by

convolving it with Laplacians at several scales and looking for the maximum response

  • However, Laplacian response decays as scale

increases:

Why does this happen?

increasing σ

  • riginal signal

(radius=8)

slide-17
SLIDE 17

Scale normalization

  • The response of a derivative of Gaussian filter to a perfect

step edge decreases as σ increases

1 π σ 2 1

slide-18
SLIDE 18

Scale normalization

  • The response of a derivative of Gaussian filter to a perfect

step edge decreases as σ increases

  • To keep response the same (scale-invariant), must

multiply Gaussian derivative by σ

  • Laplacian is the second Gaussian derivative, so it must be
  • Laplacian is the second Gaussian derivative, so it must be

multiplied by σ2

slide-19
SLIDE 19

Effect of scale normalization

Unnormalized Laplacian response Original signal Scale-normalized Laplacian response maximum

slide-20
SLIDE 20

Blob detection in 2D

  • Laplacian of Gaussian: Circularly symmetric operator for

blob detection in 2D

2 2 2 2 2

y g x g g ∂ ∂ + ∂ ∂ = ∇

slide-21
SLIDE 21

Blob detection in 2D

  • Laplacian of Gaussian: Circularly symmetric operator for

blob detection in 2D

        ∂ ∂ + ∂ ∂ = ∇

2 2 2 2 2 2 norm

y g x g g σ

Scale-normalized:

slide-22
SLIDE 22

Scale selection

  • The 2D Laplacian is given by
  • For a binary circle of radius r, the Laplacian achieves a

2 2 2

2 / ) ( 2 2 2

) 2 (

σ

σ

y x

e y x

+ −

− +

(up to scale)

  • For a binary circle of radius r, the Laplacian achieves a

maximum at

2 / r = σ

r

2 / r image Laplacian response scale (σ)

slide-23
SLIDE 23

Characteristic scale

  • We define the characteristic scale as the scale that

produces peak of Laplacian response

characteristic scale

  • T. Lindeberg (1998). Feature detection with automatic scale selection.

International Journal of Computer Vision 30 (2): pp 77--116.

slide-24
SLIDE 24

Scale selection

  • For a point compute a value (gradient, Laplacian etc.) at

several scales

  • Normalization of the values with the scale factor

e.g. Laplacian

| ) ( |

2 yy xx

L L s +

  • Select scale at the maximum → characteristic scale
  • Exp. results show that the Laplacian gives best results

| ) ( |

2 yy xx

L L s +

s

scale

slide-25
SLIDE 25

Scale selection

  • Scale invariance of the characteristic scale

s

  • norm. Lap.

scale

slide-26
SLIDE 26

Scale selection

  • Scale invariance of the characteristic scale

s

∗ ∗ =

2 1

s s s

  • norm. Lap.
  • norm. Lap.
  • Relation between characteristic scales

scale scale

slide-27
SLIDE 27

Scale-invariant detectors

  • Harris-Laplace (Mikolajczyk & Schmid’01)
  • Laplacian detector (Lindeberg’98)
  • Difference of Gaussian (Lowe’99)

Harris-Laplace Laplacian

slide-28
SLIDE 28

Harris-Laplace

multi-scale Harris points invariant points + associated regions [Mikolajczyk & Schmid’01] selection of points at maximum of Laplacian

slide-29
SLIDE 29

Matching results

213 / 190 detected interest points

slide-30
SLIDE 30

Matching results

58 points are initially matched

slide-31
SLIDE 31

Matching results

32 points are matched after verification – all correct

slide-32
SLIDE 32

LOG detector

Convolve image with scale- normalized Laplacian at several scales

)) ( ) ( (

2

σ σ

yy xx

G G s LOG + =

Detection of maxima and minima

  • f Laplacian in scale space
slide-33
SLIDE 33

Hessian detector

      =

yy xy xy xx

L L L L x H ) (

Hessian matrix

2 xy yy xx

L L L DET − =

Determinant of Hessian matrix Penalizes/eliminates long structures

with small derivative in a single direction

slide-34
SLIDE 34

Efficient implementation

  • Difference of Gaussian (DOG) approximates the

Laplacian

) ( ) ( σ σ G k G DOG − =

  • Error due to the approximation
slide-35
SLIDE 35

DOG detector

  • Fast computation, scale space processed one octave at a

time

David G. Lowe. "Distinctive image features from scale-invariant keypoints.”IJCV 60 (2).

slide-36
SLIDE 36

Local features - overview

  • Scale invariant interest points
  • Affine invariant interest points
  • Evaluation of interest points
  • Descriptors and their evaluation
slide-37
SLIDE 37

Affine invariant regions - Motivation

  • Scale invariance is not sufficient for large baseline changes

A

detected scale invariant region

A

projected regions, viewpoint changes can locally be approximated by an affine transformation A

slide-38
SLIDE 38

Affine invariant regions - Motivation

slide-39
SLIDE 39

Affine invariant regions - Example

slide-40
SLIDE 40

Harris/Hessian/Laplacian-Affine

  • Initialize with scale-invariant Harris/Hessian/Laplacian

points

  • Estimation of the affine neighbourhood with the second

moment matrix [Lindeberg’94]

  • Apply affine neighbourhood estimation to the scale-

invariant interest points [Mikolajczyk & Schmid’02, Schaffalitzky & Zisserman’02]

  • Excellent results in a comparison [Mikolajczyk et al.’05]
slide-41
SLIDE 41

Affine invariant regions

  • Based on the second moment matrix (Lindeberg’94)

        ⊗ = = ) , ( ) , ( ) , ( ) , ( ) ( ) , , (

2 2 2 D y D y x D y x D x I D D I

L L L L L L G M σ σ σ σ σ σ σ σ µ x x x x x

x x

2 1

M = ′

  • Normalization with eigenvalues/eigenvectors
slide-42
SLIDE 42

Affine invariant regions

L R

x x A =

′ = ′

L R

Rx x

Isotropic neighborhoods related by image rotation

L 2 1 L

x x

L

M = ′

R 2 1 R

x x

R

M = ′

slide-43
SLIDE 43
  • Iterative estimation – initial points

Affine invariant regions - Estimation

slide-44
SLIDE 44
  • Iterative estimation – iteration #1

Affine invariant regions - Estimation

slide-45
SLIDE 45
  • Iterative estimation – iteration #2

Affine invariant regions - Estimation

slide-46
SLIDE 46
  • Iterative estimation – iteration #3, #4

Affine invariant regions - Estimation

slide-47
SLIDE 47

Harris-Affine versus Harris-Laplace

Harris-Laplace Harris-Affine

slide-48
SLIDE 48

Harris-Affine

Harris/Hessian-Affine

Hessian-Affine

slide-49
SLIDE 49

Harris-Affine

slide-50
SLIDE 50

Hessian-Affine

slide-51
SLIDE 51

Matches

22 correct matches

slide-52
SLIDE 52

Matches

33 correct matches

slide-53
SLIDE 53

Maximally stable extremal regions (MSER) [Matas’02]

  • Extremal regions: connected components in a thresholded

image (all pixels above/below a threshold)

  • Maximally stable: minimal change of the component

(area) for a change of the threshold, i.e. region remains (area) for a change of the threshold, i.e. region remains stable for a change of threshold

  • Excellent results in a recent comparison
slide-54
SLIDE 54

Maximally stable extremal regions (MSER) Examples of thresholded images

high threshold low threshold

slide-55
SLIDE 55

MSER

slide-56
SLIDE 56

Overview

  • Introduction to local features
  • Harris interest points + SSD, ZNCC, SIFT
  • Scale & affine invariant interest point detectors
  • Scale & affine invariant interest point detectors
  • Evaluation and comparison of different detectors
  • Region descriptors and their performance
slide-57
SLIDE 57

Evaluation of interest points

  • Quantitative evaluation of interest point/region detectors

– points / regions at the same relative location and area

  • Repeatability rate : percentage of corresponding points
  • Two points/regions are corresponding if

– location error small – area intersection large

  • [K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas,
  • F. Schaffalitzky, T. Kadir & L. Van Gool ’05]
slide-58
SLIDE 58

Evaluation criterion

H

% 100 # # ⋅ = regions detected regions ing correspond ity repeatabil

slide-59
SLIDE 59

Evaluation criterion

H

% 100 # # ⋅ = regions detected regions ing correspond ity repeatabil

% 100 ) 1 ( ⋅ − = union

  • n

intersecti error

  • verlap

2% 10% 20% 30% 40% 50% 60%

slide-60
SLIDE 60

Dataset

  • Different types of transformation

– Viewpoint change – Scale change – Image blur – JPEG compression – Light change – Light change

  • Two scene types

– Structured – Textured

  • Transformations within the sequence (homographies)

– Independent estimation

slide-61
SLIDE 61

Viewpoint change (0-60 degrees )

structured scene textured scene

slide-62
SLIDE 62

Zoom + rotation (zoom of 1-4)

structured scene textured scene

slide-63
SLIDE 63

Blur, compression, illumination

blur - structured scene blur - textured scene light change - structured scene jpeg compression - structured scene

slide-64
SLIDE 64

Comparison of affine invariant detectors

60 70 80 90 100

repeatability %

Harris−Affine Hessian−Affine MSER IBR EBR Salient 800 1000 1200 1400

number of correspondences

Harris−Affine Hessian−Affine MSER IBR EBR Salient

Viewpoint change - structured scene

repeatability % # correspondences

15 20 25 30 35 40 45 50 55 60 65 10 20 30 40 50

viewpoint angle repeatability %

15 20 25 30 35 40 45 50 55 60 65 200 400 600 800

viewpoint angle number of correspondences

reference image 20 60 40

slide-65
SLIDE 65

Scale change

repeatability % repeatability %

Comparison of affine invariant detectors

reference image 4 reference image 2.8

slide-66
SLIDE 66
  • Good performance for large viewpoint and scale changes
  • Results depend on transformation and scene type, no one best

detector

Conclusion - detectors

  • Detectors are complementary

– MSER adapted to structured scenes – Harris and Hessian adapted to textured scenes

  • Performance of the different scale invariant detectors is very similar

(Harris-Laplace, Hessian, LoG and DOG)

  • Scale-invariant detector sufficient up to 40 degrees of viewpoint

change

slide-67
SLIDE 67

Overview

  • Introduction to local features
  • Harris interest points + SSD, ZNCC, SIFT
  • Scale & affine invariant interest point detectors
  • Scale & affine invariant interest point detectors
  • Evaluation and comparison of different detectors
  • Region descriptors and their performance
slide-68
SLIDE 68

Region descriptors

  • Normalized regions are

– invariant to geometric transformations except rotation – not invariant to photometric transformations

slide-69
SLIDE 69

Descriptors

  • Regions invariant to geometric transformations except

rotation

– rotation invariant descriptors – normalization with dominant gradient direction – normalization with dominant gradient direction

  • Regions not invariant to photometric transformations

– invariance to affine photometric transformations – normalization with mean and standard deviation of the image patch

slide-70
SLIDE 70

Descriptors

Extract affine regions Normalize regions Eliminate rotational + illumination Compute appearance descriptors SIFT (Lowe ’04)

slide-71
SLIDE 71

Descriptors

  • Gaussian derivative-based descriptors

– Differential invariants (Koenderink and van Doorn’87) – Steerable filters (Freeman and Adelson’91)

  • SIFT (Lowe’99)
  • SIFT (Lowe’99)
  • Moment invariants [Van Gool et al.’96]
  • Shape context [Belongie et al.’02]
  • SIFT with PCA dimensionality reduction
  • Gradient PCA [Ke and Sukthankar’04]
  • SURF descriptor [Bay et al.’08]
  • DAISY descriptor [Tola et al.’08, Windler et al’09]
slide-72
SLIDE 72

Comparison criterion

  • Descriptors should be

– Distinctive – Robust to changes on viewing conditions as well as to errors of the detector

  • Detection rate (recall)

1

  • Detection rate (recall)

– #correct matches / #correspondences

  • False positive rate

– #false matches / #all matches

  • Variation of the distance threshold

– distance (d1, d2) < threshold

1 1

[K. Mikolajczyk & C. Schmid, PAMI’05]

slide-73
SLIDE 73

Viewpoint change (60 degrees)

0.6 0.7 0.8 0.9 1

#correct / 2101

esift

* *

shape context gradient pca cross correlation complex filters har−aff esift steerable filters gradient moments sift

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.2 0.3 0.4 0.5 0.6

1−precision #correct / 2101

slide-74
SLIDE 74

esift

* *

Scale change (factor 2.8)

0.6 0.7 0.8 0.9 1

#correct / 2086

shape context gradient pca cross correlation complex filters har−aff esift steerable filters gradient moments sift

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.2 0.3 0.4 0.5 0.6

1−precision #correct / 2086

slide-75
SLIDE 75

Conclusion - descriptors

  • SIFT based descriptors perform best
  • Significant difference between SIFT and low dimension

descriptors as well as cross-correlation

  • Robust region descriptors better than point-wise

descriptors

  • Performance of the descriptor is relatively independent of

the detector

slide-76
SLIDE 76

Available on the internet

  • Binaries for detectors and descriptors

– Building blocks for recognition systems

http://lear.inrialpes.fr/software

  • Carefully designed test setup

– Dataset with transformations – Evaluation code in matlab – Benchmark for new detectors and descriptors