Overview Overview Local invariant features (C. Schmid) Matching - PowerPoint PPT Presentation

Overview Overview • Local invariant features (C. Schmid) • Matching and recognition with local features (J. Sivic) • Efficient visual search (J. Sivic) • Very large scale search (C. Schmid) • Practical session

Image search system for large datasets Image search system for large datasets Large image dataset (one million images or more) (one million images or more) query ranked image list Image search Image search system • Issues for very large databases • to reduce the query time q y • to reduce the storage requirements • with minimal loss in retrieval accuracy

Large scale object/scene recognition Large scale object/scene recognition Image dataset: > 1 million images q query y ranked image list k d i li t Image search system • Each image described by approximately 2000 descriptors – 2 10 9 descriptors to index for one million images! 2 * 10 9 descriptors to index for one million images! • Database representation in RAM: Database representation in RAM: – Size of descriptors : 1 TB, search+memory intractable

Bag-of-words [Sivic & Zisserman’03] g centroids (visual words) Query Set of SIFT [Nister & al 04, Chum & al 07] image descriptors sparse frequency vector Bag-of-features Hessian-Affine processing regions + SIFT descriptors + tf-idf weighting g g [Mikolajezyk & Schmid 04] [Lowe 04] • Visual Words • 1 word (index) per local descriptor Inverted Inverted • only images ids in inverted file l i id i i d fil querying file  8 GB for a million images, fits in RAM • Problem: Matching approximation Re-ranked Re ranked Geometric G t i ranked image ranked image verification list short-list [Lowe 04, Chum & al 2007]

Approximate nearest neighbour (ANN) evaluation of bag-of- features 0.7 ANN algorithms returns a k=100 list of potential p 0 6 0.6 neighbors 200 Accuracy : NN recall 0.5 500 = probability that the = probability that the NN is in this list 1000 0.4 ecall 2000 Ambiguity removal : g y NN re = proportion of vectors 5000 in the short-list 0.3 10000 20000 20000 30000 0.2 In BOF, this trade-off is 50000 managed by the g y number of clusters k 0.1 BOW BOW 0 1e-07 1e-06 1e-05 0.0001 0.001 0.01 0.1 rate of points retrieved

20K visual word: false matches

200K visual word: good matches missed

Problem with bag-of-features • The intrinsic matching scheme performed by BOF is weak • f for a “small” visual dictionary: too many false matches “ ll” i l di ti t f l t h • for a “large” visual dictionary: many true matches are missed • No good trade-off between “small” and “large” ! • either the Voronoi cells are too big • or these cells can’t absorb the descriptor noise  intrinsic approximate nearest neighbor search of BOF is not sufficient • Possible solutions Soft assignment [Philbin et al. CVPR’08]   Additional short codes [Jegou et al. ECCV’08]

Hamming Embedding • Representation of a descriptor x • Vector-quantized to q(x) as in standard BOF Vector quantized to q(x) as in standard BOF + short binary vector b(x) for an additional localization in the Voronoi cell • Two descriptors x and y match iif where h( a , b ) is the Hamming distance Nearest neighbors for Hamming distance  the ones for Euclidean distance • • Efficiency • Hamming distance = very few operations Fewer random memory accesses: 3  faster that BOF with same dictionary size! •

Hamming Embedding • Off-line (given a quantizer) • d draw an orthogonal projection matrix P of size d b × d th l j ti t i P f i d × d  this defines d b random projection directions • • for each Voronoi cell and projection direction compute the median value for each Voronoi cell and projection direction, compute the median value from a learning set • On-line : compute the binary signature b(x) of a given descriptor • project x onto the projection directions as z(x) = (z 1 ,…z db ) • b i (x) = 1 if z i (x) is above the learned median value, otherwise 0 [H. Jegou et al., Improving bag of features for large scale image search, ICJV’10]

Hamming and Euclidean neighborhood 1 • trade-off between memory usage and y g accuracy 0.8  more bits yield higher call) accuracy accuracy trieved (rec 0.6 In practice 64 bits (8 bytes) I ti 64 bit (8 b t ) f 5-NN ret 0.4 rate of 0.2 8 bits 16 bits 16 bit 32 bits 64 bits 128 bits 0 0 0 0.2 0.4 0.6 0.8 1 rate of cell points retrieved

ANN evaluation of Hamming Embedding 0.7 32 28 k=100 24 0 6 0.6 22 200 compared to BOW: at 20 least 10 times less points 0.5 500 in the short list for the in the short-list for the same level of accuracy 1000 18 0.4 ecall 2000 NN re Hamming Embedding h t =16 5000 provides a much better 0.3 10000 trade off between recall trade-off between recall 20000 20000 and ambiguity removal 30000 0.2 50000 0.1 HE+BOW BOW BOW 0 1e-08 1e-07 1e-06 1e-05 0.0001 0.001 0.01 0.1 rate of points retrieved

Matching points - 20k word vocabulary 240 matches 201 matches Many matches with the non-corresponding image!

Matching points - 200k word vocabulary 69 matches 35 matches Still many matches with the non-corresponding one

Matching points - 20k word vocabulary + HE 83 matches 8 matches 10x more matches with the corresponding image!

Bag of features [Sivic&Zisserman 03] Bag-of-features [Sivic&Zisserman’03] Query Set of SIFT centroids image descriptors (visual words) sparse frequency vector sparse freq enc ector Bag-of-features Harris-Hessian-Laplace processing regions + SIFT descriptors + tf-idf weighting Inverted querying querying file Re-ranked Geometric ranked image g verification list short-list [Chum & al. 2007]

Geometric verification Use the position and shape of the underlying features t i to improve retrieval quality t i l lit Both images have many matches – which is correct? g y

Geometric verification We can measure spatial consistency between the query and each result to improve retrieval quality d h l i i l li Many spatially consistent Few spatially consistent matches – correct result matches – correct result matches – incorrect matches – incorrect result

Geometric verification Gives localization of the object

Re-ranking based on geometric verification Re ranking based on geometric verification • works very well • but performed on a short-list only (100 - 1000 images)  for very large datasets, the number of distracting images is so high that relevant images are not even short-listed! that relevant images are not even short-listed!  weak geometry 1 short-list size: 0.9 20 images ort-listed 100 images 0.8 1000 images evant images sho 0.7 0 7 0.6 0.5 rate of rele 0 4 0.4 0.3 0.2 0.1 0 1000 10000 100000 1000000 dataset size

Weak geometry consistency Weak geometry consistency • • Weak geometric information used for all images (not only the short list) Weak geometric information used for all images (not only the short-list) • Each invariant interest region detection has a scale and rotation angle g g associated, here characteristic scale and dominant gradient orientation Scale change 2 Rotation angle ca. 20 degrees • Each matching pair results in a scale and angle difference • For the global image scale and rotation changes are roughly consistent

WGC: orientation consistency GC o e tat o co s ste cy Max = rotation angle between images

WGC: scale consistency

Weak geometry consistency Weak geometry consistency • Integration of the geometric verification into the BOF Integration of the geometric verification into the BOF – votes for an image in two quantized subspaces, i.e. for angle & scale – these subspace are show to be roughly independent these subspace are show to be roughly independent – final score: filtering for each parameter (angle and scale) • Only matches that do agree with the main difference of orientation and scale will be taken into account in the final score • Re-ranking using full geometric transformation still adds Re ranking sing f ll geometric transformation still adds information in a final stage

Experimental results • Evaluation for the INRIA holidays dataset, 1491 images • • 500 query images + 991 annotated true positives 500 query images + 991 annotated true positives • Most images are holiday photos of friends and family • • 1 million & 10 million distractor images from Flickr 1 million & 10 million distractor images from Flickr • Vocabulary construction on a different Flickr set • Al Almost real-time search speed t l ti h d • E Evaluation metric: mean average precision (in [0,1], bigger = better) l ti t i i i (i [0 1] bi b tt ) • Average over precision/recall curve

Holiday dataset – example queries

Dataset : Venice Channel Query Base 1 Base 2 Base 3 Base 4

Dataset : San Marco square Base 2 Base 2 Base 3 Base 3 Query Query Base 1 Base 1 Base 4 Base 5 Base 6 Base 7 Base 8 Base 9

Example distractors - Flickr

Overview Overview Local invariant features (C. Schmid) Matching - PowerPoint PPT Presentation

Overview Overview Local invariant features (C. Schmid) Matching and recognition with local features (J. Sivic) Efficient visual search (J. Sivic) Very large scale search (C. Schmid) Practical session Image search system for

01 | KPF Overview 01 | KPF Overview 01 | KPF Overview 01 | KPF Overview 01 | KPF Overview 01 |

OVERVIEW PRESENTATION / 1 OVERVIEW PRESENTATION / 1 SF park overview OVERVIEW PRESENTATION / 2

OVERVIEW PRESENTATION / 1 OVERVIEW PRESENTATION / 1 Acknowledgements OVERVIEW PRESENTATION / 2 SF

INVESTOR PRESENTATION FEBRUARY 2016 INDEX EXECUTIVE SUMMARY COMPANY OVERVIEW BUSINESS OVERVIEW

INVESTOR PRESENTATION MAY 2019 Index Executive Summary Company Overview Business Overview

INVESTOR PRESENTATION MARCH 2016 INDEX EXECUTIVE SUMMARY COMPANY OVERVIEW BUSINESS OVERVIEW

1 Overview Overview Regional demographic overview Regional demographic overview Workforce

Covid-19 and Business Interruption: Maximizing Insurance Coverage and Federal Grants Counsel

OVERVIEW OVERVIEW OVERVIEW OVERVIEW The qualifications are aimed at primary school

An overview to Maltese An overview to Maltese An overview to Maltese An overview to Maltese

GSM System Overview GSM System Overview GSM System Overview GSM System Overview Phone Lin

Butterball Employees Butterball Employees Butterball Employees Benefits Overview Ruan Benefits

Program-for-Results Financing Overview Overview Overview of World Bank Instruments

INVESTOR PRESENTATION Index Executive Summary Company Overview Business Overview Industry

Key Maths 3 UK Assessm ent overview Claire Parsons Overview 1. Key Maths 3 UK (overview) 2.

Federal Fiscal Year 2017-18 CHASE Fee Program June 21, 2018 Overview CHASE Overview Fee

EECS 4441 Human-Computer Interaction Topic #4: Empirical Research Methods for HCI I. Scott

The Assembly of Disk Galaxies: From Keck to JWST Susan Kassin Space Telescope Science Institute

Practical Considerations for ANOVA Applied Statistics and Experimental Design Chapter 5 Peter

Lecture 16: Summary and outlook Felix Held, Mathematical Sciences MSA220/MVE440 Statistical

Code-Based Cryptography for FPGAs Dr. Ruben Niederhagen, February 8, 2018 Introduction Global

Information Geometric Optimization How information theory sheds new light on black-box

Everything You Wanted to Know about Moderation (but were afraid to ask) Jeremy F. Dawson

Downscaling as a way to predict hazardous conditions for aviation activities Adil RASHEED,