1
play

1 Training and Addition are Separate System Overview System - PDF document

Scalable Recognition with a Scalable Recognition with a Outline Outline Vocabulary Tree Vocabulary Tree Abstract Abstract Strengths Strengths by: David Nistr System Overview System Overview Henrik Stewnius


  1. Scalable Recognition with a Scalable Recognition with a Outline Outline Vocabulary Tree Vocabulary Tree � Abstract Abstract � � Strengths Strengths � by: David Nistér � System Overview System Overview � Henrik Stewénius � Animated explanation of the vocabulary tree Animated explanation of the vocabulary tree � � Explanation of the scoring scheme Explanation of the scoring scheme presented by: � William Malpica � Testing Results Testing Results � � Conclusion Conclusion � CS 395T Some slides from Nister and Stewenius’s CVPR 2006 presentation Scalable Recognition with a Scalable Recognition with a Strengths! Strengths! Vocabulary Tree Vocabulary Tree � The vocabulary tree directly defines the The vocabulary tree directly defines the � � The paper describes a system which can The paper describes a system which can � quantization. quantization. recognize objects from a very large recognize objects from a very large � Each high Each high- -dimension feature vector is dimension feature vector is � database with great speed and recognition database with great speed and recognition quantized into an integer which corresponds quantized into an integer which corresponds quality. quality. to a path in the vocabulary tree. to a path in the vocabulary tree. � Results in speed Results in speed � � Feature extraction on a 640x480 video frame in Feature extraction on a 640x480 video frame in � � The system uses local region descriptors The system uses local region descriptors � 0.2 sec. and database query in 25ms on a 50000 0.2 sec. and database query in 25ms on a 50000 which are hierarchically quantized in a which are hierarchically quantized in a image database. image database. vocabulary tree. vocabulary tree. � Results in compactness Results in compactness � Adding, Querying and Removing Adding, Querying and Removing Strengths! Strengths! Images at full speed Images at full speed � Potential for on Potential for on- -the the- -fly insertion fly insertion � � An offline unsupervised training stage is An offline unsupervised training stage is � Query necessary to create the vocabulary, but new necessary to create the vocabulary, but new images can be added to the database on- images can be added to the database on -the the- - fly. fly. Add Remove � Images can be added an the same rate as Images can be added an the same rate as � feature extraction. feature extraction. � Excellent benefit for large scalable image Excellent benefit for large scalable image � databases. databases. 1

  2. Training and Addition are Separate System Overview System Overview Training and Addition are Separate � Maximally Stable Maximally Stable Extremal Extremal Regions Regions � Common Approach Our approach ( (MSERs MSERs) feature extractor. ) feature extractor. � SIFT feature descriptor SIFT feature descriptor � � Feature space is quantized through k Feature space is quantized through k- - � means clustering and build into a means clustering and build into a vocabulary tree. vocabulary tree. � To retrieve images, a hierarchical scoring To retrieve images, a hierarchical scoring � scheme is used based on Term Frequency scheme is used based on Term Frequency Inverse Document Frequency (TF- -IDF). IDF). Inverse Document Frequency (TF 2

  3. 3

  4. 4

  5. 5

  6. Definition of Scoring Definition of Scoring N = w ln (1) i Weights Weights are assigned to each node (with are assigned to each node (with N � � i certain exceptions) certain exceptions) (2) = q n w i i i Query and database vectors are defined Query and database vectors are defined � � = (3) d m w according to their assigned weights according to their assigned weights i i i q d = − s ( q , d ) Each database image is given a Each database image is given a � � (4) q d relevance score based on the normalized relevance score based on the normalized differences between the query and differences between the query and database vectors database vectors Implementation of Scoring Normalization Implementation of Scoring Normalization � Every node is associated with an inverted file, Every node is associated with an inverted file, � � To compute the normalized difference in To compute the normalized difference in Lp Lp- - � although only leaf nodes are explicitly although only leaf nodes are explicitly norm: norm: (5) represented. Inner nodes are a concatenation of represented. Inner nodes are a concatenation of ∑ − p = − p q d q d the leaf nodes. the leaf nodes. i i p i � Inverted files store the id Inverted files store the id- -numbers of the numbers of the � ∑ − p = + − p − p − p (6) 2 ( ) q d q d q d images in which a particular node occurs, and images in which a particular node occurs, and i i i i p ≠ ≠ the term frequency for that image. i q 0 , d 0 the term frequency for that image. i i � For the case of the L2 For the case of the L2- -norm: norm: � � The vectors representing the database images The vectors representing the database images � ∑ − 2 = − as well as the query images are normalized to as well as the query images are normalized to q d 2 2 q i d i (7) unit magnitude. 2 unit magnitude. ≠ ≠ i q 0 , d 0 i i Testing Results for only 1400 images Testing Results for only 1400 images � Ground truth Ground truth � database consisted of database consisted of 6376 images in 6376 images in groups of four. groups of four. � The database was The database was � queried with every queried with every image and was image and was evaluated on how evaluated on how frequently the other frequently the other three images are three images are found perfectly. found perfectly. 6

  7. Results for only 1400 images Results for only 1400 images Results for only 1400 images Results for only 1400 images Results with full 6376 image Results with full 6376 image Other Tests – – 40000 CD covers 40000 CD covers Other Tests database database � Method was Method was � tested on a tested on a database of database of 40000 CD 40000 CD covers covers running real running real- - time. time. Other Tests – – 1 million images 1 million images Other Tests – – 1 million images 1 million images Other Tests Other Tests � Method was also tested on a database of 1 Method was also tested on a database of 1 � million images. The ground truth images were million images. The ground truth images were embedded into a database containing all the embedded into a database containing all the frames from several movies: The Bourne frames from several movies: The Bourne Identity, The Matrix, Identity, The Matrix, Braveheart Braveheart, Collateral, , Collateral, Resident Evil, Almost Famous and Monsters Inc. Resident Evil, Almost Famous and Monsters Inc. � Queries on a 8GB machine would take about 1 Queries on a 8GB machine would take about 1 � second. Database creation took 2.5 days. second. Database creation took 2.5 days. 7

  8. Other Tests Other Tests – – Non movie images Non movie images Conclusion Conclusion queried on 300K frames queried on 300K frames � This methodology provides the This methodology provides the abililty abililty to to � make fast searches on extremely large make fast searches on extremely large databases. databases. � Paves the way to someday create an Paves the way to someday create an � internet internet- -scale content based image search scale content based image search engine. engine. Questions Questions 8

Recommend


More recommend