large scale video retrieval using image queries
play

Large-Scale Video Retrieval Using Image Queries Andr Filgueiras de - PowerPoint PPT Presentation

Large-Scale Video Retrieval Using Image Queries Andr Filgueiras de Araujo Department of Electrical Engineering Stanford University Andre Araujo Large-Scale Video Retrieval Using Image Queries 1 The Dark Matter of the Digital Age


  1. Large-Scale Video Retrieval Using Image Queries André Filgueiras de Araujo Department of Electrical Engineering Stanford University Andre Araujo – Large-Scale Video Retrieval Using Image Queries 1

  2. The “Dark Matter” of the Digital Age 400+ hours of video uploaded per minute 8+ billion video views per day 85% of data in the 100+ hours of video form of multimedia uploaded per minute Key problem: How can we make sense of these data? Andre Araujo – Large-Scale Video Retrieval Using Image Queries 2

  3. Automatic Visual Recognition Image classification • Is this an urban landscape? Object detection • Does this image contain a bus? Where? Instance recognition (a.k.a. “visual search”) • Does this image contain the “Wicked” billboard? Andre Araujo – Large-Scale Video Retrieval Using Image Queries 3

  4. Visual Search Image query Database of images Retrieval ¡ System ¡ Product recognition Location recognition Commercial applications [Tsai et al., MM’08, MM’10] [Chen et al., CVPR’11] Andre Araujo – Large-Scale Video Retrieval Using Image Queries 4

  5. Video Retrieval Using Image Queries Image query Database of video clips Retrieval ¡ System ¡ Applications: • Brand monitoring: search YouTube using product images • News videos: search event footage using photos • Online education: search lectures using slides Andre Araujo – Large-Scale Video Retrieval Using Image Queries 5

  6. Online Prototype http://videosearch.stanford.edu Andre Araujo – Large-Scale Video Retrieval Using Image Queries 6

  7. Simple Architecture Frame short-list Query descriptor Query-to- 1 frames Too many frames 2 à does not scale 3 Frame index Query image Final result Geometric Feature verification matching 1 2 Feature index Andre Araujo – Large-Scale Video Retrieval Using Image Queries 7

  8. Large-Scale Architecture Focus of this work Clip short-list Frame short-list Query descriptor Query-to- Query-to- 1 1 clips frames 2 2 3 3 Clip Frame index index Query image Final result Geometric Feature verification matching 1 2 Feature index Andre Araujo – Large-Scale Video Retrieval Using Image Queries 8

  9. Video Retrieval Using Image Queries Clip short-list Query descriptor Query-to- 1 clips 2 3 Clip index Main challenges: • Asymmetry: how can we compare images to videos? • Temporal aggregation: how can we describe a video clip for query-by-image retrieval? Andre Araujo – Large-Scale Video Retrieval Using Image Queries 9

  10. Contributions • Asymmetric comparisons for Fisher vectors Fisher Vector Comparisons • Cluttered query or database images • Fisher vector descriptors for video segments Fisher Vector Aggregation • Compact database for large-scale retrieval • Bloom filter descriptors for video segments Bloom Filter Aggregation • Fast and accurate large-scale retrieval Andre Araujo – Large-Scale Video Retrieval Using Image Queries 10

  11. Related Work: Visual Search Query Augmented Reality Content Tracking Video TCD [Makar et al., ’12] Frame Mat. + ST [Douze et al., ’10] Hybrid Vis. Search [Chen et al., ’14] TRECVID-CCD [Over et al., ’12] Traditional Visual Search Video Retrieval by Image FV [Perronnin et al., ’07] Image Discussed on next slide BoW [Sivic et al., ’03] SIFT [Lowe, ’04] Database Images Videos Andre Araujo – Large-Scale Video Retrieval Using Image Queries 11

  12. Related Work: Video Retrieval Using Images • Early work – BoW retrieval of movie frames [Sivic and Zisserman, ICCV’03] – Object-level retrieval of movie shots [Sivic et al., ECCV’04] • TRECVID Instance Search Challenge [Over et al., TRECVID’10-15] – Frame-based BoW with Color SIFT [Le et al., ’10-11] – Shot-based aggregation using BoW [Zhu et al., ’13] [Ballas et al., ’14] – BoW query-adaptive asymmetrical dissimilarities [Zhu et al., ’13] • Object localization in videos – SURF-based matching per shot [Apostolidis et al., ICME’13] – Optimal path using dynamic programming [Meng et al. ICIP’15] Andre Araujo – Large-Scale Video Retrieval Using Image Queries 12

  13. Background: Pairwise Image Matching Query image Database image Image features Descriptor 1 Descriptor 2 … Descriptor n Interest Local Descriptor Point Descriptor Matching Detection Extraction Andre Araujo – Large-Scale Video Retrieval Using Image Queries 13

  14. Background: Fisher Vector (FV) [Perronnin and Dance, CVPR’07] • State-of-the-art technique for large-scale retrieval • Key property: represent a set of local descriptors by a compact fixed-length vector à Two images can be compared by comparing their Fisher vectors • Construction: describe an image with aggregated Fisher scores of its local descriptors – Local descriptor distribution: Gaussian Mixture Model (GMM) – Usually only Gaussian means are taken into account • Extension of Bag-of-Words technique [Sivic and Zisserman, ICCV’03] Andre Araujo – Large-Scale Video Retrieval Using Image Queries 14

  15. Background: Fisher Vector (FV) [Perronnin and Dance, CVPR’07] Descriptor space Query image Database image 1 Database image 2 Query FV -0.2 0.2 -0.3 -0.3 -0.3 0.8 DB Im. 1 FV -0.3 0.3 0.3 -0.6 -0.3 0.3 DB Im. 2 FV 0.5 -0.2 -0.7 0.1 -0.6 0 … … Andre Araujo – Large-Scale Video Retrieval Using Image Queries 15

  16. Background: Binarized Fisher Vector (FV*) [Perronnin et al., CVPR’10] Descriptor space Query image Database image 1 Database image 2 Query FV* 0 0 0 0 0 1 DB Im. 1 FV* 0 1 1 0 0 1 DB Im. 2 FV* 1 0 0 1 0 0 … … Andre Araujo – Large-Scale Video Retrieval Using Image Queries 16

  17. Contribution 1 • Asymmetric comparisons for Fisher vectors Fisher Vector Comparisons • Cluttered query or database images • Fisher vector descriptors for video segments Fisher Vector Aggregation • Compact database for large-scale retrieval • Bloom filter descriptors for video segments Bloom Filter Aggregation • Fast and accurate large-scale retrieval Andre Araujo – Large-Scale Video Retrieval Using Image Queries 17

  18. Asymmetric Image Comparison Query image Database image Object retrieval application Video bookmarking application How can we incorporate asymmetry in FV comparisons? Andre Araujo – Large-Scale Video Retrieval Using Image Queries 18

  19. Asymmetric Comparison for FV Fisher vector = [ v 1 , v 2 , … , v K ] … Regions and have different statistics à features from are usually not present in Andre Araujo – Large-Scale Video Retrieval Using Image Queries 19

  20. Asymmetric Comparison for FV z m • FV comparison metric: cosine similarity • We want: θ 1 < θ 2 θ 1 y θ 2 • Common failure case: n x m' θ 1 > θ 2 but θ 1 ’ < θ 2 θ 1 ’ q q query • Insight: m correct match in database Compare query and database based on their projections to the x-y plane n incorrect match in database θ 1 = angle( q , m ) (i.e., using only Gaussians visited by query) θ 2 = angle( q , n ) θ 1 ’ = angle( q , m’ ) Andre Araujo – Large-Scale Video Retrieval Using Image Queries 20

  21. Asymmetric Comparison for FV Descriptor space Image Gaussian not visited by this image Original FV 0.7 0.2 -0.5 0.2 -0.2 0.2 Re-norm. Zero Modified FV 0.8 0.3 -0.5 0.3 0 0 Andre Araujo – Large-Scale Video Retrieval Using Image Queries 21

  22. Asymmetric Comparison for FV • Two retrieval problems Query Database – Query contained in database All database images compared to query based on the same subspace Query image defines projection – Database contained in query Query Database Problem: each database image is compared to the query based on different subspaces Solution: introduce weight to favor database images with more visited Database image Gaussians defines projection Andre Araujo – Large-Scale Video Retrieval Using Image Queries 22

  23. Dataset: Query Contained in Database Query Reference Clutter … + … … 200 … + Distractor … + … 9,800 … + Query Database From 0 to 40 clutter images Andre Araujo – Large-Scale Video Retrieval Using Image Queries 23

  24. Dataset: Database Contained in Query Query Clutter Reference … + … … 200 … + Distractor From 0 to 40 clutter images … 9,800 Query Database Andre Araujo – Large-Scale Video Retrieval Using Image Queries 24

  25. Experiments: Asymmetric FV Comparisons Query contained in database Database contained in query 2048 Gaussians 2048 Gaussians 90 90 80 80 70 70 60 mAP (%) mAP (%) 60 50 50 25 % 40 40 25 % 30 FV Asym. FV Asym. 30 FV ⋆ Asym. FV ⋆ Asym. 20 FV Baseline FV Baseline 20 FV ⋆ Baseline FV ⋆ Baseline 10 10 10 0 10 1 10 0 10 1 Number of clutter images Number of clutter images Andre Araujo – Large-Scale Video Retrieval Using Image Queries 25

  26. Contribution 2 • Asymmetric comparisons for Fisher vectors Fisher Vector Comparisons • Cluttered query or database images • Fisher vector descriptors for video segments Fisher Vector Aggregation • Compact database for large-scale retrieval • Bloom filter descriptors for video segments Bloom Filter Aggregation • Fast and accurate large-scale retrieval Andre Araujo – Large-Scale Video Retrieval Using Image Queries 26

Recommend


More recommend