interactive image mining
play

Interactive Image Mining Annie Morin 1 , Nguyen-Khang Pham 1,2 1 - PowerPoint PPT Presentation

Interactive Image Mining Annie Morin 1 , Nguyen-Khang Pham 1,2 1 TEXMEX/IRISA 2 Cantho University, Vietnam Outline Image retrieval and motivation Examples Image topic discovery FCA and adaptation on images Image retrieval


  1. Interactive Image Mining Annie Morin 1 , Nguyen-Khang Pham 1,2 1 TEXMEX/IRISA 2 Cantho University, Vietnam

  2. Outline � Image retrieval and motivation � Examples � Image topic discovery � FCA and adaptation on images � Image retrieval � Image indexing using indicators of FCA � Numerical results � Conclusion and future works CARME 2011 2

  3. Image retrieval Sorted list of images Image query Image retrieval System Decreasing similarity Image database CARME 2011 3

  4. Starting Point: Video Google Hessian - Affine detector [Mikolaczyk & Schmid, 2004] - Interest point SIFT computation [Lowe, 2004] detection - SIFT computation Quantification SIFT ∈ R 128 Bag of “visual words” SIFT K-means clustering visual words CARME 2011 4 Video Google [Sivic & Zisserman, 2003]

  5. SIFT descriptor ∈ R 128 SIFT = Scalable Invariant Feature Transform 4 x 4 x 8 (directions) = 128 (dimensions) CARME 2011 5

  6. R 128 6 6 Visual words … (for inst.., k -means) Clustering CARME 2011 SIFT Visual words R 128

  7. 7 word N … word 3 CARME 2011 Contingency table word 2 word 1 image M image 1 image 2 …

  8. Data Analysis Techniques Used � Video Google � tf*idf weighting � Probabilistic Latent Semantic Analysis (Sivic et al. , 2005) � Our goal � Replace these techniques by Factorial Correspondence Analysis (FCA) CARME 2011 8

  9. Motivation Success of application of Factorial Correspondence Analysis on textual data CARME 2011 9

  10. Some examples : the Caltech4 database � Caltech4 database � 4090 images divided into 5 categories � faces : 435 � motorbikes : 800 � airplanes : 800 � cars (rear): 1155 � backgrounds : 900 � Vocabulary : 2224 visual words (Sivic et al. , 2005) CARME 2011 10

  11. 11 CARME 2011 Caltech4 dataset

  12. 12 12 Factorial Correspondence CARME 2011 Analysis

  13. Factorial Correspondence analysis images visual words Image with all its visual words Characteristic visual words for this image in this plane CARME 2011 13

  14. 14 CARME 2011 Cars

  15. 15 CARME 2011 Motorbikes

  16. 16 CARME 2011 Faces

  17. 17 CARME 2011 Airplanes

  18. 18 CARME 2011 Backgrounds

  19. 19 Blue:pos Red:neg Image information extraction CARME 2011

  20. Example : Categorization of images Alogic database (961 images, 1000 visual words) CARME 2011 20

  21. 21 Categorization of images CARME 2011

  22. Numerical results: databases � Caltech4 database � 4090 images divided into 5 categories � Vocabulary : 2224 visual words � Nister database � 2250 groups of 4 images � 10200 images in total � 5000 visual words CARME 2011 22

  23. 23 CARME 2011 Nistér dataset

  24. 24 Experimental results : methods comparison CARME 2011

  25. 25 Exhaustive search Experimental results CARME 2011 Nistér-Stewénius

  26. Speeding up the retrieval by using image categorization � One factorial axis � two different topics � One image may belong to several topics � Topics determination: � Contribution to the inertia of an axis � Quality on an axis CARME 2011 26

  27. Image Indexing using FCA � Hypothesis: � Two similar images share some common properties � Indicators of FCA have relevant properties � Representation quality of images on axes � An image is well represented on some axes � Contribution to the inertia of axes � An image highly contributes to the inertia of some axes CARME 2011 27

  28. Image Indexing using FCA � Inverted file � property � Contains images which possess this property � For an axis we construct two inverted files: � Negative part/Positive part � Each file contains images well represented on this axis (or images which highly contribute to the inertia of this axis) CARME 2011 28

  29. Image Indexing using FCA � Retrieval schema � For a given image query � Sort axes by their representation quality (contribution to the inertia) � Determine relevant properties of the query (i.e. some first axes) and take the correspondent inverted files � Merge the inverted files by majority vote � candidate list � Search in the candidate list CARME 2011 29

  30. Inverted files … axis α Topics + - + - F 1 F 1 F 2 F 2 1 0 0 1 image 1 0 1 1 0 image 2 Topics determination 0 0 0 1 image 3 z image i 1 0 1 0 α image 4 i 1 0 0 1 image 5 … Inverted files 1 2 2 1 Coordinates of the images in the factorial space 4 4 3 5 5 CARME 2011 30

  31. Experimental results � Approximative search with inverted files � Method 1: � Inverted files based on the contribution � Method 2: � Inverted files based on the quality of representation � Exhaustive search CARME 2011 31

  32. FCA with indexing technique Precision and time results Times Methods #images 5 img 10 img 50 img 100 img (ms) FCA, exhausted 4090 95.92 94.61 91.35 89.64 0.50 search Representation quality-based 984 95.97 94.63 91.23 89.43 0.15 indexing Contribution- 791 95.85 94.47 91.01 88.93 0.13 based indexing Tf*IDF, exhausted 4090 88.24 84.81 77.52 73.72 2.94 search Caltech4 dataset - 3.5 times faster than FCA without indexing - Number of axes kept: 15 - 20 times faster than TF*IDF CARME 2011 32 - #images: size of candidate list

  33. FCA with indexing technique Methods #images precision Times (ms) FCA, exhausted search 10200 79.82 12.01 FCA with indexing 650 79.75 1.04 nf= 21, nf_thres = 11 FCA with indexing 79.96 397 1.33 nf: auto, nf_thres: auto TF*IDF, exhausted search 10200 73.04 36.45 Nistér dataset - 10 times faster than FCA without indexing #image: size of candidate size - 30 times faster than TF*IDF precision: precision at 4 first CARME 2011 33 returned images

  34. Results Acceleration gain Methods min max mean Inverted files based on 2.7 17.3 6.9 the contribution Inverted files based on the quality of 3.2 13.8 6.7 représentation CARME 2011 34 34

  35. Parallelization of the filtering step � Filtering of non Most of the time is for filtering relevant images � Use inverted files to filter the images sharing few topics with the query � Refinement step � Sequential search among the images candidates Computation time distribution for filtering step and refinement step for 1 million images CARME 2011 35

  36. Results with Nister-Stevenius database +1 million images from FlickR Response time Acceleration Methods P@3 (ms) gain Exhaustive 0.623 860.88 - 0.625 Filtering on 79.99 10.76 CPU 0.625 Filtering on 7.53 114.33 GPU � Method with a parallel filtering on GPU 10 times faster than the method without parallelization of the � filtering step 100 times faster than the exhaustive search � CARME 2011 36

  37. Conclusion � Adaptation of FCA on images for indexing and retrieval � Hierarchical CA to focus on some areas of a factorial plane CARME 2011 37

  38. � Work of J.B Sivic and A Zisserman using probabilistic latent semantic analysis and visual words in images CARME 2011 38

Recommend


More recommend