learning object categories from google s image search r
play

Learning Object Categories from Googles Image Search R. Fergus et - PowerPoint PPT Presentation

Learning Object Categories from Googles Image Search R. Fergus et al R. Fergus et al Present by Jie Xiao Dept. of Computer Science Univ. of Texas at San Antonio Outline Motivation Bag of words Model Approaches (pLSA, ABS-pLSA,


  1. Learning Object Categories from Google’s Image Search R. Fergus et al R. Fergus et al Present by Jie Xiao Dept. of Computer Science Univ. of Texas at San Antonio

  2. Outline Motivation “Bag of words” Model Approaches (pLSA, ABS-pLSA, TSI-pLSA) Dataset Experiment Experiment Conclusion 1 jxiao@cs.utsa.edu

  3. Motivation Current approaches of object categorization require manual labeled dataset as training set. Collecting data is time-consuming, involved in numerous human work. numerous human work. Finding good examples is another concern. 2 jxiao@cs.utsa.edu

  4. Bag of Words Model Of all the sensory impressions proceeding to China is forecasting a trade surplus of $90bn the brain, the visual experiences are the (£51bn) to $100bn this year, a threefold dominant ones. Our perception of the world increase on 2004's $32bn. The Commerce around us is based essentially on the Ministry said the surplus would be created by messages that reach the brain from our eyes. a predicted 30% jump in exports to $750bn, For a long time it was thought that the retinal compared with a 18% rise in imports to sensory, brain, China, trade, image was transmitted point by point to visual $660bn. The figures are likely to further centers in the brain; the cerebral cortex was a annoy the US, which has long argued that visual, perception, surplus, commerce, movie screen, so to speak, upon which the China's exports are unfairly helped by a retinal, cerebral cortex, exports, imports, US, image in the eye was projected. Through the deliberately undervalued yuan. Beijing discoveries of Hubel and Wiesel we now discoveries of Hubel and Wiesel we now agrees the surplus is too high, but says the agrees the surplus is too high, but says the eye, cell, optical eye, cell, optical yuan, bank, domestic, yuan, bank, domestic, know that behind the origin of the visual yuan is only one factor. Bank of China nerve, image foreign, increase, perception in the brain there is a considerably governor Zhou Xiaochuan said the country Hubel, Wiesel trade, value more complicated course of events. By also needed to do more to boost domestic following the visual impulses along their path demand so more goods stayed within the to the various cell layers of the optical cortex, country. China increased the value of the Hubel and Wiesel have been able to yuan against the dollar by 2.1% in July and demonstrate that the message about the permitted it to trade within a narrow band, but image falling on the retina undergoes a step- the US wants the yuan to be allowed to trade wise analysis in a system of nerve cells freely. However, Beijing has made it clear that it will take its time and tread carefully before stored in columns. In this system each cell allowing the yuan to rise further in value. has its specific function and is responsible for a specific detail in the pattern of the retinal image. Slide credit: Rob Fergus 3 jxiao@cs.utsa.edu

  5. Bag of Words Model LSA: U and V are orthonormal matrices A singular value decomposition(SVD) process pLSA 4 jxiao@cs.utsa.edu

  6. Bag of Words Model -- pLSA D: set of documents W: visual words Z: topics Latent variable z is associate with w and d. Matrix N M ×N :co-occurrence of words and doc N (w,d) : the number of word w appears in document d. 5 jxiao@cs.utsa.edu

  7. Bag of Words Model – pLSA (Cont.) co-occurrence of words within a topic density of topic on a given document 6 jxiao@cs.utsa.edu

  8. Bag of Words Model – pLSA (Cont.) topic specific word distribution document specific mixing proportion 7 jxiao@cs.utsa.edu

  9. Bag of Words Model – pLSA (Cont.) 8 jxiao@cs.utsa.edu

  10. Bag of Words Model – pLSA (Cont.) Calculating by EM E step: M step: 9 jxiao@cs.utsa.edu

  11. Bag of Words Model (Cont.) Object Object Bag of words Bag of words Slide credit: Rob Fergus 10 jxiao@cs.utsa.edu

  12. Bag of Words Model (Cont.) 1. 1. Representation Representation 2. 2. codewords dictionary codewords dictionary feature detection & representation & representation image representation 3. 3. Slide credit: Rob Fergus jxiao@cs.utsa.edu

  13. Approach ABS-pLSA Quantize the location within the image into one of X bins Use Use Instead of 12 jxiao@cs.utsa.edu

  14. Approach (Cont.) TSI-pLSA Introducing latent variable, c, represents the centriod of the object. foreground bins background bin background bin 13 jxiao@cs.utsa.edu

  15. Approach (Cont.) 14 jxiao@cs.utsa.edu

  16. Datasets PT: prepared training set, manually gathered P: prepared test set G: raw download data from Google image. Good image: good examples, related to keyword category keyword category Intermediate images: related to keyword category, low quality than good image Junk images: totally unrelated to the keyword category 15 jxiao@cs.utsa.edu

  17. Datasets (Cont.) V: Google validation set. Assume the images from first pages are positive examples. Cross language collections 16 jxiao@cs.utsa.edu

  18. Datasets (Cont.) 17 jxiao@cs.utsa.edu

  19. Datasets (Cont.) statistics 18 jxiao@cs.utsa.edu

  20. Experiments Region detectors: Convert to grayscale Resize to a moderate size Detect region Represent by SIFT descriptor Quantize descriptor vector 19 jxiao@cs.utsa.edu

  21. Experiments – region detector Region detectors: Kadir & Brady saliency operator Multi-scale Harris detector Difference of Gaussian Edge based operator 20 jxiao@cs.utsa.edu

  22. Experiments (Cont.) 21 jxiao@cs.utsa.edu

  23. Experiments (Cont.) 22 jxiao@cs.utsa.edu

  24. Experiments (Cont.) 23 jxiao@cs.utsa.edu

  25. Experiments (Cont.) Red: pLSA Green: ABS-pLSA Blue: TSI-pLSA Solid line: performance of automatically chosen automatically chosen topic within model Dashed line: performance of best topic within model 24 jxiao@cs.utsa.edu

  26. Discussion Limited categories Prior knowledge about number of categories Image background Similar visual word 25 jxiao@cs.utsa.edu

  27. Conclusion Introduce spatial information in pLSA. Learn object category by category name. 26 jxiao@cs.utsa.edu

  28. Thank you! 27 jxiao@cs.utsa.edu

Recommend


More recommend