harbin institute of technology microsoft research asia
play

Harbin Institute of Technology Microsoft Research Asia Microsoft - PowerPoint PPT Presentation

ACM MM 2010 Dong Liu , Xian-Sheng Hua, Meng Wang and Hong-Jiang Zhang Harbin Institute of Technology Microsoft Research Asia Microsoft Advanced Technology Center medici chapel, Firenze, Italy... Loggia dei lanzi, sword, honeymoon, ...


  1. ACM MM 2010 Dong Liu , Xian-Sheng Hua, Meng Wang and Hong-Jiang Zhang Harbin Institute of Technology Microsoft Research Asia Microsoft Advanced Technology Center

  2. medici chapel, Firenze, Italy... Loggia dei lanzi, sword, honeymoon, ... Status, building, sky, Italy, ... Cathedral, tower, Italy... 2

  3. Social tags are good, but they are Noisy Ambiguous Incomplete No relevance information Two directions to improve tag quality Tag Ranking (Liu, Hua, Zhang. Tag Ranking . WWW 09) Retagging (Liu, Hua, Wang, Zhang. Image Retagging . MM 10) 3

  4. Tags associated with social images are imprecise, subjective and incomplete. kitty top101 boy young lovely Imprecise Tags Subjective Tags Missing Tags grass flower cat animal 4

  5. What we are going to do: Improve the quality of the tags to better describe content. To improve: top 101 dog tour house Tag-based image search tiger tree sweet sky Image annotation (automatic tagging) big ground cloud cloud 5

  6. But how can we make it? Automatically. 6

  7. Similar images similar tags river bear bear water Visual and Semantic animal wildlife Consistency bath nature User-provided tags correlate with the image content with high probability Nikon power cat boy animal Prior Knowledge zoo garden tiger rabit father 7

  8. Tag Refinement The consistency between visual similarity and semantic similarity should be maximized. The deviation from the initially user-provided tags should be minimized. 8

  9. Notations 9

  10. Modeling the basic assumptions Visual and semantic consistency User-provided tags are relevant with high probability Overall formulation 10

  11. Optimizing with iterative updating Bound the objective function Derive the solution Iterative updating until convergence 11

  12. Is It Reliable ? 12

  13. Content -Related Tag Content -Unrelated Tag baby night photo macro beach fun cat grass ocean science my best Nikon flower old animal autumn raw bike deleteme dog top101 sunset bird live Describe the REAL visual content of the Describe the CONTEXTUAL information images. about the images . Informative for ALL general users. Only informative to the image owners. 13

  14. RECALL Similar images have similar tags. Only applicable for “content - related” tags. Involving the content-unrelated tags will Introduce lots of noises. Degrade the algorithmic performance. These tags should be removed from the automatic learning procedure. 14

  15. Filter out all content-unrelated tags. Construct a content-related tag dictionary by Traverse along the path until one pre- using the lexical and domain knowledge defined category is matched organism All words Non-noun animal artifact Noun Organism mammal structure Natural Pho. content-unrelated Thing content-related feline building Artifact kitty Color 15

  16. Is It Enough? 16

  17. kitty kitty kitty kitten feline cat animal pussy synonym hypernym The missing of such tags will degrade the performance of tag-based applications. 17

  18. Make use of Wordnet lexicon kitty domestic cat feline vertebrate chordate animal organism cat kitty kitten kitty-cat pussy pussycat Use each tag to perform tag-based image search on Flickr. The tags with more than 10,000 returned images are retained. 18

  19. Three-step strategy 19

  20. In term of average precision,recall and F1-Measure 50,000 Flickr images with 4,556 content-related tags. 2,500 test images. 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Precision Recall F1-Measure Original CBAR Our Method 20

  21. Method Precision Recall F1-measure Relevant tag num Before 0.71 0.34 0.46 3.09 (4,80 in Enrichment all ) After 0.90 0.66 0.76 9.34 (10.38 Enrichment in all) Tagging quality is further improved rafter the tag enrichment procedure. 22

  22. Use the learnt confidence scores as relevance measure Ranking results for query “cat” 23

  23. Our confidence score based ranking strategy outperforms the other image ranking strategies on Flickr 24

  24. Use top tags of the images after retagging to predict the tags of the unlabeled images 25

  25. Using top tags after image retagging can obtain better results than using the original images directly 26

  26. User-provided tags are imprecise and incomplete, which limits the performance of tag-based applications We propose an image retagging strategy to solve this problem: Tag filtering to remove the content-unrelated tags Tag refinement to automatically refine the tags Tag enrichment to expand the tags with synonyms and hypernyms. Image retagging benefits a series of tag-based applications 27

  27. Extend it to online videos Using more fruitful information cues such as image regions and surrounding texts 28

  28. 29

Recommend


More recommend