acm mm 2010
play

ACM MM 2010 Dong Liu , Shuicheng Yan, Yong Rui and Hong-Jiang Zhang - PowerPoint PPT Presentation

ACM MM 2010 Dong Liu , Shuicheng Yan, Yong Rui and Hong-Jiang Zhang Harbin Institute of Technology National University of Singapore Microsoft Corporation Proliferation of images and videos on the Internet Sep. 2010 : 5 billion Sep. 2010 : 120


  1. ACM MM 2010 Dong Liu , Shuicheng Yan, Yong Rui and Hong-Jiang Zhang Harbin Institute of Technology National University of Singapore Microsoft Corporation

  2. Proliferation of images and videos on the Internet Sep. 2010 : 5 billion Sep. 2010 : 120 million 2000 images /minute 20 hours uploaded/minute 2

  3. Internet Image Search 3 rd Paradigm 2 nd Paradigm Query by Tag Query by Semantic Based Surrounding 1 st Paradigm Text Query by Direct Text Based Example Pure Content Based 1990 2000 2001 2002 Year 3

  4. medici chapel, Firenze, Italy... Loggia dei lanzi, sword, honeymoon, ... Statue, building, sky, Italy, ... Cathedral, tower, Italy... 4

  5. Tag Refinement Tag-to-Region Auto-Tagging Tag Ranking top 101 alex tour speed tiger leave sweet dog big 101 cloud tree To discover the relationship between the tags and the underlying semantic regions in the images. dog dog house leave tree tree sky speed ground alex cloud 101 D. Liu, X.-S. Hua and H.-J. Zhang. Content-Based Tag Processing for Internet Social Images: A Survey. Multimedia Tools and Applications. 5

  6. How to solve various tag analysis tasks in a unified framework? Our Strategy Tag-to- Perform tag analysis at the Tag Region granularity of image regions Refinem ent Propose a new concept of Auto- multi-edge graph to Tagging  model the parallel semantic relationships between the images.  propagate the tags from images to regions . Content-Based Tag Analysis 6

  7. Multi-Edge Graph (v, f) vertex 1 vertex 2 y 1 y 2 (v, f) (v, f) (v, f) (v, f) A Core Equation (v, f) (v, f) vertex 3 vertex t vertex k y 3 y t y k (v, f) one edge all edges between vertex i and j labeling information of vertex i vertex n y n with respect to tag c probability that edge e t is labeled as positive with tag c 7

  8. Step 1: Bag-of-Regions Representation Segmentation 1 Input Image Bag-of-Regions Segmentation 2 8

  9. Step 2: Multi-Edge Graph Construction Given two images with bag-of-regions representation: Edge construction : mutual k-Nearest Neighbor reliable edge connection Edge affinity calculation reliable similarity measure 9

  10. two images with the same tag dog, flower dog, bird at least one edge connecting the two regions corresponding to the tag 10

  11. Notations 11

  12. Model the cross-level tag propagation Loss function Regularization Objective Function Solving F directly is of great computational challenge, we turn to the alternative optimization strategy 12

  13. Optimize sub-problems with cutting plane At each iteration, solving only a subset of tag confidence vector between vertex i and vertex j : The yielded a sub-optimization problem : Since Max function is non-smooth, we solve it with the cutting plane method. 13

  14. f1 f2 f3 f 3 dog 0.1 0.2 0 cat 0.1 0.1 0.2 apple 0.4 0.2 0.4 f 1 flower 0.3 0.3 0.2 f 2 tree 0.1 0.2 0.1 apple flower apple Majority Voting: apple (2 times) > flower (1 time) apple By doing so, a series of tag analysis tasks can be performed in a coherent way. 14

  15. The cutting plane iteration will terminate in a constant number of steps. The optimization objective is convex, resulting in a globally optimal solution. 15

  16. In term of pixel-level accuracy. MSRC-100 and Corel-350 datasets.(Benchmarks for tag-to-region assignment task) Comparison with k NN-1 (k=49), k NN-2 (k=99) and Bi-layer sparse coding [1]. Dataset k NN-1 k NN-2 Bi-layer [1] M-E Graph MSRC-350 0.45 0.37 0.63 0.73 COREL-100 0.52 0.44 0.61 0.67 [1] Liu, Cheng, Yan and Chua . Label to region by bi-layer sparsity priors . MM 2009. 16

  17. 17

  18. In terms of Average F-Score On the NUS-WIDE-SUB datasets with 18 , 325 Flickr images Comparison with Baseline (initial user provided tags ), CBAR [1] and TRVSC [2]. Method Baseline CBAR [1] TRVSC [2] M-E Graph Precision 0.47 0.50 0.52 0.54 Recall 0.49 0.52 0.53 0.57 F-Score 0.44 0.47 0.49 0.53 [1] C. Wang, L. Zhang and H.-J. Zhang . Content-based Image Annotation Refinement . CVPR 2007. [2] D. Liu, X.-S. Hua and H.-J. Zhang. Retagging Social Images based on Visual and Semantic Consistency . WWW 2010. 18

  19. In terms of Average Per-tag Precision and Recall. MSRC, COREL and NUS-WIDE-SUB datasets. Comparison with the state-of-the-art multi-label auto-tagging methods. 19

  20. Unified Tag Analysis with Multi-Edge Graph Perform tag analysis at the granularity of image regions Model the parallel semantic relationship between the images Realize cross-level tag propagation 20

  21. Scalability Large-scale testing Correlative cross-level tag propagation Semantic correlation among the tags More applications User behavior analysis in social network Knowledge mining from rich information cues of multimedia document 21

Recommend


More recommend