modeling annotated data
play

MODELING ANNOTATED DATA Reviewer: Saurabh Singh (ss1@uiuc.edu) - PowerPoint PPT Presentation

MODELING ANNOTATED DATA Reviewer: Saurabh Singh (ss1@uiuc.edu) Problem Modeling of associated document items Images & Annotations Papers & Bibliographies Genes & Functions Documents are considered as pairs of data


  1. MODELING ANNOTATED DATA Reviewer: Saurabh Singh (ss1@uiuc.edu)

  2. Problem • Modeling of associated document items • Images & Annotations • Papers & Bibliographies • Genes & Functions • Documents are considered as pairs of data streams. • One type provides annotation for the other type.

  3. Uses • Retrieval, Clustering, Classification • Automatic annotation • Retrieval of un-annotated data.

  4. This paper Models Images ( r ) and Annotations ( w ) Three primary tasks • Joint distribution of an image and its caption (Clustering, Organization) • Conditional distribution of words given an image. (Automatic annotation, text based retrieval) • Conditional distribution of words given a region of an image. (Automatic labeling of regions)

  5. Modeling K factors or topics • Each a distribution over words • Each a distribution over image regions Latent variables • Topic assignments • Distribution parameters (for components) Features Document: (r, w), N regions, M words Distributions p( r , w ), p(w | r ), p(w | r , r n )

  6. Text annotations Vocabulary: 168 Terms (V) Captions: 2-4 Words per Image Multinomials on V conditioned on topics

  7. Images Composed of 6-10 regions via N-cuts Each region summarized as a feature vector ~40 • Size: Percentage of image • Position: Center of mass [0, 1] • Color: µ, σ of R,G,B, L, a, b etc. • Texture: µ, σ of filter responses • Shape: area/perimeter 2 , moment of inertia etc. Multivariate Gaussian over features: µ , Σ

  8. Models Three hierarchical probabilistic models Gaussian Multinomial mixture 1. Gaussian Multinomial LDA 2. Correspondence LDA 3.

  9. Gaussian Multinomial Mixture µ r N σ z λ w β M D θ d α Z d,n W d,n β k η N D K

  10. Distributions N p ( z, r , w ) = p ( z | λ ) n =1 p ( r n | z, µ, σ ) M · m =1 p ( w m | z, β ) . • p( r , w ) • p(w | r ) = = z p ( z | r ) p ( w | z ) . But no • p(w | r , r n )

  11. Gaussian Multinomial LDA µ z r N σ α θ v w β M D θ d α Z d,n W d,n β k η N D K

  12. Distributions N p ( r , w , θ , z , v ) = p ( θ | α ) n =1 p ( z n | θ ) p ( r n | z n , µ, σ ) M · m =1 p ( v m | θ ) p ( w m | v m , β ) . All • p( r , w ) • p(w | r ) • p(w | r , r n )

  13. Correspondence LDA µ z r α θ N σ y w β M D θ d α Z d,n W d,n β k η N D K

  14. Distributions N p ( r , w , θ , z , y ) = p ( θ | α ) n =1 p ( z n | θ ) p ( r n | z n , µ, σ ) M · m =1 p ( y m | N ) p ( w m | y m , z , β ) All • p( r , w ) • p(w | r ) • p(w | r , r n )

  15. Inference & Estimation • Variational Inference • Exact intractable • Approximate assuming factorizable distribution • Minimize KL-Divergence via iterative updates to parameters • Parameter Estimation • EM algorithm • E: Compute variational posterior. • M: MLE estimate of the model parameters.

  16. Evaluation • 7000 Images and their captions • 75% Training & 25% Testing • Test set likelihood • Automatic annotation • Text based retrieval

  17. Eval: Test set likelihood 650 600 Average negative log probability 550 500 450 400 Corr − LDA GM − Mixture GM − LDA 350 ML 0 50 100 150 200 Number of factors

  18. Eval: Automatic Annotation D M d D perplexity = exp { − m =1 log p ( w m | r d ) / d =1 M d } . d =1 Maximum likelihood Empirical Bayes smoothed 100 100 90 90 Caption perplexity Caption perplexity 80 80 70 70 60 60 50 50 Corr − LDA Corr − LDA GM − Mixture GM − Mixture 40 40 GM − LDA GM − LDA ML ML 30 30 0 50 100 150 200 0 50 100 150 200 Number of factors Number of factors

  19. Eval: Automatic Annotation (Qual.) True caption True caption True caption clouds jet plane fish reefs water scotland water Corr − LDA Corr − LDA Corr − LDA sky plane jet mountain clouds fish water ocean tree coral scotland water flowers hills tree GM − LDA GM − LDA GM − LDA sky water people tree clouds water sky vegetables tree people tree water people mountain sky GM − Mixture GM − Mixture GM − Mixture sky plane jet clouds pattern fungus mushrooms tree flowers leaves water sky clouds sunset scotland

  20. Eval: Automatic Annotation (Qual.) 3 Corr − LDA: GM − LDA: 1. PEOPLE, TREE 1. HOTEL, WATER 4 2 2. SKY, JET 2. PLANE, JET 3. SKY, CLOUDS 3. TUNDRA, PENGUIN 4. SKY, MOUNTAIN 4. PLANE, JET 5. PLANE, JET 5. WATER, SKY 6. PLANE, JET 6. BOATS, WATER 6 5 1

  21. Text Based Retrieval people & fish sunset candy 1.0 1.0 1.0 Corr − LDA Corr − LDA Corr − LDA GM − Mixture GM − Mixture GM − Mixture 0.8 0.8 0.8 GM − LDA GM − LDA GM − LDA 0.6 Precision 0.6 0.6 Precision Precision 0.4 0.4 0.4 0.2 0.2 0.2 0.0 0.0 0.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Recall Recall Recall

  22. Text Based Retrieval (Qual.) Candy Sunset People & Fish

  23. Conclusion If conditionals are needed, then model them explicitly

Recommend


More recommend