Shifting from Naming to Describing: Semantic Attribute Models Rogerio Feris, June 2014
Recap Large-Scale Semantic Modeling Feature Coding and Pooling Low-Level Feature Extraction Training Data Slide credit: Rogerio Feris Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
What if no training samples are available for the target class? Is this a practical setting? Slide credit: Rogerio Feris Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Motivation ImageNet has 30 mushroom synsets , each with ≈1000 images. Slide credit: Christoph Lampert Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Motivation In nature, there are ≈14,000 mushroom species. Zero-data: Many fine-grained visual categorization tasks may have classes with few or no training examples at all. Image: http://www.evogeneao.com/ Slide adapted from Christoph Lampert Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Motivation Suspect Search in Surveillance Videos [Feris et al, IBM] Zero-data: often no example images from suspects are available, only textual descriptions. Slide credit: Rogerio Feris Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Motivation Prediction of concrete nouns from neural imaging data (mind reading) [Mark Palatucci et al, NIPS 2009] Noun Prediction Zero-Data: many nouns without corresponding neural image examples (costly label acquisition) Slide credit: Rogerio Feris Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Motivation Similar problems in other fields: Large Vocabulary Speech Recognition Zero-Data: Infeasible to acquire training samples for each word (need sub-word modeling like phonemes) Recommendation Systems Zero-Data: Newly released apps without any user ratings (also known as “cold - start problem”) [Schin et al, SIGIR 2002] Slide credit: Rogerio Feris Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Semantic Attribute Models: Zero-Shot Learning for Visual Recognition [Lampert et al, CVPR 2009] [Farhadi et al, CVPR 2009] [Palatucci et al, NIPS 2009] Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Attribute-based Classification Attributes: Semantic/nameable properties that are shared across classes Intuitive mid-level feature representation Slide adapted from Christoph Lampert Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Attribute-based Classification [Lampert et al, CVPR 2009] Unseen categories Standard multi-class classification Similar to Error-Correcting Unseen categories Output codes (ECOC [Dietterich & Bakiri, 1995]), but semantic codes are used instead Semantic Output Semantic Attribute Code Classifier (SOCC) Classifiers [Palatucci et al, NIPS 2009] Attribute-based classification Slide credit: Rogerio Feris Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Image-Attributes Prediction For each attribute , collect a set of positive and negative samples and train a classifier (e.g., using SVM or Neural networks) Example: “Stripe” Attribute Negative (Non-Stripe) Positive (Stripe) Binary Attribute Model Attributes transcend class boundaries Learning “stripe” attribute with images of zebras, clothing, … Slide credit: Rogerio Feris Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Image-Attributes Prediction Issue with Binary Attribute Models [Parikh and Grauman, ICCV 2011] Smiling ??? Not smiling ??? Natural Not natural Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Image-Attributes Prediction Relative Attributes [Parikh and Grauman, ICCV 2011] Replace binary model by a ranking function j i i j Max-margin learning to rank formulation of Joachims 2002 Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Attribute-Class Associations Manual Specification of Class-Attribute Associations Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Attribute-Class Associations Associations may be extracted automatically from other sources [Rohrbach et al, CVPR 2010] Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Attributes as “classes” “giant pandas are similar to grizzly and polar bears” [Rohrbach et al, CVPR 2010] [Felix Yu et al, CVPR 2013] [Mensink et al, CVPR 2014] Attribute-based Direct similarity Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Generalization: Label Embedding [Akata et al , CVPR 2013] Check talk by Florent Perronnin on “Output embedding for large-scale visual recognition” (LSVR CVPR 2014 tutorial) Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Generalization: Label Embedding Label Embedding Framework Automatic Discovery of word associations Frome et al . "DeViSE: A Deep Visual-Semantic Embedding Model", NIPS 2013 Real-Value word vector Deep representation Learning Skip-gram model: Semantically related Label Image words are mapped to similar vector representations Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Generalization: Label Embedding Label Embedding Framework Automatic Discovery of word associations [Frome et al, NIPS 2013] Zero-Shot Learning / Semantically close mistakes Language Model Source Code: https://code.google.com/p/word2vec/ Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
In addition to zero-shot classification, semantic attribute models have shown to be useful for many other tasks Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Other Uses of Semantic Attributes Check the CVPR 2013 tutorial on Attributes: https://filebox.ece.vt.edu/~parikh/attributes/ Slide credit: Rogerio Feris Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Attribute-based Search Application: Smart Surveillance [Feris et al, IBM - WACV 2009, CVPR 2011, ICMR 2014] Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Attribute-based People Search http://www.today.com/video/today/51630165/#51630165 Slide credit: Rogerio Feris Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Attribute-based People Search People Search in Surveillance Videos Traditional Approaches: Face Recognition (“Naming”) Face recognition is very challenging under lighting changes, pose variation, and low- resolution imagery (typical conditions in surveillance scenarios). Attribute-based People Search (“Describing”) Rather than relying on face recognition only, we provide a complementary people search framework based on fine-grained semantic attributes. Query Example: “Show me all people with a beard and sunglasses, wearing a white hat and a patterned blue shirt, from all metro cameras in the downtown area, from 2pm to 4pm last Saturday". Slide credit: Rogerio Feris Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Attribute-based People Search Suspect Description Form Slide credit: Rogerio Feris Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Attribute-based People Search System Architecture Slide credit: Rogerio Feris Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Attribute-based People Search Facial Attributes: bald, hair, color of hair, hat, color of hat, sunglasses, eyeglasses, absence of glasses, beard, mustache, absence of facial hair, skin tone (dark, medium,light), gender, … Torso Attributes: clothing color, patterned, solid, … Timestamp, Camera ID [Siddiquie et al, CVPR 2011] Slide credit: Rogerio Feris Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Attribute-based People Search Attribute Ranking [Siddiquie, Feris and Davis, CVPR 2011] “Learning to rank” - confidence of individual attributes as features Pairwise attribute modeling Slide credit: Rogerio Feris Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Structured Learning Formulation Improved performance over other ranking methods (RankSVM, RankBoost, DORM, TagProp) in three standard datasets (LFW, FaceTracer, PASCAL) See [Siddiquie, Feris and Davis, CVPR 2011] Slide credit: Rogerio Feris Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Attribute-based People Search Top-1 Ranking Results [Feris et al, ICMR 2014] Slide credit: Rogerio Feris Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Recommend
More recommend