Inferring the Why in Images [Pirsiavash et al] CSC2523 Winter - PowerPoint PPT Presentation

Inferring the Why in Images [Pirsiavash et al] CSC2523 Winter 2015: Paper Presentation Micha Livne

Goals (a) (b) Sitting Sitting because he wants because she intends to watch television to see the doctor

Related Work Predicted Intents Most FAVORABLE Least FAVORABLE Favorable Angry Happy Fearful Energetic Competent Dominant Comforting Comforting Trustworthy Least COMFORTING Most COMFORTING Favorable Angry Happy Visual Persuasion: Fearful Energetic Inferring Competent Dominant Communicative Comforting Comforting Trustworthy Most COMPETENT Least COMPETENT Favorable Angry Happy Intents of Images Fearful Energetic [Joo et al 2014] Competent Dominant Comforting Comforting Trustworthy Most DOMINANT Least DOMINANT Favorable Angry Happy Fearful Energetic Competent Dominant Comforting Comforting Trustworthy (d) Example images and predicted intents

Proposed Solution: Vision Only T y φ ( x ) [Krizhevsky et al 2012] w T argmax y φ ( x ) Visual classifier y ∈ { 1 ,...,M }

Proposed Solution: Full Solution Language Potentials Relationship Query to Language Model action + object + motivation action the object in order to motivation action the object to motivation action the object because pronoun wants to motivation action + object + scene action the object in a scene in a scene , action the object action + scene + motivation action in a scene in order to motivation action in order to motivation in a scene action because pronoun wants to motivation in a scene log-probability L ij ( y i , y j ) sentences about those

Dataset 120 100 80 Count 60 40 20 0 travel wave take eat look ride pose drink play drive read walk pet talk wait listen sail win go perform race sing sleep have rest catch relax show cross dance give hold jump kiss prepare serve cut enjoy fix fly pour protest write admire blow board build celebrate clean climb compete cook count crawl enter float hang help hit inspect laugh lead order paddle practice remove rock row sell skate smash smell smile throw toast visit work marry transport Statistics of Motivations • Based on PASCAL VOC 2012. • Only images with a person. • Annotation of: action, object, scene , and motivation (79).

Proposed Solution: Full Solution Scoring Function N X w T Ω ( y ; w, u, x, L ) = y i φ i ( x ) i N N N X X X + u i L i ( y i ) + u ij L ij ( y i , y j ) + u ijk L ijk ( y i , y j , y k ) i i<j i<j<k o a s m

Proposed Solution: Full Solution Learning 1 2 || θ || 2 + C X ξ n argmin θ , ξ n ≥ 0 n θ T ψ ( y n , x n ) − θ T ψ ( h, x n ) ≥ ∆ ( y n , h ) − ξ n ∀ n , ∀ h s.t. Inference y ∗ = argmax Ω ( y ; w, u, x, L ) y

Results Success Human Label: sitting on bench in a train station because he is waiting Human Label: sitting on chair in a dining room because she wants to eat Top Predictions: 1. sitting near table in dining room because she wants to eat Top Predictions: 1. sitting on bench in a park because he is waiting 2. sitting on a sofa in a dining room because she wants to eat 2. holding a tv in a park because he wants to take 3. holding a cup in a dining room because she wants to eat 3. holding a seal in a park because he wants to protest 4. sitting on a cup in a dining room because she wants to eat 4. holding a guitar in a park because he wants to play

Results Failure Human Label: holding a person in a living room because she wants to show Human Label: standing next to table because she wants to prepare Top Predictions: 1. sitting on sofa in living room because she wants to pet Top Predictions: 1. talking to person in dining because she wants to eat 2. sitting on sofa in living room because she wants to look 2. standing next to table in dining room because she wants to eat 3. sitting on sofa in living room because she wants to read 3. sitting next to table in dining because she wants to eat 4. sitting on chair in living room because she wants to pet 4. talking to person in kitchen because she wants to eat

Results Failure: Vision Only Human Label: sitting on a bus in a parking lot because he wants to drive Human Label: sitting on chair in living room because she wants to read Top Predictions: 1. because he wants to look Top Predictions: 1. because she wants to eat 2. because he wants to ride 2. because she wants to look 3. because he wants to drive 3. because she wants to drink 4. because he wants to eat 4. because she wants to ride

Results Baseline Our Method (Vision Only) (With Language) Action+Object+Scene 13 10 Action+Object 12 11 Object+Scene 15 12 Given Ideal Action+Scene 19 13 Detectors for: Object 19 13 Action 18 15 Scene 1 37 18 23 2 Fully Automatic 15 Chance has rank of 39

Results 1 0.9 0.8 0.7 0.6 Accuracy 0.5 0.4 0.3 Our Model (automatic) 0.2 Our Model (given ideal detectors) Baseline (automatic) 0.1 Baseline (given ideal detectors) Chance 0 0 10 20 30 40 50 60 70 80 Number of Top Retrievals

Point of Strength • Novel and important problem • Simple model - easy to understand • Augmenting image with text through data mining was proven to be effective

Point of Weakness • Results are only ok (qualitatively, failure of vision- only model does not make much more sense) • Model is linear - too simple • Language queries are simple as well

Contributions • Introducing the problem of inferring motivation behind people’s actions to the computer vision community. • Propose to use common knowledge mined from web to improve computer vision systems.

Conclusion • Interesting problem • The proposed method is more of a baseline • Future research can extend prediction model, and language model

Thanks! Questions?

Inferring the Why in Images [Pirsiavash et al] CSC2523 Winter - PowerPoint PPT Presentation

Inferring the Why in Images [Pirsiavash et al] CSC2523 Winter 2015: Paper Presentation Micha Livne Goals (a) (b) Sitting Sitting because he wants because she intends to watch television to see the doctor Related Work Predicted Intents

CS4495/6495 Introduction to Computer Vision 2A-L1 Images as functions Images as functions Images

Inferring Internet Inferring Internet Denial- -of of- -Service Activity Service Activity

On Inferring and Characterizing On Inferring and Characterizing Internet Routing Policies

Ming Colin Scribes ming : - , Motivation Inferring from Images Latent Variables :

Bitmap (Raster) Images CO2016 Multimedia and Computer Graphics Roy Crole: Bitmap Images (CO2016,

HAAR-like features for images Images digit images are scanned hand written digits Digit

https://images-na.ssl-images-amazon.com/images/I/A1w4iP5ov-L._SY879_.jpg Translate this table to a

Inferring Temporal System Properties Samuel Huang, joint work with Rance Cleaveland University of

The Challenge of Cultural The Challenge of Cultural Modeling for Inferring Modeling for

Inferring Required Permissions for Statically Composed Programs Tero Hasu Anya Helene Bagge

Inferring Descriptive Generalisations of Formal Languages Dominik D. Freydenberger 1 Daniel

Inferring User Intent for Learning by Observation Kevin R. Dixon krd@cs.cmu.edu Department of

From Dirt to Shovels: From Dirt to Shovels: Inferring PADS descriptions from ASCII Data ASCII

Understanding and Aiding Code Evolution by Inferring Change Patterns Miryung Kim Doctoral

Inferring Required Permissions for Statically Composed Programs Tero Hasu Anya Helene Bagge

From Uncertainty to Belief: Inferring the Specification Within Stephen McLaughlin Stephen

Relationships Singapore Healthcare Management 2018 Prof Chua Hong Choon Deputy Chief Executive

Patient Participation Group 31 st January 2019 2 3 pm Patient Feedback November 2018

Tideway Community Liaison Working Group Heathwall Pumping Station Kirtling Street Tunnels 1

Financing Water Systems: Green Bonds & Canada Infrastructure Bank February 11, 2020

Almond and the Influenza Pandemic J. Parman (College of William & Mary) Global Economic

Family Child Care Brown Bag Healthy Childcare Practices: Moving Forward with Confidence! July

What To Know Before Visiting Disney Disney Parks Re-opening Opening 7/11 Magic Kingdom

Session Transcript: 08-07-2020 Yoga Alliance - Afternoon session Closed Captioning/ Transcript

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Inferring the Why in Images [Pirsiavash et al] CSC2523 Winter - PowerPoint PPT Presentation

Inferring the Why in Images [Pirsiavash et al] CSC2523 Winter 2015: Paper Presentation Micha Livne Goals (a) (b) Sitting Sitting because he wants because she intends to watch television to see the doctor Related Work Predicted Intents

CS4495/6495 Introduction to Computer Vision 2A-L1 Images as functions Images as functions Images

Inferring Internet Inferring Internet Denial- -of of- -Service Activity Service Activity

On Inferring and Characterizing On Inferring and Characterizing Internet Routing Policies

Ming Colin Scribes ming : - , Motivation Inferring from Images Latent Variables :

Bitmap (Raster) Images CO2016 Multimedia and Computer Graphics Roy Crole: Bitmap Images (CO2016,

HAAR-like features for images Images digit images are scanned hand written digits Digit

https://images-na.ssl-images-amazon.com/images/I/A1w4iP5ov-L._SY879_.jpg Translate this table to a

Inferring Temporal System Properties Samuel Huang, joint work with Rance Cleaveland University of

The Challenge of Cultural The Challenge of Cultural Modeling for Inferring Modeling for

Inferring Required Permissions for Statically Composed Programs Tero Hasu Anya Helene Bagge

Inferring Descriptive Generalisations of Formal Languages Dominik D. Freydenberger 1 Daniel

Inferring User Intent for Learning by Observation Kevin R. Dixon krd@cs.cmu.edu Department of

From Dirt to Shovels: From Dirt to Shovels: Inferring PADS descriptions from ASCII Data ASCII

Understanding and Aiding Code Evolution by Inferring Change Patterns Miryung Kim Doctoral

Inferring Required Permissions for Statically Composed Programs Tero Hasu Anya Helene Bagge

From Uncertainty to Belief: Inferring the Specification Within Stephen McLaughlin Stephen

Relationships Singapore Healthcare Management 2018 Prof Chua Hong Choon Deputy Chief Executive

Patient Participation Group 31 st January 2019 2 3 pm Patient Feedback November 2018

Tideway Community Liaison Working Group Heathwall Pumping Station Kirtling Street Tunnels 1

Financing Water Systems: Green Bonds &amp; Canada Infrastructure Bank February 11, 2020

Almond and the Influenza Pandemic J. Parman (College of William &amp; Mary) Global Economic

Family Child Care Brown Bag Healthy Childcare Practices: Moving Forward with Confidence! July

What To Know Before Visiting Disney Disney Parks Re-opening Opening 7/11 Magic Kingdom

Session Transcript: 08-07-2020 Yoga Alliance - Afternoon session Closed Captioning/ Transcript

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Financing Water Systems: Green Bonds & Canada Infrastructure Bank February 11, 2020

Almond and the Influenza Pandemic J. Parman (College of William & Mary) Global Economic