Deep Affordance-Grounded Sensorimotor Object Recognition Authors: - PowerPoint PPT Presentation

Deep Affordance-Grounded Sensorimotor Object Recognition Authors: Spyridon Thermos, Georgios Presented By: Th. Papadopoulos, Petros Daras, Thomas Crosley Gerasimos Potamianos UT CS 381V Autumn 2017

Problem ● Integrate visual appearance and visual affordance information ● Object + Affordance Classification Hit Using Hammer

Affordances : “the types of actions that humans typically perform when interacting with an object.” Sit Throw Workout https://www.youtube.com/watch?v=V4XW74W9t4o https://www.youtube.com/watch?v=7Qxu5cvW-ds https://www.youtube.com/watch?v=1xS864zYIo8

Related Work Simpler Methods Smaller Data ● Factorial Conditional ● Few objects [1, 2, 3] Random Fields and Binary ● Small number of affordances [1, 2, 3] SVMs [1] ● Ex: 6 objects and 3 affordances [1] ● Gaussian Processes [2] ● SVMs + Clustering [3] [1] [2] [3]

RGB-D Sensorimotor Dataset

RGB-D Sensorimotor Dataset http://sor3d.vcl.iti.gr/wp-content/uploads/2017/03/sor3d.mp4?_=1

RGB-D Sensorimotor Dataset

RGB-D Sensorimotor Dataset Original Input

RGB-D Sensorimotor Dataset Input Processing

RGB-D Sensorimotor Dataset Data Extraction

RGB-D Sensorimotor Dataset ● 14 Object Types ● 13 Affordances ● 54 Interactions ● 105 subjects ● 4 to 8 seconds ● 20,830 instances

Architectures ● Generalized Template-Matching (GTM) ● Model spatial correlations ● Appearance CNN for object detection

Architectures ● Generalized Spatio-Temporal (GST) ● Encode time-evolving procedures ● CNN+LSTM for affordance modeling

Long Short Term Memory Networks (LSTMs) LSTMs: recurrent architecture capable of learning long-term dependencies Image Source: http://colah.github.io/posts/2015-08-Understanding-LSTMs/

LSTMs Core Idea: cell state updated and then passed on at each time step Image Source: http://colah.github.io/posts/2015-08-Understanding-LSTMs/

LSTMs “Forget Gate” “Remember Gate” Image Source: http://colah.github.io/posts/2015-08-Understanding-LSTMs/

LSTMs Image Source: http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Fusion ● Given multiple sources of information ● At what point do we combine their features? Image Source: http://cs.stanford.edu/people/karpathy/deepvideo/

Fusion ● GST Architecture ● Combines ○ Appearance ○ Affordance ● (a) Late Fusion ● (b) Slow fusion

Architecture Slow Fusion Multi-Level Late Fusion Late Fusion Fusion at FC at conv

Results Single Stream (Best) Template Matching (Best) Spatio-Temporal

Open Problems ● Authors’ Thoughts ○ NN-Autoencoders for human-object interactions ○ “In-the-wild” object-affordance detection ● Others ○ Affordance identification for control tasks ○ Better temporal sampling schemes

Deep Affordance-Grounded Sensorimotor Object Recognition Authors: - PowerPoint PPT Presentation

Deep Affordance-Grounded Sensorimotor Object Recognition Authors: Spyridon Thermos, Georgios Presented By: Th. Papadopoulos, Petros Daras, Thomas Crosley Gerasimos Potamianos UT CS 381V Autumn 2017 Problem Integrate visual appearance

Its Not Open Data Unless it is Usable Data Mike Amundsen, API Academy CA / Layer7 @mamund

The Formalities of Affordance Antony Galton University of Exeter, UK Antony Galton The

Reinforcement Learning of Reinforcement Learning of Affordance Cues Affordance Cues Final

Seeing the self in the www.hmi.unimore.it washing machine the Deep Affordance of 2.0 philosophy

Lear Learning M ning Multi ulti-Moda Modal l Grounded Lingu Grounded Linguistic istic

Response-based Learning for Grounded Grounded SMT Riezler, Machine Translation Simianer, Haas

Object Oriented Object 3 Programming Object 1 Object 2 Object 4 For : COP 3330. Object

Active Audition and Sensorimotor Integration for Sound Source Localization Mathieu Bernard 25

4. Perceptual Development Throughout the Lifespan 4.1 Sensorimotor Activities 4.2 Sensitive

Plan-based Control in an Plan-based Control in an Affordance-based Robot Control

A summary of deep models for face recognition Qianli Liao Face recognition Face recognition:

Outline Introduction Definition History Features When should Grounded Theory be used? Types

TAKE TAKE GROUNDED GROUNDED DECISIONS DECISIONS Farm Modelling Statistic based, gamification

CS6501: Deep Learning for Visual Recognition Object Detection: RCNN, Fast-RCNN, Faster-RCNN

Instance-level Recognition Pingmei Xu Object Recognition Friends SE01EP02 Recognition: Find the

Supervised object recognition, unsupervised object recognition then Perceptual organization Bill

SunyoungKim,PhD Last class Psychological design principles Recap. Psychological

CS449/649: Human-Computer Interaction Winter 2018 Lecture VIII Anastasia Kuzminykh Create

CSE 440: Introduction to HCI User Interface Design, Prototyping, and Evaluation Lecture 02:

5/5/2014 1 Peter 1:3-4, NIV "Praise be to God and Father of our Lord Jesus Christ. In his

Affordance Extraction and Inference based on Semantic Role Labeling Daniel Loureiro , Alpio

An introduction to Markov logic networks and their use in visual relational learning Willie Brink

Dreams Reoccurring Reoccurring Dreams The Craft of the Book in the Age of the W eb John Maxwell

MetaMenu Adding more interactivity in context menu interactions Emman Kianga | Interaction

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Deep Affordance-Grounded Sensorimotor Object Recognition Authors: - PowerPoint PPT Presentation

Deep Affordance-Grounded Sensorimotor Object Recognition Authors: Spyridon Thermos, Georgios Presented By: Th. Papadopoulos, Petros Daras, Thomas Crosley Gerasimos Potamianos UT CS 381V Autumn 2017 Problem Integrate visual appearance

Its Not Open Data Unless it is Usable Data Mike Amundsen, API Academy CA / Layer7 @mamund

The Formalities of Affordance Antony Galton University of Exeter, UK Antony Galton The

Reinforcement Learning of Reinforcement Learning of Affordance Cues Affordance Cues Final

Seeing the self in the www.hmi.unimore.it washing machine the Deep Affordance of 2.0 philosophy

Lear Learning M ning Multi ulti-Moda Modal l Grounded Lingu Grounded Linguistic istic

Response-based Learning for Grounded Grounded SMT Riezler, Machine Translation Simianer, Haas

Object Oriented Object 3 Programming Object 1 Object 2 Object 4 For : COP 3330. Object

Active Audition and Sensorimotor Integration for Sound Source Localization Mathieu Bernard 25

4. Perceptual Development Throughout the Lifespan 4.1 Sensorimotor Activities 4.2 Sensitive

Plan-based Control in an Plan-based Control in an Affordance-based Robot Control

A summary of deep models for face recognition Qianli Liao Face recognition Face recognition:

Outline Introduction Definition History Features When should Grounded Theory be used? Types

TAKE TAKE GROUNDED GROUNDED DECISIONS DECISIONS Farm Modelling Statistic based, gamification

CS6501: Deep Learning for Visual Recognition Object Detection: RCNN, Fast-RCNN, Faster-RCNN

Instance-level Recognition Pingmei Xu Object Recognition Friends SE01EP02 Recognition: Find the

Supervised object recognition, unsupervised object recognition then Perceptual organization Bill

SunyoungKim,PhD Last class Psychological design principles Recap. Psychological

CS449/649: Human-Computer Interaction Winter 2018 Lecture VIII Anastasia Kuzminykh Create

CSE 440: Introduction to HCI User Interface Design, Prototyping, and Evaluation Lecture 02:

5/5/2014 1 Peter 1:3-4, NIV &quot;Praise be to God and Father of our Lord Jesus Christ. In his

Affordance Extraction and Inference based on Semantic Role Labeling Daniel Loureiro , Alpio

An introduction to Markov logic networks and their use in visual relational learning Willie Brink

Dreams Reoccurring Reoccurring Dreams The Craft of the Book in the Age of the W eb John Maxwell

MetaMenu Adding more interactivity in context menu interactions Emman Kianga | Interaction

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

5/5/2014 1 Peter 1:3-4, NIV "Praise be to God and Father of our Lord Jesus Christ. In his