CMP722 ADVANCED COMPUTER VISION Lecture #10 Modeling the - PowerPoint PPT Presentation

Video: The Jenga-playing robot (MIT) CMP722 ADVANCED COMPUTER VISION Lecture #10 – Modeling the Physical World Aykut Erdem // Hacettepe University // Spring 2019

Illustration: Kevin Hong // Quanta Magazine Previously on CMP722 • graph structured data • graph neural nets (GNNs) • GNNs for ”classical network problems”

Lecture overview • physical scene understanding • intuitive physics • interaction networks • relation networks • visual interaction networks • learning physics engines via graph networks • Disclaimer: Much of the material and slides for this lecture were borrowed from — Peter Battaglia’s slides on “Structure in physical intelligence” 3

How do you understand a scene? 4

How do you understand a scene? 1. Parse it into physical objects and relations "Preca carious" " 2. Reason about the objects and their interactions Fall? Attached? Support 5

“Infinite use of finite means” - von Humboldt, on the productivity of language "Preca carious" " 6

Kenneth Craik, “The Nature of Explanation”, 1943: "If the organism carries a 'sm smal all-scal scale mo model’ of ext xternal al real ality and of it its ow own possi ssible act actions within its head, it is able to try try out out var various al alternat ative ves , co concl clude which ch is s th the best st of of th them , react act to to fu futu ture re si situat ations bef befor ore th they ar arise se , utilize the knowledge of past events in dealing with the present and future, and in every way to react in a much fuller, safer, and more competent manner to the emergencies which face it." (pg 61) "This concept of 'th thinghood' ' is of fundamental importance for any th theory ry of of th thought ." (pg 77) 7

Claim: Human intelligence is structured Founded on objects, relations, reasoning • Objects and relations reflect decisions made by evolution, experience, and task demands about how to represent the world in an efficient and useful way • Structure in our core cognitive knowledge evident very early in infancy (Spelke) • Model-building over recognizing patterns (Tenenbaum) • Combinatorial generalization via compositionality ( " infinite use of finite means”) 8

What is the mechanism of human intuitive physics? Intuitive Physics Engine: the "physics engine in the head" Battaglia, Hamrick, Tenenbaum, 2013, PNAS 9

Experiments: What will happen? Why? Will it fall? In which direction? Different masses Infer the mass Comples scenes Predict fluids Battaglia et al., 2013 Hamrick et al., 2016 Bates et al., 2015, 2018 10

Message from cognition Humans use richly structured representations of objects and relations to reason about, and interact with, their everyday environment. What insights does humans’ structured intelligence offer AI? 11

We need better object- and relation-centric models in AI A grap aph is a natural way to represent entities and their relations : : • “Nodes“ correspond to entities, objects, events, etc. • “Edges“ correspond to their relations, interactions, transitions, etc. • Inferences about entities and relations respect the graphical structure. Graphs can capture data from many complex systems: • Physical systems • Search trees • Scene graphs • Communication networks • Social networks • Transportation networks • Linguistic structure • Chemical structure • Programs • Phylogenetic trees 12

Intuitive physics as reasoning about graphs 13

Intuitive physics as reasoning about graphs 14

Interaction Network Strong relational inductive bias: Deep learning architecture which operates on graphs Related to the broad family of "Graph Neural Networks" (Scarselli et al, 2009; Li et al, 2015) and "Message-Passing Neural Networks" (Gilmer et al., 2017). Chang et al. (2016) also proposed a similar version in parallel. Battaglia et al., 2016, NeurIPS 15

Interaction Network Battaglia et al., 2016, NeurIPS 16

Interaction Network Can learn a general-purpose physics engine, simulating future states from initial ones Gravitational forces Rigid collisions between Springs and rigid collisions walls and balls Battaglia et al., 2016, NeurIPS 17

1000-step rollouts from 1-step supervised training n-body Balls Strings Ground truth Model Battaglia et al., 2016, NeurIPS 18

Zero-shot generalization to larger systems n-body Balls Strings Ground truth Model Battaglia et al., 2016, NeurIPS 19

Interaction Network for system-level predictions A "global model" can be added, which aggregates the per-object outputs to make predictions. Can be trained to predict potential energy of a system, outperforming MLP baselines Battaglia et al., 2016, NeurIPS 20

Relation Network Remove “object model” and predict global outputs only using “relation model”’s output Raposo et al., 2017, ICLR workshop; Santoro et al., 2017, NeurIPS 21

Relation Networks can infer relations in dot motion Trained on mass-spring systems Input Model Ground truth Generalizes to point-light walkers Input Model Ground truth Santoro et al., 2017, NeurIPS 22

"Visual interaction network" An interaction network augmented with a learnable perception system 23

"Visual interaction network" Multi-frame encoder (conv net-based) Interaction network Watters et al., 2017, NeurIPS 24

"Visual interaction network" Spring Gravity Magnetic Billiards Billiards Drift Can even predict invisible objects, inferred from how they affect visible ones Watters et al., 2017, NeurIPS 25

Learning to simulate more complex robotic systems Alvaro Sanchez-Gonzalez, Nicolas Heess, Tobi Springenberg, Josh Merel, Martin Riedmiller, Raia Hadsell, Peter Battaglia ICML, 2018 26

Systems: "DeepMind Control Suite" (Mujoco) & real JACO JACO Arm DeepMind Control Suite (Tassa et al., 2018) 27

Systems: "DeepMind Control Suite" (Mujoco) & real JACO JACO Arm 28

Kinematic tree of the actuated system as a graph Representing physical system as a graph: • Bodies → Nodes • Joints → Edges • Global properties Similar representation to: • Interaction Networks (Battaglia et al. 2016) • NerveNet (Wang et al. 2018) (graph-structured policy, rather than model) 29

Graph Network (GN) Battaglia et al., 2018 Graph-to-graph, modular block design Edge Node Global update update update 30

Forward model: supervised, 1-step training w/ random control inputs Next graph (t+1) Input graph (t) Chained 100-step predictions Sanchez-Gonzalez et al., 2018, ICML 31

Results: Graph Net (GN) vs MLP forward models More repeated structure: Better test generalization, Better performance over MLP within and outside of the training distribution Sanchez-Gonzalez et al., 2018, ICML 32

GN forward model: Multiple systems & zero-shot generalization Sin Single le model model trained: • Pendulum, Cartpole, Acrobot, Swimmer6 & Cheetah Zer Zero-sh shot general alizat ation : Swimmer • # training links: { 3 , 4 , 5 , 6 , -, 8 , 9 , -, -, ...} • # testing links: {-, -, -, -, 7 , -, -, 10 10-14 14 } Sanchez-Gonzalez et al., 2018, ICML 33

GN forward model: Real JACO data d model: Real JACO data Recurrent graph network ent graph network (Real JACO trajectories, rendered using Mujoco) (Real JACO trajectories, rendered using Mujoco) Sanchez-Gonzalez et al., 2018, ICML 34

System identification: GN-based inference, under diagnostic control inputs Unobserved system parameters (e.g. mass, length) are implicitly inferred Sanchez-Gonzalez et al., 2018, ICML 35

Using learned models for control 36

Control: Model-based planning Trajectory optimization: the GN-based forward model is differentiable, so we can backpropagate through it, and find a sequence of actions that maximize reward Sanchez-Gonzalez et al., 2018, ICML 37

Control: Multiple systems via a single model Sanchez-Gonzalez et al., 2018, ICML 38

Control: Zero-shot control Sanchez-Gonzalez et al., 2018, ICML 39

Control: Multiple reward functions Sanchez-Gonzalez et al., 2018, ICML 40

Learning to use mental simulation 41

Learning to use mental simulation "Imagination-based metacontroller" "Spaceship task": • Navigate to your home planet by choosing a force vector • Challenging because the planets exert gravity The agent learns 3 components: 1. Action policy (via stochastic value gradients (Heess et al. 2015)) 2. GN-based forward model (via supervised 1-step training) 3. Internal strategy for using imagination to test potential actions before selecting one to execute (via REINFORCE) Hamrick et al., 2017, ICLR 42

Learning to use mental simulation "Imagination-based planner" • Red: real actions • Blue: 1 step of imagination • Green: 2+ steps of imagination Pascanu et al., 2017, arXiv 43

Graph-structured model-free policies 44

CMP722 ADVANCED COMPUTER VISION Lecture #10 Modeling the - PowerPoint PPT Presentation

Video: The Jenga-playing robot (MIT) CMP722 ADVANCED COMPUTER VISION Lecture #10 Modeling the Physical World Aykut Erdem // Hacettepe University // Spring 2019 Illustration: Kevin Hong // Quanta Magazine Previously on CMP722 graph

CMP722 ADVANCED COMPUTER VISION Lecture #4 Multimodality Aykut Erdem // Hacettepe

CMP722 ADVANCED COMPUTER VISION Lecture #5 Language and Vision Aykut Erdem // Hacettepe

CMP722 ADVANCED COMPUTER VISION Lecture #3 Sequential Processing with NNs and Attention

CMP722 ADVANCED COMPUTER VISION Lecture #6 Deep Reinforcement Learning Aykut Erdem //

CMP722 ADVANCED COMPUTER VISION Lecture #9 Graph Networks Aykut Erdem // Hacettepe

CMP722 ADVANCED COMPUTER VISION Lecture #8 Image Synthesis Aykut Erdem // Hacettepe

Infiniband for Open MPI Andrew Friedley, Torsten Hoefler Matthew L. Leininger, Andrew Lumsdaine

for High Availability Martin Thompson - @mjpt777 What Is High Availability ?

A little introduction to MPI Jean-Luc Falcone July 2017 Message Passing Basics Point to point

Lecture 4: Message Passing Abhinav Bhatele, Department of Computer Science Announcements

Compressive Parameter Estimation via Approximate Message Passing Marco F. Duarte Joint work

Interprocess Communication Tevfik Ko ar Louisiana State University November 30th, 2010 1

Graph Neural Networks Xiachong Feng TG 2019-04-08 Relies heavily on A Gentle Introduction

CS302: Paradigms of Programming Tagging and Message Passing Manas Thakur Feb-June 2020 Recall

Using the Global Arrays Toolkit to Reimplement NumPy for Distributed Computation Jeff Daily ,

EXACTLY ONCE STATEFUL STREAMS THE EASY WAY COLIN MACNAUGTHON NEEVE RESEACH INTRODUCTIONS

Some thoughts on messaging Lets hear from an expert Dave McGimpsey interviews George

LevelJump logo + customer logo Name Contact info URL Housekeeping If you cant hear

Recruit itment Messagin ing: From analy lysis to desig ign Jonathan Schreiner American

Meta Reinforcement Learning Kate Rakelly 11/13/19 Questions we seek to answer Motivation : What

Bayesian Meta-Learning CS 330 1 Logistics Homework 2 due next Wednesday. Project proposal due in

Meta Queries Workshop Scott Joyce Advanced Meta Queries Which table do I use? How do I

Meta-policies for Distributed Role-based Access Control Andrs Belokosztolszki, Ken Moody

Towards Proximity Graph Auto-Configuration: an Approach Based on Meta-learning Rafael S. Oyamada,