Hao Su Image world Shape world How humans represent 3D in mind? - PowerPoint PPT Presentation

Synthesize for Learning: Joint analysis of 2D images and 3D shapes Hao Su Image world Shape world

How humans represent 3D in mind?

Mental rotation by Roger N. Shepard, National Science Medal Laurate, Stanford and Lynn Cooper, Professor at Columbia University

Shape constancy

3D Perception is important for robots Cosimo Alfredo Pina, “The domestic robots are getting closer”

3D Perception is important for robots

2D-3D lifting by machine learning contrast color texture motion symmetry part category-specific 3D knowledge ……

Synthesize for learning : from virtual world to real world • First build & learn in a 3D Virtual Environment , Shape Database A shape repository with rich annotation

Synthesize for learning: from virtual world to real world • First build & learn in a 3D Virtual Environment , Class, Viewpoint, Object attributes Material, Symmetry, … Simulator Shape Database … A shape repository Synthetic sensory data with rich annotation

Synthesize for learning: from virtual world to real world • First build & learn in a 3D Virtual Environment , Training Class, Viewpoint, Object attributes Material, Symmetry, … Simulator Shape Database … A shape repository Synthetic sensory data with rich annotation

Synthesize for learning: from virtual world to real world • Then adapt to 2D Real World Testing Object attributes Real data

Machine learning is data hungry Review: image classification dataset ImageNet 10 ' 10 & 10 % 10 $ CIFAR Caltech 256 LabelMe Caltech 10 # 101 2000 2002 2004 2006 2008 2010

Status review of 3D datasets <= 10,000 models in total <= 100 models in total <= 60 models per class (average)

Status review of 3D datasets ImageNet 10 ' Limited in 10 & • scale # images • object classes 10 % • diversity 10 $ CIFAR Caltech 256 LabelMe State-of-the-art 3D shape dataset Caltech 10 # 101 2000 2002 2004 2006 2008 2010

My work: Build large-scale 3D datasets of objects … ~3 million models in total ~2,000 classes Rich annotations (in progress)

An object-centric 3D knowledge-base Part Symmetry decomposition Affordance Physical properties Material Images Semantics

ShapeNet: a large-scale 3D datasets of objects 10 # ShapeNet # models per classes 10 ) 10 ( ESB SHREC12 SHREC14 MSB 10 BAB TSB WMB CCCC PSB 10 ( 10 ) 10 # 10 % 10 & 10 $ 10 # models

My work: Develop data-driven 3D learning algorithms Training Class, Viewpoint, Object attributes Material, Symmetry, … Simulator ShapeNet … A shape repository Synthetic sensory data with rich annotation

Application 1: 3D viewpoint estimation ICCV 2015 oral: Render for CNN: Viewpoint Estimation in Images Using CNNs Trained with Rendered 3D Model Views 3D Viewpoint Estimation car in-plane rotation elevatio n azimuth

Accurate viewpoint label acquisition is expensive PASCAL3D+ dataset [Xiang et al.] Annotation takes ~1 min per object

High-capacity Model High-cost Label Acquisition 30K images with viewpoint labels in PASCAL3D+ 60M parameters. AlexNet [Krizhevsky et al.] dataset [Xiang et al.] How to get MORE images with ACCURATE viewpoint labels?

Manual alignment by annotators Auto alignment through rendering

A “Data Engineering” journey 95% on synthetic val set 47% on real test set L ConvNet: Ah ha, I know! Viewpoint is just the brightness pattern!

A “Data Engineering” journey Randomize lighting 47% -> 74% ConvNet: hmm.. viewpoint is not the brightness pattern. Maybe it’s the contour?

A “Data Engineering” journey Add backgrounds 74% -> 86% ConvNet: It becomes really hard! Let me look more into the picture.

A “Data Engineering” journey bbox crop texture 86% -> 93%

A “Data Engineering” journey bbox crop texture 86% -> 93% ConvNet: the mapping becomes hard. I Key Lesson: Don’t give CNN a chance to “cheat” - it’s very good have to learn harder to get it right! at it. When there is no way to cheat, true learning starts.

Render for CNN Image Synthesis Pipeline Add bkg Rendering Crop 3D model Hyper-parameters estimation from real images

2.4M synthesized images for 12 categories • High scalability • High quality • Overfit-resistant • Accurate labels

Metric: viewpoint accuracy and median angle error (lower the better) Our model trained on rendered images outperforms state-of-the-art model trained on real images in PASCAL3D+. Real test images from PASCAL3D+ dataset 16 Viewpoint Median Error 15 14 13 12 11 10 9 8 Vps&Kps RenderForCNN (CVPR15) (Ours)

Results

Application 2: 3D human pose estimation 3DV 2015 oral: Synthesizing Training Images for Boosting Human 3D Pose Estimation

Challenge: clothing variation 3DV 2015 oral: Synthesizing Training Images for Boosting Human 3D Pose Estimation

Automatic texture transfer from images to shapes 3DV 2015 oral: Synthesizing Training Images for Boosting Human 3D Pose Estimation

Effectiveness of texture augmentation

Texture transfer for rigid objects SIGGRAPH Asia 16: Unsupervised Texture Transfer from Images to Model Collections Product photos Automatically textured shapes

Domain adaptation between Virtual and Reality 3DV 2015 oral: Synthesizing Training Images for Boosting Human 3D Pose Estimation Map features from real and synthetic images to the same domain

Adversarial learning based domain adaptation 3DV 2015 oral: Synthesizing Training Images for Boosting Human 3D Pose Estimation

Domain adaptation between Virtual and Reality

Results: 3D human pose estimation

Application 3: Attention-based object identification SIGGRAPH Asia 2016: 3D Attention-Driven Depth Acquisition for Object Identification

Background 1. How is the scene composited? 2. What are these?

Background ShapeNet Object identification 49

Autonomous object identification

The main challenge – next-best-view problem • Observation is partial and progressive à View planning • Assessing views whose observation is unknown ? Observed Unobserve ? view d views ? How can you know which view is better without knowing its observation? 51

Simulate For Reinforcement learning • Train from virtual scanned ShapeNet models using Reinforcement Learning • Test in a real environment

The general framework

The general framework Goal Action View planning: Recognition: • Evaluate a • Incremental view based classification on history based on history Belief Observe

Attention mechanism • Goal-oriented and stimulus-driven Control of goal oriented and stimulus driven attention mechanisms in the brain, Nature Review Neuroscience . 2002 Glimpse Internal Perform Representation Task Stores the info. of history Supervision or reward 55

3D Recurrent Attention Model 𝜄 , , 𝜚 (,) 𝜄 ( , 𝜚 (() 𝜄 ) , 𝜚 ()) Discriminative NBV emission NBV emission NBV emission view selection … (,) (() ()) ℎ ( ℎ ( ℎ ( View classify classify classify … aggregation (,) (() ()) ℎ , ℎ , ℎ , initial view 𝜄 / , 𝜚 (/) 𝜄 , , 𝜚 (,) 𝜄 ( , 𝜚 (() Feature Feature Feature extraction extraction extraction 𝐽 (() 𝐽 (/) 𝐽 (,)

Reinforcement learning needs LOTS of data to train! • Simulate many many scan sequences in virtual environment

Results

Results 59

Quantitative results

Reconstructed 3D scene SIGGRAPH Asia 2016: 3D Attention-Driven Depth Acquisition for Object Identification

Summary • Key theme: learn in a virtual environment of 3D shapes, test in real scenes of 2D RGB(D) images ML • Data: build a large-scale 3D database (ShapeNet) with rich annotations CG CV • Synthesize training data for deep learning, applicable for many tasks

Thank you!

Hao Su Image world Shape world How humans represent 3D in mind? - PowerPoint PPT Presentation

Synthesize for Learning: Joint analysis of 2D images and 3D shapes Hao Su Image world Shape world How humans represent 3D in mind? Mental rotation by Roger N. Shepard, National Science Medal Laurate, Stanford and Lynn Cooper, Professor at

Apache Sentry - High Availability Hao Hao - hao.hao@cloudera.com Seville, Spain, Nov 14 - 16 2016

CSCI 599: Digital Geometry Processing Spring 2015 Hao Li http://cs599.hao-li.com 1 USC

CSCI 420: Computer Graphics Fall 2018 Hao Li http://cs420.hao-li.com 1 http://hao.li/ Vision

CSCI 420: Computer Graphics Fall 2014 Hao Li http://cs420.hao-li.com 1 http://hao.li/

3D Deep Learning: An Overview based on My Work Hao Su Feb 23, 2018 Our world is 3D Hao Su 2

CSCI 420: Computer Graphics Fall 2015 Hao Li http://cs420.hao-li.com 1 http://hao.li/

CSCI 420: Computer Graphics Fall 2017 Hao Li http://cs420.hao-li.com 1 http://hao.li/

2.2 Classic Differential Geometry 1 Hao Li http://cs621.hao-li.com 1 Spring 2018 CSCI 621:

2.1 Explicit & Implicit Surfaces Hao Li http://cs621.hao-li.com 1 Administrative

2.1 Input and Interaction Hao Li http://cs420.hao-li.com 1 Administrative Exercise 1:

2.1 Explicit & Implicit Surfaces Hao Li http://cs599.hao-li.com 1 Administrative

3.1 Classic Differential Geometry Hao Li http://cs599.hao-li.com 1 Spring 2014 CSCI 599:

2.1 Input and Interaction Hao Li http://cs420.hao-li.com 1 Administrative Exercise 1:

3.2 Classic Differential Geometry 1 Hao Li http://cs621.hao-li.com 1 Spring 2017 CSCI 621:

1 DEEKSHA JOSHI YUE HAO XUAN LIU ADITHYA DEVDAS s SANDRA HERNANDEZ 2 DEEKSHA JOSHI YUE HAO

9.1 Surface Parameterization Hao Li http://cs621.hao-li.com 1 Modeling 2 Modeling 3

Diagnostics of turbulence and bulk motions in ICM Irina Zhuravleva KIPAC, Stanford University

Motion Planning for Example Finding Evasive Targets in a Cluttered Environment cleared region

Mapping FOSDEM for accessibility The team Laura https://codingcatgirl.de/ Created

Succinct Trie Indexes Made Practical Huanchen Zhang David G. Andersen, Michael Kaminsky, Andrew

Ob Objec ect-aware G e Guidance e for Auton onom omou ous S Scene e Reconstruction on

Data structures wa y x 1 D ASE System System E C r* O r state D Critic Critic E

Ephesians Series Lesson #017 February 10, 2019 Dean Bible Ministries

Estimating parameters in spatio- temporal Quermass- in spatio-temporal interaction process

Hao Su Image world Shape world How humans represent 3D in mind? - PowerPoint PPT Presentation

Synthesize for Learning: Joint analysis of 2D images and 3D shapes Hao Su Image world Shape world How humans represent 3D in mind? Mental rotation by Roger N. Shepard, National Science Medal Laurate, Stanford and Lynn Cooper, Professor at

Apache Sentry - High Availability Hao Hao - hao.hao@cloudera.com Seville, Spain, Nov 14 - 16 2016

CSCI 599: Digital Geometry Processing Spring 2015 Hao Li http://cs599.hao-li.com 1 USC

CSCI 420: Computer Graphics Fall 2018 Hao Li http://cs420.hao-li.com 1 http://hao.li/ Vision

CSCI 420: Computer Graphics Fall 2014 Hao Li http://cs420.hao-li.com 1 http://hao.li/

3D Deep Learning: An Overview based on My Work Hao Su Feb 23, 2018 Our world is 3D Hao Su 2

CSCI 420: Computer Graphics Fall 2015 Hao Li http://cs420.hao-li.com 1 http://hao.li/

CSCI 420: Computer Graphics Fall 2017 Hao Li http://cs420.hao-li.com 1 http://hao.li/

2.2 Classic Differential Geometry 1 Hao Li http://cs621.hao-li.com 1 Spring 2018 CSCI 621:

2.1 Explicit &amp; Implicit Surfaces Hao Li http://cs621.hao-li.com 1 Administrative

2.1 Input and Interaction Hao Li http://cs420.hao-li.com 1 Administrative Exercise 1:

2.1 Explicit &amp; Implicit Surfaces Hao Li http://cs599.hao-li.com 1 Administrative

3.1 Classic Differential Geometry Hao Li http://cs599.hao-li.com 1 Spring 2014 CSCI 599:

2.1 Input and Interaction Hao Li http://cs420.hao-li.com 1 Administrative Exercise 1:

3.2 Classic Differential Geometry 1 Hao Li http://cs621.hao-li.com 1 Spring 2017 CSCI 621:

1 DEEKSHA JOSHI YUE HAO XUAN LIU ADITHYA DEVDAS s SANDRA HERNANDEZ 2 DEEKSHA JOSHI YUE HAO

9.1 Surface Parameterization Hao Li http://cs621.hao-li.com 1 Modeling 2 Modeling 3

Diagnostics of turbulence and bulk motions in ICM Irina Zhuravleva KIPAC, Stanford University

Motion Planning for Example Finding Evasive Targets in a Cluttered Environment cleared region

Mapping FOSDEM for accessibility The team Laura https://codingcatgirl.de/ Created

Succinct Trie Indexes Made Practical Huanchen Zhang David G. Andersen, Michael Kaminsky, Andrew

Ob Objec ect-aware G e Guidance e for Auton onom omou ous S Scene e Reconstruction on

Data structures wa y x 1 D ASE System System E C r* O r state D Critic Critic E

Ephesians Series Lesson #017 February 10, 2019 Dean Bible Ministries

Estimating parameters in spatio- temporal Quermass- in spatio-temporal interaction process

2.1 Explicit & Implicit Surfaces Hao Li http://cs621.hao-li.com 1 Administrative

2.1 Explicit & Implicit Surfaces Hao Li http://cs599.hao-li.com 1 Administrative