3D Deep Learning Hao Su @Stanford CS231n Guest Leture Broad - PowerPoint PPT Presentation

3D Deep Learning Hao Su @Stanford CS231n Guest Leture

Broad Applications of 3D data Robotics

Broad Applications of 3D data Augmented Robotics Reality

Broad Applications of 3D data Augmented Robotics Reality Autonomous driving

Broad Applications of 3D data Augmented Robotics Reality Medical Image Autonomous Processing driving

Traditional 3D Vision Multi-view Geometry: Physics based

3D Learning: Knowledge Based

Acquire Knowledge of 3D World by Learning

The Representation Challenge of 3D Deep Learning Rasterized form Geometric form (regular grids) (irregular)

The Representation Challenge of 3D Deep Learning Volumetric Part Assembly Multi-view F ( x ) = 0 Implicit Shape Point Cloud Mesh (Graph CNN)

The Richness of 3D Learning Tasks 3D Analysis Detection Segmentation Classification Correspondence (object/scene)

The Richness of 3D Learning Tasks 3D Synthesis Monocular Shape completion Shape modeling 3D reconstruction

Agenda • 3D Classification • 3D Reconstruction • Others

Volumetric CNN

Can we use CNNs but avoid projecting the 3D data to views first? Straight-forward idea: Extend 2D grids 3D grids

Voxelization Represent the occupancy of regular 3D grids

3D CNN on Volumetric Data 3D convolution uses 4D kernels

Complexity Issue AlexNet, 2012 3DShapeNets, 2015 Input resolution: 224x224 224x224=50176 Input resolution: 30x30x30 224x224=27000

Complexity Issue Occupancy Grid Polygon Mesh 30x30x30 Information loss in voxelization

Idea 1: Learn to Project Idea: “X-ray” rendering + Image (2D) CNNs very low #param, very low computation Su et al., “ Volumetric and Multi-View CNNs for Object Many other works in autonomous driving that Classification on 3D Data ”, CVPR 2016 uses bird’s eye view for object detection

More Principled: Sparsity of 3D Shapes Occupancy: 32 64 128 Resolution:

Store only the Occupied Grids • Store the sparse surface signals • Constrain the computation near the surface

Octree: Recursively Partition the Space Each internal node has exactly eight children Neighborhood searching: Hash table

Memory Efficiency GPU Memory Memory (GB) 6 Voxel CNN O-CNN 4.5 3 1.5 0 Resolution 16^3 32^3 64^3 128^3 256^3 O-CNN Voxel CNN

Implementation • SparseConvNet • https://github.com/facebookresearch/ SparseConvNet • Uses ResNet architecture • State-of-the-art for 3D analysis • Takes time to train Graham et al., “ Submanifold Sparse Convolutional Networks ”, arxiv

Point Networks

Point cloud (The most common 3D sensor data)

Directly Process Point Cloud Data End-to-end learning for unstructured, unordered point data Object PointNet Classification Qi, Charles R., et al. " Pointnet: Deep learning on point sets for 3d classification and segmentation ”, CVPR 2017 Z aheer, Manzil, et al. " Deep sets ”, NeurIPS 2017

Permutation invariance Point cloud: N orderless points, each represented by a D dim coordinate D N 2D array representation

Permutation invariance Point cloud: N orderless points, each represented by a D dim coordinate D D represents the same set as N N 2D array representation

Construct a Symmetric Function Observe: f ( x 1 , x 2 , … , x n ) = γ ! g ( h ( x 1 ), … , h ( x n )) is symmetric if is symmetric g h (1,2,3) (1,1,1) (2,3,2) (2,3,4)

Construct a Symmetric Function Observe: f ( x 1 , x 2 , … , x n ) = γ ! g ( h ( x 1 ), … , h ( x n )) is symmetric if is symmetric g h (1,2,3) simple symmetric function g (1,1,1) (2,3,2) (2,3,4)

Construct a Symmetric Function Observe: f ( x 1 , x 2 , … , x n ) = γ ! g ( h ( x 1 ), … , h ( x n )) is symmetric if is symmetric g h (1,2,3) simple symmetric function γ g (1,1,1) (2,3,2) (2,3,4) PointNet (vanilla)

Limitations of PointNet Global feature learning Hierarchical feature learning Either one point or all points Multiple levels of abstraction 3D CNN (Wu et al.) PointNet (vanilla) (Qi et al.) • No local context for each point! • Global feature depends on absolute coordinate. Hard to generalize to unseen scene configurations!

Points in Metric Space • Learn “kernels” in 3D space and conduct convolution • Kernels have compact spatial support • For convolution, we need to find neighboring points • Possible strategies for range query • Ball query (results in more stable features) • k-NN query (faster)

PointNet v2.0: Multi-Scale PointNet N points in N 1 points in N 2 points in (x,y) (x,y, f ) (x,y, f’ ) Repeat • Sample anchor points • Find neighborhood of anchor points • Apply PointNet in each neighborhood to mimic convolution

Point Convolution As Graph Convolution • Points -> Nodes • Neighborhood -> Edges • Graph CNN for point cloud processing Wang et al., “ Dynamic Graph CNN for Learning on Point Clouds ”, Transactions on Graphics, 2019 Liu et al., “ Relation-Shape Convolutional Neural Network for Point Cloud Analysis ”, CVPR 2019

Multi-View Stereo (MVS) Reconstruct the dense 3D shape from a set of images and camera parameters 1. Goldlucke et al. “A Super-resolution Framework for High-Accuracy Multiview Reconstruction”

Requirements of MVS Time Computation Applications Range Accuracy Efficiency Efficiency Remote Sensing Autonomous Driving AR/VR Robot Manipulation Inverse Engineering

Reconstruction from Photo-Consistency NCC (Normalized Cross Correlation) SSD (Sum Squared Distance) • Requires texture • Sensitive to Non-lambertian area Image source: UW CSE455

Cost-Volume-based MVS Multi-view images and camera parameters

Cost-Volume-based MVS Build 3D cost volume in reference view frustum

Topdown View of Cost Volume

Cost-Volume-based MVS Fetch images features for each voxel • Voxel in ground truth surface shows feature consistency

Cost-Volume-based MVS Dense 3D CNNs

Improve Output Resolution • Differentiable soft-argmin to achieve sub-pixel accuracy. d=1 d=2 d=3 Kendall et al., “ End-to-End Learning of Geometry and Context for Deep Stereo Regression ”, ICCV 2017

Reconstruction is More Complete More Details from Point MVSNet Camp [2] Ours

From Single Image to Point Cloud • It is possible to generate a set (permutation invariant) Image Predicted set   ( x 1 , y 1 , z 1 )   Deep Neural   ( x 2 , y 2 , z 2 )   Network ...     ( x n , y n , z n )   Point Set Distance   ( x 0 1 , y 0 1 , z 0 1 )     ( x 0 2 , y 0 2 , z 0 2 )   ...     ( x 0 n , y 0 n , z 0 n )   Groundtruth point cloud Fan et al., “ A Point Set Generation Network for 3D Object Reconstruction from a Single Image ”, CVPR 2017

From Image to Surface • Learn to warp a plane to surface Groueix et al., “ AtlasNet: A Papier-Mâché Approach to Learning 3D Surface Generation ”, CVPR 2018 Yang, Yaoqing, et al. " Foldingnet: Point cloud auto- encoder via deep grid deformation ”, CVPR 2018

Structured Prediction: Part-based Recursive Network for Hierarchical Graph AE Li, Jun et al., “ GRASS: Generative Recursive Autoencoders Mo, Kaichun et al., “ StructureNet, a hierarchical graph network for Shape Structures ”, Siggraph 2017 for learning PartNet shape generation ”, Siggraph Asia 2019

Structured Prediction: Part-based Mo et al., “ StructureNet, a hierarchical graph network for learning PartNet shape generation ”, Siggraph Asia 2019

Many More to Explore… Movable Part Motion Parameter Segmentation Estimation Long-horizon Part Manipulation Planning

3D Deep Learning Hao Su @Stanford CS231n Guest Leture Broad - PowerPoint PPT Presentation

3D Deep Learning Hao Su @Stanford CS231n Guest Leture Broad Applications of 3D data Robotics Broad Applications of 3D data Augmented Robotics Reality Broad Applications of 3D data Augmented Robotics Reality Autonomous driving Broad

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Presentation about Deep Learning --- Zhongwu xie Contents 1.Brief introduction of Deep learning.

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

Deep learning Deep reinforcement learning Hamid Beigy Sharif university of technology December

Differen'able Func'onal Programming Noel Welsh @noelwelsh underscore Goals Deep learning

DSC 102 Systems for Scalable Analytics Arun Kumar Topic 6: Deep Learning Systems 1 Outline

ACCELERATE DEEP LEARNING WITH NVIDIA'S DEEP LEARNING PLATFORM | STEPHEN JONES | GTC16 DEEP

Deep learning for natural language processing A short primer on deep learning Benoit Favre <

Relational Deep Learning: A Deep Latent Variable Model for Link Prediction Hao Wang, Xingjian

Medical Imaging Elisa Sayrol Medical Imaging Interest in this area in Deep Learning: DeepDeep

Deep learning Optimization and Regularization in deep networks Hamid Beigy Sharif university of

Minjie Wang Deep Learning Deep Learning trend in the past 10 years Caffe State-of-art DL

Reliability of Conventional Conventional echocardiography Echocardiography Assessment of

Cyber Security Risk Management For November 6, 2014 Jim Halpert Co-Chair Global Privacy &

4-connected shift residual networks ICCV 2019 Neural Architects Workshop Andrew Brown, Pascal

Depressive symptoms and urban residential greenness: Effects of measurement errors of the mean

Convention 2019 Dont Be Scared, Be Dont Be Scared, Be Prepared to Write a Prepared to

Resolution and logarithmic resolution by weighted blowing up Dan Abramovich, Brown University

Bloom Filter based Inter-domain Name Resolution: A Feasibility Study Konstantinos V. Katsaros,

Knowledge Graph Completion Introduction and motivation We have our constructed knowledge

3D Deep Learning Hao Su @Stanford CS231n Guest Leture Broad - PowerPoint PPT Presentation

3D Deep Learning Hao Su @Stanford CS231n Guest Leture Broad Applications of 3D data Robotics Broad Applications of 3D data Augmented Robotics Reality Broad Applications of 3D data Augmented Robotics Reality Autonomous driving Broad

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Presentation about Deep Learning --- Zhongwu xie Contents 1.Brief introduction of Deep learning.

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

Deep learning Deep reinforcement learning Hamid Beigy Sharif university of technology December

Differen'able Func'onal Programming Noel Welsh @noelwelsh underscore Goals Deep learning

DSC 102 Systems for Scalable Analytics Arun Kumar Topic 6: Deep Learning Systems 1 Outline

ACCELERATE DEEP LEARNING WITH NVIDIA'S DEEP LEARNING PLATFORM | STEPHEN JONES | GTC16 DEEP

Deep learning for natural language processing A short primer on deep learning Benoit Favre &lt;

Relational Deep Learning: A Deep Latent Variable Model for Link Prediction Hao Wang, Xingjian

Medical Imaging Elisa Sayrol Medical Imaging Interest in this area in Deep Learning: DeepDeep

Deep learning Optimization and Regularization in deep networks Hamid Beigy Sharif university of

Minjie Wang Deep Learning Deep Learning trend in the past 10 years Caffe State-of-art DL

Reliability of Conventional Conventional echocardiography Echocardiography Assessment of

Cyber Security Risk Management For November 6, 2014 Jim Halpert Co-Chair Global Privacy &amp;

4-connected shift residual networks ICCV 2019 Neural Architects Workshop Andrew Brown, Pascal

Depressive symptoms and urban residential greenness: Effects of measurement errors of the mean

Convention 2019 Dont Be Scared, Be Dont Be Scared, Be Prepared to Write a Prepared to

Resolution and logarithmic resolution by weighted blowing up Dan Abramovich, Brown University

Bloom Filter based Inter-domain Name Resolution: A Feasibility Study Konstantinos V. Katsaros,

Knowledge Graph Completion Introduction and motivation We have our constructed knowledge

Deep learning for natural language processing A short primer on deep learning Benoit Favre <

Cyber Security Risk Management For November 6, 2014 Jim Halpert Co-Chair Global Privacy &