3D Pose Regression using Convolutional Neural Networks Siddharth - PowerPoint PPT Presentation

3D Pose Regression using Convolutional Neural Networks Siddharth Mahendran, Haider Ali, and René Vidal Center for Imaging Science Johns Hopkins University

Problem Statement 6D Task: given a single 2D image, estimate 6D object pose

Problem Statement 6D Task: given a single 2D image, estimate 6D object pose 2D detection has experienced significant progress over the past few years Assume a 2D bounding box returned by an oracle or an object detector 3D Task: Given a 2D image and a 2D bounding box around an object in the image, predict the 3D orientation of the object

Problem Formulation Ill Posed !! 𝑆 Pose annotations with aligned models Learn from training examples

Problem Formulation CNN 𝑆 What data to use ? Any data augmentation ? What is the network architecture ? What representation and loss function to use ?

Paper Contributions Prior work This work Problem formulation Pose classification Pose regression Representation Discretized angle bins Axis-angle / Quaternion Loss function Cross-entropy loss Geodesic loss 2D jittering [1] 3D pose jittering + Data augmentation Rendered images [2] Rendered images [1] S. Tulsiani and J. Malik, Viewpoints and Keypoints , CVPR 2015 [2] H. Su, C. Qi, Y. Li, and L. Guibas, Render for cnn: Viewpoint estimation in images using cnns trained with rendered 3d model views , ICCV 2015

Network Architecture for 3D Pose Task Image Feature Network Pose Networks Pose Object category label Feature Network: VGG-M [1] upto FC6 Pose Network: 3 Fully Connected layers with (per object category) Batch Normalization and ReLU activations [1] K. Chatfield, K. Simonyan, A. Vedaldi, and A. Zisserman. Return of the devil in the details: Delving deep into convolutional nets. BMVC 2014

Representations and Loss Functions for 3D Pose Task Exploit underlying structure of rotation matrices ! Rotation by an angle about an axis Axis-angle Quaternion

Data Augmentation for 3D Pose Task Perturbation around Z-axis: Perturbation 2D Pose jittering around X-axis: Unknown perturbations in 3D pose !! 3D Pose jittering

Experimental Setup • Dataset: Pascal3D+ (release 1.1) – ImageNet and Pascal VOC2012 images for 12 object categories • Training set: Imagenet-trainval images, • Validation set: Pascal-train images • Testing set: Pascal-val images • Data augmentation: Evaluation metric: – 3D pose jittering – 162 samples per image  Perturbations around X-axis (x9) : -2:0.5:2  Perturbations around Z-axis (x9) : -4:1:4  Flips (x2) – Rendered images [1] • Training: – Adam optimizer with learning rate schedule – Implemented in Keras with TensorFlow backend [1] H. Su, C. Qi, Y. Li, and L. Guibas, Render for cnn: Viewpoint estimation in images using cnns trained with rendered 3d model views , ICCV 2015

Results Median angle error between predicted and ground-truth rotation matrices aero bike boat bottle bus car chair dtable mbike sofa train tv mean V&K[1] 13.80 17.70 21.30 12.90 5.80 9.10 14.80 15.20 14.70 13.70 8.70 15.40 13.59 Render-for- 15.40 14.80 25.60 9.30 3.60 6.00 9.70 10.80 16.70 9.50 6.10 12.60 11.67 CNN [2] Ours: axis- 13.97 21.07 35.52 8.99 4.08 7.56 21.18 17.74 17.87 12.70 8.22 15.68 15.38 angle Ours: 14.53 22.55 35.78 9.29 4.28 8.06 19.11 30.62 18.80 13.22 7.32 16.01 16.63 quaternion Performance on ground-truth bounding boxes for un-occluded and un-truncated objects Ours: axis-angle 14.71 21.31 45.07 9.47 4.20 8.93 26.36 20.70 19.16 18.80 8.72 15.65 17.76 detected Performance on bounding boxes returned by Faster R-CNN [3] [1] S. Tulsiani and J. Malik, Viewpoints and Keypoints , CVPR 2015 [2] H. Su, C. Qi, Y. Li, and L. Guibas, Render for cnn: Viewpoint estimation in images using cnns trained with rendered 3d model views , ICCV 2015 [3] S. Ren, K. He, R. Girshick, and J. Sun. Faster RCNN: Towards real-time object detection with region proposal networks. Arxiv 2015

Conclusion We designed a Convolutional Neural Network framework for the task of 3D Pose regression with : • Suitable representation of the space of 3D rotation matrices: axis-angle and quaternion • Appropriate geodesic loss on the space of rotation matrices • Relevant data augmentation strategy, 3D pose jittering based on applying homographies to the images

Acknowledgements • Collaborators Vision Lab @ Johns Hopkins University http://www.vision.jhu.edu Center for Imaging Science @ Johns Hopkins University http://www.cis.jhu.edu Siddharth Mahendran Haider Ali • Funding Thank You! – NSF 1527340

3D Pose Regression using Convolutional Neural Networks Siddharth - PowerPoint PPT Presentation

3D Pose Regression using Convolutional Neural Networks Siddharth Mahendran, Haider Ali, and Ren Vidal Center for Imaging Science Johns Hopkins University Problem Statement 6D Task: given a single 2D image, estimate 6D object pose Problem

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural

Convolutional Kuan-Ting Lai 2020/3/31 Neural Network Convolutional Neural Networks (CNN)

Introduction CSCE 970 CSCE 970 Lecture 4: Lecture 4: Convolutional Convolutional Neural

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Convolutional Neural Networks for Sentence Classification Yoon Kim New York University 1 / 34

Convolutional Neural Networks 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image Processing

Convolutional Neural Nets 4-25-16 Reading Quiz Convolutional neural networks are most commonly

Neural Network Part 3: Convolutional Neural Networks CS 760@UW-Madison Goals for the lecture

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Semantic Segmentation of the sekleton in bone scintigraphy images with convolutional neural

Convolutional Neural Networks in Speech Lecture 20 CS 753 Instructor: Preethi Jyothi

Convolutional Neural Networks (Part III) 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image

MICROBOONE Taritree Wongjirad DPF 2017 Tufts/MIT Outline Convolutional neural networks

Neural Networks + Convolutional Neural Networks Last Class Global Features The perceptron

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Pat t erned magnet ic st ruct ures f rom f undament al micromagnet ism t o micron-scale applicat

Vibration Performance Comparison Study on Current Fiber Optic Connector Technologies Connector

Section 16: Neutral Axis and Parallel Axis Theorem 16-1 Geometry of deformation Geometry of

Systematizing Secure Computation for Research and Decision Support Jason Perry , Debayan Gupta,

lecture 2 - model transformations (rotations, scaling, translation) - intro to homogeneous

Three Dimensional Euclidean Space We set up a coordinate system in space (three dimensional

Ball micro slides type PMM TECHNISCHE DATEN ASSEMBLY The mounting holes of each type are drilled

Human-Oriented Robotics Octave/Matlab Tutorial Kai Arras Social Robotics Lab, University of

3D Pose Regression using Convolutional Neural Networks Siddharth - PowerPoint PPT Presentation

3D Pose Regression using Convolutional Neural Networks Siddharth Mahendran, Haider Ali, and Ren Vidal Center for Imaging Science Johns Hopkins University Problem Statement 6D Task: given a single 2D image, estimate 6D object pose Problem

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural

Convolutional Kuan-Ting Lai 2020/3/31 Neural Network Convolutional Neural Networks (CNN)

Introduction CSCE 970 CSCE 970 Lecture 4: Lecture 4: Convolutional Convolutional Neural

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Convolutional Neural Networks for Sentence Classification Yoon Kim New York University 1 / 34

Convolutional Neural Networks 08, 10 &amp; 17 Nov, 2016 J. Ezequiel Soto S. Image Processing

Convolutional Neural Nets 4-25-16 Reading Quiz Convolutional neural networks are most commonly

Neural Network Part 3: Convolutional Neural Networks CS 760@UW-Madison Goals for the lecture

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Semantic Segmentation of the sekleton in bone scintigraphy images with convolutional neural

Convolutional Neural Networks in Speech Lecture 20 CS 753 Instructor: Preethi Jyothi

Convolutional Neural Networks (Part III) 08, 10 &amp; 17 Nov, 2016 J. Ezequiel Soto S. Image

MICROBOONE Taritree Wongjirad DPF 2017 Tufts/MIT Outline Convolutional neural networks

Neural Networks + Convolutional Neural Networks Last Class Global Features The perceptron

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Pat t erned magnet ic st ruct ures f rom f undament al micromagnet ism t o micron-scale applicat

Vibration Performance Comparison Study on Current Fiber Optic Connector Technologies Connector

Section 16: Neutral Axis and Parallel Axis Theorem 16-1 Geometry of deformation Geometry of

Systematizing Secure Computation for Research and Decision Support Jason Perry , Debayan Gupta,

lecture 2 - model transformations (rotations, scaling, translation) - intro to homogeneous

Three Dimensional Euclidean Space We set up a coordinate system in space (three dimensional

Ball micro slides type PMM TECHNISCHE DATEN ASSEMBLY The mounting holes of each type are drilled

Human-Oriented Robotics Octave/Matlab Tutorial Kai Arras Social Robotics Lab, University of

Convolutional Neural Networks 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image Processing

Convolutional Neural Networks (Part III) 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image