Networks for 3D Single-shot Object Detection JunYoung Gwak, - PowerPoint PPT Presentation

Generative Sparse Detection Networks for 3D Single-shot Object Detection JunYoung Gwak, Christopher Choy, Silvio Savarese

Key Challenge of 3D Object Detection Disjoint input and output space: Input 3D scan: surface of the object ● Output anchor space: ● center of the bounding box Sparse convolution / PointNet: Learn only on the surface of the object ⇒ Output space is unreachable! 3 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Key Challenge of 3D Object Detection Possible solutions? (previous works) Ignore this problem and make predictions ● at the surface of the object Nontrivial to decide which part of the ○ surface is responsible for the prediction 4 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Key Challenge of 3D Object Detection Possible solutions? (previous works) Ignore this problem and make predictions ● at the surface of the object Nontrivial to decide which part of the ○ surface is responsible for the prediction Convert sparse tensor to dense tensor ● Give up efficiency in sparsity ○ 5 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Key Challenge of 3D Object Detection Possible solutions? (previous works) Ignore this problem and make predictions ● at the surface of the object Nontrivial to decide which part of the ○ surface is responsible for the prediction Convert sparse tensor to dense tensor ● Give up efficiency in sparsity ○ For every point, predict relative center of ● the instance Requires center aggregation (clustering), ○ inefficient 6 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Key Challenge of 3D Object Detection Key observation: Object centers are close to the object surface Can we generate object centers efficiently ? 7 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Method Overview 8 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Hierarchical Sparse Tensor Encoder 9 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Hierarchical Sparse Tensor Encoder Generates hierarchical sparse tensor ● features with sparse 3D ResNet Analogous to ResNet encoders ● commonly used in of 2D detectors 10 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Generative Sparse Tensor Decoder 15 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Transposed Convolution + Sparsity Pruning 16 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Transposed Convolution + Sparsity Pruning Sparse Transposed Convolution ● Outer-product of the convolution kernel shape on ○ the input coordinates Generates surrounding coordinates of the input ○ coordinates (expands support) Sparsity Pruning ● 17 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Transposed Convolution + Sparsity Pruning Sparse Transposed Convolution ● Sparsity Pruning ● For each generated point, predict whether to ○ prune the coordinate Prune coordinates that are not bounding box ○ centers 18 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Bounding box prediction 19 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Bounding box prediction For every point that are not pruned, ● predict Anchor classification ○ Bounding box regression ○ Semantic classification ○ Hierarchical multi-scale prediction on ● pyramid network 20 20 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Advantages of f Our Method Full 3D search space Search for object center up to ±1.6m of any observable surface ● Fully sparse : Minimal runtime and memory footprint Sparse Convolution Encoder ● Conv Transpose and Pruning to only generate anchor centers ● Fully-convolutional Simple architecture ● No clustering, no crop and merge, just convolutions ● 21 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Losses Sparsity Prediction: Balanced Cross Entropy ● Anchor Prediction: Balanced Cross Entropy ● Semantic Prediction: Cross Entropy ● Bounding Box Regression: Huber Loss ● 22 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Losses Sparsity Prediction: Balanced Cross Entropy ● Anchor Prediction: Balanced Cross Entropy ● Semantic Prediction: Cross Entropy ● Bounding Box Regression: Huber Loss ● Balanced Cross Entropy Overcome heavy label bias by equally penalizing positive and negative samples 23 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Losses Sparsity Prediction: Balanced Cross Entropy ● Anchor Prediction: Balanced Cross Entropy ● Semantic Prediction: Cross Entropy ● Bounding box parameters Bounding Box Regression: Huber Loss ● Balanced Cross Entropy Overcome heavy label bias by equally penalizing positive and negative samples 24 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Comparison with previous SOTA - ScanNet Outperforms previous state-of-the-art ● by 4.2 mAP@0.25 While being a single-shot detection ○ 25 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Comparison with previous SOTA - ScanNet Outperforms previous state-of-the-art ● by 4.2 mAP@0.25 While being a single-shot detection ○ While being x3.7 faster ● runtime linear to # of points ○ runtime sublinear to floor area ○ ⇒ free from curse of dimensionality!! ○ 26 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Comparison with previous SOTA - ScanNet Outperforms previous state-of-the-art ● by 4.2 mAP@0.25 While being a single-shot detection ○ While being x3.7 faster ● runtime linear to # of points ○ runtime sublinear to floor area ○ ⇒ free from curse of dimensionality!! ○ Minimal memory footprint ● x6 efficient to dense counterpart ○ 27 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Comparison with previous SOTA - ScanNet Outperforms previous state-of-the-art ● by 4.2 mAP@0.25 While being a single-shot detection ○ While being x3.7 faster ● runtime linear to # of points ○ runtime sublinear to floor area ○ ⇒ free from curse of dimensionality!! ○ Minimal memory footprint ● x6 efficient to dense counterpart ○ Maintains constant input density ● Consistent information for scalability ○ 28 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Comparison with previous SOTA - ScanNet 29 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Comparison with previous SOTA - S3DIS Achieves state-of-the-art result ● Our method doesn’t require crop -and-stitch post-processing ● unlike Yang et al. 30 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Comparison with previous SOTA - S3DIS 31 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Ablation study Train without sparsity pruning ➔ Fails to train due to out of memory error Train without Generative Sparse Tensor Decoder ➔ 32 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Scalability and generalization - S3DIS Train on small rooms, test on the the entire building 5 of S3DIS 78M points, 13984m 3 volume, and 53 rooms ● Single fully-convolutional network feed-forward ● Takes 20 seconds including data pre-processing and post-processing ● Use 5G GPU memory to detect 573 instances of 3D objects ● 33 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Scalability and generalization - S3DIS How does our method achieve high scalability and generalization capacity? Consistent information regardless of the size of input: Fully-convolutional: translation invariant ● Consistent density of input: voxels. no fixed-sized random subsampling ● Minimal runtime and memory footprint Fully sparse ● Sparse encoder: sparse convolution ○ Sparse decoder: pruning to prevent cubic growth of generated coordinates ○ 34 Generative Sparse Detection Networks for 3D Single-shot Object Detection

Networks for 3D Single-shot Object Detection JunYoung Gwak, - PowerPoint PPT Presentation

Generative Sparse Detection Networks for 3D Single-shot Object Detection JunYoung Gwak, Christopher Choy, Silvio Savarese Key Challenge of 3D Object Detection Disjoint input and output space: Input 3D scan: surface of the object Output

SHOT Brand Price NOTES WEST COAST MAGNUM SIZES 4 - 9 $ 39.20 Eagle shot prices may not be

Object Oriented Object 3 Programming Object 1 Object 2 Object 4 For : COP 3330. Object

Detection, Segmentation Overview Object Detection deer cat Object Detection as Classification

Object Detection Sanja Fidler CSC420: Intro to Image Understanding 1 / 48 Object Detection The

A Bayesian Approach to A Bayesian Approach to Unsupervised One- Unsupervised One -Shot Shot

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

From image classification to object detection Image classification Object detection Image source

AutoML for Object Detection Xiangyu Zhang MEGVII Research 1 AutoML for Advances in AutoML

TRECVID-2005: Shot Boundary Detection Task Overview Alan Smeaton Dublin City University &

TRECVID-2006: Shot Boundary Detection Task Overview Alan Smeaton Dublin City University &

Siamese Network & Matching Network for one-shot learning Reference Papers Siamese Neural

Deep Neural Networks for Object Detection Paper by C. Szegedy, A. Toshev, D. Erhan [2013]

Object-Oriented Databases Object Oriented Databases ODMG Standard Object Model, Object

Object oriented Object oriented Object oriented Object oriented approach and UML approach and

CS6501: Deep Learning for Visual Recognition Object Detection: RCNN, Fast-RCNN, Faster-RCNN

Lecture 11: Object detection Contains slides from S. Lazebnik, R. Girshick, B. Hariharan 1

Lecture 7.2: Ideals, quotient rings, and finite fields Matthew Macauley Department of

1 National Aeronautics and Space Administration Joel Susskind AIRS T(p) trends can be spurious

Multitask radiological modality invariant landmark localization using deep reinforcement learning

Efficient Design Practices for Thermal Management of TSV based 3D IC System Min Ni, Qing Su,

HOW TO ANALYZE RENTAL PROPERTIES For Maximum Cash Flow! A Free Webinar From BiggerPockets.com

Practical Social Network Analysis With Gephi Dr. Derek Greene Insight @ UCD Gephi -

Idaho Career and Technical Education Data Collection Training: Data Analysis Hella Bel Hadj Amor,

Stella Performance Strategy & Analysis Tool June 5 & 6, 2019 1 Stella Performance

Networks for 3D Single-shot Object Detection JunYoung Gwak, - PowerPoint PPT Presentation

Generative Sparse Detection Networks for 3D Single-shot Object Detection JunYoung Gwak, Christopher Choy, Silvio Savarese Key Challenge of 3D Object Detection Disjoint input and output space: Input 3D scan: surface of the object Output

SHOT Brand Price NOTES WEST COAST MAGNUM SIZES 4 - 9 $ 39.20 Eagle shot prices may not be

Object Oriented Object 3 Programming Object 1 Object 2 Object 4 For : COP 3330. Object

Detection, Segmentation Overview Object Detection deer cat Object Detection as Classification

Object Detection Sanja Fidler CSC420: Intro to Image Understanding 1 / 48 Object Detection The

A Bayesian Approach to A Bayesian Approach to Unsupervised One- Unsupervised One -Shot Shot

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

From image classification to object detection Image classification Object detection Image source

AutoML for Object Detection Xiangyu Zhang MEGVII Research 1 AutoML for Advances in AutoML

TRECVID-2005: Shot Boundary Detection Task Overview Alan Smeaton Dublin City University &amp;

TRECVID-2006: Shot Boundary Detection Task Overview Alan Smeaton Dublin City University &amp;

Siamese Network &amp; Matching Network for one-shot learning Reference Papers Siamese Neural

Deep Neural Networks for Object Detection Paper by C. Szegedy, A. Toshev, D. Erhan [2013]

Object-Oriented Databases Object Oriented Databases ODMG Standard Object Model, Object

Object oriented Object oriented Object oriented Object oriented approach and UML approach and

CS6501: Deep Learning for Visual Recognition Object Detection: RCNN, Fast-RCNN, Faster-RCNN

Lecture 11: Object detection Contains slides from S. Lazebnik, R. Girshick, B. Hariharan 1

Lecture 7.2: Ideals, quotient rings, and finite fields Matthew Macauley Department of

1 National Aeronautics and Space Administration Joel Susskind AIRS T(p) trends can be spurious

Multitask radiological modality invariant landmark localization using deep reinforcement learning

Efficient Design Practices for Thermal Management of TSV based 3D IC System Min Ni, Qing Su,

HOW TO ANALYZE RENTAL PROPERTIES For Maximum Cash Flow! A Free Webinar From BiggerPockets.com

Practical Social Network Analysis With Gephi Dr. Derek Greene Insight @ UCD Gephi -

Idaho Career and Technical Education Data Collection Training: Data Analysis Hella Bel Hadj Amor,

Stella Performance Strategy &amp; Analysis Tool June 5 &amp; 6, 2019 1 Stella Performance

TRECVID-2005: Shot Boundary Detection Task Overview Alan Smeaton Dublin City University &

TRECVID-2006: Shot Boundary Detection Task Overview Alan Smeaton Dublin City University &

Siamese Network & Matching Network for one-shot learning Reference Papers Siamese Neural

Stella Performance Strategy & Analysis Tool June 5 & 6, 2019 1 Stella Performance