Orientation-boosted Voxel Nets for 3D Object Recognition Nima Sedaghat, Mohammadreza Zolfaghari, Ehsan Amiri, Thomas Brox (BMVC 2017) Håkon Hukkelås, 26. September 2018
The Idea
The Idea
The Idea
The Idea
The Idea
Related Work Handcrafted feature descriptors - Point Feature Histogram, 3D Shape Context, Spin Images 3D Convolutional Neural Networks - 3D ShapeNets (Wu et al.), VoxNet (Maturana & Sherer) - Does not care about rotation Multi-View CNN (Su et al.) - Requires dense surfaces to render the object
Method - ORION L = (1 − γ ) L classification + γ L Orientation γ = 0.5
Multi-task Learning Orientation as a classification problem: - Relaxation on dataset constraints - Treat different orientations of an object differently Orientation class specific for object class: - Do not want to learn features shared among classes to determine orientation L = (1 − γ ) L classification + γ L Orientation γ = 0.5
Method - Voting During test-phase: Feed multiple object rotations to obtain final prediction x: input r: rotation index S k : output of the network for the kth node. c: final class prediction ∑ c final = arg max S k ( x r ) k r
Datasets Sydney Urban Objects NYUv2 ModelNet 10/40 Kitti - LIDAR / Pointcloud - Kinect / RGBD - Synthetic / CAD - Lidar / Pointcloud + RGB - 631 Objects - 2808 Objects - 4899/12311 Objects - 7481(train) + 7518(test) - 26 Classes - 10 Classes - 10 or 40 Classes Objects - ( State-of-the-art) - ( State-of-the-art) - ( State-of-the-art) - 8 Classes
Experiments and Results - Classification Method Dataset # Conv # param Sydney (F1) NYUv2 ModelNet10 Recursive D - - - 37.6 - Hand-crafted Recursive D+C - - - 44.8 - Features Triangle + SVM - - 67.1 - - GFH + SVM - - 71.0 - - Deep FusionNet 118M - - 93.1 VRN 43 18M - - 93.6 Network ShapeNet 3 - - 57.9 83.5 DeepPano 4 - - - 85.5 Shallow VoxNet (baseline) 2 890K 72 71 92 Network 2 910K 77.8 75.4 93.8 ORION(paper) 4 4M 77.5 75.5 93.9
Experiments and Results - Alignment ModelNet40 Accuracy (%) Conv. Batch No Rough, Automatic Perfect, Manual Method Layers Norm Alignment Alignment Alignment VoxNet ╳ 2 83 - - (baseline) ╳ 2 - 88.1 87.5 ✓ 2 - 88.6 88.2 ORION(paper) ✓ 4 - 89.4 89.7
Experiments and Results - Detection Sliding Window Detection of Cars: - Uses the networks orientation output - Only uses 3D point cloud for prediction - 18 Rotation steps to cover 360 degrees
Experiments and Results - Detection
Analysis Activations from the first Convolutional Layer - ORION is sensitive to orientation
Analysis - Dominant Signal Flow
Analysis - Dominant Signal Flow
Analysis
Analysis
Summary - Use orientation prediction as an auxiliary task for object classification. - Force the network to learn underlying concept of object orientation. - Achieves state-of-the-art results on all three datasets with a shallow network. Contributions: - Clearly shows that orientation as an auxiliary task helps Neural Networks in the classification task - Presents a perfectly aligned version of ModelNet 40 - Visualises and contributes to the understanding of how Neural Nets works in terms of classification and the impact of object orientation - Presents an efficient approach to 3D sliding window search for object detection
References - Orientation-boosted Voxel Nets for 3D Object Recognition , Nima Sedaghat. Mohammadreza Zolfaghari, Ehsan Amiri, Thomas Brox. BMVC 2017 - BMVC 2017 Spotlight Session-2. (https://www.youtube.com/watch?v=kl27gOI0BxU) - ORION Github (https://github.com/lmb-freiburg/orion)
Recommend
More recommend