POINT CLOUD DEEP LEARNING Innfarn Yoo, 3/29/2018 1 / 57
• Introduction Previous Work • AGENDA Method • • Result Conclusion • 2 / 57 2 / 57
INTRODUCTION 3 / 57
2D OBJECT CLASSIFICATION Deep Learning for 2D Object Classification Convolutional Neural Network (CNN) for 2D images • works really well AlexNet, ResNet, & GoogLeNet • R-CNN Fast R-CNN Faster R-CNN Mask R-CNN • Recent 2D image classification can even extract • precise boundaries of objects (FCN Mask R-CNN) [1] He et al., Mask R-CNN (2017) 4 / 57
3D OBJECT CLASSIFICATION Deep Learning for 3D Object Classification 3D object classification approaches are getting • more attentions Collecting 3D point data is easier and cheaper • than before (LiDAR & other sensors) Size of data is bigger than 2D images • • Open datasets are increasing Recent researches approaches human level • detection accuracy MVCNN, ShapeNet, PointNet, VoxNet, • [2] Zhou and Tuzel, VoxelNet (2017) VoxelNet, & VRN Ensemble 5 / 57
GOALS The goals of our method Evaluating & comparing different types of Neural Network models for 3D object • classification • Providing the generic framework to test multiple 3D neural network models • Simple & easy to implement neural network models Fast preprocessing (remove bottleneck of loading, sampling, & jittering 3D data) • 6 / 57
PREVIOUS WORK 7 / 57
3D POINT-BASED APPROACHES 3D Points Neural Nets PointNet • First 3D point-based classification • Unordered dataset • Transform Multi-Layer • Perceptron (MLP) Max Pool (MP) Classification [5] Qi et al., PointNet (2017) 8 / 57
PIXEL-BASED APPROACHES 3D 2D Projections Neural Nets Multi-Layer Perceptron (MLP) • Convolutional Neural Network (CNN) • [4] Krizhevsky et al., AlexNet (2012) Multi-View Convolutional Neural • Network (MVCNN) [3] Su et al., MVCNN (2015) 9 / 57
VOXEL-BASED APPROACHES 3D Points Voxels Neural Nets VoxNet • [7] Broke et al., VRN Ensemble (2016) VRN Ensemble • VoxelNet • [2] Zhou and Tuzel, VoxelNet (2017) [6] Maturana and Scherer, VoxNet (2015) 10 / 57
METHOD 11 / 57
PREPROCESSING Requirement Loading 3D polygonal objects • • Required Operations on 3D objects Sampling, Shuffling, Jittering, Scaling, & Rotating • • Projection, & Voxelization Python interface is not that good for multi-core processing (or multi-threading) • # of objects is notoriously for single-core processing • 12 / 57 12 / 57
FRAMEWORK Basic Pipeline nn3d_trainer (C++) Loading 3D Object Trainer: Epoch #i C++ program Load 3D Objects Sampler Threads Sample Points Thread 3 Thread 1 Thread 2 Thread N … 3D Point Sample 3D Point Sample 3D Point Sample 3D Point Sample Increase epoch Converter … Converter Converter Converter Pixel, Point, Voxel Pixel, Point, Voxel Pixel, Point, Voxel Pixel, Point, Voxel Call Python NN Model Functions: Train, Test, Eval, Report, & Save 13 / 57 13 / 57
3D DATASETS MODELNET10 MODELNET40 SHAPENET CORE V2 Princeton ModelNet Data Princeton ModelNet Data ShapeNet http://modelnet.cs.princeton.edu/ http://modelnet.cs.princeton.edu/ https://www.shapenet.org/ 10 Categories 40 Categories 55 Categories 4,930 Objects (2 GB) 12,431 Objects (10 GB) 51,191 Objects (90 GB) OFF (CAD) File Format OFF (CAD) File Format OBJ File Format 14 / 57
Point-Based Models NEURAL NETWORK MODELS Pixel-Based Models Voxel-Based Models 15 / 57 15 / 57
POINT-BASED NEURAL NETWORK MODELS Types of Models Preprocessing: Tested Models • • • Rotate randomly • Multi-Layer Perceptron (MLP) Scale randomly Multi Rotational MLPs • • • Uniform sampling on 3D object • Single Orientation CNN surfaces Multi Rotational CNNs • • Sample 2048 points • Multi Rotational Resample & Max Shuffle points Pool Layers • ResNet-like • 16 / 57 16 / 57
17 / 57 17 / 57
POINT-BASED NEURAL NETWORK MODELS MLP ReLU + Dropout 3D points … Softmax Cross Entropy Flatten Vector Fully Connected Layer … Class Onehot Vector 18 / 57 18 / 57
POINT-BASED NEURAL NETWORK MODELS Multi Rotational MLPs Softmax Cross Entropy 3D points … ReLU + Dropout Random 3x3 Rotation 3D Conv Layer Max Pooling Layer Flatten Vector Fully Connected Layer … Class Onehot Vector 19 / 57 19 / 57
POINT-BASED NEURAL NETWORK MODELS Single Orientation CNN ReLU + Dropout ReLU + Dropout Softmax Cross Entropy 3D points … Random 3x3 Rotation 3D Conv Layer Max Pooling Layer Flatten Vector Fully Connected Layer … Class Onehot Vector 20 / 57 20 / 57
POINT-BASED NEURAL NETWORK MODELS Multi Rotational CNNs ReLU + Dropout ReLU + Dropout Softmax Cross Entropy … 3D points Random 3x3 Rotation 3D Conv Layer Max Pooling Layer Flatten Vector Fully Connected Layer … Class Onehot Vector 21 / 57 21 / 57
POINT-BASED NEURAL NETWORK MODELS Multi Rotational Resample & Max Pool Layers ReLU + Dropout ReLU + Dropout Softmax Cross Entropy … 3D points Random 3x3 Rotation Resample Layer Max Pooling Layer Flatten Vector Fully Connected Layer … Class Onehot Vector 22 / 57 22 / 57
POINT-BASED NEURAL NETWORK MODELS ResNet-like Softmax Cross Entropy ReLU + Dropout ReLU + Dropout ReLU + Dropout 3D points … Random 3x3 Rotation Resample Layer Max Pooling Layer Flatten Vector Fully Connected Layer … Class Onehot Vector 23 / 57 23 / 57
PIXEL-BASED NEURAL NETWORK MODELS Types of Models Preprocessing: Tested Models: • • • Sample 8192 points • MLP Same as point-based models • • Depth-only orthogonal projection • Depth-Only Orthogonal MVCNN 32x32 or 64x64 • • Generating multiple rotations 64x64x5 & 64x64x10 • 24 / 57 24 / 57
25 / 57 25 / 57
PIXEL-BASED NEURAL NETWORK MODELS MLP Softmax Cross Entropy Images … (32x32x5) Flatten Vector Fully Connected Layer … Class Onehot Vector 26 / 57 26 / 57
PIXEL-BASED NEURAL NETWORK MODELS Depth-Only Orthogonal MVCNN Softmax Cross Entropy Images … (32x32x5) Concat Image Separation 3D Conv Layer Max Pooling Layer Flatten Vector Fully Connected Layer … Class Onehot Vector 27 / 57 27 / 57
VOXEL-BASED NEURAL NETWORK MODELS Types of Models Preprocessing: Tested Models: • • • Sample 8192 points • MLP Same as point based models CNN • • • Voxelization • ResNet-like 3D points Voxels • • Each voxel has intensity 0.0 ~ 1.0 how many points hit same voxel • • 32x32x32 & 64x64x64 28 / 57 28 / 57
29 / 57 29 / 57
VOXEL-BASED NEURAL NETWORK MODELS MLP Softmax Cross Entropy Images … (32x32x5) Flatten Vector Fully Connected Layer … Class Onehot Vector 30 / 57 30 / 57
VOXEL-BASED NEURAL NETWORK MODELS CNN Softmax Cross Entropy Voxels … 32x32x32 3D Conv Layer Max Pooling Layer Flatten Vector Fully Connected Layer … Class Onehot Vector 31 / 57 31 / 57
VOXEL-BASED NEURAL NETWORK MODELS ResNet-like Voxels 32x32x32 Softmax Cross Entropy Concat … 3D Conv Layer Avg Pooling Layer Resample Layer Max Pooling Layer Flatten Vector Fully Connected Layer … Class Onehot Vector 32 / 57 32 / 57
IMPLEMENTATION System Setup System: Ubuntu 16.04, RAM 32 GB & 64 GB, & SSD 512 GB • • NVIDIA Quadro P6000, Quadro M6000, & GeForce Titan X GCC 5.2.0 for C++ 11x • • Python 3.5 TensorFlow-GPU v1.5.0 • NumPy 1.0 • 33 / 57 33 / 57
HYPER PARAMETERS • Object Perturbation • Random Rotations: -25 ~ 25 degree Random Scaling: 0.7 ~ 1.0 • Learning Rate: 0.0001 • Keep Probability (Dropout layer): 0.7 • • Max Epochs: 1000 Batch Size: 32 • Number of Random Rotations: 20 • • Voxel Dim: 32x32x32 • MVCNN Number of Views: 5 34 / 57 34 / 57
RESULT 35 / 57
MODELNET10 # OF TEST & TRAIN OBJECTS # of Test Models # of Train Models 1000 900 800 700 600 500 400 300 200 100 0 36 / 57 table toilet monitor bathtub sofa chair desk dresser night_stand bed
MODELNET10 ACCURACY Iter: 1000 Train Accu Test Accu mAP % 100 90 80 70 60 50 40 30 20 10 0 37 / 57 PC MLP1 PC CNN1 PC MLPs PC CNNs PC MP PC ResNet PX MLP PX MVCNN VX MLP VX CNN VX ResNet
MODELNET40 # OF TEST & TRAIN OBJECTS # of Test Models # of Train Models 900 800 700 600 500 400 300 200 100 0 38 / 57
MODELNET40 ACCURACY Iter: 1000 Train Accu Test Accu mAP % 100 90 80 70 60 50 40 30 20 10 0 39 / 57 PC MLP1 PC CNN1 PC MLPs PC CNNs PC MP PC ResNet PX MLP PX MVCNN VX MLP VX CNN VX ResNet
MODELNET40 ACCURACY 7 CATEGORIES (# OF TRAIN OBJECTS > 400) # of Test Models # of Train Models 900 800 700 600 500 400 300 200 100 0 40 / 57
MODELNET40, 7 CATEGORIES ACCURACY Iter: 1000 % Train Accu Test Accu mAP 100 90 80 70 60 50 40 30 20 10 0 41 / 57 PC MLP1 PC CNN1 PC MLPs PC CNNs PC MP PC ResNet PX MLP PX MVCNN VX MLP VX CNN VX ResNet
MODELNET40 ACCURACY 10 CATEGORIES (# OF TRAIN OBJECTS > 300) # of Test Models # of Train Models 900 800 700 600 500 400 300 200 100 0 42 / 57
Recommend
More recommend