2x speedup
City Dusk Rainy Tunnel Overcast Daytime Sunny Parking Highway Snowy Night Residential Time of Day Weather Scenes
The picture can't be displayed. The picture can't be displayed. Panoptic Drivable Area Bounding Box Instance Segmentation Segmentation Lane & Tagging Tracking Tracking Sunny City Street Daytime
Youtube- Pascal COCO Mapillary Waymo Argoverse nuScenes BDD100K BB Images 10K 328k 25K - - - - - Videos - - - 2K 113 1K 240K 100K Crowd Sourced √ √ √ x x x √ √ Diverse Weather √ √ √ √ √ √ √ √ >10 Objects per Image x √ √ √ √ √ x √ Pixel Annotation √ √ √ x x x x √ Tracking x x x √ √ √ √ √ Multitask √ √ x √ √ √ x √
Youtube- Pascal COCO Mapillary Waymo Argoverse nuScenes BDD100K BB Images 10K 328k 25K - - - - - Videos - - - 2K 113 1K 240K 100K Crowd Sourced √ √ √ x x x √ √ Diverse Weather √ √ √ √ √ √ √ √ >10 Objects per Image x √ √ √ √ √ x √ Pixel Annotation √ √ √ x x x x √ Tracking x x x √ √ √ √ √ Multitask √ √ x √ √ √ x √
Youtube- Pascal COCO Mapillary Waymo Argoverse nuScenes BDD100K BB Images 10K 328k 25K - - - - - Videos - - - 2K 113 1K 240K 100K Crowd Sourced √ √ √ x x x √ √ Diverse Weather √ √ √ √ √ √ √ √ >10 Objects per Image x √ √ √ √ √ x √ Pixel Annotation √ √ √ x x x x √ Tracking x x x √ √ √ √ √ Multitask √ √ x √ √ √ x √
The picture can't be displayed. The picture can't be displayed. 10 3 10 3 10 3 10 3 318 131 12,6 300 30 28 120 12 # Labeled Frames # Labeled Frames # Instances # Instances 200 20 80 8 100 10 40 8 4 34 3 0,75 8 0,23 1,64 0,92 0 0 0 0 KITTI MOT17 BDD100K KITTI MOTS BDD100K Frames Instances
Quasi-Dense Instance Similarity Learning, Pang et al. ArXiv 2020
Sparse GTs Quasi-Dense Samples cls RoI Align BBox Backbone RPN Head reg Frame 1 shared shared cls RoI Align BBox Backbone RPN Head reg Frame 2 Object Detection
Sparse GTs Quasi-Dense Samples RoI Align Embedding Backbone RPN Head Frame 1 shared shared RoI Align Embedding Backbone RPN Head Contrastive Learning Frame 2 Instance Similarity Learning
Detections Tracklets Vanished Tracklets Backdrops Embedding Consistent Extractor Previous Frames shared High Similarity Vanished Object Embedding Extractor Inconsistent Bi-directional Softmax Low Similarity New Object Current Frame Object Association
Image Tagging Lane Marking Domain Adaptation The picture can't be displayed. Bounding Box Drivable Area Object Detection Trajectory Prediction Tracking The picture can't be displayed. Instance Segmentation Panoptic Segmentation Semantic Segmentation Tracking
Drivable Area Lane Markings
Lane ODS-F (%) Drivable IoU (%) 60 75 72,2 71,7 71,4 71,1 54,5 54,4 54,2 54,1 55 70 50,4 50 64,4 64,2 65 45,4 45 40 60 10K 20K 70K 10K 20K 70K # Images # Images Lane marking Drivable area Lane marking w/ Drivable area Drivable area w/ Lane marking
Image Instance Segmentation
Image Box Detection Instance Segmentation 70K Labeled Images 7K Labeled Images
Instance Segmentation 50 45,4 45 40,5 40 35 30 24,5 25 21,8 21,6 20,5 20 15 10 AP AP50 AP75 Inst-Seg Inst-Seg w/ Det
Loss Abundant Box Annotations Box Head Only a subset of object instances Backbone have mask annotation Mask Head Limited L o Mask Annotations s s Learning Saliency Propagation for Semi-Supervised Instance Segmentation, Zhou et al. CVPR 2020
Abundant Box Annotations Abundant boxes statistically provide knowledge of instance salient regions (part of shape)
Pixel relation can be inferred from low-level semantics (e.g., color, texture) Well-generalized pixel relation can be learned from limited masks
Loss Box Head ShapeProp Backbone Mo d u l e Mask Head L o s s
Box Detection Activating Box Head Saliency #1 Backbone ROI Feature Instance Saliency Fuse Propagating Saliency Predicted Pixelwise classification #2 Mask Shape Activation Mask Head Car Shap eProp m o du l e Ex istin g Instance Segmentation Framework
Shape Activation Conv Propagated Reuse Features t = max(%, ') Normalize & Shuffle t = 2 Reconstruction Loss t = 1 GT Mask Conv t=0 + + Instance Saliency )×%× W Propagation Weights ROI Feature Conv Blocks ()×+ , )×H×'
More than 10 points of AP gain (only a subset of classes have mask annotations) Significant improvements over baselines on class-wise semi-supervision setting
Improve both single-stage and two-stage frameworks (only a subset of images have mask annotations) Significant improvements over baselines on image-wise semi-supervision setting
Improve segmentation quality and generalization of existing frameworks (all instances have mask annotation) The learned shape representation also bring gains to fully supervision setting
Instance Saliency -> Shape Activation Saliency Propagation on BDD100K
Instance Saliency -> Shape Activation Saliency Propagation on BDD100K
w/ ShapeProp wo/ ShapeProp Class-wise Semi-supervised Instance Segmentation on COCO
wo/ ShapeProp Image-wise Semi-supervised Instance Segmentation on BDD100K
The picture can't be displayed.
Frame 1 Seg Tracking Frame 2 Seg Tracking
Segmentation Frame 1 Detection Box Tracking Seg Tracking Frame 2 Detection Seg Tracking Box Tracking Segmentation
The picture can't be displayed. The picture can't be displayed. w/ Instance Seg Track w/ Box Track All Seg 13.0 18.7 19.7 23.3 AP 30.4 33.7 40.3 41.4 MOTSA
The picture can't be displayed. The picture can't be displayed. Panoptic Drivable Area Bounding Box Instance Segmentation Segmentation Lane & Tagging Tracking Tracking Sunny City Street Daytime go.yf.io/bdd100k github.com/ucbdrive/bdd100k Paper Toolkit & Data
Recommend
More recommend