UFO 2 : A Unified Framework towards Omni-supervised Object Detection Zhongzheng Ren, Zhiding Yu, Xiaodong Yang, Ming-Yu Liu, Alexander G. Schwing, Jan Kautz ECCV 2020
Omni-supervised Object Detection unlabeled scribbles tags points boxes TV cat zebra elephant commonly used in weakly-supervised strongly-supervised e.g., segmentation omni-supervised
Omni-supervised Object Detection Prior work • Stage-wise training; pipelined (Gao et al., 2018) (Radosavovic et al., 2019) • Require some strong labels Ours • Unified • Strong labels are not necessary • More labels supported (Uijlings et al., 2018) Semi-supervised Object Detection
UFO 2 : a Unified Framework Unified 𝒕 𝒅 Image 𝒕 𝝔 𝒕 𝒆 feat. ROI feat. 𝒕 𝒕 Classification Proposal 𝒕 𝒔 Regression Refinement Proposals R Strong supervision Weak Supervision 𝒕 𝒅 𝒕 𝝔 𝒕 𝒕 Classification 𝒕 𝒆 𝒕 𝒔 Regression 𝒕 𝒕 Classification
Partial Labels (tags, points, scribbles) Unified 𝒕 𝒅 Image 𝒕 𝝔 𝓜 𝑼𝒃𝒉𝒕 𝒕 𝒆 feat. Teacher ROI feat. Partial Student 𝒕 𝒕 Labels Proposal 𝒕 𝒔 Refinement Pseudo GT Proposals R Points & Scribbles No GT boxes available Generate pseudo-GT online Teacher heads: Student heads: o image-level multi-label classification (𝓜 𝑼𝒃𝒉𝒕 ) o RoI classification o RoI regression
Strong Labels (boxes) Unified 𝒕 𝒅 Image 𝒕 𝝔 𝒕 𝒆 feat. Teacher ROI feat. Student 𝒕 𝒕 Strong Labels Proposal 𝒕 𝒔 Refinement Proposals R Boxes Naïve solution: directly supervise student heads using GT boxes Issue: Weak Teacher & Strong Students
Strong Labels (boxes) Unified 𝓜 𝑼𝟐 𝒕 𝒅 Image 𝒕 𝝔 𝓜 𝑼𝒃𝒉𝒕 𝒕 𝒆 feat. Teacher ROI 𝓜 𝑼𝟑 feat. Student 𝒕 𝒕 Strong Labels Proposal 𝒕 𝒔 Refinement Proposals R Boxes Make Teacher Great Again! o Image-level multi-label classification (𝓜 𝑼𝒃𝒉𝒕 ) o RoI classification (𝓜 𝑼𝟐 ) o RoI objectness regularization (𝓜 𝑼𝟑 )
Unlabeled Data Unified 𝒕 𝒅 dog Image 𝒕 𝝔 book 𝒕 𝒆 feat. Teacher ROI feat. Student 𝒕 𝒕 𝒕 𝒔 Pseudo GT Proposals R o Take confident classes from teacher's prediction o Then follow “Tags” setting
Inference Unified Image feat. ROI feat. Student 𝒕 𝒕 Classification 𝒕 𝒔 Regression Proposals R o Use only the student heads o As efficient as standard supervised detectors
Dataset: Partial Labels Simulation Region Point Distance distance transform * gaussian Gaussian Transform Region Scribble distance transform * gaussian Mask Skeleton
Dataset: Partial Labels Simulation COCO images & boxes points scribbles
Experiments: Train from Scratch COCO val-2014 results (AP-50) COCO val-2014 results (AP-50) 50 50 46.4 41.5 41.5 40 40 29.8 27 30 30 24.3 22.7 22.7 20 20 10 10 0 0 tags tags points points scribbles scribbles boxes boxes prior work prior work ours ours Training from scratch (single label)
Experiments: Improve Pre-trained Models COCO minival (AP) COCO minival (AP) 33.9 31.5 34 33.8 30.9 31 33.6 30.5 33.4 30.1 33.2 30 33 29.4 29.5 32.7 32.8 29.1 32.6 29 32.4 28.5 32.2 28 32 pre-trained tags points scribbles pre-trained unlabeled COCO-35 COCO-115 Improving pre-trained models (Mixed labels)
Experiments Tags Points Scribbles Boxes
Budget-aware Omni-supervised Detection Approx. per-img budget on COCO: tags: 80s, points: 88.7s, scribbles: 160.4s, boxes: 346s • Given a fixed annotation budget (time), the most common strategy: • STRONG : annotate using only boxes UFO 2 allows a promising new policy: • N%B: use N% budget for boxes and (1-N%) for points Policy Labels # Labeled Images AP STRONG B + U 2312 + 7688 13.97 ± 0.98 80%B P + B + U 1804 + 1850 + 6346 14.11 ± 1.01
Thanks for watching!
Recommend
More recommend