recent progress in object detection
play

Recent Progress in Object Detection Jiaqi Wang Multimedia - PowerPoint PPT Presentation

Recent Progress in Object Detection Jiaqi Wang Multimedia Laboratory The Chinese University of Hong Kong Task definition Image classification Object detection Semantic segmentation Instance segmentation Task definition Progress Cascade


  1. Recent Progress in Object Detection Jiaqi Wang Multimedia Laboratory The Chinese University of Hong Kong

  2. Task definition Image classification Object detection Semantic segmentation Instance segmentation

  3. Task definition

  4. Progress Cascade R-CNN Mask R-CNN Fast R-CNN Relation Network Faster R-CNN R-FCN FPN R-CNN SNIP SSD RetinaNet CornerNet MultiBox YOLO YOLO v2 2014 2015 2016 2017 2018 recent

  5. General pipeline Region proposal Two-stage Detector sliding window dense proposals anchors cls. & reg. RoI feature RoI image backbone neck task head extractor features Feature generation Region recognition

  6. General pipeline Single-stage Detector sliding window anchors dense image backbone neck cls. & reg. Feature generation

  7. Faster R-CNN • Region Proposal Network (RPN) • Training pipeline

  8. Faster R-CNN • RPN

  9. Faster R-CNN Training pipeline • Joint training: multi-task preferred RPN proposals No gradient Fast R- backbone CNN head

  10. Feature Pyramid Network (FPN) • Top-down pathway • Multi-level prediction

  11. Feature Pyramid Network (FPN)

  12. Feature Pyramid Network (FPN)

  13. Mask R-CNN • RoIAlign • Mask branch

  14. Mask R-CNN RoI Pooling RoI Align

  15. Mask R-CNN Mask branch

  16. Cascade R-CNN • Cascade architecture • Training distribution

  17. Cascade R-CNN Cascade architecture Faster R-CNN Cascade R-CNN

  18. Cascade R-CNN Training distribution Regressor Detector

  19. Cascade R-CNN Training distribution

  20. RetinaNet • FPN • Focal Loss

  21. RetinaNet FPN heavier head than SSD / Faster R-CNN

  22. RetinaNet Focal Loss • Problem: class imbalance • inefficient training • loss is overwhelmed by negative samples Model Solution Two-stage detectors 1) proposal 2) mini-batch sampling SSD Hard negative mining RetinaNet Focal loss

  23. RetinaNet Focal Loss • Solution: high confidence -> small loss

  24. COCO Challenge 2018 Comparison of our approach with 2017 winning entries on COCO test-dev. 0.5 0.57 0.56 0.49 0.56 0.49 0.55 0.48 0.541 0.474 0.54 0.467 0.47 MASK AP BOX AP 0.526 0.53 0.46 0.52 0.45 0.51 0.505 0.44 0.44 0.5 0.43 0.49 0.42 0.48 2017 winner (single model) 2017 winner 2017 winner (single model) 2017 winner Single model Final results Single model Final results

  25. COCO Challenge 2018 1. We developed a hybrid task cascade framework for detection and segmentation. Detection & Segmentation

  26. COCO Challenge 2018 1. We developed a hybrid task cascade framework for detection and segmentation. 2. We proposed a feature guided anchoring scheme to improve the average recall (AR) of RPN by 10 points. Detection & Proposal Segmentation

  27. COCO Challenge 2018 1. We developed a hybrid task cascade framework for detection and segmentation. 2. We proposed a feature guided anchoring scheme to improve the average recall (AR) of RPN by 10 points. 3. We designed a new backbone FishNet. Detection & Backbone Proposal Segmentation

  28. COCO Challenge 2018 Hybrid Task Cascade (HTC) • Cascade Mask R-CNN (Cascade R-CNN + Mask R-CNN) RPN M1 B1 M2 B2 M3 B3 pool pool pool F Problem: Two branches at each stage are executed in parallel, without interaction.

  29. COCO Challenge 2018 Hybrid Task Cascade (HTC) • Interleaved execution M1 M2 M3 RPN B1 B2 B3 pool pool pool pool F Problem: N o direct information flow between mask branches at different stages .

  30. COCO Challenge 2018 Hybrid Task Cascade (HTC) • Mask Information Flow M1 M2 M3 RPN B1 B2 B3 pool pool pool pool F Problem: Spatial context is not much explored.

  31. COCO Challenge 2018 Hybrid Task Cascade (HTC) • Spatial context S M1 M2 M3 RPN B1 B2 B3 pool pool pool pool F

  32. COCO Challenge 2018 Hybrid Task Cascade (HTC)

  33. COCO Challenge 2018 Guided anchoring Region proposal sliding window dense learnable proposals anchors cls. & reg. RoI feature RoI image backbone neck task head extractor features Feature generation Region recognition

  34. COCO Challenge 2018 Guided anchoring • Our goal • Sparse • Arbitrary shape • General rules for anchor design • Alignment • Consistency

  35. COCO Challenge 2018 Guided anchoring Location prediction

  36. COCO Challenge 2018 Guided anchoring Shape prediction

  37. COCO Challenge 2018 Guided anchoring

  38. COCO Challenge 2018 Guided anchoring Feature adaption

  39. COCO Challenge 2018 Guided anchoring 0.72 GA-RPN (SENet- 0.7 154) 0.68 GA-RPN (ResNet- 50) 0.66 AR 1000 RPN (SENet-154) 0.64 0.62 RPN (ResNeXt- RPN (ResNet- 101) 152) 0.6 RPN (ResNet-50) 0.58 0 2 4 6 8 10 12 Runtime on TITAN X (fps)

  40. COCO Challenge 2018 Guided anchoring

  41. COCO Challenge 2018 Implementation details 1. Training scales • short edge: random sampled from 400 ~ 1400 • long edge: 1600 2. Test scales • (600, 900), (800, 1200), (1000, 1500), (1200, 1800), (1400, 2100) 3. Pipeline • Joint training • Finetune with GA-RPN proposals • Test with GA-RPN proposals 4. Resources • 32 Tesla V100 GPUs (16GB) for 3 days

  42. COCO Challenge 2018 Implementation details Backbones • SENet-154 ~0.8 points higher • ResNeXt101 (64*4d) • ResNeXt101 (32*8d) comparable • DPN-107 • FishNet

  43. COCO Challenge 2018 Implementation details Other tricks • w/ SoftNMS • w/o OHEM • w/o classwise balance sampling • w/o voting for bbox or mask

  44. COCO Challenge 2018 • With bells and whistles mask AP on test-dev 49.0 (+1.6) 49 47.4 model (+2.1) ensemble 47 45.3 multi-scale & (+1.0) 44.3 flip testing (+1.8) 45 GARPN 42.5 finetune better (+1.8) 43 backbone 40.7 multi-scale (+1.2) training 39.5 41 (+1.1) 38.4 synchronize BN (+1.5) 39 deformable 36.9 conv HTC 37 baseline R-50 Cascade with mask 35

  45. mmdetection • Comprehensive √ √ RPN Fast/Faster R-CNN √ √ Mask R-CNN FPN √ √ Cascade R-CNN RetinaNet More … … • High performance √ Better performance √ Optimized memory consumption √ Faster speed • Handy to develop GitHub: mmdetection √ Written with PyTorch √ Modular design

  46. Hybrid Task Cascade for Instance Segmentation (Accepted to CVPR 2019) https://arxiv.org/abs/1901.07518 Region Proposal by Guided Anchoring (Accepted to CVPR 2019) https://arxiv.org/abs/1901.03278 FishNet: A Versatile Backbone for Image, Region, and Pixel Level Prediction (Accepted to NIPS 2018) https://arxiv.org/abs/1901.03495

Recommend


More recommend