credit https xkcd com 1897 roi 10d monocular lifting of
play

Credit: https://xkcd.com/1897/ ROI-10D: Monocular Lifting of - PowerPoint PPT Presentation

Credit: https://xkcd.com/1897/ ROI-10D: Monocular Lifting of Learning to Fuse Things and Stuff 2D Detection to 6D Pose and Metric Shape F Manhardt, W Kehl, A Gaidon J Li, A Raventos, A Bhargava, T Tagawa, A Gaidon


  1. ● ● ●

  2. Credit: https://xkcd.com/1897/

  3. ROI-10D: Monocular Lifting of Learning to Fuse Things and Stuff 2D Detection to 6D Pose and Metric Shape F Manhardt, W Kehl, A Gaidon J Li, A Raventos, A Bhargava, T Tagawa, A Gaidon https://arxiv.org/abs/1812.02781 https://arxiv.org/abs/1812.01192

  4. Credit: Ed Olson, May Mobility

  5. Image courtesy supervise.ly

  6. ● ● ●

  7. ● ○ ○ ● ○ ○ ■ ■ Toyota Safety Sense 2.0 Camera

  8. ICRA 2019 [arxiv + video]

  9. Easy to acquire Expensive / Difficult to acquire

  10. Easy to acquire

  11. Depth Model Parameters Occlusion Regularization Photometric loss Depth Regularization via view-synthesis (edge-aware depth smoothing) 18

  12. ● ● → Resolution Matters for View Synthesis!

  13. ● ○ ○ A. Odena, V. Dumoulin, and C. Olah, “Deconvolution and checkerboard artifacts,” Distill , vol. 1, no. 10, p. e3, 2016. W. Shi, J. Caballero, F. Husza ́r, J. Totz, A. P. Aitken, R. Bishop, D. Rueckert, and Z. Wang, “Real-time single image and video super- resolution using an efficient sub-pixel convolutional neural network,” CVPR 2016

  14. ● ○ Modified DispNet Architecture

  15. ● ○ ○ Priors learned by model due to occluded boundaries in fronto-parallel stereo case Spatial Fused left Left Flipped Left Transformer Disparity Disparity Network M. Jaderberg, K. Simonyan, A. Zisserman, et al. , “Spatial transformer networks,” NIPS 2015 C. Godard, O. Mac Aodha, and G. J. Brostow, “Unsupervised monocular depth estimation with left-right consistency,” CVPR 2017

  16. Sub-pixel convolutions ( SP ), Differentiable Flip Augmentation ( FA )

  17. ● ● ●

  18. ICLR 2019 [arxiv]

  19. Gaidon et al, "Virtual worlds as proxy for multiobject tracking analysis.", CVPR'16 Ros et al, "The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes", CVPR'16 de Souza et al, "Procedural Generation of Videos to Train Deep Action Recognition Networks.", CVPR'17 30

  20. 31

  21. privileged regularization adversarial loss perceptual regularization task loss (self-regularization) (this is what we care about) 32

  22. → 33

  23. → 34

  24. → 35

  25. → 36

  26. → 37

  27. 38

  28. ● ● ●

Recommend


More recommend