learning 3d object models from 2d images
play

Learning 3D object models from 2D images Cropped Input Image - PowerPoint PPT Presentation

Learning 3D object models from 2D images Cropped Input Image Predicted Mesh Generated Ground Truth Predicted Landmarks Mesh Loss Latent Spatial Mesh Convolutional ResNet-50 Vector Decoder Iterative Model Fitting Learning from Imperfect


  1. Learning 3D object models from 2D images Cropped Input Image Predicted Mesh Generated Ground Truth Predicted Landmarks Mesh Loss Latent Spatial Mesh Convolutional ResNet-50 Vector Decoder Iterative Model Fitting Learning from Imperfect Data Workshop Iasonas Kokkinos

  2. Ariel AI S. Zafeiriou E. Schmitt H. Wang D. Kulon G. Papandreou R. A. Guler B. Fulkerson P. Koutras E. Skordos A. Kakolyris H. Tam A. Lazarou S. Galanakis D. Stoddard UCL, Imperial College, FAIR, INRIA, Stony Brook Natalia Neverova M. Bronstein Z. Shu M. Sahasrabudhe E. Bartrum N. Paragios D. Samaras Imperial College FAIR Stony Brook INRIA UCL INRIA Stony Brook

  3. Human analysis: from coarse to fine DensePose (our work) DensePose (our work) Pose Estimation Pose Estimation Image Classification Object Detection Image Classification Part Segmentation Object Detection Part Segmentation Image Classification Is there a person in this image? Is there a person in this Is there a person in this Find correspondence between Find correspondence between Input Image Localize persons in the Localize persons in the Segment semantically Segment semantically Localize joints of the Localize joints of the Yes? No? image? image? all pixels and a 3D model. all pixels and a 3D model. image. image. meaningful body parts. meaningful body parts. persons in the images. persons in the images. Yes? No? Yes? No? Image Classification

  4. Human analysis: from coarse to fine DensePose (our work) DensePose (our work) Pose Estimation Pose Estimation Image Classification Object Detection Image Classification Part Segmentation Object Detection Part Segmentation Person Detection Localize persons in the image. Is there a person in this Is there a person in this Find correspondence between Find correspondence between Input Image Localize persons in the Localize persons in the Segment semantically Segment semantically Localize joints of the Localize joints of the image? image? all pixels and a 3D model. all pixels and a 3D model. image. image. meaningful body parts. meaningful body parts. persons in the images. persons in the images. Yes? No? Yes? No? Image Classification Person Detection

  5. Human analysis: from coarse to fine DensePose (our work) DensePose (our work) Pose Estimation Pose Estimation Image Classification Object Detection Image Classification Part Segmentation Object Detection Part Segmentation Part Segmentation Segment semantically meaningful Is there a person in this Is there a person in this Find correspondence between Find correspondence between Input Image Localize persons in the Localize persons in the Segment semantically Segment semantically Localize joints of the Localize joints of the body parts. image? image? all pixels and a 3D model. all pixels and a 3D model. image. image. meaningful body parts. meaningful body parts. persons in the images. persons in the images. Yes? No? Yes? No? DensePose (our work) Image Classification Person Detection Pose Estimation Image Classification Part Segmentation Object Detection Part Segmentation Is there a person in this Find correspondence between Localize persons in the Segment semantically Localize joints of the image? all pixels and a 3D model. image. meaningful body parts. persons in the images. Yes? No?

  6. Human analysis: from coarse to fine DensePose (our work) Pose Estimation Image Classification Object Detection Part Segmentation Pose Estimation Localize joints of the persons in the Is there a person in this Find correspondence between Input Image Localize persons in the Segment semantically Localize joints of the images. image? all pixels and a 3D model. image. meaningful body parts. persons in the images. Yes? No? DensePose (our work) Image Classification Person Detection Pose Estimation Image Classification Part Segmentation Pose Estimation Object Detection Part Segmentation Is there a person in this Find correspondence between Localize persons in the Segment semantically Localize joints of the image? all pixels and a 3D model. image. meaningful body parts. persons in the images. Yes? No?

  7. Human analysis: from coarse to fine DensePose (our work) Pose Estimation Image Classification Object Detection Part Segmentation Dense Pose Estimation Find correspondence between all Is there a person in this Find correspondence between Input Image Localize persons in the Segment semantically Localize joints of the pixels and a 3D model. image? all pixels and a 3D model. image. meaningful body parts. persons in the images. Yes? No? DensePose (our work) Image Classification Person Detection Pose Estimation Image Classification Part Segmentation Pose Estimation Object Detection DensePose Part Segmentation Is there a person in this Find correspondence between Localize persons in the Segment semantically Localize joints of the image? all pixels and a 3D model. image. meaningful body parts. persons in the images. Yes? No?

  8. Holy grail: 3D human reconstruction “W “Wide Open” ” (T (The Mill, 2015) 8

  9. Ariel AI: 3D human reconstruction on mobile 9

  10. Ariel AI: 3D human reconstruction on mobile Seamless augmented reality Immersive gaming Holographic telepresence Kinetic learning Universal motion capture Personalised, experiential retail 10 10 10 10

  11. Challenges Depth/height ambiguity 3D from 2D: fundamentally ill-posed problem Scarce 3D supervision – almost impossible in-the-wild 11 11 11 11

  12. From imperfect vision to imperfect data Computer Vision before deep learning: - Your `local evidence’ is imperfect (classifier scores, unary terms, ..) - Compensate for it by model-based prior during inference (AAMs, MRFs,..) Computer Vision after deep learning: - Your `local evidence’ can become perfect - Your training data is imperfect - Compensate for it by some model-based prior, prior or during training

  13. Imperfect Data for Semantic Segmentation Bounding boxes + occupancy priors “Weakly- and Semi-Supervised Learning of a Deep Convolutional Network for Semantic Image Segmentation” George Papandreou, Liang-Chieh Chen, Kevin P. Murphy, Alan L. Yuille, ICCV 2015

  14. Imperfect Data for Instance Segmentation 4 points + segmentation system Deep Extreme Cut: From Extreme Points to Object Segmentation, Kevis-Kokitsi Maninis, Sergi Caelles, Jordi Pont-Tuset, Luc Van Gool

  15. Imperfect Data for Pose Estimation Keypoints + temporal correspondence Learning Temporal Pose Estimation from Sparsely Labeled Videos, Bertasius, Gedas and Feichtenhofer, Christoph, and Tran, Du and Shi, Jianbo, and Torresani, Lorenzo(NeurIPS 2019)

  16. Part 1: Weakly- and semi- supervised learning for 3D HoloPose: Holistic 3D Human Reconstruction In-the-Wild, A. Guler and I. Kokkinos, CVPR 2019 Weakly-Supervised Mesh-Convolutional Hand Reconstruction in the Wild, D. Kulon et al CVPR 2020

  17. Part 2: Fully unsupervised learning for 3D Unstructured face dataset 3D model comes out deep magic happens Includes all previous tasks as special cases Lifting AutoEncoders: Unsupervised Learning of 3D Morphable Models Using Deep Non-Rigid Structure from Motion, M. Sahasrabudhe, Z. Shu, E. Bartrum, A. Guler, D. Samaras and I. Kokkinos, ICCV GMDL 2019

  18. DenseReg: From Image to Template to Task R. A. Guler, G. Trigeorgis, E. Antonakos, P. Snape, S. Zafeiriou, I. Kokkinos, DenseReg: Fully Convolutional Dense Shape Regression In-the-Wild, CVPR 2017

  19. DenseReg, Frame-by-Frame

  20. Supervision: from parametric model fitting to 2D keypoints 2D canonical coordinates Annotation effort: a few 2D landmarks per image Density: morphable model prior

  21. DensePose: dense image-to-body correspondence DensePose-RCNN: ~25 FPS http://densepose.org/ R. A. Guler, N. Neverova, I. Kokkinos “DensePose: Dense Human Pose Estimation In The Wild”, CVPR’18

  22. An Annot otation on pi pipe peline ne-II II segmented parts sampled points rendered images for the specific part segmented parts sampled points rendered images for the specific part input image input image ... ... ... ... Surface Correspondence Surface Correspondence TASK 1: Part Segmentation TASK 2: Marking Correspondences TASK 1: Part Segmentation TASK 2: Marking Correspondences

  23. DensePose-COCO dataset Quantization replaced by part assignment. densepose.org U coordinates V coordinates Image

  24. DensePose-RCNN in action De DensePose-RC RCNN Re Results Visualization Quantization replaced by part assignment.

  25. HoloPose: multi-person 3D reconstruction results R. A. Guler, I. Kokkinos “HoloPose: Holistic 3D Human Reconstruction In The Wild”, CVPR’19

  26. Surface-level human understanding, CVPR 2018 Dense UV coordinate regression SMPL parameter regression En End-to to-en end Rec ecover ery of Hu Human Shape e and Pose, e, CVPR 2018 A. Kanazawa M. J Black D. W. Jacobs J. Malik Learning Lea g to Estimate e 3D Hu Human Pose e and Shape e from a Singl gle e Im Image , , CVPR 2018 De DensePose: : Dense Human an Pos ose Estim imation ion In The Wild ild, , CVPR 2018 G. Pavlakos, L. Zhu, X. Zhou, K. Daniilidis R. A. Güler, N. Neverova, I. Kokkinos, Monocu cular 3D Pose and Shape Estimation of Multiple People, , CVPR 2018, Andrei Zanfir, Elisabeta Marinoiu, Cristian Sminchisescu Robust & accurate, “in-the-wild” Parametric and 3D Not 3D Alignment

  27. Bottom-up human body reconstruction

Recommend


More recommend