deepcap monocular human performance capture using weak
play

DeepCap: Monocular Human Performance Capture Using Weak Supervision - PowerPoint PPT Presentation

DeepCap: Monocular Human Performance Capture Using Weak Supervision Marc Habermann, Weipeng Xu , Michael Zollhoefer, Gerard Pons-Moll, and Christian Theobalt Marc Habermann DeepCap Human performance capture from a monocular camera Marc


  1. DeepCap: Monocular Human Performance Capture Using Weak Supervision Marc Habermann, Weipeng Xu , Michael Zollhoefer, Gerard Pons-Moll, and Christian Theobalt Marc Habermann

  2. DeepCap Human performance capture from a monocular camera Marc Habermann 2

  3. Challenges § Monocular setting is inherently ambiguous § High-dimensional problem – Pose and surface deformation Source: https://www.fiylo.de/ Marc Habermann 3

  4. Related Work § Capture using parametric models Xiang et al. 2018 Kanazawa et al. 2018 Metaxas et al. 1993, Plaenkers et al. 2001, Sminchisescu et al. 2003, Sigal et al. 2004, Joo et al. 2018, Pavlakos et al. 2018, Kanazawa et al. 2019, Pavlakos et al. 2019, … Marc Habermann 4

  5. Related Work § Monocular template-free capture Zheng et al. 2019 Saito et al. 2019 Huang et al. 2018, Varol et al. 2018, Natsume et al. 2019, … Marc Habermann 5

  6. Related Work § Template-based capture Habermann et al. 2019 Xu et al. 2018 Carranza et al. 2003, Bray et al. 2006, Starck et al. 2007, De Aguiar et al. 2008, Brox et al. 2010, Cagniart et al. 2010, … Marc Habermann 6

  7. DeepCap Learning based approach Pose + surface deformation Weak multi-view supervision Marc Habermann 7

  8. Personalized Character Model Fully automatic Template mesh Skeleton Embedded graph Marc Habermann 8

  9. Inference Time Marc Habermann 9

  10. Direct Supervision? Ground truth 3D pose Difficult to obtain Ground truth 3D surface Marc Habermann 10

  11. Weak Supervision Multi-view 2D detections Differentiable 3D to 2D modules Multi-view foreground masks Marc Habermann 11

  12. Training Data – Weak Multi View OpenPose 2D keypoints (Cao et al. 2019) Color keying Calibrated multi-view images Foreground mask Marc Habermann 12

  13. Pipeline Marc Habermann 13

  14. PoseNet Pose Prior Loss Multi-view Global Pose Sparse Alignment Keypoint Net Layer Kinematics Loss Segmented Layer Rotation ! Root Relative Global Input Image Joint Detections $ %,' Joint Angles " Landmarks Landmarks # PoseNet Root rotation ! ∈ ℝ * Joint angles " ∈ ℝ * Marc Habermann 14

  15. PoseNet Pose Prior Loss Multi-view Global Pose Sparse Alignment Keypoint Net Layer Kinematics Loss Segmented Layer Rotation ! Root Relative Global Input Image Joint Detections $ %,' Joint Angles " Landmarks Landmarks # Kinematics Layer ' !, " : ℝ *- → ℝ * per landmark / Function + Camera and root relative 3D Skeletool pose landmark positions # % 0 ,' Marc Habermann 15

  16. PoseNet Pose Prior Loss Multi-view Global Pose Sparse Alignment Keypoint Net Layer Kinematics Loss Segmented Layer Rotation ! Root Relative Global Input Image Joint Detections $ %,' Joint Angles " Landmarks Landmarks # Rigid transform for landmark # % 0 ,' Global Camera and root relative 3D space 3D space 3 # % 0 ,' + 5 # ' = 2 % 0 Inverse extrinsic rotation of Global translation the input camera 6 7 Marc Habermann 16

  17. PoseNet Pose Prior Loss Multi-view Global Pose Sparse Alignment Keypoint Net Layer Kinematics Loss Segmented Layer Rotation ! Root Relative Global Input Image Joint Detections $ %,' Joint Angles " Landmarks Landmarks # Multi-view Sparse Keypoint Loss @ ; <= # = > > 9 % # ' − $ %,' @ % ' Projecting (9) 3D landmark # ' into camera view 6 Comparing to 2D joint detection $ %,' Marc Habermann 17

  18. DefNet Multi-view Sparse Pose Keypoint Net Graph Root Relative Loss Rotation ! Landmarks Joint Angles " Global Global Landmarks A Joint Detections $ %,' Deformation Alignment Layer Layer Segmented Input Image Multi-view Def Non-rigid Net Silhouette ARAP Loss Loss Rotation C Root Relative Translation D Vertices Global Vertices B Foreground Masks DefNet Regresses embedded deformation* in canonical pose Per node E rotation angles C < and translation D < * (Sumner et al. 2007, Sorkine et al. 2007) Marc Habermann 18

  19. DefNet Multi-view Sparse Pose Keypoint Net Graph Root Relative Loss Rotation ! Landmarks Joint Angles " Global Global Landmarks A Joint Detections $ %,' Deformation Alignment Layer Layer Segmented Input Image Multi-view Def Non-rigid Net Silhouette ARAP Loss Loss Rotation C Root Relative Translation D Vertices Global Vertices B Foreground Masks Deformation Layer Posed and deformed Pose Embedded deformation Landmarks A % 0 ,' Dual Quaternion Skinning Deformation Vertices B % 0 ,F (Kavan et al. 2007) Marc Habermann 19

  20. DefNet Multi-view Sparse Pose Keypoint Net Graph Root Relative Loss Rotation ! Landmarks Joint Angles " Global Global Landmarks A Joint Detections $ %,' Deformation Alignment Layer Layer Segmented Input Image Multi-view Def Non-rigid Net Silhouette ARAP Loss Loss Rotation C Root Relative Translation D Vertices Global Vertices B Foreground Masks Rigid transform for landmark G and vertex H Global Camera and root relative 3D landmark A ' and vertex B F 3D landmark A % 0 ,' and vertex B % 0 ,F Marc Habermann 20

  21. DefNet Multi-view Sparse Pose Keypoint Net Graph Root Relative Loss Rotation ! Landmarks Joint Angles " Global Global Landmarks A Joint Detections $ %,' Deformation Alignment Layer Layer Segmented Input Image Multi-view Def Non-rigid Net Silhouette ARAP Loss Loss Rotation C Root Relative Translation D Vertices Global Vertices B Foreground Masks Multi-view Sparse Keypoint Graph Loss @ ; <=I # = > > 9 % A ' − $ %,' @ % ' Global 3D landmark A ' Marc Habermann 21

  22. DefNet Multi-view Sparse Pose Keypoint Net Graph Root Relative Loss Rotation ! Landmarks Joint Angles " Global Global Landmarks A Joint Detections $ %,' Deformation Alignment Layer Layer Segmented Input Image Multi-view Def Non-rigid Net Silhouette ARAP Loss Loss Rotation C Root Relative Translation D Vertices Global Vertices B Foreground Masks Non-rigid Silhouette Loss @ ; JFK B = > > N % 9 % O F @ % F∈L M Set of boundary vertices for camera 6 Distance transform image Marc Habermann 22

  23. Qualitative Evaluation Habermann et al. 2019 Ours Overlay on input image Overlay on reference view Marc Habermann 23

  24. Qualitative Evaluation 3D view Overlay on input image Saito et al. 2019 Zheng et al. 2019 Ours Marc Habermann 24

  25. Quantitative Evaluation Surface reconstruction accuracy Method (on S4) Multi-view IoU* (in %) HMR (Kanazawa et al. 2018) 65.1 Person-unspecific HMMR(Kanazawa et al. 2019) 63.79 LiveCap (Habermann et al. 2019) 59.96 Person-specific Ours 82.53 *IoU = Intersection over Union Marc Habermann 25

  26. More results Marc Habermann 26

  27. Thank you! Marc Weipeng Michael Gerard Christian Habermann Xu Zollhoefer Pons-Moll Theobalt Marc Habermann 27

Recommend


More recommend