DeepCap: Monocular Human Performance Capture Using Weak Supervision - PowerPoint PPT Presentation

DeepCap: Monocular Human Performance Capture Using Weak Supervision Marc Habermann, Weipeng Xu , Michael Zollhoefer, Gerard Pons-Moll, and Christian Theobalt Marc Habermann

DeepCap Human performance capture from a monocular camera Marc Habermann 2

Challenges § Monocular setting is inherently ambiguous § High-dimensional problem – Pose and surface deformation Source: https://www.fiylo.de/ Marc Habermann 3

Related Work § Capture using parametric models Xiang et al. 2018 Kanazawa et al. 2018 Metaxas et al. 1993, Plaenkers et al. 2001, Sminchisescu et al. 2003, Sigal et al. 2004, Joo et al. 2018, Pavlakos et al. 2018, Kanazawa et al. 2019, Pavlakos et al. 2019, … Marc Habermann 4

Related Work § Monocular template-free capture Zheng et al. 2019 Saito et al. 2019 Huang et al. 2018, Varol et al. 2018, Natsume et al. 2019, … Marc Habermann 5

Related Work § Template-based capture Habermann et al. 2019 Xu et al. 2018 Carranza et al. 2003, Bray et al. 2006, Starck et al. 2007, De Aguiar et al. 2008, Brox et al. 2010, Cagniart et al. 2010, … Marc Habermann 6

DeepCap Learning based approach Pose + surface deformation Weak multi-view supervision Marc Habermann 7

Personalized Character Model Fully automatic Template mesh Skeleton Embedded graph Marc Habermann 8

Inference Time Marc Habermann 9

Direct Supervision? Ground truth 3D pose Difficult to obtain Ground truth 3D surface Marc Habermann 10

Weak Supervision Multi-view 2D detections Differentiable 3D to 2D modules Multi-view foreground masks Marc Habermann 11

Training Data – Weak Multi View OpenPose 2D keypoints (Cao et al. 2019) Color keying Calibrated multi-view images Foreground mask Marc Habermann 12

Pipeline Marc Habermann 13

PoseNet Pose Prior Loss Multi-view Global Pose Sparse Alignment Keypoint Net Layer Kinematics Loss Segmented Layer Rotation ! Root Relative Global Input Image Joint Detections $ %,' Joint Angles " Landmarks Landmarks # PoseNet Root rotation ! ∈ ℝ * Joint angles " ∈ ℝ * Marc Habermann 14

PoseNet Pose Prior Loss Multi-view Global Pose Sparse Alignment Keypoint Net Layer Kinematics Loss Segmented Layer Rotation ! Root Relative Global Input Image Joint Detections $ %,' Joint Angles " Landmarks Landmarks # Kinematics Layer ' !, " : ℝ *- → ℝ * per landmark / Function + Camera and root relative 3D Skeletool pose landmark positions # % 0 ,' Marc Habermann 15

PoseNet Pose Prior Loss Multi-view Global Pose Sparse Alignment Keypoint Net Layer Kinematics Loss Segmented Layer Rotation ! Root Relative Global Input Image Joint Detections $ %,' Joint Angles " Landmarks Landmarks # Rigid transform for landmark # % 0 ,' Global Camera and root relative 3D space 3D space 3 # % 0 ,' + 5 # ' = 2 % 0 Inverse extrinsic rotation of Global translation the input camera 6 7 Marc Habermann 16

PoseNet Pose Prior Loss Multi-view Global Pose Sparse Alignment Keypoint Net Layer Kinematics Loss Segmented Layer Rotation ! Root Relative Global Input Image Joint Detections $ %,' Joint Angles " Landmarks Landmarks # Multi-view Sparse Keypoint Loss @ ; <= # = > > 9 % # ' − $ %,' @ % ' Projecting (9) 3D landmark # ' into camera view 6 Comparing to 2D joint detection $ %,' Marc Habermann 17

DefNet Multi-view Sparse Pose Keypoint Net Graph Root Relative Loss Rotation ! Landmarks Joint Angles " Global Global Landmarks A Joint Detections $ %,' Deformation Alignment Layer Layer Segmented Input Image Multi-view Def Non-rigid Net Silhouette ARAP Loss Loss Rotation C Root Relative Translation D Vertices Global Vertices B Foreground Masks DefNet Regresses embedded deformation* in canonical pose Per node E rotation angles C < and translation D < * (Sumner et al. 2007, Sorkine et al. 2007) Marc Habermann 18

DefNet Multi-view Sparse Pose Keypoint Net Graph Root Relative Loss Rotation ! Landmarks Joint Angles " Global Global Landmarks A Joint Detections $ %,' Deformation Alignment Layer Layer Segmented Input Image Multi-view Def Non-rigid Net Silhouette ARAP Loss Loss Rotation C Root Relative Translation D Vertices Global Vertices B Foreground Masks Deformation Layer Posed and deformed Pose Embedded deformation Landmarks A % 0 ,' Dual Quaternion Skinning Deformation Vertices B % 0 ,F (Kavan et al. 2007) Marc Habermann 19

DefNet Multi-view Sparse Pose Keypoint Net Graph Root Relative Loss Rotation ! Landmarks Joint Angles " Global Global Landmarks A Joint Detections $ %,' Deformation Alignment Layer Layer Segmented Input Image Multi-view Def Non-rigid Net Silhouette ARAP Loss Loss Rotation C Root Relative Translation D Vertices Global Vertices B Foreground Masks Rigid transform for landmark G and vertex H Global Camera and root relative 3D landmark A ' and vertex B F 3D landmark A % 0 ,' and vertex B % 0 ,F Marc Habermann 20

DefNet Multi-view Sparse Pose Keypoint Net Graph Root Relative Loss Rotation ! Landmarks Joint Angles " Global Global Landmarks A Joint Detections $ %,' Deformation Alignment Layer Layer Segmented Input Image Multi-view Def Non-rigid Net Silhouette ARAP Loss Loss Rotation C Root Relative Translation D Vertices Global Vertices B Foreground Masks Multi-view Sparse Keypoint Graph Loss @ ; <=I # = > > 9 % A ' − $ %,' @ % ' Global 3D landmark A ' Marc Habermann 21

DefNet Multi-view Sparse Pose Keypoint Net Graph Root Relative Loss Rotation ! Landmarks Joint Angles " Global Global Landmarks A Joint Detections $ %,' Deformation Alignment Layer Layer Segmented Input Image Multi-view Def Non-rigid Net Silhouette ARAP Loss Loss Rotation C Root Relative Translation D Vertices Global Vertices B Foreground Masks Non-rigid Silhouette Loss @ ; JFK B = > > N % 9 % O F @ % F∈L M Set of boundary vertices for camera 6 Distance transform image Marc Habermann 22

Qualitative Evaluation Habermann et al. 2019 Ours Overlay on input image Overlay on reference view Marc Habermann 23

Qualitative Evaluation 3D view Overlay on input image Saito et al. 2019 Zheng et al. 2019 Ours Marc Habermann 24

Quantitative Evaluation Surface reconstruction accuracy Method (on S4) Multi-view IoU* (in %) HMR (Kanazawa et al. 2018) 65.1 Person-unspecific HMMR(Kanazawa et al. 2019) 63.79 LiveCap (Habermann et al. 2019) 59.96 Person-specific Ours 82.53 *IoU = Intersection over Union Marc Habermann 25

More results Marc Habermann 26

Thank you! Marc Weipeng Michael Gerard Christian Habermann Xu Zollhoefer Pons-Moll Theobalt Marc Habermann 27

DeepCap: Monocular Human Performance Capture Using Weak Supervision - PowerPoint PPT Presentation

DeepCap: Monocular Human Performance Capture Using Weak Supervision Marc Habermann, Weipeng Xu , Michael Zollhoefer, Gerard Pons-Moll, and Christian Theobalt Marc Habermann DeepCap Human performance capture from a monocular camera Marc

Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat

Desktop Capture 164.pdf Page 1 of 35 Made with Doceri Desktop Capture 164.pdf Page 2 of 35

Monocular Vision Based Obstacle Avoidance: A Literature Review Outline Introduction

Monocular Visual-Inertial SLAM for ISMAR SLAM Challenge Jie PAN Shaozu CAO, Jie PAN, Jieqi SHI,

Weak-Signal Digital Modes Weak-Signal Digital Modes The weak-signal digimodes have been

To the weak I became weak, that I might win the weak. I have become all things to all people,

WEAK INTERPOLATION PROPERTY over THE MINIMAL LOGIC Larisa Maksimova Sobolev Institute of

Linking linking Weak forms Linking Weak forms Elision (sound cut)

Weak memory models INF4140 - Models of concurrency Weak memory models Fall 2016 30. 10. 2016

The weak-charged WIMP Shigeki Matsumoto (Kavli IPMU) The weak-charged WIMP, Majorana fermion with

Making weak maps compose strictly Richard Garner Uppsala University CT 2008, Calais Outline

Modelling and Verification Lecture 4 Weak bisimilarity and weak bisimulation games Properties of

DOR Data Capture and Imaging Automation Presented by: Department of Revenue Data Capture and

Lecture Capture Project Powered by Much more than Lecture Capture (Replacing Echo360)

Carbon Capture Technology Carbon Capture Technology Strategies Strategies ARPA- -E Carbon

Carbon Capture and Storage Value Chain Capture and Compression Large Stationary Sources Capture

From 2D to 3D: Monocular Vision With application to robotics/AR Motivation How many sensors do

Rent3D: Floor-Plan Priors for Monocular Layout Estimation Chenxi Liu 1 , Alexander Schwing 2 ,

Quality Assurance in Performance: Evaluating Mono Benchmark Results Tomas Kalibera, Lubomir Bulej

Time to Reduce the Implementation Gaps: The role of PCSK9i in routine Clinical Practice

Unsupervised Monocular Depth Estimation CNN Robust to Training Data Diversity Valery

Single-View and Multi-View Planar Models for Dense Monocular Mapping Alejo Concha, Jos M.

COMPUTER VISION FOR ROBOT NAVIGATION Sanketh Shetty Computer Vision and Robotics Laboratory

Visual SLAM for Mobile Instructor - Simon Lucey 16-623 - Designing Computer Vision Apps Example

Sambuz

Useful Links

Newsletter

Mail Us