VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild Environment Chunyu Wang Microsoft Research Asia https://github.com/microsoft/voxelpose-pytorch
Bro road ad Im Impact pact • Intelligent retail (Microsoft Connected Store) • Sports broadcasting/training/judging • Human-robot interaction • Augmented/virtual reality VoxelPose elPose: : Hany nyue Tu, Chuny unyu u Wang, ang, Wenjun jun Zeng
Pre reviou vious s Wo Work rk Triangulation Multiview 2D Keypoint Cross view Multi-person or Pictorial Images Estimation Matching 3D Poses Model Image Credit it : Dong, Junting , et al. “Fast and robust multi - person 3d pose estimation from multiple views.”, CVPR 2019
Voxe xelPose lPose Triangulation Multiview 2D Keypoint Cross view Multi-person or Pictorial Images Estimation Matching 3D Poses Model Single Mode l ( No hard decisions within steps ) ( Delay decision until all views are available ) VoxelPose elPose: : Hany nyue Tu, Chuny unyu u Wang, ang, Wenjun jun Zeng
Voxe xelPose lPose VoxelPose elPose: : Hany nyue Tu, Chuny unyu u Wang, ang, Wenjun jun Zeng
Voxe xelPose lPose 1. Discre 1. scretize tize 3D Space ace by Voxels els VoxelPose elPose: : Hany nyue Tu, Chuny unyu u Wang, ang, Wenjun jun Zeng
Voxe xelPose lPose 1. Discretize 3D Space by Voxels 2. Comp 2. mput ute e a featur eature e for ea each ch voxel el by inver ersel sely y projec ecting ting 2D featur eatures es to 3D VoxelPose elPose: : Hany nyue Tu, Chuny unyu u Wang, ang, Wenjun jun Zeng
Voxe xelPose lPose 1. Discretize 3D Space by Voxels 2. Compute a feature for each voxel by inversely projecting 2D features to 3D 3. 3. The e resulti esulting ng feature eature is robust ust to occl clusion usion VoxelPose elPose: : Hany nyue Tu, Chuny unyu u Wang, ang, Wenjun jun Zeng
Voxe xelPose lPose 1. Discretize 3D Space by Voxels 2. Compute a feature for each voxel by inversely projecting 2D features to 3D 3. The resulting feature is robust to occlusion 4. 4. Predict edict wh wheth ether er ea each ch voxel xel conta ntains ns body y joints ts VoxelPose elPose: : Hany nyue Tu, Chuny unyu u Wang, ang, Wenjun jun Zeng
Hy Hybrid rid Model del- (1) 1) Hu Human an De Dete tection ction (300mm x 300mm x 300mm) (2000mm x 2000mm x 2000mm) The proposals need not to be very precise since we will refine them in the following step. VoxelPose elPose: : Hany nyue Tu, Chuny unyu u Wang, ang, Wenjun jun Zeng
Hy Hybrid rid Model del- (2) 2) Jo Join int t De Detect tection ion (30mm x 30mm x 30mm) This is sufficiently accurate for body joint localization. VoxelPose elPose: : Hany nyue Tu, Chuny unyu u Wang, ang, Wenjun jun Zeng
Tech echnical nical De Deta tails ils of Vo f Voxe xelPose lPose VoxelPose elPose: : Hany nyue Tu, Chuny unyu u Wang, ang, Wenjun jun Zeng
St Step ep 1: 1: 2D 2D Hea Heatmap tmap Est stimation imation HE : Heatmap Estimation It can use the existing methods such as OpenPose, HRNet and AlphaPose. VoxelPose elPose: : Hany nyue Tu, Chuny unyu u Wang, ang, Wenjun jun Zeng
St Step ep 2: 2: 3D 3D Pe Pers rson on De Detect tection ion HE : Heatmap Estimation 3PN : 3D Proposal Estimation VoxelPose elPose: : Hany nyue Tu, Chuny unyu u Wang, ang, Wenjun jun Zeng
Step St ep 2: 2: 3D 3D Pe Pers rson on De Detect tection ion 3PN Network 3D feature Proposals volume (whole space) X x Y x Z x K X x Y x Z x 1 number of voxels A scalar for each voxel: The likelihood of having a people centered at the voxel We keep K largest voxels (proposals) after NMS VoxelPose elPose: : Hany nyue Tu, Chuny unyu u Wang, ang, Wenjun jun Zeng
St Step ep 3: 3: 3D 3D Jo Join int t De Dete tection ction HE : Heatmap Estimation 3PN : 3D Proposal Estimation VoxelPose elPose: : Hany nyue Tu, Chuny unyu u Wang, ang, Wenjun jun Zeng
St Step ep 3: 3: 3D 3D Jo Join int t De Dete tection ction PEN Network 3D feature Compute 3D Pose Expectation volume Heatmap (proposal) X x Y x Z x K X x Y x Z x K Per-voxel likelihood for number of voxels joints in the 3D space VoxelPose elPose: : Hany nyue Tu, Chuny unyu u Wang, ang, Wenjun jun Zeng
Jo Join int t Tra raining ining HE : Heatmap Estimation 3PN : 3D Proposal Estimation VoxelPose elPose: : Hany nyue Tu, Chuny unyu u Wang, ang, Wenjun jun Zeng
Prop roposal sal Qua uality lity We project the 3D proposals to 2D for visualization. Colored boxes represent their estimated confidence is larger than 0.1. VoxelPose elPose: : Hany nyue Tu, Chuny unyu u Wang, ang, Wenjun jun Zeng
Prop roposal sal Rec ecall all Rat ate When the threshold is 140mm, we get about 95% recall when voxel size is 300mm This is sufficient for 3D pose estimation Using a smaller voxel improves the precision VoxelPose elPose: : Hany nyue Tu, Chuny unyu u Wang, ang, Wenjun jun Zeng
Im Impact pact of Cam f Camera era Nu Number ber The error increases mildly when we Camera AP 25 ↑ AP 50 ↑ AP 100 ↑ AP 150 ↑ MPJPE ↓ decrease the number from 5 to 3. Number 5 83.59 98.33 99.76 99.91 17.68mm The error increases notably when 3 58.94 93.88 98.45 99.32 24.29mm using only one camera. 1 0.860 23.47 80.69 93.32 66.95mm 5* 50.91 95.25 99.36 99.56 25.51mm It generalizes to different camera * means training/testing on different cameras. configurations. VoxelPose elPose: : Hany nyue Tu, Chuny unyu u Wang, ang, Wenjun jun Zeng
De Demo mo VoxelPose elPose: : Hany nyue Tu, Chuny unyu u Wang, ang, Wenjun jun Zeng
De Demo mo VoxelPose elPose: : Hany nyue Tu, Chuny unyu u Wang, ang, Wenjun jun Zeng
De Demo mo VoxelPose elPose: : Hany nyue Tu, Chuny unyu u Wang, ang, Wenjun jun Zeng
Recommend
More recommend