fields of parts friends
play

Fields of Parts & Friends peter.gehler.net p i Detection + - PowerPoint PPT Presentation

Fields of Parts & Friends peter.gehler.net p i Detection + Geometry p i Human Pose Estimation or Predict Predict Observation Observation Bounding Boxes Joint Locations Human Pose Estimation F (1) top X Y top F (2) top , head . . .


  1. Fields of Parts & Friends peter.gehler.net

  2. p i Detection + Geometry

  3. p i

  4. Human Pose Estimation or Predict Predict Observation Observation Bounding Boxes Joint Locations

  5. Human Pose Estimation F (1) top X Y top F (2) top , head . . . . . . Y head . . . . . . . . . Y rarm Y torso Y larm . . . . . . . . . . . . Y rhnd Y lhnd Y rleg Y lleg . . . . . . Y rfoot Y lfoot ψ ( y p , y p 0 ; w ) X X p ( y | I, w ) ∝ ψ ( y p , I ; w ) + Desired Output Observation p ∼ p 0 p P. Felzenszwalb, D. Huttenlocher, Pictorial Structures for Object Recognition International Journal of Computer Vision (IJCV), 2005

  6. Pictorial Structures F (1) top X Y top F (2) top , head . . . . . . Y head . . . . . . . . . θ Y rarm Y torso Y larm . . . . . . . . . . . . Y rhnd Y lhnd Y rleg Y lleg . . . . . . ( ∆ x, ∆ y ) Y rfoot Y lfoot ψ ( y p , y p 0 ; I, w ) X X + ψ ( y p ; I, w ) p ( y | I, w ) ∝ p ∼ p 0 p [Johnson&Everingham, BMVC’10], [Yang&Ramanan, CVPR’11],[Eichner&Ferrari, ACCV’12], [Sapp et al., ECCV’10], [Tran&Forsyth, ECCV’10], [Wang et al., CVPR’11], [Agarwal&Triggs, PAMI’02], [Urtasun&Darrell, ICCV’09], [Ionescu et al., ICCV’11]

  7. Extensions [Johnson&Everingham, BMVC’10] • Ever since introduced many [Yang&Ramanan, CVPR’11] extensions are proposed: [Eichner&Ferrari, ACCV’12] [Sapp et al., ECCV’10] • loopy … [Tran&Forsyth, ECCV’10] [Wang et al., CVPR’11] • mixture … [Agarwal&Triggs, PAMI’02] [Urtasun&Darrell, ICCV’09] • holistic approaches… [Ionescu et al., ICCV’11] …

  8. Poselet Conditioned Pictorial Structures II kinematic tree pairwise poselets conditioning IV extra unary factors result position/rotation III I 50 100 150 200 ... appearance 50 100 50 . . 150 . 200 L. Pishchulin, M. Andriluka, P. Gehler, B. Schiele Poselet Conditioned Pictorial Structures, CVPR 2013

  9. Poselets • “Clusters” of more parts • Capture non-adjacent part dependencies ... ... ... ... ... Top detections Poselet cluster medoids L. Bourdev, J. Malik, Poselets: Body Part Detectors Trained Using 3D Human Pose Annotations , ICCV 2009

  10. Conditioning Pairwise Terms θ ... ( ∆ x, ∆ y ) Possible pairwise factors X X ... ψ ( y torso , x ) ψ ( y head , x ) Y torso Y head Possible body models ψ ( y head , y torso )

  11. Results Poselet Conditioned Baseline PS Top poselet Cluster Prediction Result Generic Tree Result detections medoids

  12. Results on Leeds Sports Poses S. Johnson, M. Everingham, Clustered Pose and Nonlinear Appearance Models for Human Pose Estimation , BMVC 2010 1000 training, 1000 testing images observer centric annotation [Eichner&Ferrari, ACCV12] Error: PCP percentage of correct parts

  13. Results (PCP) 55.7 Baseline PS II 60.9 pairwise kinematic tree 60.8 unary pairwise poselets pairwise + unary 62.9 conditioing IV unary result factors position/rotation I III 50 100 150 200 ... appearance 50 100 50 . . 150 . 200

  14. Results (PCP) 55.7 Baseline PS II 60.9 pairwise kinematic tree 60.8 unary pairwise poselets pairwise + unary 62.9 conditioning IV unary result factors position/rotation I III 50 100 150 200 ... appearance 50 100 50 . . 150 . 200

  15. Results (PCP) 55.7 Baseline PS II 60.9 pairwise kinematic tree 60.8 unary pairwise poselets conditioning pairwise + unary 62.9 IV unary result factors position/rotation I III 50 100 150 200 ... appearance 50 100 50 . . 150 . 200

  16. Results (PCP) 55.7 Baseline PS II 60.9 pairwise kinematic tree 60.8 unary pairwise poselets pairwise + unary 62.9 conditioning IV unary result factors position/rotation I III 50 100 150 200 ... appearance 50 100 50 . . 150 . 200

  17. Results M A P l e d o m l l u F P a r t M a r g i n a l s M A P n i a l l a P i r o s t e c r i u P t c u r t S P a r t M a r g i n a l s

  18. Only 62.9% ??? Why not 100%? � What are we missing?

  19. Expressive Spatial Models… Joint model for body parts and Mid-Level body joints representation L. Pishchulin, M. Andriluka, P. Gehler, B. Schiele, Strong Appearance and Expressive Spatial Models for Human Pose Estimation , ICCV 2013

  20. … and Strong Appearance Mixtures of DPM for local Rotation Dependent Appearance Part Detectors rotation L. Pishchulin, M. Andriluka, P. Gehler, B. Schiele, Strong Appearance and Expressive Spatial Models for Human Pose Estimation , ICCV 2013

  21. Empirical Results Setting PCP [%] model so far 62.9 Andriluka et al. CVPR 09 55.7 + flexible body model 56.9 + local mixtures 65.2 + Poselet conditioned unaries 68.5 + Poselet conditioned pairwise 69.0 Yang & Ramanan, CVPR 11 60.8 Eichner & Ferrari, ACCV 12 64.3 (Pose Inference Machines) Ramakrishna et al. ECCV 14 67.6 (CNNs) Chen & Yuille arXiv 14 76.6

  22. Still not perfect … ? � • All remaining failure cases are of these types Self-occlusion Rare poses Strong foreshortening

  23. Only detection! Explain this then! Same color!

  24. Challenging Pose Dataset • 400 activities • 40000 examples • multiple people • video joint positions and occlusions part occlusions 3D torso and head orientation activity labels M. Andriluka, L. Pishchulin, P. Gehler, B. Schiele, Human Pose Estimation: A new Benchmark and State of the Art Analysis , CVPR 2014

  25. Fields of Parts — Parametrization � • for every body part… p = 1 , . . . , P � . . , |Y p | • …and every possible state � • … a binary random variable x p i ∈ { 0 , 1 } , i = 1 , . . . , |Y p | Kiefel & Gehler, Human Pose Estimation with a Fields of Parts , ECCV 2014

  26. Fields of Parts — Energy • Pairwise binary CRF (looooooopy) Kiefel & Gehler, Human Pose Estimation with a Fields of Parts , ECCV 2014

  27. Fields of Parts — Factors • Unary Factors — your usual HOG filter � • Pairwise Factors — your usual displacement factor (and more) θ ( ∆ x, ∆ y )

  28. Comparison to PS • Number of (body) parts p = 1 , . . . , P • Pictorial Structures — few parts, huge state space y p ∈ { 1 , . . . , M } × { 1 , . . . , N } = Y p � • Fields of Parts — many parts, small state space x p i ∈ { 0 , 1 } , i = 1 , . . . , |Y p |

  29. Gain: Bilateral • Locally image conditioned pairwise factors (bilateral, segmentation) • Not possible in distance transform for pictorial structures

  30. More connections • Block-dense connections already • New connections scale linearly

  31. Inference • Intractable Inference • Mean Field Approximation • Update Equation — Bilateral Filtering Operation (linear complexity) Krähenbühl & Koltun, Efficient inference in fully connected CRFs with Gaussian edge potentials , NIPS 2011

  32. Fields of Parts — Inference → → Q 5 ( x | I, θ ) Q 10 ( x | I, θ ) unaries (step 0) • Mean Field updates (here 10) Q 0 ( x | I, θ ) → Q 1 ( x | I, θ ) → · · · → Q 10 ( x | I, θ ) � • Predict the maximum marginal state i p = argmax ˆ Q 10 ( x p i = 1 | I ) � i ∈ Y p �

  33. Fields of Parts — Objective → → Q 5 ( x | I, θ ) Q 10 ( x | I, θ ) unaries (step 0) � • Objective: Max-Margin Max-Marginal (structured SVM) � � • Backpropagation Mean Field — autodiff through bilateral filtering Q 0 ( x | I, θ ) → Q 1 ( x | I, θ ) → · · · → Q 10 ( x | I, θ ) J. Domke, Learning Graphical Model Parameters with Approximate Marginal Inference , PAMI 2013 P. Krähenbühl & V. Koltun, Parameter Learning and Convergent Inference for Dense Random Fields , ICML 2013

  34. Neural Network Interpretation → → Q 5 ( x | I, θ ) Q 10 ( x | I, θ ) unaries (step 0) � • Non-linear convolutional Filter defined by dense graphical model and mean field inference Q i +1 ( x | I, θ ) = F ( Q i ( x | I, θ )) �

  35. Results — APK � � � • On equal ground: same features, same “pairwise” terms • Pairwise conditionals improve

  36. Disclaimer: Not state-of-the-art � � � • PCP error measure

  37. Conclusion & Future Work • Parts are important for better models/understanding, not necessarily for performance • Richer image interpretation: joint pose estimation & image segmentation • More output: 3D pose, clothing, body measurements, etc • Robustness and speed • Will see more models that put tractable inference first

  38. Reference List • Teaching Geometry to Deformable Part Models, CVPR12 p i • 3D2DPM — 3D Deformable Part Models, ECCV12 p i • Poselet Conditioned Pictorial Structures, CVPR13 • Strong Appearance and Expressive Spatial Models for Human Pose Estimation, ICCV13 • Human Pose Estimation: A new Benchmark and State of the Art Analysis, CVPR14 • Human Pose Estimation with a Fields of Parts, ECCV14

  39. Bernt Schiele Micha Andriluka Leonid Pishchulin Martin Kiefel Thank You! Feedback Welcome!

Recommend


More recommend