xiao chu
play

Xiao CHU ( ) Supervisor: Xiaogang Wang The Chinese University of - PowerPoint PPT Presentation

Xiao CHU ( ) Supervisor: Xiaogang Wang The Chinese University of Hong Kong 4 th year Ph.D. student Computer vision, Human pose estimation 1. Structured Feature Learning for Pose Estimation Xiao Chu , Wanli Ouyang,


  1. Xiao CHU ( 初晓 ) Supervisor: Xiaogang Wang • The Chinese University of Hong Kong • 4 th year Ph.D. student • Computer vision, Human pose estimation • 1. Structured Feature Learning for Pose Estimation Xiao Chu , Wanli Ouyang, Hongsheng Li, and Xiaogang Wang, CVPR , 2016 2. CRF-CNN: Modelling Structured Information in Human Pose Estimation Xiao Chu, Wanli Ouyang, Hongsheng Li, Xiaogang Wang, NIPS , 2016.

  2. Structured Feature Learning for Human Pose Estimation Xiao Chu, Wanli Ouyang, Hongsheng Li, and Xiaogang Wang

  3. Human Pose estimation is estimate the joint location of each body part.

  4. Structured Prediction CNN + Structured Prediction Structured Feature 1. Build up structure  Tompson et al., NIPS'2014 at feature level  Chen&Yuille, NIPS'2014 2. Pass message with  Fan et al. , CVPR’2015 geometrical transfer kernel  Yang et al. , CVPR’2016 3. Bidirectional tree

  5. Fully convolutional net for Human pose estimation Fully Convolutional layers Head 1 × 1 kernel VGG Neck 𝑑𝑝𝑜𝑤 1~6\{𝑞𝑝𝑝𝑚4,5} 𝑔𝑑𝑝𝑜𝑤 7 Wrist Input Image 448 × 448 0.9 Prediction 56 × 56 0.02

  6. Fully convolutional net for Human pose estimation VGG 𝑑𝑝𝑜𝑤 1~6\{𝑞𝑝𝑝𝑚4,5} e1 h1 e2 h2 e3 h3 e4 h4 e5 h5 e6 h6 e5 h2 Consistent Exclusive e7 h7 e1 h1 h2 e2 h3 e3 h4 e4 h5 e5 h6 e6 e4 h6 h7 e7

  7. Structured Structured Feature Prediction VGG 𝑑𝑝𝑜𝑤 1~6\{𝑞𝑝𝑝𝑚4,5}

  8. VGG 𝑑𝑝𝑜𝑤 1~6\{𝑞𝑝𝑝𝑚4,5} Structured Feature Learning B1 A1 Positive Direction Revert Direction B2 A2 B7 A7 B3 A3 B8 A8 B4 A4 B5 B9 A5 A9 B6 B10 A6 A10

  9. ⨁ 𝑓 𝑜 Input image Feature maps for elbow Updated feature maps for elbow ⨂ Learned kernel ℎ 𝑛 Feature maps for Shifted feature maps downward lower arm

  10. Experimental Results on FLIC dataset Percentage of Correct Parts ( strict PCP) MODE[1] Tompson et al. [2] Tompson et al. [3] Chen&Yuille [4] Ours 97.9 95.2 94.7 93.7 92.4 97 91.9 88.8 87.3 86.8 84.4 82.8 80.9 68.3 52.1 U.ARMS L.ARMS MEAN Percentage of Detected joints (PDJ) fi fl fi fi fl ⇥ ⇥ fi ⇥ ⌦ fi fi “ Fr am es C i nem a” “ Leeds Poses” − − − − fi

  11. Experimental Results on LSP dataset Andriluka et al. [5] Yang&Ramanan [6] Pishchulin et al. [7] Eichner&Ferrari et al.[8] Ouyang et al. [9] Pishchulin et al. [10] Chen&Yuille[4] Ours 6.1% 95.4 92.7 89.6 88.7 87.5 87.8 87.6 86.2 85.8 85.1 82.9 83.1 82.9 83.2 81.1 80.9 80.1 79.3 78.9 78.1 76.5 75.7 74.9 74.3 77 73.2 77 72.2 75 70.3 69.2 69.3 69.2 68.6 67.1 65.2 68 64.3 63.3 67 62.8 62.9 61.8 60.7 56.5 55.4 55.7 54.2 56 46.5 46.6 45 39.8 37.4 33.9 TORSO HEAD U.ARMS L.ARMS U.LEGS L.LEGS MEAN Percentage of Correct Parts ( strict PCP)

  12. Pose estimation results on the FLIC dataset Robust to disturbance Robust to occlusion

  13. Pose estimation results on the LSP dataset Correct reasoning on extreme poses.

  14. CRF-CNN: Modeling Structured Information in Human Pose Estimation Xiao Chu, Wanli Ouyang, Hongsheng Li, and Xiaogang Wang

  15. fi fi Structure fi ✓ 1. Making CNN deeper with advanced structured design fi R N N ’ 2. Build up structure at feature level or prediction level fi fi Lower Arm: Upper Arm: Elbow: Wrist: C R F- R N N ’ FCN CRF-RNN Conditional Random field Tree-Structured fi solved with mean field graphical model fi approximation fi We need a graphical model at … feature level to guide the design fi … … of structured feature fl … … … fi … fi − fi fi fi fi − fi ✓ φ fl ✓ ✓ φ fi fi fi fi fi fi ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ ⌦ ⌦ fi fi ⇥ fl fi fi ⇥ fi fl fi fi

  16. 𝒜 … … … … 𝜁 𝑨 𝜁 𝑨ℎ … … … … 𝒊 𝜁 ℎ 𝑱 (b) Structured (c) Structured (a) Multi-layer (d) Our output space hidden layer neural network implementation Model (a) 𝐹𝑜 𝐴, 𝐢, 𝐉, Θ = 𝜔 𝑨ℎ (𝐴 𝑗 , ℎ 𝑙 ) + ∅ ℎ (ℎ 𝑙 , 𝐉) (𝑗,𝑙)∈𝜁 𝑨ℎ 𝑙 Model (b) 𝐹𝑜 𝐴, 𝐢, 𝐉, Θ = 𝜔 𝑨 (𝐴 𝑗 , 𝒜 𝑘 ) + 𝜔 𝑨ℎ (𝐴 𝑗 , ℎ 𝑙 ) + ∅ ℎ (ℎ 𝑙 , 𝐉) (𝑗,𝑘)∈𝜁 𝑨 (𝑗,𝑙)∈𝜁 𝑨ℎ 𝑙 𝑗<𝑘 𝐹𝑜 𝐴, 𝐢, 𝐉, Θ = 𝜔 𝑨 (𝐴 𝑗 , 𝒜 𝑘 ) + 𝜔 𝑨 (ℎ 𝑙 , ℎ 𝑚 ) + 𝜔 𝑨ℎ (𝐴 𝑗 , ℎ 𝑙 ) + ∅ ℎ (ℎ 𝑙 , 𝐉) Model (c) (𝑗,𝑘)∈𝜁 𝑨 (𝑙,𝑚)∈𝜁 ℎ (𝑗,𝑙)∈𝜁 𝑨ℎ 𝑙 𝑗<𝑘 𝑙<𝑚 Model (d) 𝐹𝑜 𝐴, 𝐢, 𝐉, Θ = 𝜔 𝑨 (ℎ 𝑙 , ℎ 𝑚 ) + 𝜔 𝑨 (𝐴 𝑗 , 𝒜 𝑘 ) + 𝜔 𝑨ℎ (𝐴 𝑗 , ℎ 𝑗 ) + ∅ ℎ (ℎ 𝑙 , 𝐉) (𝑙,𝑚)∈𝜁 ℎ (𝑗,𝑘)∈𝜁 𝑨 𝑗 𝑙 𝑙<𝑚 𝑗<𝑘

  17. Target 𝑞 𝐢 𝐉, Θ = 𝑅(𝐢 𝑗 |𝐉, Θ) 𝑗 Mean Field Approximation 𝑅 𝐢 𝑗 𝐉, Θ = 1 𝑓𝑦𝑞 − ∅ ℎ ℎ 𝑙 , 𝐉 − 𝜒 ℎ (𝐢 𝑗 , 𝑅(𝐢 𝑘 |𝐉, Θ) 𝑨 ℎ,𝑗 ℎ 𝑙 ∈𝐢 𝑗 (𝑗,𝑘)∈𝛇 ℎ 𝑗<𝑘

  18. ℎ 3 Flooding update 𝑔 𝑐 ℎ 2 𝑅 𝑢 𝐢 𝑗′ ⨂𝐱 𝑗 ′ →𝑗 𝑅 𝑢+1 𝐢 𝑗 = 𝜐 ∅ 𝐢 𝑗 + 𝑔 𝑔 𝑏 𝑗′∈𝒲 𝑂(𝑗)\i 𝑑 ℎ 1 ℎ 4 𝑔 𝑒 ′ < −ℎ 1 ′ < −ℎ 2 ℎ 4 ℎ 1 < −ℎ 2 < −ℎ 4 ℎ 5 < −ℎ 5 ′ < −ℎ 1 ′ < −ℎ 2 ℎ 2 ℎ 3 < −ℎ 3 < −ℎ 4 ′ < −ℎ 1 ℎ 5

  19. Serial update ℎ 3 ℎ 1 → ℎ 2 𝑔 𝑐 ℎ 2 𝑔 𝑔 𝑏 𝑑 ℎ 1 ℎ 4

  20. Serial update ℎ 3 ℎ 1 → ℎ 2 𝑔 𝑐 ℎ 4 → ℎ 2 ℎ 2 𝑔 𝑔 𝑏 𝑑 ℎ 1 ℎ 4

  21. Serial update ℎ 3 ℎ 1 → ℎ 2 𝑔 𝑐 ℎ 4 → ℎ 2 ℎ 2 𝑔 ℎ 2 ′ → ℎ 3 𝑔 𝑏 𝑑 ℎ 1 ℎ 4 ℎ 3 is marginalized.

  22. Serial update ℎ 3 ℎ 1 → ℎ 2 𝑔 𝑐 ℎ 4 → ℎ 2 ℎ 2 𝑔 ℎ 2 ′ → ℎ 3 𝑔 𝑏 𝑑 ℎ 1 ℎ 4 ℎ 3 is marginalized. ℎ 3 → ℎ 2 ′ ℎ 2 is marginalized.

  23. Serial update ℎ 3 ℎ 1 → ℎ 2 𝑔 𝑐 ℎ 4 → ℎ 2 ℎ 2 𝑔 ℎ 2 ′ → ℎ 3 𝑔 𝑏 𝑑 ℎ 1 ℎ 4 ℎ 3 is marginalized. ′ ′ ℎ 3 → ℎ 2 ℎ 2 is marginalized. ′′ \ℎ 1 −> ℎ 1 ℎ 2 ′′ \ℎ 4 −> ℎ 4 ℎ 2 ℎ 1 and ℎ 4 is marginalized.

  24. CVPR’16 V.S. NIPS’16 A1 Positive Direction 1 path A2 A7 A3 A8 A4 A5 A9 A6 A10 B1 Revert Direction B2 B7 B3 B8 B4 B5 B9 B6 B10

  25. RESULTS ON LSP (PCP) Chen&Yuille NIPS'2014 Yang et al. CVPR'2016 Chu et al. CVPR'2016 Ours 96.5 95.4 96 92.7 91.3 89.6 89.5 88.7 87.8 87.6 83.1 83.2 83.1 82.9 85 81.7 81.1 81.1 78.8 80 76.9 77 75 69.2 67.1 66.7 65.2 55.4 TORSO HEAD U. ARMS L.ARMS U.LEGS L.LEGS MEAN

  26. COMPONENT ANALYSIS (PCP) Flooding-2itrs-tree Flooding-2itrs-loopy Serial-tree(ReLU) Serial-tree(Softmax) 95.5 93.5 96 91.3 94 89.5 88.9 88.2 87.1 86.7 84.3 83.7 83.1 85 81.4 80.1 78.4 77.1 80 80 75.9 79 74.4 73 67.1 63.8 62.1 59.8 TORSO HEAD U.ARMS L.ARMS U.LEGS L.LEGS MEAN

  27. (a) (b) (c) (b) (a) (c) (a) Flooding-2itr-tree (b) Flooding-2itr-loopy (c) Final model

  28. Thank you!

  29. Conditional Random Field 𝑞 𝐴 𝐉, Θ = 𝑞(𝐴, 𝐢|𝐉, Θ) ℎ Where, 𝑓 −𝐹𝑜(𝐴,𝐢,𝐉,Θ) 𝑞 𝐴, 𝐢 𝐉, Θ = 𝑨∈𝒶,ℎ∈ℋ 𝑓 −𝐹𝑜(𝐴,𝐢,𝐉,Θ)

Recommend


More recommend