learning transferable distance functions for human action
play

Learning Transferable Distance Functions For Human Action - PowerPoint PPT Presentation

Learning Transferable Distance Functions For Human Action Recognition and Detection Weilong Yang Simon Fraser University 1 Action Recognition and Detection Walking Running Jogging Boxing Waving T X Y 2 Applications Action related


  1. Learning Transferable Distance Functions For Human Action Recognition and Detection Weilong Yang Simon Fraser University 1

  2. Action Recognition and Detection Walking Running Jogging Boxing Waving T X Y 2

  3. Applications � Action related video search � Sports and Dancing video search � Event Detection � Automatic abnormality detection in surveillance videos 3

  4. Motivation � On KTH & Weizmann action datasets, almost 100% accurancy is achieved. [Jhuang et al. ICCV07, Fathi & Mori CVPR08 ] � Most of methods rely on a large amout of training set. � Half-half split or Leave-one-out cross validation � It is unrealistic to collect this many training samples for some action. Many One Clips Clip 4

  5. Query Action Template D Template A Template B Template C Label: Throwing R-dancing M-dancing Kicking Template Set

  6. Related Works � One shot learning of object categories [Fei-Fei et al. ICCV03] � Visual Object Identification [Ferencz et al. IJCV07] Transfer Learning: The ability of a system to recognize and apply knowledge and skills learned in previous tasks to novel tasks .[Pan & Yang, TKDE 2009] 6

  7. Hyper- Query Features Fq Templates A B C Action Recognition Learning Action Detection Distance 7

  8. Patch based Action comparison Query Template Motion Descriptor Frame-to-Frame [Efros et al. ICCV 03] Distance 8

  9. Patch based Action comparison Query Template Elementary Patch-to-Patch Distance Frame Frame-to-Frame Correspondence Distance 9

  10. Hyper- Query Features Fq Templates A B C Action Recognition Learning Action Detection Distance 10

  11. Local Distance Function [Frome et al. NIPS06] 11

  12. Local Distance Function Triplet Large Training set required [Frome et al. NIPS06] 12

  13. Transferable Distance Function 13

  14. Transferable Distance Function 14

  15. Transferable Distance Function Hyper- Feature Transferable 15

  16. Max-Margin Formulation • Triplet • It is convex and similar to the primal problem of SVM 16

  17. Hyper-Features � Codebook representation � Descriptor for each patch ○ HOG + Positions � Obtaining codebook with the size of ○ K-means clustering � Hyper-feature for each patch ○ A dimensional vector 17

  18. Summary of Features Patch Matching Motion Cue Patch Weighting Hyper- Shape Cue Feature & Positions 18

  19. Hyper- Query Features Fq Templates A B C Action Recognition Learning Action Detection Distance 19

  20. Recognizing an Action Query Hyper- Features Fq Template Template Template C A B 20

  21. Hyper- Query Features Fq Templates A B C Action Recognition Learning Action Detection Distance 21

  22. Experiments on Action Recognition � Train the transferable distance function on Weizmann, and test on KTH. Source Testing set Training Set skip walk side clap jack run bend transfer wave2 jump jog wave1 pjump Weizmann KTH • The source training set does not contain the actions of the template set • Each Action in the testing set has only one clip as template 22

  23. Visualization Learnt Weights on Codeword Testing Actions Ranking 23

  24. Five Rounds of Experiments � For each round, we randomly select one actor, then choose one clip per action from this actor as the template. 5% improvement Dc : Direct Comparison (W = 1) Tr : Transferable Distance Function 24

  25. Confusion Matrix of the Round 2 Clpping vs. Waving Jogging vs. Running Direct Comparison Transfer Avg: 70.9% Avg: 76.7% 25

  26. Efficiency � With the learnt distance function, we can sort the patches on each frame by their saliency. � Instead of using all patches, we can choose the top N patches with high weights for matching. 10 Patches on Each Frame 26

  27. Hyper- Query Features Fq Templates A B C Action Recognition Learning Action Detection Distance 27

  28. Human Action Detection 28

  29. Cascade Structure Cascade Stage N Cascade Stage 1 Cascade Stage 2 Reject Reject Reject 29

  30. Cascade Structure Hyper- Features Fq Decision All Sub- Windows Reject Reject Reject 30

  31. Efficient Action Detection 31

  32. Contributions � Transferable distance function Learning � Hyper-features based on appearance and positions � Max-margin Learning framework � Action recognition from one clip � Template Matching based on motion � Efficient action detection from one clip � Cascade structure 32

  33. Thank You ! 33

Recommend


More recommend