Learning Transferable Distance Functions For Human Action Recognition and Detection Weilong Yang Simon Fraser University 1
Action Recognition and Detection Walking Running Jogging Boxing Waving T X Y 2
Applications � Action related video search � Sports and Dancing video search � Event Detection � Automatic abnormality detection in surveillance videos 3
Motivation � On KTH & Weizmann action datasets, almost 100% accurancy is achieved. [Jhuang et al. ICCV07, Fathi & Mori CVPR08 ] � Most of methods rely on a large amout of training set. � Half-half split or Leave-one-out cross validation � It is unrealistic to collect this many training samples for some action. Many One Clips Clip 4
Query Action Template D Template A Template B Template C Label: Throwing R-dancing M-dancing Kicking Template Set
Related Works � One shot learning of object categories [Fei-Fei et al. ICCV03] � Visual Object Identification [Ferencz et al. IJCV07] Transfer Learning: The ability of a system to recognize and apply knowledge and skills learned in previous tasks to novel tasks .[Pan & Yang, TKDE 2009] 6
Hyper- Query Features Fq Templates A B C Action Recognition Learning Action Detection Distance 7
Patch based Action comparison Query Template Motion Descriptor Frame-to-Frame [Efros et al. ICCV 03] Distance 8
Patch based Action comparison Query Template Elementary Patch-to-Patch Distance Frame Frame-to-Frame Correspondence Distance 9
Hyper- Query Features Fq Templates A B C Action Recognition Learning Action Detection Distance 10
Local Distance Function [Frome et al. NIPS06] 11
Local Distance Function Triplet Large Training set required [Frome et al. NIPS06] 12
Transferable Distance Function 13
Transferable Distance Function 14
Transferable Distance Function Hyper- Feature Transferable 15
Max-Margin Formulation • Triplet • It is convex and similar to the primal problem of SVM 16
Hyper-Features � Codebook representation � Descriptor for each patch ○ HOG + Positions � Obtaining codebook with the size of ○ K-means clustering � Hyper-feature for each patch ○ A dimensional vector 17
Summary of Features Patch Matching Motion Cue Patch Weighting Hyper- Shape Cue Feature & Positions 18
Hyper- Query Features Fq Templates A B C Action Recognition Learning Action Detection Distance 19
Recognizing an Action Query Hyper- Features Fq Template Template Template C A B 20
Hyper- Query Features Fq Templates A B C Action Recognition Learning Action Detection Distance 21
Experiments on Action Recognition � Train the transferable distance function on Weizmann, and test on KTH. Source Testing set Training Set skip walk side clap jack run bend transfer wave2 jump jog wave1 pjump Weizmann KTH • The source training set does not contain the actions of the template set • Each Action in the testing set has only one clip as template 22
Visualization Learnt Weights on Codeword Testing Actions Ranking 23
Five Rounds of Experiments � For each round, we randomly select one actor, then choose one clip per action from this actor as the template. 5% improvement Dc : Direct Comparison (W = 1) Tr : Transferable Distance Function 24
Confusion Matrix of the Round 2 Clpping vs. Waving Jogging vs. Running Direct Comparison Transfer Avg: 70.9% Avg: 76.7% 25
Efficiency � With the learnt distance function, we can sort the patches on each frame by their saliency. � Instead of using all patches, we can choose the top N patches with high weights for matching. 10 Patches on Each Frame 26
Hyper- Query Features Fq Templates A B C Action Recognition Learning Action Detection Distance 27
Human Action Detection 28
Cascade Structure Cascade Stage N Cascade Stage 1 Cascade Stage 2 Reject Reject Reject 29
Cascade Structure Hyper- Features Fq Decision All Sub- Windows Reject Reject Reject 30
Efficient Action Detection 31
Contributions � Transferable distance function Learning � Hyper-features based on appearance and positions � Max-margin Learning framework � Action recognition from one clip � Template Matching based on motion � Efficient action detection from one clip � Cascade structure 32
Thank You ! 33
Recommend
More recommend