human action recognition using
play

Human Action Recognition using Pose-based Discriminant Embedding - PowerPoint PPT Presentation

Human Action Recognition using Pose-based Discriminant Embedding Behrouz Saghafi Advantages of Silhouettes Informative features for describing actions Capture the spatio-temporal characteristics of motion with lower computational cost No need


  1. Human Action Recognition using Pose-based Discriminant Embedding Behrouz Saghafi

  2. Advantages of Silhouettes Informative features for describing actions Capture the spatio-temporal characteristics of motion with lower computational cost No need for an explicit human body model

  3. General frameworks for using silhouettes in action recognition Frame recognition framework • Classify sequences on a frame-by-frame basis • Label for query sequence is obtained based on a voting scheme • Ignore the temporal information and kinematics Sequence recognition framework • Classify the sequence as a whole • Compare actions based on distances defined between sequences of points • Kinematics is involved

  4. Why embedding into lower dimensional space? • Recognition methods operating in high- dimensional space suffer from curse of dimensionality • Information provided in high-dimensional image space is too much to describe an action • Structure of human body imposes a constraint on possible postures

  5. Examples of postures for run and its trajectory in a possible action space

  6. Embeddings used to find the underlying action space • PCA (Principal Components Analysis) • LDA (Linear Discriminant Analysis) • LPP (Locality Preserving Projections) • LLE (Locally Linear Embedding) • LE (Laplacian Eigenmaps) • Kernel PCA • LSTDE (Local Spatio-Temporal Discriminant Embedding) >In all these methods, Embedding is defined based on the distance between points rather than sequences. >Thus they are not guaranteed to give optimum results in sequence recognition framework

  7. Distances between sets of points Median Hausdorff Distance (MHD): Spatiotemporal Correlation Distance (SCD):

  8. Optimal Embedding Computation We propose an embedding such that in the embedded space (action space), based on SCD as the distance metric: -The intra-class sequences are as close as possible - The inter-class sequences are as far apart as possible.

  9. Optimal Embedding Computation Intra-class sequences be as close as possible in the action space The sum of all pairwise SCD between embedded intra-class sequences should be minimized with respect to A

  10. Optimal Embedding Computation

  11. Optimal Embedding Computation

  12. Optimal Embedding Computation • For the optimization of inter-class sequences:

  13. Optimal Embedding Computation Generalized eigenvalue problem

  14. Overview of Approach > Action recognition is done by comparing the similarity between test and train sequences in the low-dimensional action space in the nearest neighbor framework.

  15. Period Estimation (1) • Actions can be considered semantically periodic. • Using a single period is more computationally efficient than using the entire length. • To estimate the action period, we have used the method based on absolute correlation between frames and improved it: The object’s self -similarity is computed by:

  16. Period Estimation (2) A column Linearly detrend autocorrelation ˆ R ˆˆ ( ) m z S z zz 1    ˆ ˆ ( ) R m E z z  ˆˆ  zz n m n N m ˆ ˆˆ ( ) R m z S zz

  17. Period Estimation (3) False peak detections by zero-derivative method specified by red vertical lines

  18. Warping Bicubic interpolation technique

  19. Aligning

  20. Experimental Results (Datasets) Weizmann database Maryland database KTH database

  21. Weizmann database 9 subjects. 10 actions: bending (bend), jumping jack (jack), jumping-forward-on-two-legs (jump), jumping-in- place-on-two-legs (pjump), running (run), galloping sideways (side), skipping (skip), walking (walk), waving-one-hand (wave1), and waving-two-hands (wave2)

  22. Recognition accuracy vs. dimension for different values of T for SCD (Weizmann)

  23. Studying the effect of T : Comparing the maximum and mean of recognition rate for different values of T for SCD(Weizmann)

  24. Recognition accuracy vs. dimension for different values of T for MHD (Weizmann)

  25. Studying the effect of T : Comparing the maximum and mean of recognition rate for different values of T for MHD(Weizmann)

  26. Recognition accuracy vs. dimension, using test sequences without warping (Weizmann) T =6 using MHD Best accuracy: 98.89%

  27. Comparison of different dimension reduction methods (Weizmann)

  28. Experimental Results (Weizmann) Comparison with different dimension reduction methods LDA PCA LDA PCA 1 0.4 0.2 0.5 0 0 -0.2 -0.5 -0.4 -1 -0.6 1 0.5 1 0.5 1 0.5 0.5 0 0 0 0 -0.5 -0.5 -0.5 -1 -0.5 -1 SLPP PDE Supervised LPP PDE 0.4 0.4 0.2 0.2 0 0 -0.2 -0.2 -0.4 -0.4 -0.6 0.5 0.5 1 0.5 0.5 0 0 0 0 -0.5 -0.5 -0.5 -1 -0.5 -1

  29. Comparison with other results on Weizmann dataset

  30. Results of Robustness to Noise (Weizmann)

  31. Weizmann’s robustness database for deformations

  32. Results of Weizmann’s Deformation robustness test

  33. Weizmann’s robustness database for viewpoint

  34. Results of Weizmann’s viewpoint robustness test

  35. Experimental Results (Maryland) 10 actions : pick up object, jog in place, push, squat, wave, kick, bend to the side, throw, turn around and talk on cell phone Result: 100% recognition rate

  36. Comparison of different dimension reduction methods (Maryland)

  37. KTH dataset s1 s2 s3 s4 6 actions, 25 subjects, 4 scenarios

  38. Examples of computed edge maps for in- place actions of KTH dataset

  39. Comparison with other methods on KTH dataset for the in-place actions

Recommend


More recommend