Human Action Recognition using Pose-based Discriminant Embedding Behrouz Saghafi
Advantages of Silhouettes Informative features for describing actions Capture the spatio-temporal characteristics of motion with lower computational cost No need for an explicit human body model
General frameworks for using silhouettes in action recognition Frame recognition framework • Classify sequences on a frame-by-frame basis • Label for query sequence is obtained based on a voting scheme • Ignore the temporal information and kinematics Sequence recognition framework • Classify the sequence as a whole • Compare actions based on distances defined between sequences of points • Kinematics is involved
Why embedding into lower dimensional space? • Recognition methods operating in high- dimensional space suffer from curse of dimensionality • Information provided in high-dimensional image space is too much to describe an action • Structure of human body imposes a constraint on possible postures
Examples of postures for run and its trajectory in a possible action space
Embeddings used to find the underlying action space • PCA (Principal Components Analysis) • LDA (Linear Discriminant Analysis) • LPP (Locality Preserving Projections) • LLE (Locally Linear Embedding) • LE (Laplacian Eigenmaps) • Kernel PCA • LSTDE (Local Spatio-Temporal Discriminant Embedding) >In all these methods, Embedding is defined based on the distance between points rather than sequences. >Thus they are not guaranteed to give optimum results in sequence recognition framework
Distances between sets of points Median Hausdorff Distance (MHD): Spatiotemporal Correlation Distance (SCD):
Optimal Embedding Computation We propose an embedding such that in the embedded space (action space), based on SCD as the distance metric: -The intra-class sequences are as close as possible - The inter-class sequences are as far apart as possible.
Optimal Embedding Computation Intra-class sequences be as close as possible in the action space The sum of all pairwise SCD between embedded intra-class sequences should be minimized with respect to A
Optimal Embedding Computation
Optimal Embedding Computation
Optimal Embedding Computation • For the optimization of inter-class sequences:
Optimal Embedding Computation Generalized eigenvalue problem
Overview of Approach > Action recognition is done by comparing the similarity between test and train sequences in the low-dimensional action space in the nearest neighbor framework.
Period Estimation (1) • Actions can be considered semantically periodic. • Using a single period is more computationally efficient than using the entire length. • To estimate the action period, we have used the method based on absolute correlation between frames and improved it: The object’s self -similarity is computed by:
Period Estimation (2) A column Linearly detrend autocorrelation ˆ R ˆˆ ( ) m z S z zz 1 ˆ ˆ ( ) R m E z z ˆˆ zz n m n N m ˆ ˆˆ ( ) R m z S zz
Period Estimation (3) False peak detections by zero-derivative method specified by red vertical lines
Warping Bicubic interpolation technique
Aligning
Experimental Results (Datasets) Weizmann database Maryland database KTH database
Weizmann database 9 subjects. 10 actions: bending (bend), jumping jack (jack), jumping-forward-on-two-legs (jump), jumping-in- place-on-two-legs (pjump), running (run), galloping sideways (side), skipping (skip), walking (walk), waving-one-hand (wave1), and waving-two-hands (wave2)
Recognition accuracy vs. dimension for different values of T for SCD (Weizmann)
Studying the effect of T : Comparing the maximum and mean of recognition rate for different values of T for SCD(Weizmann)
Recognition accuracy vs. dimension for different values of T for MHD (Weizmann)
Studying the effect of T : Comparing the maximum and mean of recognition rate for different values of T for MHD(Weizmann)
Recognition accuracy vs. dimension, using test sequences without warping (Weizmann) T =6 using MHD Best accuracy: 98.89%
Comparison of different dimension reduction methods (Weizmann)
Experimental Results (Weizmann) Comparison with different dimension reduction methods LDA PCA LDA PCA 1 0.4 0.2 0.5 0 0 -0.2 -0.5 -0.4 -1 -0.6 1 0.5 1 0.5 1 0.5 0.5 0 0 0 0 -0.5 -0.5 -0.5 -1 -0.5 -1 SLPP PDE Supervised LPP PDE 0.4 0.4 0.2 0.2 0 0 -0.2 -0.2 -0.4 -0.4 -0.6 0.5 0.5 1 0.5 0.5 0 0 0 0 -0.5 -0.5 -0.5 -1 -0.5 -1
Comparison with other results on Weizmann dataset
Results of Robustness to Noise (Weizmann)
Weizmann’s robustness database for deformations
Results of Weizmann’s Deformation robustness test
Weizmann’s robustness database for viewpoint
Results of Weizmann’s viewpoint robustness test
Experimental Results (Maryland) 10 actions : pick up object, jog in place, push, squat, wave, kick, bend to the side, throw, turn around and talk on cell phone Result: 100% recognition rate
Comparison of different dimension reduction methods (Maryland)
KTH dataset s1 s2 s3 s4 6 actions, 25 subjects, 4 scenarios
Examples of computed edge maps for in- place actions of KTH dataset
Comparison with other methods on KTH dataset for the in-place actions
Recommend
More recommend