probabilistic tracking and probabilistic tracking and
play

Probabilistic Tracking and Probabilistic Tracking and Probabilistic - PowerPoint PPT Presentation

Probabilistic Tracking and Probabilistic Tracking and Probabilistic Tracking and Reconstruction of 3D Human Motion Reconstruction of 3D Human Motion Reconstruction of 3D Human Motion in Monocular Video Sequences in Monocular Video Sequences


  1. Probabilistic Tracking and Probabilistic Tracking and Probabilistic Tracking and Reconstruction of 3D Human Motion Reconstruction of 3D Human Motion Reconstruction of 3D Human Motion in Monocular Video Sequences in Monocular Video Sequences in Monocular Video Sequences Presentation of the thesis work of: Hedvig Sidenbladh, KTH Thesis opponent: Prof. Bill Freeman, MIT

  2. Thesis supervisors Thesis supervisors • Prof. Jan-Olof Eklundh, KTH • Prof. Michael Black, Brown University Collaborators Collaborators • Dr. David Fleet, Xerox PARC • Prof. Dirk Ormoneit, Stanford University

  3. A vision of the future from the past. Elektro Sparky New York Worlds Fair, 1939 (Westinghouse Historical Collection)

  4. Applications of computers Applications of computers looking at people looking at people • Human-machine interaction – Robots – Intelligent rooms • Video search • Entertainment: motion capture for games, animation, and film. • Surveillance

  5. Technical Goal Technical Goal Technical Goal Tracking a human in 3D

  6. Why is it Hard? Why is it Hard? The appearance of people can vary dramatically.

  7. Why is it hard? Why is it hard? People can appear in arbitrary poses. Structure is unobservable— inference from visible parts.

  8. Why is it hard? Why is it hard? Geometrically under-constrained.

  9. One solution: One solution: One solution: • Use markers • Use multiple cameras http://www.vicon.com/animation/

  10. State of the Art. Bregler and and Malik Malik ‘ ‘98 98 Bregler • Brightness constancy cue – Insensitive to appearance • Full-body required multiple cameras • Single hypothesis

  11. 2D vs. 3D tracking 2D vs. 3D tracking 2D vs. 3D tracking • Artist Artist’ ’s models... s models... •

  12. State of the Art. Cham and and Rehg Rehg ‘ ‘99 99 Cham • Single camera, multiple hypotheses • 2D templates (no drift but view dependent) I( x , t ) = I( x + u , 0) + η

  13. 1999 state of art 1999 state of art 1999 state of art Pavlovic, Rehg, Cham, and Murphy, Intl. Conf. Computer Vision, 1999

  14. State of the Art. Deutscher, North, , North, Deutscher Bascle, & Blake , & Blake ‘ ‘00 00 Bascle • Multiple hypotheses • Multiple cameras • Simplified clothing, lighting and background

  15. Note: we can fake it with clever system design M. Krueger, “Artificial Reality”, Addison-Wesley, 1983.

  16. Game videos... Game videos... Game videos...

  17. Decathlete 100m hurdles Decathlete 100m hurdles Decathlete 100m hurdles Black background No other people in camera Person at known Display tells person what motion to do. distance and position.

  18. Performance specifications Performance specifications * No special clothing * Monocular, grayscale, sequences (archival data) * Unknown, cluttered, environment Task: Infer 3D human motion from 2D image

  19. Bayesian formulation Bayesian formulation p (model | cues) = p (cues | model) p (model) p (cues) 1. Need a constraining likelihood model that is also invariant to variations in human appearance. 2. Need a prior model of how people move. 3. Posterior probability : Need an effective way to explore the model space (very high dimensional) and represent ambiguities.

  20. System components System components System components • Representation for probabilistic analysis. • Models for human appearance (likelihood term). • Models for human motion (prior term). – Very general model – Very specific model – Example-based model

  21. System components System components System components • Representation for probabilistic analysis. • Models for human appearance (likelihood term). • Models for human motion (prior term). – Very general model – Very specific model – Example-based model

  22. Simple Body Model Simple Body Model * Limbs are truncated cones * Parameter vector of joint angles and angular velocities = φ

  23. Multiple Hypotheses Multiple Hypotheses • Posterior distribution over model parameters often multi- modal (due to ambiguities) • Represent whole distribution: – sampled representation – each sample is a pose – predict over time using a particle filtering approach

  24. Particle Filter Particle Filter Posterior Temporal dynamics r φ φ φ p ( | ) p ( | I ) − − − t t 1 t 1 t 1 sample sample sample sample r normalize normalize φ φ p ( I | ) p ( | I ) t t t t Posterior Likelihood Problem: Expensive represententation of posterior! Approaches to solve problem: • Lower the number of samples. (Deutsher et al., CVPR00) • Represent the space in other ways (Choo and Fleet, ICCV01)

  25. System components System components System components • Representation for probabilistic analysis. • Models for human appearance (likelihood term). • Models for human motion (prior term). – Very general model – Very specific model – Example-based model

  26. What do people look like? Changing background Varying shadows Occlusion Deforming clothing Low contrast limb boundaries What do non-people look like?

  27. Edge Detection? Edge Detection? • Probabilistic model? • Under/over-segmentation, thresholds, …

  28. Key Idea #1 (Likelihood) Key Idea #1 (Likelihood) 1. Use the 3D model to predict the location of limb boundaries (not necessarily features) in the scene. 2. Compute various filter responses steered to the predicted orientation of the limb. 3. Compute likelihood of filter responses using a statistical model learned from examples .

  29. Edge Filters Edge Filters Normalized derivatives of Gaussians (Lindeberg, Granlund and Knutsson, Perona, Freeman&Adelson, …) Edge filter response steered to limb orientation: θ σ = θ σ + θ σ e f ( , , ) sin f ( , ) cos f ( , ) x x x x y Filter responses steered to arm orientation.

  30. Example Training Images Example Training Images

  31. Edge Distributions Edge Distributions Edge response steered to model edge: θ σ = θ σ + θ σ f ( , , ) sin f ( , ) cos f ( , ) x x x e x y Similar to Konishi et al., CVPR 99

  32. Edge Likelihood Ratio Edge Likelihood Ratio Edge response Likelihood ratio

  33. Motion Other Cues Ridges Other Cues I( x + u , t +1) I( x , t )

  34. Ridge Distributions Ridge Distributions Ridge response steered to limb orientation θ σ = θ σ + θ σ − θ θ σ − 2 2 f ( , , ) | sin f ( , ) cos f ( , ) 2 sin cos f ( , ) | x x x x r xx yy xy θ σ + θ σ + θ θ σ 2 2 | cos f ( , ) sin f ( , ) 2 sin cos f ( , ) | x x x xx yy xy Ridge response only on certain image scales!

  35. Motion distributions Motion distributions Different underlying motion models

  36. Likelihood Formulation Likelihood Formulation • Independence assumptions: – Cues: p(image | model) = p(cue1 | model) p(cue2 | model) – Spatial: p(image | model) = Π p(image(x) | model) x ∈ image – Scales: p(image | model) = Π p(image( σ ) | model) σ =1,... • Combines cues and scales! • Simplification, in reality there are dependencies

  37. The power of cue combination The power of cue combination The power of cue combination

  38. Using edge cues alone Using edge cues alone Using edge cues alone Edge cues

  39. Using ridge cues alone Using ridge cues alone Using ridge cues alone Ridge cues

  40. Using flow cue alone Using flow cue alone Using flow cue alone Flow cues

  41. Using edge, ridge, and motion cues Using edge, ridge, and motion cues Using edge, ridge, and motion cues together together together Edge cues Ridge cues Flow cues

  42. Key Idea #2 Key Idea #2 p (image | foreground, background) ∝ p (foreground part of image | foreground) p (foreground part of image | background) Do not look in parts of the image considered background Foreground part of image

  43. Likelihood Likelihood ∏ ∏ = p ( image | fore , back ) p ( image | fore ) p ( image | back ) fore pixels back pixels ∏ ∏ p ( image | back ) p ( image | fore ) = all pixels fore pixels ∏ p ( image | back ) fore pixels ∏ const p ( image | fore ) = fore pixels ∏ p ( image | back ) fore pixels Foreground pixels Background pixels

  44. System components System components System components • Representation for probabilistic analysis. • Models for human appearance (likelihood term). • Models for human motion (prior term). – Very general model – Very specific model – Example-based model

  45. The Prior term The Prior term Bayesian formulation: ∝ p (model | cue) p (cue | model) p (model) – Need a constraining likelihood model that is also invariant to variations in human appearance – Need a good model of how people move

  46. Very general model Very general model Very general model • Constant velocity motions • Not constrained by how people tend to move.

  47. Constant velocity model Constant velocity model • All DOF in the model parameter space, φ , independent • Angles are assumed to change with constant speed • Speed and position changes are randomly sampled from normal distribution

Recommend


More recommend