Motion Estimation (I) Ce Liu celiu@microsoft.com Microsoft Research New England
We live in a moving world • Perceiving, understanding and predicting motion is an important part of our daily lives
Motion estimation: a core problem of computer vision • Related topics: – Image correspondence, image registration, image matching, image alignment, … • Applications – Video enhancement: stabilization, denoising, super resolution – 3D reconstruction: structure from motion (SFM) – Video segmentation – Tracking/recognition – Advanced video editing (label propagation)
Contents (today) • Motion perception • Motion representation • Parametric motion: Lucas-Kanade • Dense optical flow: Horn-Schunck • Robust estimation • Applications (1)
Contents (next time) • Discrete optical flow • Layer motion analysis • Contour motion analysis • Obtaining motion ground truth • SIFT flow: generalized optical flow • Applications (2)
Readings • Rick’s book: Chapter 8 • Ce Liu’s PhD thesis (appendix A & B) • S. Baker and I. Matthews. Lucas-Kanade 20 years on: a unifying framework. IJCV 2004 • Horn-Schunck (wikipedia) • A. Bruhn, J. Weickert, C. Schnorr. Lucas/Kanade meets Horn/Schunk: combining local and global optical flow methods. IJCV 2005
Contents • Motion perception • Motion representation • Parametric motion: Lucas-Kanade • Dense optical flow: Horn-Schunck • Robust estimation • Applications (1)
Seeing motion from a static picture? http://www.ritsumei.ac.jp/~akitaoka/index-e.html
More examples
How is this possible? • The true mechanism is to be revealed • FMRI data suggest that illusion is related to some component of eye movements • We don’t expect computer vision to “see” motion from these stimuli, yet
What do you see?
In fact, …
We still don’t touch these areas
Motion analysis: human vs. computer • Computers can only analyze motion for opaque and solid objects • Challenges: – Shapeless or transparent scenes • Key: motion representation
Contents • Motion perception • Motion representation • Parametric motion: Lucas-Kanade • Dense optical flow: Horn-Schunck • Robust estimation • Applications (1)
Motion forms • Mapping: 𝑦 1 , 𝑧 1 → (𝑦 2 , 𝑧 2 ) • Global parametric motion: 𝑦 2 , 𝑧 2 = 𝑔(𝑦 1 , 𝑧 1 ; 𝜄) • Motion types – Translation: 𝑦 2 𝑧 2 = 𝑦 1 + 𝑏 𝑧 1 + 𝑐 – Similarity: 𝑦 2 𝑦 1 + 𝑏 cos 𝛽 sin 𝛽 𝑧 2 = 𝑡 𝑧 1 + 𝑐 − sin 𝛽 cos 𝛽 – Affine: 𝑦 2 𝑧 2 = 𝑏𝑦 1 + 𝑐𝑧 1 + 𝑑 𝑒𝑦 1 + 𝑓𝑧 1 + 𝑔 – Homography: 𝑦 2 𝑏𝑦 1 + 𝑐𝑧 1 + 𝑑 1 𝑧 2 = 𝑒𝑦 1 + 𝑓𝑧 1 + 𝑔 , 𝑨 = 𝑦 1 + 𝑖𝑧 1 + 𝑗 𝑨
Illustration of motion types Translation
Optical flow field • Parametric motion is limited and cannot describe the motion of arbitrary videos • Optical flow field: assign a flow vector 𝑣 𝑦, 𝑧 , 𝑤 𝑦, 𝑧 to each pixel (𝑦, 𝑧) • Projection from 3D world to 2D
Optical flow field visualization • Too messy to plot flow vector for every pixel • Map flow vector to color – Magnitude: saturation – Orientation: hue Input Ground-truth flow field Visualization code [Baker et al. 2007]
Matching criterion • Brightness constancy assumption 𝐽 1 𝑦, 𝑧 = 𝐽 2 𝑦 + 𝑣, 𝑧 + 𝑤 + 𝑠 + 𝑠 ∼ 𝑂 0, 𝜏 2 , ∼ 𝑉 −1,1 Noise 𝑠 , outlier (occlusion, lighting change) • Matching criteria – What’s invariant between two images? • Brightness, gradients, phase, other features… – Distance metric (L2, L1, truncated L1, Lorentzian) 𝐹 𝑣, 𝑤 = 𝜍 𝐽 1 𝑦, 𝑧 − 𝐽 2 𝑦 + 𝑣, 𝑧 + 𝑤 𝑦,𝑧 – Correlation, normalized cross correlation (NCC)
Error functions 3 3 L2 norm L1 norm 2.5 2.5 𝜍 𝑨 = 𝑨 2 𝜍 𝑨 = |𝑨| 2 2 1.5 1.5 1 1 0.5 0.5 0 0 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 3 3 Truncated L1 norm Lorentzian 2.5 2.5 (1 + 𝛿𝑨 2 ) 𝜍 𝑨 = min ( 𝑨 , 𝜃) 𝜍 𝑨 = log 2 2 1.5 1.5 1 1 0.5 0.5 0 0 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2
Robust statistics • Traditional L2 norm: only noise, no outlier • Example: estimate the average of 0.95, 1.04, 0.91, 1.02, 1.10, 20.01 • Estimate with minimum error 𝑨 ∗ = arg min 𝜍 𝑨 − 𝑨 𝑗 3 3 L1 norm L2 norm 𝑗 2.5 2.5 𝜍 𝑨 = |𝑨| 𝜍 𝑨 = 𝑨 2 𝑨 2 2 – L2 norm: 𝑨 ∗ = 4.172 1.5 1.5 1 1 0.5 0.5 – L1 norm: 𝑨 ∗ = 1.038 0 0 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 3 3 Truncated L1 norm Lorentzian 2.5 2.5 𝜍 𝑨 = min ( 𝑨 , 𝜃) 𝜍 𝑨 = log (1 + 𝛿𝑨 2 ) – Truncated L1: 𝑨 ∗ = 1.0296 2 2 1.5 1.5 1 1 – Lorentzian: 𝑨 ∗ = 1.0147 0.5 0.5 0 0 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2
Contents • Motion perception • Motion representation • Parametric motion: Lucas-Kanade • Dense optical flow: Horn-Schunck • Robust estimation • Applications (1)
Lucas-Kanade: problem setup • Given two images 𝐽 1 (𝑦, 𝑧) and 𝐽 2 (𝑦, 𝑧) , estimate a parametric motion that transforms 𝐽 1 to 𝐽 2 • Let 𝐲 = 𝑦, 𝑧 𝑈 be a column vector indexing pixel coordinate • Two typical transforms – Translation: 𝑋 x; p = 𝑦 + 𝑞 1 𝑧 + 𝑞 2 𝑦 𝑞 4 𝑦 + 𝑞 5 𝑧 + 𝑞 6 = 𝑞 1 𝑞 2 𝑞 3 – Affine: 𝑋 x; p = 𝑞 1 𝑦 + 𝑞 2 𝑧 + 𝑞 3 𝑧 𝑞 4 𝑞 5 𝑞 6 1 • Goal of the Lucas-Kanade algorithm 2 p ∗ = arg min p 𝐽 2 𝑋 x; p − 𝐽 1 x x
An incremental algorithm • Difficult to directly optimize the objective function 2 p ∗ = arg min p 𝐽 2 𝑋 x; p − 𝐽 1 x x • Instead, we try to optimize each step 2 Δp ∗ = arg min Δp 𝐽 2 𝑋 x; p + Δp − 𝐽 1 x x • The transform parameter is updated: p ← p + Δp ∗
Taylor expansion • The term 𝐽 2 𝑋 x; p + Δp is highly nonlinear • Taylor expansion: 𝜖𝑋 𝐽 2 𝑋 x; p + Δp ≈ 𝐽 2 𝑋 𝑦; 𝑞 + ∇𝐽 2 𝜖p Δp 𝜖𝑋 𝜖p : Jacobian of the warp • 𝑈 , then • If 𝑋 x; p = 𝑋 𝑦 x; p , 𝑋 𝑧 x; p 𝜖𝑋 𝜖𝑋 𝑦 𝑦 … 𝜖𝑋 𝜖𝑞 1 𝜖𝑞 𝑜 𝜖p = 𝜖𝑋 𝜖𝑋 𝑧 𝑧 … 𝜖𝑞 1 𝜖𝑞 𝑜
Jacobian matrix 𝑦 • For affine transform: 𝑋 x; p = 𝑞 1 𝑞 2 𝑞 3 𝑧 𝑞 4 𝑞 5 𝑞 6 1 𝜖p = 𝑦 𝑧 𝑦0 0 01 0 𝜖𝑋 The Jacobian is 1 0 𝑧 0 • For translation : 𝑋 x; p = 𝑦 + 𝑞 1 𝑧 + 𝑞 2 𝜖p = 1 0 𝜖𝑋 The Jacobian is 1 0
Taylor expansion • 𝛼𝐽 2 = 𝐽 𝑦 𝐽 𝑧 is the gradient of image 𝐽 2 evaluated at 𝑋(x; p) : compute the gradients in the coordinate of 𝐽 2 and warp back to the coordinate of 𝐽 1 𝜖p = 𝑦 𝑧 𝑦0 0 01 0 𝜖𝑋 • For affine transform 1 0 𝑧 0 𝜖𝑋 𝐽 𝑦 𝐽 𝑧 𝑦 𝐽 𝑧 𝑧 𝐽 𝑧 𝜖p = 𝐽 𝑦 𝑦 𝐽 𝑦 𝑧 ∇𝐽 2 • Let matrix 𝐂 = [𝐉 𝑦 𝐘𝐉 𝑦 𝐙 𝐉 𝑦 𝐉 𝑧 𝐘𝐉 𝑧 𝐙 𝐉 𝑧 ] ∈ ℝ 𝑜×6 , 𝐉 𝑦 and 𝐘 are both column vectors. 𝐉 𝑦 𝐘 is element-wise vector multiplication.
Gauss-Newton • With Taylor expansion, the objective function becomes 2 𝜖𝑋 Δp ∗ = arg min Δp 𝐽 2 𝑋 𝑦; 𝑞 + ∇𝐽 2 𝜖p Δp − 𝐽 1 x x Or in a vector form: Δp ∗ = arg min Δp 𝐉 𝑢 + 𝐂Δp 𝑈 (𝐉 𝑢 + 𝐂Δp) Where 𝐂 = [𝐉 𝑦 𝐘𝐉 𝑦 𝐙 𝐉 𝑦 𝐉 𝑧 𝐘𝐉 𝑧 𝐙 𝐉 𝑧 ] ∈ ℝ 𝑜×6 𝐉 𝑢 = 𝐉 2 𝐗 p − 𝐉 1 • Solution: Δp ∗ = − 𝐂 𝑈 𝐂 −1 𝐂 𝑈 𝐉 𝑢
Translation 𝜖p = 1 0 𝜖𝑋 • Jacobian: 1 0 𝜖𝑋 𝜖p = 𝐽 𝑦 𝐽 𝑧 • ∇𝐽 2 • 𝐂 = [𝐉 𝑦 𝐉 𝑧 ] ∈ ℝ 𝑜×2 • Solution: Δp ∗ = − 𝐂 𝑈 𝐂 −1 𝐂 𝑈 𝐉 𝑢 −1 𝐉 𝑦 𝑈 𝐉 𝑦 𝑈 𝐉 𝑧 𝑈 𝐉 𝑢 = − 𝐉 𝑦 𝐉 𝑦 𝑈 𝐉 𝑢 𝑈 𝐉 𝑧 𝑈 𝐉 𝑧 𝐉 𝑧 𝐉 𝑦 𝐉 𝑧
How it works
Coarse-to-fine refinement • Lucas-Kanade is a greedy algorithm that converges to local minimum • Initialization is crucial: if initialized with zero, then the underlying motion must be small • If underlying transform is significant, then coarse-to-fine is a must (𝑣 2 , 𝑤 2 ) × 2 (𝑣 1 , 𝑤 1 ) Smooth & down- × 2 sampling (𝑣, 𝑤)
Variations • Variations of Lucas Kanade: – Additive algorithm [Lucas-Kanade, 81] – Compositional algorithm [Shum & Szeliski, 98] – Inverse compositional algorithm [Baker & Matthews, 01] – Inverse additive algorithm [Hager & Belhumeur, 98] • Although inverse algorithms run faster (avoiding re- computing Hessian), they have the same complexity for robust error functions!
Recommend
More recommend