Structure From Motion EECS 442 David Fouhey Fall 2019, University - PowerPoint PPT Presentation

Structure From Motion EECS 442 – David Fouhey Fall 2019, University of Michigan http://web.eecs.umich.edu/~fouhey/teaching/EECS442_F19/

Structure from Motion

Structure from motion Have: 2D points p ij seen in m images Assume: points generated from n fixed 3D points X j and cameras M i or 𝒒 𝑗𝑘 ≡ 𝑵 𝒋 𝒀 𝒌 X j Want: Cameras 𝑵 𝒋 , points 𝒀 𝒌 p 1 j p 3 j p 2 j (Remember) M 1 M 3 𝑵 𝒋 ≡ 𝑳 𝒋 [𝑺 𝒋 , 𝒖 𝒋 ] M 2 𝜇𝒒 𝑗𝑘 = 𝑵 𝒋 𝒀 𝒌 , 𝜇 ≠ 0 Known Unknown Diagram credit: S. Lazebnik

Is SFM always uniquely solvable? • Necker cube Source: N. Snavely

Structure from motion ambiguities Let’s first find one easy ambiguity 𝒒 𝑗𝑘 ≡ 𝑵 𝒋 𝒀 𝒌 3x1 3x4 4x1

Zoolander , 2001

Structure from motion ambiguities Let’s first find one easy ambiguity 𝒒 𝑗𝑘 ≡ 𝑵 𝒋 𝒀 𝒌 Can pick any arbitrary scaling factor k and adjust the cameras and points 𝒒 𝑗𝑘 ≡ 𝑵 𝒋 𝑙 −𝟐 𝑙𝒀 𝒌 (Can usually be fixed in practice: just need a number, obtainable from heights of known objects or an IMU)

Structure from motion ambiguity Does this diagram change X j meaning if I use this coordinate system? x y p 1 j z 0 p 3 j p 2 j M 1 Versus this coordinate M 3 M 2 system?z Coordinate system irrelevant! x So global R,t also ambiguous 0 y

Structure from motion ambiguities Not just limited to scale. Given: 𝒒 𝑗𝑘 ≡ 𝑵 𝒋 𝒀 𝒌 Can insert any global transform H 𝒒 𝑗𝑘 ≡ 𝑵 𝒋 𝒀 𝒌 = 𝑵 𝒋 𝑰 −𝟐 𝑰𝒀 𝒌 H is a 3D homography / perspective transform / projective transform

Similarity/Affine/Perspective Given: Perspective Affine Similarity Lines +Parallelism +Angles 𝑏 𝑐 𝑑 𝑏 𝑐 𝑑 𝑡𝑺 𝒖 𝑒 𝑓 𝑔 𝑒 𝑓 𝑔 0 1 𝑕 ℎ 𝑗 0 0 1 3D: same idea, different dimensions House image: A. Efros

Projective ambiguity With no constraints on cameras matrices and scene, can only reconstruct up to a perspective ambiguity H 𝒒 𝑗𝑘 ≡ 𝑵 𝒋 𝒀 𝒌 = 𝑵 𝒋 𝑰 −𝟐 𝑰𝒀 𝒌 Slide credit: S. Lazebnik

Projective ambiguity Slide credit: S. Lazebnik

Affine ambiguity If we have constraints in the form of what lines are parallel, can reduce ambiguity to affine ambiguity . 𝑩 𝒖 𝑰 = Affine 0 1 𝒒 𝑗𝑘 ≡ 𝑵 𝒋 𝒀 𝒌 = 𝑵 𝒋 𝑰 −𝟐 𝑰𝒀 𝒌 Slide credit: S. Lazebnik

Affine ambiguity Slide credit: S. Lazebnik

Similarity ambiguity If we have orthogonality constraints, get up to similarity transform. Really the best we can do. We get this if we have calibrated cameras. 𝑡𝑺 𝒖 𝑰 = 0 1 𝒒 𝑗𝑘 ≡ 𝑵 𝒋 𝒀 𝒌 = 𝑵 𝒋 𝑰 −𝟐 𝑰𝒀 𝒌 Slide credit: S. Lazebnik

Similarity ambiguity Slide credit: S. Lazebnik

Affine structure from motion We’ll do the math with affine / weak perspective cameras (math is much easier) Perspective Weak Perspective

Recall: orthographic projection Orthographic camera: things infinitely far away but you have an amazing camera Image World Projection along the z direction 𝑦 𝑣 1 0 0 0 → 𝑦 𝑧 𝑤 = 0 1 0 0 𝑧 𝑨 1 0 0 0 1 1

Field of view and focal length standard wide-angle telephoto Slide Credit: F. Durand

Affine Camera 1 0 0 0 𝑵 = 𝑩 2𝐸 𝒖 2𝐸 𝑩 3𝐸 𝒖 3𝐸 0 1 0 0 0 1 0 1 0 0 0 1 3x3 Matrix 3x4 Ortho. 4x4 Matrix Affine 2D Proj Affine 3D Tedious math… 𝑏 11 𝑏 12 𝑏 13 𝑐 1 𝑵 = 𝑏 21 𝑏 22 𝑏 23 𝑐 2 0 0 0 1

Affine Camera So what? Who cares? Examine the projection 𝑌 𝑣 𝑏 11 𝑏 12 𝑏 13 𝑐 1 𝑍 𝑤 ≡ 𝑏 21 𝑏 22 𝑏 23 𝑐 2 𝑎 1 0 0 0 1 1 Projection becomes linear mapping + translation and doesn’t involve homogeneous coordinates! 𝑌 𝑤 ≡ 𝑏 11 𝑏 12 𝑏 13 𝑣 + 𝑐 1 𝑍 𝑏 21 𝑏 22 𝑏 23 𝑐 2 𝑎 b is projection of origin. Can anyone see why?

Affine structure from motion General structure 𝒒 𝑗𝑘 ≡ 𝑵 𝒋 𝒀 𝒌 from motion: 3x1 3x4 4x1 𝒒 𝑗𝑘 = 𝑩 𝒋 𝒀 𝒌 + 𝒄 𝒋 Assume M is affine camera: 2x1 2x1 2x3 3x1 mn 2D points, m cameras, n 3D points up to arbitrary 3D affine (12 DOF) Need: 2mn ≥ 8m + 3n – 12 (m = 2): n ≥ 4 (for all m!)

One simplifying trick Subtract off the average 2D point 𝒒 𝑗𝑘 = 𝑩 𝒋 𝒀 𝒌 + 𝒄 𝒋 𝑜 𝑜 𝒒 𝑗𝑘 = 𝒒 𝑗𝑘 − 1 = 𝑩 𝑗 𝒀 𝑘 + 𝒄 𝑗 − 1 ෞ 𝑜 ෍ 𝒒 𝑗𝑙 𝑜 ෍ 𝑩 𝑗 𝒀 𝑙 + 𝒄 𝑗 𝑙=1 𝑙=1 Gather terms involving A i ,push out b i 0 𝑜 𝑜 𝒒 𝑗𝑘 = 𝑩 𝒋 𝒀 𝒌 − 1 + 𝒄 𝒋 − 1 ෞ 𝑜 ෍ 𝒀 𝑙 𝑜 ෍ 𝒄 𝑗 𝑙=1 𝑙=1 Set origin to mean of 3D points Can do this entirely in terms of A ! 𝒒 𝑗𝑘 = 𝑩 𝒋 𝒀 𝒌 ෞ

Affine structure from motion First, make data measurement matrix consisting of all the points stacked together 𝑣 11 ෞ 𝑣 1𝑜 ෞ ⋯ ෞ ෞ 𝒒 𝟐𝟐 ⋯ 𝒒 𝟐𝒐 𝑤 11 ෞ 𝑤 1𝑜 ෞ m ⋮ ⋱ ⋮ ⋮ ⋱ ⋮ cameras 𝒒 𝒏𝟐 ෞ ⋯ 𝒒 𝒏𝒐 ෟ 𝑣 𝑛1 ෞ 𝑣 𝑛𝑜 ෞ ⋯ 𝑤 𝑛1 ෞ 𝑤 𝑛𝑜 ෞ n points How big is this matrix? C. Tomasi and T. Kanade. Shape and motion from image streams under orthography: A factorization method. IJCV , 9(2):137-154, November 1992.

Affine structure from motion Then, write all the equations in one in terms of product of cameras and points. 𝑩 𝟐 𝒒 𝟐𝟐 ෞ ⋯ 𝒒 𝟐𝒐 ෞ ⋮ ⋮ ⋱ ⋮ = 𝒀 𝟐 ⋯ 𝒀 𝒐 𝑬 = 𝑩 𝒏 ෞ ෟ 𝒒 𝒏𝟐 ⋯ 𝒒 𝒏𝒐 2m x n 2mx3 3xn D M S What’s the rank of D ? 3! C. Tomasi and T. Kanade. Shape and motion from image streams under orthography: A factorization method. IJCV , 9(2):137-154, November 1992.

Making Matrices Rank Deficient Repeat of epipolar geometry class, but important enough to see twice. Given matrix M: rotation matrices 𝑉 𝑛×𝑛 , 𝑊 𝑜×𝑜 𝑁 → 𝑉Σ𝑊 𝑈 diagonal scaling matrix Σ 𝑛×𝑜 Keep only k 𝜏 1 ⋯ 0 biggest σ ; set ⋮ ⋱ ⋮ Σ = 0 ⋯ 𝜏 𝑛 others to 0 Minimizes 𝑁 − ෡ 𝑁 𝐺 (sum of 𝑁 ← 𝑉෠ ෡ Σ𝑊 𝑈 squares) subject to rank( ෡ 𝑁 ) ≤ k See Eckart – Young –Mirsky theorem if you’re interested

Affine structure from motion We’d like to take the measurements and convert them into M , S = x D M S 2m n 3 Remake of M. Hebert diagram

Affine structure from motion Do SVD (typically you don’t make full U,Σ ,V) n n n n D U Σ V T x x n = 2m Truncate to top 3 singular values Σ 3 V 3 T D x x = U 3 Remake of M. Hebert diagram

Affine structure from motion Nearly there apart from this annoying Σ 3 . x x D = U 3 Σ 3 V 3 T Τ 1/2 𝑊 1 2 Σ 3 𝑈 One solution (split Σ 3 in two): 𝐸 = 𝑉 3 Σ 3 3 𝑁 𝑇 But remember x D = M S that we can put HH -1 in the middle Remake of M. Hebert diagram

Eliminating the affine ambiguity Rows a i of A i give axes of camera. Can multiply each projection A i with C to make A i C that satisfies: 𝑼 𝒃 𝟑 = 0 𝒃 𝟐 p 𝒃 𝟐 = 1 𝒃 𝟑 = 1 a 2 X a 1 Gives 3 equations per camera, can set A i C to new camera, and C -1 S to new points. In general, a recipe for eliminating ambiguities Remake of M. Hebert diagram

Reconstruction results C. Tomasi and T. Kanade, Shape and motion from image streams under orthography: A factorization method, IJCV 1992

Dealing with missing data So far, assume we can see all points in all views In reality, measurement matrix typically looks like this: cameras points Possible solution: find dense blocks, solve in block, fuse. In general, finding these dense blocks is NP-complete Figure Credit: S. Lazebnik

But cameras aren’t affine! Want: m cameras M i , n 3D points X j Given: mn 2D points p ij 𝒒 𝑗𝑘 ≡ 𝑵 𝒋 𝒀 𝒌 = 𝑵 𝒋 𝑰 −𝟐 𝑰𝒀 𝒌

When is this Possible? Want: m cameras M i , n 3D points X j Given: mn 2D points p ij 𝒒 𝑗𝑘 ≡ 𝑵 𝒋 𝒀 𝒌 = 𝑵 𝒋 𝑰 −𝟐 𝑰𝒀 𝒌 3D point (3) 2D 4x4 homography 3x4 camera point (2) (15) why? matrix (11) why? Need 2mn ≥ 11m + 3n – 15 (m = 2): n ≥ 7 (m = 3): n ≥ 6 (doesn’t get better after) (m=1): n ≤ 4

Two Camera Case For two cameras, we need 7 points. Hmm. What else (in theory) requires 7 points? Compute fundamental X matrix F and epipole b s.t. F T b = 0. Then: p p' 𝑵 1 = [𝑱, 𝟏] b 𝑵 1 𝑵 2 = [− 𝒄 𝑦 𝑮, 𝒄] 𝑵 2 Remember: this is up to a projective ambiguity!

Incremental SFM Key idea: incrementally add cameras, points ? M 1 ? M 2 Cameras ? ? Points ? ? ? ? Remake of S. Lazebnik material Note: numbers of points aren’t to scale.

Incremental SFM Key idea: incrementally add cameras, points ? 1. Initialize motion M i M 1 = [R i ,t i ] with ? M 2 Cameras fundamental matrix ? ? Points ? ? ? ? Remake of S. Lazebnik material Note: numbers of points aren’t to scale.

Structure From Motion EECS 442 David Fouhey Fall 2019, University - PowerPoint PPT Presentation

Structure From Motion EECS 442 David Fouhey Fall 2019, University of Michigan http://web.eecs.umich.edu/~fouhey/teaching/EECS442_F19/ Structure from Motion Structure from motion Have: 2D points p ij seen in m images Assume: points generated

Visual Motion Motion illusions Uses for motion cues Optic flow Motion blindness

Motion Estimation for Video Coding Motion-Compensated Prediction Bit Allocation Motion

Forces and Motion Click on the topic to go to that section Motion Motion Graphs of Motion

Forces and Motion Click on the topic to go to that section Motion Motion Graphs of Motion

Motion in Photography Freeze Motion / Blur Motion Objective The student will create freeze

Outline Outline Motion & Inverse Motion Motion & Inverse Motion Time

Learning to Synthesize Motion Blur CVPR 2019 Tim Brooks and Jon Barron Research Motion During

Motion Aftereffects Without Motion: Engaging the Human Motion Perception System With Still

Problem Definition Using Shape Spaces for Structure from Motion Can we understand motion using a

Simple Harmonic Motion (SHM) Slide 2 / 67 SHM and Uniform Circular Motion There is a deep

Simple Harmonic Motion Slide 2 / 70 SHM and Circular Motion There is a deep connection between

Motion Capture Specialized Motion Capture N. Alberto Borghese Laboratory of Human Motion

Image Motion COMPSCI 527 Computer Vision COMPSCI 527 Computer Vision Image Motion 1 /

Projectile Motion The Horizontal Motion The Vertical Motion The Trajectory The

Computer Vision by Learning: Motion in Action Jan van Gemert, UvA 2 Motion and perceptual

Forces and Motion Click on the topic to go to that section Motion Graphs of Motion Forces

The Nordic Dialect Corpus Janne Bondi Johannessen RILIVS, September 17th-18th 2009, University of

Relative Attributes by Devi Parikh, Kristen Grauman ICCV2011 Experiment presentation by Wei-Lin

33:010:458 33:010:458 A Accounting Information Accounting Information A ntin ntin Inf rm ti

Modularising inductive families Josh Ko & Jeremy Gibbons Department of Computer Science

Geographic visualisation of place names in Swedish literary texts Dana Dannlls, Lars Borin,

Modernising historical words Toma Erjavec 1 Yves Scherrer 2 1 Dept. of Knowledge Technologies,

Using an Alignment-based Lexicon for Canonicalization of Historical Text

Phil Green Steve Renals Steve Young Cambridge University Workshop on Speech, Language and Human

Structure From Motion EECS 442 David Fouhey Fall 2019, University - PowerPoint PPT Presentation

Structure From Motion EECS 442 David Fouhey Fall 2019, University of Michigan http://web.eecs.umich.edu/~fouhey/teaching/EECS442_F19/ Structure from Motion Structure from motion Have: 2D points p ij seen in m images Assume: points generated

Visual Motion Motion illusions Uses for motion cues Optic flow Motion blindness

Motion Estimation for Video Coding Motion-Compensated Prediction Bit Allocation Motion

Forces and Motion Click on the topic to go to that section Motion Motion Graphs of Motion

Forces and Motion Click on the topic to go to that section Motion Motion Graphs of Motion

Motion in Photography Freeze Motion / Blur Motion Objective The student will create freeze

Outline Outline Motion &amp; Inverse Motion Motion &amp; Inverse Motion Time

Learning to Synthesize Motion Blur CVPR 2019 Tim Brooks and Jon Barron Research Motion During

Motion Aftereffects Without Motion: Engaging the Human Motion Perception System With Still

Problem Definition Using Shape Spaces for Structure from Motion Can we understand motion using a

Simple Harmonic Motion (SHM) Slide 2 / 67 SHM and Uniform Circular Motion There is a deep

Simple Harmonic Motion Slide 2 / 70 SHM and Circular Motion There is a deep connection between

Motion Capture Specialized Motion Capture N. Alberto Borghese Laboratory of Human Motion

Image Motion COMPSCI 527 Computer Vision COMPSCI 527 Computer Vision Image Motion 1 /

Projectile Motion The Horizontal Motion The Vertical Motion The Trajectory The

Computer Vision by Learning: Motion in Action Jan van Gemert, UvA 2 Motion and perceptual

Forces and Motion Click on the topic to go to that section Motion Graphs of Motion Forces

The Nordic Dialect Corpus Janne Bondi Johannessen RILIVS, September 17th-18th 2009, University of

Relative Attributes by Devi Parikh, Kristen Grauman ICCV2011 Experiment presentation by Wei-Lin

33:010:458 33:010:458 A Accounting Information Accounting Information A ntin ntin Inf rm ti

Modularising inductive families Josh Ko &amp; Jeremy Gibbons Department of Computer Science

Geographic visualisation of place names in Swedish literary texts Dana Dannlls, Lars Borin,

Modernising historical words Toma Erjavec 1 Yves Scherrer 2 1 Dept. of Knowledge Technologies,

Using an Alignment-based Lexicon for Canonicalization of Historical Text

Phil Green Steve Renals Steve Young Cambridge University Workshop on Speech, Language and Human

Outline Outline Motion & Inverse Motion Motion & Inverse Motion Time

Modularising inductive families Josh Ko & Jeremy Gibbons Department of Computer Science