structure from motion
play

Structure From Motion EECS 442 David Fouhey Fall 2019, University - PowerPoint PPT Presentation

Structure From Motion EECS 442 David Fouhey Fall 2019, University of Michigan http://web.eecs.umich.edu/~fouhey/teaching/EECS442_F19/ Structure from Motion Structure from motion Have: 2D points p ij seen in m images Assume: points generated


  1. Structure From Motion EECS 442 โ€“ David Fouhey Fall 2019, University of Michigan http://web.eecs.umich.edu/~fouhey/teaching/EECS442_F19/

  2. Structure from Motion

  3. Structure from motion Have: 2D points p ij seen in m images Assume: points generated from n fixed 3D points X j and cameras M i or ๐’’ ๐‘—๐‘˜ โ‰ก ๐‘ต ๐’‹ ๐’€ ๐’Œ X j Want: Cameras ๐‘ต ๐’‹ , points ๐’€ ๐’Œ p 1 j p 3 j p 2 j (Remember) M 1 M 3 ๐‘ต ๐’‹ โ‰ก ๐‘ณ ๐’‹ [๐‘บ ๐’‹ , ๐’– ๐’‹ ] M 2 ๐œ‡๐’’ ๐‘—๐‘˜ = ๐‘ต ๐’‹ ๐’€ ๐’Œ , ๐œ‡ โ‰  0 Known Unknown Diagram credit: S. Lazebnik

  4. Is SFM always uniquely solvable? โ€ข Necker cube Source: N. Snavely

  5. Structure from motion ambiguities Letโ€™s first find one easy ambiguity ๐’’ ๐‘—๐‘˜ โ‰ก ๐‘ต ๐’‹ ๐’€ ๐’Œ 3x1 3x4 4x1

  6. Zoolander , 2001

  7. Structure from motion ambiguities Letโ€™s first find one easy ambiguity ๐’’ ๐‘—๐‘˜ โ‰ก ๐‘ต ๐’‹ ๐’€ ๐’Œ Can pick any arbitrary scaling factor k and adjust the cameras and points ๐’’ ๐‘—๐‘˜ โ‰ก ๐‘ต ๐’‹ ๐‘™ โˆ’๐Ÿ ๐‘™๐’€ ๐’Œ (Can usually be fixed in practice: just need a number, obtainable from heights of known objects or an IMU)

  8. Structure from motion ambiguity Does this diagram change X j meaning if I use this coordinate system? x y p 1 j z 0 p 3 j p 2 j M 1 Versus this coordinate M 3 M 2 system?z Coordinate system irrelevant! x So global R,t also ambiguous 0 y

  9. Structure from motion ambiguities Not just limited to scale. Given: ๐’’ ๐‘—๐‘˜ โ‰ก ๐‘ต ๐’‹ ๐’€ ๐’Œ Can insert any global transform H ๐’’ ๐‘—๐‘˜ โ‰ก ๐‘ต ๐’‹ ๐’€ ๐’Œ = ๐‘ต ๐’‹ ๐‘ฐ โˆ’๐Ÿ ๐‘ฐ๐’€ ๐’Œ H is a 3D homography / perspective transform / projective transform

  10. Similarity/Affine/Perspective Given: Perspective Affine Similarity Lines +Parallelism +Angles ๐‘ ๐‘ ๐‘‘ ๐‘ ๐‘ ๐‘‘ ๐‘ก๐‘บ ๐’– ๐‘’ ๐‘“ ๐‘” ๐‘’ ๐‘“ ๐‘” 0 1 ๐‘• โ„Ž ๐‘— 0 0 1 3D: same idea, different dimensions House image: A. Efros

  11. Projective ambiguity With no constraints on cameras matrices and scene, can only reconstruct up to a perspective ambiguity H ๐’’ ๐‘—๐‘˜ โ‰ก ๐‘ต ๐’‹ ๐’€ ๐’Œ = ๐‘ต ๐’‹ ๐‘ฐ โˆ’๐Ÿ ๐‘ฐ๐’€ ๐’Œ Slide credit: S. Lazebnik

  12. Projective ambiguity Slide credit: S. Lazebnik

  13. Affine ambiguity If we have constraints in the form of what lines are parallel, can reduce ambiguity to affine ambiguity . ๐‘ฉ ๐’– ๐‘ฐ = Affine 0 1 ๐’’ ๐‘—๐‘˜ โ‰ก ๐‘ต ๐’‹ ๐’€ ๐’Œ = ๐‘ต ๐’‹ ๐‘ฐ โˆ’๐Ÿ ๐‘ฐ๐’€ ๐’Œ Slide credit: S. Lazebnik

  14. Affine ambiguity Slide credit: S. Lazebnik

  15. Similarity ambiguity If we have orthogonality constraints, get up to similarity transform. Really the best we can do. We get this if we have calibrated cameras. ๐‘ก๐‘บ ๐’– ๐‘ฐ = 0 1 ๐’’ ๐‘—๐‘˜ โ‰ก ๐‘ต ๐’‹ ๐’€ ๐’Œ = ๐‘ต ๐’‹ ๐‘ฐ โˆ’๐Ÿ ๐‘ฐ๐’€ ๐’Œ Slide credit: S. Lazebnik

  16. Similarity ambiguity Slide credit: S. Lazebnik

  17. Affine structure from motion Weโ€™ll do the math with affine / weak perspective cameras (math is much easier) Perspective Weak Perspective

  18. Recall: orthographic projection Orthographic camera: things infinitely far away but you have an amazing camera Image World Projection along the z direction ๐‘ฆ ๐‘ฃ 1 0 0 0 โ†’ ๐‘ฆ ๐‘ง ๐‘ค = 0 1 0 0 ๐‘ง ๐‘จ 1 0 0 0 1 1

  19. Field of view and focal length standard wide-angle telephoto Slide Credit: F. Durand

  20. Affine Camera 1 0 0 0 ๐‘ต = ๐‘ฉ 2๐ธ ๐’– 2๐ธ ๐‘ฉ 3๐ธ ๐’– 3๐ธ 0 1 0 0 0 1 0 1 0 0 0 1 3x3 Matrix 3x4 Ortho. 4x4 Matrix Affine 2D Proj Affine 3D Tedious mathโ€ฆ ๐‘ 11 ๐‘ 12 ๐‘ 13 ๐‘ 1 ๐‘ต = ๐‘ 21 ๐‘ 22 ๐‘ 23 ๐‘ 2 0 0 0 1

  21. Affine Camera So what? Who cares? Examine the projection ๐‘Œ ๐‘ฃ ๐‘ 11 ๐‘ 12 ๐‘ 13 ๐‘ 1 ๐‘ ๐‘ค โ‰ก ๐‘ 21 ๐‘ 22 ๐‘ 23 ๐‘ 2 ๐‘Ž 1 0 0 0 1 1 Projection becomes linear mapping + translation and doesnโ€™t involve homogeneous coordinates! ๐‘Œ ๐‘ค โ‰ก ๐‘ 11 ๐‘ 12 ๐‘ 13 ๐‘ฃ + ๐‘ 1 ๐‘ ๐‘ 21 ๐‘ 22 ๐‘ 23 ๐‘ 2 ๐‘Ž b is projection of origin. Can anyone see why?

  22. Affine structure from motion General structure ๐’’ ๐‘—๐‘˜ โ‰ก ๐‘ต ๐’‹ ๐’€ ๐’Œ from motion: 3x1 3x4 4x1 ๐’’ ๐‘—๐‘˜ = ๐‘ฉ ๐’‹ ๐’€ ๐’Œ + ๐’„ ๐’‹ Assume M is affine camera: 2x1 2x1 2x3 3x1 mn 2D points, m cameras, n 3D points up to arbitrary 3D affine (12 DOF) Need: 2mn โ‰ฅ 8m + 3n โ€“ 12 (m = 2): n โ‰ฅ 4 (for all m!)

  23. One simplifying trick Subtract off the average 2D point ๐’’ ๐‘—๐‘˜ = ๐‘ฉ ๐’‹ ๐’€ ๐’Œ + ๐’„ ๐’‹ ๐‘œ ๐‘œ ๐’’ ๐‘—๐‘˜ = ๐’’ ๐‘—๐‘˜ โˆ’ 1 = ๐‘ฉ ๐‘— ๐’€ ๐‘˜ + ๐’„ ๐‘— โˆ’ 1 เทž ๐‘œ เท ๐’’ ๐‘—๐‘™ ๐‘œ เท ๐‘ฉ ๐‘— ๐’€ ๐‘™ + ๐’„ ๐‘— ๐‘™=1 ๐‘™=1 Gather terms involving A i ,push out b i 0 ๐‘œ ๐‘œ ๐’’ ๐‘—๐‘˜ = ๐‘ฉ ๐’‹ ๐’€ ๐’Œ โˆ’ 1 + ๐’„ ๐’‹ โˆ’ 1 เทž ๐‘œ เท ๐’€ ๐‘™ ๐‘œ เท ๐’„ ๐‘— ๐‘™=1 ๐‘™=1 Set origin to mean of 3D points Can do this entirely in terms of A ! ๐’’ ๐‘—๐‘˜ = ๐‘ฉ ๐’‹ ๐’€ ๐’Œ เทž

  24. Affine structure from motion First, make data measurement matrix consisting of all the points stacked together ๐‘ฃ 11 เทž ๐‘ฃ 1๐‘œ เทž โ‹ฏ เทž เทž ๐’’ ๐Ÿ๐Ÿ โ‹ฏ ๐’’ ๐Ÿ๐’ ๐‘ค 11 เทž ๐‘ค 1๐‘œ เทž m โ‹ฎ โ‹ฑ โ‹ฎ โ‹ฎ โ‹ฑ โ‹ฎ cameras ๐’’ ๐’๐Ÿ เทž โ‹ฏ ๐’’ ๐’๐’ เทŸ ๐‘ฃ ๐‘›1 เทž ๐‘ฃ ๐‘›๐‘œ เทž โ‹ฏ ๐‘ค ๐‘›1 เทž ๐‘ค ๐‘›๐‘œ เทž n points How big is this matrix? C. Tomasi and T. Kanade. Shape and motion from image streams under orthography: A factorization method. IJCV , 9(2):137-154, November 1992.

  25. Affine structure from motion Then, write all the equations in one in terms of product of cameras and points. ๐‘ฉ ๐Ÿ ๐’’ ๐Ÿ๐Ÿ เทž โ‹ฏ ๐’’ ๐Ÿ๐’ เทž โ‹ฎ โ‹ฎ โ‹ฑ โ‹ฎ = ๐’€ ๐Ÿ โ‹ฏ ๐’€ ๐’ ๐‘ฌ = ๐‘ฉ ๐’ เทž เทŸ ๐’’ ๐’๐Ÿ โ‹ฏ ๐’’ ๐’๐’ 2m x n 2mx3 3xn D M S Whatโ€™s the rank of D ? 3! C. Tomasi and T. Kanade. Shape and motion from image streams under orthography: A factorization method. IJCV , 9(2):137-154, November 1992.

  26. Making Matrices Rank Deficient Repeat of epipolar geometry class, but important enough to see twice. Given matrix M: rotation matrices ๐‘‰ ๐‘›ร—๐‘› , ๐‘Š ๐‘œร—๐‘œ ๐‘ โ†’ ๐‘‰ฮฃ๐‘Š ๐‘ˆ diagonal scaling matrix ฮฃ ๐‘›ร—๐‘œ Keep only k ๐œ 1 โ‹ฏ 0 biggest ฯƒ ; set โ‹ฎ โ‹ฑ โ‹ฎ ฮฃ = 0 โ‹ฏ ๐œ ๐‘› others to 0 Minimizes ๐‘ โˆ’ เทก ๐‘ ๐บ (sum of ๐‘ โ† ๐‘‰เท  เทก ฮฃ๐‘Š ๐‘ˆ squares) subject to rank( เทก ๐‘ ) โ‰ค k See Eckart โ€“ Young โ€“Mirsky theorem if youโ€™re interested

  27. Affine structure from motion Weโ€™d like to take the measurements and convert them into M , S = x D M S 2m n 3 Remake of M. Hebert diagram

  28. Affine structure from motion Do SVD (typically you donโ€™t make full U,ฮฃ ,V) n n n n D U ฮฃ V T x x n = 2m Truncate to top 3 singular values ฮฃ 3 V 3 T D x x = U 3 Remake of M. Hebert diagram

  29. Affine structure from motion Nearly there apart from this annoying ฮฃ 3 . x x D = U 3 ฮฃ 3 V 3 T ฮค 1/2 ๐‘Š 1 2 ฮฃ 3 ๐‘ˆ One solution (split ฮฃ 3 in two): ๐ธ = ๐‘‰ 3 ฮฃ 3 3 ๐‘ ๐‘‡ But remember x D = M S that we can put HH -1 in the middle Remake of M. Hebert diagram

  30. Eliminating the affine ambiguity Rows a i of A i give axes of camera. Can multiply each projection A i with C to make A i C that satisfies: ๐‘ผ ๐’ƒ ๐Ÿ‘ = 0 ๐’ƒ ๐Ÿ p ๐’ƒ ๐Ÿ = 1 ๐’ƒ ๐Ÿ‘ = 1 a 2 X a 1 Gives 3 equations per camera, can set A i C to new camera, and C -1 S to new points. In general, a recipe for eliminating ambiguities Remake of M. Hebert diagram

  31. Reconstruction results C. Tomasi and T. Kanade, Shape and motion from image streams under orthography: A factorization method, IJCV 1992

  32. Dealing with missing data So far, assume we can see all points in all views In reality, measurement matrix typically looks like this: cameras points Possible solution: find dense blocks, solve in block, fuse. In general, finding these dense blocks is NP-complete Figure Credit: S. Lazebnik

  33. But cameras arenโ€™t affine! Want: m cameras M i , n 3D points X j Given: mn 2D points p ij ๐’’ ๐‘—๐‘˜ โ‰ก ๐‘ต ๐’‹ ๐’€ ๐’Œ = ๐‘ต ๐’‹ ๐‘ฐ โˆ’๐Ÿ ๐‘ฐ๐’€ ๐’Œ

  34. When is this Possible? Want: m cameras M i , n 3D points X j Given: mn 2D points p ij ๐’’ ๐‘—๐‘˜ โ‰ก ๐‘ต ๐’‹ ๐’€ ๐’Œ = ๐‘ต ๐’‹ ๐‘ฐ โˆ’๐Ÿ ๐‘ฐ๐’€ ๐’Œ 3D point (3) 2D 4x4 homography 3x4 camera point (2) (15) why? matrix (11) why? Need 2mn โ‰ฅ 11m + 3n โ€“ 15 (m = 2): n โ‰ฅ 7 (m = 3): n โ‰ฅ 6 (doesnโ€™t get better after) (m=1): n โ‰ค 4

  35. Two Camera Case For two cameras, we need 7 points. Hmm. What else (in theory) requires 7 points? Compute fundamental X matrix F and epipole b s.t. F T b = 0. Then: p p' ๐‘ต 1 = [๐‘ฑ, ๐Ÿ] b ๐‘ต 1 ๐‘ต 2 = [โˆ’ ๐’„ ๐‘ฆ ๐‘ฎ, ๐’„] ๐‘ต 2 Remember: this is up to a projective ambiguity!

  36. Incremental SFM Key idea: incrementally add cameras, points ? M 1 ? M 2 Cameras ? ? Points ? ? ? ? Remake of S. Lazebnik material Note: numbers of points arenโ€™t to scale.

  37. Incremental SFM Key idea: incrementally add cameras, points ? 1. Initialize motion M i M 1 = [R i ,t i ] with ? M 2 Cameras fundamental matrix ? ? Points ? ? ? ? Remake of S. Lazebnik material Note: numbers of points arenโ€™t to scale.

Recommend


More recommend