computer vision two view geometry
play

COMPUTER VISION Two-view Geometry Emanuel Aldea < - PowerPoint PPT Presentation

COMPUTER VISION Two-view Geometry Emanuel Aldea < emanuel.aldea@u-psud.fr > http://hebergement.u-psud.fr/emi/ Computer Science and Multimedia Master - University of Pavia Outline The 3D representation of points The pinhole camera model


  1. COMPUTER VISION Two-view Geometry Emanuel Aldea < emanuel.aldea@u-psud.fr > http://hebergement.u-psud.fr/emi/ Computer Science and Multimedia Master - University of Pavia

  2. Outline The 3D representation of points The pinhole camera model Applying a coordinate transformation Homogeneous representations and algebraic operations The fundamental matrix The essential matrix Rectification E. Aldea (CS&MM- U Pavia) COMPUTER VISION (2/25)

  3. The 3D representation of points In the 3D space :     X X ′ p = ( X , Y , Z ) T = p ′ = ( X ′ , Y ′ , Z ′ ) T = Y Y ′     Z ′ Z � �� � � �� � initial point same point in different coordinate system Euclidean transform p ′ = Rp + t becomes in homogeneous coordinates :       X ′ r 11 r 12 r 13 t 1 X Y ′ r 21 r 22 r 23 t 2 Y        =  ·       Z ′ r 31 r 32 r 33 t 3 Z     1 0 0 0 1 1 � R � t p ′ = or otherwise ˜ p , avec R T R = I , det R = 1 ˜ 0 T 1 ◮ the transform has six degrees of freedom (three elementary rotations, three elementary translations) ◮ we discard the˜for the sake of simplicity, but when it makes sense the variables are homogeneous E. Aldea (CS&MM- U Pavia) COMPUTER VISION (3/25)

  4. Outline The 3D representation of points The pinhole camera model Applying a coordinate transformation Homogeneous representations and algebraic operations The fundamental matrix The essential matrix Rectification E. Aldea (CS&MM- U Pavia) COMPUTER VISION (4/25)

  5. The pinhole camera model 3D ⇒ 2D projection ◮ In the 3D focal plance : ( X , Y , Z ) T ⇒ ( fX / Z , fY / Z , f ) T ◮ In the image 2D plane : ( X , Y , Z ) T ⇒ ( fX / Z , fY / Z ) = ( x , y ) E. Aldea (CS&MM- U Pavia) COMPUTER VISION (5/25)

  6. The pinhole camera model The image plane projection ( fX / Z , fY / Z ) gives in homogeneous coordinates :   X       fX f 1 0 Y    =  ·  · fY f 1 0  = diag( f , f , 1)[ I | 0 ] X      Z  Z 1 1 0 1 Problem : usually, the chosen reference in the image plane is not the projection of the optical axis : This gives in the reference system we use commonly : ( X , Y , Z ) ⇒ ( fX / Z + p x , fY / Z + p y )   X       1 0 fX f p x Y    =  · · 1 0  = diag( f , f , 1)[ I | 0 ] X fY f p y       Z  1 1 0 Z 1 E. Aldea (CS&MM- U Pavia) COMPUTER VISION (6/25) � �� �

  7. Outline The 3D representation of points The pinhole camera model Applying a coordinate transformation Homogeneous representations and algebraic operations The fundamental matrix The essential matrix Rectification E. Aldea (CS&MM- U Pavia) COMPUTER VISION (7/25)

  8. Transformation to an inertial (fixed) frame Final step of the modelling : we express the 3D variables in a frame which is not attached to the camera and which is fixed (typical setting for mobile robotics) : By denoting as C the center of the camera in “world” coordinates, the transform world to camera is expressed as � � R − RC X cam = X E. Aldea (CS&MM- U Pavia) COMPUTER VISION (8/25) 0 T 1

  9. Outline The 3D representation of points The pinhole camera model Applying a coordinate transformation Homogeneous representations and algebraic operations The fundamental matrix The essential matrix Rectification E. Aldea (CS&MM- U Pavia) COMPUTER VISION (9/25)

  10. Homogeneous representation of 2D lines and points ◮ A 2D line is defined by ax + by + c = 0 i.e. a parametrization l = ( a , b , c ). ◮ However, kax + kby + kc = 0 corresponds to the same line, thus l = ( ka , kb , kc ) , ∀ k ∈ R \ { 0 } ◮ A 2D point ( x , y ) lies on a line ( a , b , c ) if ax + by + c = 0. ◮ This may be expressed as ( x , y , 1) T · ( a , b , c ) = ( x , y , 1) T · l = 0. ◮ ∀ k ∈ R \ { 0 } , ( kx , ky , k ) T · l = 0 if and only if ( x , y , 1) T · l = 0. ◮ ∀ k ∈ R \ { 0 } , we denote thus ( kx , ky , k ) as the homogeneous representation of the 2D point ( x , y ). ◮ An arbitrary homogeneous x = ( x 1 , x 2 , x 3 ) corresponds to the 2D point ( x 1 / x 3 , x 2 / x 3 ). ◮ Result : the point x lies on the line l if and only if x T l = 0. ◮ Result : the intersection of two lines l and l ′ is the point x = l × l ′ . ◮ Result : the line through two points x and x ′ is l = x × x ′ . E. Aldea (CS&MM- U Pavia) COMPUTER VISION (10/25)

  11. Some quick vector operations � �   i j k x 2 y 3 − x 3 y 2 � � � � x × y = x × · y = x 1 x 2 x 3 = x 3 y 1 − x 1 y 3 � �   � � y 1 y 2 y 3 x 1 y 2 − y 1 x 2 � �   0 − x 3 x 2 0 − x 1 x × = x 3   − x 2 0 x 1 Mixed product : x T ( y × z ) = | x y z | (the volume of the parallelepiped defined by the three vectors) E. Aldea (CS&MM- U Pavia) COMPUTER VISION (11/25)

  12. Singular value decomposition Theorem (SVD) : Let A be an m × n matrix. A may be expressed as : min( m , n ) � A = UΣV T = σ i U i V T i i =1 where Σ is a m × n diagonal matrix with σ i = Σ ii ≥ 0, and U ( m × m ) and V ( n × n ) are composed of orthornormal columns ◮ The rank of A is the number of σ i > 0 ◮ An orthonormal basis for the null space of A is composed of V i for indices i such that σ i = 0 ◮ By convention, the σ i are aligned in descending order by the decomposition algorithms. E. Aldea (CS&MM- U Pavia) COMPUTER VISION (12/25)

  13. Outline The 3D representation of points The pinhole camera model Applying a coordinate transformation Homogeneous representations and algebraic operations The fundamental matrix The essential matrix Rectification E. Aldea (CS&MM- U Pavia) COMPUTER VISION (13/25)

  14. Why is this part “fundamental” ? (cheap joke) What we can get from two views : ◮ Sparse 3D reconstruction ◮ Relative camera pose estimation ◮ Parametric surface fitting ◮ Dense 3D reconstruction (more complex work required for this) ◮ ... but also many multi-view algorithms extend nicely from two-view analysis E. Aldea (CS&MM- U Pavia) COMPUTER VISION (14/25)

  15. The anatomy of two views Some important observations : ◮ the pixel projection is along the ray defined by the 3D point and the camera center (i.e. as for x , X and C ) ◮ conversely, if x and x ′ do correspond to the same 3D point, the two rays intersect ◮ the two rays define a plane π denoted as epipolar plane ◮ the epipolar plane also contains the ray defined by the camera centers E. Aldea (CS&MM- U Pavia) COMPUTER VISION (15/25)

  16. The anatomy of two views From the projection in the two views we have : λ ′ x ′ = K ′ ( RX + t ) λ x = KX By eliminating X we get : λ ′ x ′ = K ′ ( λ RK − 1 x + t ) X = λ K − 1 x λ ′ K ′− 1 x ′ = λ RK − 1 x + t We eliminate the sum by applying a cross product with t : λ ′ t × K ′− 1 x ′ = λ t × RK − 1 x We multiply by K ′− 1 x ′ in order to get a null mixed product : 0 = λ K ′− 1 x ′ t × RK − 1 x Finally, by transposing K ′− 1 x ′ and ignoring the scalar λ we get : x ′ T K ′− T t × RK − 1 x = 0 � �� � F E. Aldea (CS&MM- U Pavia) COMPUTER VISION (16/25)

  17. The fundamental matrix F x ′ T Fx = 0 ◮ applying the F constraint does not require information about the scene 3D structure ◮ F is valid for the whole image ◮ we may apply the constraint without performing/knowing the camera calibration ◮ For a given point x ′ ,we denote by l ′ its corresponding epipolar line . It follows from x ′ T Fx = 0 that l ′ = Fx ◮ Similarly, l = F T x ′ ◮ The fundamental matrix constraint translates to a search along the epipolar line ... ◮ ... but also F = K ′− T t × RK − 1 encodes, along with the calibration matrices, the rotation and translation between views E. Aldea (CS&MM- U Pavia) COMPUTER VISION (17/25)

  18. The fundamental matrix F Theorem The condition which is necessary and sufficient for a matrix F to be a fundamental matrix is that det( F ) = 0 Multiple ways to notice that F is rank deficient : ◮ it follows from the fact that det( t × ) = 0 ◮ it follows from the fact that Fe = 0 E. Aldea (CS&MM- U Pavia) COMPUTER VISION (18/25)

  19. Computing F - the 8 point algorithm Straightforward approach : T Fx i = 0 ◮ each observation (match) provides a constraint on F as x ′ i ◮ if we group the unknowns as the column vector f = [ f 11 f 12 . . . f 33 ], the constraint may be expressed as a i f = 0, with a i a row vector ◮ only 8 parameters are independent, since the scale is not determined ◮ the search for f may be expressed as : min � Af � , subject to � f � = 1 f where A = [ a 1 a 2 . . . a 8 ] ◮ Solution : f is the last column of V , where A = UDV T is the SVD of A ◮ Proof : � � UDV T f � � DV T f � � � � V T f � � � DV T f � � = � subject to � , and � f � = � . We have to minimize � � � V T f � = 1. If y = V T f , then we minimize � Dy � subject to � y � = 1. Since D is diagonal with values in descending order, it means that y = (0 , 0 . . . , 1), and f = Vy is the last column of V . ( A5.3, Hartley and Zisserman ) E. Aldea (CS&MM- U Pavia) COMPUTER VISION (19/25)

Recommend


More recommend