Local/Global Scene Flow using Intensity and Depth Data Julian Quiroga Frederic Devernay James Crowley PRIMA team, INRIA Grenoble julian.quiroga@inria.fr July 8, 2013 Julian Quiroga (INRIA) Local/Global Scene Flow July 8, 2013 1 / 31
Motivation The scene flow is the 3D motion field of the scene ( Vedula ICCV’99). Surface Flow, Morpheo-INRIA 2011 Applications Using depth and/or color Action recognition Interaction 3D reconstruction Navigation RGB-D SLAM Dataset TUM Julian Quiroga (INRIA) Local/Global Scene Flow July 8, 2013 2 / 31
Scene flow computation Stereo or multiview: From several optical flows ( Vedula et al. PAMI’05) Scene flow Using structure constraints ( Huguet & Devernay ICCV’07, Wedel et al. ECCV’08, Basha et al. CVPR’10 ) 2 views and optical flow Julian Quiroga (INRIA) Local/Global Scene Flow July 8, 2013 3 / 31
Scene flow computation Color and depth: Optical flow and range flow under orthography ( Spies et al. CVIU’02, Lukins et al. BMVC’04) Range flow equation Optical flow equation Photometric constraints ( Letouzey BMVC’11) Particle filtering ( Hadfield&Bowden ICCV’11) Projective camera model 3D motion field Julian Quiroga (INRIA) Local/Global Scene Flow July 8, 2013 4 / 31
Our work Assumptions Fixed camera Brightness and depth consistency Scene composed by locally-rigid moving parts Approach Local motion : 2D tracking of 3D surface patches in a LK framework. Global motion : an adaptive 2D TV-regularization of the 3D motion field. Large/small motions : multi-scale and a set of 3D correspondences. Energy E ( v ) = E D ( v ) + α E M ( v ) + β E R ( v ) , where v = { v X , v Y , v Z } . Julian Quiroga (INRIA) Local/Global Scene Flow July 8, 2013 5 / 31
Presentation outline Motion model Data term Regularisation term Sparse matching term Optimisation Experimentation Conclusion Julian Quiroga (INRIA) Local/Global Scene Flow July 8, 2013 6 / 31
Motion model Let X = ( X , Y , Z ) be a 3D point in the camera frame. The image flow ( u , v ) induced by the 3D motion v = { v X , v Y , v Z } is given by: � v X − xv Z � X + v X � � − X = 1 u = x ′ − x = Z + v Z 1 + v Z / Z Z Z and � v Y − yv Z � Y + v Y − Y � = 1 � v = y ′ − y = . Z + v Z Z Z 1 + v Z / Z M ( X ) and the new 3D points is X ′ = X + v . where ( x , y ) = ˆ Using a Taylor series in the denominator term containing v Z , we get � 1 � � 1 − v Z � v Z � 2 � = Z + − ... 1 + v Z / Z Z 1 − v Z � � = f ( v Z / Z ) ≈ 1 ∨ Z Julian Quiroga (INRIA) Local/Global Scene Flow July 8, 2013 7 / 31
Motion model Surface t Surface point X = ( X, Y, Z ) T ∈ R 3 Surface t +1 Scene Flow Image point V x = ( x, y ) T ∈ R 2 X t X t + 1 Scene Flow V = ( V X , V Y , V Z ) T ∈ R 3 V = X t +1 − X t Y Image Flow Z ( u , v ) Warp function X x t x t + 1 x t + 1 = W ( x t ; V ) Image Plane � u � W ( x t ; V ) = x t + v y � u � 1 � V X � 0 − x t = 1 V Y v Z t − y t 0 1 V Z x C . of . P Julian Quiroga (INRIA) Local/Global Scene Flow July 8, 2013 8 / 31
Presentation outline Motion model Data term Regularisation term Sparse matching term Optimisation Experimentation Conclusion Julian Quiroga (INRIA) Local/Global Scene Flow July 8, 2013 9 / 31
Data term Intensity image Depth image Brightness constancy assumption ( BCA ) I 2 ( W ( x ; v )) = I 1 ( x ) Depth velocity constraint ( DVC ) Z 2 ( W ( x ; v )) = Z 1 ( x ) + v Z ( x ) Julian Quiroga (INRIA) Local/Global Scene Flow July 8, 2013 10 / 31
Data term We solve for the local scene flow vector v that minimizes � | ρ I ( x , v ) | 2 � � | ρ Z ( x , v ) | 2 � � Ψ + λ Ψ , { x } √ s 2 + ε 2 is a differentiable approx. of the L 1 norm. � s 2 � where Ψ = Using IRLS the scene flow increment is given by � − Ψ ′ � � ( ∇ I J ) T ρ I ∆ v = H − 1 � ρ 2 x ′ , v � � I ( x , v ) { x } ( ∇ Z J − ( 0 , 0 , 1 )) T ρ Z − λ Ψ ′ � � �� ρ 2 x ′ , v � Z ( x , v ) where the Jacobian is defined as � f x J = ∂ W 1 � 0 c x − x ∂ v = . 0 f y c y − y Z ( x ) Julian Quiroga (INRIA) Local/Global Scene Flow July 8, 2013 11 / 31
Data term The matrix H is the Gauss-Newton approximation of the Hessian Ψ ′ + λ Ψ ′ � I 2 � � Z 2 � I x I y I x I Σ Z x Z y Z x ( Z Σ − 1 ) ρ I ρ Z � x x H = I 2 Z 2 I x I y I y I Σ Z x Z y Z y ( Z Σ − 1 ) y y Z 2 Z 2 I 2 ( Z Σ − 1 ) 2 I x I Σ I y I Σ Z x ( Z Σ − 1 ) Z y ( Z Σ − 1 ) Σ { x } � � � � with I Σ = − xI x + yI y and Z Σ = − xZ x + yZ y . Final expression �� � 2 � �� � 2 � � � x ′ , v ( x ) x ′ , v ( x ) �� �� � � E D ( v ) = Ψ � ρ I + λ Ψ � ρ Z x x ′ ∈ N ( x ) Julian Quiroga (INRIA) Local/Global Scene Flow July 8, 2013 12 / 31
Presentation outline Motion model Data term Regularisation term Sparse matching term Optimisation Experimentation Conclusion Julian Quiroga (INRIA) Local/Global Scene Flow July 8, 2013 13 / 31
Regularisation term The regularization term is given by: � E R ( v ) = ω ( x ) |∇ v ( x ) | , x where we use the notation |∇ v | := |∇ v X | + |∇ v Y | + |∇ v Z | . The decreasing positive function � − α |∇ Z 1 ( x ) | β � ω ( x ) = exp prevent regularization of the motion field along strong depth discontinuities. Julian Quiroga (INRIA) Local/Global Scene Flow July 8, 2013 14 / 31
Presentation outline Motion model Data term Regularisation term Sparse matching term Optimisation Experimentation Conclusion Julian Quiroga (INRIA) Local/Global Scene Flow July 8, 2013 15 / 31
Matching term �� x 1 1 , x 1 � � x N 1 , x N �� Let , ..., be the set of correspondences, the 2 2 matching term is defined as � | δ 3 D ( x , m ( x )) − v ( x ) | 2 � � E M ( v ) = p ( x )Ψ x with p ( x ) = 1 if there is a descriptor in a region around point x . The matching function m ( x ) gives the correspondency of each pixel x . The function δ 3 D ( x 1 , x 2 ) = M − 1 cam ( x 2 Z 2 ( x 2 ) − x 1 Z 1 ( x 1 )) computes the 3D displacement for each correspondency. Julian Quiroga (INRIA) Local/Global Scene Flow July 8, 2013 16 / 31
Presentation outline Motion model Data term Regularisation term Sparse matching term Optimization Experimentation Conclusion Julian Quiroga (INRIA) Local/Global Scene Flow July 8, 2013 17 / 31
Optimization To compute the scene flow we introduce an auxiliary flow and solve for the 3D motion field v that minimizes E ( v , u ) = E D ( v ) + α E M ( v ) + 1 2 θ | v − u | 2 + β E R ( u ) where θ is a small constant. For a fixed v , we solve for u that minimizes 1 1 2 κ | u ( x ) − v ( x ) | 2 + ω ( x ) |∇ u ( x ) | � x where κ = βθ . For every dimension this problem corresponds to a weighted version of the ROF model for image denoising. Julian Quiroga (INRIA) Local/Global Scene Flow July 8, 2013 18 / 31
Optimization For a fixed u , we solve for v that minimizes 2 1 � 2 θ | v ( x ) − u ( x ) | 2 E D ( v ) + α E M ( v ) + x The scene flow increment can be computed as � − Ψ ′ � �� ( ∇ I J ) T ρ I ∆ v = H − 1 � ρ 2 x ′ , v x ′ , v � � � I x ′ ∈ N ( x ) ( ∇ Z J − D ) T ρ Z − λ Ψ ′ � �� �� ρ 2 x ′ , v x ′ , v � � Z ρ 3 D ( x , v ) + 1 + α p ( x )Ψ ′ � � ρ 2 3 D ( x , v ) 2 θ ( u − v ) where ρ 3 D is a 3D residue defined as ρ 3 D ( x , v ) = δ 3 D ( x , m ( x )) − v , and H is the Gauss-Newton approximation of the Hessian matrix. Julian Quiroga (INRIA) Local/Global Scene Flow July 8, 2013 19 / 31
Optimization The (G-N approximation) of the Hessian matrix is given by � Ψ ′ � �� ( ∇ I J ) T ( ∇ I J ) � ρ 2 x ′ , v � H = I x ′ ∈ N ( x ) ( ∇ Z J − D ) T ( ∇ Z J − D ) + λ Ψ ′ � �� � ρ 2 x ′ , v � Z I d + 1 + α p ( x )Ψ ′ � � ρ 2 3 D ( x , v ) 2 θ I d with I d the 3 × 3 identity matrix. Julian Quiroga (INRIA) Local/Global Scene Flow July 8, 2013 20 / 31
Presentation outline Motion model Data term Regularisation term Sparse matching term Optimization Experimentation Conclusion Julian Quiroga (INRIA) Local/Global Scene Flow July 8, 2013 21 / 31
Experimentation - Middlebury datasets I 1 I 2 Z 1 ground truth (OF) Comparisons Details LG SF : proposed method Images : Teddy, Cones (2 and 6) L SF : local scene flow 5 levels of PYR decomposition TV- L 1 : optical flow + depth Window size: 5 × 5 ORT SF : ortographic camera Error measures Hug 07 : Huguet and Devernay, ICCV 2007 Optical flow: NRMS OF , AAE OF Bas 10 : Basha et al., CVPR 2010 Scene flow: NRMS V , P 10 % Had 11 : Hadfield and Bowden, ICCV 2011 Julian Quiroga (INRIA) Local/Global Scene Flow July 8, 2013 22 / 31
Recommend
More recommend