Learning Two-View Stereo Matching Jianxiong Xiao Jingni Chen Dit-Yan Yeung Long Quan Department of Computer Science and Engineering The Hong Kong University of Science and Technology The 10th European Conference on Computer Vision Jianxiong Xiao et al. (HKUST) Learning Two-View Stereo Matching ECCV 2008 1 / 45
Outline Introduction 1 Semi-supervised Matching Framework 2 Local Label Preference Cost Regional Surface Shape Cost Global Epipolar Geometry Cost Symmetric Visibility Consistency Cost Iterative MV Optimization 3 Learning the Symmetric Affinity Matrix 4 More Details 5 Experiments 6 Jianxiong Xiao et al. (HKUST) Learning Two-View Stereo Matching ECCV 2008 2 / 45
Introduction Outline Introduction 1 Semi-supervised Matching Framework 2 Local Label Preference Cost Regional Surface Shape Cost Global Epipolar Geometry Cost Symmetric Visibility Consistency Cost Iterative MV Optimization 3 Learning the Symmetric Affinity Matrix 4 More Details 5 Experiments 6 Jianxiong Xiao et al. (HKUST) Learning Two-View Stereo Matching ECCV 2008 3 / 45
Introduction Stereo Matching between Two Images Input: two wide-baseline images taken from the same static scene, neither calibrated nor rectified. For more general applications, such as robust motion estimation from structure. Jianxiong Xiao et al. (HKUST) Learning Two-View Stereo Matching ECCV 2008 4 / 45
Introduction Related Work Small-baseline matching algorithm: cannot be extended easily when the epipolar lines are not parallel. Wide-baseline matching: depend heavily on the epipolar geometry which has to be provided, often by off-line calibration. Sparse matching: the fundamental matrix so estimated often fits to subsets of image, not the whole image. Region growing based methods: greedily, bad result for quite different pixel scales due to discrete growing. Learning techniques: the information learned from other irrelevant images is very weak, the quality of the result greatly depends on the training data. Jianxiong Xiao et al. (HKUST) Learning Two-View Stereo Matching ECCV 2008 5 / 45
Introduction Related Work Small-baseline matching algorithm: cannot be extended easily when the epipolar lines are not parallel. Wide-baseline matching: depend heavily on the epipolar geometry which has to be provided, often by off-line calibration. Sparse matching: the fundamental matrix so estimated often fits to subsets of image, not the whole image. Region growing based methods: greedily, bad result for quite different pixel scales due to discrete growing. Learning techniques: the information learned from other irrelevant images is very weak, the quality of the result greatly depends on the training data. Jianxiong Xiao et al. (HKUST) Learning Two-View Stereo Matching ECCV 2008 5 / 45
Introduction Related Work Small-baseline matching algorithm: cannot be extended easily when the epipolar lines are not parallel. Wide-baseline matching: depend heavily on the epipolar geometry which has to be provided, often by off-line calibration. Sparse matching: the fundamental matrix so estimated often fits to subsets of image, not the whole image. Region growing based methods: greedily, bad result for quite different pixel scales due to discrete growing. Learning techniques: the information learned from other irrelevant images is very weak, the quality of the result greatly depends on the training data. Jianxiong Xiao et al. (HKUST) Learning Two-View Stereo Matching ECCV 2008 5 / 45
Introduction Related Work Small-baseline matching algorithm: cannot be extended easily when the epipolar lines are not parallel. Wide-baseline matching: depend heavily on the epipolar geometry which has to be provided, often by off-line calibration. Sparse matching: the fundamental matrix so estimated often fits to subsets of image, not the whole image. Region growing based methods: greedily, bad result for quite different pixel scales due to discrete growing. Learning techniques: the information learned from other irrelevant images is very weak, the quality of the result greatly depends on the training data. Jianxiong Xiao et al. (HKUST) Learning Two-View Stereo Matching ECCV 2008 5 / 45
Introduction Related Work Small-baseline matching algorithm: cannot be extended easily when the epipolar lines are not parallel. Wide-baseline matching: depend heavily on the epipolar geometry which has to be provided, often by off-line calibration. Sparse matching: the fundamental matrix so estimated often fits to subsets of image, not the whole image. Region growing based methods: greedily, bad result for quite different pixel scales due to discrete growing. Learning techniques: the information learned from other irrelevant images is very weak, the quality of the result greatly depends on the training data. Jianxiong Xiao et al. (HKUST) Learning Two-View Stereo Matching ECCV 2008 5 / 45
Introduction Our Semi-supervised Matching Approach Propose a semi-supervised prospective of the matching problem without training. Utilize all information in an optimization procedure: local, regional and global. More robust to noise: the label vector is affected not merely by one matched pair but by all pairs with weighted paths to it. Capable of handling real number labels which is the inherent requirement of sub-pixel accuracy matching. Jianxiong Xiao et al. (HKUST) Learning Two-View Stereo Matching ECCV 2008 6 / 45
Semi-supervised Matching Framework Outline Introduction 1 Semi-supervised Matching Framework 2 Local Label Preference Cost Regional Surface Shape Cost Global Epipolar Geometry Cost Symmetric Visibility Consistency Cost Iterative MV Optimization 3 Learning the Symmetric Affinity Matrix 4 More Details 5 Experiments 6 Jianxiong Xiao et al. (HKUST) Learning Two-View Stereo Matching ECCV 2008 7 / 45
Semi-supervised Matching Framework Three Main Catalogs of Learning Methods Supervised Learning √ and is √ or × ? Given that × . Now, whether Unsupervised Learning Given , and , any interesting structure in them? Semi-supervised Learning √ ? × Jianxiong Xiao et al. (HKUST) Learning Two-View Stereo Matching ECCV 2008 8 / 45
Semi-supervised Matching Framework Three Main Catalogs of Learning Methods Supervised Learning √ and is √ or × ? Given that × . Now, whether Unsupervised Learning Given , and , any interesting structure in them? Semi-supervised Learning √ ? × Jianxiong Xiao et al. (HKUST) Learning Two-View Stereo Matching ECCV 2008 8 / 45
Semi-supervised Matching Framework Three Main Catalogs of Learning Methods Supervised Learning √ and is √ or × ? Given that × . Now, whether Unsupervised Learning Given , and , any interesting structure in them? Semi-supervised Learning √ ? × Jianxiong Xiao et al. (HKUST) Learning Two-View Stereo Matching ECCV 2008 8 / 45
Semi-supervised Matching Framework Three Main Catalogs of Learning Methods Supervised Learning √ and is √ or × ? Given that × . Now, whether Unsupervised Learning Given , and , any interesting structure in them? Semi-supervised Learning √ ? × Jianxiong Xiao et al. (HKUST) Learning Two-View Stereo Matching ECCV 2008 8 / 45
Semi-supervised Matching Framework Notations For p = 1 or 2, q = 3 − p : x p ( s p − 1 ) × c p + t p : coordinate position ( s p , t p ) in the p -th image space, s p ∈ { 1 , ··· , r p } , t p ∈ { 1 , ··· , c p } , i = ( s p − 1 ) × c p + t p . X p : Input image with size n p = r p × c p pixels � T X p = � x p 1 , x p 2 ,..., x p ( s p − 1 ) × c p + t p ,..., x p n p . x q j :a matching point of x p i located at coordinate position ( s q , t q ) in the q -th continuous image space, s q , t q ∈ R . � T = s 2 , t 2 �� T ∈ R 2 , Label Vector y p v p i , h p � �� s 1 , t 1 � � i = − i representing the position offset from the point in the first image to the point in the second image. Label Matrix Y p = � T and Visibility Vector y p 1 , ··· , y p � n p O p = � T . o p 1 , ··· , o p � n p Jianxiong Xiao et al. (HKUST) Learning Two-View Stereo Matching ECCV 2008 9 / 45
Semi-supervised Matching Framework Notations For p = 1 or 2, q = 3 − p : x p ( s p − 1 ) × c p + t p : coordinate position ( s p , t p ) in the p -th image space, s p ∈ { 1 , ··· , r p } , t p ∈ { 1 , ··· , c p } , i = ( s p − 1 ) × c p + t p . X p : Input image with size n p = r p × c p pixels � T X p = � x p 1 , x p 2 ,..., x p ( s p − 1 ) × c p + t p ,..., x p n p . x q j :a matching point of x p i located at coordinate position ( s q , t q ) in the q -th continuous image space, s q , t q ∈ R . � T = s 2 , t 2 �� T ∈ R 2 , Label Vector y p v p i , h p � �� s 1 , t 1 � � i = − i representing the position offset from the point in the first image to the point in the second image. Label Matrix Y p = � T and Visibility Vector y p 1 , ··· , y p � n p O p = � T . o p 1 , ··· , o p � n p Jianxiong Xiao et al. (HKUST) Learning Two-View Stereo Matching ECCV 2008 9 / 45
Recommend
More recommend