Convex Methods for Dense Semantic 3D Reconstruction Christian H¨ ane Computer Vision and Geometry Group, ETHZ May 2014 Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 1 / 45
Outline Convex Multi-Label Formulation 1 Joint 3D Scene Reconstruction and Class Segmentation 2 Class Specific 3D Object Shape Priors Using Surface Normals 3 Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 2 / 45
Outline Convex Multi-Label Formulation 1 Joint 3D Scene Reconstruction and Class Segmentation 2 Class Specific 3D Object Shape Priors Using Surface Normals 3 C. Zach, C. H¨ ane, M. Pollefeys, What Is Optimized in Convex Relaxations for Multi-Label Problems: Connecting Discrete and Continuously-Inspired MAP Inference , TPAMI 2014 C. Zach, C. H¨ ane, M. Pollefeys, What Is Optimized in Tight Convex Relaxations for Multi-Label Problems? , CVPR 2012 Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 3 / 45
Labeling Problems Given a set of nodes (pixels, superpixels, voxels) Goal assign one out of L labels to each node Local preference per node plus regularization Energy minimization problem Omni present in computer vision Most multi-label ( L > 2) instances NP-hard Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 4 / 45
Approaches Different solution approaches Graph-Cuts Belief propagation Convex relaxation ... This talk: Convex relaxation only Discrete domain (graphical model) Continuous domain Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 5 / 45
Discrete Domain Describe domain by a graph For images, a node per pixel and edges to the neighbors Assign one out of L labels to each node θ i s : Cost for assigning label i at node s s t θ ij st : Cost for assigning i at s and j at t Find assignment that has minimal cost Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 6 / 45
LP Relaxation LP relaxation by Schlesinger et al. 1976, review Tomas Werner 2007 � � � θ ij st x ij θ i s x i min s + st x s , t s , i i , j � � s.t. x i x ij x i x ji s = t = st st j j � x ij x i x i s = 1 s ≥ 0 st ≥ 0 ∀ s , t , i , j i s and θ ij θ i st cost for assigning a label or a transition s ∈ { 0 , 1 } and x ij x i st ∈ { 0 , 1 } exact solution but non-convex problem s ∈ [0 , 1] and x ij Relaxed to linear program x i st ∈ [0 , 1] Label assignment through thresholding Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 7 / 45
Metrication artifacts Grid graph based representation Smoothness cost Measured by crossing edges Inpainting example Multiple equally good solutions Penalize true boundary length: Continuous formulation Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 8 / 45
Continuously Inspired Formulation [Chambolle et al. 2008] Domain continuous (e.g. image plane), label space discrete Domain segmented into areas that have one out of L labels assigned Smoothness cost θ ij times boundary length between labels i and j Smoothness needs to form a metric over the label-space Original formulation continuous primal-dual saddle point Our version discretized pure primal formulation [Zach et al. 2012] � � � θ i s x i θ ij s � y ij min s + s � 2 x , y s i , j : i < j s , i ∇ x i � � y ji � y ij � s.t. s = s − s j : j < i i : j > i � x i x i s = 1 s ≥ 0 ∀ s , i , j i Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 9 / 45
Interpretation of y ij s Consider the following segmentation result x i = 1 � ∇ x i � s = y ij s x j = 1 j : j < i y ji i : j > i y ij � ∇ x i � Constraint s = � s − � s s = y ij s = − y ij ∇ x i � ∇ x j � � � It follows s and s y ij s normal direction of the boundary between i and j at position s Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 10 / 45
Extensions Original continuously inspired formulation only metric smoothness θ ii θ ij s ≤ θ ik s + θ kj s = 0 s Metric smoothness meaningful for e.g. image denoising Not meaningful for semantic segmentation Anisotropic smoothness sometimes desired Aligning segmentation boundary direction with image edge Well known for binary segmentation [Esedoglu and Osher 2004] Goal: Formulation for non-metric and anisotropic smoothness Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 11 / 45
Anisotropic Smoothness [Esedoglu and Osher 2004] ℓ 1 x i = 1 n ij 1 ℓ 2 n ij 2 x j = 1 Goal: Penalize boundary length ℓ weighted by its direction n Exchange θ ij s � y ij s � 2 by φ ij s ( y ij s ) s ( · ): R N → R + φ ij 0 is a convex positively 1-homogeneous function How do we specify such functions? Next slide n ij s normal of boundary between labels i and j at position s Regularizer penalizes by boundary length times φ ij s ( n ij s ) Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 12 / 45
Wulff Shape [Esedoglu and Osher 2004] 0 Specifying a function φ ( · ) can be hard 30 330 Wulff shape W φ 60 300 Convex shape 1.5 1 All possible φ ( · ) can be specified by 0.5 90 270 φ ( y ) = max µ ∈W φ µ · y 120 240 150 210 Defining φ ( · ) through W φ often easier 180 Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 13 / 45
Non-Metric LP-relaxation does allow for arbitrary smoothness Continuously inspired formulation allows only for metrics Where is the difference? LP relaxation contains x ij st variables that have to be non-negative Continuously inspired formulation contains y ij s that are in [ − 1 , 1] N And hence, no non-negative x ii s Fixed by introducing non-negative pseudo-marginals Split positive and negative part of y ij s into individual variables x ij s := max { 0 , y ij s } and x ji s := − min { 0 , y ij s } y ij s = x ij s − x ji s still present in the formulation x ii s and non-negativity constraint added This allows non-metric smoothness for the discretized case Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 14 / 45
Final Convex Multi-Label Formulation The final formulation allows for non-metric and anisotropic smoothness at the same time � � � θ i s x i φ ij x ij s − x ji � � min s + s s x s i , j : i < j s , i � � s.t. x i � x ij x i � x ji � � s = s = s s − e k k k j j � x i x i x ij s = 1 s ≥ 0 s ≥ 0 ∀ s , i , j , k i e k : k -th canonical basis vector Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 15 / 45
Summary LP-Relaxation Arbitrary smoothness cost Metrication artifacts Continuously inspired formulation Penalizes boundary length Only metric smoothness Extensions Anisotropic costs in multi-label case Non-metric smoothness Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 16 / 45
Outline Convex Multi-Label Formulation 1 Joint 3D Scene Reconstruction and Class Segmentation 2 Class Specific 3D Object Shape Priors Using Surface Normals 3 C. H¨ ane, C. Zach, A. Cohen, R. Angst, M. Pollefeys, Joint 3D Scene Reconstruction and Class Segmentation , CVPR 2013 Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 17 / 45
Idea Two intrinsically ill-posed problems Image segmentation Dense 3D modeling Object category influences desired surface smoothness Optimize both jointly Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 18 / 45
Formulation Baseline Method: Volumetric depth map fusion Segmentation of a voxel space into free and occupied space: u s ∈ [0 , 1] Our joint fusion Labeling of a voxelspace into L labels: x i s ∈ [0 , 1] and � i x i s = 1 We use: free space, building, ground, vegetation, clutter Convex Energy Unary term Connects image appearance and depth maps Smoothness term Dependent on surface orientation and involved labels Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 19 / 45
Energy Objective � � � ρ i s x i φ ij ( y ij E ( x , y ) = s + s ) s ∈ Ω i i , j : i < j Subject to marginalization and normalization constraints x i s ∈ [0 , 1]: indicating whether label i is chosen at voxel s y ij s ∈ [ − 1 , 1] 3 : represents the local surface orientation ρ i s : joint unary term (from depth maps and class likelihoods) φ ij : convex smoothness term (trained from cadastral city model) Optimized using primal-dual algorithm [Chambolle and Pock 2011] Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 20 / 45
Joint Fusion: Training Overview Image based classifier Geometric priors Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 21 / 45
Joint Fusion: Inference Overview Input Camera poses Vertical direction [Cohen et al. 2012] Joint Fusion + → Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 22 / 45
Appearance Likelihoods Images classified using a boosted decision tree classifier [STAIR Vision Library, Gould et al. 2010] 5 classes: sky, building, ground, vegetation, clutter Negative log likelihoods σ class i , per super pixel Figure: Best cost labels Christian H¨ ane (ETHZ) Semantic 3D Reconstruction May 2014 23 / 45
Recommend
More recommend