non convex relaxations for rank regularization
play

Non-Convex Relaxations for Rank Regularization Carl Olsson - PowerPoint PPT Presentation

Non-Convex Relaxations for Rank Regularization Carl Olsson 2019-05-01 Carl Olsson 2019-05-01 1 / 40 Structure from Motion and Factorization W X X Affine camera model: P 1 P 2 X = X 1 X 2 . . . .


  1. Non-Convex Relaxations for Rank Regularization Carl Olsson 2019-05-01 Carl Olsson 2019-05-01 1 / 40

  2. Structure from Motion and Factorization W ⊙ X X Affine camera model:   P 1 � � P 2   X = X 1 X 2 . . .   . . � �� � . 3D points � �� � camera matrices Carl Olsson 2019-05-01 2 / 40

  3. General Motion/Deformation . . . Linear shape basis assumption:                             0 . 1581 0 . 4714 − 0 . 9782 2 . 0509 1 . 8610 − 2 . 4750     − 0 . 0366 − 0 . 0468 0 . 2511 0 . 0532 0 . 2687 0 . 5076     = 0 . 5402 − 1 . 9804 0 . 4749 − 0 . 4343 2 . 0293 0 . 3569         . . . . . .   . . . . . .   . . . . . .                       . . . Carl Olsson 2019-05-01 3 / 40

  4. Rank and Factorization   . . . c 11 c 21 c 12 c 21 . . .   � � � �   X = . . . = . . .  . x 1 x 2 x n b 1 b 2 b r . . ...  . .  . . � �� �  B ( m × r ) . . . c 1 r c 21 � �� � C T ( r × n ) rank( X ) = r Factorization not unique: X = BC T = BG G − 1 C T . ���� � �� � ˜ ˜ C T B DOF: ( m + n ) r − r 2 << mn Can reconstruct at most mn − (( m + n ) r − r 2 ) missing elements. Small DOF desirable! Incorporate as many constraints as possible. Carl Olsson 2019-05-01 4 / 40

  5. Structure from Motion Rigid reconstruction: Non-rigid version: Carl Olsson 2019-05-01 5 / 40

  6. Low Rank Approximation Find the best rank r 0 approximation of X 0 : � X − X 0 � 2 min F rank( X )= r 0 Eckart, Young (1936): Closed form solution via SVD: If X 0 = � n then X = � r 0 i =1 σ i ( X 0 ) u i v T i =1 σ i ( X 0 ) u i v T i . i Alternative formulation: X µ rank( X ) + � X − X 0 � 2 min F Eckart, Young: � σ i ( X 0 ) if σ i ( X 0 ) ≥ √ µ σ i ( X ) = 0 otherwise Carl Olsson 2019-05-01 6 / 40

  7. Low Rank Approximation Generalizations: min g (rank( X )) + �A X − b � 2 + C ( X ) No closed form solution. Non-convex. Discontinuous. Even local optimization can be difficult. Goal: Find ”flexible, easy to optimize” relaxations. Carl Olsson 2019-05-01 7 / 40

  8. The Nuclear Norm Approach Recht, Fazel, Parillo 2008. Replace rank( X ) with � X � ∗ = � n i =1 σ i ( X ). X µ � X � ∗ + �A X − b � 2 min Convex. Can be solved optimally. Shrinking bias. Not good for SfM! Closed form solution to min X µ � X � ∗ + � X − X 0 � 2 F : � σ i ( X 0 ) − µ � If X 0 = � n then X = � n i =1 σ i ( X 0 ) u i v T u i v T i =1 max 2 , 0 i . i � �� � soft thresholding Carl Olsson 2019-05-01 8 / 40

  9. Just a few prior works Low rank recovery via Nuclear Norm: Fazel, Hindi, Boyd. A rank minimization heuristic with application to minimum order system approximation. 2001. Cand` es, Recht. Exact matrix completion via convex optimization. 2009. Cand` es, Li, Ma, Wright. Robust principal component analysis? 2011. Non-convex approaches: Mohan, Fazel. Iterative reweighted least squares for matrix rank minimization. 2010. Pinghua, Zhang, Lu, Huang, Ye. A general iterative shrinkage and thresholding algorithm for non-convex regularized optimization problems. 2013. Sparse signal recovery using the ℓ 1 norm: Tropp. Just relax: Convex programming methods for identifying sparse signals in noise. 2006. Cand` es, Romberg, Tao. Stable signal recovery from incomplete and inaccurate measurements. 2006. Cand` es, Tao. Near-optimal signal recovery from random projections: Universal encoding strategies? 2006. Non-Convex approaches: Candes, Wakin, Boyd. Enhancing sparsity by reweighted ℓ 1 minimization. 2008. Carl Olsson 2019-05-01 9 / 40

  10. Our Approach i µ − max ( √ µ − σ i ( X ) , 0) 2 . Replace µ rank( X ) with R µ ( σ ( X )) = � X R µ ( X ) + �A X − b � 2 min R µ continuous, but non-convex. The global minimizer does not change if �A� < 1. R µ ( σ ( X )) + �A X − b � 2 lower bound on µ rank( X ) + �A X − b � 2 . f ∗∗ µ ( X ) = R µ ( σ ( X )) + � X − X 0 � 2 F is the convex envelope of f µ ( X ) = µ rank( X ) + � X − X 0 � 2 F . Larsson, Olsson. Convex Low Rank Regularization. IJCV 2016. Carl Olsson 2019-05-01 10 / 40

  11. Shrinking Bias 1D versions: i µ − [ √ µ − σ i ( X )] 2 rank( X ) = � i | σ i ( X ) | 0 � X � ∗ = � i σ i ( X ) R µ ( σ ( X )) = � + Singular value thresholding: 2 √ µ � X � ∗ + � X − X 0 � 2 µ rank( X ) + � X − X 0 � 2 R µ ( σ ( X )) + � X − X 0 � 2 F F F Carl Olsson 2019-05-01 11 / 40

  12. More General Framework Computation of the convex envelopes f ∗∗ g ( X ) = R g ( σ ( X )) + � X − X 0 � 2 F of f g ( X ) = g (rank( X )) + � X − X 0 � 2 F where g ( k ) = � k i =1 g i and 0 ≤ g 1 ≤ g 2 ≤ ... . And proximal operators . Another special case: f r 0 ( X ) = I (rank( X ) ≤ r 0 ) + � X − X 0 � 2 F Larsson, Olsson. Convex Low Rank Regularization. IJCV 2016. Carl Olsson 2019-05-01 12 / 40

  13. Results General Case If f g ( X ) = g (rank( X )) + � X − X 0 � 2 F then � n � � � � f ∗∗ g i , σ 2 − � σ ( Z ) − σ ( X ) � 2 + � X − X 0 � 2 g ( X ) = max min i ( Z ) F . σ ( Z ) i =1 The maximization over Z reduces to a 1D-search. (piece-wise quadratic concave objective function) Can be done in O ( n ) time ( n =number of singular values). Carl Olsson 2019-05-01 13 / 40

  14. Convexity of f ∗∗ µ 2 µ 2 µ √ µ √ µ � √ µ − σ � √ µ − σ � 2 � 2 + + ( σ − σ 0 ) 2 for σ 0 = 0 , 1 , 2 + + σ 2 µ − µ − If σ 0 = 2 the function will not try to make σ = 0! Carl Olsson 2019-05-01 14 / 40

  15. Interpretations: f ∗∗ r 0 f r 0 ( X ) = I (rank( X ) ≤ r 0 ) + � X − X 0 � 2 F Level set surfaces { X | R r 0 ( X ) = α } for X = diag( x 1 , x 2 , x 3 ) with r 0 = 1 ( Left ) and r 0 = 2 ( Middle ). Note that when r 0 = 1 the regularizer promotes solutions where only one of x k is non-zero. For r 0 = 2 the regularlizer instead favors solutions with two non-zero x k . For comparison we also include the level set of the nuclear norm. Carl Olsson 2019-05-01 15 / 40

  16. Hankel Matrix Estimation Signal Hankel Matrix Matrix+Noise H ∈H I (rank( H ) ≤ r 0 ) + � H − X 0 � 2 min F SVD Mean 40 f ∗∗ µ f ∗∗ r 0 Nuclear norm 30 � H − H ( f ) � 1 Mean f ∗∗ 20 µ f ∗∗ r 0 Nuclear norm 0 . 9 10 0 0 . 1 0 . 2 0 . 3 0 . 4 0 . 5 0 . 6 0 . 7 0 . 8 0 . 9 1 0 . 8 noise σ 0 . 1 0 . 2 0 . 3 0 . 4 0 . 5 0 . 6 0 . 7 0 . 8 0 . 9 1 Carl Olsson 2019-05-01 16 / 40

  17. Smooth Linear Shape Basis f N ( S ) = � S − S 0 � 2 F + µ � P ( S ) � ∗ + τ TV( S ) f R ( S ) = � S − S 0 � 2 F + R µ ( P ( S )) + τ TV( S ) 160 R µ Nuclear norm � S − S GT � F 140 120 100 0 . 01 0 . 02 0 . 03 0 . 04 0 . 05 0 . 06 0 . 07 0 . 08 0 . 09 0 . 10 noise σ Carl Olsson 2019-05-01 17 / 40

  18. RIP problems Linear observations: b = A X 0 + ǫ , A : R m × n → R p , X 0 low rank, ǫ noise. Recover X 0 using rank-penalties/constraints R µ ( σ ( X )) + �A X − b � 2 R r 0 ( σ ( X )) + �A X − b � 2 or Restricted Isometry Property (RIP): F ≤ �A X � 2 ≤ (1 + δ q ) � X � 2 (1 − δ q ) � X � 2 F , rank( X ) ≤ q Olsson, Carlsson, Andersson, Larsson. Non-Convex Rank/Sparsity Regularization and Local Minima. ICCV 2017. Olsson, Carlsson, Bylow. A Non-Convex Relaxation for Fixed-Rank Approximation. RSL-CV 2017. Carl Olsson 2019-05-01 18 / 40

  19. Near Convex Rank/Sparsity Estimation Intuition: If RIP holds then �A X � 2 behaves like � X � 2 F . R µ ( σ ( X )) + � X − X 0 � 2 F ≈ R µ ( σ ( X )) + � X � 2 F − 2 � X , X 0 � is convex. What about R µ ( σ ( X )) + �A X − b � 2 ≈ R µ ( σ ( X )) + �A X � 2 − 2 � X , A ∗ b � ? Near convex? 1D-example: R µ ( x ) + ( 1 2 x − b ) 2 ( µ = 1) √ √ 1 b = 0 b = b = 1 b = 2 b = 1 . 5 √ 2 Carl Olsson 2019-05-01 19 / 40

  20. Main Result (Rank Penalty) Def. F µ ( X ) := R µ ( σ ( X )) + �A X − b � 2 Z := ( I − A ∗ A ) X s + A ∗ b X s stationary point of F µ ( X ) ⇔ R µ ( σ ( X )) + � X − Z � 2 X s ∈ arg min F . X F local approximation of �A X − b � 2 around X s . � X − Z � 2 X s obtained by thresholding SVD of Z . Theorem If X s is a stationary point of F µ , and the singular values of Z fulfill √ µ ∈ [(1 − δ r ) √ µ, 1 − δ r ] . then for any another stationary point X ′ σ i ( Z ) / s we have rank ( X ′ s − X s ) > r. Carl Olsson 2019-05-01 20 / 40

  21. Main Result (Rank Constraint) Def. F r 0 ( X ) := R r 0 ( σ ( X )) + �A X − b � 2 Z := ( I − A ∗ A ) X s + A ∗ b X s stationary point of F r 0 ( X ) ⇔ R r 0 ( σ ( X )) + � X − Z � 2 X s ∈ arg min F . X F local approximation of �A X − b � 2 around X s . � X − Z � 2 X s obtained by thresholding SVD of Z . Theorem If X s is a stationary point of F r 0 with rank( X s ) = r 0 , and the singular values of Z fulfill σ r 0 +1 ( Z ) < (1 − 2 δ 2 r 0 ) σ r 0 ( Z ) then any other stationary point X ′ s has rank( X ′ s ) > r 0 . Carl Olsson 2019-05-01 21 / 40

Recommend


More recommend