Optimization and curve fitting on manifolds Pierre-Antoine Absil (Dept. of Mathematical Engineering, UCLouvain) Journ´ ee conjointe des GDR MIA et ISIS Optimisation G´ eom´ etrique sur les Vari´ et´ es 21st November 2014 1
Optimization on Manifolds in one picture R f M 2
Optimization on Manifolds in one picture R x f M 3
A book http://press.princeton.edu/titles/8586.html Optimization Algorithms on Matrix Manifolds P.-A. Absil, R. Mahony, R. Sepulchre Princeton University Press, January 2008 1. Introduction 2. Motivation and applications 3. Matrix manifolds: first-order geometry 4. Line-search algorithms 5. Matrix manifolds: second-order geometry 6. Newton’s method 7. Trust-region methods 8. A constellation of superlinear algorithms 4
A toolbox http://www.manopt.org/ Ref: Nicolas Boumal et al, Manopt, a Matlab toolbox for optimization on manifolds , JMLR 15(Apr) 1455-1459, 2014. 5
Optimization on manifolds: an introduction Motivation and problem formulation Optimization on Manifolds in one picture R x f M 6
Optimization on manifolds: an introduction Motivation and problem formulation Why general manifolds? – Motivating examples Given A = A T ∈ R n × n Given A = A T ∈ R n × n , and N = diag( p , p − 1 , . . . , 1), ( Y T Y ) − 1 ( Y T AY ) min f ( X ) = − trace( X T AXN ) � � min f ( Y ) = − trace subj. to X ∈ R n × p : X T X = I subj. to Y ∈ R n × p (i.e., Y full rank) ∗ R YM R Y f f f ( YM ) = f ( Y ) Feasible set: St( p , n ) Feasible set: Gr( p , n ) � � = { X ∈ R n × p : X T X = I } { YM : M ∈ R p × p } : Y ∈ R n × p = ∗ ∗ Embedded submanifold Quotient manifold 7
Optimization on manifolds: an introduction Specific manifolds Optimization on Manifolds in one picture R x f M 8
Optimization on manifolds: an introduction Specific manifolds Specific manifolds, and where they appear ◮ Stiefel manifold St( p , n ) and orthogonal group O p = St( n , n ) St( p , n ) = { X ∈ R n × p : X T X = I p } Applications: computer vision; principal component analysis; independent component analysis... ◮ Grassmann manifold Gr( p , n ) Set of all p -dimensional subspaces of R n Applications: various dimension reduction problems... ◮ Set of fixed-rank PSD matrices S + ( p , n ). A quotient representation: X ∼ Y ⇔ ∃ Q ∈ O p : Y = XQ Applications: Low-rank approximation of symmetric matrices; algorithms for (large-scale) semidefinite programming... 9
Optimization on manifolds: an introduction Specific manifolds Specific manifolds, and where they appear ◮ Low-rank manifold R m × n rk p = { M ∈ R m × n : rk( M ) = p } R m × n rk p Applications: dimensionality reduction; model for matrix completion... ◮ Shape manifold O n \ R n × p ∗ Y ∼ X ⇔ ∃ U ∈ O n : Y = UX Applications: shape analysis ◮ Oblique manifold R n × p / S diag+ ∗ R n × p / S diag+ ≃ { Y ∈ R n × p : diag( Y T Y ) = I p } ∗ ∗ Applications: blind source separation; factor analysis (oblique Procrustes problem)... 10
Optimization on manifolds: an introduction Mathematical background Smooth optimization problems on general manifolds R U f V M x f ∈ C ∞ ( x )? Yes iff f ◦ ϕ − 1 ∈ C ∞ ( ϕ ( x )) ϕ ψ R d ψ ◦ ϕ − 1 R d C ∞ ϕ ◦ ψ − 1 ψ ( V ) ϕ ( U ) ϕ ( U ∩ V ) ψ ( U ∩ V ) 11
Optimization on manifolds: an introduction Mathematical background Optimization on manifolds in its most abstract formulation R U f V M x f ∈ C ∞ ( x )? Yes iff f ◦ ϕ − 1 ∈ C ∞ ( ϕ ( x )) ϕ ψ R d ψ ◦ ϕ − 1 R d C ∞ ϕ ◦ ψ − 1 ψ ( V ) ϕ ( U ) ϕ ( U ∩ V ) ψ ( U ∩ V ) Given: ◮ A set M endowed (explicitly or implicitly) with a manifold structure (i.e., a collection of compatible charts). ◮ A function f : M → R , smooth in the sense of the manifold structure. Task: Compute a local minimizer of f . 12
Optimization on manifolds: an introduction Algorithms on abstract manifolds Algorithms formulated on abstract manifolds ◮ Steepest-descent Needs: Riemannian structure and retraction ◮ Newton Needs: affine connection and retraction ◮ Conjugate Gradients Needs: Riemannian structure, retraction, and vector transport ◮ BFGS Needs: needs Riemannian structre, retraction, and vector transport ◮ Trust Region Needs: Riemannian structure and retraction 13
Optimization on manifolds: an introduction Algorithms on abstract manifolds Steepest descent on abstract manifolds Required: Riemannian manifold M ; retraction R on M . Iteration x k ∈ M �→ x k +1 ∈ M defined by 1. Compute steepest-descent direction in T x k M : η k = − grad f ( x k ) . 2. Set x k +1 := R x k ( t k η k ) where t k is chosen using a line-search method. R grad f ( x ) x f x + 14
Optimization on manifolds: an introduction Algorithms on abstract manifolds Newton on abstract manifolds Required: Riemannian manifold M ; retraction R on M ; affine connection ∇ on M ; real-valued function f on M . Iteration x k ∈ M �→ x k +1 ∈ M defined by 1. Solve the Newton equation Hess f ( x k ) η k = − grad f ( x k ) for the unknown η k ∈ T x k M , where Hess f ( x k ) η k := ∇ η k grad f . 2. Set x k +1 := R x k ( η k ) . 15
Optimization on manifolds: an introduction Algorithms on abstract manifolds Newton on submanifolds of R n Required: Riemannian submanifold M of R n ; retraction R on M ; real-valued function f on M . Iteration x k ∈ M �→ x k +1 ∈ M defined by 1. Solve the Newton equation Hess f ( x k ) η k = − grad f ( x k ) for the unknown η k ∈ T x k M , where Hess f ( x k ) η k := P T xk M D grad f ( x k )[ η k ] . 2. Set x k +1 := R x k ( η k ) . 16
Optimization on manifolds: an introduction Algorithms on abstract manifolds Newton on the unit sphere S n − 1 Required: real-valued function f on S n − 1 . Iteration x k ∈ S n − 1 �→ x k +1 ∈ S n − 1 defined by 1. Solve the Newton equation � P x k D(grad f )( x k )[ η k ] = − grad f ( x k ) x T η k = 0 , for the unknown η k ∈ R n , where P x k = ( I − x k x T k ) . 2. Set x k + η k x k +1 := � x k + η k � . 17
Optimization on manifolds: an introduction Algorithms on abstract manifolds Newton for Rayleigh quotient optimization on unit sphere Iteration x k ∈ S n − 1 �→ x k +1 ∈ S n − 1 defined by 1. Solve the Newton equation � P x k A P x k η k − η k x T k Ax k = − P x k Ax k , x T k η k = 0 , for the unknown η k ∈ R n , where P x k = ( I − x k x T k ) . 2. Set x k + η k x k +1 := � x k + η k � . 18
Optimization on manifolds: an introduction Algorithms on abstract manifolds Conjugate Gradients on abstract manifolds Require: Riemannian manifold M ; vector transport T on M with associated retraction R ; real-valued function f on M ; initial iterate x 0 ∈ M . 1: Set η 0 = − grad f ( x 0 ). 2: for k = 0 , 1 , 2 , . . . do : Compute a step size α k and set x k +1 = R x k ( α k η k ) . (1) : Compute β k +1 and set η k +1 = − grad f ( x k +1 ) + β k +1 T α k η k ( η k ) . (2) 5: end for Fletcher-Reeves: β k +1 = � grad f ( x k +1 ) , grad f ( x k +1 ) � . � grad f ( x k ) , grad f ( x k ) � � grad f ( x k +1 ) , grad f ( x k +1 ) −T α k η k (grad f ( x k )) � Polak-Ribi` ere: β k +1 = . � grad f ( x k ) , grad f ( x k ) � Ref: PAA et al [AMS08, § 8.3], Sato & Iwai [SI13]. 19
Optimization on manifolds: an introduction Algorithms on abstract manifolds BFGS on abstract manifolds 1: Given: Riemannian manifold M with Riemannian metric g ; vector transport T on M with associated retraction R ; smooth real-valued function f on M ; initial iterate x 0 ∈ M ; initial Hessian approximation B 0 . 2: for k = 0, 1, 2, . . . do Obtain η k ∈ T x k M by solving B k η k = − grad f ( x k ). 3: Compute step size α k and set x k +1 = R x k ( α k η k ). 4: Define s k = T αη k ( αη k ) and y k = grad f ( x k +1 ) − T αη k (grad f ( x k )). 5: Define the linear operator B k +1 : T x k +1 M → T x k +1 M by 6: B k p − g ( s k , ˜ B k p ) B k s k + g ( y k , p ) B k +1 p = ˜ ˜ g ( y k , s k ) y k for all p ∈ T x k +1 M , g ( s k , ˜ B k s k ) (3) with B k = T αη k ◦ B k ◦ ( T αη k ) − 1 . ˜ (4) 7: end for Ref: Qi et al [QGA10], Ring & Wirth [RW12]. 20
Optimization on manifolds: an introduction Algorithms on abstract manifolds Trust region on abstract manifolds η y y + m y T y M v 1 M Refs: PAA et al [ABG07], Huang et al [HAG14]. 21
Optimization on manifolds: an introduction A brief history Optimization on Manifolds in one picture R x f M 22
Optimization on manifolds: an introduction A brief history Some classics on Optimization On Manifolds (I) R x f x + Luenberger (1973), Introduction to linear and nonlinear programming . Luenberger mentions the idea of performing line search along geodesics, “which we would use if it were computationally feasible (which it definitely is not)”. 23
Recommend
More recommend