nonsmooth trust region methods on riemannian manifolds
play

Nonsmooth trust region methods on Riemannian manifolds S. Hosseini - PowerPoint PPT Presentation

Nonsmooth trust region methods on Riemannian manifolds S. Hosseini Institut f ur Numerische Simulation,Universit at Bonn, Bonn, Germany. S. Hosseini (Universit at Bonn) Nonsmooth trust region methods on Riemannian manifolds 1 / 32


  1. Nonsmooth trust region methods on Riemannian manifolds S. Hosseini Institut f¨ ur Numerische Simulation,Universit¨ at Bonn, Bonn, Germany. S. Hosseini (Universit¨ at Bonn) Nonsmooth trust region methods on Riemannian manifolds 1 / 32

  2. Trust Region Method x ∈ R n f ( x ) min where f : R n → R is continuously differentiable. minimizing a model function Q k defined by Q k ( x k , d ) = f ( x k ) + ∇ f ( x k ) T d + 1 2 d T B k d , over a restricted region centered at the current iterate. B k is adequately selected and the model function preserves the first and second order information of the objective function f . The so-called trust region ratio evaluates an agreement between the model and the actual objective reductions along the computed step. Considering the trust region ratio, one can decide whether the step is accepted or rejected. After that the trust region radius is updated and a new point is obtained. S. Hosseini (Universit¨ at Bonn) Nonsmooth trust region methods on Riemannian manifolds 2 / 32

  3. Nonsmooth Trust Region Method x ∈ R n f ( x ) min where f : R n → R is is locally Lipschitz. We need to construct Φ : R n × R n → R to build at each iteration a model Q k defined by Q k ( x k , d ) = f ( x k ) + Φ( x k , d ) + 1 2 d T B k d , which must be an approximation of f ( x k + d ) for small d . Nonsmooth trust region algorithms approximately solve the subproblem min � d �≤ δ k } Q k ( x k , d ) { d ∈ R n : to obtain d k . Using the trust region ratio, either the step is accepted or rejected. S. Hosseini (Universit¨ at Bonn) Nonsmooth trust region methods on Riemannian manifolds 3 / 32

  4. Nonsmooth Trust Region Method on Riemannian manifolds min x ∈ M f ( x ) where f : M → R is a locally Lipschitz function on a complete Riemannian manifold M . We need to construct Φ : TM → R (modeling the derivative of f ) to build at each iteration a model Q k . We need a sequence { B k : k = 1 , 2 , .. } of n × n symmetric matrices (modeling the Hessian of f ). Then we build a sequence of model functions � T x k M → R Q k : f ( x k ) + Φ( x k , d ) + 1 ( x k , d ) �→ 2 � B k d , d � analogous to a second order Taylor expansion in the Euclidean case. S. Hosseini (Universit¨ at Bonn) Nonsmooth trust region methods on Riemannian manifolds 4 / 32

  5. A nonsmooth trust region algorithm on Riemannian manifolds I 1: Data: An n -dimensional complete Riemannian manifold ( M , g ); a real valued locally Lipschitz function f on M . 2: Parameters: δ 0 > 0 , δ 0 > δ 1 > 0, c 0 , c 1 , c 2 , c 3 , c 4 > 0, c 2 < c 1 < 1 , c 0 ≤ 1 . 3: Input: initial iterate x 1 ∈ M , and B 1 ∈ S ( n ), where S ( n ) denotes the space of symmetric n × n -matrices. 4: Output: sequence of iterates { x k } . 5: for k = 1 , 2 , ... do find d ∗ k = argmin { Q k ( x k , d k ) = f ( x k ) + Φ( x k , d k ) + 1 / 2 � B k d k , d k � : d k ∈ T x k M , � d k � ≤ δ k } (0.1) where Φ : TM → R is a given function. Assume ¯ d k is an inexact solution of 0.1 in the sense that 6: f ( x k ) − Q k ( x k , ¯ d k ) ≥ c 0 [ f ( x k ) − Q k ( x k , d ∗ k )] and � ¯ d k � ≤ δ k . if ¯ d k = 0 then , Stop. 7: S. Hosseini (Universit¨ at Bonn) Nonsmooth trust region methods on Riemannian manifolds 5 / 32

  6. A nonsmooth trust region algorithm on Riemannian manifolds II else 8: 9: r k = f ( x k ) − f (exp x k ( ¯ d k )) d k ) , f ( x k ) − Q k ( x k , ¯ if c 2 < r k , then x k +1 = exp x k ( ¯ d k ) and update B k . 10: end if 11: if r k ≤ c 2 , then x k +1 = x k , δ k +1 = c 3 δ k . 12: else 13: if c 2 < r k ≤ c 1 , then δ k +1 = δ k . 14: else δ k +1 = min { c 4 δ k , δ 0 } . 15: end if 16: end if 17: end if 18: 19: end for S. Hosseini (Universit¨ at Bonn) Nonsmooth trust region methods on Riemannian manifolds 6 / 32

  7. Using retractions Remark Instead of using the exponential map to update x k , we can choose a retraction R : TM → M. The notion of retraction on a manifold, includes all first-order approximations to the Riemannian exponential. The retraction can be used to take a step in the direction of a tangent vector. Using a good retraction amounts to finding an approximation of the exponential mapping that can be computed with low computational cost while not adversely affecting the behavior of the optimization algorithm. . S. Hosseini (Universit¨ at Bonn) Nonsmooth trust region methods on Riemannian manifolds 7 / 32

  8. Critical Point Definition With f : M → R and Φ : TM → R , define ψ ( x , δ ) = sup {− Φ( x , d ) : d ∈ T x M , � d � ≤ δ } . (0.2) The point x ∈ M is called a critical point with respect to Φ of the objective function f if there exists δ > 0 such that ψ ( x , δ ) = 0. Upper Dini directional derivative Note that the upper Dini directional derivative of f at x in the direction d ∈ T x M denoted by f + ( x ; d ) is defined as follows; f (exp x ( td )) − f ( x ) f + ( x ; d ) := lim sup . t t ↓ 0 A point x is called a Dini stationary point if for all d ∈ T x M , f + ( x ; d ) ≥ 0. S. Hosseini (Universit¨ at Bonn) Nonsmooth trust region methods on Riemannian manifolds 8 / 32

  9. Assumption Assume that D is a bounded open convex set containing N := { x ∈ M : f ( x ) ≤ f ( x 0 ) } and for all x ∈ D and d ∈ T x M it holds that Φ( x , td ) ≤ f + ( x ; d ) , lim inf t t ↓ 0 where f + ( x ; d ) is the upper Dini directional derivative of f at x in the directional d ∈ T x M. S. Hosseini (Universit¨ at Bonn) Nonsmooth trust region methods on Riemannian manifolds 9 / 32

  10. Critical Point If x is a critical point of f in the sense of Definition, then for t small enough Φ( x , td ) ≥ 0. Φ( x , td ) Therefore lim inf t ↓ 0 ≥ 0. Hence, using Assumption, we have that for t all d ∈ T x M , f + ( x ; d ) ≥ 0. One can also show that a local minimizer x of a locally Lipschitz function f : M → R is always a critical point, provided that the function Φ satisfies some natural assumption. S. Hosseini (Universit¨ at Bonn) Nonsmooth trust region methods on Riemannian manifolds 10 / 32

  11. Assumptions on Φ Assumption Let Φ : TM → R . Assume that Φ( x , 0 x ) = 0 ∀ x ∈ M , (0.3) Φ( x , α d ) ≤ α Φ( x , d ) , ∀ ( x , d ) ∈ TM , 0 ≤ α ≤ 1 , (0.4) for all x ∈ M , Φ | T x M is lower semi continuous , (0.5) for any ( x , d ) ∈ TM it holds that f (exp x ( d )) − f ( x ) ≤ Φ( d ) + o ( � d � ) , (0.6) and there exists δ ∗ such that for all δ < δ ∗ the function ψ ( ., δ ) is lower semi continuous, (0.7) where ψ is defined in (0.2) and the implicit constant in the o-term is uniform over compact sets. S. Hosseini (Universit¨ at Bonn) Nonsmooth trust region methods on Riemannian manifolds 11 / 32

  12. Lemma Suppose that f : M → R and Φ : TM → R such that Assumption 0.2 holds. Then every local minimizer of f is a critical point in the sense of Definition 1. Assumption Recall N = { x ∈ M : f ( x ) ≤ f ( x 1 ) } where x 1 is the starting point of Algorithm 1. Assume that N is bounded. Furthermore assume that there exists C > 0 such that � B k � ≤ C, for all k = 1 , 2 , ... . Theorem Suppose that Φ and ( B k ) k are such that Assumptions 0.2 and 0.3 hold true. If ¯ x is an accumulation point of { x k } , generated by Algorithm 1, then ¯ x is a critical point of f in the sense of Definition 1. S. Hosseini (Universit¨ at Bonn) Nonsmooth trust region methods on Riemannian manifolds 12 / 32

  13. The following lemma proves that if Algorithm 1 generates a sequence { x k } with x k = ¯ x for all large k , then ¯ x is a critical point of f . Lemma Suppose that ¯ x is an accumulation point of { x k } which is not a critical point. Then there exist ǫ > 0 and β > 0 such that for all k satisfying dist ( x k , ¯ x ) < ǫ, 0 < δ k < β, � B k � ≤ C , (0.8) we have r k = f ( x k ) − f (exp x k ( ¯ d k )) > c 2 , f ( x k ) − Q k ( x k , ¯ d k ) where x k , δ k , c 2 are the same as in algorithm 1. S. Hosseini (Universit¨ at Bonn) Nonsmooth trust region methods on Riemannian manifolds 13 / 32

  14. Nonsmooth analysis on Manifolds Definition (Clarke generalized directional derivative) Suppose f : M → R is a locally Lipschitz function on a Riemannian manifold M . Let φ x : U x → T x M be an exponential chart at x . Given another point y ∈ U x , consider σ y , v ( t ) := φ − 1 y ( tw ), a geodesic passing through y with derivative w , where ( φ y , y ) is an exponential chart around y and d ( φ x ◦ φ − 1 y )(0 y )( w ) = v . Then, the Clarke generalized directional derivative of f at x ∈ M in the direction v ∈ T x M , denoted by f ◦ ( x ; v ), is defined as f ( σ y , v ( t )) − f ( y ) f ◦ ( x , v ) = lim sup . t y → x , t ↓ 0 S. Hosseini (Universit¨ at Bonn) Nonsmooth trust region methods on Riemannian manifolds 14 / 32

  15. Nonsmooth analysis on Manifolds If f is differentiable in x ∈ M , we define the gradient of f as the unique vector grad f ( x ) ∈ T x M , which satisfies � grad f ( x ) , ξ � = df ( x )( ξ ) for all ξ ∈ T x M . Definition (Subdifferential) We define the subdifferential of f , denoted by ∂ f ( x ), as the subset of T x M whose support function is f ◦ ( x ; . ) . It can be proved [4] that ∂ f ( x ) = conv { lim i →∞ grad f ( x i ) : { x i } ⊆ Ω f , x i → x } , where Ω f is a dense subset of M on which f is differentiable. S. Hosseini (Universit¨ at Bonn) Nonsmooth trust region methods on Riemannian manifolds 15 / 32

Recommend


More recommend