Quasi-Newton methods for minimization Lectures for PHD course on Non-linear equations and numerical optimization Enrico Bertolazzi DIMS – Universit` a di Trento March 2005 Quasi-Newton methods for minimization 1 / 63 Quasi Newton Method Outline Quasi Newton Method 1 The symmetric rank one update 2 The Powell-symmetric-Broyden update 3 The Davidon Fletcher and Powell rank 2 update 4 The Broyden Fletcher Goldfarb and Shanno (BFGS) update 5 The Broyden class 6 Quasi-Newton methods for minimization 2 / 63
Quasi Newton Method Algorithm (General quasi-Newton algorithm) k ← 0 ; x 0 assigned; g 0 ← ∇ f( x 0 ) ; H 0 ← ∇ 2 f( x 0 ) − 1 ; while � g k � > ǫ do — compute search direction d k ← H k g k ; Approximate arg min λ> 0 f( x k − λ d k ) by linsearch; — perform step x k +1 ← x k − λ k d k ; g k +1 ← ∇ f( x k +1 ) ; — update H k +1 � � H k +1 ← some algorithm H k , x k , x k +1 , g k , g k +1 ; k ← k + 1 ; end while Quasi-Newton methods for minimization 3 / 63 The symmetric rank one update Outline Quasi Newton Method 1 The symmetric rank one update 2 The Powell-symmetric-Broyden update 3 The Davidon Fletcher and Powell rank 2 update 4 The Broyden Fletcher Goldfarb and Shanno (BFGS) update 5 The Broyden class 6 Quasi-Newton methods for minimization 4 / 63
The symmetric rank one update Let B k and approximation of the Hessian of f( x ). Let x k , x k +1 , g k and g k +1 and if we use the Broyden update formula to force secant condition to B k +1 we obtain B k +1 ← B k + ( y k − B k s k ) s T k , s T k s k where s k = x k +1 − x k and y k = g k +1 − g k . By using Sherman–Morrison formula and setting H k = B − 1 we obtain k the update: ( H k y k − s k ) s T k H k +1 ← H k − H k s T k s k + s T k H k g k +1 The previous update do not maintain symmetry. In fact if H k is symmetric then H k +1 not necessarily is symmetric. Quasi-Newton methods for minimization 5 / 63 The symmetric rank one update To avoid loss of symmetry we can consider an update of the form: H k +1 ← H k + uu T Imposing the secant condition (on the inverse) H k y k + uu T y k = s k H k +1 y k = s k ⇒ from previous equality k uu T y k = y T y T k H k y k + y T ⇒ k s k � � 1 / 2 y T y T k s k − y T k u = k H k y k we obtain u = s k − H k y k s k − H k y k = � � 1 / 2 u T y k y T k s k − y T k H k y k Quasi-Newton methods for minimization 6 / 63
The symmetric rank one update substituting the expression of u s k − H k y k u = � � 1 / 2 y T k s k − y T k H k y k in the update formula, we obtain H k +1 ← H k + w k w T k w k = s k − H k y k w T k y k The previous update formula is the symmetric rank one formula (SR1). To be definite the previous formula needs w T k y k � = 0. Moreover if w T k y k < 0 and H k is positive definite then H k +1 not necessarily is positive definite. Have H k symmetric and positive definite is important for global convergence Quasi-Newton methods for minimization 7 / 63 The symmetric rank one update This lemma is used in the forward theorems Lemma Let be q( x ) = 1 2 x T Ax − b T x + c with A ∈ ❘ n × n symmetric and positive definite. Then y k = g k +1 − g k = Ax k +1 − b − Ax k + b = As k where g k = ∇ q( x k ) T . Quasi-Newton methods for minimization 8 / 63
The symmetric rank one update Theorem (property of SR1 update) Let be q( x ) = 1 2 x T Ax − b T x + c with A ∈ ❘ n × n symmetric and positive definite. Let be x 0 and H 0 assigned. Let x k and H k produced by 1 x k +1 = x k + s k ; 2 H k +1 updated by the SR1 formula H k +1 ← H k + w k w T k w k = s k − H k y k w T k y k If s 0 , s 1 , . . . , s n − 1 are linearly independent then H n = A − 1 . Quasi-Newton methods for minimization 9 / 63 The symmetric rank one update Proof. (1 / 2) . We prove by induction the hereditary property H i y j = s j . BASE: For i = 1 is exactly the secant condition of the update. INDUCTION: Suppose the relation is valid for k > 0 the we prove that it is valid for k + 1. In fact, from the update formula H k +1 y j = H k y j + w T k y j w k = s k − H k y k w k w T k y k by the induction hypothesis for j < k and using lemma on slide 8 we have w T k y j = s T k y j − y T k H k y j = s T k y j − y T k s j = y T k Ay j − y T k Ay j = 0 so that H k +1 y j = H k y j = s j for j = 0 , 1 , . . . , k − 1. For j = k we have H k +1 y k = s k trivially by construction of the SR1 formula. Quasi-Newton methods for minimization 10 / 63
The symmetric rank one update Proof. (2 / 2) . To prove that H n = A − 1 notice that H n y j = s j , As j = y j , j = 0 , 1 , . . . , n − 1 and combining the equality H n As j = s j , j = 0 , 1 , . . . , n − 1 due to the linear independence of s i we have H n A = I i.e. H n = A − 1 . Quasi-Newton methods for minimization 11 / 63 The symmetric rank one update Properties of SR1 update (1 / 2) 1 The SR1 update possesses the natural quadratic termination property (like CG). 2 SR1 satisfy the hereditary property H k y j = s j for j < k . 3 SR1 does maintain the positive definitiveness of H k if and only if w T k y k > 0. However this condition is difficult to guarantee. 4 Sometimes w T k y k becomes very small or 0. This results in serious numerical difficulty (roundoff) or even the algorithm is broken. We can avoid this breakdown by the following strategy Breakdown workaround for SR1 update � � � � � ≥ ǫ � � y k � (i.e. the angle between w k and y k is far � w T � w T if k y k 1 k from 90 degree), then we update with the SR1 formula. Otherwise we set H k +1 = H k . 2 Quasi-Newton methods for minimization 12 / 63
The symmetric rank one update Properties of SR1 update (2 / 2) Theorem (Convergence of nonlinear SR1 update) Let f( x ) satisfying standard assumption. Let be { x k } a sequence of iterates such that lim k →∞ x k = x ⋆ . Suppose we use the breakdown workaround for SR1 update and the steps { s k } are uniformly linearly independent. Then we have � H k − ∇ 2 f( x ⋆ ) − 1 � � � = 0 . lim k →∞ A.R.Conn, N.I.M.Gould and P.L.Toint Convergence of quasi-Newton matrices generated by the symmetric rank one update. Mathematic of Computation 50 399–430, 1988. Quasi-Newton methods for minimization 13 / 63 The Powell-symmetric-Broyden update Outline Quasi Newton Method 1 The symmetric rank one update 2 The Powell-symmetric-Broyden update 3 The Davidon Fletcher and Powell rank 2 update 4 The Broyden Fletcher Goldfarb and Shanno (BFGS) update 5 The Broyden class 6 Quasi-Newton methods for minimization 14 / 63
The Powell-symmetric-Broyden update The SR1 update, although symmetric do not have minimum property like the Broyden update for the non symmetric case. The Broyden update A k +1 = A k + ( y k − A k s k ) s T k s T k s k solve the minimization problem � A k +1 − A k � F ≤ � A − A k � F for all As k = y k If we solve a similar problem in the class of symmetric matrix we obtain the Powell-symmetric-Broyden (PSB) update Quasi-Newton methods for minimization 15 / 63 The Powell-symmetric-Broyden update Lemma (Powell-symmetric-Broyden update) Let A ∈ ❘ n × n symmetric and s , y ∈ ❘ n with s � = 0 . Consider the set � B ∈ ❘ n × n | Bs = y , B = B T � B = if s T y � = 0 a then there exists a unique matrix B ∈ B such that � A − B � F ≤ � A − C � F for all C ∈ B moreover B has the following form B = A + ωs T + sω T − ( ω T s ) ss T ω = y − As s T s ( s T s ) 2 then B is a rank two perturbation of the matrix A . a This is true if Wolfe line search is performed Quasi-Newton methods for minimization 16 / 63
The Powell-symmetric-Broyden update Proof. (1 / 11) . First of all notice that B is not empty, in fact � 1 � 1 s T yyy T ∈ B s T yyy T s = y So that the problem is not empty. Next we reformulate the problem as a constrained minimum problem: n � 1 ( A ij − B ij ) 2 subject to Bs = y and B = B T arg min 2 B ∈ ❘ n × n i,j =1 The solution is a stationary point of the Lagrangian: � g ( B , λ , M ) = 1 F + λ T ( By − s ) + 2 � A − B � 2 µ ij ( B ij − B ji ) i<j Quasi-Newton methods for minimization 17 / 63 The Powell-symmetric-Broyden update Proof. (2 / 11) . taking the gradient we have ∂ g ( B , λ , B ) = A ij − B ij + λ i s j + M ij = 0 ∂ B ij where µ ij if i < j ; M ij = − µ ij if i > j ; 0 If i = j . The previous equality can be written in matrix form as B = A + λs T + M . Quasi-Newton methods for minimization 18 / 63
The Powell-symmetric-Broyden update Proof. (3 / 11) . Imposing symmetry for B A + λs T + M = A T + sλ T + M T = A + sλ T − M solving for M we have M = sλ T − λs T 2 substituting in B we have B = A + sλ T + λs T 2 Quasi-Newton methods for minimization 19 / 63 The Powell-symmetric-Broyden update Proof. (4 / 11) . Imposing s T Bs = s T y s T As + s T sλ T s + s T λs T s = s T y ⇒ 2 λ T s = ( s T ω ) / ( s T s ) where ω = y − As . Imposing Bs = y As + sλ T s + λs T s = y ⇒ 2 s T s − ( s T ω ) s λ = 2 ω ( s T s ) 2 next we compute the explicit form of B . Quasi-Newton methods for minimization 20 / 63
Recommend
More recommend