a positive bb like stepsize and an extension for
play

A Positive BB-Like Stepsize and An Extension for Symmetric Linear - PowerPoint PPT Presentation

A Positive BB-Like Stepsize and An Extension for Symmetric Linear Systems Yu-Hong Dai Academy of Mathematics and Systems Science, Chinese Academy of Sciences Joint with M. Al-Baali and Xiaoqi Yang Peking University, 20140903 Yu-Hong Dai


  1. A Positive BB-Like Stepsize and An Extension for Symmetric Linear Systems Yu-Hong Dai Academy of Mathematics and Systems Science, Chinese Academy of Sciences Joint with M. Al-Baali and Xiaoqi Yang Peking University, 20140903 Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 1 / 39

  2. Outline Introduction 1 A Positive BB-Like Stepsize 2 Analysis of The New Method 3 An Extension for Symmetric Linear Systems 4 Some Discussions 5 Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 2 / 39

  3. Introduction Section I. Introduction Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 3 / 39

  4. Introduction Unconstrained Optimization x ∈ R n min f ( x ) , Convex Quadratic Minimization Q ( x ) := 1 2 x T A x − b T x , x ∈ R n min Linear System x ∈ R n A x = b , Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 4 / 39

  5. Introduction Steepest Descent Method (Cauchy 1847) x k + 1 = x k − α k g k α k = arg min α ≥ 0 f ( x k − α g k ) Fast during early several iterations Linear Convergence � k � κ − 1 ∇ 2 f ( x ∗ ) � � � g k � 2 ≈ κ = cond , κ + 1 Zigzagging Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 5 / 39

  6. Introduction Barzilai-Borwein (1988) x k + 1 = x k − α k g k x k − D − 1 = g k k D = α − 1 I � D k s k − 1 − y k − 1 � 2 D k = arg min 2 ( s k − 1 = x k − x k − 1 , y k − 1 = g k − g k − 1 ) = s T k − 1 s k − 1 α BB 1 ⇒ k s T k − 1 y k − 1 Similarly, = s T k − 1 y k − 1 α BB 2 k y T k − 1 y k − 1 Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 6 / 39

  7. Introduction Fletcher (2005), “On the Barzilai-Borwein method": u ∈ [ 0 , 1 ] 3 △ u = − f , f = x ( x − 1 ) y ( y − 1 ) z ( z − 1 ) w ( x , y , z ) − 1 ( x − α ) 2 + ( y − β ) 2 + ( z − γ ) 2 �� � 2 σ 2 � w = exp n = 10 6 A u = b , � � ⇔ min 1 2 u T A u − b T u � g k � 2 ≤ 10 − 6 � g 1 � 2 u 1 = 0 , Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 7 / 39

  8. Introduction Numerical Results ( σ, α, β, γ ) BB CG ( 20 , 0 . 5 , 0 . 5 , 0 . 5 ) double 543(859) 162(178) single 462(964) 254(387) ( 50 , 0 . 4 , 0 . 7 , 0 . 5 ) double 640(1009) 285(306) single 310(645) 290(443) � g 2000 � But SD: 2000, = 0 . 18 ! � g 1 � Scholar google BB : 806 times (by Jan 5, 2014) Scholar google GPSR by Figueiredo, Wright and Nowak (2007): 1310 times (by Jan 5, 2014) Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 8 / 39

  9. Introduction Efficiency Evidences of BB for Quadratic Minimization Barzilai-Borwein (1988) n = 2, R -superlinear � � α − 1 k i 1 → λ 1 , α − 1 k i 2 → λ 2 Dai & Fletcher (2005) n = 3, R -superlinear Dai & Fletcher (2005) Cyclic SD method, m ≥ n 2 + 1, R -superlinear In theory, how to show that BB is better than SD for any-dimensional quadratic functions? Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 9 / 39

  10. Introduction Quadratic Termination of Gradient Method g k + 1 = g k − α k A g k = ( I − α k A ) g k � � � k = j = 1 ( 1 − α j A ) g 1 Assuming that λ ( A ) = { λ 1 , λ 2 , ..., λ n } by the Caylay-Hamilton theorem, we must have g n + 1 = 0 if � � � � λ − 1 α k : k = 1 , ..., n = : k = 1 , ..., n k This property was first due to Yan-Lian Lai (1983). Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 10 / 39

  11. Introduction A Typical Nonmonotone Performance of BB Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 11 / 39

  12. Introduction For any dimensional strictly convex quadratics Raydan (1993): global convergence Dai & Liao (2002): R -linear convergence We can then show that the BB stepsize can be asymptotically accepted by the nonmonotone line search in the context of unconstrained optimization. This is a property similar to quasi-Newton methods where the stepsize α k = 1 is usually firstly tried by the Wolfe line search and it will gradually accepted. Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 12 / 39

  13. Introduction Gobalization Technique for General Functions Raydan (1997): GLL nonmonotone line search f ( x k − α g k ) ≤ f ref − δα � g k � 2 , f ref = max j = 1 ,..., m f k − j Dai & Zhang (2001): Adaptive nonmonotone line search Initialization : f ref = + ∞ , H ∈ [ 4 , 10 ] If f k ≤ f best f best = f k , f c = f k , h = 0 ; Else f c = max { f c , f k } , h = h + 1 if h = H , f ref = f c , search , f c = f k , h = 0 Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 13 / 39

  14. A Positive BB-Like Stepsize Section II. A Positive BB-Like Stepsize Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 14 / 39

  15. A Positive BB-Like Stepsize Motivation What to do if the BB stepsize = s T = s T k − 1 s k − 1 k − 1 y k − 1 α BB 1 α BB 2 or k k s T y T k − 1 y k − 1 k − 1 y k − 1 is very small or even negative? Project it onto the interval � � α min , α max ? k k How to choose α min (and α max ) ? 10 − 30 , 10 − 8 , 10 − 5 , ...... k k For a symmetric but not necessarily positive definite linear system x ∈ R n , A x = b , how to approximate the (inverse) Jacobian matrix by the form α I , in which case it may have negative eigenvalues? Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 15 / 39

  16. A Positive BB-Like Stepsize The New positive stepsize The New positive stepsize α k = � s k − 1 � (1) � y k − 1 � Mentioned in several previous occasions, but not been carefully studied [eg., Dai & Yuan (2001), Dai (2003), Dai & Yang (2006), Mehiddin Al-Baali (2007)] Property 1: Geometry mean � α BB 1 · α BB 2 α k = (2) k k Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 16 / 39

  17. A Positive BB-Like Stepsize The New positive stepsize (Cond.) Propery 2: Certain quasi-Newton property Two features of ∇ 2 f ( x k ) s T k − 1 ∇ 2 f ( x k ) s k − 1 ≈ s T k − 1 y k − 1 (3) y T k − 1 ∇ 2 f ( x k ) − 1 y k − 1 ≈ s T k − 1 y k − 1 (4) Approximation ∇ 2 f ( x k ) − 1 ← H = α I , ∇ 2 f ( x k ) ← H − 1 = α − 1 I � , � � s T k − 1 H − 1 s k − 1 + y T k − 1 H y k − 1 − 2 s T � α k = arg min k − 1 y k − 1 H = α I � 0 Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 17 / 39

  18. A Positive BB-Like Stepsize Property 3: One-retard extension of [Dai & Yang, 2006] = � g k � α DY (5) k � A g k � The stepsize (5) is shown to tend to some optimal stepsize: 2 k →∞ α DY lim inf = := arg min α ≥ 0 � I − α A � . (6) k λ 1 + λ n Both the solution and the minimal/maximal eigenpairs can simultaneously obtained (One stone Two birds). Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 18 / 39

  19. Analysis of The New Method Section III. Analysis of The New Method Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 19 / 39

  20. Analysis of The New Method Some notations Assume that � 1 � 0 � � A = , b = , λ > 1 λ 0 Denote g k = ( g ( 1 ) k , g ( 2 ) k ) T Assumption 1 λ > 1 (7) Assumption 2 g ( i ) 1 � = 0 , g ( i ) 2 � = 0 , i = 1 , 2 (8) Define � 2 � g ( 1 ) k q k = (9) � 2 � g ( 2 ) k Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 20 / 39

  21. Analysis of The New Method Some basic relations � α k = � s k − 1 � � y k − 1 � = � g k − 1 � 1 + q k − 1 � A g k − 1 � = (10) λ 2 + q k − 1 � g k + 1 = ( I − α k A ) g k (11) √ λ 2 + q k − 1 − √  1 + q k − 1 g ( 1 ) g ( 1 ) √ k + 1 = � g ( 1 ) k + 1 = ( 1 − α k ) g ( 1 )  k  λ 2 + q k − 1 k √ λ 2 + q k − 1 − λ √ = ⇒ (12) g ( 2 ) k + 1 = ( 1 − λα k ) g ( 2 ) 1 + q k − 1 g ( 2 ) g ( 2 ) √ k + 1 =  k  k λ 2 + q k − 1 Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 21 / 39

  22. Analysis of The New Method Recurrence relation of q k � 2 � � λ 2 + q k − 1 − � 1 + q k − 1 q k + 1 = q k λ 2 + q k − 1 − λ � � 1 + q k − 1 � 2 � λ 2 + q k − 1 − λ 2 + q k − 1 + λ � � � � ( 1 + q k − 1 )( 1 + q k − 1 ) = q k ( λ 2 − 1 ) q k − 1 � 2 � � λ − q k − 1 + τ ( q k − 1 ) q k = , (13) q 2 λ − 1 k − 1 where τ ( w ) = ( 1 + w )( λ 2 + w ) , w ≥ 0 (14) � h ( w ) = λ − w + τ ( w ) , w ≥ 0 (15) λ + 1 Define M k = log q k . Then we obtain M k + 1 = M k − 2 M k − 1 + 2 log ( h ( q k − 1 )) (16) Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 22 / 39

  23. Analysis of The New Method The difficulty: Previously, for the BB1 or BB2 method, we can get the linear recurrence relation M k + 1 = M k − 2 M k − 1 . But now we have got a nonlinear recurrence relation. Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 23 / 39

Recommend


More recommend