ThRaSH Workshop 2012 Convergence of Local Search Sebastian U. Stich a,b joint work with uller a,b , Bernd G¨ artner a Christian L. M¨ Institute of Theoretical Computer Science Department of Computer Science a , Swiss Institute of Bioinformatics b ETH Z¨ urich Mai 3, 2012 S. Stich Random Pursuit
Introduction Local Search Outlook Table of contents Introduction 1 Black-box setting Convex functions Local Search 2 Definition Convergence Examples Outlook 3 Outlook and Open Problems S. Stich Random Pursuit
Introduction Black-box setting Local Search Convex functions Outlook Black-box optimization Given: f : E �→ R x → → f ( x ) Goal: min x ∈ E f ( x ) Problem class f convex Oracle access to ( f ( x )) Complexity: number of oracle calls sufficient to solve any problem of the class Solution: y : f ( y ) − min x ∈ E f ( x ) ≤ ǫ S. Stich Random Pursuit
Introduction Black-box setting Local Search Convex functions Outlook Convex functions first-order condition: f ( y ) ≥ f ( x ) + �∇ f ( x ) , y − x � , ∀ x , y ∈ E f ( x ) ( x 0 , f ( x 0 )) f ( x 0 ) + �∇ f ( x 0 ) , y − x � S. Stich Random Pursuit
Introduction Black-box setting Local Search Convex functions Outlook Convex functions II Quadratic upper bound: Quadratic lower bound: (strongly convex) f ( y ) ≥ f ( x )+ �∇ f ( x ) , y − x � + µ f ( y ) ≤ f ( x )+ �∇ f ( x ) , y − x � + L 2 � y − x � 2 2 � y − x � 2 f ( x ) f ( x ) ( x 0 , f ( x 0 )) ( x 0 , f ( x 0 )) f ( x 0 ) + �∇ f ( x 0 ) , y − x � f ( x 0 ) + �∇ f ( x 0 ) , y − x � We call κ := L/µ condition number; µ · I n � ∇ 2 f ( x ) � L · I n Only (!) for strongly convex: � x − x ∗ � 2 ≤ 2 µ ( f ( x ) − f ( x ∗ )) S. Stich Random Pursuit
Introduction Definition Local Search Convergence Outlook Examples Local search σ k x k x k +1 = x k + σ k u k 1 − γ k Sufficient decrease 0 < γ ≤ γ k ≤ 1 : f ( x k +1 ) ≤ (1 − γ k ) f ( x k ) + γ k min t ∈ R f ( x k + t u k ) S. Stich Random Pursuit
Introduction Definition Local Search Convergence Outlook Examples Sufficient decrease Sufficient decrease 0 < γ ≤ γ k ≤ 1 : f ( x k +1 ) ≤ (1 − γ k ) f ( x k ) + γ k min t ∈ R f ( x k + t u k ) ⇒ f ( x k ) − f ( x k +1 ) ≥ γ k ( f ( x k ) − f ( x k + t u k )) ∀ t ∈ R � � ∇ f ( x k ) ·�∇ f ( x k ) � 2 Set t = − , u k �∇ f ( x k ) � 2 L � �� � := β k S. Stich Random Pursuit
Introduction Definition Local Search Convergence Outlook Examples Single step progress We use this t together with our assumptions. Quadratic upper bound: f ( x k ) − f ( x k +1 ) ≥ γ k �∇ f ( x k ) � 2 2 β 2 2 Quadratic lower bound: f ( x k ) − f ( x k +1 ) ≥ γµ L β 2 k ( f ( x k ) − f ( x ∗ )) Progress: � � γ f ( x k +1 ) − f ( x ∗ ) 1 − β 2 · ( f ( x k ) − f ( x ∗ )) ≤ k κ � �� � � �� � := f k +1 := f k S. Stich Random Pursuit
Introduction Definition Local Search Convergence Outlook Examples Global convergence After N steps: N − 1 � ln f N − ln f 0 ≤ ln f k +1 − ln f k k =0 N − 1 � γ � � 1 − β 2 ≤ ln k κ k =0 N − 1 γ � β 2 ≤ − k κ k =0 � � N − 1 − γ � β 2 f N ≤ f 0 · exp k κ k =0 S. Stich Random Pursuit
Introduction Definition Local Search Convergence Outlook Examples Convergence with high probability β 2 = � v , u � 2 v v ∈ S n − 1 , u ∼ S n − 1 u � � v , u � 2 � � � v , u � 2 � = 1 ≤ 2 � β 2 � � β 2 � = E Var = Var E n n 2 � N − 1 � � β 2 � � v , u k � 2 < (1 − ǫ ) N ≤ Var 1 2 � · P ǫ 2 E [ β 2 ] 2 = ǫ 2 N n N k =1 S. Stich Random Pursuit
Introduction Definition Local Search Convergence Outlook Examples Convergence with high probability For N = Ω( n ) : � � N − 1 � � κ · (1 − ǫ ) N − γ − γ � β 2 f N ≤ f 0 · exp ≤ f 0 · exp w.h.p. k κ n k =0 S. Stich Random Pursuit
Introduction Definition Local Search Convergence Outlook Examples Example I - Random Pursuit u ∼ S n − 1 + approximate line search: � � σ k ∈ 0 . 5 · arg min h ∈ R f ( x k + h u ) , arg min h ∈ R f ( x k + h u ) + δ σ x S. Stich Random Pursuit
Introduction Definition Local Search Convergence Outlook Examples Example II - Random Gradient Method [Nesterov 2011] u ∼ S n − 1 + estimated stepsize: σ k ≈ − 1 Lf ′ ( u , x k ) σ x S. Stich Random Pursuit
Introduction Definition Local Search Convergence Outlook Examples Example III & IV Different spaces Discrete space Matrices f : R n × n → R S. Stich Random Pursuit
Introduction Definition Local Search Convergence Outlook Examples Example V (?) - Optimal 1/5 rule u ∼ N (0 , I n ) + ’1/5’- rule: P [ f ( x k + σ k u ) ≤ f ( x k )] = const S. Stich Random Pursuit
Introduction Local Search Outlook and Open Problems Outlook Outlook and Open Problems Possible: Concentration for N = Ω(log n ) Different search directions (not only S n − 1 ) Non-isotropic sampling Interesting spaces (e.g. matrices) Smooth convex functions Would be very nice: Constraint handling Apply to 1/5-rule stepsize rule [like (1+1)-ES] Open: Extension of the model to ”almost convex functions” Thank you S. Stich Random Pursuit
References References S.U. Stich, C.L. M¨ uller, B. G¨ artner. Optimization of convex functions with Random Pursuit 2011. S.U. Stich. Convergence of Local Search, Manuscript 2012. S. Stich Random Pursuit
Recommend
More recommend