properties of the stochastic approximation schedule in
play

Properties of the Stochastic Approximation Schedule in the - PowerPoint PPT Presentation

The algorithm Unsettled issues Flat Histogram in finite time Parallel Interacting Chains Properties of the Stochastic Approximation Schedule in the Wang-Landau Algorithm Pierre E. Jacob CEREMADE, Universit e Paris Dauphine funded by AXA


  1. The algorithm Unsettled issues Flat Histogram in finite time Parallel Interacting Chains Properties of the Stochastic Approximation Schedule in the Wang-Landau Algorithm Pierre E. Jacob CEREMADE, Universit´ e Paris Dauphine funded by AXA research MCQMC – February 2012 joint work with Luke Bornn (UBC), Arnaud Doucet (Oxford), Pierre Del Moral (INRIA & Universit´ e de Bordeaux), Robin J. Ryder (Dauphine) P.E.JACOB Wang-Landau 1/ 25

  2. The algorithm Unsettled issues Flat Histogram in finite time Parallel Interacting Chains Outline The algorithm 1 Unsettled issues 2 Flat Histogram in finite time 3 Parallel Interacting Chains 4 P.E.JACOB Wang-Landau 2/ 25

  3. The algorithm Unsettled issues Flat Histogram in finite time Parallel Interacting Chains Motivation 0.5 0.4 density 0.3 0.2 0.1 0.0 −4 −2 0 2 4 X Figure: A normal distribution biased to get desired frequencies in specific parts of the space. Here we use φ = { 75% , 25% } on { ] − ∞ , 0] , [0 , + ∞ [ } . P.E.JACOB Wang-Landau 3/ 25

  4. The algorithm Unsettled issues Flat Histogram in finite time Parallel Interacting Chains Motivation Histogram of the binned coordinate 0.4 0.3 Density 0.2 0.1 0.0 −4 −2 0 2 4 binned coordinate Figure: Normal biased to get the same frequency in each of 5 bins. P.E.JACOB Wang-Landau 4/ 25

  5. The algorithm Unsettled issues Flat Histogram in finite time Parallel Interacting Chains Setting Partition the state space d � X = X i i =1 Desired frequencies � φ = ( φ 1 , . . . , φ d ) such that φ i = 1 i Penalized distribution π θ ( x ) ∝ π ( x ) ∀ i ∈ { 1 , . . . , d } ∀ x ∈ X i θ ( i ) P.E.JACOB Wang-Landau 5/ 25

  6. The algorithm Unsettled issues Flat Histogram in finite time Parallel Interacting Chains First algorithm Algorithm 1 Wang-Landau with deterministic schedule ( γ t ) 1: Init ∀ i ∈ { 1 , . . . , d } set θ 0 ( i ) ← 1 / d . 2: Init X 0 ∈ X . 3: for t = 1 to T do Sample X t from K θ t − 1 ( X t − 1 , · ), MH kernel targeting π θ t − 1 . 4: Update the penalties: 5: log θ t ( i ) ← log θ t − 1 ( i ) + γ t (1 I X i ( X t ) − φ i ) 6: end for P.E.JACOB Wang-Landau 6/ 25

  7. The algorithm Unsettled issues Flat Histogram in finite time Parallel Interacting Chains Flat Histogram Issues with the first version Choice of γ has a huge impact on the results. Flat Histogram Define the counters: t � ν t ( i ) := 1 I X i ( X n ) n =1 Flat Histogram (FH) is reached when: � � ν t ( i ) � � max − φ i � < c � � t i ∈{ 1 ,..., d } � P.E.JACOB Wang-Landau 7/ 25

  8. The algorithm Unsettled issues Flat Histogram in finite time Parallel Interacting Chains Flat Histogram Idea Instead of decreasing γ t at each time step t , decrease only when the Flat Histogram criterion is reached. In practice Denote by κ t the number of FH criteria reached up to time t . Use γ κ t instead of γ t at time t . If FH is reached at time t , reset ν t ( i ) to 0 for all i . P.E.JACOB Wang-Landau 8/ 25

  9. The algorithm Unsettled issues Flat Histogram in finite time Parallel Interacting Chains Wang–Landau with Flat Histogram Algorithm 2 Wang-Landau with stochastic schedule ( γ κ t ) 1: Init as before: X 0 , θ 0 ( i ). 2: Init κ 0 ← 0. 3: for t = 1 to T do Sample X t from K θ t − 1 ( X t − 1 , · ), MH kernel targeting π θ t − 1 . 4: If (FH) then κ t ← κ t − 1 + 1, otherwise κ t ← κ t − 1 . 5: Update the penalties: 6: log θ t ( i ) ← log θ t − 1 ( i ) + γ κ t (1 I X i ( X t ) − φ i ) 7: end for P.E.JACOB Wang-Landau 9/ 25

  10. The algorithm Unsettled issues Flat Histogram in finite time Parallel Interacting Chains Understanding the algorithm Pros and cons . . . it works much better than the first version. . . however it is a bit tricky to analyse. Putting a label on the algorithm It is an adaptive MCMC algorithm, ie the kernel changes at every time step. Here the target distribution changes at every time step but the proposal stays the same. Between two FH, γ κ t stays constant, so there is no diminishing adaptation . Hence the FH version is a bit more complicated than the deterministic version. P.E.JACOB Wang-Landau 10/ 25

  11. The algorithm Unsettled issues Flat Histogram in finite time Parallel Interacting Chains Understanding the algorithm A reasonable first step Proof that FH is met in finite time. (under strong assumptions) Note: it means the desired frequencies are reached, when γ stays constant. ⇒ it might be a hint that the diminishing γ does not play a big part in the algorithm. P.E.JACOB Wang-Landau 11/ 25

  12. The algorithm Unsettled issues Flat Histogram in finite time Parallel Interacting Chains FH is met in finite time To be sure that eventually, for any c > 0: � � ν t ( i ) � � max − φ i � < c � � t i ∈{ 1 ,..., d } � we want to prove: ν t ( i ) P ∀ i ∈ { 1 , . . . , d } − t →∞ φ i − − → t P.E.JACOB Wang-Landau 12/ 25

  13. The algorithm Unsettled issues Flat Histogram in finite time Parallel Interacting Chains Various updates Right update log θ t ( i ) ← log θ t − 1 ( i ) + γ (1 I X i ( X t ) − φ i ) (1) Wrong update θ t ( i ) ← θ t − 1 ( i ) [1 + γ (1 I X i ( X t ) − φ i )] ⇔ log θ t ( i ) ← log θ t − 1 ( i ) + log [1 + γ (1 I X i ( X t ) − φ i )] (2) (actually not wrong if ∀ i φ i = 1 d ) P.E.JACOB Wang-Landau 13/ 25

  14. The algorithm Unsettled issues Flat Histogram in finite time Parallel Interacting Chains Assumptions From now on, there are only two bins: d = 2. Additionally: Assumption The bins are not empty with respect to µ and π : ∀ i ∈ { 1 , 2 } µ ( X i ) > 0 and π ( X i ) > 0 Assumption The state space X is compact. P.E.JACOB Wang-Landau 14/ 25

  15. The algorithm Unsettled issues Flat Histogram in finite time Parallel Interacting Chains Assumptions Assumption The proposition distribution Q ( x , y ) is such that: ∃ q min > 0 ∀ x ∈ X ∀ y ∈ X Q ( x , y ) > q min Assumption The MH acceptance ratio is bounded from both sides: m < π ( y ) Q ( y , x ) ∃ m > 0 ∃ M > 0 ∀ x ∈ X ∀ y ∈ X Q ( x , y ) < M π ( x ) P.E.JACOB Wang-Landau 15/ 25

  16. The algorithm Unsettled issues Flat Histogram in finite time Parallel Interacting Chains Theorem Theorem Consider the sequence of penalties θ t introduced in the WL algorithm. We define: Z t = log θ t (1) θ t (2) = log θ t (1) − log θ t (2) Then: Z t L 1 − t →∞ 0 − − → t and consequently, with update (1) (FH) is reached in finite time for any precision threshold c, whereas this is not guaranteed for update (2). P.E.JACOB Wang-Landau 16/ 25

  17. The algorithm Unsettled issues Flat Histogram in finite time Parallel Interacting Chains Consequence Recall t � ν t ( i ) := 1 I X i ( X n ) n =1 Using update (1), and starting from Z 0 = 0: Z t = log θ t (1) − log θ t (2) = ( ν t (1) γ (1 − φ 1 ) − ( t − ν t (1)) γφ 1 ) − ( ν t (2) γ (1 − φ 2 ) − ( t − ν t (2)) γφ 2 ) = ν t (1) (2 γ ) − t (2 γφ 1 ) L 1 using ν t (1) + ν t (2) = t and φ 1 + φ 2 = 1. Hence if Z t − t →∞ 0 then − − → t ν t (1) L 1 − t →∞ φ 1 − − → t P.E.JACOB Wang-Landau 17/ 25

  18. The algorithm Unsettled issues Flat Histogram in finite time Parallel Interacting Chains Proof ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Z s ● ● ● ● ● ● ● ● ~ ● ● ● ● ~ ● Z ● s + T ● ● ● ● ● ● ● ● ● Z s + T ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5 10 15 20 25 30 35 time Figure: We prove that Z t returns below a given horizontal bar whenever it goes above it, and it does so in finite time. It then implies Z t / t → 0. P.E.JACOB Wang-Landau 18/ 25

  19. The algorithm Unsettled issues Flat Histogram in finite time Parallel Interacting Chains Parallel Interacting Chains A parallel version of the algorithm runs N chains in parallel (see e.g. F.Liang, JSP 2006). Target the same distribution Each new value ( X ( k ) ) is drawn from a MH kernel K θ t − 1 ( X ( k ) t − 1 , · ) t using the same penalties ( θ t ). Interaction between chains To update θ t use an average: N 1 I X i ( X ( k ) � 1 ) t N k =1 instead of 1 I X i ( X t ). P.E.JACOB Wang-Landau 19/ 25

  20. The algorithm Unsettled issues Flat Histogram in finite time Parallel Interacting Chains Parallel Interacting Chains Reaching Flat Histogram 40 30 #FH N = 1 N = 10 20 N = 100 10 2000 4000 6000 8000 10000 iterations P.E.JACOB Wang-Landau 20/ 25

  21. The algorithm Unsettled issues Flat Histogram in finite time Parallel Interacting Chains Parallel Interacting Chains Stabilization of the log penalties 10 5 value 0 −5 −10 2000 4000 6000 8000 10000 iterations Figure: log θ t against t , for N = 1 P.E.JACOB Wang-Landau 21/ 25

  22. The algorithm Unsettled issues Flat Histogram in finite time Parallel Interacting Chains Parallel Interacting Chains Stabilization of the log penalties 10 5 value 0 −5 −10 2000 4000 6000 8000 10000 iterations Figure: log θ t against t , for N = 10 P.E.JACOB Wang-Landau 22/ 25

Recommend


More recommend