Array-RQMC for Markov Chains with Random Stopping Times Pierre - PowerPoint PPT Presentation

1 Array-RQMC for Markov Chains with Random Stopping Times Pierre L’Ecuyer Maxime Dion Adam L’Archevˆ eque-Gaudet Informatique et Recherche Op´ erationnelle, Universit´ e de Montr´ eal 1. Markov chain setting, Monte Carlo, classical RQMC. 2. Array-RQMC: preserving the low discrepancy of the chain’s states. 3. Least-squares Monte Carlo for optimal stopping times. 4. Examples.

2 Monte Carlo for Markov Chains Setting : A Markov chain with state space X ⊆ R ℓ , evolves as X 0 = x 0 , X j = ϕ j ( X j − 1 , U j ) , j ≥ 1 , where the U j are i.i.d. uniform r.v.’s over (0 , 1) d . Want to estimate τ � µ = E [ Y ] where Y = g j ( X j ) j =1 and τ is a stopping time w.r.t. the filtration F{ ( j , X j ) , j ≥ 0 } .

2 Monte Carlo for Markov Chains Setting : A Markov chain with state space X ⊆ R ℓ , evolves as X 0 = x 0 , X j = ϕ j ( X j − 1 , U j ) , j ≥ 1 , where the U j are i.i.d. uniform r.v.’s over (0 , 1) d . Want to estimate τ � µ = E [ Y ] where Y = g j ( X j ) j =1 and τ is a stopping time w.r.t. the filtration F{ ( j , X j ) , j ≥ 0 } . Ordinary MC : For i = 0 , . . . , n − 1, generate X i , j = ϕ j ( X i , j − 1 , U i , j ), j = 1 , . . . , τ i , where the U i , j ’s are i.i.d. U(0 , 1) d . Estimate µ by τ i n n µ n = 1 g j ( X i , j ) = 1 � � � ˆ Y i . n n i =1 j =1 i =1

3 Classical RQMC for Markov Chains Put V i = ( U i , 1 , U i , 2 , . . . ). Estimate µ by τ i n µ rqmc , n = 1 � � ˆ g j ( X i , j ) n i =1 j =1 where P n = { V 0 , . . . , V n − 1 } ⊂ (0 , 1) s has the following properties: (a) each point V i has the uniform distribution over (0 , 1) s ; (b) P n has low discrepancy. Dimension is s = inf { s ′ : P [ d τ ≤ s ′ ] = 1 } . For a Markov chain, the dimension s is often very large!

4 Array-RQMC for Markov Chains [L´ ecot, Tuffin, L’Ecuyer 2004, 2008] Simulate n chains in parallel. At each step, use an RQMC point set P n to advance all the chains by one step, while inducing global negative dependence across the chains. Intuition: The empirical distribution of S n , j = { X 0 , j , . . . , X n − 1 , j } , should be a more accurate approximation of the theoretical distribution of X j , for each j , than with crude Monte Carlo. The discrepancy between these two distributions should be as small as possible.

4 Array-RQMC for Markov Chains [L´ ecot, Tuffin, L’Ecuyer 2004, 2008] Simulate n chains in parallel. At each step, use an RQMC point set P n to advance all the chains by one step, while inducing global negative dependence across the chains. Intuition: The empirical distribution of S n , j = { X 0 , j , . . . , X n − 1 , j } , should be a more accurate approximation of the theoretical distribution of X j , for each j , than with crude Monte Carlo. The discrepancy between these two distributions should be as small as possible. Then, we will have small variance for the (unbiased) estimators: n − 1 n − 1 µ j = E [ g j ( X j )] ≈ 1 µ = E [ Y ] ≈ 1 � � g j ( X i , j ) and Y i . n n i =0 i =0

4 Array-RQMC for Markov Chains [L´ ecot, Tuffin, L’Ecuyer 2004, 2008] Simulate n chains in parallel. At each step, use an RQMC point set P n to advance all the chains by one step, while inducing global negative dependence across the chains. Intuition: The empirical distribution of S n , j = { X 0 , j , . . . , X n − 1 , j } , should be a more accurate approximation of the theoretical distribution of X j , for each j , than with crude Monte Carlo. The discrepancy between these two distributions should be as small as possible. Then, we will have small variance for the (unbiased) estimators: n − 1 n − 1 µ j = E [ g j ( X j )] ≈ 1 µ = E [ Y ] ≈ 1 � � g j ( X i , j ) and Y i . n n i =0 i =0 How can we preserve low-discrepancy of X 0 , j , . . . , X n − 1 , j when j increases? Can we quantify the variance improvement?

5 To simplify, suppose each X j is a uniform r.v. over (0 , 1) ℓ . Select a discrepancy measure D for the point set S n , j = { X 0 , j , . . . , X n − 1 , j } over (0 , 1) ℓ , and a corresponding measure of variation V , such that µ rqmc , j , n − µ j ) 2 ] ≤ E [ D 2 ( S n , j )] V 2 ( g j ) . Var [ˆ µ rqmc , j , n ] = E [(ˆ

5 To simplify, suppose each X j is a uniform r.v. over (0 , 1) ℓ . Select a discrepancy measure D for the point set S n , j = { X 0 , j , . . . , X n − 1 , j } over (0 , 1) ℓ , and a corresponding measure of variation V , such that µ rqmc , j , n − µ j ) 2 ] ≤ E [ D 2 ( S n , j )] V 2 ( g j ) . Var [ˆ µ rqmc , j , n ] = E [(ˆ If D is defined via a reproducing kernel Hilbert space, then, for some random ξ j (that generally depends on S n , j ), � n � � n � 1 1 E [ D 2 ( S n , j )] � � = Var ξ j ( X i , j ) = Var ( ξ j ◦ ϕ j )( X i , j − 1 , U i , j )) n n i =1 i =1 E [ D 2 (2) ( Q n )] · V 2 ≤ (2) ( ξ j ◦ ϕ j ) for some other discrepancy D (2) over (0 , 1) ℓ + d , where Q n = { ( X 0 , j − 1 , U 0 , j ) , . . . , ( X n − 1 , j − 1 , U n − 1 , j ) } . Heuristic: Under appropriate conditions, we should have V (2) ( ξ j ◦ ϕ j ) < ∞ and E [ D 2 (2) ( Q n )] = O ( n − α + ǫ ) for some α ≥ 1.

6 In the points ( X i , j − 1 , U i , j ) of Q n , the U i , j can be defined via some RQMC scheme, but the X i , j − 1 cannot be chosen; they are determined by the history of the chains. The idea is to select a low-discrepancy point set ˜ Q n = { ( w 0 , U 0 ) , . . . , ( w n − 1 , U n − 1 ) } , where the w i ∈ [0 , 1) ℓ are fixed and the U i ∈ (0 , 1) d are randomized, and then define a bijection between the states X i , j − 1 and the w i so that the X i , j − 1 are “close” to the w i (small discrepancy between the two sets). Bijection defined by a permutation π j of S n , j .

6 In the points ( X i , j − 1 , U i , j ) of Q n , the U i , j can be defined via some RQMC scheme, but the X i , j − 1 cannot be chosen; they are determined by the history of the chains. The idea is to select a low-discrepancy point set ˜ Q n = { ( w 0 , U 0 ) , . . . , ( w n − 1 , U n − 1 ) } , where the w i ∈ [0 , 1) ℓ are fixed and the U i ∈ (0 , 1) d are randomized, and then define a bijection between the states X i , j − 1 and the w i so that the X i , j − 1 are “close” to the w i (small discrepancy between the two sets). Bijection defined by a permutation π j of S n , j . State space in R ℓ : same algorithm essentially.

7 Array-RQMC algorithm X i , 0 ← x 0 , for i = 0 , . . . , n − 1; for j = 1 , 2 , . . . , max i τ i do Randomize afresh { U 0 , j , . . . , U n − 1 , j } in ˜ Q n ; X i , j = ϕ j ( X π j ( i ) , j − 1 , U i , j ), for i = 0 , . . . , n − 1; Compute the permutation π j +1 (sort the states); end for Estimate µ by the average ¯ Y n = ˆ µ rqmc , n .

7 Array-RQMC algorithm X i , 0 ← x 0 , for i = 0 , . . . , n − 1; for j = 1 , 2 , . . . , max i τ i do Randomize afresh { U 0 , j , . . . , U n − 1 , j } in ˜ Q n ; X i , j = ϕ j ( X π j ( i ) , j − 1 , U i , j ), for i = 0 , . . . , n − 1; Compute the permutation π j +1 (sort the states); end for Estimate µ by the average ¯ Y n = ˆ µ rqmc , n . Theorem: The average ¯ Y n is an unbiased estimator of µ . Can estimate Var [ ¯ Y n ] by the empirical variance of m indep. realizations.

8 Mapping chains to points Multivariate sort : Sort the states (chains) by first coordinate, in n 1 packets of size n / n 1 . Sort each packet by second coordinate, in n 2 packets of size n / n 1 n 2 . . . . At the last level, sort each packet of size n ℓ by the last coordinate. Choice of n 1 , n 2 , ..., n ℓ ?

8 Mapping chains to points Multivariate sort : Sort the states (chains) by first coordinate, in n 1 packets of size n / n 1 . Sort each packet by second coordinate, in n 2 packets of size n / n 1 n 2 . . . . At the last level, sort each packet of size n ℓ by the last coordinate. Choice of n 1 , n 2 , ..., n ℓ ? Generalization : Define a sorting function v : X → [0 , 1) c and apply the multivariate sort (in c dimensions) to the transformed points v ( X i , j ). Choice of v : Two states mapped to nearby values of v should be approximately equivalent.

9 A (4,4) mapping Sobol’ net in 2 dimensions with digital shift States of the chains 1.0 1.0 s s 0.9 s 0.9 s s 0.8 0.8 s s s s s 0.7 0.7 s 0.6 s 0.6 s s 0.5 0.5 s s s 0.4 0.4 s s 0.3 0.3 s s s s s 0.2 s 0.2 s s s 0.1 s 0.1 s s s 0.0 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

10 A (4,4) mapping Sobol’ net in 2 dimensions with digital shift States of the chains 1.0 1.0 s s 0.9 s 0.9 s s 0.8 0.8 s s s s s 0.7 0.7 s 0.6 s 0.6 s s 0.5 0.5 s s s 0.4 0.4 s s 0.3 0.3 s s s s s 0.2 s 0.2 s s s 0.1 s 0.1 s s s 0.0 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

11 A (4,4) mapping 1.0 s s s 0.9 s s 0.8 s s s s s 0.7 s s 0.6 s s 0.5 s s s 0.4 s s 0.3 s s s s s 0.2 s s s s s 0.1 s s s 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

12 A (16,1) mapping, sorting along first coordinate 1.0 s s s 0.9 s s 0.8 s s s s s 0.7 s s 0.6 s s 0.5 s s s 0.4 s s 0.3 s s s s s 0.2 s s s s s 0.1 s s s 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Array-RQMC for Markov Chains with Random Stopping Times Pierre - PowerPoint PPT Presentation

1 Array-RQMC for Markov Chains with Random Stopping Times Pierre LEcuyer Maxime Dion Adam LArchev eque-Gaudet Informatique et Recherche Op erationnelle, Universit e de Montr eal 1. Markov chain setting, Monte Carlo,

Markov Chains Markov Processes Discrete-time Markov Chains Continuous-time Markov Chains Dr

Markov chains and Hidden Markov Models 9000 Markov chains and HMMs We will discuss: Markov

CSCE 471/871 Lecture 3: Markov Chains Markov Chains and and Hidden Markov Models Hidden

Imprecise Markov chains From basic theory to applications II prof. Jasper De Bock Imprecise

Overview Motivation Verifying Continuous-Time Markov Chains 1 Lecture 1+2: Discrete-Time Markov

Discrete time Markov chains Today: Discrete Time Markov Chains, Limiting Discrete time Markov

Discrete Time Markov Chains Discrete-Time Markov Chains Books - Introduction to Stochastic

Overview Verifying Continuous-Time Markov Chains Negative exponential distributions 1 Lecture

Markov Chains and Hidden Markov Models COMP 571 Luay Nakhleh, Rice University Markov Chains and

Markov Chains and Hidden Markov Models COMP 571 Luay Nakhleh, Rice University 2 Markov Chains

Hidden Markov Models Discrete Markov Processes 1 Hidden Markov Models Hidden Markov Models 2

Randomized Algorithms Randomized Algorithms Markov Chains and Random Walks Markov Chains and

Simulation of Discrete-Time Markov Chains Discrete-Time Markov Chains (DTMCs) Numerical Solution

Under Interval and Fuzzy From the . . . Symmetric Markov Chains Uncertainty, Symmetric In

CS70: Lecture 36. Markov Chains 1. Markov Process: Motivation, Definition 2. Examples 3.

Graphical Models - Part II Oliver Schulte - CMPT 726 Bishop PRML Ch. 8 Markov Random Fields

Advanced Algorithms (XIII) Shanghai Jiao Tong University Chihao Zhang June 1, 2020 Total

BY THE NUMBERS BY THE NUMBERS BY THE NUMBERS BY THE NUMBERS BY THE NUMBERS BY THE NUMBERS

Alive2 Verifying existing optimizations Nuno Lopes John Regehr Microsoft Research University

Watching the Watchers: Automatically Inferring TV Content From Outdoor Light Effusions Yi Xu,

New design of an acoustic array calibrator for underwater neutrino telescopes M. Saldaa, C.D.

Search for 3rd generation superpartners with the ATLAS experiment Keisuke Yoshihara (University

Chip and PIN is broken Steven Murdoch, Saar Drimer, Ross Anderson, Mike Bond Europay

Triple Therapy After PCI in AF: A Quagmire Soon to be Drained Freek W.A. Verheugt Department of

Sambuz

Useful Links

Newsletter

Mail Us