draft
play

Draft Supercanonical convergence rates in quasi-Monte Carlo - PowerPoint PPT Presentation

1 Draft Supercanonical convergence rates in quasi-Monte Carlo simulation of Markov chains Pierre LEcuyer Joint work with Christian L ecot, David Munger, and Bruno Tuffin 2 Draft Markov Chain Setting A Markov chain with state space X


  1. 1 Draft Supercanonical convergence rates in quasi-Monte Carlo simulation of Markov chains Pierre L’Ecuyer Joint work with Christian L´ ecot, David Munger, and Bruno Tuffin

  2. 2 Draft Markov Chain Setting A Markov chain with state space X evolves as X 0 = x 0 , X j = ϕ j ( X j − 1 , U j ) , j ≥ 1 , where the U j are i.i.d. uniform r.v.’s over (0 , 1) d . Payoff (or cost) function: τ � Y = g j ( X j ) j =1 for some fixed time horizon τ .

  3. 2 Draft Markov Chain Setting A Markov chain with state space X evolves as X 0 = x 0 , X j = ϕ j ( X j − 1 , U j ) , j ≥ 1 , where the U j are i.i.d. uniform r.v.’s over (0 , 1) d . Payoff (or cost) function: τ � Y = g j ( X j ) j =1 for some fixed time horizon τ . We may want to estimate µ = E [ Y ] , or some other functional of Y , or perhaps the entire distribution of Y .

  4. 3 Draft Ordinary Monte Carlo simulation For i = 0 , . . . , n − 1, generate X i , j = ϕ j ( X i , j − 1 , U i , j ), j = 1 , . . . , τ , where the U i , j ’s are i.i.d. U(0 , 1) d . Estimate µ by n − 1 τ n − 1 µ n = 1 g j ( X i , j ) = 1 � � � ˆ Y i . n n i =0 j =1 i =0 µ n ] = 1 n Var [ Y i ] = O ( n − 1 ) . E [ˆ µ n ] = µ and Var [ˆ The width of a confidence interval on µ converges as O ( n − 1 / 2 ) . That is, for each additional digit of accuracy, one must multiply n by 100.

  5. 3 Draft Ordinary Monte Carlo simulation For i = 0 , . . . , n − 1, generate X i , j = ϕ j ( X i , j − 1 , U i , j ), j = 1 , . . . , τ , where the U i , j ’s are i.i.d. U(0 , 1) d . Estimate µ by n − 1 τ n − 1 µ n = 1 g j ( X i , j ) = 1 � � � ˆ Y i . n n i =0 j =1 i =0 µ n ] = 1 n Var [ Y i ] = O ( n − 1 ) . E [ˆ µ n ] = µ and Var [ˆ The width of a confidence interval on µ converges as O ( n − 1 / 2 ) . That is, for each additional digit of accuracy, one must multiply n by 100. Can also estimate the distribution (density) of Y by the empirical distribution of Y 0 , . . . , Y n − 1 , or by an histogram (perhaps smoothed), or by a kernel density estimator. The mean integrated square error (MISE) for the density typically converges as O ( n − 2 / 3 ) for an histogram and O ( n − 4 / 5 ) for the best density estimators.

  6. 3 Draft Ordinary Monte Carlo simulation For i = 0 , . . . , n − 1, generate X i , j = ϕ j ( X i , j − 1 , U i , j ), j = 1 , . . . , τ , where the U i , j ’s are i.i.d. U(0 , 1) d . Estimate µ by n − 1 τ n − 1 µ n = 1 g j ( X i , j ) = 1 � � � ˆ Y i . n n i =0 j =1 i =0 µ n ] = 1 n Var [ Y i ] = O ( n − 1 ) . E [ˆ µ n ] = µ and Var [ˆ The width of a confidence interval on µ converges as O ( n − 1 / 2 ) . That is, for each additional digit of accuracy, one must multiply n by 100. Can also estimate the distribution (density) of Y by the empirical distribution of Y 0 , . . . , Y n − 1 , or by an histogram (perhaps smoothed), or by a kernel density estimator. The mean integrated square error (MISE) for the density typically converges as O ( n − 2 / 3 ) for an histogram and O ( n − 4 / 5 ) for the best density estimators. Can we do better than those rates?

  7. 4 Draft Plenty of applications fit this setting: Finance Queueing systems Inventory, distribution, logistic systems Reliability models MCMC in Bayesian statistics Many many more...

  8. 5 Draft Example: An Asian Call Option (two-dim state) Given observation times t 1 , t 2 , . . . , t τ , s 0 > 0, and X 0 = 0, let X ( t j − 1 ) + ( r − σ 2 / 2)( t j − t j − 1 ) + σ ( t j − t j − 1 ) 1 / 2 Z j , X ( t j ) = S ( t j ) = s 0 exp[ X ( t j )] , (geometric Brownian motion) where U j ∼ U [0 , 1) and Z j = Φ − 1 ( U j ) ∼ N (0 , 1). Running average: ¯ S j = 1 � j i =1 S ( t i ). j 0 , ¯ � � Payoff at step j = τ is Y = g τ ( X τ ) = max S τ − K . MC State: X j = ( S ( t j ) , ¯ S j ) . Transition: S ( t j ) , ( j − 1) ¯ � � S j − 1 + S ( t j ) X j = ( S ( t j ) , ¯ S j ) = ϕ j ( S ( t j − 1 ) , ¯ S j − 1 , U j ) = . j Want to estimate E [ Y ], or distribution of Y , etc.

  9. 6 Draft Take τ = 12, T = 1 (one year), t j = j /τ for j = 0 , . . . , τ , K = 100, s 0 = 100, r = 0 . 05, σ = 0 . 5. We make n = 10 6 independent runs. Mean: 13.1 . Max = 390.8 In 53.47% of cases, the payoff is 0.

  10. 6 Draft Take τ = 12, T = 1 (one year), t j = j /τ for j = 0 , . . . , τ , K = 100, s 0 = 100, r = 0 . 05, σ = 0 . 5. We make n = 10 6 independent runs. Mean: 13.1 . Max = 390.8 In 53.47% of cases, the payoff is 0. Histogram of positive values: Frequency ( × 10 3 ) average = 13 . 1 30 20 10 Payoff 0 0 50 100 150 Confidence interval on E [ Y ] converges as O ( n − 1 / 2 ). Can we do better?

  11. 7 Draft Another histogram, with n = 4096 runs. Frequency 150 100 50 0 Payoff 0 25 50 75 100 125 150 For histogram: MISE = O ( n − 2 / 3 ) . For polygonal interpolation: MISE = O ( n − 4 / 5 ) . Same with KDE. Can we do better?

  12. 8 Draft Randomized quasi-Monte Carlo (RQMC) � To estimate µ = (0 , 1) s f ( u ) d u , RQMC Estimator: n − 1 µ n , rqmc = 1 � ˆ f ( U i ) , n i =0 with P n = { U 0 , . . . , U n − 1 } ⊂ (0 , 1) s an RQMC point set: (i) each point U i has the uniform distribution over (0 , 1) s ; (ii) P n as a whole is a low-discrepancy point set. E [ˆ µ n , rqmc ] = µ (unbiased) , Var [ f ( U i )] 2 � Var [ˆ µ n , rqmc ] = + Cov [ f ( U i ) , f ( U j )] . n 2 n i < j We want to make the last sum as negative as possible.

  13. 8 Draft Randomized quasi-Monte Carlo (RQMC) � To estimate µ = (0 , 1) s f ( u ) d u , RQMC Estimator: n − 1 µ n , rqmc = 1 � ˆ f ( U i ) , n i =0 with P n = { U 0 , . . . , U n − 1 } ⊂ (0 , 1) s an RQMC point set: (i) each point U i has the uniform distribution over (0 , 1) s ; (ii) P n as a whole is a low-discrepancy point set. E [ˆ µ n , rqmc ] = µ (unbiased) , Var [ f ( U i )] 2 � Var [ˆ µ n , rqmc ] = + Cov [ f ( U i ) , f ( U j )] . n 2 n i < j We want to make the last sum as negative as possible. Weak attempts: antithetic variates ( n = 2), Latin hypercube sampling,...

  14. 9 Draft Variance estimation: Can compute m independent realizations X 1 , . . . , X m of ˆ µ n , rqmc , then µ n , rqmc ] by their sample mean ¯ estimate µ and Var [ˆ X m and sample variance S 2 m . Could be used to compute a confidence interval. Temptation: assume that ¯ X m has the normal distribution. Beware: usually wrong unless m → ∞ .

  15. 10 Draft Stratification of the unit hypercube Partition axis j in k j ≥ 1 equal parts, for j = 1 , . . . , s . Draw n = k 1 · · · k s random points, one per box, independently. Example , s = 2, k 1 = 12, k 2 = 8, n = 12 × 8 = 96. 1 u i , 1 u i , 2 0 1

  16. 11 Stratified estimator: Draft n − 1 X s , n = 1 � f ( U j ) . n j =0 The crude MC variance with n points can be decomposed as n − 1 X n ] = Var [ X s , n ] + 1 Var [ ¯ � ( µ j − µ ) 2 n j =0 where µ j is the mean over box j . The more the µ j differ, the more the variance is reduced.

  17. 11 Stratified estimator: Draft n − 1 X s , n = 1 � f ( U j ) . n j =0 The crude MC variance with n points can be decomposed as n − 1 X n ] = Var [ X s , n ] + 1 Var [ ¯ � ( µ j − µ ) 2 n j =0 where µ j is the mean over box j . The more the µ j differ, the more the variance is reduced. If f ′ is continuous and bounded, and all k j are equal, then Var [ X s , n ] = O ( n − 1 − 2 / s ) .

  18. 11 Stratified estimator: Draft n − 1 X s , n = 1 � f ( U j ) . n j =0 The crude MC variance with n points can be decomposed as n − 1 X n ] = Var [ X s , n ] + 1 Var [ ¯ � ( µ j − µ ) 2 n j =0 where µ j is the mean over box j . The more the µ j differ, the more the variance is reduced. If f ′ is continuous and bounded, and all k j are equal, then Var [ X s , n ] = O ( n − 1 − 2 / s ) . For large s , not practical. For small s , not really better than midpoint rule with a grid when f is smooth. But can still be applied to a few important random variables. Gives an unbiased estimator, and variance can be estimated by replicating m ≥ 2 times.

  19. 12 Draft Randomly-Shifted Lattice Example: lattice with s = 2 , n = 101 , v 1 = (1 , 12) / 101 1 u i , 1 u i , 2 0 1

  20. 12 Draft Randomly-Shifted Lattice Example: lattice with s = 2 , n = 101 , v 1 = (1 , 12) / 101 1 u i , 1 U u i , 2 0 1

  21. 12 Draft Randomly-Shifted Lattice Example: lattice with s = 2 , n = 101 , v 1 = (1 , 12) / 101 1 u i , 1 u i , 2 0 1

  22. 12 Draft Randomly-Shifted Lattice Example: lattice with s = 2 , n = 101 , v 1 = (1 , 12) / 101 1 u i , 1 u i , 2 0 1

  23. 13 Draft Example of a digital net in base 2: Hammersley point set , n = 2 8 = 256, s = 2. 1 u i , 1 u i , 2 0 1

  24. 13 Draft Example of a digital net in base 2: Hammersley point set , n = 2 8 = 256, s = 2. 1 u i , 1 u i , 2 0 1

  25. 13 Draft Example of a digital net in base 2: Hammersley point set , n = 2 8 = 256, s = 2. 1 u i , 1 u i , 2 0 1

  26. 13 Draft Example of a digital net in base 2: Hammersley point set , n = 2 8 = 256, s = 2. 1 u i , 1 u i , 2 0 1

  27. 13 Draft Example of a digital net in base 2: Hammersley point set , n = 2 8 = 256, s = 2. 1 u i , 1 u i , 2 0 1

Recommend


More recommend