1 Draft The Array-RQMC method: Review of convergence results Pierre L’Ecuyer, Christian L´ ecot, Bruno Tuffin DIRO, Universit´ e de Montr´ eal, Canada LAMA, Universit´ e de Savoie, France Inria–Rennes, France
2 Draft Monte Carlo for Markov Chains Setting : A Markov chain with state space X ⊆ R ℓ , evolves as X 0 = x 0 , X j = ϕ j ( X j − 1 , U j ) , j ≥ 1 , where the U j are i.i.d. uniform r.v.’s over (0 , 1) d . Want to estimate τ � µ = E [ Y ] where Y = g j ( X j ) j =1 for some fixed time horizon τ .
2 Draft Monte Carlo for Markov Chains Setting : A Markov chain with state space X ⊆ R ℓ , evolves as X 0 = x 0 , X j = ϕ j ( X j − 1 , U j ) , j ≥ 1 , where the U j are i.i.d. uniform r.v.’s over (0 , 1) d . Want to estimate τ � µ = E [ Y ] where Y = g j ( X j ) j =1 for some fixed time horizon τ . Ordinary MC : For i = 0 , . . . , n − 1, generate X i , j = ϕ j ( X i , j − 1 , U i , j ), j = 1 , . . . , τ , where the U i , j ’s are i.i.d. U(0 , 1) d . Estimate µ by τ n n µ n = 1 g j ( X i , j ) = 1 � � � ˆ Y i . n n i =1 j =1 i =1
3 Draft Example: Asian Call Option Given observation times t 1 , t 2 , . . . , t τ suppose S ( t j ) = S ( t j − 1 ) exp[( r − σ 2 / 2)( t j − t j − 1 ) + σ ( t j − t j − 1 ) 1 / 2 Φ − 1 ( U j )] , where U j ∼ U [0 , 1) and S ( t 0 ) = s 0 is fixed. Running average: ¯ � j S j = 1 i =1 S ( t i ). j State: X j = ( S ( t j ) , ¯ S j ) . Transition: S ( t j ) , ( j − 1)¯ � S j − 1 + S ( t j ) � X j = ( S ( t j ) , ¯ S j ) = ϕ j ( S ( t j − 1 ) , ¯ S j − 1 , U j ) = . j 0 , ¯ � � Payoff at step j = τ is Y = g τ ( X τ ) = max S τ − K .
4 Draft Plenty of other applications: Finance Queueing systems Inventory, distribution, logistic systems Reliability models MCMC in Bayesian statistics Etc.
5 Draft Classical RQMC for Markov Chains Put V i = ( U i , 1 , . . . , U i ,τ ) ∈ (0 , 1) s = (0 , 1) d τ . Estimate µ by τ n µ rqmc , n = 1 � � ˆ g j ( X i , j ) n i =1 j =1 where P n = { V 0 , . . . , V n − 1 } ⊂ (0 , 1) s satisfies: (a) each point V i has the uniform distribution over (0 , 1) s ; (b) P n covers (0 , 1) s very evenly (i.e., has low discrepancy). The dimension s is often very large!
6 Draft Array-RQMC for Markov Chains L., L´ ecot, Tuffin, et al. [2004, 2006, 2008, etc.] Simulate an “array” of n chains in “parallel.” At each step, use an RQMC point set P n to advance all the chains by one step, while inducing global negative dependence across the chains. Goal : Want a small discrepancy (or “distance”) between the empirical distribution of S n , j = { X 0 , j , . . . , X n − 1 , j } and the theoretical distribution of X j , for each j . If we succeed, these (unbiased) estimators will have small variance: n − 1 n − 1 µ j = E [ g j ( X j )] ≈ 1 µ = E [ Y ] ≈ 1 � � g j ( X i , j ) and Y i . n n i =0 i =0
6 Draft Array-RQMC for Markov Chains L., L´ ecot, Tuffin, et al. [2004, 2006, 2008, etc.] Simulate an “array” of n chains in “parallel.” At each step, use an RQMC point set P n to advance all the chains by one step, while inducing global negative dependence across the chains. Goal : Want a small discrepancy (or “distance”) between the empirical distribution of S n , j = { X 0 , j , . . . , X n − 1 , j } and the theoretical distribution of X j , for each j . If we succeed, these (unbiased) estimators will have small variance: n − 1 n − 1 µ j = E [ g j ( X j )] ≈ 1 µ = E [ Y ] ≈ 1 � � g j ( X i , j ) and Y i . n n i =0 i =0 How can we preserve low-discrepancy of S n , j as j increases? Can we quantify the variance improvement? Convergence rate in n ?
7 Draft Some generalizations L., L´ ecot, and Tuffin [2008]: τ can be a random stopping time w.r.t. the filtration F{ ( j , X j ) , j ≥ 0 } . L., Demers, and Tuffin [2006, 2007]: Combination with splitting techniques (multilevel and without levels), combination with importance sampling and weight windows. Covers particle filters. L. and Sanvido [2010]: Combination with coupling from the past for exact sampling. Dion and L. [2010]: Combination with approximate dynamic programming and for optimal stopping problems. Gerber and Chopin [2014]: Sequential QMC (yesterday’s talk).
8 Draft Convergence results and applications L., L´ ecot, and Tuffin [2006, 2008]: Special cases: convergence at MC rate, one-dimensional, stratification, etc. L´ ecot and Tuffin [2004]: Deterministic, one-dimension, discrete state. El Haddad, L´ ecot, L. [2008, 2010]: Deterministic, multidimensional. Fakhererredine, El Haddad, L´ ecot [2012, 2013, 2014]: LHS, stratification, Sudoku sampling, ... W¨ achter and Keller [2008]: Applications in computer graphics.
9 Draft Other QMC methods for Markov chains Interested in steady-state distribution. Introduce dependence between the steps j ; a single chain visit the state space very uniformly. Owen, Tribble, Chen, Dick, Matsumoto, Nishimura, .... [2004–2010]: Markov chain quasi-Monte Carlo. Propp [2012] and earlier: Rotor-router sampling.
10 Draft To simplify, suppose each X j is a uniform r.v. over (0 , 1) ℓ . Select a discrepancy measure D for the point set S n , j = { X 0 , j , . . . , X n − 1 , j } over (0 , 1) ℓ , and a corresponding measure of variation V , such that µ rqmc , j , n − µ j ) 2 ] ≤ E [ D 2 ( S n , j )] V 2 ( g j ) . Var [ˆ µ rqmc , j , n ] = E [(ˆ
10 Draft To simplify, suppose each X j is a uniform r.v. over (0 , 1) ℓ . Select a discrepancy measure D for the point set S n , j = { X 0 , j , . . . , X n − 1 , j } over (0 , 1) ℓ , and a corresponding measure of variation V , such that µ rqmc , j , n − µ j ) 2 ] ≤ E [ D 2 ( S n , j )] V 2 ( g j ) . Var [ˆ µ rqmc , j , n ] = E [(ˆ If D is defined via a reproducing kernel Hilbert space, then, for some random ξ j (that generally depends on S n , j ), � n � � n � 1 1 E [ D 2 ( S n , j )] � � = Var ξ j ( X i , j ) = Var ( ξ j ◦ ϕ j )( X i , j − 1 , U i , j )) n n i =1 i =1 E [ D 2 (2) ( Q n )] · V 2 ≤ (2) ( ξ j ◦ ϕ j ) for some other discrepancy D (2) over (0 , 1) ℓ + d , where Q n = { ( X 0 , j − 1 , U 0 , j ) , . . . , ( X n − 1 , j − 1 , U n − 1 , j ) } . Goal: Under appropriate conditions, to obtain V (2) ( ξ j ◦ ϕ j ) < ∞ and E [ D 2 (2) ( Q n )] = O ( n − α + ǫ ) for some α ≥ 1.
11 Draft Discrepancy bounds by induction? Let ℓ = d = 1, X = [0 , 1], and X j ∼ U (0 , 1). L 2 -star discrepancy: n − 1 12 n 2 + 1 1 � D 2 ( x 0 , . . . , x n − 1 ) = ( w i − x i ) 2 n i =0 where w i = ( i + 1 / 2) / n and 0 ≤ x 0 ≤ x 1 ≤ · · · ≤ x n − 1 . We have n − 1 ξ j ( x ) = − 1 � [ µ ( Y i ) + B 2 (( x − Y i ) mod 1) + B 1 ( x ) B 1 ( Y i )] , n i =1 where B 1 ( x ) = x − 1 / 2 and B 2 ( x ) = x 2 − x + 1 / 6. Problem: the 2-dim function ξ j ◦ ϕ j has mixed derivative that is not square integrable, so it has infinite variation, it seems. Otherwise, we would have a proof that E [ D 2 ( S n , j )] = O ( n − 2 ). Help!
12 Draft In the points ( X i , j − 1 , U i , j ) of Q n , the U i , j can be defined via some RQMC scheme, but the X i , j − 1 cannot be chosen; they are determined by the history of the chains. The idea is to select a low-discrepancy point set ˜ Q n = { ( w 0 , U 0 ) , . . . , ( w n − 1 , U n − 1 ) } , where the w i ∈ [0 , 1) ℓ are fixed and the U i ∈ (0 , 1) d are randomized, and then define a bijection between the states X i , j − 1 and the w i so that the X i , j − 1 are “close” to the w i (small discrepancy between the two sets). Example : If ℓ = 1, can take w i = ( i + 0 . 5) / n . Bijection defined by a permutation π j of S n , j .
12 Draft In the points ( X i , j − 1 , U i , j ) of Q n , the U i , j can be defined via some RQMC scheme, but the X i , j − 1 cannot be chosen; they are determined by the history of the chains. The idea is to select a low-discrepancy point set ˜ Q n = { ( w 0 , U 0 ) , . . . , ( w n − 1 , U n − 1 ) } , where the w i ∈ [0 , 1) ℓ are fixed and the U i ∈ (0 , 1) d are randomized, and then define a bijection between the states X i , j − 1 and the w i so that the X i , j − 1 are “close” to the w i (small discrepancy between the two sets). Example : If ℓ = 1, can take w i = ( i + 0 . 5) / n . Bijection defined by a permutation π j of S n , j . For state space in R ℓ : same algorithm essentially.
13 Draft Array-RQMC algorithm X i , 0 ← x 0 , for i = 0 , . . . , n − 1; for j = 1 , 2 , . . . , τ do Randomize afresh { U 0 , j , . . . , U n − 1 , j } in ˜ Q n ; X i , j = ϕ j ( X π j ( i ) , j − 1 , U i , j ), for i = 0 , . . . , n − 1; Compute the permutation π j +1 (sort the states); end for Estimate µ by the average ¯ Y n = ˆ µ rqmc , n .
13 Draft Array-RQMC algorithm X i , 0 ← x 0 , for i = 0 , . . . , n − 1; for j = 1 , 2 , . . . , τ do Randomize afresh { U 0 , j , . . . , U n − 1 , j } in ˜ Q n ; X i , j = ϕ j ( X π j ( i ) , j − 1 , U i , j ), for i = 0 , . . . , n − 1; Compute the permutation π j +1 (sort the states); end for Estimate µ by the average ¯ Y n = ˆ µ rqmc , n . Theorem: The average ¯ Y n is an unbiased estimator of µ . Can estimate Var [ ¯ Y n ] by the empirical variance of m indep. realizations.
Recommend
More recommend