1 Draft A review of Array-RQMC Sorting methods and convergence rates Pierre L’Ecuyer Christian L´ ecot, David Munger, Bruno Tuffin DIRO, Universit´ e de Montr´ eal, Canada LAMA, Universit´ e de Savoie, France Inria–Rennes, France UNSW, Sydney, Feb. 2016
2 Draft Monte Carlo for Markov Chains Setting : A Markov chain with state space X ⊆ R ℓ , evolves as X 0 = x 0 , X j = ϕ j ( X j − 1 , U j ) , j ≥ 1 , where the U j are i.i.d. uniform r.v.’s over (0 , 1) d . Want to estimate τ � µ = E [ Y ] where Y = g j ( X j ) j =1 for some fixed time horizon τ .
2 Draft Monte Carlo for Markov Chains Setting : A Markov chain with state space X ⊆ R ℓ , evolves as X 0 = x 0 , X j = ϕ j ( X j − 1 , U j ) , j ≥ 1 , where the U j are i.i.d. uniform r.v.’s over (0 , 1) d . Want to estimate τ � µ = E [ Y ] where Y = g j ( X j ) j =1 for some fixed time horizon τ . Ordinary MC : For i = 0 , . . . , n − 1, generate X i , j = ϕ j ( X i , j − 1 , U i , j ), j = 1 , . . . , τ , where the U i , j ’s are i.i.d. U(0 , 1) d . Estimate µ by n τ n µ n = 1 g j ( X i , j ) = 1 � � � ˆ Y i . n n i =1 j =1 i =1 µ n ] = 1 n Var [ Y i ] = O ( n − 1 ). E [ˆ µ n ] = µ and Var [ˆ
3 Draft Example 1 (very simple, one-dimensional state) Let Y = θ U + (1 − θ ) V , where U , V indep. U (0 , 1) and θ ∈ [0 , 1). This Y has cdf G θ . Markov chain is defined by X 0 = U 0 ∼ U (0 , 1); X j = ϕ j ( X j − 1 , U j ) = G θ ( θ X j − 1 + (1 − θ ) U j ) , j ≥ 1 where U j ∼ U (0 , 1). Then, X j ∼ U (0 , 1). We consider various functions g j : g j ( x ) = x − 1 / 2, g j ( x ) = x 2 − 1 / 3, g j ( x ) = sin(2 π x ), g j ( x ) = e x − e + 1, g j ( x ) = ( x − 1 / 2) + − 1 / 8, g j ( x ) = I [ x ≤ 1 / 3] − 1 / 3. They all have E [ g j ( X j )] = 0. Also discrepancies of states X 0 , j , . . . , X n − 1 , j .
4 Draft Example 2: Asian Call Option (two-dim state) Given observation times t 1 , t 2 , . . . , t τ suppose S ( t j ) = S ( t j − 1 ) exp[( r − σ 2 / 2)( t j − t j − 1 ) + σ ( t j − t j − 1 ) 1 / 2 Φ − 1 ( U j )] , where U j ∼ U [0 , 1) and S ( t 0 ) = s 0 is fixed. Running average: ¯ � j S j = 1 i =1 S ( t i ). j 0 , ¯ � � Payoff at step j = τ is Y = g τ ( X τ ) = max S τ − K . State: X j = ( S ( t j ) , ¯ S j ) . Transition: S ( t j ) , ( j − 1) ¯ � S j − 1 + S ( t j ) � X j = ( S ( t j ) , ¯ S j ) = ϕ j ( S ( t j − 1 ) , ¯ S j − 1 , U j ) = . j
5 Draft Plenty of potential applications: Finance Queueing systems Inventory, distribution, logistic systems Reliability models MCMC in Bayesian statistics Etc.
6 Draft Classical Randomized Quasi-Monte Carlo (RQMC) for Markov Chains One RQMC point for each sample path. Put V i = ( U i , 1 , . . . , U i ,τ ) ∈ (0 , 1) s = (0 , 1) d τ . Estimate µ by n τ µ rqmc , n = 1 � � ˆ g j ( X i , j ) n i =1 j =1 where P n = { V 0 , . . . , V n − 1 } ⊂ (0 , 1) s satisfies: (a) each point V i has the uniform distribution over (0 , 1) s ; (b) P n covers (0 , 1) s very evenly (i.e., has low discrepancy). The dimension s is often very large!
7 Draft Array-RQMC for Markov Chains L., L´ ecot, Tuffin, et al. [2004, 2006, 2008, etc.] Earlier deterministic versions: L´ ecot et al. Simulate an “array” of n chains in “parallel.” At each step, use an RQMC point set P n to advance all the chains by one step. Seek global negative dependence across the chains. Goal : Want small discrepancy (or “distance”) between empirical distribution of S n , j = { X 0 , j , . . . , X n − 1 , j } and theoretical distribution of X j . If we succeed, these (unbiased) estimators will have small variance: n − 1 µ arqmc , j , n = 1 � µ j = E [ g j ( X j )] ≈ ˆ g j ( X i , j ) n i =0 n − 1 n − 1 Var [ g j ( X i , j )] 2 � � Var [ˆ µ arqmc , j , n ] = + Cov [ g j ( X i , j ) , g j ( X k , j )] . n 2 n i =0 k = i +1
Some RQMC insight: To simplify the discussion, suppose X j ∼ U (0 , 1) ℓ . 8 Draft This can be achieved (in principle) by a change of variable. We estimate � µ j = E [ g j ( X j )] = E [ g j ( ϕ j ( X j − 1 , U ))] = [0 , 1) ℓ + d g j ( ϕ j ( x , u )) d x d u (we take a single j here) by n − 1 n − 1 µ arqmc , j , n = 1 g j ( X i , j ) = 1 � � ˆ g j ( ϕ j ( X i , j − 1 , U i , j )) . n n i =0 i =0 This is (roughly) RQMC with the point set Q n = { ( X i , j − 1 , U i , j ) , 0 ≤ i < n } . We want Q n to have low discrepancy (LD) (be highly uniform) over [0 , 1) ℓ + d .
Some RQMC insight: To simplify the discussion, suppose X j ∼ U (0 , 1) ℓ . 8 Draft This can be achieved (in principle) by a change of variable. We estimate � µ j = E [ g j ( X j )] = E [ g j ( ϕ j ( X j − 1 , U ))] = [0 , 1) ℓ + d g j ( ϕ j ( x , u )) d x d u (we take a single j here) by n − 1 n − 1 µ arqmc , j , n = 1 g j ( X i , j ) = 1 � � ˆ g j ( ϕ j ( X i , j − 1 , U i , j )) . n n i =0 i =0 This is (roughly) RQMC with the point set Q n = { ( X i , j − 1 , U i , j ) , 0 ≤ i < n } . We want Q n to have low discrepancy (LD) (be highly uniform) over [0 , 1) ℓ + d . We do not choose the X i , j − 1 ’s in Q n : they come from the simulation. To construct the (randomized) U i , j , select a LD point set ˜ Q n = { ( w 0 , U 0 , j ) , . . . , ( w n − 1 , U n − 1 , j ) } , where the w i ∈ [0 , 1) ℓ are fixed and each U i , j ∼ U (0 , 1) d . Permute the states X i , j − 1 so that X π j ( i ) , j − 1 is “close” to w i for each i (LD between the two sets), and compute X i , j = ϕ j ( X π j ( i ) , j − 1 , U i , j ) for each i . Example: If ℓ = 1, can take w i = ( i + 0 . 5) / n and just sort the states. For ℓ > 1, there are various ways to define the matching (multivariate sort).
9 Draft Array-RQMC algorithm X i , 0 ← x 0 (or X i , 0 ← x i , 0 ) for i = 0 , . . . , n − 1; for j = 1 , 2 , . . . , τ do Compute the permutation π j of the states (for matching); Randomize afresh { U 0 , j , . . . , U n − 1 , j } in ˜ Q n ; X i , j = ϕ j ( X π j ( i ) , j − 1 , U i , j ), for i = 0 , . . . , n − 1; µ arqmc , j , n = ¯ � n − 1 Y n , j = 1 ˆ i =0 g ( X i , j ); n end for µ arqmc , n = � τ Estimate µ by the average ¯ Y n = ˆ j =1 ˆ µ arqmc , j , n .
9 Draft Array-RQMC algorithm X i , 0 ← x 0 (or X i , 0 ← x i , 0 ) for i = 0 , . . . , n − 1; for j = 1 , 2 , . . . , τ do Compute the permutation π j of the states (for matching); Randomize afresh { U 0 , j , . . . , U n − 1 , j } in ˜ Q n ; X i , j = ϕ j ( X π j ( i ) , j − 1 , U i , j ), for i = 0 , . . . , n − 1; µ arqmc , j , n = ¯ � n − 1 Y n , j = 1 ˆ i =0 g ( X i , j ); n end for µ arqmc , n = � τ Estimate µ by the average ¯ Y n = ˆ j =1 ˆ µ arqmc , j , n . Proposition: (i) The average ¯ Y n is an unbiased estimator of µ . (ii) The empirical variance of m independent realizations gives an unbiased estimator of Var [ ¯ Y n ].
10 Draft Key issues: 1. How can we preserve LD of S n , j as j increases? µ arqmc , j , n ] = O ( n − α ) for some α > 1? 2. Can we prove that Var [ˆ How? What α ? Intuition: Write discrepancy measure of S n , j as the mean square integration error (or variance) when integrating some function ψ : [0 , 1) ℓ + d → R using Q n . Use RQMC theory to show it is small if Q n has LD. Then use induction.
11 Draft Some generalizations L., L´ ecot, and Tuffin [2008]: τ can be a random stopping time w.r.t. the filtration F{ ( j , X j ) , j ≥ 0 } . L., Demers, and Tuffin [2006, 2007]: Combination with splitting techniques (multilevel and without levels), combination with importance sampling and weight windows. Covers particle filters. L. and Sanvido [2010]: Combination with coupling from the past for exact sampling. Dion and L. [2010]: Combination with approximate dynamic programming and for optimal stopping problems. Gerber and Chopin [2015]: Sequential QMC.
12 Draft Convergence results and applications L., L´ ecot, and Tuffin [2006, 2008]: Special cases: convergence at MC rate, one-dimensional, stratification, etc. Var in O ( n − 3 / 2 ). L´ ecot and Tuffin [2004]: Deterministic, one-dimension, discrete state. El Haddad, L´ ecot, L. [2008, 2010]: Deterministic, multidimensional. Fakhererredine, El Haddad, L´ ecot [2012, 2013, 2014]: LHS, stratification, Sudoku sampling, ... W¨ achter and Keller [2008]: Applications in computer graphics. Gerber and Chopin [2015]: Sequential QMC (particle filters), Owen nested scrambling an dHilbert sort. Variance in o ( n − 1 ).
13 Draft Mapping chains to points when ℓ > 2 1. Multivariate batch sort : Sort the states (chains) by first coordinate, in n 1 packets of size n / n 1 . Sort each packet by second coordinate, in n 2 packets of size n / n 1 n 2 . · · · At the last level, sort each packet of size n ℓ by the last coordinate. Choice of n 1 , n 2 , ..., n ℓ ?
14 Draft A (4,4) mapping Sobol’ net in 2 dimensions after random digital shift States of the chains 1.0 1.0 s s s 0.9 s 0.9 s s 0.8 s 0.8 s s s 0.7 0.7 s s 0.6 0.6 s s 0.5 s 0.5 s s 0.4 0.4 s s s 0.3 0.3 s s s s s s 0.2 0.2 s s s s 0.1 0.1 s s 0.0 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Recommend
More recommend