tutorial on quasi monte carlo methods
play

Tutorial on quasi-Monte Carlo methods Josef Dick School of - PowerPoint PPT Presentation

Tutorial on quasi-Monte Carlo methods Josef Dick School of Mathematics and Statistics, UNSW, Sydney, Australia josef.dick@unsw.edu.au Comparison: MCMC, MC, QMC Roughly speaking: Markov chain Monte Carlo and quasi-Monte Carlo are for


  1. Tutorial on quasi-Monte Carlo methods Josef Dick School of Mathematics and Statistics, UNSW, Sydney, Australia josef.dick@unsw.edu.au

  2. Comparison: MCMC, MC, QMC Roughly speaking: • Markov chain Monte Carlo and quasi-Monte Carlo are for different types of problems; • If you have a problem where Monte Carlo does not work, then chances are quasi-Monte Carlo will not work as well; • If Monte Carlo works, but you want a faster method ⇒ try (randomized) quasi-Monte Carlo (some tweaking might be necessary). • Quasi-Monte Carlo is an "experimental design" approach to Monte Carlo simulation; In this talk we shall discuss how quasi-Monte Carlo can be faster than Monte Carlo under certain assumptions.

  3. Outline D ISCREPANCY , REPRODUCING KERNEL H ILBERT SPACES AND WORST - CASE ERROR Q UASI -M ONTE C ARLO POINT SETS R ANDOMIZATIONS W EIGHTED FUNCTION SPACES AND TRACTABILITY

  4. The task is to approximate an integral � I s ( f ) = [ 0 , 1 ] s f ( z ) d z for some integrand f by some quadrature rule N − 1 � Q N , s ( f ) = 1 f ( x n ) N n = 0 at some sample points x 0 , . . . , x N − 1 ∈ [ 0 , 1 ] s .

  5. � N − 1 � [ 0 , 1 ] s f ( x ) d x ≈ 1 f ( x n ) N n = 0 In other words: Area under curve = Volume of cube × average function value. • If x 0 , . . . , x N − 1 ∈ [ 0 , 1 ] s are chosen randomly ⇒ Monte Carlo • If x 0 , . . . , x N − 1 ∈ [ 0 , 1 ] s chosen deterministically ⇒ Quasi-Monte Carlo

  6. Smooth integrands • Integrand f : [ 0 , 1 ] → R ; say continuously differentiable; • We want to study the integration error: � 1 N − 1 � f ( x ) d x − 1 f ( x n ) . N 0 n = 0 • Representation: � 1 � 1 f ( x ) = f ( 1 ) − f ′ ( t ) d t = f ( 1 ) − 1 [ 0 , t ] ( x ) f ′ ( t ) d t , 0 x where � 1 if x ∈ [ 0 , t ] , 1 [ 0 , t ] ( x ) = 0 otherwise .

  7. Integration error • Substitute: � � � 1 � 1 � 1 f ( 1 ) − 1 [ 0 , t ] ( x ) f ′ ( t ) d t f ( x ) d x = d x 0 0 0 � 1 � 1 = f ( 1 ) − 1 [ 0 , t ] ( x ) f ′ ( t ) d x d t , 0 0 � � � 1 N − 1 N − 1 � � 1 f ( x n ) = 1 1 [ 0 , t ] ( x n ) f ′ ( t ) d t f ( 1 ) − N N 0 n = 0 n = 0 � 1 N − 1 � 1 = f ( 1 ) − 1 [ 0 , t ] ( x n ) f ′ ( t ) d t . N 0 n = 0 • Integration error: � � � 1 � 1 � 1 N − 1 N − 1 � � f ( x ) d x − 1 1 1 [ 0 , t ] ( x n ) − 1 [ 0 , t ] ( x ) d x f ′ ( t ) d t . f ( x n ) = N N 0 0 0 n = 0 n = 0

  8. Local discrepancy Integration error: � � � 1 � 1 N − 1 N − 1 � � f ( x ) d x − 1 1 1 [ 0 , t ] ( x n ) − t f ′ ( t ) d t . f ( x n ) = N N 0 0 n = 0 n = 0 Let P = { x 0 , . . . , x N − 1 } ⊂ [ 0 , 1 ] . Define the local discrepancy by N − 1 � ∆ P ( t ) = 1 1 [ 0 , t ] ( x n ) − t , t ∈ [ 0 , 1 ] . N n = 0 Then � 1 � 1 N − 1 � f ( x ) d x − 1 ∆ P ( t ) f ′ ( t ) d t . f ( x n ) = N 0 0 n = 0

  9. Koksma-Hlawka inequality � � � � � 1 � 1 � N − 1 � � � � f ( x ) d x − 1 � � � � ∆ P ( t ) f ′ ( t ) d t f ( x n ) � = � � � � � N � � 0 0 n = 0 �� 1 � 1 / p �� 1 � 1 / q | ∆ P ( t ) | p d t | f ′ ( t ) | q d t ≤ 0 0 1 p + 1 = � ∆ P � L p � f ′ � L q , q = 1 , �� | g | p � 1 / p . where � g � L p =

  10. Interpretation of discrepancy Let P = { x 0 , . . . , x N − 1 } ⊂ [ 0 , 1 ] . Recall the definition of the local discrepancy: N − 1 � ∆ P ( t ) = 1 1 [ 0 , t ] ( x n ) − t , t ∈ [ 0 , 1 ] . N n = 0 Local discrepancy measures difference between uniform distribution and emperical distribution of quadrature points P = { x 0 , . . . , x N − 1 } . This is the Kolmogorov-Smirnov test for the difference between the empirical distribution of { x 0 , . . . , x N − 1 } and the uniform distribution.

  11. Function space Representation � 1 f ′ ( t ) d t . f ( x ) = f ( 1 ) − x Define inner product: � 1 � f , g � = f ( 1 ) g ( 1 ) + f ′ ( t ) g ′ ( t ) d t . 0 and norm � � 1 | f ( 1 ) | 2 + | f ′ ( t ) | 2 d t . � f � = 0 Function space: H = { f : [ 0 , 1 ] → R : f absolutely continuous and � f � < ∞} .

  12. Worst-case error The worst-case error is defined by � � � 1 � N − 1 � � f ( x ) d x − 1 � � sup e ( H , P ) = f ( x n ) � . � � � N f ∈H , � f �≤ 1 0 n = 0

  13. Reproducing kernel Hilbert space Recall: � 1 f ′ ( x )( − 1 [ 0 , x ] ( y )) d x f ( y ) = f ( 1 ) · 1 + 0 and � 1 � f , g � = f ( 1 ) g ( 1 ) + f ′ ( x ) g ′ ( x ) d x . 0 Goal: Find set of functions g y ∈ H for each y ∈ [ 0 , 1 ] such that for all f ∈ H . � f , g y � = f ( y ) Conclusions: • g y ( 1 ) = 1 for all y ∈ [ 0 , 1 ] ; � − 1 • if y ≤ x , g ′ y ( x ) = − 1 [ 0 , x ] ( y ) = 0 otherwise ; • Make g continuous such that g ∈ H ;

  14. Reproducing kernel Hilbert space It follows that g y ( x ) = 1 + min { 1 − x , 1 − y } . The function K ( x , y ) := g y ( x ) = 1 + min { 1 − x , 1 − y } , x , y ∈ [ 0 , 1 ] is called reproducing kernel . The space H is called a reproducing kernel Hilbert space (with reproducing kernel K ).

  15. Numerical integration in reproducing kernel Hilbert spaces Function representation: � � � 1 � 1 � 1 f ( z ) d z = � f , K ( · , z ) � d z = f , K ( · , z ) d z 0 0 0 � � N − 1 N − 1 N − 1 � � � 1 f ( x n ) = 1 f , 1 � f , K ( · , x n ) � = K ( · , x n ) N N N n = 0 n = 0 n = 0 Integration error � � � 1 � 1 N − 1 N − 1 � � f ( z ) d z − 1 K ( · , z ) d z − 1 f ( x n ) = f , K ( · , x n ) N N 0 0 n = 0 n = 0 = � f , h � , where � 1 N − 1 � K ( x , z ) d z − 1 h ( x ) = K ( x , x n ) . N 0 n = 0

  16. Worst-case error in reproducing kernel Hilbert spaces Thus � � � 1 � N − 1 � � f ( z ) d z − 1 � � sup e ( H , P ) = � f ( x n ) � � N � f ∈H , � f �≤ 1 0 n = 0 sup = |� f , h �| f ∈H , � f � = 1 � � f �� � � � � sup = � f � , h � � f ∈H , f � = 0 = � h , h � � h � = � h � , since the supremum is attained when choosing f = � h � ∈ H . h

  17. Worst-case error in reproducing kernel Hilbert spaces � 1 � 1 � 1 N − 1 � K ( x , y ) d x d y − 2 e 2 ( H , P ) = K ( x , x n ) d x N 0 0 0 n = 0 N − 1 � + 1 K ( x n , x m ) N 2 n , m = 0

  18. Numerical integration in higher dimensions • Tensor product space H s = H × · · · × H . • Reproducing kernel s � [ 1 + min { 1 − x i , 1 − y i } ] , K ( x , y ) = i = 1 where x = ( x 1 , . . . , x s ) , y = ( y 1 , . . . , y s ) ∈ [ 0 , 1 ] s . • Functions f ∈ H s have partial mixed derivatives up to order 1 in each variable ∂ | u | f ∈ L 2 ([ 0 , 1 ] s ) , ∂ x u where u ⊆ { 1 , . . . , s } , x u = ( x i ) i ∈ u and | u | denotes the cardinality of u and where ∂ | u | f ∂ x u ( x u , 1 ) = 0 for all u ⊂ { 1 , . . . , s } .

  19. Worst-case error Again � � � N − 1 � [ 0 , 1 ] s K ( x , y ) d x d y − 2 e 2 ( H , P ) = [ 0 , 1 ] s K ( x , x n ) d x N [ 0 , 1 ] s n = 0 N − 1 � + 1 K ( x n , x m ) N 2 n , m = 0 and � [ 0 , 1 ] s ∆ P ( x ) ∂ s f e 2 ( H , P ) = ∂ x ( x ) d x , where N − 1 s � � ∆ P ( x ) = 1 1 [ 0 , x ] ( x n ) − x i . N n = 0 i = 1

  20. Discrepancy in higher dimensions Point set P = { x 0 , . . . , x N − 1 } ⊂ [ 0 , 1 ] s , t = ( t 1 , . . . , t s ) ∈ [ 0 , 1 ] s . • Local discrepancy: N − 1 s � � ∆ P ( t ) = 1 1 [ 0 , t ] ( x n ) − t i , N n = 0 i = 1 where [ 0 , t ] = � s i = 1 [ 0 , t i ] .

  21. Koksma-Hlawka inequality Let f : [ 0 , 1 ] s → R with �� � 1 / q � � q � ∂ s � � � � f � = ∂ x f ( x ) d x , � � [ 0 , 1 ] s and where ∂ | u | f ∂ x u ( x u , 1 ) = 0 for all u ⊂ { 1 , . . . , s } . Then � � � � N − 1 � � [ 0 , 1 ] s f ( x ) d x − 1 � � � f ( x n ) � � ≤ � f �� ∆ P � L p , � N n = 0 where 1 p + 1 q = 1. ⇒ Construct points P = { x 0 , . . . , x N − 1 } ⊂ [ 0 , 1 ] s with small discrepancy L p ( P ) = � ∆ P � L p . It is often useful to consider different reproducing kernels yielding different worst-case errors and discrepancies. We will see further examples later.

  22. Constructions of low-discrepancy sequences • Lattice rules (Bilyk, Brauchart, Cools, D., Hellekalek, Hickernell, Hlawka, Joe, Keller, Korobov, Kritzer, Kuo, Larcher, L ’Ecuyer, Lemieux, Leobacher, Niederreiter, Nuyens, Pillichshammer, Sinescu, Sloan, Temlyakov, Wang, Wo´ zniakowski, ...) • Digital nets and sequences (Baldeaux, Bierbrauer, Brauchart, Chen, D., Edel, Faure, Hellekalek, Hofer, Keller, Kritzer, Kuo, Larcher, Leobacher, Niederreiter, Owen, Özbudak, Pillichshammer, Pirsic, Schmid, Skriganov, Sobol’, Wang, Xing, Yue, ...) • Hammersley-Halton sequences (Atanassov, De Clerck, Faure, Halton, Hammersley, Kritzer, Larcher, Lemieux, Pillichshammer, Pirsic, White, ...) • Kronecker sequences (Beck, Hellekalek, Larcher, Niederreiter, Schoissengeier, ...)

  23. Lattice rules Let N ∈ N , let g = ( g 1 , . . . , g s ) ∈ { 1 , . . . , N − 1 } s . Choose the quadrature points as � n g � for n = 0 , . . . , N − 1 , x n = , N where { z } = z − ⌊ z ⌋ for z ∈ R + 0 .

  24. Fibonacci lattice rules Lattice rule with N = 55 points and generating vector g = ( 1 , 34 ) .

  25. Fibonacci lattice rules Lattice rule with N = 89 points and generating vector g = ( 1 , 55 ) .

Recommend


More recommend