Modern Computational Statistics Lecture 4: Numerical Integration - PowerPoint PPT Presentation

Modern Computational Statistics Lecture 4: Numerical Integration Cheng Zhang School of Mathematical Sciences, Peking University September 23, 2019

Overview 2/30 ◮ Statistical inference often depends on intractable integrals � I ( f ) = Ω f ( x ) dx ◮ This is especially true in Bayesian statistics, where a posterior distribution is usually non-trivial. ◮ In some situations, the likelihood itself may depend on intractable integrals so frequentist methods would also require numerical integration ◮ In this lecture, we start by discussing some simple numerical methods that can be easily used in low dimensional problems ◮ Next, we will discuss several Monte Carlo strategies that could be implemented even when the dimension is high

Newton-Cˆ otes Quadrature 3/30 ◮ Consider a one-dimensional integral of the form � b I ( f ) = a f ( x ) dx ◮ A common strategy for approximating this integral is to use a tractable approximating function ˜ f ( x ) that can be integrated easily ◮ We typically constrain the approximating function to agree with f on a grid of points: x 1 , x 2 , . . . , x n

Newton-Cˆ otes Quadrature 4/30 ◮ Newton-Cˆ otes methods use equally-spaced grids ◮ The approximating function is a polynomial ◮ The integral then is approximated with a weighted sum as follows n ˆ � I = w i f ( x i ) i =1 ◮ In its simplest case, we can use the Riemann rule by partitioning the interval [ a, b ] into n subintervals of length h = b − a n ; then n − 1 ˆ � I L = h f ( a + ih ) i =0 This is obtained using a piecewise constant function ˜ f that matches f at the left points of each subinterval

Newton-Cˆ otes Quadrature 5/30 ◮ Alternatively, the approximating function could agree with the integrand at the right or middle point of each subinterval n n − 1 f ( a + ( i + 1 ˆ � ˆ � I R = h f ( a + ih ) , I M = h 2) h ) i =1 i =0 ◮ In either case, the approximating function is a zero-order polynomial ◮ To improve the approximation, we can use the trapzoidal rule by using a piecewise linear function that agrees with f ( x ) at both ends of subintervals n − 1 I = h f ( x i ) + h ˆ � 2 f ( a ) + h 2 f ( b ) i =1

Newton-Cˆ otes Quadrature 6/30 ◮ We would further improve the approximation by using higher order polynomials ◮ Simpson’s rule uses a quadratic approximation over each subinterval � x i +1 f ( x ) dx ≈ x i +1 − x i � f ( x i ) + 4 f ( x i + x i +1 � ) + f ( x i +1 ) 6 2 x i ◮ In general, we can use any polynomial of degree k

Gaussian Quadrature 7/30 ◮ Newton-Cˆ otes rules require equally spaced grids ◮ With a suitably flexible choice of n + 1 nodes, x 0 , x 1 , . . . , x n , and corresponding weights, A 0 , A 1 , . . . , A n , n � A i f ( x i ) i =0 gives the exact integration for all polynomials with degree less than or equal to 2 n + 1 ◮ This is called Gaussian quadrature, which is especially � b useful for the following type of integrals a f ( x ) w ( x ) dx where w ( x ) is a nonnegative function and � b a x k w ( x ) dx < ∞ for all k ≥ 0

Orthogonal Functions 8/30 ◮ In general, for squared integrable functions, � b f ( x ) 2 w ( x ) dx ≤ ∞ a denoted as f ∈ L 2 w, [ a,b ] , we define the inner product as � b � f, g � w, [ a,b ] = f ( x ) g ( x ) w ( x ) dx a where f, g ∈ L 2 w, [ a,b ] ◮ We said two functions to be orthogonal if � f, g � w, [ a,b ] = 0. If f and g are also scaled so that � f, f � w, [ a,b ] = 1, � g, g � w, [ a,b ] = 1, then f and g are orthonormal

Orthogonal Polynomials 9/30 ◮ We can define a sequence of orthogonal polynomials by a recursive rule T k +1 ( x ) = ( α k +1 + β k +1 x ) T k ( x ) − γ k +1 T k − 1 ( x ) ◮ Example: Chebyshev polynomials (first kind). T 0 ( x ) = 1 , T 1 ( x ) = x T n +1 ( x ) = 2 xT n ( x ) − T n − 1 ( x ) 1 ◮ T n ( x ) are orthogonal with respect to w ( x ) = 1 − x 2 and √ [ − 1 , 1] � 1 1 T n ( x ) T m ( x ) √ 1 − x 2 dx = 0 , ∀ n � = m − 1

Orthogonal Polynomials 10/30 ◮ In general orthogonal polynomials are note unique since � f, g � = 0 implies � cf, dg � = 0 ◮ To make the orthogonal polynomial unique, we can use the following standarizations ◮ make the polynomial orthonormal: � f, f � = 0 ◮ set the leading coefficient of T j ( x ) to 1 ◮ Orthogonal polynomials form a basis for L 2 w, [ a,b ] so any function in this space can be written as ∞ � f ( x ) = a n T n ( x ) n =0 � f,T n � where a n = � T n ,T n �

Gaussian Quadrature 11/30 ◮ Let { T n ( x ) } ∞ n =0 be a sequence of orthogonal polynomials with respect to w on [ a, b ]. ◮ Denote the n + 1 roots of T n +1 ( x ) by a < x 0 < x 1 < . . . < x n < b. ◮ We can find weights A 1 , A 2 , . . . , A n +1 such that � b n � P ( x ) w ( x ) dx = A i P ( x i ) , ∀ deg( P ) ≤ 2 n + 1 a i =0 ◮ To do that, we first show: there exists weights A 1 , A 2 , . . . , A n +1 such that � b n � ∀ deg( P ) < n + 1 P ( x ) w ( x ) dx = A i P ( x i ) , a i =0

Gaussian Quadrature 12/30 ◮ Sketch of proof. We only need to satisfy � b n x k w ( x ) dx = � A i x k i , ∀ k = 0 , 1 , . . . , n a i =0 This leads to a system of linear equations       1 1 . . . 1 A 0 I 0 x 0 x 1 . . . x n A 1 I 1        =  . . . . . .      . . . . . .       . . . . . .      x n x n x n . . . A n I n 0 1 n � b a x k w ( x ) dx . The determinant of the where I k = coefficient matrix is a Vandermonde determinant, and is non-zero since x i � = x j , ∀ i � = j

Gaussian Quadrature 13/30 ◮ Now we show that the above Gaussian Quadrature can be exact for polynomials of degree ≤ 2 n + 1 ◮ Let P ( x ) be a polynomial with deg( P ) ≤ 2 n + 1, there exist polynomials g ( x ) and r ( x ) such that P ( x ) = g ( x ) T n +1 ( x ) + r ( x ) with deg( g ) ≤ n, deg( r ) ≤ n , Therefore, � b � b n � P ( x ) w ( x ) dx = r ( x ) w ( x ) dx = A i r ( x i ) a a i =0 n � = A i P ( x i ) i =0

Monte Carlo Method 14/30 ◮ We now discuss the Monte Carlo method mainly in the context of statistical inference ◮ As before, suppose we are interested in estimating � b I ( h ) = a h ( x ) dx ◮ If we can draw iid samples, x (1) , x (2) , . . . , x ( n ) uniformly from ( a, b ), we can approximate the integral as n I n = ( b − a ) 1 ˆ � h ( x ( i ) ) n i =1 ◮ Note that we can think about the integral as � b 1 ( b − a ) h ( x ) · b − adx a 1 where b − a is the density of Uniform( a, b )

Monte Carlo Method 15/30 ◮ In general, we are interested in integrals of the form � X h ( x ) f ( x ) dx , where f ( x ) is a probability density function ◮ Analogous to the above argument, we can approximate this integral (or expectation) by drawing iid samples x (1) , x (2) , . . . , x ( n ) from the density f ( x ) and then n I = 1 ˆ � h ( x ( i ) ) n i =1 ◮ Based on the law of large numbers, we know that p ˆ lim I n − → I n →∞ ◮ And based on the central limit theorem √ n (ˆ σ 2 = V ar( h ( X )) I n − I ) → N (0 , σ 2 ) ,

Example: estimating π 16/30 [ − 1 , 1] 2 h ( x ) · 1 ◮ Let h ( x ) = 1 B (0 , 1) ( x ), then π = 4 � 4 dx ◮ Monte Carlo estimate of π n I n = 4 ˆ � 1 B (0 , 1) ( x ( i ) ) n i =1 x ( i ) ∼ Uniform([ − 1 , 1] 2 )

Example: estimating π 17/30

Monte Carlo vs Quadrature 18/30 ◮ Convergence rate for Monte Carlo: O ( n − 1 / 2 ) � σ � | ˆ p I n − I | ≤ √ nδ ≥ 1 − δ, ∀ δ often slower than quadrature methods ( O ( n − 2 ) or better) ◮ However, the convergence rate of Monte Carlo does not depend on dimensionality ◮ On the other hand, quadrature methods are difficult to extend to multidimensional problems, because of the curse of dimensionality. The actual convergence rate becomes O ( n − k/d ), for any order k method in dimension d ◮ This makes Monte Carlo strategy very attractive for high dimensional problems

Exact Simulation 19/30 ◮ Monte Carlo methods require sampling a set of points chosen randomly from a probability distribution ◮ For simple distribution f ( x ) whose inverse cumulative distribution functions (CDF) exists, we can sampling x from f as follows x = F − 1 ( u ) , u ∼ Uniform(0 , 1) where F − 1 is the inverse CDF of f ◮ Proof. p ( a ≤ x ≤ b ) = p ( F ( a ) ≤ u ≤ F ( b )) = F ( b ) − F ( a )

Examples 20/30 ◮ Exponential distribution: f ( x ) = θ exp( − θx ). The CDF is � a θ exp( − θx ) = 1 − exp( − θa ) F ( a ) = 0 therefore, x = F − 1 ( u ) = − 1 θ log(1 − u ) ∼ f ( x ). Since 1 − u also follows the uniform distribution, we often use x = − 1 θ log( u ) instead 2 π exp( − x 2 1 ◮ Normal distribution: f ( x ) = 2 ). Box-Muller √ Transform � − 2 log U 1 cos 2 πU 2 X = � − 2 log U 1 sin 2 πU 2 Y = where U 1 ∼ Uniform(0 , 1) , U 2 ∼ Uniform(0 , 1)

Modern Computational Statistics Lecture 4: Numerical Integration - PowerPoint PPT Presentation

Modern Computational Statistics Lecture 4: Numerical Integration Cheng Zhang School of Mathematical Sciences, Peking University September 23, 2019 Overview 2/30 Statistical inference often depends on intractable integrals I ( f ) =

MODERN 1 MODERN 2 MODERN 3 MODERN 4 MODERN A peep at some distant orb has power to raise

Modern Computational Statistics Lecture 20: Applications in Computational Biology Cheng Zhang

Modern Risk Modern Risk Modern Risk Management Modern Risk Management anagement Concepts:

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official

Modern OLTP Indexes (Part 2) 1 / 43 Modern OLTP Indexes (Part 2) Recap Recap 2 / 43 Modern OLTP

C0002M Numerical analysis, Lecture 11 Ove Edlund Ove Edlund C0002M Numerical analysis,

AM 205: lecture 12 Last time: Numerical differentiation, numerical solution of ordinary

Numerical Differentiation & Integration Composite Numerical Integration I Numerical Analysis

Numerical Differentiation & Integration Numerical Differentiation I Numerical Analysis (9th

Numerical Semigroup Algebra Joint with Kee, Mee-Kyoung International meeting on numerical

Obstacles in Numerical Calculations Erik Schnetter Paris, November 2006 Obstacles in Numerical

JUST THE MATHS SLIDES NUMBER 17.7 NUMERICAL MATHEMATICS 7 (Numerical solution) of

JUST THE MATHS SLIDES NUMBER 17.8 NUMERICAL MATHEMATICS 8 (Numerical solution) of

4. Numerical Quadrature Where analytical abilities end . . . 4. Numerical Quadrature Numerical

Numerical Differentiation & Integration Elements of Numerical Integration I Numerical

Numerical Recipes for Multiprecision Computations Henri Cohen May 13, 2014 IMB, Universit e

Monte Carlo Methods Lecture notes for MAP001169 Based on Script by Martin Sk old adopted by

Stochastic Simulation Independent, uniformly distributed RN Generation of random variables

Stochastic Simulation Generation of random variables Continuous sample space Bo Friis Nielsen

Markov Chain Monte Carlo (MCMC) Inference Seung-Hoon Na Chonbuk National University Monte Carlo

Generating Random Variables Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of Electrical

Hadamard Alberto Maldonado Romo Instituto Polit ecnico Nacional Centro de Investigaci on

http://www.iragreenberg.com Review Only one thing executes at any time Scope

Acceptance-Rejection method The acceptance-rejection method is usually used when the inverse

Modern Computational Statistics Lecture 4: Numerical Integration - PowerPoint PPT Presentation

Modern Computational Statistics Lecture 4: Numerical Integration Cheng Zhang School of Mathematical Sciences, Peking University September 23, 2019 Overview 2/30 Statistical inference often depends on intractable integrals I ( f ) =

MODERN 1 MODERN 2 MODERN 3 MODERN 4 MODERN A peep at some distant orb has power to raise

Modern Computational Statistics Lecture 20: Applications in Computational Biology Cheng Zhang

Modern Risk Modern Risk Modern Risk Management Modern Risk Management anagement Concepts:

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official

Modern OLTP Indexes (Part 2) 1 / 43 Modern OLTP Indexes (Part 2) Recap Recap 2 / 43 Modern OLTP

C0002M Numerical analysis, Lecture 11 Ove Edlund Ove Edlund C0002M Numerical analysis,

AM 205: lecture 12 Last time: Numerical differentiation, numerical solution of ordinary

Numerical Differentiation &amp; Integration Composite Numerical Integration I Numerical Analysis

Numerical Differentiation &amp; Integration Numerical Differentiation I Numerical Analysis (9th

Numerical Semigroup Algebra Joint with Kee, Mee-Kyoung International meeting on numerical

Obstacles in Numerical Calculations Erik Schnetter Paris, November 2006 Obstacles in Numerical

JUST THE MATHS SLIDES NUMBER 17.7 NUMERICAL MATHEMATICS 7 (Numerical solution) of

JUST THE MATHS SLIDES NUMBER 17.8 NUMERICAL MATHEMATICS 8 (Numerical solution) of

4. Numerical Quadrature Where analytical abilities end . . . 4. Numerical Quadrature Numerical

Numerical Differentiation &amp; Integration Elements of Numerical Integration I Numerical

Numerical Recipes for Multiprecision Computations Henri Cohen May 13, 2014 IMB, Universit e

Monte Carlo Methods Lecture notes for MAP001169 Based on Script by Martin Sk old adopted by

Stochastic Simulation Independent, uniformly distributed RN Generation of random variables

Stochastic Simulation Generation of random variables Continuous sample space Bo Friis Nielsen

Markov Chain Monte Carlo (MCMC) Inference Seung-Hoon Na Chonbuk National University Monte Carlo

Generating Random Variables Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of Electrical

Hadamard Alberto Maldonado Romo Instituto Polit ecnico Nacional Centro de Investigaci on

http://www.iragreenberg.com Review Only one thing executes at any time Scope

Acceptance-Rejection method The acceptance-rejection method is usually used when the inverse

Numerical Differentiation & Integration Composite Numerical Integration I Numerical Analysis

Numerical Differentiation & Integration Numerical Differentiation I Numerical Analysis (9th

Numerical Differentiation & Integration Elements of Numerical Integration I Numerical