On Stein’s Method for Infinite Dimensional Gaussian Approximation Hsin-Hung Shih Department of Applied Mathematics National University of Kaohsiung, Kaohsiung, Taiwan 1 1 Joint work with Y.-J. Lee 1
Contents • Motivation • Stein’s lemma for abstract Wiener measures • Stein’s equation for abstract Wiener measures • Application: A basic central limit theorem 2
Part I. Motivation • In 1972, Charles Stein introduced powerful Stein’s method for estimating the distance from a probability distribution on R to a Gaussian distribution. An Outline of Stein’s Method [Step 1.] Stein began with the observation: Stein’s Lemma: Let Z be a real valued random variable. Then Z has a standard normal distribution if and only if E [ f ′ ( Z )] = E [ Zf ( Z )] , for all continuous and piecewise cintinuously differentiable func- tions f : R → R with E [ | f ′ ( Z ) | ] < + ∞ . Then A defined by A f ( x ) = f ′ ( x ) − xf ( x ) is a characterizing operator such that X ∼ Z if and only if E [ A f ( X )] = 0 holds for all smooth functions f . [Step 2.] Construct Stein equation: A f ( x ) = f ′ ( x ) − xf ( x ) = h − E [ h ( Z )] for any bounded function h on R , and find a equation. Stein observed that the function 3
� x x 2 { h ( x ) − E [ h ( Z )] } e − t 2 2 dt f h ( x ) = e 2 −∞ satisfies such an equation. [Step 3.] Then, for any class H of (bounded) test functions h , it follows that | E [ f ′ d H ( W, Z ) ≡ sup | E [ h ( W )] − E [ h ( Z )] | = sup h ( W ) − Wf h ( W )] | . h ∈H h ∈H Remark 1. | h ( x ) − h ( y ) | • When H = { h : R → R : � h � ULip = sup x � = y ≤ 1 } , | x − y | d H is called the Wasserstein distance. • When H = { 1 ( −∞ ,z ] ; z ∈ R } , d H is called the Kolmogorov distance. Reference. C. Stein, Approximation Computation of Expectations , IMS Lecture Notes, Monograph Series, vol. 7, Institute of Mathematical Statistics, Hayward, CA, 1986. 4
• In order to extend Stein method to Wiener process or other Markov process approximation, A. D. Barbour introduced the gen- erator method. An Outline of Barbour’s Generator Method [Step 1.] If µ is the stationary distribution of the Markov process, then X ∼ µ if and only if E A f ( X ) = 0 for all real-valued functions f for which A f is defined, where A is the infinitesimal generator of this Markov process. [Step 2.] Since �� t � T t h − h = A T u h du , 0 where T t f ( x ) = E [ f ( X t ) | X (0) = x ] is the transition operator of the Markov process, formally taking limits yields �� ∞ � � h dµ − h = A T u h du , 0 if the right-hand side exists. Remark 2. Barbour’s generator method gives both a Stein equa- tion � A f ( x ) = h − h dµ and a candidate for its solution � ∞ − T u h du. 0 5
Example 3. The operator A h ( x ) = h ′′ ( x ) − xh ′ ( x ) is the generator of the Ornstein-Uhlenbeck process with sta- tionary distribution µ ∼ N (0 , 1). Putting f = h ′ gives the classical Stein characterization for N (0 , 1). Reference. 1. A. D. Barbour, Stein’s method and Poisson process conver- gence, J. Appl. Probab. 25(A) (1988), 175-184. 2. A. D. Barbour, Stein’s method for diffusion approximation, Probab. Theory Related Fields 84 (1990), 297-322. 6
Part II. Stein Lemma for Abstract Wiener Measures Review of Abstract Wiener Space • L. Gross introduced the notion of abstract Wiener space in the following paper: L. Gross, Abstract Wiener spaces, Proc. 5th Berkeley Symp. Math. Statist. Probab. , vol. 2 (1965), 31-42. • Let H be a given real separable Hilbert space with the norm | · | H induced by the inner product �· , ·� , and � · � be an another norm defined on H which is weaker than the | · | H -norm. • Let µ be the Gauss cylinder set measure in H , i.e., it is the non-negative set function defined on the colletion of cylinder sets such that if E = { x ∈ H ; Px ∈ F } , then � 1 � n � e − | x | 2 2 dx, √ µ ( E ) = 2 π F where n = dim PH and dx is the Lebesgue measure of PH . • Let F be the partially ordered set of finite dimensional orthogonal projections P of H , where P > Q means Q ( H ) ⊂ P ( H ) for P, Q ∈ F . • If � · � -norm is measurable on H , i.e., for every ǫ > 0, there exists P 0 ∈ F such that µ {� Px � > ǫ } < ǫ, ∀ P ⊥ P 0 and P ∈ F , then the triple ( i, H, B ) is called an abstract Wiener space, where B is the completion of H with respect to � · � -norm and i is the canonical embedding of H into B . 7
• As H is identified as a dense subspace of B , we identify the dual space B ∗ of B as a dense subspace of H ∗ ≈ H ⊂ B under the adjoint operator i ∗ of i by the following way: for x ∈ H and η ∈ B ∗ , � x, i ∗ ( η ) � = ( i ( x ) , η ) , where ( · , · ) is the B - B ∗ pairing. Fact. B carries a probability measure p t on B ( B ) such that for any η ∈ B ∗ , � e i ( x, η ) p t ( dx ) = e − t 2 | η | 2 H . B p t is called the abstract Wiener measure in B with variance parameter t > 0 . Thus ( · , η ) is a random variable over ( B, p t ) with mean 0 and H . For any h ∈ H , let { η n } be a sequence in B ∗ such variance t | η | 2 that | η n − h | H → 0 as n → ∞ . Then { ( · , η n ) } forms a Cauchy sequence in L 2 ( B, p t ), the L 2 ( B, p t )-limit of which is denoted by � h . One notes that � h is independent of the choice of { η n } and � h is distributed by the law of N (0 , t | h | 2 H ). Reference. H.-H. Kuo, Gaussian Measures in Banach Spaces , Lect. Notes in Math., vol. 463, Springer-Verlag, Berlin/New York, 1975. 8
Stein’s Lemma for Abstract Wiener measures • B is a real separable Banach space with � · � -norm; • Z is a fixed Gaussianly distributed B -valued random variable with mean 0 which means that the distribution µ Z ≡ P ( { Z ∈ ·} ) of Z is a probability measure in ( B, B ( B )) such that for each η ∈ B ∗ , the random variable ( · , η ) has a normal distribution with mean 0 with respect to µ Z , where ( · , · ) is the B - B ∗ pairing. Without loss of generality, we may assume that µ Z is non- degenerate, that is, every non-empty open subset of B has positive µ Z -measure. If not, we replace B by the support of µ Z . Kuelb’s Theorem. Suppose µ is a Gaussian measure in a real separable Banach space B . Assume that every non-empty open subset of B has positive µ -measure. Then there exists a real separable Hilbert space H such that ( i, H, B ) is an abstract Wiener Space and µ equals p 1 . By Kuelb’s theorem, there exists a real separable Hilbert space H such that ( i, H, B ) is an abstract Wiener space and µ Z is the abstract Wiener measure on B with variance parameter 1. Now, following the basic idea of Barbour generator method, we construct our infinite-dimensional setting for Gaussian approxima- tion as follows. [The first Step] Look for a characterizing operator A Z for µ Z . Such an operator is defined on a sufficiently large class D of complex-valued functions on B such that a B -valued random variable Y has the same distribution as Z if and only if E [ A Z ( f ( Y ))] = 0 for all f belonging to D . 9
For each t ≥ 0, let O t be the mapping from B ×B ( B ) into [0 , 1] given by � � 1 − e − 2 t ( e − t x, E ) = 1 E ( e − t x + 1 − e − 2 t y ) µ Z ( dy ) . O t ( x, E ) ≡ p B Then, for each E ∈ B ( B ), the mapping x ∈ B �→ O t ( x, E ) is B ( B )-measurable, and, for each x ∈ B , {O t ( x, · ); t ≥ 0 } is a family of probability measures on B ( B ) satisfying the Chapman- Kolmogorov equations: � O s ( y, E ) O t ( x, dy ) = O s + t ( x, E ) , ∀ s, t ≥ 0 . B Thus {O t ( · , · ); t ≥ 0 } forms a temporally homogeneous Markov transition family. It is well-known that there exists a probability measure Λ a on a probability space (Ω , F ) and a B -valued process Θ = { Θ( t ); t ≥ 0 } on that space is a temporally homogeneous Markov process such that Λ a ( { Θ( t ) ∈ dy } ) = O t ( a, dy ) and the transition probability . Λ a (Θ( t ) ∈ dy | Θ( s )) = O t − s (Θ( s ; · ) , dy ) ∀ 0 ≤ s ≤ t. We call such a process Θ a canonical B -valued Ornstein-Uhlenbeck process starting at the point a . [The second step] For any h ∈ X 0 ( X 0 is some space of test functions))), find a function f h belonging to D solving the now-called Stein equation A Z f = h − E [ h ( Z )] . For each B ( B )-measurable function f and t ≥ 0, let T t f ( x ) = E [ f (Θ( t )) | Θ(0) = x ], x ∈ B , with respect to Λ a . Then � � � f ( e − t x + 1 − e − 2 t y ) µ Z ( dy ) , t ≥ 0 , T t f ( x ) ≡ f ( y ) O t ( x, dy ) = B B 10
Recommend
More recommend