13. The Weak Law and the Strong Law of Large Numbers James Bernoulli proved the weak law of large numbers (WLLN) around 1700 which was published posthumously in 1713 in his treatise Ars Conjectandi. Poisson generalized Bernoulli’s theorem around 1800, and in 1866 Tchebychev discovered the method bearing his name. Later on one of his students, Markov observed that Tchebychev’s reasoning can be used to extend Bernoulli’s theorem to dependent random variables as well. In 1909 the French mathematician Emile Borel proved a deeper theorem known as the strong law of large numbers that further generalizes Bernoulli’s theorem. In 1926 Kolmogorov derived conditions that were necessary and sufficient for a set of mutually independent random variables to obey the law of large numbers. 1 PILLAI
Let be independent, identically distributed Bernoulli random X i Variables such that = = = − = P ( X ) p , P ( X 0 ) 1 p q , i i = + + + k X X � X and let represent the number of “successes” 1 2 n in n trials. Then the weak law due to Bernoulli states that [see Theorem 3-1, page 58, Text] { } pq k − > ε ≤ (13-1) P p . n ε 2 n i.e., the ratio “total number of successes to the total number of trials” tends to p in probability as n increases . A stronger version of this result due to Borel and Cantelli states that the above ratio k/n tends to p not only in probability , but with probability 1. This is the strong law of large numbers (SLLN). 2 PILLAI
What is the difference between the weak law and the strong law? ε { n } The strong law of large numbers states that if is a sequence of positive numbers converging to zero, then { } ∞ ∑ k − ≥ ε < ∞ P p . (13-2) n n = n 1 From Borel-Cantelli lemma [see (2-69) Text], when (13-2) is { } ∆ k − ≥ ε satisfied the events can occur only for a finite A = p n n n number of indices n in an infinite sequence, or equivalently, the { } k − < ε p events occur infinitely often, i.e., the event k/n n n converges to p almost-surely. Proof: To prove (13-2), we proceed as follows. Since 4 k − ≥ ε ⇒ − ≥ ε 4 4 p k np n n 3 PILLAI
we have ( { } { } ) n ∑ − ≥ ε = ε k − ≥ ε + k − < ε 4 4 4 4 4 ( k np ) p ( k ) n n P p P p n n n = k 0 and hence n ∑ − 4 ( k np ) p ( k ) { } n k − ≥ ε ≤ (13-3) = P p k 0 n ε 4 4 n where n ∑ = = = − n k n k p ( k ) P X k p q k n i = i 1 By direct computation n n n { } { } ∑ ∑ ∑ ( ) − = − = − 4 4 4 ( k np ) p ( k ) E ( X np ) E ( X p ) n i i = = = k 0 i 1 i 1 4 PILLAI
n n n n n ∑ ∑∑∑∑ = = 4 E {( Y ) } E ( Y Y Y Y ) = 1 → can coincide with i n i i k j l j, k or l , and the second variable = = = = = i 1 i 1 k 1 j 1 l 1 0 takes (n-1) values n n n n n ∑ ∑∑ ∑∑ 4 = + − + − 3 2 2 E ( Y ) 4 n ( n 1 ) E ( Y ) E ( Y ) 3 n ( n 1 ) E ( Y ) E ( Y ) i i j i j = = = = = i 1 i 1 j 1 i 1 j 1 = + + − ≤ + − 3 3 2 n ( p q ) pq 3 n ( n 1 )( pq ) [ n 3 n ( n 1 )] pq = 2 3 n pq , (13-4) since + = + − − < ≤ < 3 3 3 2 2 p q ( p q ) 3 p q 3 pq 1 , pq 1 / 2 1 Substituting (13-4) also (13-3) we obtain 3 pq { } k − ≥ ε ≤ P p n ε 2 4 n ε = 1 Let so that the above integral reads 1/8 n and hence { } ∞ ∞ 1 1 ∞ ∑ ∑ ( ) k ∫ − − ≥ ≤ ≤ + 3/ 2 P p 3 pq 3 pq 1 x dx n 1/8 3/ 2 n n 1 = = n 1 n 1 5 = + = < ∞ (13-5) 3 pq (1 2) 9 pq , PILLAI
thus proving the strong law by exhibiting a sequence of positive ε = 1/8 numbers that converges to zero and satisfies (13-2). 1/ n n We return back to the same question: “What is the difference between the weak law and the strong law?.” The weak law states that for every n that is large enough, the n = ratio is likely to be near p with certain probability that ∑ ( i X ) / n k n / i = 1 tends to 1 as n increases. However, it does not say that k/n is bound to stay near p if the number of trials is increased. Suppose (13-1) is n . ε satisfied for a given in a certain number of trials If additional 0 n , trials are conducted beyond the weak law does not guarantee that 0 the new k/n is bound to stay near p for such trials. In fact there can n > n > p + ε be events for which for in some regular manner. k / n , 0 The probability for such an event is the sum of a large number of very small probabilities, and the weak law is unable to say anything specific about the convergence of that sum. However, the strong law states (through (13-2)) that not only all such sums converge, but the total number of all such events 6 PILLAI
> p + ε k / n where is in fact finite! This implies that the probability { } of the events as n increases becomes and remains − p > ε k n small, since with probability 1 only finitely many violations to → ∞ n . the above inequality takes place as Interestingly, if it possible to arrive at the same conclusion using a powerful bound known as Bernstein’s inequality that is based on the WLLN. Bernstein’s inequality : Note that k − > ε ⇒ > + ε p k n ( p ) n and for any this gives λ − + ε > λ > ( k n ( p )) e 1 . 0 , Thus n ( ) ∑ − k − > ε = k n k n P n { p } p q k = + ε k n ( p ) n ( ) ∑ ≤ λ − + ε − ( k n ( p )) k n k e n p q k = + ε k n ( p ) n ( ) ∑ λ − + ε − ≤ ( k n ( p )) k n k e n p q 7 k = PILLAI k 0
n ∑ − λ ε λ − λ − k − > ε = n q k p n k P { p } e ( pe ) ( qe ) ( ) n n k = k 0 − λ ε λ − λ = + n q p n e ( pe qe ) . (13-6) Since for any real x , 2 ≤ + x x e x e 2 2 2 2 λ − λ λ λ + ≤ λ + + − λ + q p q p pe qe p ( q e ) q ( p e ) 2 2 2 2 2 = λ + λ ≤ λ q p (13-7) pe qe e . Substituting (13-7) into (13-6), we get 2 k − > ε ≤ λ − λ ε n n P n { p } e . λ n − λ ε But is minimum for and hence 2 λ = ε n / 2 2 k − > ε ≤ − ε ε > n / 4 P n { p } e , 0. (13-8) Similarly 2 k − < − ε ≤ − ε n / 4 P n { p } e 8 PILLAI
and hence we obtain Bernstein’s inequality 2 / 4 k − > ε ≤ − ε (13-9) n P { p } 2 e . n Bernstein’s inequality is more powerful than Tchebyshev’s inequality as it states that the chances for the relative frequency k /n exceeding → ∞ its probability p tends to zero exponentially fast as n . Chebyshev’s inequality gives the probability of k /n to lie − ε + ε p between and for a specific n . We can use Bernstein’s p − ε inequality to estimate the probability for k /n to lie between p + ε p and for all large n . Towards this, let = − ε ≤ k < + ε y { p p } n n so that 2 / 4 = n − > ε ≤ − ε c n P y ( ) P { p } 2 e k n ∞ To compute the probability of the event note that its ∩ y , n ∞ ∞ = n m complement is given by = ∪ c c ( ∩ y ) y 9 n n PILLAI = = n m n m
and using Eq. (2-68) Text, 2 − ε m / 4 ∞ ∞ ∞ 2 e ∑ ∑ 2 − ε ≤ ≤ = c c n / 4 ∪ P ( y ) P y ( ) 2 e . n n 2 − ε − / 4 1 e = n m = = n m n m This gives 2 − ε m / 4 ∞ ∞ 2 e = − ≥ − → → ∞ ∩ ∪ P ( y ) {1 P ( y )} 1 1 as m n n 2 − ε − / 4 1 e = = n m n m or, − ε ≤ k ≤ + ε ≥ → → ∞ P { p p , for all n m } 1 as m . n Thus k /n is bound to stay near p for all large enough n , in probability, a conclusion already reached by the SLLN. ε = Discussion: Let Thus if we toss a fair coin 1,000 times, 0 . 1 . from the weak law { } 1 k − 1 ≥ ≤ P 0 . 01 . n 2 40 10 PILLAI
Recommend
More recommend