CS70: Lecture 33. WLLN, Confidence Intervals (CI): Chebyshev vs. CLT 1. Review: Inequalities: Markov, Chebyshev 2. Law of Large Numbers 3. Review: CLT 4. Confidence Intervals: Chebyshev vs. CLT
Inequalities: An Overview Chebyshev Distribution Markov p n p n p n p n � � n n n µ µ a Pr [ X > a ] Pr [ | X − µ | > � ]
Fraction of H ’s Here is a classical application of Chebyshev’s inequality. How likely is it that the fraction of H ’s differs from 50 % ? Let X m = 1 if the m -th flip of a fair coin is H and X m = 0 otherwise. Define M n = X 1 + ··· + X n , for n ≥ 1 . n We want to estimate Pr [ | M n − 0 . 5 | ≥ 0 . 1 ] = Pr [ M n ≤ 0 . 4 or M n ≥ 0 . 6 ] . By Chebyshev, Pr [ | M n − 0 . 5 | ≥ 0 . 1 ] ≤ var [ M n ] ( 0 . 1 ) 2 = 100 var [ M n ] . Now, var [ M n ] = 1 n 2 ( var [ X 1 ]+ ··· + var [ X n ]) = 1 n var [ X 1 ] ≤ 1 4 n . Var ( X i ) = p ( 1 − p ) ≤ ( . 5 )( . 5 ) = 1 4
Fraction of H ’s M n = X 1 + ··· + X n , for n ≥ 1 . n Pr [ | M n − 0 . 5 | ≥ 0 . 1 ] ≤ 25 n . For n = 1 , 000, we find that this probability is less than 2 . 5 % . As n → ∞ , this probability goes to zero. In fact, for any ε > 0, as n → ∞ , the probability that the fraction of H s is within ε > 0 of 50 % approaches 1: Pr [ | M n − 0 . 5 | ≤ ε ] → 1 . This is an example of the (Weak) Law of Large Numbers. We look at a general case next.
Weak Law of Large Numbers Theorem Weak Law of Large Numbers Let X 1 , X 2 ,... be pairwise independent with the same distribution and mean µ . Then, for all ε > 0, Pr [ | X 1 + ··· + X n − µ | ≥ ε ] → 0 , as n → ∞ . n Proof: Let M n = X 1 + ··· + X n . Then n var [ M n ] = var [ X 1 + ··· + X n ] Pr [ | M n − µ | ≥ ε ] ≤ ε 2 n 2 ε 2 nvar [ X 1 ] = var [ X 1 ] = → 0 , as n → ∞ . n 2 ε 2 n ε 2
Recap: Normal (Gaussian) Distribution. For any µ and σ , a normal (aka Gaussian ) random variable Y , which we write as Y = N ( µ , σ 2 ) , has pdf 1 2 πσ 2 e − ( y − µ ) 2 / 2 σ 2 . √ f Y ( y ) = Standard normal has µ = 0 and σ = 1 . Note: Pr [ | Y − µ | > 1 . 65 σ ] = 10 %; Pr [ | Y − µ | > 2 σ ] = 5 % .
Recap: Central Limit Theorem Central Limit Theorem Let X 1 , X 2 ,... be i.i.d. with E [ X 1 ] = µ and var ( X 1 ) = σ 2 . Define S n := A n − µ σ / √ n = X 1 + ··· + X n − n µ σ √ n . Then, S n → N ( 0 , 1 ) , as n → ∞ . That is, � α 1 − ∞ e − x 2 / 2 dx . Pr [ S n ≤ α ] → √ 2 π 1 E ( S n ) = σ / √ n ( E ( A n ) − µ ) = 0 1 Var ( S n ) = σ 2 / nVar ( A n ) = 1 .
Confidence Interval (CI) for Mean: CLT Let X 1 , X 2 ,... be i.i.d. with mean µ and variance σ 2 . Let A n = X 1 + ··· + X n . n The CLT states that A n − µ σ / √ n = X 1 + ··· + X n − n µ σ √ n → N ( 0 , 1 ) as n → ∞ . Thus, for n ≫ 1, one has Pr [ − 2 ≤ ( A n − µ σ / √ n ) ≤ 2 ] ≈ 95 % . Equivalently, Pr [ µ ∈ [ A n − 2 σ √ n , A n + 2 σ √ n ]] ≈ 95 % . That is, [ A n − 2 σ √ n , A n + 2 σ √ n ] is a 95 % − CI for µ .
CI for Mean: CLT vs. Chebyshev Let X 1 , X 2 ,... be i.i.d. with mean µ and variance σ 2 . Let A n = X 1 + ··· + X n . n The CLT states that X 1 + ··· + X n − n µ σ √ n → N ( 0 , 1 ) as n → ∞ . Also, [ A n − 2 σ √ n , A n + 2 σ √ n ] is a 95 % − CI for µ . What would Chebyshev’s bound give us? [ A n − 4 . 5 σ √ n , A n + 4 . 5 σ √ n ] is a 95 % − CI for µ . ( Why ?) Thus, the CLT provides a smaller confidence interval.
Coins and CLT. Let X 1 , X 2 ,... be i.i.d. B ( p ) . Thus, X 1 + ··· + X n = B ( n , p ) . � Here, µ = p and σ = p ( 1 − p ) . CLT states that X 1 + ··· + X n − np → N ( 0 , 1 ) . � p ( 1 − p ) n
Coins and CLT. Let X 1 , X 2 ,... be i.i.d. B ( p ) . Thus, X 1 + ··· + X n = B ( n , p ) . � Here, µ = p and σ = p ( 1 − p ) . CLT states that X 1 + ··· + X n − np → N ( 0 , 1 ) � p ( 1 − p ) n and [ A n − 2 σ √ n , A n + 2 σ √ n ] is a 95 % − CI for µ with A n = ( X 1 + ··· + X n ) / n . Hence, [ A n − 2 σ √ n , A n + 2 σ √ n ] is a 95 % − CI for p . Since σ ≤ 0 . 5 , [ A n − 20 . 5 √ n , A n + 20 . 5 √ n ] is a 95 % − CI for p . Thus, [ A n − 1 √ n , A n + 1 √ n ] is a 95 % − CI for p .
Summary Inequalities and Confidence Interals 1. Inequalities: Markov and Chebyshev Tail Bounds 2. Weak Law of Large Numbers 3. Confidence Intervals: Chebyshev Bounds vs. CLT Approx. ⇒ A n − µ 4. CLT: X n i.i.d. = σ / √ n → N ( 0 , 1 ) 5. CI: [ A n − 2 σ √ n , A n + 2 σ √ n ] = 95 % -CI for µ .
Recommend
More recommend