Interval Estimation Edwin Leuven
Interval estimation While an estimator may be unbiased or consistent, a given estimate will never equal the true value ◮ this point estimate does not give a sense of “closeness”, while ◮ the variance estimate does not give a sense of location We could try to combine both in statements like: “We are confident that θ lies somewhere between . . . and . . . ” where we would like to 1. give a specific interval, and 2. be precise about how confident we are This is the aim of a confidence interval (CI), which is a particular type of probability interval 2/36
Interval estimation Formally we define a confidence interval as follows Pr(ˆ L < θ < ˆ U ) = 1 − α where we construct estimates of a lower bound ˆ L and an upper bound ˆ U such that the interval [ˆ L , ˆ U ] covers the parameter of interest with probability 1 − α We call 1 − α the confidence level CI’s are random intervals because they differ across random samples The confidence level is thus a probability relative to the sampling distribution! 3/36
Probability intervals We will consider probability intervals for continuous r.v. X Pr( a < X < b ) with density f ( x ) and support on the real line this equals � b Pr( a < X < b ) = f ( x ) dx a � b � a = f ( x ) dx − f ( x ) dx −∞ −∞ = F ( b ) − F ( a ) Note that since X is continuous Pr( X = x ) = 0 and Pr( a < X < b ) = Pr( a ≤ X < b ) = Pr( a < X ≤ b ) = Pr( a ≤ X ≤ b ) 4/36
Probability intervals ( X ∼ χ 2 (3)) Density Area = Pr(a < X < b) = F(b) − F(a) a b X 5/36
Probability intervals We often need to compute either p = F ( a ) ≡ Pr( X ≤ a ) or a = F − 1 ( p ) In R we can do this using the pxxx and qxxx functions, f.e. for the normal distribution: pnorm (1.96) ## [1] 0.9750021 qnorm (0.975) ## [1] 1.959964 6/36
What is the probability that our estimator is close to θ ? We can write this as: Pr( | ˆ θ − θ | < ε ) = Pr( θ − ε < ˆ θ < θ + ε ) Density Area = ? θ − ε θ θ + ε ^ θ 7/36
Interval estimation The probability that our estimator is no further than ε from θ equals Pr( θ − ε < ˆ θ < θ + ε ) which is the probability that the r.v. ˆ θ is in a fixed interval with unknown boundaries Note though that we can rewrite this as follows Pr( θ − ε < ˆ θ < θ + ε ) = Pr(ˆ θ − ε < θ < ˆ θ + ε ) which is the probability that the random interval (ˆ θ − ε, ˆ θ + ε ) covers the fixed number θ How do we construct such intervals and compute their corresponding confidence levels? 8/36
CI for the mean – Normal data, variance known Let X ∼ N ( µ, σ 2 ) then ¯ X ∼ N ( µ, σ 2 / n ) Now consider taking a random sample of size n , then ¯ � � ε X − µ ε Pr( µ − ε < ¯ X < µ + ε ) = Pr − σ/ √ n < σ/ √ n < σ/ √ n ε ε � � � � σ/ √ n σ/ √ n =Φ − Φ − ε � � σ/ √ n =2Φ − 1 = 1 − α For a given confidence level 1 − α we get the following ε ε = z 1 − α/ 2 · σ/ √ n where z 1 − α/ 2 = Φ − 1 (1 − α/ 2) 9/36
CI for the mean – Normal data, variance known 1 − α 2 Φ ( x ) α 2 µ z α 2 z 1 −α 2 x 10/36
CI for the mean – Normal data, variance known Since Pr( µ − ε < ¯ X < µ + ε ) = Pr(¯ X − ε < µ < ¯ X + ε ) , the following is a (1 − α )100% CI: X − z 1 − α/ 2 · σ/ √ n , ¯ X + z 1 − α/ 2 · σ/ √ n ) (¯ For a given sample, µ will be either inside or outside this interval But before drawing the sample, there is a (1 − α )100% chance that an interval constructed this way will cover the true parameter µ 11/36
CI for the mean – Normal data, variance known For example if we set the confidence level at 1 − α = 0 . 90, then z 1 − 0 . 10 / 2 = Φ − 1 (0 . 95) = − Φ − 1 (0 . 05) ≈ 1 . 645 qnorm (.95) ## [1] 1.6448536 With n = 10 random draws from X ∼ N ( µ, 1) mean ( rnorm (10)) ## [1] -0.38315741 we get the following 90% confidence interval: √ √ ( − 0 . 38 − 1 . 645 · 1 / 10 , − 0 . 38 + 1 . 645 · 1 / 10) ≈ ( − 0 . 90 , 0 . 14) 12/36
CI for the mean – Normal data, variance known We know that we need to cover the true parameter 90% of the time: n = 10; nrep = 1e5; z = qnorm (0.95) cover = rep (F, nrep) for (i in 1 : nrep) { x = rnorm (n, 0, 1) m = mean (x); se = 1 / sqrt (n) ci0 = m - z * se; ci1 = m + z * se; cover[i] = ci0 < 0 & 0 < ci1 } mean (cover) ## [1] 0.90217 13/36
90% CI for the mean, n = 10 50 50 40 40 Sample nr. 30 30 20 20 10 10 0 0 −2 −2 −1 −1 0 0 1 1 2 2 CI 14/36
90% CI for the mean, n = 40 50 50 40 40 Sample nr. 30 30 20 20 10 10 0 0 −2 −2 −1 −1 0 0 1 1 2 2 CI 15/36
90% CI for the mean, n = 160 50 50 40 40 Sample nr. 30 30 20 20 10 10 0 0 −2 −2 −1 −1 0 0 1 1 2 2 CI 16/36
90% CI for the mean, n = 10 50 50 40 40 Sample nr. 30 30 20 20 10 10 0 0 −2 −2 −1 −1 0 0 1 1 2 2 CI 17/36
95% CI for the mean, n = 10 50 50 40 40 Sample nr. 30 30 20 20 10 10 0 0 −2 −2 −1 −1 0 0 1 1 2 2 CI 18/36
99% CI for the mean, n = 10 50 50 40 40 Sample nr. 30 30 20 20 10 10 0 0 −2 −2 −1 −1 0 0 1 1 2 2 CI 19/36
Computing sample size Suppose you plan to collect data, and you want to know the sample size you need to achieve a certain level of confidence in our interval estimate Since ε = z 1 − α/ 2 · σ/ √ n Solving for n , we obtain � 2 � z 1 − α/ 2 · σ n = ε Note that we need a larger sample ( n increases) if ◮ we require greater precision ( ε decreases) ◮ we want to be more confident ( α decreases) ◮ there is more dispersion in the population ( σ increases) 20/36
CI for the Variance – Normal data The sample variance n 1 S 2 = ( X i − ¯ � X ) 2 n − 1 i =1 is our estimator for the population variance When the X i follow a normal distribution then ( n − 1) S 2 /σ 2 follows a so-called Chi-squared distribution with n − 1 degrees of freedom: Chi-squared distribution If Z i ∼ N (0 , 1) and V = � k i =1 Z 2 i then V ∼ χ 2 ( k ) where k are the degrees of freedom. E [ V ] = k and Var( V ) = 2 k . 21/36
CI for the Variance – χ 2 distribution χ 2 ( 1 ) χ 2 ( 2 ) χ 2 ( 3 ) χ 2 ( 9 ) Density 0 2 4 6 8 X 22/36
CI for the Variance – Normal data Because the Chi-square distribution is asymmetric we need to make sure that we set the boundaries of the CI such that we have α/ 2 probability mass on each side: . 025 < ( n − 1) S 2 /σ 2 < c n − 1 Pr( c n − 1 . 975 ) = 0 . 95 where we can compute c n − 1 in R using qchisq(p,k) p We can rewrite the above as . 975 < σ 2 < ( n − 1) S 2 / c n − 1 Pr(( n − 1) S 2 / c n − 1 . 025 ) = 0 . 95 23/36
CI for the Variance – Normal data 1.0 0.8 0.6 Density 0.4 0.2 0.0 0.0 0.0 0.5 0.5 1.0 1.0 1.5 1.5 2.0 2.0 2.5 2.5 S 2 24/36
CI for the Variance – Normal data If α = . 05 we know that we need to cover the true parameter 95% of the time: n = 4; nrep = 1e5 cover = rep (F, nrep) for (i in 1 : nrep) { v = var ( rnorm (n, 1, 10)) ci0 = (n - 1) * v / qchisq (.975, n - 1) ci1 = (n - 1) * v / qchisq (.025, n - 1) cover[i] = ci0 < 100 & 100 < ci1 } mean (cover) ## [1] 0.94955 25/36
CI for the mean – Normal data, variance unknown We considered X ∼ N ( µ, σ 2 ) and assumed we knew σ 2 In practice we probably don’t know σ 2 , but we have an estimator n 1 S 2 = � ( X i − ¯ X ) 2 n − 1 i =1 A simple solution is to replace σ with S : x − z α/ 2 S / √ n , ¯ x + z 1 − α/ 2 S / √ n � � ¯ But how does this work? 26/36
CI for the mean – Normal data, variance unknown If α = . 1 we know that we need to cover the true parameter 90% of the time: n = 10; nrep = 1e5; z = qnorm (.95) cover = rep (F, nrep) for (i in 1 : nrep) { x = rnorm (n, 1, 10) m = mean (x); se = sd (x) / sqrt (n) ci0 = m - z * se; ci1 = m + z * se; cover[i] = ci0 < 1 & 1 < ci1 } mean (cover) ## [1] 0.86477 27/36
Student’s t-distribution It turns out that X − µ ) / ( S / √ n ) Z = (¯ does not follow a Normal but a so-called t-distribution with n − 1 degrees of freedom t distribution If Z ∼ N (0 , 1) and V ∼ χ 2 ( k ), Z and V are independent, and Z T = � V / k then T ∼ t ( k ) where k are the degrees of freedom. E [ T ] = k and Var( T ) = 2 k . 28/36
Student’s t-distribution N ( 0 , 1 ) = t ( ∞ ) t ( 4 ) t ( 1 ) Density x 29/36
Student’s t-distribution 5 4 t 1 −α 2 ( k ) 3 z 0.995 z 0.975 z 0.95 2 5 10 20 50 k (degrees of freedom) 30/36
CI for the mean – Normal data, variance unknown We know that we need to cover the true parameter 95% of the time: n = 10; nrep = 1e5; z = qt (.975, n - 1) cover = rep (FALSE, nrep) for (i in 1 : nrep) { x = rnorm (n, 1, 10) m = mean (x); se = sd (x) / sqrt (n) ci0 = m - z * se; ci1 = m + z * se; cover[i] = ci0 < 1 & 1 < ci1 } mean (cover) ## [1] 0.95102 31/36
Recommend
More recommend