ST 380 Probability and Statistics for the Physical Sciences Interval Estimates A point estimate by itself provides no information about the precision and reliability of estimation. Consider, e.g. ¯ X as an estimator for µ . We have no idea how close ¯ x is to µ . An alternative to reporting a single number is to report an entire interval of plausible values, that is an interval estimate . 1 / 15 Interval Estimation Introduction
ST 380 Probability and Statistics for the Physical Sciences Confidence Interval For a 95% confidence interval, at the 95% confidence level, any value of parameter θ in the interval is plausible. A confidence level of 95% implies that 95% of all samples would give an interval that includes θ , and only 5% of all samples would yield an erroneous interval. The most frequently used confidence levels are 90%, 95%, and 99%. The higher the confidence level, the more strongly we believe that the value of the parameter lies within the interval. 2 / 15 Interval Estimation Introduction
ST 380 Probability and Statistics for the Physical Sciences Basic Properties of Confidence Intervals The basic properties of confidence intervals (CIs) are most easily introduced by first focusing on a simple, albeit somewhat unrealistic, problem. Suppose that the parameter of interest is µ , the population is normal, and the value of the standard deviation σ is known. 3 / 15 Interval Estimation Basic Properties
ST 380 Probability and Statistics for the Physical Sciences The Assumptions Normality of the population distribution is often a reasonable assumption, or at least an approximation. However, if the value of µ is unknown, it is typically implausible that the value of σ is known. Methods based on less restrictive assumptions will be shown later. 4 / 15 Interval Estimation Basic Properties
ST 380 Probability and Statistics for the Physical Sciences Recall that ¯ X ∼ N ( µ, σ 2 / n ), and that ¯ X − µ Z = σ/ √ n ∼ N (0 , 1) . So 0 . 95 = P ( − 1 . 96 < Z < 1 . 96) ¯ � � X − µ = P − 1 . 96 < σ/ √ n < 1 . 96 � X − 1 . 96 σ X + 1 . 96 σ � ¯ √ n < µ < ¯ = P √ n This is a random interval for the fixed value µ . 5 / 15 Interval Estimation Basic Properties
ST 380 Probability and Statistics for the Physical Sciences The interpretation is: “the probability is .95 that the random interval includes the true value of µ .” If ¯ x = 80, n = 31 and σ = 2, the 95% confidence interval would be 80 . 0 ± 1 . 96 × 2 . 0 = (79 . 3 , 80 . 7) . √ 31 It is tempting to conclude that µ is within this (now fixed) interval with probability .95 ... 6 / 15 Interval Estimation Basic Properties
ST 380 Probability and Statistics for the Physical Sciences But µ is a constant, if unknown, and once we evaluate the interval, the end-points are also fixed. It is therefore incorrect to write the statement P [ µ ∈ (79 . 3 , 80 . 7)] = . 95 . The correct interpretation is that if we repeatedly formed confidence intervals using this procedure, in the long run, 95% of them would contain the parameter µ . We might write that we are “95% confident” that µ lies within the interval. 7 / 15 Interval Estimation Basic Properties
ST 380 Probability and Statistics for the Physical Sciences For a 99% confidence interval, we would need to replace 1.96 by 2.58. In general, a confidence level of 1 − α is achieved by using z α/ 2 in place of 1.96. Recall that � � Z > z α/ 2 = α/ 2 . P 8 / 15 Interval Estimation Basic Properties
ST 380 Probability and Statistics for the Physical Sciences Definition A 100(1 − α )% confidence interval for the mean µ of a normal population, when the value of σ is known, is given by � � x − z α/ 2 × σ x + z α/ 2 × σ ¯ √ n , ¯ √ n We often write it more compactly as x ± z α/ 2 × σ ¯ √ n . 9 / 15 Interval Estimation Basic Properties
ST 380 Probability and Statistics for the Physical Sciences Confidence Level, Precision, and Sample Size Why settle for a 95% confidence interval when a 99% interval is available? One issue is that the 99% interval is wider (it uses 2.58 instead of 1.96), and thus has less precision. If we want both high confidence and precision, we could fix both and then solve for the necessary sample size. 10 / 15 Interval Estimation Basic Properties
ST 380 Probability and Statistics for the Physical Sciences Large Sample Confidence Intervals Suppose as before that the parameter of interest is µ , but the population is not known to be normal; we still assume for now that the value of the standard deviation σ is known. The Central Limit Theorem assures us that ¯ X is approximately normally distributed as N ( µ, σ 2 / n ), and hence x ± z α/ 2 × σ ¯ √ n . is a confidence interval for µ with a confidence level of approximately 100(1 − α )%. 11 / 15 Interval Estimation Large Sample Intervals
ST 380 Probability and Statistics for the Physical Sciences Suppose, more realistically, that σ is also unknown. Replacing σ by s , the sample standard deviation, in the calculation of the confidence interval is an additional approximation, but it is still true that x ± z α/ 2 × s ¯ √ n . is a confidence interval for µ with a confidence level of approximately 100(1 − α )%. 12 / 15 Interval Estimation Large Sample Intervals
ST 380 Probability and Statistics for the Physical Sciences General Large Sample Case In other situations, we may want to use an estimator ˆ θ of some parameter θ , and we may know that ˆ θ is approximately normally distributed with mean θ , and we may have an estimated standard θ of ˆ error ˆ σ ˆ θ . Then ˆ θ ± z α/ 2 × ˆ σ ˆ θ is a confidence interval for θ with a confidence level of approximately 100(1 − α )%. 13 / 15 Interval Estimation Large Sample Intervals
ST 380 Probability and Statistics for the Physical Sciences Small Samples from a Normal Distribution Recall the confidence interval for the mean µ of a normal distribution, when σ is known: x ± z α/ 2 × σ ¯ √ n . If σ is not known, we replace it by its estimate, s = sample standard deviation . To maintain the coverage probability of 100(1 − α )%, we must adjust the multiplier z α/ 2 . 14 / 15 Interval Estimation Large Sample Intervals
ST 380 Probability and Statistics for the Physical Sciences The necessary probability result is that ¯ X − µ T = S / √ n has a known probability distribution, the Student’s t -distribution with ν = n − 1 degrees of freedom. It follows that x ± t α/ 2 ,ν × s ¯ √ n , is a 100(1 − α )% confidence interval for µ , where t α/ 2 ,ν is the 1 − α quantile of that distribution. 15 / 15 Interval Estimation Large Sample Intervals
Recommend
More recommend