UQ, STAT2201, 2017, Lecture 6 Unit 6 – Statistical Inference Ideas. 1
Statistical Inference is the process of forming judgements about the parameters of a population typically on the basis of random sampling . 2
The random variables X 1 , X 2 , . . . , X n are an (i.i.d.) random sample of size n if (a) the X i ’s are independent random variables and (b) every X i has the same probability distribution. A statistic is any function of the observations in a random sample, and the probability distribution of a statistic is called the sampling distribution . 3
Any function of the observation, or any statistic , is also a random variable. We call the probability distribution of a statistic a sampling distribution . A point estimate of some population parameter θ is a single numerical value ˆ θ of a statistic ˆ Θ. The statistic ˆ Θ is called the point estimator . 4
The most common statistic we consider is the sample mean , X , with a given value denoted by x . As an estimator, the sample mean is an estimator of the population mean, µ . 5
The Central Limit Theorem 6
Central Limit Theorem (for sample means): If X 1 , X 2 , . . . , X n is a random sample of size n taken from a population with mean µ and finite variance σ 2 and if X is the sample mean, the limiting form of the distribution of Z = X − µ σ/ √ n as n → ∞ , is the standard normal distribution. This implies that X is approximately normally distributed with mean µ and standard deviation σ/ √ n . 7
The standard error of X is given by σ/ √ n . In most practical situations σ is not known but rather estimated in this case, the estimated standard error , (denoted in typical computer output as ”SE”), is s / √ n where the sample standard deviation s is the point estimator for the population standard deviation, � n � x 2 i − n x 2 � � � i =1 � s = . n − 1 8
Central Limit Theorem (for sums): Manipulate the central limit theorem (for sample means and use � n i =1 X i = nX . This yields, � n i =1 X i − n µ Z = √ , n σ 2 which follows a standard normal distribution as n → ∞ . This implies that � n i =1 X i is approximately normally distributed with mean n µ and variance n σ 2 . 9
Confidence Intervals 10
Knowing the sampling distribution (or the approximate sampling distribution) of a statistic is the key for the two main tools of statistical inference that we study: (a) Confidence intervals – a method for yielding error bounds on point estimates . (b) Hypothesis testing – a methodology for making conclusions about population parameters. 11
The formulas for most of the statistical procedures use quantiles of the sampling distribution . When the distribution is N (0 , 1) (standard normal), the α ’s quantile is denoted z α and satisfies: � z α 1 − x 2 √ 2 dx . α = e 2 π −∞ A common value to use for α is 0 . 05 and in procedures the expressions z 1 − α or z 1 − α/ 2 appear. Note that in this case z 1 − α/ 2 = 1 . 96 ≈ 2 . 12
A confidence interval estimate for µ is an interval of the form l ≤ µ ≤ u , where the end-points l and u are computed from the sample data. Because different samples will produce different values of l and u , these end points are values of random variables L and U , respectively. Suppose that � � L ≤ µ ≤ U = 1 − α. P The resulting confidence interval for µ is l ≤ µ ≤ u . The end-points or bounds l and u are called the lower- and upper-confidence limits (bounds), respectively, and 1 − α is called the confidence level . 13
If x is the sample mean of a random sample of size n from a normal population with known variance σ 2 , a 100(1 − α )% confidence interval on µ is given by σ σ √ n ≤ µ ≤ x + z 1 − α/ 2 √ n . x − z 1 − α/ 2 Note that it is roughly of the form, x − 2 SE ≤ µ ≤ x + 2 SE . Learn how to do back of the envelope calculations! 14
Confidence interval formulas give insight into the required sample size : If x is used as an estimate of µ , we can be 100(1 − α )% confident that the error | x − µ | will not exceed a specified amount ∆ when the sample size is not smaller than � 2 � z 1 − α/ 2 σ n = . ∆ 15
Hypothesis Testing 16
A statistical hypothesis is a statement about the parameters of one or more populations. The null hypothesis , denoted H 0 is the claim that is initially assumed to be true based on previous knowledge. The alternative hypothesis , denoted H 1 is a claim that contradicts the null hypothesis. 17
For some arbitrary value µ 0 , a two-sided alternative hypothesis is expressed as: H 0 : µ = µ 0 , H 1 : µ � = µ 0 A one-sided alternative hypothesis is expressed as: H 0 : µ = µ 0 , H 1 : µ < µ 0 or H 0 : µ = µ 0 , H 1 : µ > µ 0 . 18
The standard scientific research use of hypothesis is to “hope to reject” H 0 so as to have statistical evidence for the validity of H 1 . 19
An hypothesis test is based on a decision rule that is a function of the test statistic . For example: Reject H 0 if the test statistic is below a specified threshold, otherwise don’t reject. 20
Rejecting the null hypothesis H 0 when it is true is defined as a type I error . Failing to reject the null hypothesis H 0 when it is false is defined as a type II error . 21
H 0 Is True H 0 Is False Fail to reject H 0 : No error Type II error Reject H 0 : Type I error No error � H 0 is true). � α = P (type I error) = P (reject H 0 � H 0 is false ). � β = P (type II error) = P (fail to reject H 0 22
The power of a statistical test is the probability of rejecting the null hypothesis H 0 when the alternative hypothesis is true. Desire: α is low and power (1 − β ) as high as can be. 23
Simple Hypothesis Tests 24
A typical example of a simple hypothesis test has H 0 : µ = µ 0 , H 1 : µ = µ 1 , where µ 0 and µ 1 are some specified values for the population mean. This test isn’t typically practical but is useful for understanding the concepts at hand. Assuming that µ 0 < µ 1 and setting a threshold, τ , reject H 0 if the x > τ , otherwise don’t reject. 25
Explicit calculation of the relationships of τ , α , β , n , σ , µ 0 and µ 1 is possible in this case. 26
Practical Hypothesis Tests (focus of Units 7,8 of the course) 27
In most hypothesis tests used in practice (and in this course), a specified level of type I error, α is predetermined (e.g. α = 0 . 05) and the type II error is not directly specified. The probability of making a type II error β increases (power decreases) rapidly as the true value of µ approaches the hypothesized value. The probability of making a type II error also depends on the sample size n - increasing the sample size results in a decrease in the probability of a type II error. The population (or natural) variability (e.g. described by σ ) also affects the power. 28
The P-value is the smallest level of significance that would lead to rejection of the null hypothesis H 0 with the given data. That is, the P-value is based on the data. It is computed by considering the location of the test statistic under the sampling distribution based on H 0 . 29
It is customary to consider the test statistic (and the data) significant when the null hypothesis H 0 is rejected; therefore, we may think of the P -value as the smallest α at which the data are significant. In other words, the P -value is the observed significance level . 30
Clearly, the P -value provides a measure of the credibility of the null hypothesis. Computing the exact P -value for a statistical test is not always doable by hand. It is typical to report the P -value in studies where H 0 was rejected (and new scientific claims were made). Typical (“convincing”) values can be of the order 0 . 001. 31
A General Procedure for Hypothesis Tests is (1) Parameter of interest: From the problem context, identify the parameter of interest. (2) Null hypothesis, H 0 : State the null hypothesis, H 0 . (3) Alternative hypothesis, H 1 : Specify an appropriate alternative hypothesis, H 1 . (4) Test statistic: Determine an appropriate test statistic. (5) Reject H 0 if: State the rejection criteria for the null hypothesis. (6) Computations: Compute any necessary sample quantities, substitute these into the equation for the test statistic, and compute the value. (7) Draw conclusions: Decide whether or not H 0 should be rejected and report that in the problem context. 32
Recommend
More recommend