Variability of a Difference CI and Test for Difference of Proportions CI and Test for Difference of Means STAT 113 Analytic Intervals and Tests for Differences Between Two Groups Colin Reimer Dawson Oberlin College November 9, 2017 1 / 27
Variability of a Difference CI and Test for Difference of Proportions CI and Test for Difference of Means Cases to Address We will need standard errors to do CIs and tests for the following parameters: 1. Single Proportion (Last Time) 2. Single Mean (Wrap Up Today) 3. Difference of Proportions (Today) 4. Difference of Means (Today) 5. Mean of Differences (Next Week) 2 / 27
Variability of a Difference CI and Test for Difference of Proportions CI and Test for Difference of Means Variability of a Difference CI and Test for Difference of Proportions CI and Test for Difference of Means 3 / 27
Variability of a Difference CI and Test for Difference of Proportions CI and Test for Difference of Means Example: Penguins Again! Penguin Breeding The scientists who studied whether metal bands were harmful to penguin survival also examined whether they affected the penguins’ breeding patterns. For the metal-band group, 39 of 122 penguin-seasons resulted in offspring (32%). In the control group, 70 out of 160 penguin-seasons 32% of 122 breeding seasons (combined across penguins), whereas the controls had offpsring in 70 of 160 breeding seasons (44%). • If we want to construct a confidence interval or do a test about the difference in the proportion “breeding opportunities” that were successful, what is the relevant population parameter ? • What is the relevant sample statistic? 4 / 27
Variability of a Difference CI and Test for Difference of Proportions CI and Test for Difference of Means Outline Variability of a Difference CI and Test for Difference of Proportions CI and Test for Difference of Means 5 / 27
Variability of a Difference CI and Test for Difference of Proportions CI and Test for Difference of Means Variance and Standard Error of Differences • With two independent samples, A and B , then quantities such as ˆ p A − ˆ p B that depend on both random samples have two independent sources of variability. • So the difference is more variable than either sample statistic alone. • Specifically, the variance of the difference is the sum of the separate variances: s 2 p B = s 2 p A + s 2 p A − ˆ ˆ ˆ p B ˆ 6 / 27
Variability of a Difference CI and Test for Difference of Proportions CI and Test for Difference of Means Variance and Standard Error of Proportions • Recall: across all random samples, the standard deviation of the sample proportions (i.e., the standard error) is � p (1 − p ) p = s ˆ n where p is the population proportion and n is the sample size. • The variance of ˆ p is the square of this; i.e., the same thing without the square root. 7 / 27
Variability of a Difference CI and Test for Difference of Proportions CI and Test for Difference of Means Standard Error of Difference of Proportions So the variance of the difference between two independent sample proportions is p B = p A (1 − p A ) + p B (1 − p B ) s 2 p A − ˆ ˆ n A n B and the standard deviation (i.e., standard error) of the difference is � p A (1 − p A ) + p B (1 − p B ) s ˆ p B = p A − ˆ n A n B 8 / 27
Variability of a Difference CI and Test for Difference of Proportions CI and Test for Difference of Means Standard Error of Difference of Means The exact same reasoning applies to the standard error of a difference between means of two independent samples: σ A σ B s ¯ x A = s ¯ x B = √ n A √ n B x A = σ 2 σ 2 s 2 A s 2 B x B = ¯ ¯ n A n B x B = σ 2 + σ 2 s 2 A B x A − ¯ ¯ n A n B � σ 2 + σ 2 A B s ¯ x B = x A − ¯ n A n B 9 / 27
Variability of a Difference CI and Test for Difference of Proportions CI and Test for Difference of Means Analytic Approximations of Sampling Distributions Param. Stat. Randomization Theory SE Test Dist. � p 0 (1 − p 0 ) p p ˆ Simulate from p 0 Normal n s µ x ¯ Bootstrap + shift t n − 1 √ n � p A (1 − p A ) + p B (1 − p B ) p A − ˆ ˆ p A − p B p B Scramble groups Normal n A n B � s 2 s 2 x A − ¯ ¯ n A + A B µ A − µ B x B Scramble groups t min( n A ,n B ) − 1 n B s D ¯ Flip pairs ∗ µ D x D t n D − 1 √ n D � 1 − r 2 ρ r Scramble pairings t n − 2 n − 2 Statistic ± Critical Value × � CI : SE Statistic − Null Param . Sandardized Test Statistic : � SE 10 / 27
Variability of a Difference CI and Test for Difference of Proportions CI and Test for Difference of Means Outline Variability of a Difference CI and Test for Difference of Proportions CI and Test for Difference of Means 11 / 27
Variability of a Difference CI and Test for Difference of Proportions CI and Test for Difference of Means Distribution of ˆ p A − ˆ p B • Condition: The sampling distribution of ˆ p A − ˆ p B is approximately Normal with at least 10 cases from all four combinations: n A p A ≥ 10 n A (1 − p A ) ≥ 10 n B p B ≥ 10 n B (1 − p B ) ≥ 10 • Mean: p A − p B • Standard deviation (standard error): � p A (1 − p A ) + p B (1 − p B ) SE ˆ p = n A n B 12 / 27
Variability of a Difference CI and Test for Difference of Proportions CI and Test for Difference of Means Confidence Interval for a Difference of Proportions CI Summary: Difference of Proportions To compute a confidence interval for a difference of two proportions when the sampling distribution for ˆ p A − ˆ p B is approximately Normal (see the last slide for conditions) 1. Find the standardized endpoints, Z ∗ , for the confidence level, using a standard Normal 2. “Destandardize” to get the endpoints � p A (1 − ˆ ˆ p A ) + ˆ p B (1 − ˆ p B ) p B ± Z ∗ · p A − ˆ ˆ n A n B Why do we use ˆ p A and ˆ p B in the standard error, again? 13 / 27
Variability of a Difference CI and Test for Difference of Proportions CI and Test for Difference of Means P -values for a difference of two sample proportions from a Standard Normal Computing P -values when the null sampling distribution is approximately Normal (see previously stated conditions) is the reverse process: 1. Convert ˆ p A − ˆ p B to a z -score within the theoretical null sampling distribution (i.e., using its mean and standard deviation). Z observed = ˆ p A − ˆ p B − 0 ? 2. Find the relevant area beyond Z observed using a Standard Normal 14 / 27
Variability of a Difference CI and Test for Difference of Proportions CI and Test for Difference of Means Null Standard Error • Problem: H 0 states what the difference is, but the standard error depends on each population proportion: � p A (1 − p A ) + p B (1 − p B ) p B = s ˆ p A − ˆ n A n B • This is not a function of the difference. • But, H 0 says that p A and p B are the same thing , so we can estimate this single number using ˆ p combined , the proportion of the relevant category across both groups. • Note: hold this proportion constant already when doing a randomization test. 15 / 27
Variability of a Difference CI and Test for Difference of Proportions CI and Test for Difference of Means P -values for a difference of two proportions Computing P -values when the null sampling distribution is approximately Normal (see previously stated conditions) is the reverse process: 1. Convert ˆ p A − ˆ p B to a z -score within the theoretical null sampling distribution (i.e., using its mean and standard deviation). p A − ˆ ˆ p B − 0 Z observed = � p combined (1 − ˆ ˆ p combined ) + ˆ p combined (1 − ˆ p combined ) n A n B p A − ˆ ˆ p B − 0 = � p combined )( 1 1 p combined (1 − ˆ ˆ n A + n B ) 2. Find the relevant area beyond Z observed using a Standard Normal 16 / 27
Variability of a Difference CI and Test for Difference of Proportions CI and Test for Difference of Means Example: Penguins Again! Penguin Breeding The scientists who studied whether metal bands were harmful to penguin survival also examined whether they affected the penguins’ breeding patterns. For the metal-band group, 39 of 122 penguin-seasons resulted in offspring (32%). In the control group, 70 out of 160 penguin-seasons 32% of 122 breeding seasons (combined across penguins), whereas the controls had offpsring in 70 of 160 breeding seasons (44%). n metal = n control = p metal = ˆ p control = ˆ p metal − ˆ ˆ p control = 17 / 27
Variability of a Difference CI and Test for Difference of Proportions CI and Test for Difference of Means Penguins Breed Confidence (Interval) Is the Normal approximation reasonable? CI : point estimate ± Z ∗ · SE � p A (1 − ˆ ˆ p A ) + ˆ p B (1 − ˆ p B ) SE = n A n B = Z ∗ = Find a 90% CI for p A − p B 18 / 27
Variability of a Difference CI and Test for Difference of Proportions CI and Test for Difference of Means Do metal bands reduce breeding chances? Z observed = observed difference − null difference standard error � p combined )( 1 + 1 SE = p combined (1 − ˆ ˆ ) n A n B p combined = ˆ SE = Z observed = P -value = Decision? 19 / 27
Variability of a Difference CI and Test for Difference of Proportions CI and Test for Difference of Means Outline Variability of a Difference CI and Test for Difference of Proportions CI and Test for Difference of Means 20 / 27
Recommend
More recommend