STAT 113 Independent vs. Paired Samples Colin Reimer Dawson Oberlin College November 16, 2015
Summary: Comparing Two Means Properties of Differences Between Sample Statistics Difference Between Means of Two Independent Groups Inference About the Mean of Paired Differences
Cases to Address We will need standard errors to do CIs and tests for the following parameters: 1. Single Proportion 2. Single Mean 3. Difference of Proportions 4. Difference of Means (last Friday) 5. Mean of Differences (last Friday / today)
Variance and Standard Error of Differences ◮ The variance of a difference is the sum of the separate variances, due to having two sources of randomness: s 2 p B = s 2 p A + s 2 p A − ˆ p B ˆ ˆ ˆ
Standard Error of Difference of Proportions So the variance of the difference between two sample proportions is p B = p A (1 − p A ) + p B (1 − p B ) s 2 p A − ˆ ˆ n A n B and the standard deviation (i.e., standard error) of the difference is � p A (1 − p A ) + p B (1 − p B ) s ˆ p B = p A − ˆ n A n B
Standard Error of Difference of Means The exact same reasoning applies to the standard error of a difference between means of two independent samples: x B = σ 2 + σ 2 s 2 A B x A − ¯ ¯ n A n B and the standard deviation (i.e., standard error) of the difference is � σ 2 + σ 2 A B s ¯ x B = x A − ¯ n A n B
Distribution of Difference of Sample Means ◮ The statistic ¯ x A − ¯ x B is Normal for large enough samples ( n ≥ 27 ish) and/or when the populations are Normal. ◮ We saw before that � σ 2 + σ 2 A B s ¯ x B = x A − ¯ n A n B ◮ But, as with one mean, we do not know σ 2 A or σ 2 B . So, substitute s 2 A and s 2 B . ◮ But, then the distribution is not Normal; it is t . ◮ Conservatively: use d f = min( n A − 1 , n B − 1) (does not assume equal variances).
Confidence Interval for a Difference of Means CI Summary: Difference of Means To compute a confidence interval for a difference of two means when the sampling distribution for ¯ x A − ¯ x B is approximately T -distributed (i.e., Normal populations and/or large sample), use � s 2 + s 2 x B ± T ∗ · A B x A − ¯ ¯ n A n B where T ∗ is the T -statistic of the endpoint appropriate for the confidence level, computed from a t -distribution with min( n A − 1 , n B − 1) degrees of freedom.
P -values for a difference of two means Computing P -values when the null sampling distribution is approximately Normal (Normal populations and/or large samples) is the reverse process: 1. Convert ¯ x A − ¯ x B to a t -statistic within the theoretical null sampling distribution (i.e., using its mean and standard deviation). T observed = ¯ x A − ¯ x B − ( µ A − µ B ) 0 � n A + s 2 s 2 A B n B 2. Find the relevant area beyond T observed using a t -distribution with min( n A − 1 , n B − 1) degrees of freedom.
Example: Credit Card Use and Tip Percentage Credit Cards and Tip Percentage We analyze the percent tip left on 157 bills from the First Crush Bistro in Northern New York State. The mean percent tip left on the 106 bills paid in cash was 16.39 with a standard deviation of 5.05. The mean percent tip left on the 51 bills paid with a credit card was 17.10 with a standard deviation of 2.47. n card = 51 n cash = 106 x card = 17 . 10 ¯ x cash = 16 . 39 ¯ card = 2 . 47 2 = 6 . 10 cash = 5 . 05 2 = 25 . 50 s 2 s 2 x card − ¯ ¯ x cash = 17 . 10 − 16 . 39 = 0 . 71
Card Vs. Cash Confidence Interval Is the Normal approximation reasonable? Yes, the sample sizes are well above 27 CI : point estimate ± T ∗ · SE � s 2 + s 2 A B SE = n A n B � 6 . 10 51 + 25 . 50 = 106 = 0 . 60 T ∗ = ± 2 . 68 (0.005 and 0.995 q-tiles from t 50 dist.) Find a 99% CI for µ card − µ cash . 0 . 71 ± 2 . 68 · 0 . 60 = ( − 0 . 898 , 2 . 318)
Do people who pay differently tip differently? T observed = observed difference − null difference standard error � s 2 + s 2 A B SE = n A n B SE = 0 . 60 (same as before) T observed = 0 . 70 − 0 = 1 . 17 0 . 60 P -value = 0 . 25 (two-tailed area beyond 1.17 in t 50 ) Decision? Fail to reject H 0 : µ card − µ cash = 0
R Code: Key Data require(mosaic) ## We are given the means and standard deviations, but could compute them ## from a hypothetical dataset: # n.card <- nrow(filter(TippingNY, PaymentMethod == "card")) # n.cash <- nrow(filter(TippingNY, PaymentMethod == "cash")) # xbar.card <- # mean(~PaymentMethod, data = filter(TippingNY, PaymentMethod == "card")) # xbar.cash <- # mean(~PaymentMethod, data = filter(TippingNY, PaymentMethod == "cash")) # s.card <- # sd(~PaymentMethod, data = filter(TippingNY, PaymentMethod == "card")) # s.cash <- # sd(~PaymentMethod, data = filter(TippingNY, PaymentMethod == "cash")) n.card <- 51; n.cash <- 106 xbar.card <- 17.10; xbar.cash <- 16.39 s.card <- 2.47; s.cash <- 5.05
R Code: Confidence Interval ## Compute the standard error se.diff <- sqrt(s.card^2 / n.card + s.cash^2 / n.cash) ## compute critical t-values (tstar.99 <- xqt(c(0.005, 0.995), df = min(n.card - 1, n.cash - 1))) 0.5 5 5 9 0 0 9 0 0 . . 0 . 0 0 0.4 density 0.3 0.2 0.1 −3 −2 −1 0 1 2 3 [1] -2.677793 2.677793 ## compute the confidence interval (CI.99 <- (xbar.card - xbar.cash) + tstar.99 * se.diff) [1] -0.8971559 2.3171559
R Code: Hypothesis Test ## We already have the standard error, but here it is again se.diff <- sqrt(s.card^2 / n.card + s.cash^2 / n.cash) ## compute T.observed (slightly different without any rounding) (T.observed <- ((xbar.card - xbar.cash) - 0) / se.diff) [1] 1.18298 ## compute the two-tailed P-value (slightly different without rounding) (P.value <- 2 * xpt(abs(T.observed), df = min(n.card - 1, n.cash - 1), lower.tail = FALSE)) 0.5 1 9 2 7 1 8 . . 0 0 0.4 density 0.3 0.2 0.1 3 2 1 0 −1 −2 −3 [1] 0.2424117
Independent Groups Vs. Paired Samples Design ◮ Independent Groups: Randomly assign cases to groups (or get two separate groups, as in observational study) ◮ Random Assignment → Remove systematic differences between groups ◮ However, still have chance differences between groups. ◮ Chance differences are a source of variability that we need to account for in CIs and tests. ◮ Paired Samples: Each observation in group A has a matched (presumably relatively similar) observation in group B ◮ Removes some of the chance differences between the groups ◮ Results in sample differences closer (on average) to any systematic difference due to the grouping ◮ Narrower CIs ◮ Easier to reject H 0 (one less competing “chance” explanation of a difference)
Kinds of Paired Samples Designs 1. Repeated Measures Design: Have each case produce an observation in each group. 2. Matched Pairs Design: Observations come from different sources (e.g., people), but cases are selected to be similar in key ways (e.g., identical twins that share genetics; or pairs matched by gender, race, age, etc.) 3. Pairing By Time : Alternate cases between groups so that pairs were collected close in time (e.g., b/c you think time of day / week / year / etc. is a source of variability)
From Two Samples to One Sample ◮ In a paired samples design, can convert from two sets of original response variable to a single set of difference scores . Then ◮ Parameter of interest: µ D , population mean difference score ◮ Null hypothesis: H 0 : µ D = 0 ◮ Alternative hypothesis H 1 : µ D � = 0 (or > 0 or < 0 ) ◮ CI: A ≤ µ D ≤ B with C% confidence ◮ Reduces to one sample inference for a mean: Sampling distribution is t , with n D − 1 df ( n D is the number of pairs ), provided (a) n D large, and/or (b) population of differences is approx. Normal
Recommend
More recommend