Comparing Several Samples We are often interested in comparing - PowerPoint PPT Presentation

ST 380 Probability and Statistics for the Physical Sciences Comparing Several Samples We are often interested in comparing measurements made under more than two different sets of conditions. Examples Strengths of concrete beams manufactured with three different levels of a plasticizer. Effects of five different brands of gasoline on fuel consumption. Effects of four different sugar solutions on bacterial growth. 1 / 15 The Analysis of Variance Introduction

ST 380 Probability and Statistics for the Physical Sciences Analysis of Variance When we compare more than two samples, we first ask: Are there any differences among populations? If we detect differences, the next question is: Which means are different, and by how much? When there are only two samples, these questions are basically the same; now we have to address them separately. For historical reasons, the techniques are known as the “analysis of variance” (ANOVA). 2 / 15 The Analysis of Variance Analysis of Variance

ST 380 Probability and Statistics for the Physical Sciences Notation I = the number of samples µ 1 = mean of population 1 . . . µ I = mean of population I σ 2 = the variance in each population Note that the variance is assumed to be the same in each population; no extension of Welch’s method is available. 3 / 15 The Analysis of Variance Analysis of Variance

ST 380 Probability and Statistics for the Physical Sciences Example 10.1 Four types of boxes were compared in terms of compressive strength (lb f ). boxes <- read.table("Data/Example-10-01.txt", header = TRUE) boxplot(Strength ~ Type, boxes) Type 4 appears to have lower strength than the other types, and Type 2 appears to be the strongest. How do we make objective statements about these appearances? 4 / 15 The Analysis of Variance Analysis of Variance

ST 380 Probability and Statistics for the Physical Sciences The first question suggests a hypothesis test: H 0 : µ 1 = µ 2 = · · · = µ I versus H a : at least two of the means are unequal We look for a test statistic that compares the differences among the sample means with what we would expect those differences would be under H 0 . 5 / 15 The Analysis of Variance Analysis of Variance

ST 380 Probability and Statistics for the Physical Sciences The conventional statistic is a ratio of sums of squares that are involved in estimating variances, hence “analysis of variance”. Under H 0 , it follows the F -distribution, and is denoted F . The F -statistic is a generalization of the pooled t -statistic used to compare two samples. ...actually the square of t . The calculations are tedious, and best left to software. 6 / 15 The Analysis of Variance Analysis of Variance

ST 380 Probability and Statistics for the Physical Sciences Using R boxesAov <- aov(Strength ~ factor(Type), boxes) summary(boxesAov) Output Df Sum Sq Mean Sq F value Pr(>F) factor(Type) 3 127375 42458 25.09 5.53e-07 *** Residuals 20 33839 1692 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 7 / 15 The Analysis of Variance Analysis of Variance

ST 380 Probability and Statistics for the Physical Sciences On both the factor(Type) and Residuals lines, the Mean Sq is the Sum Sq divided by the Df . On the factor(Type) line, the F value is the ratio of the mean square for Type to the mean square for Residuals, and is the required test statistic. On the same line, Pr(F) is the P -value, which in this case is less than 10 − 6 ; that tells us that there is very strong evidence against H 0 –no surprise there, given the differences among the box-plots. 8 / 15 The Analysis of Variance Analysis of Variance

ST 380 Probability and Statistics for the Physical Sciences Multiple Comparisons When, as in this example, we decide that there are significant differences, the next question is what are they? One approach is to take each pair of samples, and compare them using either: a hypothesis test that the means are equal; a confidence interval for the difference. We have seen that these alternatives are essentially equivalent. 9 / 15 The Analysis of Variance Multiple Comparisons

ST 380 Probability and Statistics for the Physical Sciences Sometimes this “pairwise” approach is reasonable. However, its error rate may be unacceptable. Among I samples, there are I ( I − 1) / 2 pairwise comparisons. If we construct I ( I − 1) / 2 pairwise confidence intervals, each with a probability α of being incorrect, we should expect α I ( I − 1) / 2 of them to be wrong. If α = . 05 and I = 4, as in the example, α I ( I − 1) / 2 = . 3, so the “per-family” error rate is 30%. 10 / 15 The Analysis of Variance Multiple Comparisons

ST 380 Probability and Statistics for the Physical Sciences Tukey’s HSD Tukey’s “Honest Significant Difference” (HSD) method is a way of constructing pairwise confidence intervals in such a way that the probability that all of them are correct is the desired level (1 − α ). Using R boxesHSD <- TukeyHSD(boxesAov) boxesHSD 11 / 15 The Analysis of Variance Multiple Comparisons

ST 380 Probability and Statistics for the Physical Sciences Output Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = Strength ~ factor(Type), data = boxes) $‘factor(Type)‘ diff lwr upr p adj 2-1 43.93333 -22.53671 110.403377 0.2804669 3-1 -14.93333 -81.40338 51.536711 0.9215560 4-1 -150.98333 -217.45338 -84.513289 0.0000185 3-2 -58.86667 -125.33671 7.603377 0.0942542 4-2 -194.91667 -261.38671 -128.446623 0.0000004 4-3 -136.05000 -202.52004 -69.579956 0.0000726 12 / 15 The Analysis of Variance Multiple Comparisons

ST 380 Probability and Statistics for the Physical Sciences The results can also be presented graphically: plot(boxesHSD) The implications, from both presentantions, are: The confidence intervals for comparing all types except 4 contain zero, so those differences are not significant. The confidence intervals for comparing Type 4 with the other three types are all negative, so Type 4 has significantly lower strength than the other types. Even Though Type 2 appears to be the strongest, that is not confirmed by this experiment. 13 / 15 The Analysis of Variance Multiple Comparisons

ST 380 Probability and Statistics for the Physical Sciences Inconsistency In some sets of data, the F -test and the HSD may give inconsistent results: The F -test may reject H 0 , and yet no pair of means are significantly different using the HSD; Conversely, the F -test may fail to reject H 0 , and yet at least two means are significantly different using the HSD. If your interest really is in pairwise comparisons, for instance to rank the populations, or to find the best or worst, you should ignore the F -test. Just reject H 0 if and only if at least two means are significantly different. 14 / 15 The Analysis of Variance Multiple Comparisons

ST 380 Probability and Statistics for the Physical Sciences False Discovery Rate Other methods than Tukey’s have been proposed for managing the multiplicity problem. When I is large, Tukey’s method may be unnecessarily conservative, meaning that it may fail to detect real differences. Yoav Benjamini and Yosef Hochberg developed the idea of “false discovery rate” as an alternative. 15 / 15 The Analysis of Variance Multiple Comparisons

Comparing Several Samples We are often interested in comparing - PowerPoint PPT Presentation

ST 380 Probability and Statistics for the Physical Sciences Comparing Several Samples We are often interested in comparing measurements made under more than two different sets of conditions. Examples Strengths of concrete beams manufactured

Business Statistics CONTENTS Comparing two samples Comparing two unrelated samples Comparing

Samples Advertising of samples and handing out samples Advertising Education and Assurance

-Samples [AB98] Hyp: domain S is a smooth curve or surface. S 1 -Samples [AB98] Hyp:

Comparing Two Samples We are often interested in comparing measurements made under two different

STAT 113 Independent vs. Paired Samples Colin Reimer Dawson Oberlin College November 16, 2017

Biased and Unbiased Samples James J. Heckman Econ 312, Spring 2019 May 14, 2019 1 / 125

Biased and Unbiased Samples James J. Heckman Econ 312, Spring 2019 May 13, 2019 1 / 125

Chapter 8 Slide 1 Inferences from Two Samples 8-1 Overview 8-2 Inferences about Two Proportions

Single Factor Experiments Moving from comparing two samples (e.g. two formulations of mortar) to

Climate: What Is It Anyway Comparing Weather and Climate Climate Regions and Biomes Comparing

Business Statistics CONTENTS Comparing two s Comparing more than two s Analysis of

Labeling Blood Samples There are documented occurrences and near misses of mislabeling of blood

This graph shows the evidence from the samples giving an indication of the predominance of the

Many Features, Few Samples: Many Features, Few Samples: From cheminformatics cheminformatics to

MutaPon Analysis in Frozen and FFPE Tumor Samples Gad Getz, PhD KrisPn Ardlie, PhD Broad

SEM Photographs of Activated ash samples SEM Micrographs (Original ash samples) (a) Sample S1F1

Blocked Designs Recall the paired comparison design: two treatments applied to the same

Interconnection in the Internet: the policy challenge David Clark MIT CSAIL November, 2011

LIGO containers in diverse computing environments Thomas P Downes Center for Gravitation,

Packaging Can Create Customer Experiences Interactive Packaging Justifies Your Price Your

Spectral Method and Regularized MLE Are Both Optimal for Top- K Ranking Yuxin Chen Electrical

MA111: Contemporary mathematics Jack Schmidt No entrance or exit quiz today (pick-up your

Pttr tss r st

Closest Pair of Points in the Plane Inge Li Grtz Thank you to Kevin Wayne for inspiration to