Two-Tailed Tests and Stat. Significance Worksheet: Love is Blind STAT 113 Tests and Confidence Intervals Colin Reimer Dawson Oberlin College October 10th, 2016
Two-Tailed Tests and Stat. Significance Worksheet: Love is Blind Reminders and Announcements • HW online, due Friday (but ok if you want to turn it in during break)
Two-Tailed Tests and Stat. Significance Worksheet: Love is Blind Two-Tailed Tests and Stat. Significance Worksheet: Love is Blind
Two-Tailed Tests and Stat. Significance Worksheet: Love is Blind Two-Tailed Tests Two-Tailed Test In a Two-Tailed Test , H 1 does not specify the direction (sign) of a difference/correlation/slope. So outcomes at either extreme count in its favor. The P -value therefore uses outcomes at or past the observed one, but also the symmetric outcomes on the other “tail” We should prefer two-tailed tests, unless only one side of the alternative is plausible a priori .
Two-Tailed Tests and Stat. Significance Worksheet: Love is Blind What is low enough? Significance level ( α ) We need to decide for ourselves, in advance of collecting data , what we will count as a “low enough” P -value to achieve statistical significance. This threshold is called the significance level of the test. (Notation: α )
Two-Tailed Tests and Stat. Significance Worksheet: Love is Blind Making a Decision Reject H 0 or not? Compare P to α . (a) P ≥ α : Do not reject H 0 . (Data wouldn’t be that surprising if H 0 true. H 0 is “presumed innocent”.) (b) P < α : Reject H 0 . (Data would be too surprising if H 0 were true. Beyond a “reasonable doubt”.) We do not “accept H 0 ”. We “fail to reject” it. (Not enough evidence to decide)
Two-Tailed Tests and Stat. Significance Worksheet: Love is Blind Types of Errors 2 × 2 table of possibilities. Is H 0 actually false (does the treatment actually work)? Did we reject H 0 (did we conclude that it works)? Action H 0 rejected H 0 not rejected True Discovery Missed Discovery H 0 is false Truth H 0 is true False Discovery No Error Table: Possible outcomes of a null hypothesis significance test
Two-Tailed Tests and Stat. Significance Worksheet: Love is Blind Type I vs. Type II Errors • We can set α to whatever we want. The lower it is, the less often we make Type I Errors. • Tradeoff: Fewer Type I Errors → More Type II Errors.
Two-Tailed Tests and Stat. Significance Worksheet: Love is Blind Type I vs. Type II Errors Decreasing α moves the rejection threshold out toward the tail of the H 0 distribution. 0.20 ● α = 0.15 , threshold = 8 ● ● 0.15 Probability ● 0.10 ● ● 0.05 ● ● ● 0.00 ● ● ● ● ● ● ● ● ● ● ● ● 0 5 10 15 20 Values Blue spikes: Distribution of outcomes if H 0 is true
Two-Tailed Tests and Stat. Significance Worksheet: Love is Blind Type I vs. Type II Errors Decreasing α moves the rejection threshold out toward the tail of the H 0 distribution. 0.20 ● α = 0.05 , threshold = 9 ● ● 0.15 Probability ● 0.10 ● ● 0.05 ● ● ● 0.00 ● ● ● ● ● ● ● ● ● ● ● ● 0 5 10 15 20 Values Blue spikes: Distribution of outcomes if H 0 is true
Two-Tailed Tests and Stat. Significance Worksheet: Love is Blind Type I vs. Type II Errors Decreasing α moves the rejection threshold out toward the tail of the H 0 distribution. 0.20 ● α = 0.01 , threshold = 11 ● ● 0.15 Probability ● 0.10 ● ● 0.05 ● ● ● 0.00 ● ● ● ● ● ● ● ● ● ● ● ● 0 5 10 15 20 Values Blue spikes: Distribution of outcomes if H 0 is true
Two-Tailed Tests and Stat. Significance Worksheet: Love is Blind Type I vs. Type II Errors We retain H 0 when we do not exceed the threshold. But if H 1 is correct, this is a Type II Error. More stringent threshold → missed discoveries. 0.20 ● α = 0.15 , threshold = 8 ● ● ● 0.15 ● ● Probability ● ● ● 0.10 ● ● ● ● 0.05 ● ● ● ● ● ● ● 0.00 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 5 10 15 20 Values Blue spikes: Distribution of outcomes if H 0 is true Orange spikes: Distribution of outcomes for one possible parameter value under .
Two-Tailed Tests and Stat. Significance Worksheet: Love is Blind Type I vs. Type II Errors We retain H 0 when we do not exceed the threshold. But if H 1 is correct, this is a Type II Error. More stringent threshold → missed discoveries. 0.20 ● α = 0.05 , threshold = 9 ● ● ● 0.15 ● ● Probability ● ● ● 0.10 ● ● ● ● 0.05 ● ● ● ● ● ● ● 0.00 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 5 10 15 20 Values Blue spikes: Distribution of outcomes if H 0 is true Orange spikes: Distribution of outcomes for one possible parameter value under .
Two-Tailed Tests and Stat. Significance Worksheet: Love is Blind Type I vs. Type II Errors We retain H 0 when we do not exceed the threshold. But if H 1 is correct, this is a Type II Error. More stringent threshold → missed discoveries. 0.20 ● α = 0.01 , threshold = 11 ● ● ● 0.15 ● ● Probability ● ● ● 0.10 ● ● ● ● 0.05 ● ● ● ● ● ● ● 0.00 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 5 10 15 20 Values Blue spikes: Distribution of outcomes if H 0 is true Orange spikes: Distribution of outcomes for one possible parameter value under .
Two-Tailed Tests and Stat. Significance Worksheet: Love is Blind Worksheet: Love is Blind, Continued
Recommend
More recommend