Null Hypothesis Significance Testing p -values, significance level, - PowerPoint PPT Presentation

Null Hypothesis Significance Testing p -values, significance level, power, t -tests 18.05 Spring 2018 NO CLASS Monday April 16 (Patriots’ Day) Problem set due Wednesday April 18 Watch class web site for RESCHEDULED OFFICE HOURS

Understand this figure f ( x | H 0 ) x reject H 0 don’t reject H 0 reject H 0 x = test statistic f ( x | H 0 ) = pdf of null distribution = green curve Rejection region is a portion of the x -axis. Significance = probability over the rejection region = red area. April 14, 2018 2 / 26

Simple and composite hypotheses Simple hypothesis : the sampling distribution is fully specified. Usually the parameter of interest has a specific value. Composite hypotheses : the sampling distribution is not fully specified. Usually the parameter of interest has a range of values. Example. A coin has probability θ of heads. Toss it 30 times and let x be the number of heads. (i) H : θ = 0 . 4 is simple. x ∼ binomial(30 , 0 . 4). (ii) H : θ > 0 . 4 is composite. x ∼ binomial(30 , θ ) depends on which value of θ is chosen. April 14, 2018 3 / 26

Extreme data and p -values Hypotheses: H 0 , H A . Test statistic: value: x , computed from data. Null distribution: f ( x | H 0 ) (assumes null hypothesis is true) Sides: H A determines if the rejection region is one or two-sided. Rejection region/Significance: P ( x in rejection region | H 0 ) = α . The p -value is a tool to check if the test statistic is in the rejection region. It is also a measure of the evidence for rejecting H 0 . p-value: P ( data at least as extreme as x | H 0 ) “Data at least as extreme” is defined by the sidedness of the rejection region. April 14, 2018 4 / 26

Extreme data and p -values Example. Suppose we have the right-sided rejection region shown below. Also suppose we see data with test statistic x = 4 . 2. Should we reject H 0 ? f ( x | H 0 ) x c α 4 . 2 don’t reject H 0 reject H 0 answer: The test statistic is in the rejection region, so reject H 0 . Alternatively: blue area < red area Significance: α = P ( x in rejection region | H 0 ) = red area. p-value: p = P (data at least as extreme as x | H 0 ) = blue area. Since p < α we reject H 0 . April 14, 2018 5 / 26

Extreme data and p -values Example. Now suppose x = 2 . 1 as shown. Should we reject H 0 ? f ( x | H 0 ) x c α 2 . 1 don’t reject H 0 reject H 0 answer: Test statistic not in the rejection region: don’t reject H 0 . Alternatively: blue area > red area Significance: α = P ( x in rejection region | H 0 ) = red area. p-value: p = P (data at least as extreme as x | H 0 ) = blue area. Since p > α we don’t reject H 0 . April 14, 2018 6 / 26

Critical values The boundaries of the rejection region are called critical values. Critical values are labeled by the probability to their right. They are complementary to quantiles: c p = q 1 − p . Example: for a standard normal c 0 . 025 = 1 . 96 and c 0 . 975 = − 1 . 96. In R, for a standard normal c 0 . 025 = qnorm(0.975) . April 14, 2018 7 / 26

Two-sided p -values These are trickier: what does ‘at least as extreme’ mean in this case? The p -value is a tool for deciding if the test statistic is in the region. If the null distribution is symmetric around zero then p = 2min(left tail prob. of x , right tail prob. of − x ) f ( x | H 0 ) x c 1 − α/ 2 c α/ 2 x reject H 0 don’t reject H 0 reject H 0 x is outside the rejection region, so p > α : do not reject H 0 April 14, 2018 8 / 26

Concept question 1. You collect data from an experiment and do a left-sided z -test with significance 0.1. You find the z -value is 1.8 (i) Which of the following computes the critical value for the rejection region? (a) pnorm(0.1, 0, 1) (b) pnorm(0.9, 0, 1) (c) pnorm(0.95, 0, 1) (d) pnorm(1.8, 0, 1) (e) 1 - pnorm(1.8, 0, 1) (f) qnorm(0.05, 0, 1) (g) qnorm(0.1, 0, 1) (h) qnorm(0.9, 0, 1) (i) qnorm(0.95, 0, 1) (ii) Which of the above computes the p -value for this experiment? (iii) Should you reject the null hypothesis? (a) Yes (b) No answer: (i) g. (ii) d. (iii) No. (Draw a picture!) April 14, 2018 9 / 26

Error, significance and power: a tale of a President H 0 is true H A is true Reject H 0 Type I error correct decision Our decision Don’t reject H 0 correct decision Type II error Significance level = P (type I error) = probability we incorrectly reject H 0 = P (test statistic in rejection region | H 0 ) = P (false positive) Power = probability we correctly reject H 0 = P (test statistic in rejection region | H A ) = 1 − P (type II error) = P (true positive) • H A determines the power of the test. • Significance and power are both probabilities of the rejection region. • Want significance level near 0 and power near 1. April 14, 2018 10 / 26

Table question: significance level and power The rejection region is boxed in red. The corresponding probabilities for different hypotheses are shaded below it. x 0 1 2 3 4 5 6 7 8 9 10 H 0 : p ( x | θ = 0 . 5) .001 .010 .044 .117 .205 .246 .205 .117 .044 .010 .001 H A : p ( x | θ = 0 . 6) .000 .002 .011 .042 .111 .201 .251 .215 .121 .040 .006 H A : p ( x | θ = 0 . 7) .000 .0001 .001 .009 .037 .103 .200 .267 .233 .121 .028 1. Find the significance level of the test. 2. Find the power of the test for each of the two alternative hypotheses. answer: 1. Significance level = P ( x in rejection region | H 0 ) = 0 . 11 2. θ = 0 . 6: power = P ( x in rejection region | H A ) = 0 . 18 θ = 0 . 7: power = P ( x in rejection region | H A ) = 0 . 383 April 14, 2018 11 / 26

Concept question 1. The power of the test in the graph is given by the area of f ( x | H A ) f ( x | H 0 ) R 2 R 3 R 1 R 4 x . reject H 0 region non-reject H 0 region (a) R 1 (b) R 2 (c) R 1 + R 2 (d) R 1 + R 2 + R 3 answer: (c) R 1 + R 2 . Power = P (rejection region | H A ) = area R 1 + R 2 . April 14, 2018 12 / 26

Concept question 2. Which test has higher power? f ( x | H A ) f ( x | H 0 ) x . reject H 0 region do not reject H 0 region f ( x | H A ) f ( x | H 0 ) x . reject H 0 region do not reject H 0 region (a) Top graph (b) Bottom graph April 14, 2018 13 / 26

Solution answer: (a) The top graph. Power = P ( x in rejection region | H A ). In the top graph almost all the probability of H A is in the rejection region, so the power is close to 1. April 14, 2018 14 / 26

Discussion question The null distribution for test statistic x is N (4 , 8 2 ). The rejection region is { x ≥ 20 } . What is the significance level and power of this test? answer: 20 is two standard deviations above the mean of 4. Thus, significance = P ( x ≥ 20 | H 0 ) ≈ 0 . 025 The question about power was a trick: we can’t compute the power without an alternative distribution. April 14, 2018 15 / 26

One-sample t -test Data: we assume normal data with both µ and σ unknown: x 1 , x 2 , . . . , x n ∼ N ( µ, σ 2 ) . Null hypothesis: µ = µ 0 for some specific value µ 0 . Test statistic: n t = x − µ 0 1 s 2 = � ( x i − x ) 2 . s / √ n , where n − 1 i =1 Here t is the Studentized mean and s 2 is the sample variance. Null distribution: f ( t | H 0 ) is the pdf of T ∼ t ( n − 1), the t distribution with n − 1 degrees of freedom. Two-sided p -value: p = P ( | T | > | t | ). R command: pt(x,n-1) is the cdf of t ( n − 1). http://mathlets.org/mathlets/t-distribution/ April 14, 2018 16 / 26

Board question: z and one-sample t -test For both problems use significance level α = 0 . 05. Assume the data 2, 4, 4, 10 is drawn from a N ( µ, σ 2 ). Suppose H 0 : µ = 0; H A : µ � = 0. 1. Is the test one or two-sided? If one-sided, which side? 2. Assume σ 2 = 16 is known and test H 0 against H A . 3. Now assume σ 2 is unknown and test H 0 against H A . Answer on next slide. April 14, 2018 17 / 26

Solution s 2 = 9+1+1+25 We have ¯ x = 5, = 12 3 1. Two-sided. A standardized sample mean far above or below 0 is evidence against H 0 , and consistent with H A . 2. We’ll use the standardized mean z for the test statistic (we could also use ¯ x ). The null distribution for z is N(0 , 1). This is a two-sided test so the rejection region is ( z ≤ z 0 . 975 or z ≥ z 0 . 025 ) = ( −∞ , − 1 . 96] ∪ [1 . 96 , ∞ ) Since z = (¯ x − 0) / (4 / 2) = 2 . 5 is in the rejection region we reject H 0 in favor of H A . Repeating the test using a p -value: p = P ( | z | ≥ 2 . 5 | H 0 ) = 0 . 012 Since p < α we reject H 0 in favor of H A . Continued on next slide. April 14, 2018 18 / 26

Solution continued x − µ ¯ 3. We’ll use the Studentized t = s / √ n for the test statistic. The null √ distribution for t is t 3 . For the data we have t = 5 / 3. This is a two-sided test so the p -value is √ p = P ( | t | ≥ 5 / 3 | H 0 ) = 0 . 06318 Since p > α we do not reject H 0 . April 14, 2018 19 / 26

Two-sample t -test: equal variances Data: we assume normal data with µ x , µ y and (same) σ unknown: x 1 , . . . , x n ∼ N( µ x , σ 2 ) , y 1 , . . . , y m ∼ N( µ y , σ 2 ) Null hypothesis H 0 : µ x = µ y . p = ( n − 1) s 2 x + ( m − 1) s 2 � 1 n + 1 � y s 2 Pooled variance : . n + m − 2 m t = ¯ x − ¯ y Test statistic: s p Null distribution: f ( t | H 0 ) is the pdf of T ∼ t ( n + m − 2) In general (so we can compute power) we have (¯ x − ¯ y ) − ( µ x − µ y ) ∼ t ( n + m − 2) s p Note: there are more general formulas for unequal variances. April 14, 2018 20 / 26

Null Hypothesis Significance Testing p -values, significance level, - PowerPoint PPT Presentation

Null Hypothesis Significance Testing p -values, significance level, power, t -tests 18.05 Spring 2018 NO CLASS Monday April 16 (Patriots Day) Problem set due Wednesday April 18 Watch class web site for RESCHEDULED OFFICE HOURS Understand

Null Hypothesis Significance Testing p -values, significance level, power, t -tests 18.05 Spring

Null Hypothesis Significance Testing p -values, significance level, power, t -tests 18.05 Spring

STAT 113 Hypothesis Testing I Colin Reimer Dawson Oberlin College October 5, 2017 1 / 17

Multiple Tests Reality Null is True Null is False (No effect/relation) (Effect/relation

Hypothesis testing get data that differ from the null hypothesis. If the data would be quite

Null Hypothesis Significance Testing Signifcance Level, Power, t -Tests 18.05 Spring 2014 Jeremy

Null Hypothesis Significance Testing Signifcance Level, Power, t -Tests 18.05 Spring 2014 Jeremy

STAT 215 Hypothesis Testing I Colin Reimer Dawson Oberlin College September 7, 2017 1 / 14

Topic III: Significance Testing Discrete Topics in Data Mining Universitt des Saarlandes,

Testing Specification testing Michel Bierlaire Introduction to choice models Differences from

Hypothesis tests with binomial example STAT 587 (Engineering) Iowa State University October 2,

t -tests STAT 587 (Engineering) Iowa State University October 2, 2020 Statistical hypothesis

Gov 2000: 6. Hypothesis Testing Matthew Blackwell October 11, 2016 1 / 55 1. Hypothesis

A quick review Significance of similarity scores (P-values) Empirical null score

Null Hypothesis Significance Testing and the Problem of Underpowered Studies in Economics Le

Null Hypothesis Significance Testing Gallery of Tests 18.05 Spring 2014 January 1, 2017 1

CS654 Advanced Computer Architecture Lec 4 - Introduction Peter Kemper Adapted from the slides

Sample size determination: why, when, how? @graemeleehickey www.glhickey.com

EGAP Learning Days: Power Analysis Gareth Nellis Preliminaries: Average Treatment Effect

Bias and Equity Implicit Bias & Health Care Disparities Two sides of the same coin Clinical

Optimization of Power Analysis Using Neural Network Zdenek Martinasek, Jan Hajny and Lukas Malina

Fundamentals of Signals Overview Definition Examples Energy and power Signal

CS4617 Computer Architecture Lecture 2 Dr J Vaughan September 10, 2014 1/26 Amdahls Law

EEN320 - Power Systems I ( ) Part 2: Single-phase and

Sambuz

Useful Links

Newsletter

Mail Us

Null Hypothesis Significance Testing p -values, significance level, - PowerPoint PPT Presentation

Null Hypothesis Significance Testing p -values, significance level, power, t -tests 18.05 Spring 2018 NO CLASS Monday April 16 (Patriots Day) Problem set due Wednesday April 18 Watch class web site for RESCHEDULED OFFICE HOURS Understand

Null Hypothesis Significance Testing p -values, significance level, power, t -tests 18.05 Spring

Null Hypothesis Significance Testing p -values, significance level, power, t -tests 18.05 Spring

STAT 113 Hypothesis Testing I Colin Reimer Dawson Oberlin College October 5, 2017 1 / 17

Multiple Tests Reality Null is True Null is False (No effect/relation) (Effect/relation

Hypothesis testing get data that differ from the null hypothesis. If the data would be quite

Null Hypothesis Significance Testing Signifcance Level, Power, t -Tests 18.05 Spring 2014 Jeremy

Null Hypothesis Significance Testing Signifcance Level, Power, t -Tests 18.05 Spring 2014 Jeremy

STAT 215 Hypothesis Testing I Colin Reimer Dawson Oberlin College September 7, 2017 1 / 14

Topic III: Significance Testing Discrete Topics in Data Mining Universitt des Saarlandes,

Testing Specification testing Michel Bierlaire Introduction to choice models Differences from

Hypothesis tests with binomial example STAT 587 (Engineering) Iowa State University October 2,

t -tests STAT 587 (Engineering) Iowa State University October 2, 2020 Statistical hypothesis

Gov 2000: 6. Hypothesis Testing Matthew Blackwell October 11, 2016 1 / 55 1. Hypothesis

A quick review Significance of similarity scores (P-values) Empirical null score

Null Hypothesis Significance Testing and the Problem of Underpowered Studies in Economics Le

Null Hypothesis Significance Testing Gallery of Tests 18.05 Spring 2014 January 1, 2017 1

CS654 Advanced Computer Architecture Lec 4 - Introduction Peter Kemper Adapted from the slides

Sample size determination: why, when, how? @graemeleehickey www.glhickey.com

EGAP Learning Days: Power Analysis Gareth Nellis Preliminaries: Average Treatment Effect

Bias and Equity Implicit Bias &amp; Health Care Disparities Two sides of the same coin Clinical

Optimization of Power Analysis Using Neural Network Zdenek Martinasek, Jan Hajny and Lukas Malina

Fundamentals of Signals Overview Definition Examples Energy and power Signal

CS4617 Computer Architecture Lecture 2 Dr J Vaughan September 10, 2014 1/26 Amdahls Law

EEN320 - Power Systems I ( ) Part 2: Single-phase and

Sambuz

Useful Links

Newsletter

Mail Us

Bias and Equity Implicit Bias & Health Care Disparities Two sides of the same coin Clinical