ECON 626: Applied Microeconomics Lecture 9: Multiple Test - PowerPoint PPT Presentation

ECON 626: Applied Microeconomics Lecture 9: Multiple Test Corrections Professors: Pamela Jakiela and Owen Ozier

Multiple Hypothesis Testing: The Problem Consider testing 100 true null hypotheses — how many will rejected? UMD Economics 626: Applied Microeconomics Lecture 9: Multiple Test Corrections, Slide 2

Multiple Hypothesis Testing: The Problem Consider testing 100 true null hypotheses — how many will rejected? Number of Tests 1 2 k Test size 0.05 0.05 0.05 0.95 2 0.95 k No rejections 0.95 1 - 0.95 2 1 - 0.95 k Any rejections 0.05 UMD Economics 626: Applied Microeconomics Lecture 9: Multiple Test Corrections, Slide 2

Multiple Hypothesis Testing: The Problem Consider testing 100 true null hypotheses — how many will rejected? Number of Tests 1 2 k Test size 0.05 0.05 0.05 0.95 2 0.95 k No rejections 0.95 1 - 0.95 2 1 - 0.95 k Any rejections 0.05 UMD Economics 626: Applied Microeconomics Lecture 9: Multiple Test Corrections, Slide 3

Multiple Hypothesis Testing: The Problem Consider testing 100 true null hypotheses — how many will rejected? Number of Tests 1 2 k Test size 0.05 0.05 0.05 0.95 k No rejections 0.95 0.9025 1 - 0.95 k Any rejections 0.05 0.0975 UMD Economics 626: Applied Microeconomics Lecture 9: Multiple Test Corrections, Slide 4

Multiple Hypothesis Testing: The Problem Consider testing 100 true null hypotheses — how many will rejected? Number of Tests 1 2 k Test size 0.05 0.05 0.05 0.95 k No rejections 0.95 0.9025 1 - 0.95 k Any rejections 0.05 0.0975 UMD Economics 626: Applied Microeconomics Lecture 9: Multiple Test Corrections, Slide 5

Multiple Hypothesis Testing: The Problem 1 Probability of rejecting a false null hypothesis .8 .6 .4 .2 0 0 20 40 60 80 100 Number of (independent) hypotheses tested Under the null, probability of rejecting at least on hypothesis increases rapidly with number of independent hypothesis tests UMD Economics 626: Applied Microeconomics Lecture 9: Multiple Test Corrections, Slide 6

Multiple Hypothesis Testing: The Problem How can we (credibly) test multiple hypotheses? UMD Economics 626: Applied Microeconomics Lecture 9: Multiple Test Corrections, Slide 7

Multiple Hypothesis Testing: The Problem How can we (credibly) test multiple hypotheses? • What sort of ninny would test 100 hypotheses? UMD Economics 626: Applied Microeconomics Lecture 9: Multiple Test Corrections, Slide 7

Multiple Hypothesis Testing: The Problem How can we (credibly) test multiple hypotheses? • What sort of ninny would test 100 hypotheses? • Valid reasons for testing many hypotheses: ◮ Studies often have 2 or 3 treatment arms (and rightly so!) ◮ Difficult to predict which outcomes will be affected ◮ Particularly true for secondary hypotheses/treatment effects ◮ Different measures of the same outcome often available ◮ Heterogeneity in treatment effects (across sub-samples) UMD Economics 626: Applied Microeconomics Lecture 9: Multiple Test Corrections, Slide 7

Multiple Hypothesis Testing: The Problem Published empirical papers include a lot of hypothesis tests! Source: Young (2019) UMD Economics 626: Applied Microeconomics Lecture 9: Multiple Test Corrections, Slide 8

Bonferroni Corrections Most conservative approach is the Bonferroni method ∗ • Problem: you wish to test hypotheses H 1 , ... H k using a test size of α • Solution (of sorts): use a test size of α/ k instead ◮ Family-wise error rate (FWER) : probability of rejecting a true null ◮ Bonferroni correction holds FWER below α ◮ Bonferroni corrections are too conservative: ◮ FWER ≈ 0 . 04877 when number of independent tests is large ◮ Bonferroni corrections can be extremely conservative when tests are not independent (consider example of perfectly correlated tests) Good news: if you are testing k hypotheses and a Bonferroni correction works (i.e. your results hold up), you don’t need the rest of this lecture ∗ Purportedly developed by Olive Jean Dunn and not, ahem, Carlo Emilio Bonferroni UMD Economics 626: Applied Microeconomics Lecture 9: Multiple Test Corrections, Slide 9

Bonferroni Corrections Number of Tests 1 k Test size (per test) 0.05 α/ k 1 - (single) test size 0.95 1 − α/ k (1 − α/ k ) k No rejections 0.95 1 − (1 − α/ k ) k Any rejections 0.05 UMD Economics 626: Applied Microeconomics Lecture 9: Multiple Test Corrections, Slide 10

Bonferroni Corrections Number of Tests 1 2 10 Test size (per test) 0.05 0.025 0.005 1 - (single) test size 0.95 1 − α/ k (1 − α/ k ) k No rejections 0.95 1 − (1 − α/ k ) k Any rejections 0.05 UMD Economics 626: Applied Microeconomics Lecture 9: Multiple Test Corrections, Slide 11

Bonferroni Corrections Number of Tests 1 2 10 Test size (per test) 0.05 0.025 0.005 1 - (single) test size 0.95 0.975 0.995 (1 − α/ k ) k No rejections 0.95 1 − (1 − α/ k ) k Any rejections 0.05 UMD Economics 626: Applied Microeconomics Lecture 9: Multiple Test Corrections, Slide 12

Bonferroni Corrections Number of Tests 1 2 10 Test size (per test) 0.05 0.025 0.005 1 - (single) test size 0.95 0.975 0.995 No rejections 0.95 0.950625 0.951110 1 − (1 − α/ k ) k Any rejections 0.05 UMD Economics 626: Applied Microeconomics Lecture 9: Multiple Test Corrections, Slide 13

Bonferroni Corrections Number of Tests 1 2 10 Test size (per test) 0.05 0.025 0.005 1 - (single) test size 0.95 0.975 0.995 No rejections 0.95 0.950625 0.951110 Any rejections 0.05 0.049375 0.048890 UMD Economics 626: Applied Microeconomics Lecture 9: Multiple Test Corrections, Slide 14

Bonferroni Corrections Most conservative approach is the Bonferroni method ∗ • Problem: you wish to test hypotheses H 1 , ... H k using a test size of α • Solution (of sorts): use a test size of α/ k instead ◮ Family-wise error rate (FWER) : probability of rejecting a false null ◮ Bonferroni correction holds FWER below α ◮ Bonferroni corrections are too conservative: ◮ FWER ≈ 0 . 04877 when number of independent tests is large ◮ Bonferroni corrections can be extremely conservative when tests are not independent (consider example of perfectly correlated tests) Good news: if you are testing k hypotheses and a Bonferroni correction works (i.e. your results hold up), you don’t need the rest of this lecture ∗ Purportedly developed by Olive Jean Dunn and not, ahem, Carlo Emilio Bonferroni UMD Economics 626: Applied Microeconomics Lecture 9: Multiple Test Corrections, Slide 15

Stepdown Methods Holm (1979) proposes a less conservative stepdown method : 0. Order k p-values from smallest to largest, p (1) , p (2) , .. p ( k ) 1a. If p (1) > α/ k , stop. Fail to reject all hypotheses 1b. Reject H (1) if p (1) < α/ k . Proceed to Step 2. 2a. If p (2) > α/ ( k − 1), stop. Fail to reject all remaining hypotheses. 2b. Reject H (2) if p (2) < α/ ( k − 1). Proceed to Step 3. ... j. Repeat as needed until you stop rejecting hypotheses because p ( j ) > α/ ( k − ( j − 1)) or all k hypotheses have been rejected More good news: Romano & Wolf (JASA, 2005) state “This procedures holds under arbitrary dependence on the joint distribution of p-values.” UMD Economics 626: Applied Microeconomics Lecture 9: Multiple Test Corrections, Slide 16

Stepdown Methods: Holm vs. Bonferroni p-value Bonferroni Holm 0.010 0.050 0.050 0.010 0.050 0.040 0.015 0.075 0.045 0.050 0.250 0.100 0.100 0.500 0.100 Blue indicates hypotheses that would not be rejected using a test size of α = 0 . 05 UMD Economics 626: Applied Microeconomics Lecture 9: Multiple Test Corrections, Slide 17

Resampling-Based Stepdown Methods More complicated/powerful bootstrap-based stepdown methods exist • Examples: Westfall & Young (1993), Romano & Wolf (2005) • These procedures exploit additional assumptions to increase power (so you don’t need them if simpler methods “work” in your setting) • They are also more computationally-intensive, often including phrases like “efficient computation” or “computationally feasible” • Approaches use some form of stepdown structure ◮ At each step, “accept”/reject decisions use empirical distribution of bootstrapped p-values associated with not-yet-rejected hypotheses ◮ Can be modified to generate adjusted p-values UMD Economics 626: Applied Microeconomics Lecture 9: Multiple Test Corrections, Slide 18

Example: Romano and Wolf (2005) For each of k hypotheses, let t ∗ , m be a resampling-based test statistic, k defined for m = 1 , . . . , M bootstrap replications, permutations, etc. • Test statistics defined so that higher indicates greater significance p k = # { t ∗ , m • Unadjusted p-value: ˆ ≥ t k } / M k UMD Economics 626: Applied Microeconomics Lecture 9: Multiple Test Corrections, Slide 19

Example: Romano and Wolf (2005) For each of k hypotheses, let t ∗ , m be a resampling-based test statistic, k defined for m = 1 , . . . , M bootstrap replications, permutations, etc. • Test statistics defined so that higher indicates greater significance p k = # { t ∗ , m • Unadjusted p-value: ˆ ≥ t k } / M k To simplify notation, assume hypotheses are ordered: t 1 ≥ t 2 > . . . ≥ t k • For j = 1 , . . . , k and m = 1 , . . . , M , define: max ∗ , m = max { t ∗ , m , t ∗ , m j +1 , . . . , t ∗ , m } j j k UMD Economics 626: Applied Microeconomics Lecture 9: Multiple Test Corrections, Slide 19

ECON 626: Applied Microeconomics Lecture 9: Multiple Test - PowerPoint PPT Presentation

ECON 626: Applied Microeconomics Lecture 9: Multiple Test Corrections Professors: Pamela Jakiela and Owen Ozier Multiple Hypothesis Testing: The Problem Consider testing 100 true null hypotheses how many will rejected? UMD Economics 626:

ECON 626: Applied Microeconomics Lecture 10: Attrition Professors: Pamela Jakiela and Owen Ozier

ECON 626: Applied Microeconomics Lecture 5: Regression Discontinuity Professors: Pamela Jakiela

ECON 626: Applied Microeconomics Lecture 3: Difference-in-Differences Professors: Pamela Jakiela

ECON 626: Applied Microeconomics Lecture 1: Selection Bias and the Experimental Ideal

ECON 626: Applied Microeconomics Lecture 4: Instrumental Variables Professors: Pamela Jakiela

ECON 626: Applied Microeconomics Lecture 8: Permutations and Bootstraps Professors: Pamela

ECON 626: Applied Microeconomics Lecture 11: Maximum Likelihood Estimation Professors: Pamela

ECON 626: Applied Microeconomics Lecture 6: Selection on Observables Professors: Pamela Jakiela

Advanced Microeconomics P . v. Mouche Wageningen University 2016 Microeconomics Game Theory

Charge Extraction Lecture 9 10/06/2011 MIT Fundamentals of Photovoltaics 2.626/2.627 Fall

Introduction to Your Introduction to Your Microeconomics Course Microeconomics Course Welcome

Microeconomics 3200/4200: Part 1 P. Piacquadio p.g.piacquadio@econ.uio.no September 14, 2017

Microeconomics 3200/4200: Part 1 P. Piacquadio p.g.piacquadio@econ.uio.no August 17, 2017 P.

Model-Based Testing (ISTQB Chapter 4) Arie van Deursen 1 4.1 ISTQB Test Design Test Scripts

Charge Separation Part 1: Diode Lecture 5 9/22/2011 MIT Fundamentals of Photovoltaics

Econ 101: Principles of Microeconomics Chapter 4: Consumer and Producer Surplus Fall 2010

Sampling Methods Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 11 Sampling Rejection Sampling

Advanced Algorithms (X) Shanghai Jiao Tong University Chihao Zhang May 11, 2020 Estimate

Lecture 5: Sampling (Monte Carlo Methods) Julia Hockenmaier juliahmr@illinois.edu 3324 Siebel

1 Acceptance, Rejection, and I/O for Turing Machines Definition 1.1 (Initial Configuration) If M

IETF 67 SIP meeting draft-ietf-sip-connected-identity-02 Current status Finished WGLC (based

What we told CVPR 18 ACs Slides edited by: DAF, from slides by DAF, Ivan, Deva, Aude Outline

Distributed Algorithms for MCMC Sampling Yitong Yin Nanjing University Shonan Meeting No. 162:

Intr oduc tion to E c onome tr ic s Chapte r 4 E ze quie l Ur ie l Jim ne z Unive r