Multiple Testing Applied Multivariate Statistics Spring 2012 - PowerPoint PPT Presentation

Multiple Testing Applied Multivariate Statistics – Spring 2012

Overview  Problem of multiple testing  Controlling the FWER: - Bonferroni - Bonferroni-Holm  Controlling the FDR: - Benjamini-Hochberg  Case study 1

Package repositories in R  Comprehensive R Archive network (CRAN): - packages from diverse backgrounds - install packages using function “ install.packages ” - homepage: http://cran.r-project.org/  Bioconductor: - biology context - download package manually, unzip, load into R using “library(…, lib.loc = ‘path where you saved the folder of the package’)” - homepage: http://www.bioconductor.org  We are going to use the package “ multtest ” from Bioconductor 2

Example: Effect of “wonder - pill”  Claim: Wonder pill has an effect!  Random group of people  Measure 100 variables before and after taking the pill: Weight, blood pressure, heart rate, blood parameters, etc.  Compare before and after using a paired t-test for each variable on the 5% significance level  Breaking news: 5 out of 100 variables indeed showed a significant effect !! 3

The problem of Multiple Testing  Single test on 5% significance level: By definition, type 1 error is (at most) 5%  Type 1 error: Reject H 0 if H 0 is actually true In example: Declare that wonder-pill changes variable, if in reality there is no change  Let’s assume, that wonder -pill has no effect at all. Then: Every variable has a 5% chance of being “significantly changed by the drug”  Like a lottery: Nmb. Sign. Tests ~ Bin(100, 0.05) Significant tests All tests Test 5 Test 19 Test 1 Test 2 … Test 43 5% chance Test 100 Test 77 4

Family Wise Error Rate (FWER)  Family: Group of tests that is done  FWER = Probability of getting at least one wrong significance (= one false positive test)  𝐺𝑋𝐹𝑆 = 𝑄 𝑊 ≥ 1 ≈ 𝑊 𝑁 0 Declared Declared Total non-sign. sign. True H 0 U V M 0 False H 0 T S M 1 Total M-R R M  Clinical trials: Food and Drug Administration (FDA) typically requires FWER to be less than 5% 5

FWER in example  V: Number of incorrectly significant tests  V ~ Bin(100, 0.05)  𝐺𝑋𝐹𝑆 = 𝑄 𝑊 ≥ 1 = 1 − 𝑄 𝑊 = 0 = 1 − 0.95 100 = 0.99 (assuming independence among variables)  We will most certainly have at least one false positive test! 6

Controlling FWER: Bonferroni Method  “Corrects” p -values; only count a test as significant, if corrected p-value is less than significance level  If you do M tests, reject each H 0i only if for the corresponding p-value P i holds: M ∗ 𝑄 𝑗 < 𝛽  FWER of this procedure is less or equal to 𝛽  In example: Reject H 0 only if 100*p-value is less than 0.05  Very conservative: Power to detect H A gets very small 7

Example: Bonferroni  P-values (sorted): H 0(1) : 0.005, H 0(2) : 0.011, H 0(3) : 0.02, H 0(4) : 0.04, H 0(5) : 0.13  M = 5 tests; Significance level: 0.05  Corrected p-value: 0.005*5 = 0.025 < 0.05: Reject H 0(1)  Corrected p- value: 0.011*5 = 0.055: Don’t reject H 0(2)  Corrected p-value: 0.02*5 = 0.1: Don’t reject H 0(3)  Corrected p-value: 0.04*5 = 0.2: Don’t reject H 0(4)  Corrected p-value: 0.13*5 = 0.65: Don’t reject H 0(5)  Conclusion: Reject H 0(1) , don’t reject H 0(2) , H 0(3) , H 0(4) , H 0(5) 8

Improving Bonferroni: Holm-Bonferroni Method  “Corrects” p -values; only count a test as significant, if corrected p-value is less than significance level  Sort all M p-values in increasing order: P (1) , …, P (M) H 0(i) denotes the null hypothesis for p-value P (i)  Multiply P (1) with M, P (2) with M-1, etc.  If P (i) smaller than the cutoff 0.05, reject H 0(i) and carry on If at some point H 0(j) can not be rejected, stop and don’t reject H 0(j) , H 0(j+1) , …, H 0(M)  FWER of this procedure is less or equal to 𝛽  Method “Holm” has never worse power than “ Bonferroni ” and is often better; still conservative 9

Example: Holm-Bonferroni  P-values: H 0(1) : 0.005, H 0(2) : 0.011, H 0(3) : 0.02, H 0(4) : 0.04, H 0(5) : 0.13  M = 5 tests; Significance level: 0.05  Corrected p-value: 0.005*5 = 0.025 < 0.05: Reject H 0(1)  Corrected p-value: 0.011*4 = 0.044 : Reject H 0(2)  Corrected p-value: 0.02*3 = 0.06: Don’t reject H 0(3) and stop  Conclusion: Reject H 0(1) and H 0(2) , don’t reject H 0(3) , H 0(4) , H 0(5) 10

False Discovery Rate (FDR)  Controlling FWER is extremely conservative We might be willing to accept A FEW false positives  FDR = Fraction of “false significant results” among the significant results you found  𝐺𝐸𝑆 = 𝑊 𝑆 Declared Declared Total non-sign. sign. True H 0 U V M 0 False H 0 T S M 1 Total M-R R M  FDR = 0.1 oftentimes acceptable for screening 11

Controlling FDR: Benjamini-Hochberg  “Corrects” p -values; only count a test as significant, if corrected p-value is less than significance level  Method a bit more involved; sequential as Holm-Bonferroni 12

Correcting for Multiple Testing in R  Function “mt.rawp2adjp” in package “ multtest ” from Bioconductor  Use option “ proc ”: - Bonferroni : “ Bonferroni ” - Holm-Bonferroni : “Holm” - Benjamini- Hochberg: “BH” 13

When to correct for multiple testing? Many hits / many False Pos.  Don’t correct: Exploratory analysis; when generating hypothesis Report the number of tests you do (e.g.: “We investigated 40 features, but only report on 10; 7 of those show a significant difference.”)  Control FDR (typically FDR < 10%): Exploratory analysis; Screening: Select some features for further, more expensive investigation Balance between high power and low number of false positives  Control FWER (typically FWER < 5%): Confirmatory analysis; use if you really don’t want any false positives Few hits / few False Pos. 14

Case study: Detecting Leukemia types  38 tumor mRNA samples from one patient each: 27 acute lymphoblastic leukemia (ALL) cases (code 0) 11 acute myeloid leukemia (AML) cases (code 1)  Expression of 3051 genes for each sample  Which genes are associated with the different tumor types? 15

Concepts to know  When to control FWER, FDR  Bonferroni, Holm-Bonferroni, Benjamini-Hochberg 16

R functions to know  “mt.rawp2adjp” in Bioconductor package “ multtest ” 17

Online Resources  http://www.bioconductor.org/packages/release/bioc/html/m ulttest.html  There: Section “Documentation”  “multtest.pdf”: Practical introduction to multtest-package  “MTP.pdf”: Theoretical introduction to multiple testing 18

Multiple Testing Applied Multivariate Statistics Spring 2012 - PowerPoint PPT Presentation

Multiple Testing Applied Multivariate Statistics Spring 2012 Overview Problem of multiple testing Controlling the FWER: - Bonferroni - Bonferroni-Holm Controlling the FDR: - Benjamini-Hochberg Case study 1 Package

Levels of Testing Chapter 12 Beyond unit testing Developer Testing stages Unit testing

Testing Terminology System testing Types of errors Function testing Structure

Property-Based Testing Matt Bachmann @mattbachmann Testing is Important Testing is Important

Software Testing Overview What is software testing? General testing criteria Testing

Software testing Software Testing Introduction Testing levels Automated testing Principles and

1. Test page This page is for testing. This page is for testing. This page is for testing.

Overview Objective Types of testing ECE 553: TESTING AND Verification testing

Object Oriented Testing Chapter 23 1 OO Testing Class Testing: Equivalent to unit testing

Software Testing Software testing 1 V model Software testing 2 Program testing goals To

Factor Analysis for Multiple Testing : an R package for large-scale significance testing under

Development Services in Automotive TESTING LABORATORY Accredited Testing Laboratory Nr. 1552

A review of software testing P DAVID COWARD 200511347 Software testing Software

Chapter 1 Fundamentals of testing 1. Why is testing necessary? 2. What is testing? 3. Test

Functional Testing Review Chapter 8 Functional Testing We saw three types of functional

Chapter 11, Testing ! Function testing Types of errors ! Structure Testing Dealing with

Multiple Decrement Models Lecture: Weeks 8-9 Lecture: Weeks 8-9 (STT 456) Multiple Decrement

Request for reissuance of four Request For Application (RFA) solicitations December 2014

CS 1655 / Spring 2010 Secure Data Management and Web Applications 01 Data Mining and

Gene expression analysis Roadmap Microarray technology: how it work Applications: what

High-dimensional data analysis Nicolai Meinshausen Seminar f ur Statistik, ETH Z urich Van

genomic res earch ARES Gianluca Reali coordinator University of Perugia 2nd TERENA Network

Cytogenetics Update Lynda J Campbell lynda.campbell@svhm.org.au Ph Nowell and Hungerford,

Related topics: Marc Van Droogenbroecks Computer Vision and Louis Wehenkel/Pierre

Mathematical Models of Supervised Learning and their Application to Medical Diagnosis Mario