Testing proportions BIO5312 FALL2017 STEPHANIE J. SPIELMAN, PHD

Estimation An estimator is a statistic (~formula) for estimating a parameter A good estimator is unbiased ◦ The expected value (expectation) of the estimator should equal the parameter being estimated ◦ Mean of the sampling distribution of the statistic should equal the parameter being estimated A good estimator is consistent ◦ Increasing the sample size produces an estimate with smaller SE A good estimator is efficient ◦ Has the smallest SE among any estimator you could have chosen

� � We are usually interested in point estimate, SE, and CI Normally-distributed variable ◦ 𝜈 " = 𝑦̅ 1 (, - .,̅) 0 ∑ ( ) = ◦ 𝜏 -23 4.5 ◦ Known σ 6 ◦ SE = 4 ◦ 95% CI = 𝑦̅ ± 𝑎 :.:)< 𝑇𝐹 ◦ Unknown σ ? ◦ SE = 4 ◦ 95% CI = 𝑦̅ ± 𝑢 :.:)< 𝑇𝐹

Hypothesis testing frameworks t -tests compare means for continuous quantitative data Today we will learn to analyze discrete count data ("proportions"): ◦ Binomial test ◦ 𝝍 2 goodness-of-fit ◦ Contingency table analysis ◦ 𝝍 2 association/homogeneity and Fisher exact test

Binomial test H 𝑞 H 1 − 𝑞 (4.H) = 4 4 H 𝑞 H 𝑟 (4.H) 𝑄 𝑙 𝑡𝑣𝑑𝑑𝑓𝑡𝑡𝑓𝑡 = 4! ◦ Binomial coefficient: 4 H = H! 4.H ! Null proportion of successes to test against Hypothesis test: ◦ H 0 : The relative frequency of success in the underlying population is p 0 ◦ H A : The relative frequency of success in the underlying population is not p 0 ◦ H A : The relative frequency of success in the underlying population is > /< p 0

Binomial test assumption: BInS conditions are satisfied B inary outcomes I ndependent trials (outcomes do not influence each other) n is fixed before the trials begin S ame probability of success, p, for all trials

Binomial test: Example In a certain species of wasp, each wasp has a 30% chance of being male. I collect 12 wasps, of which 5 are male. Does my sample show evidence that 30% of wasps are male? Use α=0.05. In other words, is the observed success proportion 5/12 (41.67%) consistent with a population whose probability of success is 0.3?

Verifying assumptions B inary outcomes: Male or female I ndependent trials: Wasp sex does not influence sex of other wasps n is fixed before the trials begin: I collect 12 wasps S ame probability of success, p, for all trials: P(male) = 0.3 for every wasp

Performing the binomial test My sample: ◦ p = 5/12 = 0.417 ◦ n = 12 ◦ X = 5 We generally say X instead of k when performing hypothesis tests, by convention H 0 : The probability of being a male wasp is p 0 = 0.3 H A : The probability of being a male wasp differs from p 0 = 0.3

The PMF for wasp sex The sampling 0.25 0.24 0.23 distribution for the 0.20 binomial test statistic is Probability mass 0.17 0.16 binomial: This is 0.15 effectively our null. 0.10 0.079 0.071 0.05 0.029 0.014 0.0078 0.0015 0.00019 1.5e − 05 5.3e − 07 0.00 0 1 2 3 4 5 6 7 8 9 10 11 12 Number of males (successes)

p 0 = 0.3 Performing the test n = 12 X = 5 Recall, the P-value is the probability of obtaining a result as extreme or more ◦ Therefore, P-value is P(number of successes >=5) < 0.3 < 0.7 (5).<) + 5) U 0.3 U 0.7 (5).U) + ⋯ + 5) 5) 5) 0.3 5) 0.7 (5).5)) 𝑄(𝑌 ≥ 5) = 0.25 0.24 0.23 0.20 Probability mass 0.17 0.16 0.15 > 1 – pbinom(4, 12, 0.3) 0.10 0.079 0.071 [1] 0.2673445 0.05 0.029 0.014 0.0078 0.0015 0.00019 1.5e − 05 5.3e − 07 0.00 0 1 2 3 4 5 6 7 8 9 10 11 12 Number of males (successes)

Conclusions, round 1 Our P-value of 0.276 is much greater than α. Therefore we fail to reject the null hypothesis and we have no evidence that the population proportion of males corresponding to our sample differs from 0.3.

Notes on binomial tests Computing two-sided P-values is non-trivial ◦ Binomial distribution symmetric only when p=0.5 > binom.test(5, 12, 0.3) Exact binomial test This is not 0.276*2! data: 5 and 12 number of successes = 5, number of trials = 12, p-value = 0.3614 alternative hypothesis: true probability of success is not equal to 0.3 95 percent confidence interval: 0.1516522 0.7233303 sample estimates: probability of success 0.4166667

� � Computing the binomial standard error 𝑻𝑭 𝒒 " = 𝒒 " 𝟐 − 𝒒 " /𝒐 𝟏.𝟓𝟐𝟖(𝟐.𝟏.𝟓𝟐𝟖) = = 0.142 𝟐𝟑 What is this value? 1. The standard deviation of the sampling distribution of the probability of success 2. Quantifies the precision of 𝑞̂ , our estimate of the population prob. of success

� � Computing the binomial confidence interval Classically, we use the Wald method ◦ Note: Only "precise" when n is not very large (>0.8) or small (<0.2) 𝒒 " − 𝒂 𝟏.𝟏𝟑𝟔 ∗ 𝑻𝑭 𝒒 " < 𝒒 < 𝒒 " + 𝒂 𝟏.𝟏𝟑𝟔 ∗ 𝑻𝑭 𝒒 " " is the estimated proportion of success, X/n = 0.417 ◦ 𝒒 ◦ 𝒂 𝟏.𝟏𝟑𝟔 is 1.96 "(𝟐.𝒒 𝒒 ") 𝟏.𝟓𝟐𝟖(𝟐.𝟏.𝟓𝟐𝟖) ◦ 𝑻𝑭 𝒒 " = = = 0.142 𝒐 𝟐𝟑

Calculating the binomial CI " − 𝒂 𝟏.𝟏𝟑𝟔 ∗ 𝑻𝑭 𝒒 " + 𝒂 𝟏.𝟏𝟑𝟔 ∗ 𝑻𝑭 𝒒 𝒒 " < 𝒒 < 𝒒 " 0.417 – 0.278 < p < 0.417 + 0.278 à 0.417 ± 0.278 > binom.test(5, 12, 0.3) Exact binomial test data: 5 and 12 number of successes = 5, number of trials = 12, p-value = 0.3614 alternative hypothesis: true probability of success is not equal to 0.3 95 percent confidence interval: 0.1516522 0.7233303 sample estimates: R uses a more exact method, the Clopper-Pearson interval probability of success 0.4166667

Final conclusions Our P-value of 0.276 is much greater than α. Therefore we fail to reject the null hypothesis and we have no evidence that the population proportion of males corresponding to our sample differs from 0.3. Our estimated proportion of success is 0.417 with SE =0.142 and a 95% CI of 0.417 ± 0.278.

Pause: Binomial exercise

Use 𝟁 2 Goodness-of-fit test if we do not have binary outcomes Goodness-of-fit test asks if observed proportions are equal to a null proportion 0 for goodness-of- df = (number of categories) – 1 – (number of parameters estimated from data) fit test

Example: Are babies born with the same frequency every day of the week? 70 Day in 1999 # births 60 Sunday 33 50 Monday 41 Frequency 40 Tuesday 63 30 Wednesday 63 20 Thursday 47 10 Friday 56 0 Saturday 47 . . . . . . . n n e d u i t r a u o u e h F S S M T W T H 0 : The probability of birth was the same every day of the week in 1999. H A : The probability of birth was not the same every day of the week in 1999.

� Test statistic # ij?klmkn - .# k,okpqkn - 0 𝜓 ) = ∑ r # k,okpqkn - # Observed Day births # days in 1999 Expected prop # Expected births Sunday 33 52 52/365 = 0.142 0.142*52 = 49.863 Monday 41 52 0.142 49.863 Tuesday 63 52 0.142 49.863 Wednesday 63 52 0.142 49.863 Thursday 47 52 0.142 49.863 Friday 56 53 0.145 50.822 Saturday 47 52 0.142 49.863 Total 350 365 1 1

� Calculating the test statistic and df 𝜓 ) = s # 𝑝𝑐𝑡𝑓𝑠𝑤𝑓𝑒 r − # 𝑓𝑦𝑞𝑓𝑑𝑢𝑓𝑒 r ) # 𝑓𝑦𝑞𝑓𝑑𝑢𝑓𝑒 r r (yy.z{.|Uy ) (z5.z{.|Uy ) (Uy.z{.|Uy ) (Uy.z{.|Uy ) (z}.z{.|Uy ) (<U.<:.|)) ) (z}.z{.|Uy ) 0 0 0 0 0 0 0 = + + + + + + z{.|Uy z{.|Uy z{.|Uy z{.|Uy z{.|Uy <:.|)) z{.|Uy = 15.05 = Day # Observed births # Expected births Sunday 33 0.142*52 = 49.863 df = #categories – 1 = 7 – 1 = 6 Monday 41 49.863 Tuesday 63 49.863 Wednesday 63 49.863 Our categorical variable is Days of week Thursday 47 49.863 It has seven categories Friday 56 50.822 Saturday 47 49.863 Total 350 1

Reports and conclusions 0.16 Probability density 0.14 > 1 - pchisq(15.05, 6) 0.12 [1] 0.01987137 0.10 0.08 � 2 = 15.05 0.06 0.04 0.02 0 0 5 10 15 20 2 � 6 At 0.0199, we reject the null hypothesis that are births are equally distributed across days in 1999. We have evidence that frequency of births differs across days.

Notes on 𝟁 2 Goodness-of-fit test Assumptions for all 𝟁 2 tests ◦ Randomly sampled data from population ◦ Two or more categories of a categorical variable (data is counts ) ◦ Expected frequencies must be >=1 ◦ No more than 20% of expected frequencies are < 5 We take only >= test statistic for P-value ◦ General to all 𝟁 2 tests

𝟁 2 goodness-of-fit in R #### Prepare data: Observed counts and expected proportions #### > births <- c(33,41,63,63,47,56,47) > expected <- c(52,52,52,52,52,53,52) > expected <- expected/sum(expected) > expected [1] 0.1424658 0.1424658 0.1424658 0.1424658 0.1424658 0.1452055 0.1424658 > > chisq.test chisq.test(births, p = expected) (births, p = expected) Chi Chi-squared test for given probabilities squared test for given probabilities data: data: births births X-squared = 15.057, squared = 15.057, df df = 6, p = 6, p-value = 0.01982 value = 0.01982

Testing proportions BIO5312 FALL2017 STEPHANIE J. SPIELMAN, PHD - PowerPoint PPT Presentation

Testing proportions BIO5312 FALL2017 STEPHANIE J. SPIELMAN, PHD Estimation An estimator is a statistic (~formula) for estimating a parameter A good estimator is unbiased The expected value (expectation) of the estimator should equal the

Small area estimation of proportions of Small area estimation of proportions of Arsenic affected

Lecture 22/Chapter 19 Part 4. Statistical Inference Ch. 19 Diversity of Sample Proportions

Estimating proportions of elements in finite symmetric and classical groups Alice Niemeyer UWA,

11/11/2014 Chapter 21 COMPARING TWO PROPORTIONS 1 THE STANDARD DEVIATION OF THE DIFFERENCE

Levels of Testing Chapter 12 Beyond unit testing Developer Testing stages Unit testing

Testing Terminology System testing Types of errors Function testing Structure

Property-Based Testing Matt Bachmann @mattbachmann Testing is Important Testing is Important

Software Testing Overview What is software testing? General testing criteria Testing

Software testing Software Testing Introduction Testing levels Automated testing Principles and

1. Test page This page is for testing. This page is for testing. This page is for testing.

Ratios, Rates & Proportions Slide 2 / 130 Table of Contents Click on the topic to go to that

Factor Proportions and the Structure of Commodity Trade John Romalis - Chicago GSB, August

PCA and Admixture proportions for low depth NGS data Anders Albrechtsen Structured

Inference for Proportions Marc H. Mehlman marcmehlman@yahoo.com University of New Haven Based on

ACMS 20340 Statistics for Life Sciences Chapter 20: Comparing Two Proportions Two sample tests

Unit 5: Inference for categorical variables Lecture 1: Inference for proportions Statistics 101

Statistical Methods: Lecture 10 Dennis Dobler Vrije Universiteit Amsterdam December 6, 2017

Two-sided Exact Tests and Matching Confidence Intervals for Discrete Data Michael P. Fay

Total Wellness 6 Easy to implement initiatives for a longer, healthier, more productive life Why

Autism Spectrum Disorder in Child ren and Adolescents Aura Lee A. Motus, M.D. UNM

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture18: Alternative tests and

Fundamentals Tamuno Alfred, PhD Biostatistician DataCamp Designing and Analyzing Clinical

Welcome to the course! EX P ERIMEN TAL DES IGN IN P YTH ON Luke Hayden Instructor

Sta$s$cal Hypothesis Tes$ng Ghostbusters Ghostbusters How many

Sambuz

Useful Links

Newsletter

Mail Us