Sample Size Vorasith Sornsrivichai, MD., FETP Epidemiology Unit, Faculty of Medicine Prince of Songkla University
“All nature is but art, unknown to thee; All chance, direction, which thou canst not see; All discord, harmony not understood; All partial evil, universal good; And spite of pride, in erring reason's spite, One truth is clear, Whatever is, is right” ~ Alexander Pope ~
How Much Is Enough? � “Is sample size of 30 subjects enough?” � “If I sampling 10% of population will it be OK?” � “Can I just use all 24 patients I have?” 3
Objectives � To learn how to calculate the sample size needed to obtain a specified precision for an estimate of a parameter � To learn how to calculate the sample size needed to provide a specified power for a comparative study 4
Outline of Presentation � Review of basic principle � Determination of sample size � Sample size calculation 5
6 Source: http://trochim.human.cornell.edu/kb/random.htm
Two Types Of Study Objective � Estimation: Approximation of some parameters (magnitude or difference or ratio) � Critical feature is the precision of the estimation. � e.g. “A public health officer seeks to estimate the proportion of children in the district receiving vaccinations.” � Hypothesis testing: Examination of proposed assumption � Critical feature is the power of the study � e.g. “Is drug B more effective than drug A?” 7
Determinants of The Sample Size � Effect size � Level of significance � Power of the test � Variation of the outcome 8
Other Determinants of The Sample Size � Research questions and objective of the study � Defining the population and the population size � Type of outcome e.g. dichotomous, continuous � Outcome measurement e.g. single, repeated measurements � Sampling technique e.g. cluster sampling � Type of statistical methods � Type of analysis e.g. subgroup analysis � Non-responses or lost to follow-up 9
Effect Size � RR, OR, RD, etc. � The higher the effect size, the lower the sample size needed 10
“To err is human” (, to forgive divine) ~ Alexander Pope~
Errors Truth Study Results H o is not true H 0 is true α β 1 – Reject H o Power Type I error α β 1 – Fail to reject H 0 Type II error Confidence 12
Significance � False detection of difference/association by “chance” α or Type I error � � Statistical significance VS Epidemiological & Clinical significance 13
Power of the test β � Power (1- ) is the probability of rejecting H o when H o is not true H a H 0 Study number Power = 9/10 *100 = 90% 14
“Knowledge is an unending adventure at the edge of uncertainty.” ~ Jacob Bronowski ~
Uncertainty… � Variability in the population: not all samples would give exactly the same finding, i.e., there is uncertainty in making an inference � However, the uncertainty can usually be quantified � Uncertainty can be reduced by using a sufficiently large sample 16
Population Sample n = 2 n = 5 n = 20
Central Limit Theorem � If samples are drawn from a non-normally distributed parent population, the frequency distribution of the population of sample means approaches the normal distribution as the sample size increases. Population Sample n = 2 n = 5 n = 20
Sampling Distributions As the sample size increases: � the sample means tend to be distributed normally � the width of the distribution decreases As the number of samples increases : � the mean of the distribution of sample means tends to the mean of the population The above is also true for sample estimates of population proportion � as long as the proportion is not too close to 0 or 1 19
Standard Normal Distribution 0.9973 0.9545 0.6826 0 -2.56 -1.96 -1 0 1 1.96 2.56 X-3SE X-2SE X-1SE X X+1SE X+2SE X+3SE 20
Estimation Big n Small n Narrow SE Wide SE Distribution of estimate of the means from many samples 21
Range of population values Estimation of X bar compatible with our study value (large sample) d d d = precision Study value of X bar Sampling distributions from populations with various values of X bar 22
Estimation Range of population values of X bar compatible with (small sample) our study value d d Study value d = precision of X bar Sampling distributions from populations with various values of X bar 23
Population1 Population2 μ 1~150 cm. μ 2~150 cm. σ 1~ 5 cm. σ 2~ 10 cm. x2 x1 Estimate of mean height α = 0.05 d=3 cm. n1=12 n2=45 d d X Distribution of means of hypothetical samples
Population Population N= ∞ N= ∞ σ σ σ σ μ μ Data Data SD B ~ σ SD A ~ σ Sample A Sample B n=100 n=25 X X A B Estimation Estimation SE A = SD A / 100 Uncertainty SE B = SD B / Uncertainty 25 in measure in measure sample A sample B X X 25
Sample Size Calculation
Sample Size Calculation � Available tables � Nomogram � Manual calculation � Software: EpiInfo, STATA, R, OpenEpi 27
Available Table � e.g. sample size to estimate P within d absolute percentage points with 99% confidence 28
Nomogram 29 http://ccforum.com/content/6/4/335
OpenEpi Open Source Epidemiologic Statistics for Public Health http://www.openepi.com
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
Considerations � The appropriate sample size may not be the same for all objectives in a study. � Therefore calculate for all objectives then decide � All sample size calculations considered here and in most computer programs assume simple random sampling � Other sampling method e.g. cluster sampling may require adjustments 49
Considerations � Calculated sample size is the minimum sample needed � Add more (~10–30%) for non-response and lost to follow up � E.g. suppose 10% of subjects in the study are expected to refuse to participate or to drop out before the study ends. � The total number of n/(1-0.1) eligible subjects would have to be approached in the first instance 50
Inappropriate Sample Size � Too SMALL � Too BIG � wide CI � waste of reource (effort, time, money) � unable to detect a real effect � even very small effects become � may miss important statistical significant association � may be unethical 51
“Although our intellect always longs for clarity and certainty, our nature often finds uncertainty fascinating.” ~ Karl von Clausewitz ~
Sample Size Calculation � One sample � Estimating: proportion, mean � Hypothesis testing: proportion, mean � Two sample � Estimating: difference between two proportions, two means � Hypothesis testing: difference between two proportions, two means 53
Sample size calculations for estimation are based on : = × d Z SE − α 1 / 2 d /2 α In each case, we just put in the appropriate expression for standard error e.g. SD / n 54
Estimating A Population Mean = × d Z SE − α 1 / 2 σ = SE σ Z n − α /2 1 ∴ = d n σ 2 2 Z − α /2 1 ∴ = n 2 d 55
Example 1 (Estimating a population mean) � An estimate is desired of the average retail price of 20 tablets of a tranquilizer. It is required to be within 10 % of the true average price with 95 %CI. The SD in price was estimated as 85 %. How many pharmacies should be randomly selected? σ 2 2 Z α / 2 = n 2 d � n = (1.96) 2 (0.85) 2 /(0.1) 2 ~278 56
Estimating A Population Proportion = × d Z SE − α 1 / 2 ( − p 1 p ) = SE n − 2 Z p ( 1 p ) − α /2 1 ∴ = n 2 d 57
Example 2 (Estimating a population proportion) � A district public health officer seeks to estimate the proportion of children in the district receiving appropriate childhood vaccinations. How many children must be studied if the resulting estimate is to fall within 10 % of the true proportion with 95% CI. = 2 2 n Z p ( 1 - p )/d − α 1 / 2 � n = (1.96) 2 (0.25)/(0.1) 2 = 96.04 58
Parameter Estimation � The sample selected will P P(1-P) be largest when P = 0.5 0.5 0.25 � When one has no idea what the level of P is in 0.4 0.24 the population, choosing 0.3 0.21 0.5 for P will always provide enough 0.2 0.16 observations. 0.1 0.09 59
Hypothesis Testing mean of B - mean of A Δ is minimum effect Δ α worth detecting * β = Δ − × = × * ( Z SE ) ( Z SE ) β a / 2 0 1 Δ = × + × ( Z SE ) ( Z SE ) 1 β a / 2 0 1 60
Basic Equations Underlying Sample Size 1 = × d Z SE − 1 α / 2 Δ = × + × 2a Z SE Z SE − α β 1 / 2 0 1 If SE 0 = SE 1 = SE then Δ = + SE ( Z Z ) 2b − β 1 α /2 Most sample size calculations for estimation and hypothesis testing are based on these equations . 61
Recommend
More recommend