sample size
play

Sample Size Vorasith Sornsrivichai, MD., FETP Epidemiology Unit, - PowerPoint PPT Presentation

Sample Size Vorasith Sornsrivichai, MD., FETP Epidemiology Unit, Faculty of Medicine Prince of Songkla University All nature is but art, unknown to thee; All chance, direction, which thou canst not see; All discord, harmony not understood;


  1. Sample Size Vorasith Sornsrivichai, MD., FETP Epidemiology Unit, Faculty of Medicine Prince of Songkla University

  2. “All nature is but art, unknown to thee; All chance, direction, which thou canst not see; All discord, harmony not understood; All partial evil, universal good; And spite of pride, in erring reason's spite, One truth is clear, Whatever is, is right” ~ Alexander Pope ~

  3. How Much Is Enough? � “Is sample size of 30 subjects enough?” � “If I sampling 10% of population will it be OK?” � “Can I just use all 24 patients I have?” 3

  4. Objectives � To learn how to calculate the sample size needed to obtain a specified precision for an estimate of a parameter � To learn how to calculate the sample size needed to provide a specified power for a comparative study 4

  5. Outline of Presentation � Review of basic principle � Determination of sample size � Sample size calculation 5

  6. 6 Source: http://trochim.human.cornell.edu/kb/random.htm

  7. Two Types Of Study Objective � Estimation: Approximation of some parameters (magnitude or difference or ratio) � Critical feature is the precision of the estimation. � e.g. “A public health officer seeks to estimate the proportion of children in the district receiving vaccinations.” � Hypothesis testing: Examination of proposed assumption � Critical feature is the power of the study � e.g. “Is drug B more effective than drug A?” 7

  8. Determinants of The Sample Size � Effect size � Level of significance � Power of the test � Variation of the outcome 8

  9. Other Determinants of The Sample Size � Research questions and objective of the study � Defining the population and the population size � Type of outcome e.g. dichotomous, continuous � Outcome measurement e.g. single, repeated measurements � Sampling technique e.g. cluster sampling � Type of statistical methods � Type of analysis e.g. subgroup analysis � Non-responses or lost to follow-up 9

  10. Effect Size � RR, OR, RD, etc. � The higher the effect size, the lower the sample size needed 10

  11. “To err is human” (, to forgive divine) ~ Alexander Pope~

  12. Errors Truth Study Results H o is not true H 0 is true α β 1 – Reject H o Power Type I error α β 1 – Fail to reject H 0 Type II error Confidence 12

  13. Significance � False detection of difference/association by “chance” α or Type I error � � Statistical significance VS Epidemiological & Clinical significance 13

  14. Power of the test β � Power (1- ) is the probability of rejecting H o when H o is not true H a H 0 Study number Power = 9/10 *100 = 90% 14

  15. “Knowledge is an unending adventure at the edge of uncertainty.” ~ Jacob Bronowski ~

  16. Uncertainty… � Variability in the population: not all samples would give exactly the same finding, i.e., there is uncertainty in making an inference � However, the uncertainty can usually be quantified � Uncertainty can be reduced by using a sufficiently large sample 16

  17. Population Sample n = 2 n = 5 n = 20

  18. Central Limit Theorem � If samples are drawn from a non-normally distributed parent population, the frequency distribution of the population of sample means approaches the normal distribution as the sample size increases. Population Sample n = 2 n = 5 n = 20

  19. Sampling Distributions As the sample size increases: � the sample means tend to be distributed normally � the width of the distribution decreases As the number of samples increases : � the mean of the distribution of sample means tends to the mean of the population The above is also true for sample estimates of population proportion � as long as the proportion is not too close to 0 or 1 19

  20. Standard Normal Distribution 0.9973 0.9545 0.6826 0 -2.56 -1.96 -1 0 1 1.96 2.56 X-3SE X-2SE X-1SE X X+1SE X+2SE X+3SE 20

  21. Estimation Big n Small n Narrow SE Wide SE Distribution of estimate of the means from many samples 21

  22. Range of population values Estimation of X bar compatible with our study value (large sample) d d d = precision Study value of X bar Sampling distributions from populations with various values of X bar 22

  23. Estimation Range of population values of X bar compatible with (small sample) our study value d d Study value d = precision of X bar Sampling distributions from populations with various values of X bar 23

  24. Population1 Population2 μ 1~150 cm. μ 2~150 cm. σ 1~ 5 cm. σ 2~ 10 cm. x2 x1 Estimate of mean height α = 0.05 d=3 cm. n1=12 n2=45 d d X Distribution of means of hypothetical samples

  25. Population Population N= ∞ N= ∞ σ σ σ σ μ μ Data Data SD B ~ σ SD A ~ σ Sample A Sample B n=100 n=25 X X A B Estimation Estimation SE A = SD A / 100 Uncertainty SE B = SD B / Uncertainty 25 in measure in measure sample A sample B X X 25

  26. Sample Size Calculation

  27. Sample Size Calculation � Available tables � Nomogram � Manual calculation � Software: EpiInfo, STATA, R, OpenEpi 27

  28. Available Table � e.g. sample size to estimate P within d absolute percentage points with 99% confidence 28

  29. Nomogram 29 http://ccforum.com/content/6/4/335

  30. OpenEpi Open Source Epidemiologic Statistics for Public Health http://www.openepi.com

  31. 31

  32. 32

  33. 33

  34. 34

  35. 35

  36. 36

  37. 37

  38. 38

  39. 39

  40. 40

  41. 41

  42. 42

  43. 43

  44. 44

  45. 45

  46. 46

  47. 47

  48. 48

  49. Considerations � The appropriate sample size may not be the same for all objectives in a study. � Therefore calculate for all objectives then decide � All sample size calculations considered here and in most computer programs assume simple random sampling � Other sampling method e.g. cluster sampling may require adjustments 49

  50. Considerations � Calculated sample size is the minimum sample needed � Add more (~10–30%) for non-response and lost to follow up � E.g. suppose 10% of subjects in the study are expected to refuse to participate or to drop out before the study ends. � The total number of n/(1-0.1) eligible subjects would have to be approached in the first instance 50

  51. Inappropriate Sample Size � Too SMALL � Too BIG � wide CI � waste of reource (effort, time, money) � unable to detect a real effect � even very small effects become � may miss important statistical significant association � may be unethical 51

  52. “Although our intellect always longs for clarity and certainty, our nature often finds uncertainty fascinating.” ~ Karl von Clausewitz ~

  53. Sample Size Calculation � One sample � Estimating: proportion, mean � Hypothesis testing: proportion, mean � Two sample � Estimating: difference between two proportions, two means � Hypothesis testing: difference between two proportions, two means 53

  54. Sample size calculations for estimation are based on : = × d Z SE − α 1 / 2 d /2 α In each case, we just put in the appropriate expression for standard error e.g. SD / n 54

  55. Estimating A Population Mean = × d Z SE − α 1 / 2 σ = SE σ Z n − α /2 1 ∴ = d n σ 2 2 Z − α /2 1 ∴ = n 2 d 55

  56. Example 1 (Estimating a population mean) � An estimate is desired of the average retail price of 20 tablets of a tranquilizer. It is required to be within 10 % of the true average price with 95 %CI. The SD in price was estimated as 85 %. How many pharmacies should be randomly selected? σ 2 2 Z α / 2 = n 2 d � n = (1.96) 2 (0.85) 2 /(0.1) 2 ~278 56

  57. Estimating A Population Proportion = × d Z SE − α 1 / 2 ( − p 1 p ) = SE n − 2 Z p ( 1 p ) − α /2 1 ∴ = n 2 d 57

  58. Example 2 (Estimating a population proportion) � A district public health officer seeks to estimate the proportion of children in the district receiving appropriate childhood vaccinations. How many children must be studied if the resulting estimate is to fall within 10 % of the true proportion with 95% CI. = 2 2 n Z p ( 1 - p )/d − α 1 / 2 � n = (1.96) 2 (0.25)/(0.1) 2 = 96.04 58

  59. Parameter Estimation � The sample selected will P P(1-P) be largest when P = 0.5 0.5 0.25 � When one has no idea what the level of P is in 0.4 0.24 the population, choosing 0.3 0.21 0.5 for P will always provide enough 0.2 0.16 observations. 0.1 0.09 59

  60. Hypothesis Testing mean of B - mean of A Δ is minimum effect Δ α worth detecting * β = Δ − × = × * ( Z SE ) ( Z SE ) β a / 2 0 1 Δ = × + × ( Z SE ) ( Z SE ) 1 β a / 2 0 1 60

  61. Basic Equations Underlying Sample Size 1 = × d Z SE − 1 α / 2 Δ = × + × 2a Z SE Z SE − α β 1 / 2 0 1 If SE 0 = SE 1 = SE then Δ = + SE ( Z Z ) 2b − β 1 α /2 Most sample size calculations for estimation and hypothesis testing are based on these equations . 61

Recommend


More recommend