Sampling and Probability Learning Objectives • Understand probability sampling • Compute and interpret unconditional and conditional probabilities • Evaluate and interpret independence of events 2 Two Areas of Biostatistics Goal: Statistical Inference POPULATION SAMPLE µ = ? n, X Descriptive Statistics 3 1
Population and Sample • It is difficult to include entire population in the study. Thus, we select a representative sample from the population. • Every subject of the population should have the same chance to be selected to the sample. • Probability is important for both selecting sample and making statistical inference 4 Sampling from a Population SAMPLES n n n n Population n N n n n n n 5 Probability Sampling • Simple random sample • Enumerate all members of population N (sampling frame), select n individuals at random (each has same probability of being selected) • Systematic sample • Start with sampling frame; determine sampling interval (N/n); select first person at random from first (N/n) and every (N/n) thereafter • Stratified sample • Organize population into mutually exclusive strata; select individuals at random within each stratum 6 2
Non-Probability Sampling Non-probability sampling is useful in many scenarios, when it is not possible to generate sampling frame. • Convenience sample • Individual selected into sample by any convenient contact (not for inference) • Quota sample • Select a pre-determined number of individuals into sample from groups of interest (not selected at random) 7 Basics of Probability • Probability reflects the likelihood that outcome will occur • 0 < Probability < 1 Number with outcome Probabilit = y N 8 Study of Obesity in Children Age 5 6 7 8 9 10 Total Boys 432 379 501 410 420 418 2560 Girls 408 513 412 436 461 500 2730 Total 840 892 913 846 881 918 5290 See more details on Page 69, Example 5.1 9 3
Basic Probability Calculation P(Select a boy) = 2560/5290 = 0.484 P(Select boy age 10) = 418/5290 = 0.079 P(Select child at least 8 years of age) = (846+881+918)/5290 = 2645/5290 = 0.500 10 Conditional Probability • Probability of outcome in a specific sub-population P(Select 9 year old from among girls) = P(Select 9 year old|girl) = 461/2730 = 0.169 P(Select boy|6 years of age) = 379/892=0.425 11 Evaluation of PSA Test Prostate No Prostate Total Cancer Cancer Low PSA 3 61 64 Moderate PSA 13 28 41 High PSA 12 3 15 Total 28 92 120 12 4
Conditional Probability P(Prostate Cancer|Low PSA) = 3/64 = 0.047 P(Prostate Cancer|Moderate PSA) = 13/41 = 0.317 P(Prostate Cancer|High PSA) = 12/15 = 0.80 13 Independence • Two events, A and B, are independent if P(A|B) = P(A) or if P(B|A) = P(B) • Is screening test independent of prostate cancer diagnosis? • P(Prostate Cancer) = 28/120 = 0.23 • P(Prostate Cancer|Low PSA) = 0.047 • P(Prostate Cancer|Moderate PSA) = 0.317 • P(Prostate Cancer|High PSA) = 0.80 14 Summary We have covered the following topics in this learning unit. • Population and sample • Probability sampling methods • Basic probability calculation • Conditional probability and independence 15 5
Recommend
More recommend