Pairwise Independent Sampling Mathematics for Computer Science Theorem: MIT 6.042J/18.062J Let R 1 ,…,R n be pairwise independent random vars with the same finite Sampling & mean μ and variance σ 2 . Let Then n :: = (R 1 + R 2 + . + R A n ) / n. Confidence 2 σ Pr[ |A n - µ | > δ ] ≤ 1 δ n Albert R Meyer, Ma y 13, 2013 confidence.1 Albert R Meyer, Ma y 13, 2013 Sampling Sampling Questions Then coliform count in Charles River Make 32 measurements for swimming of CMD at random EPA requires times and locations average CMD < 200 (Coliform Microbial Density) Albert R Meyer, Ma y 13, 2013 confidence.3 Albert R Meyer, Ma y 13, 2013 confidence.4 1
Sampling Questions Sampling Questions A few of the 32 counts That is, convince EPA that turn out to be > 200 but the estimate based on 32 their average is 180. samples is within 20 of the actual average? Convince the EPA that avg in whole river is < 200? Albert R Meyer, Ma y 13, 2013 confidence.5 Albert R Meyer, Ma y 13, 2013 confidence.6 Pairwise Independent Sampling Sampling parameters 2 σ 1 c ::= actual average CMD in river Pr[|A n - µ | > δ ] ≤ δ CMD sample ↔ ran var with μ = c n n samples ↔ n mutually indep µ = c, δ = 20 n = 32, ran vars with μ = c A n ::= avg of the n CMD samples Albert R Meyer, Ma y 13, 2013 confidence.7 Albert R Meyer, Ma y 13, 2013 confidence.8 2
Pairwise Independent Sampling Bound for ( 2 2 σ σ 1 1 Pr[A 32 -c| > 20 ] ≤ Pr[A 32 -c| > 20 ] ≤ 32 20 32 20 µ = c, δ = 20 µ = c, δ = 20 n = 32, n = 32, suppose L is max possible ?? don’t know ( difference of samples worst σ = L = 50 2 Albert R Meyer, May 13, 2013 confidence.9 Albert R Meyer, May 13, 2013 confidence.10 Pairwise Independent Sampling Confidence − not Probable Reality 2 1 25 tempting to say: Pr[A 32 -c| > 20 ] ≤ < 0.05 32 20 “the probability that c = 180 ± 20 Pr[ |A 32 -c| ≤ 20] > 0.95 is at least 0.95” --technically wrong! Albert R Meyer, May 13, 2013 confidence.11 Albert R Meyer, May 13, 2013 confidence.12 3
Confidence Confidence The possible outcomes of our c is the actual average in sampling process is a random the river. variable. We can say that the “ probability that our sampling c is unknown, process will yield an average but not a random variable! that is ± 20 of the true average at least 0.95” Albert R Meyer, May 13, 2013 confidence.13 Albert R Meyer, May 13, 2013 confidence.14 Confidence Confidence For simplicity we say that Tell the EPA that with probability 0.95 our estimate c = 180 ± 20 at the method for avg CMD will be 95% confidence level within 20 of the actual avg, c, in the river. Albert R Meyer, May 13, 2013 confidence.15 Albert R Meyer, May 13, 2013 confidence.17 4
Confidence Confidence Moral: when you are told that Moral: Also ask “Why am I some fact holds at a high hearing about this particular confidence level, remember experiment? How many that a random experiment others were tried and not lies behind this claim. Ask reported?” yourself “what experiment?” See http://xkcd.com/882/ Albert R Meyer, Ma y 13, 2013 confidence.18 Albert R Meyer, Ma y 13, 2013 confidence.19 5
MIT OpenCourseWare http://ocw.mit.edu 6.042J / 18.062J Mathematics for Computer Science Spring 20 15 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Recommend
More recommend