sampling
play

Sampling& meanandvariance 2 . Let Then n :: = (R 1 + R 2 + . + R - PowerPoint PPT Presentation

PairwiseIndependentSampling MathematicsforComputerScience Theorem: MIT 6.042J/18.062J LetR 1 ,,R n bepairwiseindependent randomvarswiththesamefinite Sampling& meanandvariance 2 . Let Then n :: = (R 1 + R 2 + . +


  1. Pairwise Independent Sampling Mathematics for Computer Science Theorem: MIT 6.042J/18.062J Let R 1 ,…,R n be pairwise independent random vars with the same finite Sampling & mean μ and variance σ 2 . Let Then n :: = (R 1 + R 2 + . + R A n ) / n. Confidence 2   σ Pr[ |A n - µ | > δ ] ≤ 1   δ n   Albert R Meyer, Ma y 13, 2013 confidence.1 Albert R Meyer, Ma y 13, 2013 Sampling Sampling Questions Then coliform count in Charles River Make 32 measurements for swimming of CMD at random EPA requires times and locations average CMD < 200 (Coliform Microbial Density) Albert R Meyer, Ma y 13, 2013 confidence.3 Albert R Meyer, Ma y 13, 2013 confidence.4 1

  2. Sampling Questions Sampling Questions A few of the 32 counts That is, convince EPA that turn out to be > 200 but the estimate based on 32 their average is 180. samples is within 20 of the actual average? Convince the EPA that avg in whole river is < 200? Albert R Meyer, Ma y 13, 2013 confidence.5 Albert R Meyer, Ma y 13, 2013 confidence.6 Pairwise Independent Sampling Sampling parameters 2   σ 1 c ::= actual average CMD in river Pr[|A n - µ | > δ ] ≤   δ CMD sample ↔ ran var with μ = c   n n samples ↔ n mutually indep µ = c, δ = 20 n = 32, ran vars with μ = c A n ::= avg of the n CMD samples Albert R Meyer, Ma y 13, 2013 confidence.7 Albert R Meyer, Ma y 13, 2013 confidence.8 2

  3. Pairwise Independent Sampling Bound for ( 2 2     σ σ 1 1 Pr[A 32 -c| > 20 ] ≤ Pr[A 32 -c| > 20 ] ≤         32 20 32 20 µ = c, δ = 20 µ = c, δ = 20 n = 32, n = 32, suppose L is max possible ?? don’t know ( difference of samples worst σ = L = 50 2 Albert R Meyer, May 13, 2013 confidence.9 Albert R Meyer, May 13, 2013 confidence.10 Pairwise Independent Sampling Confidence − not Probable Reality 2   1 25 tempting to say: Pr[A 32 -c| > 20 ] ≤ < 0.05   32  20  “the probability that c = 180 ± 20 Pr[ |A 32 -c| ≤ 20] > 0.95 is at least 0.95” --technically wrong! Albert R Meyer, May 13, 2013 confidence.11 Albert R Meyer, May 13, 2013 confidence.12 3

  4. Confidence Confidence The possible outcomes of our c is the actual average in sampling process is a random the river. variable. We can say that the “ probability that our sampling c is unknown, process will yield an average but not a random variable! that is ± 20 of the true average at least 0.95” Albert R Meyer, May 13, 2013 confidence.13 Albert R Meyer, May 13, 2013 confidence.14 Confidence Confidence For simplicity we say that Tell the EPA that with probability 0.95 our estimate c = 180 ± 20 at the method for avg CMD will be 95% confidence level within 20 of the actual avg, c, in the river. Albert R Meyer, May 13, 2013 confidence.15 Albert R Meyer, May 13, 2013 confidence.17 4

  5. Confidence Confidence Moral: when you are told that Moral: Also ask “Why am I some fact holds at a high hearing about this particular confidence level, remember experiment? How many that a random experiment others were tried and not lies behind this claim. Ask reported?” yourself “what experiment?” See http://xkcd.com/882/ Albert R Meyer, Ma y 13, 2013 confidence.18 Albert R Meyer, Ma y 13, 2013 confidence.19 5

  6. MIT OpenCourseWare http://ocw.mit.edu 6.042J / 18.062J Mathematics for Computer Science Spring 20 15 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

Recommend


More recommend