stat 113 working with theoretical distributions
play

STAT 113 Working with Theoretical Distributions Colin Reimer Dawson - PowerPoint PPT Presentation

Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution STAT 113 Working with Theoretical Distributions Colin Reimer Dawson Oberlin College November 2, 2017 1 / 26 Analytic Approximations


  1. Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution STAT 113 Working with Theoretical Distributions Colin Reimer Dawson Oberlin College November 2, 2017 1 / 26

  2. Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution 2 / 26

  3. Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution Outline Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution 3 / 26

  4. Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution P -value = Proportion of Randomized Sample Statistics 0.04 ● ● ● ● ● ● ● Probability ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.02 P ( X ≥ 270 ) ≈ 0.04 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.00 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 200 220 240 260 280 300 Values Figure: Randomization distribution for the number of heads in 500 coin flips, highlighting the one-tailed P -value testing H 1 : p > 0 . 5 for an observation of 270 heads. 4 / 26

  5. Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution Confidence Level = Proportion of Bootstrap Samples 8 6 Density 4 2 0 0.4 0.5 0.6 0.7 x Mercury Figure: Bootstrap distribution for mean mercury level in fish in Florida Lakes (from FloridaLakes dataset). The middle 95% is highlighted illustrating a 95% confidence interval. 5 / 26

  6. Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution Properties of Sampling Distributions Most (about 95%) of simple random samples have a sample mean ( ¯ x ) which is within 2 Standard Errors of the population mean ( µ ). Therefore, about 95% of the time, the population mean will be within 2SE of the sample mean! A similar statement holds for some other statistics/parameters, under a particular condition. What condition? The sampling distribution needs to be (approximately) symmetric and bell-shaped 6 / 26

  7. Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution So what’s with all these bell shapes? • Q: Why are so many distributions “bell-shaped”? • A: The Central Limit Theorem • One of the most important results in probability: for sufficiently large samples, sample means have a Normal (bell-shaped) distribution . 7 / 26

  8. Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution Sample Means Show Up A Lot • Sample means are sample means (did you know this?) • Sample proportions are sample means (encode binary variable as 0s and 1s) 8 / 26

  9. Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution Even More Stuff is Normal Also... • Sum of two Normals is Normal • Rescaling a Normal by a constant is Normal • Difference of Normals is Normal So... • Sampling distribution for difference of sample means is approximately Normal • Sampling distribution for difference of sample proportions is approximately Normal 9 / 26

  10. Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution Outline Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution 10 / 26

  11. Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution Approximating with a Smooth Curve 400 800 200 Frequency Frequency Frequency 200 400 100 0 0 0 50 100 150 200 50 100 150 200 50 100 150 200 Birthweight in oz Birthweight in oz Birthweight in oz 150 60 200 Frequency Frequency Frequency 100 40 100 50 20 0 0 0 50 100 150 200 50 100 150 200 50 100 150 200 Birthweight in oz Birthweight in oz Birthweight in oz Figure: Frequency Histograms of Babies’ Birth Weights (Nolan and Speed, 2000) 11 / 26

  12. Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution Density Proportion = Area = Height × Width Density = Height = Proportion Width This quantity (proportion divided by width) is called “density” by analogy to physics: “amount of stuff” divided by “amount of space”. 12 / 26

  13. Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution Density Histograms 0.030 0.030 0.030 Density Density Density 0.015 0.015 0.015 0.000 0.000 0.000 50 100 150 200 50 100 150 200 50 100 150 200 Birthweight in oz Birthweight in oz Birthweight in oz 0.030 0.030 0.030 Density Density Density 0.015 0.015 0.015 0.000 0.000 0.000 50 100 150 200 50 100 150 200 50 100 150 200 Birthweight in oz Birthweight in oz Birthweight in oz Figure: Density Histograms of Babies’ Birth Weights (Nolan and Speed, 2000) 13 / 26

  14. Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution Density Functions 0.030 0.030 0.030 Density Density Density 0.015 0.015 0.015 0.000 0.000 0.000 50 100 150 200 50 100 150 200 50 100 150 200 Birthweight in oz Birthweight in oz Birthweight in oz 0.030 0.030 0.030 Density Density Density 0.015 0.015 0.015 0.000 0.000 0.000 50 100 150 200 50 100 150 200 50 100 150 200 Birthweight in oz Birthweight in oz Birthweight in oz Figure: Densities of Babies’ Birth Weights (Nolan and Speed, 2000) 14 / 26

  15. Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution Proportion = Area Under the Density Curve 0.020 Density 0.010 P = 0.067 0.000 50 100 150 200 Birthweight in oz Figure: Approximating birth weight distribution using a Normal. Shaded area is P ( Weight ≥ 148 oz ) 15 / 26

  16. Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution Outline Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution 16 / 26

  17. Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution Normal Distributions Normal distributions are completely specified by their mean ( µ ) and their standard deviation ( σ ). We can write N (0 , 1) as shorthand for a Normal with mean 0 and standard deviation 1. 1.5 N(0, 1) N(2, 1) N(0, 0.5) 1.0 N(−4, 0.3) Density 0.5 0.0 −6 −4 −2 0 2 4 6 x 1 2 2 πe − ( x − µ σ ) √ density ( x ) = but we won’t use this directly. , σ 17 / 26

  18. Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution Normal Distributions µ − 2 σ µ − σ µ µ + σ µ + 2 σ Pairs: (Approximately) what proportion of the area under the curve is shaded? In a bell-shaped (normal) distribution, 95% of cases lie within 2 standard deviations of the mean. So 5% lie beyond 2 σ from µ . 18 / 26

  19. Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution Area Under Normal Curve −2 −1 0 1 2 Area under a curve using calculus: � ∞ 1 2 2 πe − ( x − 0 1 ) √ dx σ 1 . 5 but this integrand doesn’t have a closed-form antiderivative 19 / 26

  20. Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution StatKey to the Rescue! 20 / 26

  21. Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution R Works Too library("mosaic") ## Area to the right of 1.5 xpnorm(1.5, mean = 0, sd = 1, lower.tail = FALSE) If X ~ N(0, 1), then P(X <= 1.5) = P(Z <= 1.5) = 0.9331928 P(X > 1.5) = P(Z > 1.5) = 0.0668072 1.5 0.5 (z=1.5) 0.9332 0.0668 0.4 density 0.3 0.2 0.1 −2 0 2 21 / 26

  22. Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution Outline Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution 22 / 26

  23. Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution Quantiles of a Normal Curve Suppose that the bootstrap distribution of means for samples of size 500 Atlanta commute times is N (29 . 11 , 0 . 93) . Find an endpoint (percentile) so that just 5% of the bootstrap means are smaller. 23 / 26

  24. Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution StatKey... 24 / 26

  25. Analytic Approximations Density Functions Properties of Normal Distributions Normal Bootstrap Distibution And in R ... xqnorm(0.05, mean = 29.11, sd = 0.93) P(X <= 27.5802861269351) = 0.05 P(X > 27.5802861269351) = 0.95 27.5803 0.5 (z=−1.645) 0.05 0.95 0.4 density 0.3 0.2 0.1 26 28 30 32 [1] 27.58029 25 / 26

Recommend


More recommend