sampling distributions probability
play

Sampling Distributions & Probability Paul Gribble Winter, 2019 - PowerPoint PPT Presentation

Sampling Distributions & Probability Paul Gribble Winter, 2019 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . McCall Chapter 3 measures of central tendency mean


  1. Sampling Distributions & Probability Paul Gribble Winter, 2019 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  2. McCall Chapter 3 ▶ measures of central tendency ▶ mean ▶ deviations about the mean ▶ minimum variability of scores about the mean ▶ median ▶ mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  3. McCall Chapter 3 ▶ measures of variability ▶ range ▶ variance ▶ standard deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  4. Population vs Sample ▶ why do we sample the population? ▶ in cases when we cannot feasibly measure the entire population ▶ the idea is that we can use characteristics of our sample to estimate characteristics of the population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  5. McCall Chapter 3 ▶ populations vs samples ▶ estimators of population parameters ▶ based on a sample ▶ e.g. for estimating parameters of normal distribution ▶ mean, variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  6. McCall Chapter 7 ▶ sampling ▶ sampling distribution ▶ sampling error ▶ probability & hypothesis testing ▶ estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  7. Methods of Sampling ▶ simple random sampling ▶ all elements of the population have an equal probability of being selected for the sample ▶ representative samples of all aspects of population (for large samples) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  8. Methods of Sampling ▶ proportional stratified random sample ▶ mainly used for small samples ▶ random sampling within groups but not between ▶ e.g. political polls ▶ random sampling within each province ▶ but not between provinces ▶ total # samples for each province pre-determined by overall population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  9. Random Sampling ▶ each subject is selected independently of other subjects ▶ selection of one element of the population does not alter likelihood of selecting any other element of the population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  10. Sampling in Practice ▶ elements of the population available to be sampled is often biased ▶ willingness of subjects to participate ▶ certain subjects sign up for certain kinds of experiments ▶ Psych 1000 subject pool — is it representative of the general population? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  11. Sampling Distributions ▶ sampling is an imprecise process ▶ estimate will never be exactly the same as population parameter ▶ a set of multiple estimates based on multiple samples is called an empirical sampling distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  12. Sampling Distribution Definition (sampling distribution) the distribution of a statistic (e.g. the mean) determined on separate independent samples of size N drawn from a given population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  13. Empirical Sampling Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  14. Sampling Distributions ▶ mean, standard deviation and variance in raw score distributions vs sampling distributions: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  15. Population Estimates ▶ by using the mean of a sample of raw scores we can estimate both: ▶ mean of sampling distribution of means ▶ mean of population raw scores ▶ we can estimate the standard deviation of the sampling s x distribution of the means using: s ¯ x = √ N ▶ standard deviation of raw scores in sample divided by the square root of the size of the sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  16. Standard error of the mean ▶ all that’s required to estimate it is ▶ standard deviation of raw scores ▶ N (# scores in sample) ▶ it represents an estimate of the amount of variability (or sampling error) in means from all possible samples of size N of the population of raw scores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  17. Standard error of the mean ▶ this is great news, it means that it’s not necessary to select several samples in order to estimate the population sampling error of the mean ▶ we only need 1 sample, and based on its standard deviation, we can compute an estimate of how our estimate of the mean would vary if we were to repeatedly sample ▶ we can then use our estimate s ¯ x as a measure of the precision of our estimate of the population mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  18. Standard error of the mean x = s x s ¯ √ N √ ▶ we are dividing by N ▶ thus s ¯ x (standard error of the mean) is always smaller than s x (standard deviation of raw scores in a sample) ▶ said differently: the variability of means from sample to sample will always be smaller than the variability of raw scores within a sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  19. Standard error of the mean ▶ as N increases, s ¯ x decreases ▶ for large samples (large N ), the mean will be less variable from sample to sample ▶ and so will be a more accurate estimate of the true mean of the population ▶ larger samples produce more accurate and more precise estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  20. Normal Distribution ▶ given random sampling, the sampling distribution of the mean: ▶ is a normal distribution if the population distribution of the raw scores is normal ▶ approaches a normal distribution as the size of the sample increases even if the population distribution of raw scores is not normal ▶ Central Limit Theorem ▶ the sum of a large number of independent observations from the same distribution has, under certain general conditions, an approximate normal distribution ▶ the approximation steadily improves as the number of observations increases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  21. Normal Distribution ▶ why do we care about whether populations or samples are normally distributed? ▶ all sorts of parametric statistical tests are based on the assumption of a particular theoretical sampling distribution ▶ t-test (normal) ▶ F-test (normal) ▶ others. . . ▶ assuming an underlying theoretical distribution allows us to quickly compute population estimates, and compute probabilities of particular outcomes quickly and easily ▶ non-parametric methods can be used in other cases but they are more work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  22. Normal Distribution ▶ given two parameters (mean, variance): ▶ we can look up in a table (or compute in R) the proportion of population scores that fall above (or below) a given value (allowing us to compute probabilities of particular outcomes) ▶ we can assume the shape of the entire distribution based only on the mean and variance of our sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  23. Violations of Normality ▶ what if the assumption of normality is violated? ▶ we can perform non-parametric statistical tests ▶ we could determine how serious the violation is (what impact it will have on our statistical tests and the resulting conclusions) ▶ pre-existing rules of thumb about how sensitive a given statistical test is to particular kinds of violations of normality ▶ monte-carlo simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  24. A single case ▶ suppose it is known: ▶ for a population asked to remember 15 nouns, the mean number of nouns recalled after 1 hour is 7.0, and standard deviation is 2.0 ( µ = 7 . 0; σ = 2 . 0) ▶ in R use dnorm() to compute probability density 0.20 probability 0.10 0.00 0 2 4 6 8 10 12 14 # items recalled . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  25. A single case ▶ does taking a new drug improve memory? ▶ test a single person after taking the drug ▶ they score 11 nouns recalled ▶ what can we conclude? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Recommend


More recommend