introductory statistics refresher
play

Introductory Statistics Refresher Dr. Julia L. Sharp Short Course - PowerPoint PPT Presentation

Introductory Statistics Refresher Dr. Julia L. Sharp Short Course on Introductory Statistics Part III Sharp (Clemson University) ASA 1 / 26 Hypothesis Testing As an example, suppose that I claim that I am excellent free throw shooter,


  1. Introductory Statistics Refresher Dr. Julia L. Sharp Short Course on Introductory Statistics Part III Sharp (Clemson University) ASA 1 / 26

  2. Hypothesis Testing As an example, suppose that I claim that I am excellent free throw shooter, making 80% or more of my free throw shots. Given a claim. Gathered evidence. Assessed the evidence using the claim. Sharp (Clemson University) ASA 2 / 26

  3. Hypothesis Testing State the null and alternative hypotheses. State the Type I and Type II Errors for the hypotheses. State the level of significance (maximum acceptable α ). Check assumptions. Compute the test statistic. Calculate the p -value. Compare the p -value with the level of significance. Make a decision regarding the null hypothesis. Draw a conclusion in terms of the problem. Sharp (Clemson University) ASA 3 / 26

  4. Hypothesis Testing Definitions Null Hypothesis: ( H o ) a statement of no effect or no change. This statement is assumed to be true unless sufficient evidence is gathered to reject this hypothesis. Alternative Hypothesis: ( H a ) the research hypothesis. This is the statement that one wishes to support as being true. This is done by gathering evidence against the null hypothesis. Type I Error: an error that occurs if the null hypothesis is rejected when it is true. The probability of a Type I error is denoted as α Type II Error: an error that occurs if the null hypothesis is not rejected when it is false. The probability of a Type II error is denoted as β Sharp (Clemson University) ASA 4 / 26

  5. Hypothesis Testing Definitions State of Nature H o is True H o is False Reject H o Fail to Reject H o Sharp (Clemson University) ASA 5 / 26

  6. More Hypothesis Testing Definitions Test statistic: a quantity computed from sample data that depends on the value of the parameter begin tested Level of significance: the maximum allowable chance of making a Type I error that the researcher is willing to accept P -value: the probability, computed assuming the null hypothesis is true, that a test statistic will be as or more extreme than the test statistic that was actually observed. Sharp (Clemson University) ASA 6 / 26

  7. Small Sample P -value Method: H o : µ = µ 0 t obs = y − µ 0 s / √ n H a : µ < µ 0 H a : µ > µ 0 H a : µ � = µ 0 P -value: P -value: P -value: P ( T < t obs ) P ( T < t obs ) P ( T < t obs ) Decision Rule: Sharp (Clemson University) ASA 7 / 26

  8. P -value Method Example Suppose that we would like to conduct a test to determine if the average Phosphorus leaching is less than 50mm. Recall that the sample mean from 32 lysometer samples is 44.7166 and the sample standard deviation is 7.8069. Use a significance level of 0.05. State the hypotheses. Compute the test statistic. Determine the p -value. Sharp (Clemson University) ASA 8 / 26

  9. P -value Method Example Suppose that we would like to conduct a test to determine if the average Phosphorus leaching is less than 50mm. Use a significance level of 0.05. Make a decision regarding H o . State the conclusion in terms of the problem. Sharp (Clemson University) ASA 9 / 26

  10. Example Riddle and Bergström (2013) describe several experiments to examine Phosphorus leaching from two soils. A table of results from one of the experiments is reproduced below. There were four different rain simulations used and two soil types (clay and sand). The amount of drainage water collected from lysimeters was recorded. Riddle, M. U. and Bergström, L. (2013). “Phosphorus leaching from two soils with catch crops Sharp (Clemson University) ASA 10 / 26

  11. Hypothesis Test: Phosphorus Leaching Conduct a test to determine if the average Phosphorus leaching is less than 50mm. One Sample t-test data: drain$drainage t = -3.8283, df = 31, p-value = 0.0002936 alternative hypothesis: true mean is less than 50 95 percent confidence interval: -Inf 47.05657 sample estimates: mean of x 44.71664 Sharp (Clemson University) ASA 11 / 26

  12. Inferences Comparing Two Population Central Values Compare the average responses in two groups. Assumptions: Independent random samples of n 1 observations from one population and n 2 observations from a second population are selected. Samples are selected from normal distributions or large sample sizes are used. GOAL: Make inference about the difference between the population means. Population Sample Mean Standard Deviation Size Mean Standard Deviation 1 2 Sharp (Clemson University) ASA 12 / 26

  13. Inference for Two Population Means: Example Riddle and Bergström (2013) describe several experiments to examine Phosphorus leaching from two soils. A table of results from one of the experiments is reproduced below. There were four different rain simulations used and two soil types (clay and sand). The amount of drainage water collected from lysimeters was recorded. Suppose that we would like to compare the average amount of drainage water collected from clay soil to the average amount of drainage water col- lected from sandy soil. Sharp (Clemson University) ASA 13 / 26

  14. Sampling Distribution of Y 1 − Y 2 Suppose two independent random variables Y 1 and Y 2 are normally distributed with appropriate means and variances: The sampling distributions of Y 1 and Y 2 are: The sampling distribution of Y 1 − Y 2 is: The mean of the sampling distribution is: The standard error of the sampling distribution is: Sharp (Clemson University) ASA 14 / 26

  15. Inference for Comparing Two Population Means: ★ ✥ Independent Samples σ 2 1 and σ 2 2 ✧ ✦ Equal Unequal � ❅ σ 2 1 = σ 2 σ 2 1 � = σ 2 � ❅ 2 2 ✬ ✩ ✬ ✩ � ❅ � ✠ ❅ ❘ � 1 Variance of σ 2 + σ 2 + 1 � σ 2 1 2 Y 1 − ¯ ¯ Y 2 n 1 n 2 n 1 n 2 ✫ ✪ ✫ ✪ ✬ ✩ ✬ ✩ � ❅ Variance Estimate � ✠ ❅ ❘ � 1 s 2 + s 2 + 1 � s 2 1 2 p n 1 n 2 n 1 n 2 ✫ ✪ ✫ ✪ Sharp (Clemson University) ASA 15 / 26

  16. Independent Samples, Equal Variances: Hypothesis Tests for Comparing Two Population Means H o : µ 1 − µ 2 = D 0 H a : µ 1 − µ 2 < D 0 H a : µ 1 − µ 2 > D 0 H a : µ 1 − µ 2 � = D 0 Test statistic: t obs = ( y 1 − y 2 ) − D 0 � 1 + 1 s p n 1 n 2 where p = ( n 1 − 1 ) s 2 1 + ( n 2 − 1 ) s 2 s 2 2 n 1 + n 2 − 2 Sharp (Clemson University) ASA 16 / 26

  17. Independent Samples, Equal Variances: Hypothesis Test P -values H o : µ 1 − µ 2 = D 0 H a : µ 1 − µ 2 < D 0 H a : µ 1 − µ 2 > D 0 H a : µ 1 − µ 2 � = D 0 P -value: P -value: P -value: P ( T < t obs ) P ( T > t obs ) 2 P ( T > | t obs | ) Decision Rule: Sharp (Clemson University) ASA 17 / 26

  18. Random Assignment of Treatment to Experimental Units Is the average amount of drainage water collected from clay soil different from the average amount of drainage water collected from sandy soil? Sharp (Clemson University) ASA 18 / 26

  19. Inference for Two Means Is the average amount of drainage water collected from clay soil different from the average amount of drainage water collected from sandy soil? Use a significance level of 0.05. Sharp (Clemson University) ASA 19 / 26

  20. Inference for Two Means, Independent Samples, Equal Variances: Confidence Interval A 100 ( 1 − α )% confidence interval for the difference in population means is � � � 1 + 1 ( y 1 − y 2 ) ± t α/ 2 , ( n 1 + n 2 − 2 ) s p n 1 n 2 Two Sample t-test data: drain$drainage by drain$soil t = 1.5148, df = 30, p-value = 0.1403 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -1.42650 9.61915 sample estimates: mean in group clay mean in group sand 46.76480 42.66848 Sharp (Clemson University) ASA 20 / 26

  21. Inference for Comparing Two Population Means: ★ ✥ Independent Samples σ 2 1 and σ 2 2 ✧ ✦ Equal Unequal � ❅ σ 2 1 = σ 2 σ 2 1 � = σ 2 � ❅ 2 2 ✬ ✩ ✬ ✩ � ❅ � ✠ ❅ ❘ � 1 Variance of σ 2 + σ 2 + 1 � σ 2 1 2 Y 1 − ¯ ¯ Y 2 n 1 n 2 n 1 n 2 ✫ ✪ ✫ ✪ ✬ ✩ ✬ ✩ � ❅ Variance Estimate � ✠ ❅ ❘ � 1 s 2 + s 2 + 1 � s 2 1 2 p n 1 n 2 n 1 n 2 ✫ ✪ ✫ ✪ Sharp (Clemson University) ASA 21 / 26

  22. Independent Samples, Unequal Variances: Hypothesis Tests for Comparing Two Population Means H o : µ 1 − µ 2 = D 0 H a : µ 1 − µ 2 < D 0 H a : µ 1 − µ 2 > D 0 H a : µ 1 − µ 2 � = D 0 Test statistic: obs = ( y 1 − y 2 ) − D 0 t ′ � s 2 + s 2 1 2 n 1 n 2 Sharp (Clemson University) ASA 22 / 26

  23. Distribution of the Test Statistic t ′ ˙ t ( df ) ∼ ( n 1 − 1 )( n 2 − 1 ) where df = ( 1 − c ) 2 ( n 1 − 1 ) + c 2 ( n 2 − 1 ) s 2 1 n 1 and c = s 2 + s 2 1 2 n 1 n 2 Sharp (Clemson University) ASA 23 / 26

  24. Independent Samples, Unequal Variances: Hypothesis Test P -values H o : µ 1 − µ 2 = D 0 H a : µ 1 − µ 2 < D 0 H a : µ 1 − µ 2 > D 0 H a : µ 1 − µ 2 � = D 0 P -value: P -value: P -value: P ( T < t obs ) P ( T > t obs ) 2 P ( T > | t obs | ) Decision Rule: Sharp (Clemson University) ASA 24 / 26

  25. Inference for Two Means (Unequal Variances): Example Using PROC TTEST Is the average amount of drainage water collected from clay soil different from the average amount of drainage water collected from sandy soil? Use a significance level of 0.05. Sharp (Clemson University) ASA 25 / 26

Recommend


More recommend