statistical methods
play

Statistical Methods Statistical Methods Descriptive Inferential - PowerPoint PPT Presentation

Statistical Methods Statistical Methods Descriptive Inferential Statistics Statistics Hypothesis Others Estimation Testing Outline of today Hypothesis testing for one population mean Hypothesis testing for two samples comparing


  1. Rejection Regions (Two-Sided Test) H a : µ ≠ µ 0 Level of Confidence Rejection Rejection Region Region 1 - α 1/2 α 1/2 α Nonrejection Region 0 t Statistic t n-1, α/2 t n-1, 1- α /2

  2. Rejection Regions (Two-Sided Test) H a : µ ≠ µ 0 Level of Confidence Rejection Rejection Region Region 1 - α 1/2 α 1/2 α Nonrejection Region 0 t Statistic t n-1, α/2 t n-1, 1- α /2 Observed t statistic

  3. Rejection Regions (Two-Sided Test) H a : µ ≠ µ 0 Level of Confidence Rejection Rejection Region Region 1 - α 1/2 α 1/2 α Nonrejection Region 0 t Statistic t n-1, α/2 t n-1, 1- α /2 Observed t statistic

  4. Rejection Regions (Two-Sided Test) H a : µ ≠ µ 0 Level of Confidence Rejection Rejection Region Region 1 - α 1/2 α 1/2 α Nonrejection Region 0 t Statistic t n-1, α/2 t n-1, 1- α /2 Observed t statistic

  5. Birth Weight Example  The mean birth weight in the United States is 120oz. You get a list of birth weights from 100 consecutive, full-term, live-born deliveries from the maternity ward of a hospital in a low- SES area.  The sample mean birth weight is 115 oz and standard deviation is 24 oz.  Can we actually say the underlying mean birth weight from this hospital is lower than the national average?

  6. One-Sided t Test Solution  H 0 : Test Statistic:  H a :  α =  df =  Critical Value(s): Decision: Conclusion:

  7. One-Sided t Test Solution  H 0 : µ = 120 Test Statistic:  H a : µ < 120 − µ − 115 120 x  α =0.05 = = = − 0 2 . 08 t s / n 24 / 100  df =  Critical Value(s): Decision:

  8. One-Sided t Test Solution  H 0 : µ = 120 Test Statistic:  H a : µ < 120 − µ − 115 120 x  α =0.05 = = = − 0 2 . 08 t s / n 24 / 100  df =100-1=99  Critical Value(s): -1.66 EXCEL: t 99, .05 =-TINV(0.1,99) Decision:

  9. One-Sided t Test Solution  H 0 : µ = 120 Test Statistic:  H a : µ < 120 − µ −  α =0.05 X 115 120 = = = − 0 t 2 . 08  df =100-1=99 / 24 / 100 s n  Critical Value(s): -1.66 Decision: EXCEL: t 99, .05 =-TINV(0.1,99) − < − 2 . 08 1 . 66 Reject H 0 at significant level 0.05 and the true mean birth weight is significantly lower in this hospital than in the general population.

  10. EX Cardiovascular Disease ( two- sided t-test)  Test the hypothesis that the mean cholesterol level of recent female Asian immigrants is different from the mean in the general U.S. population 190 mg/dL. Blood tests are performed on 100 female Asian = 181 . 52 x immigrants, the mean level mg/dL with standard deviation s=40 mg/dL

  11. EX Cardiovascular Disease ( two- sided t-test)  Test the hypothesis that the mean cholesterol level of recent female Asian immigrants is different from the mean in the general U.S. population 190 mg/dL. Blood tests are performed on 100 female Asian = 181 . 52 x immigrants, the mean level mg/dL with standard deviation s=40 mg/dL  H 0 : µ = 190 vs H a : µ ≠ 190

  12. EX Cardiovascular Disease ( two- sided t-test)  Test the hypothesis that the mean cholesterol level of recent female Asian immigrants is different from the mean in the general U.S. population 190 mg/dL. Blood tests are performed on 100 female Asian = 181 . 52 x immigrants, the mean level mg/dL with standard deviation s=40 mg/dL  H 0 : µ = 190 vs H a : µ ≠ 190 − µ − 181 . 52 190 x = = = − 0 2 . 12 t s / n 40 / 100

  13. EX Cardiovascular Disease ( two- sided t-test)  Test the hypothesis that the mean cholesterol level of recent female Asian immigrants is different from the mean in the general U.S. population 190 mg/dL. Blood tests are performed on 100 female Asian = 181 . 52 x immigrants, the mean level mg/dL with standard deviation s=40 mg/dL  H 0 : µ = 190 vs H a : µ ≠ 190 − µ − 181 . 52 190 x = = = − 0 2 . 12 t s / n 40 / 100 α =0.05, the critical values are t 99, .025 =-TINV(0.05,99) = -1.98, so t 99, .975 =1.98.

  14. EX Cardiovascular Disease ( two- sided t-test)  Test the hypothesis that the mean cholesterol level of recent female Asian immigrants is different from the mean in the general U.S. population 190 mg/dL. Blood tests are performed on 100 female Asian = 181 . 52 x immigrants, the mean level mg/dL with standard deviation s=40 mg/dL  H 0 : µ = 190 vs H a : µ ≠ 190 − µ − 181 . 52 190 x = = = − 0 2 . 12 t s / n 40 / 100 0 t 99, .025 t 99, .975 α =0.05, the critical values are t 99, .025 =-TINV(0.05,99) = -1.98, so t 99, .975 =1.98. Thus t=-2.12 is in the rejection region.

  15. EX Cardiovascular Disease ( two- sided t-test)  Test the hypothesis that the mean cholesterol level of recent female Asian immigrants is different from the mean in the general U.S. population 190 mg/dL. Blood tests are performed on 100 female Asian = 181 . 52 x immigrants, the mean level mg/dL with standard deviation s=40 mg/dL  H 0 : µ = 190 vs H a : µ ≠ 190 − µ − 181 . 52 190 x = = = − 0 2 . 12 t s / n 40 / 100 0 t 99, .025 t 99, .975 α =0.05, the critical values are Conclusion: the mean cholesterol t 99, .025 =-TINV(0.05,99) = -1.98, so t 99, level of recent Asian immigrants is .975 =1.98. Thus t=-2.12 is in the rejection significantly different from the mean for the general U.S. region. H 0 is rejected at 5% level of population significance.

  16. Determination of Statistical Significance  Critical-value method Calculate the critical value from α and t distribution  p-value method Once we calculate the actual t statistics, we can ask what is the smallest significance level at which we could still reject H 0 (observed level of significance)

  17. p-value  Def.: Probability of obtaining a test statistic as extreme or more extreme than actual sample value given H 0 is true For H 1 : µ < µ 0 p-value=P(t n-1 <=t obs )  For H 1 : µ > µ 0 p-value=P(t n-1 >= t obs ) For H 1 : µ = µ 0 p-value= 2P(t n-1 <= t obs ), if t obs <=0 2(1-P(t n-1 <= t obs )), if t obs >0  Used to make rejection decision If p-value ≥ α , do not reject H 0  If p-value < α , reject H 0 

  18. Birth Weight Example  The mean birth weight in the United States is 120oz. You get a list of birth weights from 100 consecutive, full-term, live-born deliveries from the maternity ward of a hospital in a low-SES area.  The sample mean birth weight is = 115 oz and x standard deviation is s=24 oz.  Find the p-value.

  19. One-sided test: 1. t value of sample statistic (observed) − µ − 115 120 x = = = − 0 2 . 08 t s / n 24 / 100 0 Z -2.08

  20. One-sided test: Use alternative hypothesis to find direction p-value is P(t 99 <= -2.08) = .020 EXCEL function: TDIST(2.08,99,1)=0.02 p-Value=.02 t n-1 distribution -2.08 0 Z Pvalue< α , so H 0 is rejected and we conclude that the true mean birth weight is significantly lower in this hospital than in the general population.

  21. EX Cardiovascular Disease ( two- sided t-test)  Test the hypothesis that the mean cholesterol level of recent female Asian immigrants is different from the mean in the general U.S. population 190 mg/dL. Blood tests are performed on 100 female Asian immigrants, the mean = level mg/dL with standard deviation s=40 181 . 52 x mg/dL  H 0 : µ = 190 vs H a : µ ≠ 190  What is the pvalue?

  22. Two-sided test: 1. t value of sample statistic (observed) − µ − 181 . 52 190 x = = = − 0 2 . 12 t s / n 40 / 100 -2.12 0 Z

  23. Two-sided test: p-value is 2*P(t 99 ≤ -2.12) =TDIST(2.12,99,2)=0.037< α 1/2 p-Value=.0185 -2.12 0 t

  24. Two-sided test: p-value is 2*P(t 99 ≤ -2.12) =TDIST(2.12,99,2)=0.037< α 1/2 p-Value=.0185 1/2 p-Value=.0185 -2.12 0 2.12 Z Conclusion: the mean cholesterol level of recent Asian immigrants is significantly different from the mean for the general U.S. population

  25. Relationship Between Hypothesis Testing and Confidence Intervals (2-sided case) H 0 : µ = µ 0 versus H 1 : µ = µ 1 ≠ µ 0 H 0 is rejected with level α 100% × (1- α ) CI does not contain µ 0  p-value tells exactly how significant results are  CI gives range of values that may contain µ

  26. Hypothesis Tests vs. Confidence Intervals There are three ways to test hypotheses (assume α = 0.05): 1. By critical value method 2. By computing p-value 3. By constructing confidence interval

  27. EX Cardiovascular Disease ( two- sided t-test) CI method  Test the hypothesis that the mean cholesterol level of recent female Asian immigrants is different from the mean in the general U.S. population 190 mg/dL. Blood tests are performed on 100 female Asian x = 181.52 immigrants, the mean level mg/dL with standard deviation s=40 mg/dL  H 0 : µ = 190 vs H a : µ ≠ 190 − µ − x 181.52 190 = = = − 0 2.12 t / 40 / 100 s n α =0.05, the critical values are t 99, .025 =-TINV(0.05,99) = -1.98, so t 99, .975 =1.98. Then the 95% CI for µ is ± = 181.52 1.98*40 / 100 [173.6,189.4] Conclusion: since it does not include µ = 190, we conclude that the mean cholesterol level of recent Asian immigrants is significantly different from the mean for the general U.S. population

  28. Hypothesis Testing: Two-Sample Inference

  29. We’ll learn…  How to use hypothesis testing for comparing the difference between 1. The means of two independent populations 2. The means of two related populations 3. The variances of two populations

  30. Two-sample Inference Possible scenarios: 1. Two independent samples  From two independent populations 2. Paired samples:  Single population (before/after measurements)  Two related populations (matched pairs)

  31. Example  Let’s say we are interested in the relationship between use of oral contraceptives (OC) and level of blood pressure (BP) in women. 1. Follow up non-OC users and measure the change when they become OC user : Longitudinal study; paired samples from single population  Same patients: each woman is used as her own control, so observed difference more likely to be due to OC use  Hard to follow up all patients  Expensive

  32. Example  Let’s say we are interested in the relationship between use of oral contraceptives (OC) and level of blood pressure (BP) in women. 2. Measure the difference in BP between a group of OC users and a group of non-OC users: Cross-sectional : two independent samples  The participants are seen at only one visit, could be very different due to other factors such as age  Financially feasible

  33. The Paired t Test

  34. Related Populations: Paired t Test  Tests means of 2 related populations  Paired samples  Repeated measures (before/after)  Use difference between paired values  Assumptions:  Both populations are normally distributed  If not normal, use large samples (n>=30)

  35. Example  Want to study the effect of OC on blood pressure.  Study design:  Recruit 10 women who are not using OC.  Follow-up after 1 year of using OC.  Interested in knowing BP difference. SBP Level: 10 women not using OC using OC Difference (baseline) 1 115 128 13 2 112 115 3 3 107 106 -1 4 119 128 9 … … …. ….

  36. Mean Difference, σ D Known  The i th paired difference is D i = X 1i - X 2i  The point estimate for the population mean paired difference is D : n ∑ D i D = i = 1 , n is the number of pairs in the paired sample n  Suppose the population standard deviation of the difference scores, σ D , is known  The test statistic for the mean difference is a Z value: Z = D − µ D σ D Where n µ D = hypothesized mean difference σ D = population standard dev. of differences n = the sample size (number of pairs)

  37. Mean Difference, σ D Unknown  If σ D is unknown, we can estimate the unknown population standard deviation with a sample standard deviation: n = ∑ − 2 (D D ) i = i 1 S D − n 1  Use a paired t test, the test statistic for D is now a t statistic, with n-1 d.f.: t = D − µ D S D n

  38. Hypothesis Testing for Mean Difference, σ D Unknown Lower-tail test: Upper-tail test: Two-tail test: H 0 : µ D = 0 H 0 : µ D = 0 H 0 : µ D = 0 H 1 : µ D < 0 H 1 : µ D > 0 H 1 : µ D ≠ 0 α /2 α /2 α α -t α t α -t α /2 t α /2 Reject H 0 if t < -t α Reject H 0 if t > t α Reject H 0 if t < -t α/2 or t > t α/2 Where t has n - 1 d.f .

  39. Confidence Interval The confidence interval for µ D is  σ D known 1.   D − Z σ D D + Z σ D ,     n n σ D unknown 2.   S D S D D − t n − 1 D + t n − 1     n n where n = the sample size (number of pairs in the paired sample)

  40. Paired t Test Example SBP Level: 10 women not using OC using OC Difference, D i 1 115 128 13 2 112 115 3 3 107 106 - 1 4 119 128 9 5 115 122 7 6 138 145 7 7 126 132 6 8 105 109 4 9 104 102 -2 10 115 117 2 ∑ ∑ (D i − D) 2 D i D = =4.8, S D = = 4.566 n − 1 n

  41. Paired t Test: Solution  Has the use of OC made a difference in their blood pressure (at the 0.01 level)? Reject Reject H 0 : µ D = 0 H 1 : µ D ≠ 0 α /2 α /2 α = .01 t 9,0.025 t 9,0.975 - 2.26 2.26 Critical Value = ± 2.26 d.f. = 10 - 1 = 9 EXCEL: t 9,0.975 =TINV(0.05,9)=2.26

  42. Paired t Test: Solution  Has the use of OC made a difference in their blood pressure (at the 0.01 level)? Reject Reject H 0 : µ D = 0 H 1 : µ D ≠ 0 α /2 α /2 α = .01 t 9,0.025 t 9,0.975 - 2.26 2.26 Critical Value = ± 2.26 3.32 d.f. = 10 - 1 = 9 EXCEL: t 9,0.975 Test Statistic: =TINV(0.05,9)=2.26 t = D − µ D 4.8 − 0 = = 3.32 S D / n 4.566/ 10

  43. Paired t Test: Solution  Has the use of OC made a difference in their blood pressure (at the 0.01 level)? Reject Reject H 0 : µ D = 0 H 1 : µ D ≠ 0 α /2 α /2 α = .01 - 2.26 2.26 D = 4.8 3.32 3.32 Critical Value = ± 2.26 Decision: reject H 0 d.f. = 10 - 1 = 9 (t stat is in the reject region) Test Statistic: Conclusion: There is a t = D − µ D 4.8 − 0 = = 3.32 significant change in the S D / n 4.566/ 10 blood pressure.

  44. Confidence Interval for the True Difference ( µ D ) Between the Underlying means of Two Paired Samples So in the above example: 95% Confidence Interval is [1.53, 8.07]

  45. Install the Excel 2007 Analysis ToolPak Although the Analysis ToolPak comes with Excel 2007, it doesn’t come pre-installed. Follow the following link to install it in Excel. http://www.dummies.com/how-to/content/how-to- install-the-excel-2007-analysis-toolpak.html After installation, please restart Excel, then you will see the Data Analysis button in the Analysis group added to the end of the Ribbon’s Data tab.

  46. EXCEL Paired T-test Analysis  EXCEL  Data  Data Analysis  t-Test: Paired Two Sample for Means

  47. EXCEL Paired T-test Analysis Results t-Test: Paired Two Sample for Means Variable 1 Variable 2 Mean 120.4 115.6 Variance 174.9333 106.2667 Observations 10 10 Pearson Correlation 0.954777 Hypothesized Mean Difference 0 df 9 t Stat 3.324651 P(T<=t) one-tail 0.004437 t Critical one-tail 1.833113 P(T<=t) two-tail 0.008874 t Critical two-tail 2.262157

  48. Independent Samples

  49. Independent Samples  Different data sources  Unrelated  Independent  Sample selected from one population has no effect on the sample selected from the other population  Goal: Test hypothesis or form a confidence interval for the difference between two population means, µ 1 – µ 2 The point estimate for the difference is X 1 – X 2

  50. Difference Between Two Means Population means, independent samples σ 1 and σ 2 known σ 1 and σ 2 unknown, assumed equal σ 1 and σ 2 unknown, not assumed equal

  51. Difference Between Two Means Population means, independent samples σ 1 and σ 2 known Use a Z test statistic Use S p to estimate unknown σ , σ 1 and σ 2 unknown, use a t test statistic and pooled assumed equal standard deviation Use S 1 and S 2 to estimate σ 1 and σ 2 unknown, unknown σ 1 and σ 2 , use a not assumed equal separate-variance t test

  52. σ 1 and σ 2 Known Population means, Assumptions: independent samples  Samples are randomly and independently drawn σ 1 and σ 2 known  Population distributions are normal or both sample sizes are ≥ 30 σ 1 and σ 2 unknown,  The test statistic is a Z-value… assumed equal  Population standard deviations σ 1 and σ 2 unknown, are known not assumed equal

  53. σ 1 and σ 2 Known (continued) Population means, …and the standard error of independent X 1 – X 2 is samples σ 1 + σ 2 2 2 σ X 1 − X 2 = σ 1 and σ 2 known n 1 n 2 σ 1 and σ 2 unknown, assumed equal σ 1 and σ 2 unknown, not assumed equal

  54. σ 1 and σ 2 Known (continued) Population means, The test statistic for independent µ 1 – µ 2 is: samples ( ) ( ) X 1 − X 2 − µ 1 − µ 2 σ 1 and σ 2 known Z = σ 1 + σ 2 2 2 n 1 n 2 σ 1 and σ 2 unknown, assumed equal σ 1 and σ 2 unknown, not assumed equal

  55. Hypothesis Tests for Two Population Means Two Population Means, Independent Samples Lower-tail test: Upper-tail test: Two-tail test: H 0 : µ 1 = µ 2 H 0 : µ 1 = µ 2 H 0 : µ 1 = µ 2 H 1 : µ 1 < µ 2 H 1 : µ 1 > µ 2 H 1 : µ 1 ≠ µ 2 i.e., i.e., i.e., H 0 : µ 1 – µ 2 = 0 H 0 : µ 1 – µ 2 = 0 H 0 : µ 1 – µ 2 = 0 H 1 : µ 1 – µ 2 > 0 H 1 : µ 1 – µ 2 ≠ 0 H 1 : µ 1 – µ 2 < 0

  56. Hypothesis tests for µ 1 – µ 2 Lower-tail test: Upper-tail test: Two-tail test: H 0 : µ 1 – µ 2 = 0 H 0 : µ 1 – µ 2 = 0 H 0 : µ 1 – µ 2 = 0 H 1 : µ 1 – µ 2 < 0 H 1 : µ 1 – µ 2 > 0 H 1 : µ 1 – µ 2 ≠ 0 α /2 α /2 α α -z α z α -z α /2 z α /2 Reject H 0 if Z < -Z α Reject H 0 if Z > Z α Reject H 0 if Z < -Z α /2 or Z > Z α /2

  57. Confidence Interval, σ 1 and σ 2 Known Population means,  The confidence interval for independent µ 1 – µ 2 is: samples ( ) ± Z σ 1 + σ 2 2 2 X 1 − X 2 σ 1 and σ 2 known n 1 n 2 σ 1 and σ 2 unknown, assumed equal σ 1 and σ 2 unknown, not assumed equal

  58. σ 1 and σ 2 Unknown, Assumed Equal Assumptions: Population means,  Samples are randomly and independent independently drawn samples  Populations are normally σ 1 and σ 2 known distributed or both sample sizes are at least 30  Population variances are σ 1 and σ 2 unknown, unknown but assumed equal assumed equal σ 1 and σ 2 unknown, not assumed equal

  59. σ 1 and σ 2 Unknown, Assumed Equal (continued) Forming interval estimates: Population means,  The population variances are independent assumed equal, so use the two samples sample variances and pool them to estimate the common σ 2 σ 1 and σ 2 known  The pooled variance ( ) S 1 ( ) S 2 2 + n 2 − 1 n 1 − 1 2 2 = σ 1 and σ 2 unknown, S p (n 1 − 1) + (n 2 − 1) assumed equal σ 1 and σ 2 unknown, not assumed equal

  60. σ 1 and σ 2 Unknown, Assumed Equal (continued) The test statistic for µ 1 – µ 2 is a t  Population means, value with (n 1 + n 2 – 2) degrees of independent freedom ( ) samples ( ) X 1 − X 2 − µ 1 − µ 2 t =   2 1 + 1 σ 1 and σ 2 known S p     n 1 n 2 σ 1 and σ 2 unknown, assumed equal σ 1 and σ 2 unknown, not assumed equal

  61. Confidence Interval, σ 1 and σ 2 Unknown Population means,  The confidence interval for independent µ 1 – µ 2 is: samples ( ) ± t n 1 + n 2 -2   1 + 1 X 1 − X 2 2 S p     σ 1 and σ 2 known n 1 n 2 σ 1 and σ 2 unknown, assumed equal σ 1 and σ 2 unknown, not assumed equal

  62. Pooled-Variance t Test: Example Example: Compare the mean systolic pressure between OC and non-OC users. SBP level ID not using OCs using OC 1 115 128 Assuming both populations are 2 112 115 approximately normal with equal 3 107 106 4 119 128 variances, is there a difference 5 115 122 in average SBP ( α = 0.05)? 6 138 145 7 126 132 8 105 109 9 104 102 10 115 117

  63. EXCEL DATA Analysis  With raw data, we can perform the t-test using the following analysis  EXCEL  Data  Data Analysis  t-Test: Two Sample Assuming Equal Variances

  64. EXCEL T-test Analysis Results t-Test: Two-Sample Assuming Equal Variances Sample Variable 1 Variable 2 results Mean 115.6 120.4 Variance 106.2666667 174.9333333 Observations 10 10 Pooled Variance 140.6 Hypothesized Mean Difference 0 df 18 t Stat -0.905177144 P(T<=t) one-tail 0.188664198 t Critical one-tail 1.734063592 P(T<=t) two-tail 0.377328397 t Critical two-tail 2.100922037

  65. Solution H0: µ 1 – µ 2 = 0   Test Statistic: - 0.91 Ha: µ 1 – µ 2 ≠ 0   Decision: α = 0.05  Do not reject at α = 0.05 df = (10-1)+(10-1)=18   Critical Value(s): 2.1  Conclusion: TINV(0.05, 18)=2.1 we conclude that the mean blood pressures of the OC an Reject H 0 Reject H 0 d non-OC groups do not .025 .025 significantly differ from each other. 0 t Pvalue=0.38>0.05, do not reject -2.1 2.1 H0

  66. Calculating the Test Statistic The test statistic is: ( ) ( ) ( ) − − µ − µ − − X X 115.6 120.4 0 1 2 1 2 = = = t -0.91     1 1 1 1 + + 2 140.6     S   p 10 10  n n  1 2 ( ) ( ) ( ) ( ) − + − − + − 2 2 n 1 S n 1 S 10 1 106.3 10 1 174.9 = = = 1 1 2 2 2 S 140.6 − + − + − p (n 1) (n 1) (10-1) (10 1) 1 2

  67. σ 1 and σ 2 Unknown, Not Assumed Equal Assumptions: Population means,  Samples are randomly and independent independently drawn samples  Populations are normally σ 1 and σ 2 known distributed or both sample sizes are at least 30  Population variances are σ 1 and σ 2 unknown, unknown but cannot be assumed equal assumed to be equal σ 1 and σ 2 unknown, not assumed equal

  68. σ 1 and σ 2 Unknown, Unequal variances (continued) Population means, The test statistic for µ 1 – µ 2 is:  independent ( ) samples ( ) X 1 − X 2 − µ 1 − µ 2 t = 2 2 S 1 + S 2 σ 1 and σ 2 known n 1 n 2 σ 1 and σ 2 unknown, assumed equal σ 1 and σ 2 unknown, not assumed equal

  69. σ 1 and σ 2 Unknown, Not Assumed Equal (continued) Population means, Forming the test statistic: independent samples  The population variances are not assumed equal, so include the two sample variances in the σ 1 and σ 2 known computation of the test statistic  The test statistic can be σ 1 and σ 2 unknown, approximated by a t distribution assumed equal with v degrees of freedom (see next slide) σ 1 and σ 2 unknown, not assumed equal

  70. σ 1 and σ 2 Unknown, Assumed Unequal (Satterthwaite’s method) (continued) Population means,  The number of degrees of freedom independent is the integer portion of: samples 2   2 2 S 1 + S 2     n 1 n 2 ν = σ 1 and σ 2 known 2 2     2 2 ( ) + ( ) S 1 S 1 n 1 − 1 n 2 − 1         n 1 n 1 σ 1 and σ 2 unknown, assumed equal σ 1 and σ 2 unknown, not assumed equal

  71. σ 1 and σ 2 Unknown, Unequal variances (continued) Population means,  The confidence interval for independent µ 1 – µ 2 is: samples ( ) ± t ν 2 2 S 1 + S 2 X 1 − X 2 n 1 n 2 σ 1 and σ 2 known σ 1 and σ 2 unknown, assumed equal σ 1 and σ 2 unknown, not assumed equal

  72. Unequal Variances: Example Example: Compare the mean systolic pressure between OC and non-OC users. SBP level ID not using OCs using OC Assuming both populations 1 115 128 are approximately normal with 2 112 115 unequal variances, test for the 3 107 106 equality of the mean 4 119 128 5 115 122 cholesterol levels of the 6 138 145 children Group1 and Group2 7 126 132 ( α = 0.05). 8 105 109 9 104 102 10 115 117

  73. EXCEL DATA Analysis EXCEL  Data  Data Analysis  t-Test: Two Sample Assuming Unequal Variances

  74. EXCEL DATA Analysis t-Test: Two-Sample Assuming Unequal Variances Variable 1 Variable 2 Mean 115.6 120.4 Variance 106.2667 174.9333 Observations 10 10 Hypothesized Mean Difference 0 df 17 t Stat -0.90518 P(T<=t) one-tail 0.189011 t Critical one-tail 1.739607 P(T<=t) two-tail 0.378022 t Critical two-tail 2.109816

  75. Solution H0: µ 1 – µ 2 = 0   Test Statistic: Ha: µ 1 – µ 2 ≠ 0  α = 0.05   Degree of freedom??  Critical Value(s):  Decision:  Conclusion:

Recommend


More recommend