inference for two samples
play

Inference for Two Samples In designed experiments, we compare the - PowerPoint PPT Presentation

ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Inference for Two Samples In designed experiments, we compare the distribution of some variable under various conditions, defined by the levels


  1. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Inference for Two Samples In designed experiments, we compare the distribution of some variable under various conditions, defined by the levels of experimental factors . In the simplest case, there is only one factor, and it has just two levels. That is, we have samples from two populations, and we use those samples to make inferences about the differences between the populations. 1 / 34 Inferences About Process Quality Statistical Inference for Two Samples

  2. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Comparing Means The two samples are X 1 , 1 , X 1 , 2 , . . . , X 1 , n 1 from population 1, and X 2 , 1 , X 2 , 2 , . . . , X 2 , n 2 from population 2, assumed to be independent. Point estimator The natural unbiased estimator of µ 1 − µ 2 is ¯ X 1 − ¯ X 2 . Its variance is X 2 ) = σ 2 + σ 2 Var( ¯ X 1 − ¯ 1 2 n 1 n 2 so its standard error is � σ 2 + σ 2 1 2 . n 1 n 2 2 / 34 Inferences About Process Quality Statistical Inference for Two Samples

  3. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Known variances If σ 2 1 and σ 2 2 are known, we can use the fact that X 1 − ¯ ¯ X 2 − ( µ 1 − µ 2 ) Z = ∼ N (0 , 1) � σ 2 + σ 2 1 2 n 1 n 2 to construct confidence intervals for µ 1 − µ 2 and to test hypotheses about this difference, in exactly the same way as for the mean µ of a single population. 3 / 34 Inferences About Process Quality Statistical Inference for Two Samples

  4. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Example: Paint formulations Population 1 is the drying time of a standard formulation of paint, and population 2 is the drying time for a modified formulation that is intended to dry more quickly. Sample sizes are n 1 = n 2 = 10, and we assume σ 1 = σ 2 = 8 minutes, so the standard error of ¯ X 1 − ¯ � X 2 is (8 2 / 10 = 3 . 58) minutes. The sample means are ¯ x 1 = 121 minutes and ¯ x 2 = 112 minutes. 4 / 34 Inferences About Process Quality Statistical Inference for Two Samples

  5. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Point estimate The point estimate of µ 1 − µ 2 is 9 minutes, with the standard error 3.58 minutes. Interval estimate The 95% confidence interval for µ 1 − µ 2 is 9 ± 1 . 96 × 3 . 58 = (1 . 98 , 16 . 02) minutes . 5 / 34 Inferences About Process Quality Statistical Inference for Two Samples

  6. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Hypothesis test The null hypothesis is that the new formulation dries no faster: H 0 : µ 2 ≥ µ 1 , or equivalently H 0 : µ 1 − µ 2 ≤ 0, so the alternative is H 1 : µ 1 − µ 2 > 0. The test statistic is x 1 − ¯ ¯ x 2 z obs = standard error = 9 / 3 . 58 = 2 . 516 . The P -value is 1 − Φ(2 . 516) = 0 . 0059, so H 0 is rejected in any test with Type I error rate α > 0 . 0059. The data give strong evidence that the new formulation dries faster. 6 / 34 Inferences About Process Quality Statistical Inference for Two Samples

  7. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Unknown variances, assumed equal If the variances are unknown, but we assume that σ 2 1 = σ 2 2 = σ 2 , we estimate the common variance σ 2 with the pooled estimator n 1 n 2 X 1 ) 2 + ( X 1 , i − ¯ ( X 2 , i − ¯ � � X 2 ) 2 i =1 i =1 S 2 p = n 1 + n 2 − 2 = ( n 1 − 1) S 2 1 + ( n 2 − 1) S 2 2 , ( n 1 − 1) + ( n 2 − 1) a weighted average of S 2 1 and S 2 2 . 7 / 34 Inferences About Process Quality Statistical Inference for Two Samples

  8. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control The estimated standard error of ¯ X 1 − ¯ X 2 is � 1 + 1 . S p n 1 n 2 The point estimator, interval estimator, and hypothesis tests are constructed in the same way as for a single population, with: The standard error replaced by the estimated standard error; The normal quantiles z α or z α/ 2 replaced by the corresponding t -quantiles t α, n 1 + n 2 − 2 or t α/ 2 , n 1 + n 2 − 2 ; For a P -value, Φ( · ) replaced by F t , n 1 + n 2 − 2 ( · ). 8 / 34 Inferences About Process Quality Statistical Inference for Two Samples

  9. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Example: Comparing mean yields Population 1 is the yield of a process using the current catalyst; Population 2 is the yield of the process using the an alternative, cheaper catalyst. Sample sizes are n 1 = n 2 = 8, and the sample means are x 1 = 92 . 255, ¯ ¯ x 2 = 92 . 733. Sample standard deviations are s 1 = 2 . 39, s 2 = 2 . 98, so 7 × 2 . 39 2 + 7 × 2 . 98 2 � s p = = 2 . 701 . 7 + 7 9 / 34 Inferences About Process Quality Statistical Inference for Two Samples

  10. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Point estimate We estimate µ 1 − µ 2 by ¯ x 1 − ¯ x 2 = − 0 . 478 with estimated standard � error s p × 2 / 8 = 1 . 351. Note that catalyst 2 actually gives a higher mean yield, in addition to being cheaper. Interval estimate The 95% confidence interval for µ 1 − µ 2 is − 0 . 478 ± t 0 . 025 , 14 × 1 . 351 = ( − 6 . 271 , 5 . 315) . 10 / 34 Inferences About Process Quality Statistical Inference for Two Samples

  11. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Hypothesis test Since the new catalyst is cheaper, we would switch to it unless it gives substantially lower yield. We would test the null hypothesis that its yield is no worse than the current catalyst; H 0 : µ 1 ≤ µ 2 versus H 1 : µ 1 > µ 2 . Montgomery instead tests for no change; H 0 : µ 1 = µ 2 versus H 1 : µ 1 � = µ 2 . In either case, the test statistic is estimated standard error = − 0 . 478 x 1 − ¯ ¯ x 2 t obs = 1 . 351 = − 0 . 35 . 11 / 34 Inferences About Process Quality Statistical Inference for Two Samples

  12. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control The two-sided P -value is 2(1 − F t , 14 ( | − 0 . 35 | )) = 0 . 73, so H 0 is not rejected at any conventional level. The data are consistent with the null hypothesis that there is no difference in mean yield, or in other words that changing the catalyst has no significant effect on mean yield. 12 / 34 Inferences About Process Quality Statistical Inference for Two Samples

  13. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Comparing Variances If we have samples of size n 1 and n 2 from N ( µ 1 , σ 2 1 ) and N ( µ 2 , σ 2 2 ), we may need to decide whether the variances are equal. We could test H 0 : σ 2 1 = σ 2 2 against H 1 : σ 2 1 � = σ 2 2 , or, equivalently, construct a confidence interval for σ 2 1 /σ 2 2 . Both depend on the fact that F = s 2 1 /σ 2 1 s 2 2 /σ 2 2 follows the F -distribution with n 1 − 1 and n 2 − 1 degrees of freedom. 13 / 34 Inferences About Process Quality Statistical Inference for Two Samples

  14. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Inference for Two Proportions If we have samples of size n 1 and n 2 from Bernoulli populations with proportions p 1 and p 2 , we may need to decide whether the proportions are equal. We assume that the normal approximations to the sampling distributions of ˆ p 1 and ˆ p 2 are good. We could test H 0 : p 1 = p 2 versus H 1 : p 1 � = p 2 , or construct a confidence interval for p 1 − p 2 . In this case, these may not be exactly equivalent. 14 / 34 Inferences About Process Quality Statistical Inference for Two Samples

  15. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Both the test and the confidence interval are based on the fact that p 1 − ˆ ˆ p 2 − ( p 1 − p 2 ) Z = � p 1 (1 − p 1 ) + p 2 (1 − p 2 ) n 1 n 2 is approximately N (0 , 1). 15 / 34 Inferences About Process Quality Statistical Inference for Two Samples

  16. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control In the hypothesis test, when p 1 = p 2 = p , the denominator is � 1 � + 1 � p (1 − p ) n 1 n 2 which we estimate by replacing p with p = n 1 ˆ p 1 + n 2 ˆ p 2 ˆ . n 1 + n 2 The test statistic is p 1 − ˆ ˆ p 2 z obs = � � � n 1 + 1 1 p (1 − ˆ ˆ p ) n 2 16 / 34 Inferences About Process Quality Statistical Inference for Two Samples

  17. ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Continuity correction A continuity correction improves the approximation to the normal distribution: � � p 2 − 1 n 1 + 1 1 p 1 − ˆ ˆ sgn(ˆ p 1 − ˆ p 2 ) 2 n 2 z obs = . � � � n 1 + 1 1 p (1 − ˆ ˆ p ) n 2 Equivalently, replace x 1 by x 1 ± 1 2 and x 2 by x 2 ∓ 1 2 , where the signs are chosen to make z obs smallest. Montgomery does not give the corrected form, so you can use the uncorrected form from the previous slide for exercises. 17 / 34 Inferences About Process Quality Statistical Inference for Two Samples

Recommend


More recommend