m8s1 regression inference
play

M8S1 - Regression Inference Professor Jarad Niemi STAT 226 - Iowa - PowerPoint PPT Presentation

M8S1 - Regression Inference Professor Jarad Niemi STAT 226 - Iowa State University November 29, 2018 Professor Jarad Niemi (STAT226@ISU) M8S1 - Regression Inference November 29, 2018 1 / 13 Regression Inference Review of population mean


  1. M8S1 - Regression Inference Professor Jarad Niemi STAT 226 - Iowa State University November 29, 2018 Professor Jarad Niemi (STAT226@ISU) M8S1 - Regression Inference November 29, 2018 1 / 13

  2. Regression Inference Review of population mean inference Assumptions Confidence interval p -value Hypothesis test Regression inference Assumptions Confidence interval p -value Hypothesis test Professor Jarad Niemi (STAT226@ISU) M8S1 - Regression Inference November 29, 2018 2 / 13

  3. Population mean Assumptions Population mean assumptions What is an inference? Making a statement about the population based on a sample. What are our assumptions when making an inference about a population mean? Data are independent Data are normally distributed Data are identically distributed with a common mean and a common variance This is encapsulated with the statistical notation iid ∼ N ( µ, σ 2 ) Y i Professor Jarad Niemi (STAT226@ISU) M8S1 - Regression Inference November 29, 2018 3 / 13

  4. Population mean Statistics Statistics for a population mean iid ∼ N ( µ, σ 2 ) , If we have the assumption Y i What is our estimator for µ ? sample mean n µ = y = 1 � ˆ y i n i =1 What is our estimator for σ 2 ? sample variance n 1 σ 2 = s 2 = � ( y i − y ) 2 ˆ n − 1 i =1 What is the standard error of ˆ µ ? µ ] = SE [ y ] = s/ √ n SE [ˆ Professor Jarad Niemi (STAT226@ISU) M8S1 - Regression Inference November 29, 2018 4 / 13

  5. Population mean Statistics Confidence intervals for a population mean iid ∼ N ( µ, σ 2 ) , what is the formula to If we have the assumption Y i construct a 100(1 − α )% confidence interval for the population mean µ ? y ± t n − 1 ,α/ 2 s/ √ n where P ( T n − 1 > t n − 1 ,α/ 2 ) = α/ 2 . More generally, we have µ ± t ∗ × SE [ˆ ˆ µ ] where µ is the estimator of the population mean ˆ t ∗ is the appropriate t -critical value SE [ˆ µ ] is the standard error of the estimator Professor Jarad Niemi (STAT226@ISU) M8S1 - Regression Inference November 29, 2018 5 / 13

  6. Population mean Statistics t -statistic for a population mean Suppose you have the null hypothesis H 0 : µ = m 0 What is the formula for the t -statistic? t = y − m 0 s/ √ n = ˆ µ − m 0 SE [ˆ µ ] Thus we have the estimator minus the hypothesized value in the numerator and the standard error of the estimator in the denominator. If the null hypothesis is true, what is the distribution for t ? t ∼ T n − 1 Professor Jarad Niemi (STAT226@ISU) M8S1 - Regression Inference November 29, 2018 6 / 13

  7. Population mean Hypothesis test Hypothesis test for population mean Suppose you have the hypotheses H 0 : µ = m 0 versus H a : µ > m 0 How can you calculate the p -value for this test? � T n − 1 > ˆ µ − m 0 � p -value = P ( T n − 1 > t ) = P SE [ˆ µ ] At level α , you reject H 0 if p -value ≤ α and conclude that there is statistically significant evidence that µ > 0 or fail to reject H 0 if p -value > α and conclude that there is insufficient evidence that µ > 0 . Professor Jarad Niemi (STAT226@ISU) M8S1 - Regression Inference November 29, 2018 7 / 13

  8. Regression Assumptions Assumptions In statistical notation, the regression assumptions can be written as iid ∼ N (0 , σ 2 ) y i = β 0 + β 1 x i + ǫ i ǫ i for some unknown population intercept ( β 0 ), population slope ( β 1 ), and error for individual i ( ǫ i ). What are the assumptions for the regression model? Errors are independent Errors are normal Errors are identically distributed with a mean of 0 and a variance of σ 2 Linear relationship between the explanatory variable and the mean of the response: E [ Y i ] = β 0 + β 1 x i You might also see regression written like ind ∼ N ( β 0 + β 1 x i , σ 2 ) . Y i Professor Jarad Niemi (STAT226@ISU) M8S1 - Regression Inference November 29, 2018 8 / 13

  9. Regression Statistics for regression Statistics for regression (You do not need to know the formulas.) iid ∼ N (0 , σ 2 ) y i = β 0 + β 1 x i + ǫ i ǫ i For the slope ( β 1 ), the estimator is the sample slope ˆ β 1 = b 1 = r × s y /s x For the intercept ( β 0 ), the estimator is the sample intercept ˆ β 0 = b 0 = y − b 1 x For the variance ( σ 2 ) , the estimator is n n 1 1 σ 2 = � � e 2 ( y − b 0 − b 1 x i ) 2 ˆ i = n − 2 n − 2 i =1 i =1 Professor Jarad Niemi (STAT226@ISU) M8S1 - Regression Inference November 29, 2018 9 / 13

  10. Regression Statistics for regression Standard errors for regression (You do not need to know the formulas.) iid ∼ N (0 , σ 2 ) y i = β 0 + β 1 x i + ǫ i ǫ i The important standard errors are � 1 SE [ˆ β 1 ] = SE [ b 1 ] = ˆ σ ( n − 1) s 2 x and � x 2 1 SE [ˆ β 0 ] = SE [ b 0 ] = ˆ σ n + ( n − 1) s 2 x We can use these to construct confidence intervals and pvalues. Professor Jarad Niemi (STAT226@ISU) M8S1 - Regression Inference November 29, 2018 10 / 13

  11. Regression Confidence intervals for regression Confidence intervals for regression iid ∼ N (0 , σ 2 ) y i = β 0 + β 1 x i + ǫ i ǫ i 100(1 − α ) % confidence interval for the slope: b 1 ± t n − 2 ,α/ 2 × SE [ b 1 ] 100(1 − α ) % confidence interval for the intercept: b 0 ± t n − 2 ,α/ 2 × SE [ b 0 ] To remember the degrees of freedom, it is always the sample size minus the number of parameters in the mean. In this case, there are two parameters in the mean: β 0 and β 1 . Professor Jarad Niemi (STAT226@ISU) M8S1 - Regression Inference November 29, 2018 11 / 13

  12. Regression Confidence intervals for regression Hypothesis tests Although alternative hypothesis tests can be constructed for different hypothesized values, the vast majority of the time we are testing versus a hypothesized value of 0 and typically only caring about the slope. Suppose you have these hypotheses about the slope H 0 : β 1 = 0 versus H a : β 1 � = 0 Then our t -statistic is ˆ β 1 − 0 = b 1 − 0 t = SE [ b 1 ] ∼ T n − 2 SE [ˆ β 1 ] and a p -value is p -value = 2 P ( T n − 2 > | t | ) . Professor Jarad Niemi (STAT226@ISU) M8S1 - Regression Inference November 29, 2018 12 / 13

  13. Regression Confidence intervals for regression Why do we care about β 1 = 0 ? If β 1 = 0 , then y i = β 0 + ǫ i , i.e. our response variable is independent of our explanatory variable. 1.00 0.75 0.50 y 0.25 0.00 0.25 0.50 0.75 1.00 x Professor Jarad Niemi (STAT226@ISU) M8S1 - Regression Inference November 29, 2018 13 / 13

Recommend


More recommend