stat 213 simple linear regression i
play

STAT 213 Simple Linear Regression I Colin Reimer Dawson Oberlin - PowerPoint PPT Presentation

Outline Simple Linear Regression Model STAT 213 Simple Linear Regression I Colin Reimer Dawson Oberlin College 5 October 2016 Outline Simple Linear Regression Model Outline Simple Linear Regression Model Outline Simple Linear Regression


  1. Outline Simple Linear Regression Model STAT 213 Simple Linear Regression I Colin Reimer Dawson Oberlin College 5 October 2016

  2. Outline Simple Linear Regression Model Outline Simple Linear Regression Model

  3. Outline Simple Linear Regression Model The Project Find a relationship between a response variable ( Y ) and one or more predictor/explanatory variables, X 1 , . . . , X k . Y = f ( X ) + ε DATA = PATTERN + IDIOSYNCRACIES • One vs two means: Y quantitative, X categorical • Simple Linear Regression: Both quantitative (but still just one X )

  4. Outline Simple Linear Regression Model Examples • Y = Home Price X = Home size • Y = Exam score X = Hours spent studying • Y = State % in poverty X = State % with no health insurance • Y = SAT score X = Family income

  5. Outline Simple Linear Regression Model The Simple Linear Model Y = β 0 + β 1 · X + ε aka Response = Intercept + Slope · Predictor + Random Error Standard form: Assume the ε ∼ N (0 , σ ε ) and are independent Parameters to estimate: β 0 , β 1 and σ ε

  6. Outline Simple Linear Regression Model SLM Visualized

  7. Outline Simple Linear Regression Model SLM With Data

  8. Outline Simple Linear Regression Model Presidential Approval and Re-election Margin ● ● 20 Reelection Margin (%) ● ● 10 ● ● 5 ● ● 0 ● ● −10 ● 30 40 50 60 70 Incumbent Approval (%)

  9. Outline Simple Linear Regression Model Conditions for SLM Pattern 1. Mean Y at each X is a linear function of X : µ Y ( X ) = f ( X ) = β 0 + β 1 X Residuals 2. Zero mean: Residuals centered at 0 3. Constant variance: Same variability at all X (Homoskedasticity) 4. Independence: No relationship among errors 5. Normality (for standard form): At each X , Y s are Normally distributed

  10. Outline Simple Linear Regression Model Exploring violations of conditions https://gallery.shinyapps.io/slr_diag/

  11. Outline Simple Linear Regression Model Re-election Margin: Two Models ● ● ● ● 20 20 Reelection Margin (%) Reelection Margin (%) ● ● ● ● 10 10 ● ● ● ● 5 5 ● ● ● ● 0 0 ● ● ● ● −10 −10 ● ● 30 40 50 60 70 30 40 50 60 70 Incumbent Approval (%) Incumbent Approval (%) Figure: Left: Constant Model Y = β 0 + ε ; Right: Best Fit Linear Model: Y = β 0 + β 1 X + ε

  12. Outline Simple Linear Regression Model FIT: What parameters? The Simple Linear Model Y = β 0 + β 1 · X + ε aka Response = Intercept + Slope · Predictor + Random Error Standard form: Assume the ε ∼ N (0 , σ ε ) and are independent Parameters to estimate: β 0 , β 1 and σ ε

  13. Outline Simple Linear Regression Model Minimizing Sum of Squared Residuals • From data, pick estimates ˆ β 0 and ˆ β 1 to define an estimated f ( X ) (can write ˆ f ( X ) ). Defines prediction equation: Y i = ˆ ˆ f ( X i ) = ˆ β 0 + ˆ β 1 X i • If we want ˆ f ( X i ) to represent mean Y at X i , choose ˆ β 0 and ˆ β 1 to minimize sum of squared residuals: � ( Y i − ˆ Y i ) 2 SSR = • How? Multivariable calculus gives us formulae: � ( X i − ¯ X )( Y i − ¯ Y ) ˆ β 0 = ¯ ˆ Y − ˆ β 1 ¯ β 1 = X � ( X i − ¯ X ) 2

  14. Outline Simple Linear Regression Model Re-election Margin: Two Models ● ● ● ● 20 20 Reelection Margin (%) Reelection Margin (%) ● ● ● ● 10 10 ● ● ● ● 5 5 ● ● ● ● 0 0 ● ● ● ● −10 −10 ● ● 30 40 50 60 70 30 40 50 60 70 Incumbent Approval (%) Incumbent Approval (%) Figure: Left: Best fit Constant Model Y = ¯ ε ; Right: Best Fit Y + ˆ Linear Model: Y = ˆ β 0 + ˆ β 1 X + ˆ ε

  15. Outline Simple Linear Regression Model Estimating σ ε • The standard estimate of the population standard deviation of residuals, σ ε is (almost) the sample standard deviation of the residuals �� ( Y i − ˆ � SSR Y i ) 2 σ ε = ˆ n − 2 = n − 2 • We usually have n − 1 in the denominator when computing sample variance. Why n − 2 here?

  16. Outline Simple Linear Regression Model ASSESS: Check conditions with residual plots https://gallery.shinyapps.io/slr_diag/

Recommend


More recommend