welcome back
play

Welcome Back! EDUC 7610 Chapter 2 The Simple Regression Model - PowerPoint PPT Presentation

Welcome Back! EDUC 7610 Chapter 2 The Simple Regression Model Fall 2018 Tyson S. Barrett, PhD ! " = $ % + $ ' ( '" + ) " Lets start with Scatterplots Each point represents a single 6 observation The red line is the line


  1. Welcome Back!

  2. EDUC 7610 Chapter 2 The Simple Regression Model Fall 2018 Tyson S. Barrett, PhD ! " = $ % + $ ' ( '" + ) "

  3. Let’s start with Scatterplots Each point represents a single 6 observation The red line is the line of best fit 4 y The line happens to go through each Conditional Mean 2 It goes through the mean at each value • of x E.g. When x = 1, mean of y = 2.5 (the • 0 2 4 6 conditional mean of y at x = 1 is 2.5) x

  4. Conditional Means and Prediction The open circles are where the 6 Conditional Means are In this case, all conditional means 4 run along the line y When this happens (or approx. • happens) we have linearity 2 The line is the linear model’s predicted level of y for each level of 0 x 2 4 6 x

  5. Why is that line the “best”? That line is the line that minimizes 6 the error between the predicted values and the observed values 4 i.e., “residual” or “error” y . . % ) 4 = + % −2 4 !! "#$%&'() = + ( 0 0 5 % 2 %,- %,- This approach is called Ordinary 0 Least Squares (OLS) regression 2 4 6 x

  6. Features of the “Best” Line (Simple Regression) Slope = ! " Intercept = ! 5 ! 5 = 7 + − ! " 7 ( ' ( % − * % − * ( % + + % ! " = $ ( % − * ( % , %&" = -./((+) 234(() The Line ( 8 + % ) = ! 5 + ! " ( % ' ( % − * ( % , ' ( % − * % − * ( % + + % 234(() = $ -./((+) = $ 6 6 %&" %&"

  7. The “Best” Line and Correlation ' & ! " = $ %& ' ( ! " is only affected by variables that ) * We unstandardized the $ %& by ) + influence both X and Y while $ %& is affected by variables that only %& has no scale but ! " is in the units $ influence Y of the outcome $ %& is affected by the range of the ! " is the effect of X on Y while $ %& is variables measured the relative importance of X on Y

  8. "# by $ % We unstandardized the ! $ & That is, ! "# is the standardized version of ' ( If we standardize our variables before using regression, both ! "# a nd ' ( are the same + " = - . − 0 - Why? 1 2

  9. ! "# has no scale but $ % is in the units of the outcome ! "# has a range of -1 to 1 $ % is in the range of the outcome (approximately), often is from – ∞ to ∞ “For a one unit increase in X there is an associated increase of $ % units in the outcome”

  10. # $% is affected by the range of the variables measured The value of ! " is not affected by the range of X (the significance is…) # $% is affected by having a less-than-representative range of X Why?

  11. ! "# is affected by the range of the variables measured 6 6 4 4 y y 2 2 0 0 2 4 6 1 2 3 4 x x

  12. $ % is only affected by variables that $ % is the effect of X on influence both X and Y while ! "# is Y while ! "# is the affected by variables that only relative importance of influence Y X on Y ! "# is a measure of relative importance compared to other variables If other variables are important, ! "# will be relatively smaller • $ % is a measure of the effect of X on Y and therefore shouldn’t change much based on the range of X • The standard error is affected though (we’ll discuss later)

  13. Back to Residuals 2 The estimate of ! " depends on minimizing the residuals so they are kind of a big deal 0 y / / ' ) 5 = - ' −3 5 ## $%&'()*+ = - ( 1 1 6 ' − 2 '." '." − 2 − 1 0 1 2 x

  14. Back to Residuals 2 Our ! " values can be separated into three parts: 0 y " = $ ! + & " − $ " − & ! ! ! + ! ! " The same for Unexplained Explained everyone (a − 2 component component constant) (residuals) − 2 − 1 0 1 2 x

  15. Back to Residuals 2 Our ! " values can be separated into three parts: 0 y " = $ ! + & " − $ " − & ! ! ! + ! ! " The same for Unexplained Explained everyone (a − 2 component component constant) (residuals) − 2 − 1 0 1 2 x

  16. Properties of the Residuals 1. The mean is exactly zero. 2. The correlation with X is exactly zero. 3. The variance is: . ) Var Y. X = Var(Y)(1 − r ,- Var Y. X . ) The proportion of variance Var(Y) = (1 − r ,- in Y not explained by X

  17. Properties of the Residuals 1. The mean is exactly zero. 2. The correlation with X is exactly zero. . is the proportion of r ,- 3. The variance is: variance in Y explained by X . ) Var Y. X = Var(Y)(1 − r ,- Var Y. X . ) The proportion of variance Var(Y) = (1 − r ,- in Y not explained by X

  18. Residuals tell us stuff 1. Partial relationships because the residual is what is remaining in Y after adjusting for X 2. Residual analysis to detect anomalies 3. Detect non-linearities 4. Assess the homoskedasticity assumption

Recommend


More recommend