getting to regression the workhorse of quantitative
play

Getting to Regression: The Workhorse of Quantitative Political - PowerPoint PPT Presentation

Correlation Regression Getting to Regression: The Workhorse of Quantitative Political Analysis Department of Government London School of Economics and Political Science Correlation Regression 1 Correlation 2 Regression Correlation


  1. Correlation Regression Getting to Regression: The Workhorse of Quantitative Political Analysis Department of Government London School of Economics and Political Science

  2. Correlation Regression 1 Correlation 2 Regression

  3. Correlation Regression 1 Correlation 2 Regression

  4. Correlation Regression Correlation as Measure of Bivariate Relationship Covariance: ( X i − ¯ X )( Y i − ¯ Y ) � n Cov ( X , Y ) = i =1 n − 1

  5. Correlation Regression Correlation as Measure of Bivariate Relationship Covariance: ( X i − ¯ X )( Y i − ¯ Y ) � n Cov ( X , Y ) = i =1 n − 1 Correlation: ( X i − ¯ X )( Y i − ¯ Y ) � n Corr ( X , Y ) = r x , y = i =1 ( n − 1) s x s y �� n where s x = i =1 ( x i − ¯ x ) 2

  6. Correlation Regression Correlation is linear! Source: Wikimedia

  7. Correlation Regression Guess the Correlation! 1 Go to: http://guessthecorrelation.com/ 2 Play a few rounds

  8. Correlation Regression 1 Correlation 2 Regression

  9. Correlation Regression Regression Definition: a statistical method for measuring the relationships between one variable and many other variables

  10. Correlation Regression Regression Definition: a statistical method for measuring the relationships between one variable and many other variables Uses of Regression 1 Description 2 Prediction 3 Causal Inference

  11. Correlation Regression Regression Definition: a statistical method for measuring the relationships between one variable and many other variables Uses of Regression 1 Description 2 Prediction 3 Causal Inference Ordinary least squares (OLS) regression

  12. Correlation Regression Interpretations of OLS

  13. Correlation Regression Interpretations of OLS 1 Line (or surface) of best fit 2 Ratio of Cov ( X , Y ) and Var ( X ) 3 Minimizing residual sum of squares (SSR)

  14. Correlation Regression Interpretations of OLS 1 Line (or surface) of best fit 2 Ratio of Cov ( X , Y ) and Var ( X ) 3 Minimizing residual sum of squares (SSR) 4 Estimating unit-level causal effect

  15. Correlation Regression Bivariate Regression I Y is continuous X is a randomized treatment indicator/dummy (0 , 1) How do we know if the X had an effect on Y ?

  16. Correlation Regression Bivariate Regression I Y is continuous X is a randomized treatment indicator/dummy (0 , 1) How do we know if the X had an effect on Y ? Look at outcome mean-difference: E [ Y | X = 1] − E [ Y | X = 0]

  17. Correlation Regression Bivariate Regression I Mean difference ( E [ Y | X = 1] − E [ Y | X = 0]) is the regression line slope Slope ( β ) defined as ∆ Y ∆ X

  18. Correlation Regression Bivariate Regression I Mean difference ( E [ Y | X = 1] − E [ Y | X = 0]) is the regression line slope Slope ( β ) defined as ∆ Y ∆ X ∆ Y = E [ Y | X = 1] − E [ Y | X = 0] ∆ X = 1 − 0 = 1

  19. Correlation Regression Three Equations 1 Population: Y = β 0 + β 1 X (+ ǫ )

  20. Correlation Regression Three Equations 1 Population: Y = β 0 + β 1 X (+ ǫ ) 2 Sample estimate: y = ˆ β 0 + ˆ ˆ β 1 x + e

  21. Correlation Regression Three Equations 1 Population: Y = β 0 + β 1 X (+ ǫ ) 2 Sample estimate: y = ˆ β 0 + ˆ ˆ β 1 x + e 3 Unit: y i = ˆ β 0 + ˆ β 1 x i + e i = ¯ y 0 i + ( y 1 i − y 0 i ) x i + ( y 0 i − ¯ y 0 i )

  22. Correlation Regression y 7 6 5 4 3 2 1 x 0 1

  23. Correlation Regression y 7 6 ¯ y 1 5 4 3 ¯ y 0 2 1 x 0 1

  24. Correlation Regression y 7 6 ¯ y 1 5 4 ∆ y 3 ¯ y 0 2 ∆ x 1 x 0 1

  25. Correlation Regression y 7 6 ¯ y 1 5 4 ∆ y = β 1 3 ¯ y 0 2 ˆ β 0 ∆ x 1 x 0 1

  26. Correlation Regression y y = ˆ β 0 + ˆ 7 ˆ β 1 x 6 5 4 ˆ β 1 3 2 ˆ β 0 1 x 0 1

  27. Correlation Regression y 7 ˆ y = 2 + 3 x 6 5 4 3 2 1 x 0 1

  28. Correlation Regression y 7 y = 2 + 3 x ˆ 6 y i = 2 + 3 x i + e i e i 5 4 3 2 1 x 0 1

  29. Correlation Regression Questions?

  30. Correlation Regression Continuous X If x is continuous, calculation is more complicated Rather than β 1 being the mean-difference in outcomes, it is the slope across all values of x ˆ β 1 = Cov ( x , y ) / Var ( x )

  31. Correlation Regression Calculations x ) 2 x i y i x i − ¯ x y i − ¯ y ( x i − ¯ x )( y i − ¯ y ) ( x i − ¯ 1 1 ? ? ? ? 2 5 ? ? ? ? 3 3 ? ? ? ? 4 6 ? ? ? ? 5 2 ? ? ? ? 6 7 ? ? ? ? x ¯ y ¯ Cov ( x , y ) Var ( x )

  32. Correlation Regression y 7 6 5 4 3 2 1 x 0 1 2 3 4 5 6 7

  33. Correlation Regression ¯ x y 7 6 5 ¯ y 4 3 2 1 x 0 1 2 3 4 5 6 7

  34. Correlation Regression ¯ x y 7 6 5 ¯ y 4 3 2 1 x 0 1 2 3 4 5 6 7

  35. Correlation Regression ¯ x y 7 6 5 ¯ y 4 3 2 1 x 0 1 2 3 4 5 6 7

  36. Correlation Regression Calculations x ) 2 x i y i x i − ¯ x y i − ¯ y ( x i − ¯ x )( y i − ¯ y ) ( x i − ¯ 1 1 ? ? ? ? 2 5 ? ? ? ? 3 3 ? ? ? ? 4 6 ? ? ? ? 5 2 ? ? ? ? 6 7 ? ? ? ? x ¯ y ¯ Cov ( x , y ) Var ( x )

  37. Correlation Regression Calculations If x is continuous, calculation is more complicated: � β 1 = Cov ( x , y ) / Var ( x ) x ) 2 x i y i x i − ¯ x y i − ¯ y ( x i − ¯ x )( y i − ¯ y ) ( x i − ¯ − 2 . ¯ − 6 . 6¯ 1 1 6 -3 6 6.25 − 1 . ¯ 2 5 3 +1 − 2 . 00 2.25 − 0 . ¯ − 0 . 3¯ 3 3 6 -1 3 0.25 +0 . ¯ − 0 . 1¯ 4 6 3 +2 6 0.25 +1 . ¯ 5 2 6 -2 − 2 . 50 2.25 +2 . ¯ − 8 . 3¯ 6 7 3 +3 3 6.25 3 . ¯ 3.5 6 11 17.5

  38. Correlation Regression Calculations If x is continuous, calculation is more complicated: � β 1 = Cov ( x , y ) / Var ( x ) = 11 / 17 . 5 = 0 . 627 x ) 2 x i y i x i − ¯ x y i − ¯ y ( x i − ¯ x )( y i − ¯ y ) ( x i − ¯ − 2 . ¯ − 6 . 6¯ 1 1 6 -3 6 6.25 − 1 . ¯ 2 5 3 +1 − 2 . 00 2.25 − 0 . ¯ − 0 . 3¯ 3 3 6 -1 3 0.25 +0 . ¯ − 0 . 1¯ 4 6 3 +2 6 0.25 +1 . ¯ 5 2 6 -2 − 2 . 50 2.25 +2 . ¯ − 8 . 3¯ 6 7 3 +3 3 6.25 3 . ¯ 3.5 6 11 17.5

  39. Correlation Regression Intercept ˆ β 0 Simple formula: ˆ y − ˆ β 0 = ¯ β 1 ¯ x

  40. Correlation Regression Intercept ˆ β 0 Simple formula: ˆ y − ˆ β 0 = ¯ β 1 ¯ x Intuition: OLS fit always runs through point (¯ x , ¯ y )

  41. Correlation Regression Intercept ˆ β 0 Simple formula: ˆ y − ˆ β 0 = ¯ β 1 ¯ x Intuition: OLS fit always runs through point (¯ x , ¯ y ) Ex.: ˆ β 0 = 3 . ¯ 6 − 0 . 627 ∗ 3 . 5 = 1 . 4¯ 6

  42. Correlation Regression Intercept ˆ β 0 Simple formula: ˆ y − ˆ β 0 = ¯ β 1 ¯ x Intuition: OLS fit always runs through point (¯ x , ¯ y ) Ex.: ˆ β 0 = 3 . ¯ 6 − 0 . 627 ∗ 3 . 5 = 1 . 4¯ 6 y = 1 . 4¯ ˆ 6 + 0 . 6857ˆ x

  43. Correlation Regression ¯ x y 7 6 5 ¯ y 4 3 2 1 x 0 1 2 3 4 5 6 7

  44. Correlation Regression Systematic versus unsystematic components

  45. Correlation Regression Systematic versus unsystematic components Systematic: Regression line (slope) Linear regression estimates the conditional means of the population data (i.e., E [ Y | X ])

  46. Correlation Regression Systematic versus unsystematic components Systematic: Regression line (slope) Linear regression estimates the conditional means of the population data (i.e., E [ Y | X ]) Unsystematic: Error term is the deviation of observations from the line The difference between each value y i and ˆ y i is the residual : e i OLS produces an estimate of β that minimizes the residual sum of squares

  47. Correlation Regression Why are there residuals?

  48. Correlation Regression Why are there residuals? Fundamental randomness

  49. Correlation Regression Why are there residuals? Fundamental randomness Measurement error

  50. Correlation Regression Why are there residuals? Fundamental randomness Measurement error Omitted variables

  51. Correlation Regression Minimum Mathematical Requirements 1 Do we need variation in X ?

  52. Correlation Regression Minimum Mathematical Requirements 1 Do we need variation in X ? Yes, otherwise dividing by zero

  53. Correlation Regression Minimum Mathematical Requirements 1 Do we need variation in X ? Yes, otherwise dividing by zero 2 Do we need variation in Y ? No, ˆ β 1 can equal zero ( Cor ( X , Y ) = 0)

  54. Correlation Regression Minimum Mathematical Requirements 1 Do we need variation in X ? Yes, otherwise dividing by zero 2 Do we need variation in Y ? No, ˆ β 1 can equal zero ( Cor ( X , Y ) = 0)

  55. Correlation Regression Minimum Mathematical Requirements 1 Do we need variation in X ? Yes, otherwise dividing by zero 2 Do we need variation in Y ? No, ˆ β 1 can equal zero ( Cor ( X , Y ) = 0) 3 How many observations do we need?

  56. Correlation Regression Minimum Mathematical Requirements 1 Do we need variation in X ? Yes, otherwise dividing by zero 2 Do we need variation in Y ? No, ˆ β 1 can equal zero ( Cor ( X , Y ) = 0) 3 How many observations do we need? n ≥ k , where k is number of parameters to be estimated

Recommend


More recommend