BS2247 Introduction to Econometrics Lecture 3: The simple regression model OLS, Algebraic properties, Goodness-of-fit Dr. Kai Sun Aston Business School 1 / 28
The Simple Regression Model: y = β 0 + β 1 x + u In the simple linear regression of y on x , we typically refer to y as the ◮ Dependent Variable, or ◮ Left-Hand Side Variable, or ◮ Explained Variable, or ◮ Regressand 2 / 28
The Simple Regression Model: y = β 0 + β 1 x + u We typically refer to x as the ◮ Independent Variable, or ◮ Right-Hand Side Variable, or ◮ Explanatory Variable, or ◮ Regressor, or ◮ Covariate, or ◮ Control Variables 3 / 28
The Simple Regression Model: y = β 0 + β 1 x + u ◮ β 0 : intercept parameter, or constant term ◮ β 1 : slope parameter ◮ u : error term, or disturbance, or unobservable (there would be no Econometrics without it!) 4 / 28
A Simple Assumption The expected value of u , the error term, in the population is 0. E ( u ) = 0 “There is no error on average.” But this assumption on its own is trivial. 5 / 28
A Crucial Assumption: Zero Conditional Mean ◮ Assume how u and x are related E ( u | x ) = E ( u ): x and u are completely uncorrelated . ◮ Intuitively, this means that knowing something about x does not give us any information about u . ◮ It is crucial to assume that E ( u | x ) = 0, and therefore, E ( u ) = 0. ◮ E ( u | x ) = 0 is the zero conditional mean assumption. 6 / 28
A Crucial Assumption: Zero Conditional Mean ◮ Assume how u and x are related E ( u | x ) = E ( u ): x and u are completely uncorrelated . ◮ Intuitively, this means that knowing something about x does not give us any information about u . ◮ It is crucial to assume that E ( u | x ) = 0, and therefore, E ( u ) = 0. ◮ E ( u | x ) = 0 is the zero conditional mean assumption. 6 / 28
A Crucial Assumption: Zero Conditional Mean ◮ Assume how u and x are related E ( u | x ) = E ( u ): x and u are completely uncorrelated . ◮ Intuitively, this means that knowing something about x does not give us any information about u . ◮ It is crucial to assume that E ( u | x ) = 0, and therefore, E ( u ) = 0. ◮ E ( u | x ) = 0 is the zero conditional mean assumption. 6 / 28
A Crucial Assumption: Zero Conditional Mean ◮ Assume how u and x are related E ( u | x ) = E ( u ): x and u are completely uncorrelated . ◮ Intuitively, this means that knowing something about x does not give us any information about u . ◮ It is crucial to assume that E ( u | x ) = 0, and therefore, E ( u ) = 0. ◮ E ( u | x ) = 0 is the zero conditional mean assumption. 6 / 28
◮ E ( u | x ) = 0 implies that E ( y | x ) = β 0 + β 1 x Proof: y = β 0 + β 1 x + u , u = y − β 0 − β 1 x E ( u | x ) = E ( y − β 0 − β 1 x | x ) = E ( y | x ) − β 0 − β 1 x = 0 E ( y | x ) = β 0 + β 1 x . QED. ◮ E ( y | x ) is called population regression function. ◮ E ( y | x ) is the expected value of y given a particular value of x . 7 / 28
◮ E ( u | x ) = 0 implies that E ( y | x ) = β 0 + β 1 x Proof: y = β 0 + β 1 x + u , u = y − β 0 − β 1 x E ( u | x ) = E ( y − β 0 − β 1 x | x ) = E ( y | x ) − β 0 − β 1 x = 0 E ( y | x ) = β 0 + β 1 x . QED. ◮ E ( y | x ) is called population regression function. ◮ E ( y | x ) is the expected value of y given a particular value of x . 7 / 28
◮ E ( u | x ) = 0 implies that E ( y | x ) = β 0 + β 1 x Proof: y = β 0 + β 1 x + u , u = y − β 0 − β 1 x E ( u | x ) = E ( y − β 0 − β 1 x | x ) = E ( y | x ) − β 0 − β 1 x = 0 E ( y | x ) = β 0 + β 1 x . QED. ◮ E ( y | x ) is called population regression function. ◮ E ( y | x ) is the expected value of y given a particular value of x . 7 / 28
◮ E ( u | x ) = 0 implies that E ( y | x ) = β 0 + β 1 x Proof: y = β 0 + β 1 x + u , u = y − β 0 − β 1 x E ( u | x ) = E ( y − β 0 − β 1 x | x ) = E ( y | x ) − β 0 − β 1 x = 0 E ( y | x ) = β 0 + β 1 x . QED. ◮ E ( y | x ) is called population regression function. ◮ E ( y | x ) is the expected value of y given a particular value of x . 7 / 28
Ordinary Least Squares (OLS) ◮ Basic idea of regression is to estimate the unknown population parameters, β 0 and β 1 , from a sample. ◮ Let { ( x i , y i ) : i = 1 , . . . , n } denote a random sample of size n from the population. ◮ For each observation in this sample, it will be the case that y i = β 0 + β 1 x i + u i 8 / 28
Ordinary Least Squares (OLS) ◮ Basic idea of regression is to estimate the unknown population parameters, β 0 and β 1 , from a sample. ◮ Let { ( x i , y i ) : i = 1 , . . . , n } denote a random sample of size n from the population. ◮ For each observation in this sample, it will be the case that y i = β 0 + β 1 x i + u i 8 / 28
Ordinary Least Squares (OLS) ◮ Basic idea of regression is to estimate the unknown population parameters, β 0 and β 1 , from a sample. ◮ Let { ( x i , y i ) : i = 1 , . . . , n } denote a random sample of size n from the population. ◮ For each observation in this sample, it will be the case that y i = β 0 + β 1 x i + u i 8 / 28
Population regression line, sample data points and the associated error terms y y E( y|x ) = + x 0 + 1 x E( y|x ) . y 4 { u 4 . u 3 u y 3 } } . y 2 u 2 { } u 1 u . y 1 x 1 x 2 x 3 x 4 x 9 / 28
Deriving OLS Estimates ◮ Use the assumption E ( u | x ) = E ( u ) = 0. x and u are uncorrelated = ⇒ Cov ( u , x ) = 0. ◮ Recall that E ( ux ) = E ( u ) E ( x ) + Cov ( u , x ). Since E ( u ) = 0 and Cov ( u , x ) = 0, E ( ux ) = 0 10 / 28
Deriving OLS Estimates ◮ Use the assumption E ( u | x ) = E ( u ) = 0. x and u are uncorrelated = ⇒ Cov ( u , x ) = 0. ◮ Recall that E ( ux ) = E ( u ) E ( x ) + Cov ( u , x ). Since E ( u ) = 0 and Cov ( u , x ) = 0, E ( ux ) = 0 10 / 28
Deriving OLS Estimates ◮ Use the two “algebraically feasible” assumptions: (1) E ( u ) = 0 (2) E ( ux ) = 0 ◮ Since u = y − β 0 − β 1 x , these two assumptions become: (1) E ( y − β 0 − β 1 x ) = 0 (2) E [( y − β 0 − β 1 x ) x ] = 0 ◮ These are called population moment restrictions. 11 / 28
Deriving OLS Estimates ◮ Use the two “algebraically feasible” assumptions: (1) E ( u ) = 0 (2) E ( ux ) = 0 ◮ Since u = y − β 0 − β 1 x , these two assumptions become: (1) E ( y − β 0 − β 1 x ) = 0 (2) E [( y − β 0 − β 1 x ) x ] = 0 ◮ These are called population moment restrictions. 11 / 28
Deriving OLS Estimates ◮ Use the two “algebraically feasible” assumptions: (1) E ( u ) = 0 (2) E ( ux ) = 0 ◮ Since u = y − β 0 − β 1 x , these two assumptions become: (1) E ( y − β 0 − β 1 x ) = 0 (2) E [( y − β 0 − β 1 x ) x ] = 0 ◮ These are called population moment restrictions. 11 / 28
Deriving OLS Estimates using M.O.M. Use population moment restrictions to solve for population parameters. (1) E ( y i − β 0 − β 1 x i ) = 0 (2) E [( y i − β 0 − β 1 x i ) x i ] = 0 (These moment restrictions hold for each observation in the population.) 12 / 28
Deriving OLS Estimates using M.O.M. This left us with two unknowns (i.e., β 0 and β 1 ) and two equations. To solve for these two equations, we can rewrite the first condition as follows E ( y i ) = β 0 + β 1 E ( x i ) or β 0 = E ( y i ) − β 1 E ( x i ) 13 / 28
Deriving OLS Estimates using M.O.M. Use the second population moment restriction and β 0 = E ( y i ) − β 1 E ( x i ), and we would have E [( y i − ( E ( y i ) − β 1 E ( x i )) − β 1 x i ) x i ] = 0 Solve for β 1 gives β 1 = E ( x i − E ( x i ))( y i − E ( y i )) E ( x i − E ( x i )) 2 14 / 28
Deriving OLS Estimates using M.O.M. To emphasize that β 0 and β 1 are estimated using a finite sample, which is mostly the case in empirical study, after replacing population moments by their sample counterpart, we put “hats” over the β ’s, that is, ˆ y − ˆ β 0 = ¯ β 1 ¯ x � i ( x i − ¯ x )( y i − ¯ y ) ˆ � β 1 = x ) 2 i ( x i − ¯ 15 / 28
Summary of OLS slope estimate ◮ The slope estimate is the sample covariance between x and y divided by the sample variance of x ◮ If x and y are positively correlated, the slope will be positive ◮ If x and y are negatively correlated, the slope will be negative ◮ Only need x to vary in our sample 16 / 28
Summary of OLS slope estimate ◮ The slope estimate is the sample covariance between x and y divided by the sample variance of x ◮ If x and y are positively correlated, the slope will be positive ◮ If x and y are negatively correlated, the slope will be negative ◮ Only need x to vary in our sample 16 / 28
Summary of OLS slope estimate ◮ The slope estimate is the sample covariance between x and y divided by the sample variance of x ◮ If x and y are positively correlated, the slope will be positive ◮ If x and y are negatively correlated, the slope will be negative ◮ Only need x to vary in our sample 16 / 28
Summary of OLS slope estimate ◮ The slope estimate is the sample covariance between x and y divided by the sample variance of x ◮ If x and y are positively correlated, the slope will be positive ◮ If x and y are negatively correlated, the slope will be negative ◮ Only need x to vary in our sample 16 / 28
Recommend
More recommend