applied statistics
play

Applied Statistics Lecturer: Serena Arima Introduction Binary - PowerPoint PPT Presentation

Introduction Binary model Example Fit Test Applied Statistics Lecturer: Serena Arima Introduction Binary model Example Fit Test Introduction Until now: 1 Linear regression model; 2 Analysis of Variance model (ANOVA); 3 Analysis of


  1. Introduction Binary model Example Fit Test Applied Statistics Lecturer: Serena Arima

  2. Introduction Binary model Example Fit Test Introduction Until now: 1 Linear regression model; 2 Analysis of Variance model (ANOVA); 3 Analysis of Covariance model (ANCOVA). In practical applications, one often has to cope with phenomena that are discrete or mixed discrete-continuous nature.

  3. Introduction Binary model Example Fit Test Introduction Until now: 1 Linear regression model; 2 Analysis of Variance model (ANOVA); 3 Analysis of Covariance model (ANCOVA). In practical applications, one often has to cope with phenomena that are discrete or mixed discrete-continuous nature.

  4. Introduction Binary model Example Fit Test Introduction Suppose we want to explain whether a family possesses a car or not. Let the sole explanatory variable to be the family income. We have n families and the response variable is defined as y i = 1 if family i owns a car y i = 0 if family i does not own a car x i 1 is the income of the family i .

  5. Introduction Binary model Example Fit Test Introduction We estimate the relationship between y and x 2 using the linear model y i = β 0 + β 1 x i 1 + ǫ i = x ′ i β + ǫ i It seems reasonable to make the standard assumption that E [ ǫ i | x i ] = 0 E [ y i | x i ] = x ′ i β This implies that: E [ y i | x i ] = 1 · Pr ( y i = 1 || x i ) + 0 · Pr ( y i = 0 | x i ) = Pr ( y i = 1 | x i ) = x ′ i β

  6. Introduction Binary model Example Fit Test Introduction We estimate the relationship between y and x 2 using the linear model y i = β 0 + β 1 x i 1 + ǫ i = x ′ i β + ǫ i It seems reasonable to make the standard assumption that E [ ǫ i | x i ] = 0 E [ y i | x i ] = x ′ i β This implies that: E [ y i | x i ] = 1 · Pr ( y i = 1 || x i ) + 0 · Pr ( y i = 0 | x i ) = Pr ( y i = 1 | x i ) = x ′ i β

  7. Introduction Binary model Example Fit Test Introduction We can use the OLS method in order to estimate the model and we get: y i = � β 0 + � � β 1 x i 1 Regression model 2 1 Car 0 -1 -2 5 10 15 20 25 Family

  8. Introduction Binary model Example Fit Test Introduction Thus, the linear model implies that x ′ i β is a probability and should therefore lie between 0 and 1. This is only possible if the x i values are bounded and if certain restrictions on β are satisfied. Usually this is hard to achieve in practice. In addition, because y i has only two possible outcomes (0 and 1), the error term has two possible outcomes as well.

  9. Introduction Binary model Example Fit Test Introduction Thus, the linear model implies that x ′ i β is a probability and should therefore lie between 0 and 1. This is only possible if the x i values are bounded and if certain restrictions on β are satisfied. Usually this is hard to achieve in practice. In addition, because y i has only two possible outcomes (0 and 1), the error term has two possible outcomes as well.

  10. Introduction Binary model Example Fit Test Introduction In particular, the distribution of the error term ǫ i is P ( ǫ i = − x ′ i β ) = P ( y i = 0 | x i ) = 1 − x ′ i β P ( ǫ i = 1 − x ′ i β ) = P ( y i = 1 | x i ) = x ′ i β Hence, the variance of the error term is V ( ǫ i | x i ) = x ′ i β ( 1 − x ′ i β ) Hence, the error term is not Normal and it is also heteroskedastic! Moreover its variance depend upon the model parameters β .

  11. Introduction Binary model Example Fit Test Binary choice model To overcome the problems, there exists a class of binary choice model designed to model the choice between two discrete alternatives. In general, we have P ( y i = 1 | x i ) = G ( x i , β ) for some function G ( . ) that takes values in [ 0 , 1 ] . Usually, one restricts attention to functions of the form G ( x i , beta ) = F ( x ′ i β ) where F is some distribution function.

  12. Introduction Binary model Example Fit Test Binary choice model To overcome the problems, there exists a class of binary choice model designed to model the choice between two discrete alternatives. In general, we have P ( y i = 1 | x i ) = G ( x i , β ) for some function G ( . ) that takes values in [ 0 , 1 ] . Usually, one restricts attention to functions of the form G ( x i , beta ) = F ( x ′ i β ) where F is some distribution function.

  13. Introduction Binary model Example Fit Test Binary choice model A common choice is the standard Normal distribution function � w � � 1 − 1 2 t 2 F ( w ) = Φ( w ) = √ exp dt 2 π −∞ leading the so-called probit model in which P ( y i = 1 | x i = Φ( x ′ i β ) = Φ( β 0 + β 1 x i 1 )

  14. Introduction Binary model Example Fit Test Binary choice model Another choice is the standard logistic distribution function e w F ( w ) = L ( w ) = 1 + e w leading the so-called logit model in which exp ( x ′ i β ) exp ( β 0 + β 1 x i 1 ) P ( y i = 1 | x i ) = i β ) = 1 + exp ( x ′ 1 + exp ( β 0 + β 1 x i 1 )

  15. Introduction Binary model Example Fit Test Binary choice model This model can also be written as P ( y i = 1 | x i ) 1 − P ( y i = 1 | x i ) = x ′ log i β The left hand side is referred to log odds ratio . An odds ratio of 3 means the the odds of y i = 1 are 3 times those of y i = 0. Using this equality, the β coefficients can be interpreted as describing the effect upon the odds ratio. For example, if β k = 0 . 1, a unit increase of x ik increases the odds ratio by about 10%.

  16. Introduction Binary model Example Fit Test Binary choice model Another common choice is the uniform distribution over the interval [ 0 , 1 ] with distribution function F ( w ) = 0 w < 0 F ( w ) = w 0 ≤ w ≤ 0 F ( w ) = 1 w > 1 . This results in the so-called linear probability model defined as Pr ( y i = 1 | x i ) = 0 if x ′ i β < 0; Pr ( y i = 1 | x i ) = x ′ i β if 0 ≤ x ′ i β ≤ 1; Pr ( y i = 1 | x i ) = 1 if x ′ i β > 1.

  17. Introduction Binary model Example Fit Test Binary choice model: interpretation A main difficulty with these models, it’s the parameters’ interpretation: apart for their signs , the coefficients in these binary choice models may be interpret according to marginal effect of changes in the explanatory variables . For a continuous explanatory variable x ik , the marginal effect is defined as the partial derivative of the probability that y i equals one.

  18. Introduction Binary model Example Fit Test Binary choice model: interpretation A main difficulty with these models, it’s the parameters’ interpretation: apart for their signs , the coefficients in these binary choice models may be interpret according to marginal effect of changes in the explanatory variables . For a continuous explanatory variable x ik , the marginal effect is defined as the partial derivative of the probability that y i equals one.

  19. Introduction Binary model Example Fit Test Binary choice model: interpretation For the probit model the marginal effect is d Φ( x ′ i β ) = φ ( x ′ i β ) β dx ik where φ denotes the standard normal density function, that is � � 1 − 1 2 w 2 √ φ ( w ) = exp 2 π

  20. Introduction Binary model Example Fit Test Binary choice model: interpretation For the logit model the marginal effect is dL ( x ′ e x ′ i β i β ) = β k ( 1 + e x ′ dx ik i β ) For the linear probability model the marginal effect is dx ′ i β = β k dx ik (or 0).

  21. Introduction Binary model Example Fit Test Example 1: probit model Suppose we have n = 2380 individuals and the following variables have been recorded (in 1920-1940): Loan: binary variable � 1 if the bank loan is rejected, 0 if it is allowed; Income: monthly income for each individual; Race: race of each individual (0=white, 1=black) (R); LoanPayment: ratio income and loan payment (LP), income / payment

  22. Introduction Binary model Example Fit Test Example 1: probit model We would like to study whether the rejection of a loan is related with other variables, such as the income, the race and the income/payment ratio. The response variable is a binary variable and the explanatory variables are both continuous and discrete. Let’s try to interpret different models!

  23. Introduction Binary model Example Fit Test Example 1: probit model We would like to study whether the rejection of a loan is related with other variables, such as the income, the race and the income/payment ratio. The response variable is a binary variable and the explanatory variables are both continuous and discrete. Let’s try to interpret different models!

  24. Introduction Binary model Example Fit Test Example 0: linear model We start with a simple linear model. The estimated model is: P ( loanRejection = 1 | LP ) = − 0 . 07991 + 0 . 60353 LP i Increasing the income/loan ratio of 0.1, the probability that the loan is rejected increases of 0.06; What is the probability that the loan is rejected when the income/loan ratio is 0.5? The predicted probability is − 0 . 07991 + 0 . 60353 · 0 . 5 = 0 . 22 What is the probability that the loan is rejected when the income/loan ratio is 0.01? The predicted probability is − 0 . 07991 + 0 . 60353 · 0 . 01 = − 0 . 073 (!!!)

Recommend


More recommend