Applied Statistics Lecturer: Serena Arima Introduction Binary - PowerPoint PPT Presentation

Introduction Binary model Example Fit Test Applied Statistics Lecturer: Serena Arima

Introduction Binary model Example Fit Test Introduction Until now: 1 Linear regression model; 2 Analysis of Variance model (ANOVA); 3 Analysis of Covariance model (ANCOVA). In practical applications, one often has to cope with phenomena that are discrete or mixed discrete-continuous nature.

Introduction Binary model Example Fit Test Introduction Suppose we want to explain whether a family possesses a car or not. Let the sole explanatory variable to be the family income. We have n families and the response variable is defined as y i = 1 if family i owns a car y i = 0 if family i does not own a car x i 1 is the income of the family i .

Introduction Binary model Example Fit Test Introduction We estimate the relationship between y and x 2 using the linear model y i = β 0 + β 1 x i 1 + ǫ i = x ′ i β + ǫ i It seems reasonable to make the standard assumption that E [ ǫ i | x i ] = 0 E [ y i | x i ] = x ′ i β This implies that: E [ y i | x i ] = 1 · Pr ( y i = 1 || x i ) + 0 · Pr ( y i = 0 | x i ) = Pr ( y i = 1 | x i ) = x ′ i β

Introduction Binary model Example Fit Test Introduction We can use the OLS method in order to estimate the model and we get: y i = � β 0 + � � β 1 x i 1 Regression model 2 1 Car 0 -1 -2 5 10 15 20 25 Family

Introduction Binary model Example Fit Test Introduction Thus, the linear model implies that x ′ i β is a probability and should therefore lie between 0 and 1. This is only possible if the x i values are bounded and if certain restrictions on β are satisfied. Usually this is hard to achieve in practice. In addition, because y i has only two possible outcomes (0 and 1), the error term has two possible outcomes as well.

Introduction Binary model Example Fit Test Introduction In particular, the distribution of the error term ǫ i is P ( ǫ i = − x ′ i β ) = P ( y i = 0 | x i ) = 1 − x ′ i β P ( ǫ i = 1 − x ′ i β ) = P ( y i = 1 | x i ) = x ′ i β Hence, the variance of the error term is V ( ǫ i | x i ) = x ′ i β ( 1 − x ′ i β ) Hence, the error term is not Normal and it is also heteroskedastic! Moreover its variance depend upon the model parameters β .

Introduction Binary model Example Fit Test Binary choice model To overcome the problems, there exists a class of binary choice model designed to model the choice between two discrete alternatives. In general, we have P ( y i = 1 | x i ) = G ( x i , β ) for some function G ( . ) that takes values in [ 0 , 1 ] . Usually, one restricts attention to functions of the form G ( x i , beta ) = F ( x ′ i β ) where F is some distribution function.

Introduction Binary model Example Fit Test Binary choice model A common choice is the standard Normal distribution function � w � � 1 − 1 2 t 2 F ( w ) = Φ( w ) = √ exp dt 2 π −∞ leading the so-called probit model in which P ( y i = 1 | x i = Φ( x ′ i β ) = Φ( β 0 + β 1 x i 1 )

Introduction Binary model Example Fit Test Binary choice model Another choice is the standard logistic distribution function e w F ( w ) = L ( w ) = 1 + e w leading the so-called logit model in which exp ( x ′ i β ) exp ( β 0 + β 1 x i 1 ) P ( y i = 1 | x i ) = i β ) = 1 + exp ( x ′ 1 + exp ( β 0 + β 1 x i 1 )

Introduction Binary model Example Fit Test Binary choice model This model can also be written as P ( y i = 1 | x i ) 1 − P ( y i = 1 | x i ) = x ′ log i β The left hand side is referred to log odds ratio . An odds ratio of 3 means the the odds of y i = 1 are 3 times those of y i = 0. Using this equality, the β coefficients can be interpreted as describing the effect upon the odds ratio. For example, if β k = 0 . 1, a unit increase of x ik increases the odds ratio by about 10%.

Introduction Binary model Example Fit Test Binary choice model Another common choice is the uniform distribution over the interval [ 0 , 1 ] with distribution function F ( w ) = 0 w < 0 F ( w ) = w 0 ≤ w ≤ 0 F ( w ) = 1 w > 1 . This results in the so-called linear probability model defined as Pr ( y i = 1 | x i ) = 0 if x ′ i β < 0; Pr ( y i = 1 | x i ) = x ′ i β if 0 ≤ x ′ i β ≤ 1; Pr ( y i = 1 | x i ) = 1 if x ′ i β > 1.

Introduction Binary model Example Fit Test Binary choice model: interpretation A main difficulty with these models, it’s the parameters’ interpretation: apart for their signs , the coefficients in these binary choice models may be interpret according to marginal effect of changes in the explanatory variables . For a continuous explanatory variable x ik , the marginal effect is defined as the partial derivative of the probability that y i equals one.

Introduction Binary model Example Fit Test Binary choice model: interpretation For the probit model the marginal effect is d Φ( x ′ i β ) = φ ( x ′ i β ) β dx ik where φ denotes the standard normal density function, that is � � 1 − 1 2 w 2 √ φ ( w ) = exp 2 π

Introduction Binary model Example Fit Test Binary choice model: interpretation For the logit model the marginal effect is dL ( x ′ e x ′ i β i β ) = β k ( 1 + e x ′ dx ik i β ) For the linear probability model the marginal effect is dx ′ i β = β k dx ik (or 0).

Introduction Binary model Example Fit Test Example 1: probit model Suppose we have n = 2380 individuals and the following variables have been recorded (in 1920-1940): Loan: binary variable � 1 if the bank loan is rejected, 0 if it is allowed; Income: monthly income for each individual; Race: race of each individual (0=white, 1=black) (R); LoanPayment: ratio income and loan payment (LP), income / payment

Introduction Binary model Example Fit Test Example 1: probit model We would like to study whether the rejection of a loan is related with other variables, such as the income, the race and the income/payment ratio. The response variable is a binary variable and the explanatory variables are both continuous and discrete. Let’s try to interpret different models!

Introduction Binary model Example Fit Test Example 0: linear model We start with a simple linear model. The estimated model is: P ( loanRejection = 1 | LP ) = − 0 . 07991 + 0 . 60353 LP i Increasing the income/loan ratio of 0.1, the probability that the loan is rejected increases of 0.06; What is the probability that the loan is rejected when the income/loan ratio is 0.5? The predicted probability is − 0 . 07991 + 0 . 60353 · 0 . 5 = 0 . 22 What is the probability that the loan is rejected when the income/loan ratio is 0.01? The predicted probability is − 0 . 07991 + 0 . 60353 · 0 . 01 = − 0 . 073 (!!!)

Applied Statistics Lecturer: Serena Arima Introduction Binary - PowerPoint PPT Presentation

Introduction Binary model Example Fit Test Applied Statistics Lecturer: Serena Arima Introduction Binary model Example Fit Test Introduction Until now: 1 Linear regression model; 2 Analysis of Variance model (ANOVA); 3 Analysis of

The Power and Limits of Statistics DPRRGSP 2018-11-29 @ReinhardFurrer Applied Statistics

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official

Areal statistics Barry Rowlingson Research Fellow DataCamp Spatial Statistics in R Borders

Probability Review Applied Bayesian Statistics Dr. Earvin Balderama Department of Mathematics

Applied Bayesian Statistics STAT 388/488 Dr. Earvin Balderama Department of Mathematics &

The Pulse monitors: Statistics Smartpods PULSE 1 - Improve Facility Efficiencies 2 - Increase

Quality Assurance in Official Statistics Directorate of Economics & Statistics, Planning

UK Bleeding Disorder Statistics UK Bleeding Disorder Statistics UK Bleeding Disorder Statistics

The Statistics Network The Statistics Network Statistics network Compute servers Desktop PCs

1 Practical Information 2 Introduction to Statistics Per Bruun Brockhoff 3 Descriptive Statistics:

Statistics for Social Sciences I: Introduction to Statistics Introduction to Statistics

Applied Machine Learning Introduction 1 APPLIED MACHINE LEARNING Practicalities Contact

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

EVIDENTIAL STATISTICS Reforming the Introductory Course in Applied Statistics for Non-Majors

Test statistics and randomization distributions Applied Statistics and Experimental Design

Nested designs Applied Statistics and Experimental Design Chapter 7 Peter Hoff Statistics,

Week 7: Binary Outcomes Logistic Regression & Classification Max H. Farrell The University

Binary choice 3.2 Apply the model on data Michel Bierlaire Solution of the practice quiz.

Binary Choice Matthieu de Lapparent matthieu.delapparent@epfl.ch Transport and Mobility

Microeconometrics Blundell Lecture 1 Overview and Binary Response Models Richard Blundell

Microeconometrics Module A: Non-continuous outcomes I Alexander Ahammer Department of Economics,

Qualitative Response Models Michael R. Roberts Department of Finance The Wharton School

Estimation in the Fixed Effects Ordered Logit Model Chris Muris (SFU) Outline Introduction

KernGPLM A Package for Kernel-Based Fitting of Aim of this Talk Generalized Partial Linear

Applied Statistics Lecturer: Serena Arima Introduction Binary - PowerPoint PPT Presentation

Introduction Binary model Example Fit Test Applied Statistics Lecturer: Serena Arima Introduction Binary model Example Fit Test Introduction Until now: 1 Linear regression model; 2 Analysis of Variance model (ANOVA); 3 Analysis of

The Power and Limits of Statistics DPRRGSP 2018-11-29 @ReinhardFurrer Applied Statistics

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official

Areal statistics Barry Rowlingson Research Fellow DataCamp Spatial Statistics in R Borders

Probability Review Applied Bayesian Statistics Dr. Earvin Balderama Department of Mathematics

Applied Bayesian Statistics STAT 388/488 Dr. Earvin Balderama Department of Mathematics &amp;

The Pulse monitors: Statistics Smartpods PULSE 1 - Improve Facility Efficiencies 2 - Increase

Quality Assurance in Official Statistics Directorate of Economics &amp; Statistics, Planning

UK Bleeding Disorder Statistics UK Bleeding Disorder Statistics UK Bleeding Disorder Statistics

The Statistics Network The Statistics Network Statistics network Compute servers Desktop PCs

1 Practical Information 2 Introduction to Statistics Per Bruun Brockhoff 3 Descriptive Statistics:

Statistics for Social Sciences I: Introduction to Statistics Introduction to Statistics

Applied Machine Learning Introduction 1 APPLIED MACHINE LEARNING Practicalities Contact

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

EVIDENTIAL STATISTICS Reforming the Introductory Course in Applied Statistics for Non-Majors

Test statistics and randomization distributions Applied Statistics and Experimental Design

Nested designs Applied Statistics and Experimental Design Chapter 7 Peter Hoff Statistics,

Week 7: Binary Outcomes Logistic Regression &amp; Classification Max H. Farrell The University

Binary choice 3.2 Apply the model on data Michel Bierlaire Solution of the practice quiz.

Binary Choice Matthieu de Lapparent matthieu.delapparent@epfl.ch Transport and Mobility

Microeconometrics Blundell Lecture 1 Overview and Binary Response Models Richard Blundell

Microeconometrics Module A: Non-continuous outcomes I Alexander Ahammer Department of Economics,

Qualitative Response Models Michael R. Roberts Department of Finance The Wharton School

Estimation in the Fixed Effects Ordered Logit Model Chris Muris (SFU) Outline Introduction

KernGPLM A Package for Kernel-Based Fitting of Aim of this Talk Generalized Partial Linear

Applied Bayesian Statistics STAT 388/488 Dr. Earvin Balderama Department of Mathematics &

Quality Assurance in Official Statistics Directorate of Economics & Statistics, Planning

Week 7: Binary Outcomes Logistic Regression & Classification Max H. Farrell The University