logistic regression for nominal response variables
play

Logistic Regression for Nominal Response Variables Edpsy/Psych/Soc - PowerPoint PPT Presentation

Logistic Regression for Nominal Response Variables Edpsy/Psych/Soc 589 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign Board of Trustees, University of Illinois c Spring


  1. Logistic Regression for Nominal Response Variables Edpsy/Psych/Soc 589 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign � Board of Trustees, University of Illinois c Spring 2017

  2. Introduction Multinomial/Baseline SAS Inference Grouped Data Latent Variable Conditional Model Mixed model Outline ◮ Introduction and Extending binary model ◮ Nominal Responses (baseline model) ◮ SAS ◮ Inference ◮ Grouped Data ◮ Latent variable interpretation ◮ Discrete choice model (“conditional” model) C.J. Anderson (Illinois) Logistic Regression for Nominal Responses Spring 2017 2.1/ 98

  3. Introduction Multinomial/Baseline SAS Inference Grouped Data Latent Variable Conditional Model Mixed model Additional References General References: ◮ Agresti, A. (2013). Categorical Data Analysis , 3rd edition. NY: Wiley. ◮ Long, J.S. (1997). Regression Models for Categorical and Limited Dependent Variables . Thousand Oaks, CA: Sage. ◮ Powers, D.A. & Xie, Y. (2000). Statistical Methods for Categorical Data Analysis. San Diego, CA: Academic Press. Fitting (Conditional) Multinomial Models using SAS: ◮ SAS Institute (1995). Logistic Regression Examples Using the SAS System, (version 6). Cary, NC: SAS Institute. ◮ Kuhfeld, W.F. (2001). Marketing Research Methods in the SAS System, Version 8.2 Edition, TS-650. Cary, NC: SAS Institute. (reports TS-650A – TS-560I). C.J. Anderson (Illinois) Logistic Regression for Nominal Responses Spring 2017 3.1/ 98

  4. Introduction Multinomial/Baseline SAS Inference Grouped Data Latent Variable Conditional Model Mixed model Additional References (continued) Some on my web-site, ◮ http://faculty.education.illinois.edu/cja/ Handbookof Quantitative Psychology ◮ http://faculty.education.illinois.edu/cja/BestPractices/index.html ◮ Course web-site is most up-to-date. C.J. Anderson (Illinois) Logistic Regression for Nominal Responses Spring 2017 4.1/ 98

  5. Introduction Multinomial/Baseline SAS Inference Grouped Data Latent Variable Conditional Model Mixed model Situation ◮ Situation: ◮ One response variable Y with J levels. ◮ One or more explanatory or predictor variables. The predictor variables may be quantitative, qualitative or both. ◮ Model: “Multinomial” Logistic regression. ◮ What if you have multiple predictor or explanatory variables? Describe individuals? Descriptors of categories? or Both? C.J. Anderson (Illinois) Logistic Regression for Nominal Responses Spring 2017 5.1/ 98

  6. Introduction Multinomial/Baseline SAS Inference Grouped Data Latent Variable Conditional Model Mixed model Differences w/rt Binary logistic Regression There are 3 basic differences. ◮ Forming logits. ◮ The Distribution. ◮ Connections with other models (not mentioned before). C.J. Anderson (Illinois) Logistic Regression for Nominal Responses Spring 2017 6.1/ 98

  7. Introduction Multinomial/Baseline SAS Inference Grouped Data Latent Variable Conditional Model Mixed model Forming Logits ◮ When J = 2, Y is dichotomous and we can model logs of odds that an event occurs or does not occur. There is only 1 logit that we can form � π � logit( π ) = log 1 − π ◮ When J > 2, . . . ◮ We have a multicategory or “polytomous” or “polychotomous” response variable. ◮ There are J ( J − 1) / 2 logits (odds) that we can form, but only ( J − 1) are non-redundant. ◮ There are different ways to form a set of ( J − 1) non-redundant logits. C.J. Anderson (Illinois) Logistic Regression for Nominal Responses Spring 2017 7.1/ 98

  8. Introduction Multinomial/Baseline SAS Inference Grouped Data Latent Variable Conditional Model Mixed model How to “dichotomized” the response Y ? The most common ones ◮ Nomnial Y ◮ “Baseline” logit models or “Multinomial” logistic regression. ◮ “Conditional” or “Multinomial” logit models. ◮ Ordinal Y ◮ Cumulative logits (Proportional Odds). ◮ Adjacent categories. ◮ Continuation ratios. C.J. Anderson (Illinois) Logistic Regression for Nominal Responses Spring 2017 8.1/ 98

  9. Introduction Multinomial/Baseline SAS Inference Grouped Data Latent Variable Conditional Model Mixed model The Multinomial Distribution ◮ Y j ∼ Mulitnomial( π 1 , π 2 , . . . , π J ) where ◮ where � j π j = 1 ◮ Y j = number of cases in the j th category ( Y j = 0 , 1 , . . . , n ). ◮ n = � j Y j , the number of “trials”. ◮ Mean: E ( Y j ) = n π j ◮ Variance: var( Y j ) = n π j (1 − π j ) ◮ Covariance cov( Y j , Y k ) = − n π j π k , for j � = k . ◮ Probability mass function, � � n ! π y 1 π y 2 . . . π y J P ( y 1 , y 2 , . . . , y J ) = y 1 ! y 2 ! . . . y J ! ◮ Binomial distribution is a special case. C.J. Anderson (Illinois) Logistic Regression for Nominal Responses Spring 2017 9.1/ 98

  10. Introduction Multinomial/Baseline SAS Inference Grouped Data Latent Variable Conditional Model Mixed model Example of Multinomial ◮ High School & Beyond program types ◮ General ◮ Academic ◮ Vo/Tech ◮ US 2006 Progress in International Reading Literacy Study (PIRLS) responses to item “How often to you use the Internet as a source of information for school-related work” with responses ◮ Every day or almost every data ( y 1 = 746, p 1 = . 1494) ◮ Once or twice a week ( y 2 = 1 , 240, p 2 = . 2883) ◮ Once or twice a month ( y 3 = 1 , 377, p 3 = . 2757) ◮ Never or almost never ( y 4 = 1 , 631, p 4 = . 3266) C.J. Anderson (Illinois) Logistic Regression for Nominal Responses Spring 2017 10.1/ 98

  11. Introduction Multinomial/Baseline SAS Inference Grouped Data Latent Variable Conditional Model Mixed model Graph of PIRLS Distribution C.J. Anderson (Illinois) Logistic Regression for Nominal Responses Spring 2017 11.1/ 98

  12. Introduction Multinomial/Baseline SAS Inference Grouped Data Latent Variable Conditional Model Mixed model Graph of PIRLS Distribution C.J. Anderson (Illinois) Logistic Regression for Nominal Responses Spring 2017 12.1/ 98

  13. Introduction Multinomial/Baseline SAS Inference Grouped Data Latent Variable Conditional Model Mixed model Connections with Other Models ◮ Some are equivalent to Poisson regression or loglinear models. ◮ Some can be derived from (equivalent to) discrete choice models (e.g., Luce, McFadden). ◮ Some can be derived from latent variable models. ◮ Those that are equivalent to conditional multinomial models are equivalent to proportional hazard models (models for survival data), which is equivalent to Poisson regression model. ◮ Some multicategory logit models are very similar to IRT models in terms of their parametric form. The difference between them is that in the IRT models, the predictor is unobserved (latent), and in the model we discuss here, the predictor variable is observed. ◮ Others. C.J. Anderson (Illinois) Logistic Regression for Nominal Responses Spring 2017 13.1/ 98

  14. Introduction Multinomial/Baseline SAS Inference Grouped Data Latent Variable Conditional Model Mixed model Multicategory Logit Models for Nominal Responses ◮ Baseline or Multinomial logistic regression model. Use characteristics of individuals as predictor variables. The parameters differ for each category of the response variable. ◮ Conditional Logit model. Use characteristics of the categories of the response variable as the predictors. The model parameters are the same for each category of the response variable. ◮ Conditional or Mixed logit model. Uses characteristics or attributes of the individuals and the categories as predictor variables. C.J. Anderson (Illinois) Logistic Regression for Nominal Responses Spring 2017 14.1/ 98

  15. Introduction Multinomial/Baseline SAS Inference Grouped Data Latent Variable Conditional Model Mixed model Confusion There is not a standard terminology for these models. ◮ Agresti (90) “Conditional Logit model”: “Originally referred to by McFadden as a conditional logit model, it is now usually called the multinomial logit model.” ◮ Long (97): Refers to the “Baseline or Multinomial logistic regression model” as a “multinomial logit” model and calls “Conditional Logit model“ the “conditional logit” model. ◮ Powers & Xie (00) on the “Conditional” and “Multinomial” models, “However, it is often called a multinominal logit model, leading to a great deal of confusion.” ◮ Agresti (2013) calls all of them “multinomial models” and refers to the Baseline or Multinomial logistic regression model as the “Baseline-category” model. C.J. Anderson (Illinois) Logistic Regression for Nominal Responses Spring 2017 15.1/ 98

  16. Introduction Multinomial/Baseline SAS Inference Grouped Data Latent Variable Conditional Model Mixed model Further Contribution to Confusion The models are related (connections): ◮ Baseline model is a special case of conditional model. ◮ Conditional Model can be fit as a proportional hazards model (have to do this in R). ◮ All are special cases of Possion log-linear models. C.J. Anderson (Illinois) Logistic Regression for Nominal Responses Spring 2017 16.1/ 98

Recommend


More recommend