Some Preliminary Market Research: A Googoloscopy Parametric Links for Binary Response Link GoogleHits Logit 2,800,000 Roger Koenker and Jungmo Yoon Probit 1,900,000 Cloglog 1,700 University of Illinois, Urbana-Champaign Cauchit 433 UseR! 2006 Abstract There is more to life than logit and probit. Koenker and Yoon (UIUC) Parametric Links UseR! 2006 1 / 14 Koenker and Yoon (UIUC) Parametric Links UseR! 2006 2 / 14 Some Preliminary Market Research: A Googoloscopy Some Preliminary Market Research: A Googoloscopy Link GoogleHits Link GoogleHits Logit 2,800,000 Logit 2,800,000 Probit 1,900,000 Probit 1,900,000 Cloglog 1,700 Cloglog 1,700 Cauchit 433 Cauchit 433 A Meta-Analysis Proposal: A Meta-Analysis Proposal: Factors determining the use of Logit vs. Probit in binary response Factors determining the use of Logit vs. Probit in binary response applications. applications. Should we use logit or probit for the analysis? Koenker and Yoon (UIUC) Parametric Links UseR! 2006 2 / 14 Koenker and Yoon (UIUC) Parametric Links UseR! 2006 2 / 14
Cauchit? Cauchit? As in the Cauchy distribution, also known as the Witch of Agnesi: As in the Cauchy distribution, also known as the Witch of Agnesi: Available in R since 2.1.0. Available in R since 2.1.0. Not to be confused with. . . . Koenker and Yoon (UIUC) Parametric Links UseR! 2006 3 / 14 Koenker and Yoon (UIUC) Parametric Links UseR! 2006 3 / 14 Cauchit? Why Do We Need Parametric Links? As in the Cauchy distribution, also known as the Witch of Agnesi: The three canonical human motivations: Available in R since 2.1.0. Guilt: For 20 years I’ve been teaching Daryl Pregibon’s (1980) paper “A Goodness of Link Test” Not to be confused with. . . . Cauchit is much more tolerant of a few surprising observations than is either logit or probit. Koenker and Yoon (UIUC) Parametric Links UseR! 2006 3 / 14 Koenker and Yoon (UIUC) Parametric Links UseR! 2006 4 / 14
Why Do We Need Parametric Links? Why Do We Need Parametric Links? The three canonical human motivations: The three canonical human motivations: Guilt: For 20 years I’ve been teaching Daryl Pregibon’s (1980) paper Guilt: For 20 years I’ve been teaching Daryl Pregibon’s (1980) paper “A Goodness of Link Test” – but I could never answer the obvious “A Goodness of Link Test” – but I could never answer the obvious question: “What should we do if we reject the logistic specification?” question: “What should we do if we reject the logistic specification?” Boredom: There must be more to life than probit or logit. Koenker and Yoon (UIUC) Parametric Links UseR! 2006 4 / 14 Koenker and Yoon (UIUC) Parametric Links UseR! 2006 4 / 14 Why Do We Need Parametric Links? What is a Link Function? Latent variable model for binary response, y ∗ i = x ⊤ i β + u i , u i ∼ iid F The three canonical human motivations: Guilt: For 20 years I’ve been teaching Daryl Pregibon’s (1980) paper “A Goodness of Link Test” – but I could never answer the obvious question: “What should we do if we reject the logistic specification?” Boredom: There must be more to life than probit or logit. Fear: Maybe we are all missing something interesting that could be revealed by more general link functions. Koenker and Yoon (UIUC) Parametric Links UseR! 2006 4 / 14 Koenker and Yoon (UIUC) Parametric Links UseR! 2006 5 / 14
What is a Link Function? What is a Link Function? Latent variable model for binary response, Latent variable model for binary response, y ∗ i = x ⊤ y ∗ i = x ⊤ i β + u i , u i ∼ iid F i β + u i , u i ∼ iid F Observed response is: Observed response is: y i = { y ∗ i � 0 } = { u i � − x ⊤ y i = { y ∗ i � 0 } = { u i � − x ⊤ i β } i β } Probability of the event is: P { y i = 1 } = 1 − F (− x ⊤ i β ) ≡ π Koenker and Yoon (UIUC) Parametric Links UseR! 2006 5 / 14 Koenker and Yoon (UIUC) Parametric Links UseR! 2006 5 / 14 What is a Link Function? Two Parametric Families of Link Functions Latent variable model for binary response, y ∗ i = x ⊤ i β + u i , u i ∼ iid F Gosset: The Student t family with degrees of freedom ν provides a Observed response is: convenient nesting of probit and Cauchit. y i = { y ∗ i � 0 } = { u i � − x ⊤ i β } Probability of the event is: P { y i = 1 } = 1 − F (− x ⊤ i β ) ≡ π Link function is just the quantile function of the error distribution, g ( π ) = − F − 1 ( 1 − π ) = x ⊤ i β Koenker and Yoon (UIUC) Parametric Links UseR! 2006 5 / 14 Koenker and Yoon (UIUC) Parametric Links UseR! 2006 6 / 14
Two Parametric Families of Link Functions The Pregibon Family α, δ = ( −0.25 , −0.25 ) α, δ = ( −0.25 , 0 ) α, δ = ( −0.25 , 0.25 ) 0.15 0.15 0.15 0.00 0.00 0.00 Gosset: The Student t family with degrees of freedom ν provides a 0 20 40 60 −10 0 10 −60 −40 −20 0 convenient nesting of probit and Cauchit. α, δ = ( 0 , −0.25 ) α, δ = ( 0 , 0 ) α, δ = ( 0 , 0.25 ) Pregibon: The (generalized) Tukey λ family 0.15 0.15 0.15 0.00 0.00 0.00 g ( π ) = π α + δ α + δ − ( 1 − π ) α − δ 0 5 10 15 20 −5 0 5 −20 −10 −5 0 α − δ α, δ = ( 0.25 , −0.25 ) α, δ = ( 0.25 , 0 ) α, δ = ( 0.25 , 0.25 ) provides a nice nesting of logit: ( α , δ ) = ( 0, 0 ) , the parameters α and 0.15 0.15 0.15 δ can be interpreted as kurtosis and skewness, respectively. 0.00 0.00 0.00 −2 0 2 4 6 8 −4 −2 0 2 4 −8 −6 −4 −2 0 2 Figure: Pregibon Densities for various ( α , δ ) ’s. All densities scaled to have the same interquartile range. Koenker and Yoon (UIUC) Parametric Links UseR! 2006 6 / 14 Koenker and Yoon (UIUC) Parametric Links UseR! 2006 7 / 14 Implementation in R Implementation in R Crucial Change is to permit “. . . ” in glm families: Crucial Change is to permit “. . . ” in glm families: family = binomial(’Gosset’, ...) family = binomial(’Gosset’, ...) Provide p-d-q functions for the new link. ◮ Thanks to Luke Tierney for a R-devel suggestion to expand the range of qt() . ◮ Thanks to Robert King for the gld package for the generalized Tukey λ family. Koenker and Yoon (UIUC) Parametric Links UseR! 2006 8 / 14 Koenker and Yoon (UIUC) Parametric Links UseR! 2006 8 / 14
Implementation in R Implementation in R Crucial Change is to permit “. . . ” in glm families: Crucial Change is to permit “. . . ” in glm families: family = binomial(’Gosset’, ...) family = binomial(’Gosset’, ...) Provide p-d-q functions for the new link. Provide p-d-q functions for the new link. ◮ Thanks to Luke Tierney for a R-devel suggestion to expand the range ◮ Thanks to Luke Tierney for a R-devel suggestion to expand the range of qt() . of qt() . ◮ Thanks to Robert King for the gld package for the generalized Tukey λ ◮ Thanks to Robert King for the gld package for the generalized Tukey λ family. family. Choose optimizer for the profiled likelihood: Choose optimizer for the profiled likelihood: ◮ Gosset: optimize() for ν ∈ ( 0.15, 30 ) ◮ Gosset: optimize() for ν ∈ ( 0.15, 30 ) ◮ Pregibon: optim() for ( α , δ ) ∈ [− 0.5, 0.5 ] 2 ◮ Pregibon: optim() for ( α , δ ) ∈ [− 0.5, 0.5 ] 2 Plea to R-core: Quite minor changes in glm() and friends would be sufficient to allow users to (more easily) “roll their own links.” Koenker and Yoon (UIUC) Parametric Links UseR! 2006 8 / 14 Koenker and Yoon (UIUC) Parametric Links UseR! 2006 8 / 14 Performance of the Gosset Link Performance of the Gosset Link A model of job tenure at Western Electric (R.I.P.), the probability π i of A model of job tenure at Western Electric (R.I.P.), the probability π i of quiting within 6 months of initial employment is given by, quiting within 6 months of initial employment is given by, g ν ( π i ) = β 0 + β 1 SEX i + β 2 DEX i + β 3 LEX i + β 4 LEX 2 g ν ( π i ) = β 0 + β 1 SEX i + β 2 DEX i + β 3 LEX i + β 4 LEX 2 i i ● ● ● ● ● −734 ● ● ● ● 2 log Likelihood ● ● ● ● ● ● ● −738 ● ● ● ● ● ● ● ● ● ● ● ● ● ● −742 ● ● ● ● ● ● 0.5 1.0 1.5 2.0 ν Figure: Profile likelihood for the Gosset link parameter ν Koenker and Yoon (UIUC) Parametric Links UseR! 2006 9 / 14 Koenker and Yoon (UIUC) Parametric Links UseR! 2006 9 / 14
Recommend
More recommend