Lecture 7: GLMs: Score equations, Residuals Author: Nick Reich / - PowerPoint PPT Presentation

Lecture 7: GLMs: Score equations, Residuals Author: Nick Reich / Transcribed by Bing Miu and Yukun Li Course: Categorical Data Analysis (BIOSTATS 743) Made available under the Creative Commons Attribution-ShareAlike 4.0 International License.

Likelihood Equations for GLMs ◮ The GLM likelihood function is given as follows: � ⇀ L ( β ) = log ( f ( y i | θ i , φ )) i � y i θ i − b ( θ i ) � � = + C ( y i , φ ) a ( φ ) i � � y i θ i − b ( θ i ) = + C ( y i , φ ) a ( φ ) i i ◮ φ is a dispersion parameter. Not indexed by i , assumed to be fixed ◮ θ i contains β , from η i ◮ C ( y i , φ ) is from the random component.

Score Equations ◮ Taking the derivative of the log likelihood function, set it equal to 0 ⇀ � ∂ L ( β ) ∂ L i = = 0 , ∀ j ∂β j ∂β j i ∂θ i = ( y i − µ i ) ◮ Since ∂ L i a ( φ ) , µ i = b ′ ( θ i ), Var ( Y i ) = b ′′ ( θ i ) a ( φ ), and η i = � j β j x ij � � ∂ L i y i − µ i a ( φ ) ∂µ i 0 = = x ij ∂β j a ( φ ) Var ( Y i ) ∂η i i i � ( y i − µ i ) x ij ∂µ i = Var ( Y i ) ∂η i i ◮ V ( θ ) = b ′′ ( θ ), b ′′ ( θ ) is the variance function of the GLM. ◮ µ i = E [ Y i | x i ] = g − 1 ( X i β ). These functions are typically non-linear with respect to β ’s, thus require iterative computation solutions.

Example: Score Equation from Binomial GLM (Ch5.5.1) Y~ Binomial ( n i , π i ) ◮ The joint probability mass function: N � π ( x i ) y i [1 − π ( x i )] n i − y i i =1 ◮ The log likelihood: � � � � � � �� L ( β ) = y i x ij β j − n i log 1 + exp β j x ij j i i j ◮ The score equation: ⇀ � e X i β ∂ L ( β ) = ( y i − n i ˆ π i ) x ij note that ˆ π i = ∂β j 1 + e X i β i .

Asymptotic Covariance of ˆ β : ◮ The likelihood function determines the asymptotic covariance of the ML estimate for ˆ β . ◮ Given the information matrix, I with hj elements: ⇀ � − ∂ 2 L ( � � ∂µ i � 2 N � β ) x ih x ij I = E = ∂β h β j Var ( Y i ) ∂η i i =1 where w i denotes � ∂µ i � 2 1 w i = Var ( Y i ) ∂η i

Asymptotic Covariance Matrix of ˆ β : ◮ The information matrix, I is equivalent to: I = � N i =1 x ih x ij w i = X T WX ◮ W is a diagonal matrix with w i as the diagonal element. In β MLE and depdent on the link practice, W is evulated at ˆ function ◮ The square root of the main diagonal elements of ( X T WX ) − 1 are estimated standard errors of ˆ β

Analogous to SLR SLR GLM the i th main diagnal σ 2 Var ( ˆ ˆ β i ) � N x ) 2 i =1 ( x i − ¯ element of ( X T WX ) − 1 σ 2 ( X T X ) − 1 ( X T WX ) − 1 Cov ( ˆ β i ) ˆ

Residual and Diagnostics ◮ Deviance Tests ◮ Measure of goodness of fit in GLM based on likelihood ◮ Most useful as a comparison between models (used as a screening method to identify important covariates) ◮ Use the saturated model as a baseline for comparison with other model fits ◮ For Poisson or binomial GLM: D = − 2[ L (ˆ µ | y ) − L ( y | y )]. ◮ Example of Deviance Model D(( y , ˆ µ ) ) � ( Y i − ˆ µ i ) 2 Gaussian 2 � ( y i ln( y i Poisson µ i ) − ( y i − ˆ µ i )) ˆ 2 � ( y i ln ( y i µ i )+( n i − y i ) ln ( n i − y i Bionomial µ i )) ˆ n i − ˆ

Deviance tests for nested models ◮ Consider two models, M 0 with fitted values ˆ µ 0 and M 1 with fitted values ˆ µ 1 : ◮ M 0 is nested within M 1 η µ 1 1 = β 0 + β 1 X 11 + β 2 X 12 η µ 0 0 = β 0 + β 1 X 11 ◮ Simpler models have smaller log likelihood and larger deviance: L (ˆ µ 0 | y ) ≤ L (ˆ µ 1 | y ) and D ( y | ˆ µ 1 ) ≤ D ( y | ˆ µ 0 ). ◮ The likelihood-ratio statistic comparing the two models is the difference between the deviances. − 2[ L (ˆ µ 0 | y ) − L (ˆ µ 1 | y )] = − 2[ L (ˆ µ 0 | y ) − L ( y | y )] − {− 2[ L (ˆ µ 1 | y ) − L ( y | y )] } = D ( y | ˆ µ 0 ) − D ( y | ˆ µ 1 )

Hypothesis test with differences in Deviance ◮ H 0 : β i 1 = ... = β ij = 0, fit a full and reduced model ◮ Hypothesis test with difference in deviance as test statistics. df is the number of parameter different between µ 1 and µ 0 µ 1 ) ∼ χ 2 D ( y | ˆ µ 0 ) − D ( y | ˆ df ◮ Reject H 0 if the the chi-square calculated value is larger than χ 2 df , 1 − α , where df is the number of parameters difference between µ 0 and µ 1 .

Residual Examinations ◮ Pearson residuals : e p y − ˆ µ i √ µ i ) , where µ i = g − 1 ( η i ) = g − 1 ( X i β ) i = V (ˆ ◮ Deviance residuals : µ i ) √ d i , where d i is the deviance contribution of e d i = sign ( y i − ˆ � 1 x > 0 i th obs. and sign ( x ) = − 1 x ≤ 0 ◮ Standardized residuals: e i y − ˆ µ i µ i ) , � √ r i = � h i ) , where e i = h 1 is the measure of (1 − � V (ˆ leverage, and r i ∼ = N (0 , 1)

Residual Plot Problem: Residual plot is hard to interpret for logistic regression 2 1 Residuals 0 −1 −2 −3 −2 −1 0 1 2 3 Expected Values

Binned Residual Plot ◮ Group observations into ordered groups (by x j , ˆ y or x ij ), with equal number of observations per group. ◮ Compute group-wise average for raw residuals ◮ Plot the average residuals vs predicted value. Each dot represent a group. Average Residuals 0.4 0.0 −0.6 −2 −1 0 1 2 Expected Values

Binned Residual Plot (Part 2) ◮ Red lines indicate ± 2 standard-error bounds, within which one would expect about 95% of the binned residuals to fall. ◮ R function avaiable. linrary (arm) binnedplot (x ,y, nclass...) # x <- Expected values. # y <- Residuals values. # nclass <- Number of bins. Average Residuals 0.2 −0.6 −2 −1 0 1 2 Expected Values

Binned Residual Plot (Part 3) ◮ In practice may need to fiddle with the number of observations per group. Default will take the value of nclass according to the n such that: – if n ≥ 100, nclass = floor ( sqrt ( length ( x ))); – if 10 < n < 100, nclass = 10; – if n < 10, nclass = floor ( n / 2).

Ex: Binned Residual Plot with different bin sizes bin size = 10 bin size = 50 0.4 0.2 Average Residuals Average Residuals 0.2 0.1 0.0 0.0 −0.2 −0.2 −3 −2 −1 0 1 2 −4 −2 0 2 4 Expected Values Expected Values bin size = 100 bin size = 500 0.4 0.4 Average Residuals Average Residuals 0.2 0.2 0.0 0.0 −0.2 −0.2 −0.4 −4 −2 0 2 4 −4 −2 0 2 4 Expected Values Expected Values

Lecture 7: GLMs: Score equations, Residuals Author: Nick Reich / - PowerPoint PPT Presentation

Lecture 7: GLMs: Score equations, Residuals Author: Nick Reich / Transcribed by Bing Miu and Yukun Li Course: Categorical Data Analysis (BIOSTATS 743) Made available under the Creative Commons Attribution-ShareAlike 4.0 International License.

Diagnostics Internally studentized residuals, PRESS residuals or externally studentized

Model Adequacy Usual residual plots: Residuals versus predicted (fitted) values; Probability

Lecture 15 GPs for GLMs + Spatial Data 3/20/2018 1 GPs and GLMs 2 Bern (

MARC Fall Meeting 09/24/17 MARC Fall Meeting 09/24/17 SCORE Presentation SCORE

gam.check summary(resid_fit) Randomised quantile residuals Example Fitting to residuals

Crash course on GLMs Richard Erickson Quantitative Ecologist DataCamp Hierarchical and Mixed

Section6.1 Systems of Equations in Two Variables Introduction Definitions A system of equations

Opportunities for Underutilized Wood Regional Symposium May 1, 2018 Chips Bark

The Source, Issue, Remedy And Opportunity September 2019 Presentation by: Kenneth Leung

RTCR and Chlorine Residuals - Overall Look From A Utility Perspective Sharon L. Fillmann

ENHANCING STRENGTH AND DURABILITY OF CONCRETE USING RESIDUALS AND REJECT FIBERS FROM PULP AND

Titel der Prsentation Treatment of residuals from coffee production in Costa Rica Naxos 13

Residuals and Goodness-of-fit tests for marked Gibbs point processes Fr ed eric Lavancier

AIRS Minor Constituents Focus Group: Turning small residuals into science Turning small

Sample Score Report by three areas, or claims. Sample Score

Entrepreneurship & SCORE By: Mort Harris Agenda Who is SCORE Entrepreneurship

Lecture 3 Residual Analysis + Generalized Linear Models Colin Rundel 1/23/2018 1 Residual

Statistics and Data Analysis Logistic Regression & Frequent Pattern Mining Ling-Chieh Kung

Agenda 1. Visualizing predictions in R (lab from Tuesday) 2. Parsimony and Occams razor

STAT 215 Multiple Logistic Regression Colin Reimer Dawson Oberlin College November 16, 2017 1

STAT 215 Logistic Regression II Colin Reimer Dawson Oberlin College November 14, 2017 1 / 33

Contents 1 Introduction 1 2 The Problem of Overdispersion 1 2.1 Relevant Distributional

Workshop 10.5a: Logistic regression Murray Logan August 23, 2016 Table of contents 1 Logistic

Logistic regression Susanne Rosthj Section of Biostatistics Institute of Public Health

Lecture 7: GLMs: Score equations, Residuals Author: Nick Reich / - PowerPoint PPT Presentation

Lecture 7: GLMs: Score equations, Residuals Author: Nick Reich / Transcribed by Bing Miu and Yukun Li Course: Categorical Data Analysis (BIOSTATS 743) Made available under the Creative Commons Attribution-ShareAlike 4.0 International License.

Diagnostics Internally studentized residuals, PRESS residuals or externally studentized

Model Adequacy Usual residual plots: Residuals versus predicted (fitted) values; Probability

Lecture 15 GPs for GLMs + Spatial Data 3/20/2018 1 GPs and GLMs 2 Bern (

MARC Fall Meeting 09/24/17 MARC Fall Meeting 09/24/17 SCORE Presentation SCORE

gam.check summary(resid_fit) Randomised quantile residuals Example Fitting to residuals

Crash course on GLMs Richard Erickson Quantitative Ecologist DataCamp Hierarchical and Mixed

Section6.1 Systems of Equations in Two Variables Introduction Definitions A system of equations

Opportunities for Underutilized Wood Regional Symposium May 1, 2018 Chips Bark

The Source, Issue, Remedy And Opportunity September 2019 Presentation by: Kenneth Leung

RTCR and Chlorine Residuals - Overall Look From A Utility Perspective Sharon L. Fillmann

ENHANCING STRENGTH AND DURABILITY OF CONCRETE USING RESIDUALS AND REJECT FIBERS FROM PULP AND

Titel der Prsentation Treatment of residuals from coffee production in Costa Rica Naxos 13

Residuals and Goodness-of-fit tests for marked Gibbs point processes Fr ed eric Lavancier

AIRS Minor Constituents Focus Group: Turning small residuals into science Turning small

Sample Score Report by three areas, or claims. Sample Score

Entrepreneurship &amp; SCORE By: Mort Harris Agenda Who is SCORE Entrepreneurship

Lecture 3 Residual Analysis + Generalized Linear Models Colin Rundel 1/23/2018 1 Residual

Statistics and Data Analysis Logistic Regression &amp; Frequent Pattern Mining Ling-Chieh Kung

Agenda 1. Visualizing predictions in R (lab from Tuesday) 2. Parsimony and Occams razor

STAT 215 Multiple Logistic Regression Colin Reimer Dawson Oberlin College November 16, 2017 1

STAT 215 Logistic Regression II Colin Reimer Dawson Oberlin College November 14, 2017 1 / 33

Contents 1 Introduction 1 2 The Problem of Overdispersion 1 2.1 Relevant Distributional

Workshop 10.5a: Logistic regression Murray Logan August 23, 2016 Table of contents 1 Logistic

Logistic regression Susanne Rosthj Section of Biostatistics Institute of Public Health

Entrepreneurship & SCORE By: Mort Harris Agenda Who is SCORE Entrepreneurship

Statistics and Data Analysis Logistic Regression & Frequent Pattern Mining Ling-Chieh Kung