Workshop 10.4: Generalized linear models Murray Logan August - PDF document

-1- Workshop 10.4: Generalized linear models Murray Logan August 16, 2016 Table of contents 1 Exponential family distributions 2 0.1. Linear models Homogeneity of variance   σ 2 . 0 0 ··· . .  σ 2  0 . ··· σ 2 )   y i = β 0 + β 1 × x i + ε i ε i ∼ N ( 0 , . V = cov = . . .   . . σ 2 � ��  . .  ··· Linearity Normality σ 2 0 . ··· ··· Zero covariance (=independence) . . . 0.2. Other data types • Binary - only 0 and 1 (dead/alive) (present/absent) • Proportional abundance - range from 0 to 100 • Count data - min of zero

-2- 0.3. Linear models 12 a) b) Present 1.0 ● ● ● ● ● ● ● ● ● 10 Predicted probability 0.8 of presence 8 Frequency 0.6 6 0.4 4 0.2 Absent 2 0.0 ● ● ● ● ● ● ● ● ● ● ● 0 0.0 0.4 0.8 X • expected values outside logical bounds • response not normally distributed 0.4. Logistic models 12 b) b) Present 1.0 ● ● ● ● ● ● ● ● ● 10 0.8 8 Frequency 0.6 6 0.4 4 0.2 Absent 2 0.0 ● ● ● ● ● ● ● ● ● ● ● 0 0.0 0.4 0.8 X • expected values outside logical bounds • response not normally distributed 1. Exponential family distributions

-3- 1.1. Gaussian distribution Virtually unbound measurements (weight, lengths etc) Probability density function Cumulative density function µ = 25, σ 2 = 5 µ = 25, σ 2 = 2 µ = 10, σ 2 = 2 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 2 σ 2 π e − ( x − µ )2 f ( x | µ , σ 2 ) = 1 2 σ 2 √ 1.2. Binomial distribution Presence/absence and data bound to the range [0,1] Probability density function Cumulative density function n = 50 n = 20 n = 3 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 ( n ) p k (1 − p ) n − k f ( k | n , p ) = p

-4- 1.3. Poisson distribution Count data (or count derivatives - like low densities) Probability density function Cumulative density function λ = 25 λ = 15 λ = 3 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 f ( x | λ ) = e − λ λ x x ! 1.4. Negative Binomial Count data (or count derivatives - like low densities) Probability density function Cumulative density function n = 25 n = 10 n = 1.5 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 µ x ω ω f ( x | µ , ω ) = Γ( x + ω ) Γ( ω ) x ! × ( µ + ω ) µ + ω

-5- 1.5. General linear models Homogeneity of variance   σ 2 . 0 0 ··· . .  σ 2  0 . ··· σ 2 )   y i = β 0 + β 1 × x i + ε i ε i ∼ N ( 0 , . V = cov = . . .   . . σ 2 � ��  . .  ··· Linearity Normality σ 2 0 . ··· ··· Zero covariance (=independence) . . . E ( Y ) = β 0 + β 1 x 1 + ... + β p x p + ε , ε ∼ Dist (...) � �� Link function Systematic 1.6. General linear models E ( Y ) = β 0 + β 1 x 1 + ... + β p x p + e � �� Random Link function Systematic • Random component. E ( Y i ) ∼ N ( µ i , σ 2 ) A nominated distribution (Gaussian, Poisson, Binomial, Gamma, Beta,. . . ) 1.7. General linear models E ( Y ) = β 0 + β 1 x 1 + ... + β p x p + e � �� Random Link function Systematic • Random component. • Systematic component β 0 + β 1 x 1 + ... + β p x p • Link function 1.8. Generalized linear models

-6- Response vari- Probability Distribu- Link function Model name able tion Continuous Gaussian identiy: Linear regression measurements µ Binary,proportions Binomial logit: Logistic regression ( ) π log 1 − π probit: Probit regression ∫ α + β . X 1 ( − 1 2 Z 2 ) exp dZ √ 2 π −∞ complimentary: Logistic regression log ( − log (1 − π )) Quasi-binomial logit: Logistic regression ( ) π log 1 − π Counts Poisson log: Poisson regression / log µ log-linear model Negative binomial Negative binomial ( µ ) log µ − θ regression Quasi- log: Poisson regression poisson log µ 1.9. OLS Parameter estimates 6 8 10 12 14 Sum of squares µ =10 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 6 8 10 12 14 Parameter estimates

-7- 1.10. Maximum Likelihood 2 σ 2 π e − ( x − µ )2 f ( x | µ , σ 2 ) = 1 2 σ 2 √ 2 ln σ 2 − ∑ 2 ln L ( µ , σ 2 ) = − n 2 ln (2 π ) − n 1 i =1 ( x i − µ ) 2 2 σ 2 Maximum likelihood estimates: ∑ n x = 1 µ = ¯ ˆ i =1 x i n σ 2 = 1 ∑ n x ) 2 ˆ i =1 ( x i − ¯ n 1.11. Maximum Likelihood Parameter estimates 6 8 10 12 14 Log−likelihood µ =10 ● ● ● ● ● 6 6 6 8 8 8 10 10 10 12 12 12 14 14 14 Parameter estimates

Workshop 10.4: Generalized linear models Murray Logan August - PDF document

-1- Workshop 10.4: Generalized linear models Murray Logan August 16, 2016 Table of contents 1 Exponential family distributions 2 0.1. Linear models Homogeneity of variance 2 . 0 0 . . 2 0 .

Limitations of linear models Richard Erickson Instructor DataCamp Generalized Linear Models in

Workshop 11.2a: Generalized Linear Mixed Effects Models (GLMM) Murray Logan 07 Feb 2017

Workshop 11.2a: Generalized Linear Mixed Effects Models (GLMM) Murray Logan February 7, 2017

Introduction to General and Generalized Linear Models Generalized Linear Models - part III Henrik

Introduction to General and Generalized Linear Models Generalized Linear Models - part I Henrik

Introduction to General and Generalized Linear Models Generalized Linear Models - part II Henrik

Workshop 11.1: Generalized linear models Murray Logan 26-011-2013 Other data types Binary -

Workshop 7: (Generalized) Linear models Murray Logan July 19, 2017 Table of contents 1

Workshop 7: (Generalized) Linear models Murray Logan 19 Jul 2017 Section 1 Linear model

Workshop 10.4: Generalized linear models Murray Logan February 15, 2017 Table of contents 1

Workshop 10.4: Generalized linear models Murray Logan 16 Aug 2016 Linear models Homogeneity

Introduction to General and Generalized Linear Models General Linear Models - part I Henrik

Linear Models are Most Favorable among Generalized Linear Models Kuan-Yun Lee and Thomas A.

Linear and Generalized Linear Models for Analyzing Face Recognition Performance J. Ross Beveridge

Proper Generalized Decomposition for Linear and Non-Linear Stochastic Models Olivier Le Matre 1

Introduction to General and Generalized Linear Models General Linear Models - part II Henrik

Generalized Nonlinear Models gnm : a Package for Generalized Nonlinear Models Same form as

Advanced Section #5: Generalized Linear Models: Logistic Regression and Beyond Marios Mattheakis

Generalized linear models Sren Hjsgaard Department of Mathematical Sciences Aalborg

Overview of logistic regression Richard Erickson Instructor DataCamp Generalized Linear Models

Generalized linear mixed effects models Consider stochastic vector Y = ( Y 1 , . . . , Y n ) and

Multiple logistic regression Richard Erickson Instructor DataCamp Generalized Linear Models in

Generalized Linear Model Certain nonlinear models with a specific structure arise from using

LOGISTIC REGRESSION AND GENERALIZED LINEAR MODELS W. RYAN LEE CS109/AC209/STAT121 ADVANCED