-1- Workshop 4: Statistical modelling intro Murray Logan March 10, 2019 Table of contents 1 Introduction 1 1. Introduction 1.1. Statistical modelling What is a statistical model? 1.2. Statistical modelling What is a statistical model? Mathematical model 12 ● 10 ● 8 ● ● y 6 ● 4 y=2+1.5x ● 2 ● 0 0 1 2 3 4 5 6 x Statistical model 12 ● 10 ● ● 8 y 6 ● ● y=2+1.5x+ ε 4 ● ● 2 0 0 1 2 3 4 5 6 x
-2- 1.3. Statistical modelling 12 ● 10 ● ● 8 y 6 ● ● y=2+1.5x+ ε 4 ● ● 2 0 0 1 2 3 4 5 6 x What is a statistical model? • stoichastic mathematical expression • low-dimensional summary • relates one or more dependent random variables to one or more independent variables 1.4. Statistical modelling A random variable is one whose values depend on a set of random events and are described by a probability distribution 1.5. Statistical modelling 12 ● 10 ● ● 8 y 6 ● ● y=2+1.5x+ ε 4 ● ● 2 0 0 1 2 3 4 5 6 x What is a statistical model? • embodies a data generation process along with the distributional assumptions underlying this generation • incorporates uncertainty • response = model + error • incorporate error (uncertainty) 1.6. Statistical modelling 12 ● 10 ● ● 8 y 6 ● ● y=2+1.5x+ ε 4 ● ● 2 0 0 1 2 3 4 5 6 x What is the purpose of statistical modelling?
-3- • describe relationships / effects • estimate effects • predict outcomes 1.7. Statistical models 12 ● 10 ● ● 8 y 6 ● ● y=2+1.5x+ ε 4 ● ● 2 0 0 1 2 3 4 5 6 x How do we estimate model parameters? - Y ∼ β 0 + β 1 X What criterion do we use to assess best fit? • Depends on how we assume Y is distributed 1.8. Statistical models 12 ● 10 ● ● 8 y 6 ● ● y=2+1.5x+ ε 4 ● ● 2 0 0 1 2 3 4 5 6 x If we assume Y is drawn from a normal (gaussian) distribution. . . • Ordinary Least Squares OLS 1.9. Estimation • parameters – location (mean) – spread (variance) - uncertainty
-4- 1.10. Estimation 1.10.1. Least squares Parameter estimates 6 8 10 12 14 Sum of squares µ =10 ● ● ● ● ● ● ● ● ● ● 6 8 10 12 14 Parameter estimates 1.11. Estimation 1.11.1. Least squares estimates • Minimize sum of the squared residuals • Solve simultaneous equations 3.0 = β 0 × 1 + β 1 × 0 + ε 1 Y X 2.5 = β 0 × 1 + β 1 × 1 + ε 1 3 0 6.0 = β 0 × 1 + β 1 × 2 + ε 2 2.5 1 5.5 = β 0 × 1 + β 1 × 3 + ε 3 6 2 5.5 3 9.0 = β 0 × 1 + β 1 × 4 + ε 4 9 4 8.6 = β 0 × 1 + β 1 × 5 + ε 5 8.6 5 12.0 = β 0 × 1 + β 1 × 6 + ε 6 12 6 1.12. Estimation 1.12.1. Least squares estimates • Minimize sum of the squared residuals • Solve simultaneous equations Provided data (and residuals) Gaussian
-5- 12 ● 10 ● ● 8 y 6 ● ● y=2+1.5x+ ε 4 ● ● 2 0 0 1 2 3 4 5 6 x 1.13. Gaussian distribution Probability density function µ = 25, σ 2 = 5 µ = 25, σ 2 = 2 µ = 10, σ 2 = 2 0 5 10 15 20 25 30 35 40 2 σ 2 π e − ( x − µ )2 f ( x | µ , σ 2 ) = 1 2 σ 2 √ 1.14. Linear model assumptions • Normality • Homogeneity of variance • Linearity • Independence
-6- Homogeneity of variance σ 2 0 0 ··· . . σ 2 0 . ··· ε i ∼ N ( 0 , σ 2 ) y i = β 0 + β 1 × x i + ε i V = cov = . . . . σ 2 � �� � � �� � . . ··· Linearity Normality σ 2 0 ··· ··· Zero covariance (=independence) 1.15. Linear model assumptions Homogeneity of variance σ 2 0 0 ··· . . σ 2 0 . ··· ε i ∼ N ( 0 , σ 2 ) y i = β 0 + β 1 × x i + ε i V = cov = . . . . σ 2 � �� � � �� � . . ··· Linearity Normality σ 2 0 ··· ··· Zero covariance (=independence) What do we do, if the data do not satisfy the assumptions?
-7- 1.16. Scale transformations Frequency Frequency 0 10 20 30 40 0.0 0.5 1.0 1.5 2.0 log 10 Leaf length (cm) Leaf length (cm) Logarithmic scale Linear scale 0 10 20 30 40 0.0 0.5 1.0 1.5 2.0 log 10 leaf length (cm) Leaf length (cm) 1.17. Linear model y i = β 0 + β 1 x i + ε i • model embodies data generation processes • pertains to: • effects (linear predictor) • distribution 1.18. Data types Type Example Distribution Range Measurements length, weight Gaussian real, −∞ ≤ x ≥ ∞ logNormal real, 0 < x ≥ ∞ Gamma real, 0 < x ≥ ∞ Counts Abundance Poisson discrete, 0 ≥ x ≤ ∞ Negative Binomial discrete, 0 ≥ x ≤ ∞ Binary Presence/Absence Binomial discrete, x = 0, 1 Proportions Ratio Binomial discrete, 0 ≥ x ≤ n Percentages Percent cover Binomial real, 0 ≤ x ≥ 1 Beta real, 0 ≤ x ≥ 1
-8- What about density? 1.19. Gamma zero-bound variables with large var. Probability density function µ = 15, σ 2 = 15 (a = 15, s = 1) µ = 15, σ 2 = 30 (a = 7.5, s = 2) µ = 15, σ 2 = 60 (a = 3.75, s = 4) 0 5 10 15 20 25 30 35 40 1 $f(x | s , a ) = Θ ( a − 1) e Θ {− ( x / s ) } $ ( s a Γ( a )) x a = shape , s = scale µ = as , σ 2 = as 2 1.20. Poisson distribution Count data
-9- Probability density function λ = 25 λ = 15 λ = 3 0 5 10 15 20 25 30 35 40 f ( x | λ ) = e − λ λ x x ! µ = σ 2 = λ = df θ ( dispersion ) = σ 2 µ = 1 1.21. Negative Binomial Count data
-10- Probability density function Probability density function µ = 15, ω = 7.5 ( θ = 0.133; σ 2 = 3 µ ) µ = 25, ω = Inf ( θ = 0 ) µ = 15, ω = 3 ( θ = 0.333; σ 2 = 6 µ ) µ = 15, ω = Inf ( θ = 0 ) µ = 15, ω = 1.667 ( θ = 0.6; σ 2 = 10 µ ) µ = 3, ω = Inf ( θ = 0 ) 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 µ x ω ω f ( x | µ , ω ) = Γ( x + ω ) Γ( ω ) x ! × ( µ + ω ) µ + ω θ ( dispersion ) = 1 /ω µ 2 ω = − θ = 0 ( when ω = ∞ ) µ − σ 2 1.22. Binomial distribution Proportions or Presence/absence ( n ) p x (1 − p ) n − x f ( x | n , p ) = p µ = np , σ 2 = np (1 − p ) for presence/absence n = 1 1.23. Beta Continuous between 0 and 1
-11- Probability density function µ = 0.5, σ 2 = 0.023 (a = 5, b = 5) µ = 0.167, σ 2 = 0.019 (a = 1, b = 5) µ = 0.833, σ 2 = 0.019 (a = 5, b = 1) µ = 0.5, σ 2 = 0.125 (a = 0.5, b = 0.5) 0.00 0.25 0.50 0.75 1.00 Γ( a + b ) Γ( a )Γ( b ) x a − 1 (1 − x ) b − 1 f ( x | a , b ) = a + b , σ 2 = a ab µ = ( a + b ) 2 .( a + b +1) • must consider zero-one inflated 1.24. Generalized linear models Y = β 0 + β 1 x 1 + ... + β p x p + e g ( µ ) = β 0 + β 1 x 1 + ... + β p x p ���� � �� � Link function Systematic • Random component. Y ∼ Dist ( µ , ...) • Systematic component [ −∞ , ∞ ] • Link function ( g () ) g ( µ ) = β 0 + β 1 x 1 + ... + β p x p 1.25. Generalized linear models Linear model is just a special case Y = β 0 + β 1 x 1 + ... + β p x p + e g ( µ ) = β 0 + β 1 x 1 + ... + β p x p ���� � �� � Link function Systematic • Random component. Y ∼ N ( µ , σ 2 ) • Systematic component [ −∞ , ∞ ] • Link function ( g () ) I ( µ ) = β 0 + β 1 x 1 + ... + β p x p
-12- 1.26. Generalized linear models Response variable Probability Canonical Link function Model name Distribution Continuous Gaussian identiy: Linear regression measurements µ Gamma verse: Gamma regression 1 /µ Counts Poisson log: Poisson regression / log-linear log( µ ) model Negative bino- log: Negative binomial regression mial log( µ ) Quasi-poisson log: Poisson regression log ( µ ) Binary,proportions Binomial logit: Logistic regression ( π ) log 1 − π probit: Probit regression ∫ α + β . X 1 ( − 1 2 Z 2 ) exp dZ √ 2 π −∞ complimentary: Logistic regression log ( − log (1 − π )) Quasi-binomial logit: Logistic regression ( π ) log 1 − π Percentages Beta logit: Beta regression ( ) π log 1 − π 1.27. OLS Parameter estimates 6 8 10 12 14 Sum of squares µ =10 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 6 8 10 12 14 Parameter estimates
-13- 1.28. Maximum Likelihood 2 σ 2 π e − ( x − µ )2 f ( x | µ , σ 2 ) = 1 2 σ 2 √ 2 ln σ 2 − ∑ 2 ln L ( µ , σ 2 ) = − n 2 ln (2 π ) − n 1 i =1 ( x i − µ ) 2 2 σ 2 Maximum likelihood estimates: ∑ n x = 1 µ = ¯ ˆ i =1 x i n σ 2 = 1 ∑ n x ) 2 ˆ i =1 ( x i − ¯ n 1.29. Maximum Likelihood Parameter estimates 6 8 10 12 14 Log−likelihood µ =10 ● ● ● ● ● 6 6 6 8 8 8 10 10 10 12 12 12 14 14 14 Parameter estimates
Recommend
More recommend