lms and gamlss
play

LMS and GAMLSS Flexible Regression and Smoothing Mikis Stasinopoulos - PowerPoint PPT Presentation

LMS and GAMLSS LMS and GAMLSS Flexible Regression and Smoothing Mikis Stasinopoulos 1 and Bob Rigby 1 1 STORM/FLSC, London Metropolitan University Royal Statistical Society, February 2016 1 LMS and GAMLSS Outline 1 The Lambda, Mu and Sigma


  1. LMS and GAMLSS LMS and GAMLSS Flexible Regression and Smoothing Mikis Stasinopoulos 1 and Bob Rigby 1 1 STORM/FLSC, London Metropolitan University Royal Statistical Society, February 2016 1

  2. LMS and GAMLSS Outline 1 The Lambda, Mu and Sigma method (LMS) The problem The method Extensions of the method 2 The Generalised Additive Model for Location Scale and Shape What is GAMLSS? Distributions and Additive terms R implementation 3 The lung function data 4 Conclusions 2

  3. LMS and GAMLSS The Lambda, Mu and Sigma method (LMS) The LMS method The LMS method is a nice example of a model built to solve a problem. This is also typical of Tim’s work who developed the growth curve methodology to solve specific problems and then applied it to a variety of fields: Nuitricion Medicine ... 3

  4. LMS and GAMLSS The Lambda, Mu and Sigma method (LMS) The problem The Dutch boys data BMI : the BMI of 7294 boys age : the age in years Source: van Buuren and Fredriks (2001) 4

  5. LMS and GAMLSS The Lambda, Mu and Sigma method (LMS) The problem The Dutch boys data: statistical challenges 35 30 25 BMI 20 15 0 5 10 15 20 age 5

  6. LMS and GAMLSS The Lambda, Mu and Sigma method (LMS) The problem The Dutch boys data: Histograms by age 10 15 20 25 30 35 10 15 20 25 30 35 (16,17] (17,18] (18,19] (19,20] 25 20 15 10 5 0 (11,12] (12,13] (13,14] (14,15] (15,16] 25 20 15 Percent of Total 10 5 0 (6,7] (7,8] (8,9] (9,10] (10,11] 25 20 15 10 5 0 (1,2] (2,3] (3,4] (4,5] (5,6] 25 20 15 10 5 0 10 15 20 25 30 35 10 15 20 25 30 35 10 15 20 25 30 35 age 6

  7. LMS and GAMLSS The Lambda, Mu and Sigma method (LMS) The problem The Dutch boys data: centile estimation Centile curves using BCT 35 30 25 BMI 20 15 0 5 10 15 20 age 7

  8. LMS and GAMLSS The Lambda, Mu and Sigma method (LMS) The problem The Dutch boys data: centiles Centile curves using BCT 35 30 25 BMI 20 15 0 5 10 15 20 age 8

  9. LMS and GAMLSS The Lambda, Mu and Sigma method (LMS) The problem The Dutch boys data: conditional distributions plot 30 25 bmi 20 15 10 6 8 10 12 14 16 18 20 age 9

  10. LMS and GAMLSS The Lambda, Mu and Sigma method (LMS) The problem World Health Organisation Child Growth Standards: Girls BMI-for-age GIRLS 5 to 19 years (z-scores) 32 32 30 Obesity 30 2 28 28 Overw eight 26 26 1 24 24 BMI (kg/m²) 22 22 0 Normal 20 20 18 18 -2 16 16 Thinness -3 14 14 Severe thinness 12 12 10 10 Months 3 6 9 3 6 9 3 6 9 3 6 9 3 6 9 3 6 9 3 6 9 3 6 9 3 6 9 3 6 9 3 6 9 3 6 9 3 6 9 3 6 9 Years 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Age (completed months and years) 10 2007 WHO Reference

  11. LMS and GAMLSS The Lambda, Mu and Sigma method (LMS) The problem World Health Organisation Child Growth Standards: Boys BMI-for-age BOYS 5 to 19 years (z-scores) 32 32 Obesity 30 30 2 28 28 Overw eight 26 26 1 24 24 BMI (kg/m²) 0 22 22 Normal 20 20 18 18 -2 16 Thinness 16 -3 14 14 Severe thinness 12 12 10 10 Months 3 6 9 3 6 9 3 6 9 3 6 9 3 6 9 3 6 9 3 6 9 3 6 9 3 6 9 3 6 9 3 6 9 3 6 9 3 6 9 3 6 9 Years 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Age (completed months and years) 11 2007 WHO Reference

  12. LMS and GAMLSS The Lambda, Mu and Sigma method (LMS) The method The LMS method: Cole and Green (1992) Let Y be a random variable with range Y > 0 defined through the transformed variable Z given by: �� Y � λ � Z = 1 − 1 , if λ � = 0 σν µ � Y � = 1 σ log , if λ = 0 . µ where Z ∼ N (0 , 1) (a truncated normal) µ = h 1 ( age ) σ = h 2 ( age ) λ = h 3 ( age ) where h() are smooth functions. 12

  13. LMS and GAMLSS The Lambda, Mu and Sigma method (LMS) The method The LMS method 1 Y ∼ D ( µ, σ, λ ) 2 All three parameters are smooth-function of one explanatory variable age 3 The parameters can be interpreted as: µ (approximate the median) a location parameter σ (approximate a coefficient of variation) a scale parameter λ (which from now on I will call ν ) a skewness parameter Is there anything missing? 13

  14. LMS and GAMLSS The Lambda, Mu and Sigma method (LMS) Extensions of the method The LMS method and extensions 1 Y ∼ D ( µ, σ, ν, τ ) g 1 ( µ ) = h 1 ( age ): modelling the location parameter 2 g 2 ( σ ) = h 2 ( age ): modelling the scale parameter g 3 ( ν ) = h 3 ( age ): modelling the skewness parameter g 4 ( τ ) = h 4 ( age ): modelling the kutrosis parameter 3 the g () functions are link functions 14

  15. LMS and GAMLSS The Lambda, Mu and Sigma method (LMS) Extensions of the method The LMS method and extensions Different assumptions about Z produce different LMS methods 1 if Z ∼ N (0 , 1) then Y ∼ BCCG ( µ, σ, ν ) = LMS method 2 if Z ∼ t τ then Y ∼ BCT ( µ, σ, ν, τ ) = LMST method 3 if Z ∼ PE (0 , 1 , τ ) then Y ∼ BCPE ( µ, σ, ν, τ ) = LMSP method adopted by WHO 15

  16. LMS and GAMLSS The Generalised Additive Model for Location Scale and Shape Generalised Additive Model for Location Scale and Shape Generalised additive model for location scale and shape Rigby and Stasinopoulos (2005) y ∼ D ( µ , σ , ν , τ ) g µ ( µ ) = X µ β µ + h 1 ,µ ( x 1 ,µ ) + ... + h k ,µ ( x k ,µ ) g σ ( σ ) = X σ β σ + h 1 ,σ ( x 1 ,σ ) + ... + h k ,σ ( x k ,σ ) g ν ( ν ) = X ν β ν + h 1 ,ν ( x 1 ,ν ) + ... + h k ,ν ( x k ,ν ) g τ ( τ ) = X τ β τ + h 1 ,τ ( x 1 ,τ ) + ... + h k ,τ ( x k ,τ ) where D ( µ , σ , ν , τ ) can be any distribution and where h j ( x j ) are smooth functions of the X ’s. 16

  17. LMS and GAMLSS The Generalised Additive Model for Location Scale and Shape GAMLSS assumptions 15 y 10 5 6 8 10 12 14 16 18 x 17

  18. LMS and GAMLSS The Generalised Additive Model for Location Scale and Shape What is GAMLSS? What is GAMLSS? GAMLSS: are semi-parametric regression type models. regression type: we have many explanatory variables X and one response variable y and we believe that X → y parametric: a parametric distribution assumption for the response variable, semi: the parameters of the distribution, as functions of explanatory variables, may involve non-parametric smoothing functions GAMLSS philosophy: try different models GAMLSS is a generalisation of GLM and GAM models. 18

  19. LMS and GAMLSS The Generalised Additive Model for Location Scale and Shape Distributions and Additive terms GAMLSS: Distributions discrete , continuous continuous , and There are around 80 discrete mixed , implemented as gamlss.family in the mixed distributions, Shapes , R including highly skew and kurtotic distributions creating a new distribution is relatively easy truncated an existing distribution truncating using a censored version of an existing distribution mixture different distributions to create a new finite mixing mixture distribution. discretise continuous distributions discretise log or logit any continuous distribution in ( −∞ , ∞ ) 19

  20. LMS and GAMLSS The Generalised Additive Model for Location Scale and Shape Distributions and Additive terms Additive terms Additive terms R Name P-splines pb() , pbm() , cy() Varying coefficient pvc() Cubic splines cs() loess / neural networks lo() , nn() Fractional/picewise polynomials fp() , fk() non-linear fit nl() Random effects random() , re() Ridge regression ri() Simon Wood’s GAM ga() decision threes tr() random walk and AR rw(), ar() 20

  21. LMS and GAMLSS The Generalised Additive Model for Location Scale and Shape R implementation GAMLSS: R implementation GAMLSS is implemented in series of packages in R gamlss the original package gamlss.dist for distributions gamlss.data for distributions gamlss.demo for demos gamlss.nl for non-linear terms gamlss.tr for truncated distributions gamlss.cens for censored (left, right or interval) response variables gamlss.mx for finite mixtures and random effects gamlss.spatial for Gaussian Markov Random fields The GAMLSS packages can be downloaded from CRAN, the R library at http://www.r-project.org/ 21

  22. LMS and GAMLSS The lung function data 3164 male observations of lung function data 1.0 0.9 0.8 FEV1/FVC 0.7 0.6 0.5 100 120 140 160 180 200 height 22

  23. LMS and GAMLSS The lung function data The lung function data Y = FEV 1 / FVC : the Spirometric lung function an established index for diagnosing airway obstruction (3164 male) age : the height in cm Source: Stanojevic et al. 2009 23

  24. LMS and GAMLSS The lung function data The lung function data four different models Hossain et al. (2015) 1 The LMS model 2 Beta Inflated (at 1) 3 Logit skew student t distribution (logitSST) inflated (at 1) 4 Generalised Tobit models 24

  25. LMS and GAMLSS The lung function data The lung function data: the LMS model BCT ( µ, σ, ν, τ ) Y ∼ g 1 ( µ ) = h 1 ( x ) g 2 ( σ ) = h 2 ( x ) g 3 ( ν ) = h 3 ( x ) g 4 ( τ ) = h 4 ( x ) v ξ . x = 25

  26. LMS and GAMLSS The lung function data The lung function data: the Beta inflated (at 1) model Y ∼ BEINF 1( µ, σ, ν ) g 1 ( µ ) = h 1 ( x ) g 2 ( σ ) = h 2 ( x ) g 3 ( ν ) = h 3 ( x ) v ξ . x = 26

  27. LMS and GAMLSS The lung function data The lung function data: the Logit-SST inflated (at 1) model Y ∼ logitSSTat 1( µ, σ, ν.τ, ξ ) µ = h 1 ( x ) log σ = h 2 ( x ) log ν = h 3 ( x ) log τ = h 4 ( x ) � p 1 � log = h 5 ( x ) . 1 − p 1 v ξ . x = 27

Recommend


More recommend