introduction to general and generalized linear models
play

Introduction to General and Generalized Linear Models Mixed effects - PowerPoint PPT Presentation

Introduction to General and Generalized Linear Models Mixed effects models - Part IV Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby January 2011 Henrik Madsen Poul


  1. Introduction to General and Generalized Linear Models Mixed effects models - Part IV Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby January 2011 Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall January 2011 1 / 23

  2. This lecture General mixed effects models Laplace approximation Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall January 2011 2 / 23

  3. General mixed effects models General mixed effects models Let us now look at methods to deal with nonlinear and non-normal mixed effects models. In general it will be impossible to obtain closed form solutions and hence numerical methods must be used. Estimation and inference will be based on likelihood principles. Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall January 2011 3 / 23

  4. General mixed effects models General mixed effects models The general mixed effects model can be represented by its likelihood function: � L M ( θ ; y ) = R q L ( θ ; u , y ) d u where y is the observed random variables, θ is the model parameters to be estimated, and U is the q unobserved random variables or effects. The likelihood function L is the joint likelihood of both the observed and the unobserved random variables. The likelihood function for estimating θ is the marginal likelihood L M obtained by integrating out the unobserved random variables. Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall January 2011 4 / 23

  5. General mixed effects models General mixed effects models The integral shown on the previous slide is generally difficult to solve if the number of unobserved random variables is more than a few, i.e. for large values of q . A large value of q significantly increases the computational demands due to the product rule which states that if an integral is sampled in m points per dimension to evaluate it, the total number of samples needed is m q , which rapidly becomes infeasible even for a limited number of random effects. The likelihood function gives a very broad definition of mixed models: the only requirement for using mixed modeling is to define a joint likelihood function for the model of interest. In this way mixed modeling can be applied to any likelihood based statistical modeling. Examples of applications are linear mixed models (LMM) and nonlinear mixed models (NLMM), generalized linear mixed models, but also models based on Markov chains, ODEs or SDEs. Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall January 2011 5 / 23

  6. General mixed effects models Hierarchical models As for the Gaussian linear mixed models it is useful to formulate the model as a hierarchical model containing a first stage model f Y | u ( y ; u , β ) which is a model for the data given the random effects, and a second stage model f U ( u ; Ψ ) which is a model for the random effects. The total set of parameters is θ = ( β , Ψ ) . Hence the joint likelihood is given as L ( β , Ψ ; u , y ) = f Y | u ( y ; u , β ) f U ( u ; Ψ ) Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall January 2011 6 / 23

  7. General mixed effects models Hierarchical models To obtain the likelihood for the model parameters ( β , Ψ ) the unobserved random effects are again integrated out. The likelihood function for estimating ( β , Ψ ) is as before the marginal likelihood � L M ( β , Ψ ; y ) = R q L ( β , Ψ ; u , y ) d u where q is the number of random effects, and β and Ψ are the parameters to be estimated. Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall January 2011 7 / 23

  8. General mixed effects models Grouping structures and nested effects For nonlinear mixed models where no closed form solution to the likelihood function is available it is necessary to invoke some form of numerical approximation to be able to estimate the model parameters. The complexity of this problem is mainly dependent on the dimensionality of the integration problem which in turn is dependent on the dimension of U and in particular the grouping structure in the data for the random effects. These structures include a single grouping , nested grouping , partially crossed and crossed random effects . For problems with only one level of grouping the marginal likelihood can be simplified as � M � L M ( β , Ψ ; y ) = f Y | u i ( y ; u i , β ) f U i ( u i ; Ψ ) d u i R qi i =1 where q i is the number of random effects for group i and M is the number of groups. Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall January 2011 8 / 23

  9. General mixed effects models Grouping structures and nested effects Instead of having to solve an integral of dimension q it is only necessary to solve M smaller integrals of dimension q i . In typical applications there is often just one or only a few random effects for each group, and this thus greatly reduces the complexity of the integration problem. If the data has a nested grouping structure a reduction of the dimensionality of the integral similar to that shown on the previous slide can be performed. An example of a nested grouping structure is data collected from a number of schools, a number of classes within each school and a number of students from each class. Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall January 2011 9 / 23

  10. General mixed effects models Grouping structures and nested effects If the nonlinear mixed model is extended to include any structure of random effects such as crossed or partially crossed random effects it is required to evaluate the full multi-dimensional integral Estimation in these models can efficiently be handled using the multivariate Laplace approximation, which only samples the integrand in one point common to all dimensions. Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall January 2011 10 / 23

  11. Laplace approximation The Laplace approximation For a given set of model parameters θ the joint log-likelihood ℓ ( θ , u , y ) = log( L ( θ , u , y )) is approximated by a second order Taylor approximation around the optimum ˜ u = � u θ of the log-likelihood function w.r.t. the unobserved random variables u , i.e. u , y ) − 1 u ) T H (˜ ℓ ( θ , u , y ) ≈ ℓ ( θ , ˜ 2( u − ˜ u )( u − ˜ u ) where the first-order term of the Taylor expansion disappears since the expansion is done around the optimum ˜ u and u ) = − ℓ ′′ H (˜ uu ( θ , u , y ) | u =˜ u is the negative Hessian of the joint log-likelihood evaluated at ˜ u which will simply be referred to as “the Hessian”. Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall January 2011 11 / 23

  12. Laplace approximation The Laplace approximation Using the approximation in the Laplace approximation of the marginal log-likelihood becomes � � � u ) T H (˜ u , y ) − 1 ℓ M,LA ( θ , y ) = log R q exp ℓ ( θ , ˜ 2 ( u − ˜ u )( u − ˜ u ) d u � � � � H (˜ u ) � � u , y ) − 1 = ℓ ( θ , ˜ 2 log � � 2 π Yhe integral is eliminated by transforming it to an integration of a u and covariance H − 1 (˜ multivariate Gaussian density with mean ˜ u ) . Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall January 2011 12 / 23

  13. Laplace approximation The Laplace approximation The Laplace likelihood only approximates the marginal likelihood for mixed models with nonlinear random effects and thus maximizing the Laplace likelihood will result in some amount of error in the resulting estimates. It can be shown that joint log-likelihood converges to a quadratic function of the random effect for increasing number of observations per random effect and thus that the Laplace approximation is asymptotically exact. In practical applications the accuracy of the Laplace approximation may still be of concern, but often improved numerical approximation of the marginal likelihood (such as Gaussian quadrature) may easily be computationally infeasible to perform. Another option for improving the accuracy is Importance sampling. Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall January 2011 13 / 23

  14. Laplace approximation Two-level hierarchical model For the two-level or hierarchical model it is readily seen that the joint log-likelihood is ℓ ( θ , u , y ) = ℓ ( β , Ψ , u , y ) = log f Y | u ( y ; u , β ) + log f U ( u ; Ψ ) which implies that the Laplace approximation becomes � � � � u ; Ψ ) − 1 H (˜ u ) � � ℓ M,LA ( θ , y ) = log f Y | u ( y ; ˜ u , β ) + log f U (˜ 2 log � � 2 π It is clear that as long as a likelihood function of the random effects and model parameters can be defined it is possible to use the Laplace likelihood for estimation in a mixed model framework. Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall January 2011 14 / 23

  15. Laplace approximation Gaussian second stage model Let us assume that the second stage model is zero mean Gaussian, i.e. u ∼ N ( 0 , Ψ ) which means that the random effect distribution is completely described by its covariance matrix Ψ . In this case the Laplace likelihood in becomes u , β ) − 1 ℓ M,LA ( θ , y ) = log f Y | u ( y ; ˜ 2 log | Ψ | − 1 u − 1 u T Ψ − 1 ˜ 2 ˜ 2 log | H (˜ u ) | where it is seen that we still have no assumptions on the first stage model f Y | u ( y ; u , β ) . Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall January 2011 15 / 23

Recommend


More recommend