Conditional Inference Functions for Mixed-Effects Models with Unspecified Random-Effects Distribution Annie Qu Joint Work with Peng Wang and Cindy Tsai Department of Statistics University of Illinois at Urbana-Champaign International Workshop on Perspectives on High-dimensional Data Analysis Fields Institute, Toronto June, 2011 1 / 33
Motivating Example A longitudinal observational study, non-surgical periodontal treatment effect on tooth loss There are 722 subjects for 7-year follow up The main covariate: non-surgical periodontal treatment (1 or 0) for three years before the study Other covariates: Gender Age Variables to measure teeth health condition There is subject-specific variation among subjects 2 / 33
A Graph of Longitudinal Data 3 / 33
Longitudinal Data Tooth loss and other covariates are recorded repeatedly over a 7-year period Measurements within the same subject are correlated Major approaches for correlated data: Marginal models Mixed-effects models 4 / 33
Marginal Models The inference of the population average is the main focus Generalized Estimating Equations (GEE) (Liang & Zeger, 1986); Quadratic Inference Functions (Qu et al., 2000): Does not require likelihood function Consistent even if the correlation structure is misspecified Estimator is efficient with the correct working correlation Provides robust sandwich variance estimator 5 / 33
Mixed Models There is heterogeneity among subjects Able to incorporate several sources of variation: random effects and serial correlation Limitations: Requires parametric assumption for random effects, usually normality assumption Involves high dimensional integration for non-normal random effects 6 / 33
Existing Methods for Generalized Linear Mixed-Effects Model Penalized quasilikelihood (PQL) (Breslow and Clayton, 1993) Hierarchical generalized linear model (HGLM) (Lee and Nelder, 1996, 2001) Conditional likelihood (Jiang, 1999) Conditional second-order generalized estimating equations (Vonesh et al., 2002) 7 / 33
Limitations and Assumptions Require normal assumption for random effects (PQL, second order GEE). Require estimation of variance components (PQL and conditional second order GEE). Do not incorporate serial correlation (PQL, HGLM and conditional likelihood). 8 / 33
Advantages of the Proposed Approach A new approach using the conditional quadratic inference function Does not require distribution assumption of random effects Does not require the likelihood function, only involves the first two moments Accommodates variations from both random effects and serial correlations Does not require estimation of unknown variance components or correlation parameters Challenge: the dimension of random effects parameters increases as the sample size increases 9 / 33
GEE Generalized estimating equations (Liang & Zeger,1986) can be represented as N � ′ � ∂µ i A − 1 / 2 R − 1 ( α ) A − 1 / 2 � ( y i − µ i ) = 0 , i i ∂β i =1 where y i = ( y i 1 , ..., y it ) is the response vector for the i th subject, µ i = E ( y i ) = ( µ i 1 , . . . , µ it ) is the mean vector for the i th subject, A i is a diagonal matrix of variance components of y i , and R ( α ) is the working correlation 10 / 33
Representation of Correlation Matrix Approximate R − 1 by � m j =1 a j M j M 1 , . . . , M m are known basis matrices a 1 , . . . , a m are unknown constants The linear representation can accommodate most common working correlation structures such as AR-1, exchangeable or block diagonal 11 / 33
QIF Approach (Qu et al., 2000) � ′ � A − 1 / 2 R − 1 ( α ) A − 1 / 2 GEE: � N ∂µ i ( y i − µ i ) = 0 i =1 ∂β i i Substitute R − 1 ≈ � m j =1 a j M j into GEE, m µ i ′ A − 1 / 2 a j M j ) A − 1 / 2 � � g = ˙ ( ( y i − µ i ) i i j =1 12 / 33
QIF Approach Define the extended score µ i ) ′ A − 1 / 2 M 1 A − 1 / 2 � ( ˙ ( y i − µ i ) i i G N ( β ) = 1 g i ( β ) = 1 . ¯ � . . N N µ i ) ′ A − 1 / 2 M m A − 1 / 2 � ( ˙ ( y i − µ i ) i i The GEE is a linear combination of ¯ G N ( β ) The QIF estimator ˆ β = arg min ¯ N C − 1 N ¯ G ′ G N , where C N = (1 / N ) � g i ( β ) g ′ i ( β ) The QIF estimator ˆ β is more efficient than the GEE estimator under the misspecified correlation structure It provides an objective and inference function for model checking and testing 13 / 33
Mixed-Effects Model A mixed effects model conditional on random effects b i for longitudinal data is modeled as E ( y it | x it , b i ) = µ ( x ′ it β + z ′ it b i ) , i = 1 , ... N , t = 1 , ..., n i y it is the response variable x it are the covariates z it are the covariates for random effects β are the fixed-effect parameters b = ( b 1 , .., b N ) are the random-effects parameters, have the same dimension as the sample size 14 / 33
Penalized Conditional Quasilikelihood The conditional quasi-likelihood of y given the random effects q = − 1 � N b is l b i =1 d i ( y i , µ b i ), where 2 φ � u y − u d i ( y , u ) = − 2 a i v ( u ) du y Require a constraint to ensure identifiability: P A b = 0 P A is the projection matrix on the null space of ( I − P X ) Z Penalized conditional quasilikelihood (Jiang, 1999) N l q = − 1 i ) − 1 � d i ( y i , µ b 2 λ | P A b | 2 2 φ i =1 The penalty λ is fixed, and is chosen as 1 in Jiang (1999) Jiang’s approach does not converge 15 / 33
Conditional Extended Score Corresponding for β and b Take the derivatives of the penalized conditional quasilikelihood l q corresponding to β and b The quasi-score equation corresponding to the fixed effect β is N ( ∂µ b � i ) − 1 ( y i − µ b ∂β ) ′ (W b i i ) = 0 . i =1 The quasi-score equation corresponding to the random effects b is ( ∂µ b 1 1 ) − λ ∂ P A b ∂ b 1 ) ′ (W b 1 ) − 1 ( y 1 − µ b 1 h 1 = 1 ∂ b 1 P A b = 0 . . . bN ( ∂µ N ) − 1 ( y N − µ b N N ) − λ ∂ P A b ∂ b N ) ′ (W b h N = i ∂ b N P A b = 0 16 / 33
Extended Score for β Construct extended scores associated with the fixed effect β i = 1 ( ∂µ b ∂β ) ′ A − 1 / 2 M 1 A − 1 / 2 � N � y i − µ b � i N i i i N = 1 i ( β ) = 1 . G f � g f . . N N i = 1 ( ∂µ b i =1 ∂β ) ′ A − 1 / 2 M m A − 1 / 2 � N � y i − µ b � i i i i Conditional on b , β = arg min(¯ ˆ N ) ′ (¯ N ) − 1 (¯ G f C f G f N ) N = (1 / N ) � g f where ¯ C f i ( β ) g f i ( β ) ′ 17 / 33
Extended Score Corresponding to b For the i th subject, the quasi-score associated with the random effect: h i = ( ∂µ b i 1 ) − λ∂ P A b 1 ) − 1 ( y 1 − µ b i i ) ′ (W b P A b = 0 ∂ b i ∂ b i 1 1 Substitute W i = A 2 i and assume independent structure 2 i RA for R The extended score for the random effect b for subject i bi � � ( ∂µ ∂ b i ) ′ A − 1 ( y i − µ b i i ) i g r i = i λ ∂ P A b ∂ b i P A b In a simple random intercept model, ∂ P A b ∂ b i P A b = � N i =1 b i / N Jiang (1999) only considers the constraint for the random effect P A b = 0 This constraint is not sufficient to ensure algorithm convergence 18 / 33
Extended Score for b The convergence problem becomes more serious when there are high-dimensional random effects involved in the model We include an addditional penalty term λ b i which also controls the variance of the random effects estimators to ensure that the algorithm converges The new extended scores for b are g r = � ′ ( g r 1 , . . . , ( g r � 1 ) ′ , λ b ′ N ) ′ , λ b ′ N For given fixed effects β , ˆ b = arg min( g r ) ′ ( g r ) No replicate for each g r i , so there is no weighting matrix in the estimation 19 / 33
Regularity Conditions The parameter space S is compact There is a unique β 0 ∈ S which satisfies E [ g ( β 0 | b 0 )] = 0 g i , b (ˆ The derivative of the score function ˙ β | b 0 ) = O p (1) Expectation of the continuous score E [ g ( β | b )] is continuous and differentiable in both β and b The weighting matrix C N ( β | b ) → a . s . C 0 ( β | b ) and A N ( β | b ) → a . s . A 0 ( β | b ), where C − 1 0 ( β | b ) = A 0 ( β | b ) A 0 ( β | b ) ′ The estimating functions conditional on the estimated random effects converges to 0 in probability p E [ E { g i ( β 0 | ˆ b ) } ] → 0 as N → ∞ This condition is much weaker than the consistency for the random effects estimator 20 / 33
Asymptotic Properties Theorem 1: Under some regularity conditions, the QIF estimator for the fixed effects ˆ β 1 has the following properties as N → ∞ I. (Consistency) ˆ β 1 → p β 0 . √ β 1 − β 0 ) d N (ˆ II. (Asymptotic Normality) → N (0 , Ω 1 ) Difficulties: No normality assumption for the random effects ˆ b is not required to be a consistent estimator for true b III If ˆ b is a consistent estimator of b 0 , then Ω 1 = lim n , N →∞ ¨ Q − 1 ββ (ˆ β 1 | ˆ b ) = Ω 0 , where ¨ ββ (ˆ β 1 | ˆ b ) ≈ { ˙ G N ,β (ˆ β 1 | ˆ N (ˆ b ) ˙ G N ,β (ˆ β 1 | ˆ Q − 1 b ) C − 1 b ) } − 1 21 / 33
Recommend
More recommend