Multiple imputation methods for incomplete longitudinal ordinal data: a simulation study Anne-Francoise DONNEAU Medical Informatics and Biostatistics School of Public Health University of Li` ege Promotor: Pr. A. Albert 14 September 2012 AFr. Donneau (ULg) Multiple imputation methods for incomplete longitudinal ordinal data: a simulation study 1 / 21
Outline of the presentation ◮ Introduction ◮ Methods for (incomplete) Non-Gaussian longitudinal data Generalized Estimating Equations (GEE) Multiple imputation based GEE (MI-GEE) ◮ Simulation plan ◮ Results ◮ Conclusions AFr. Donneau (ULg) Multiple imputation methods for incomplete longitudinal ordinal data: a simulation study 2 / 21
Introduction Ordinal longitudinal data Analysis of ordinal longitudinal data Units: Subjects, objects ( i = 1 , · · · , N ) AFr. Donneau (ULg) Multiple imputation methods for incomplete longitudinal ordinal data: a simulation study 3 / 21
Introduction Ordinal longitudinal data Analysis of ordinal longitudinal data Units: Subjects, objects ( i = 1 , · · · , N ) Outcome: Ordinal variable Y with K levels AFr. Donneau (ULg) Multiple imputation methods for incomplete longitudinal ordinal data: a simulation study 3 / 21
Introduction Ordinal longitudinal data Analysis of ordinal longitudinal data Units: Subjects, objects ( i = 1 , · · · , N ) Outcome: Ordinal variable Y with K levels Measurement: Repeated at T time points, Y i = ( Y i 1 , · · · , Y iT ) ′ AFr. Donneau (ULg) Multiple imputation methods for incomplete longitudinal ordinal data: a simulation study 3 / 21
Introduction Ordinal longitudinal data Analysis of ordinal longitudinal data Units: Subjects, objects ( i = 1 , · · · , N ) Outcome: Ordinal variable Y with K levels Measurement: Repeated at T time points, Y i = ( Y i 1 , · · · , Y iT ) ′ Covariates: T × p covariates matrix X i = ( x i1 , · · · , x iT ) ′ Time, gender, age ... AFr. Donneau (ULg) Multiple imputation methods for incomplete longitudinal ordinal data: a simulation study 3 / 21
Introduction Ordinal longitudinal data Analysis of ordinal longitudinal data Units: Subjects, objects ( i = 1 , · · · , N ) Outcome: Ordinal variable Y with K levels Measurement: Repeated at T time points, Y i = ( Y i 1 , · · · , Y iT ) ′ Covariates: T × p covariates matrix X i = ( x i1 , · · · , x iT ) ′ Time, gender, age ... Methods: Methods for Non-Gaussian Longitudinal Data AFr. Donneau (ULg) Multiple imputation methods for incomplete longitudinal ordinal data: a simulation study 3 / 21
Introduction Ordinal longitudinal data Analysis of ordinal longitudinal data Units: Subjects, objects ( i = 1 , · · · , N ) Outcome: Ordinal variable Y with K levels Measurement: Repeated at T time points, Y i = ( Y i 1 , · · · , Y iT ) ′ Covariates: T × p covariates matrix X i = ( x i1 , · · · , x iT ) ′ Time, gender, age ... Methods: Methods for Non-Gaussian Longitudinal Data Generalized estimating equations (GEE) AFr. Donneau (ULg) Multiple imputation methods for incomplete longitudinal ordinal data: a simulation study 3 / 21
Introduction Ordinal longitudinal data Analysis of ordinal longitudinal data Units: Subjects, objects ( i = 1 , · · · , N ) Outcome: Ordinal variable Y with K levels Measurement: Repeated at T time points, Y i = ( Y i 1 , · · · , Y iT ) ′ Covariates: T × p covariates matrix X i = ( x i1 , · · · , x iT ) ′ Time, gender, age ... Methods: Methods for Non-Gaussian Longitudinal Data Generalized estimating equations (GEE) Problem: Missing data AFr. Donneau (ULg) Multiple imputation methods for incomplete longitudinal ordinal data: a simulation study 3 / 21
Missing data Missingness Missingness Missing data patterns: ◮ Drop out / attrition ◮ Non-monotone missingness AFr. Donneau (ULg) Multiple imputation methods for incomplete longitudinal ordinal data: a simulation study 4 / 21
Missing data Missingness Missingness Missing data patterns: ◮ Drop out / attrition ◮ Non-monotone missingness Missing data mechanism (Little and Rubin, 1987) MCAR - Missing completely at random ◮ independent of (both observed and unobserved) measurements MAR - Missing at random ◮ conditional on observed measurements, independent of unobserved measurements MNAR - Missing not at random ◮ dependent on unobserved and (also possibly) observed measurements AFr. Donneau (ULg) Multiple imputation methods for incomplete longitudinal ordinal data: a simulation study 4 / 21
Analysis - GEE Methods for Non-Gaussian Longitudinal Data GEE ◮ GEE - extension of Generalized Linear Models to longitudinal data AFr. Donneau (ULg) Multiple imputation methods for incomplete longitudinal ordinal data: a simulation study 5 / 21
Analysis - GEE Methods for Non-Gaussian Longitudinal Data GEE ◮ GEE - extension of Generalized Linear Models to longitudinal data ◮ Ordinal data (proportional odds model) - needs some transformations AFr. Donneau (ULg) Multiple imputation methods for incomplete longitudinal ordinal data: a simulation study 5 / 21
Analysis - GEE Methods for Non-Gaussian Longitudinal Data GEE ◮ GEE - extension of Generalized Linear Models to longitudinal data ◮ Ordinal data (proportional odds model) - needs some transformations ◮ Define of a ( K − 1) expanded vector of binary responses Y ∗ ij = ( Y ∗ ij 1 , ..., Y ∗ ij , ( K − 1) )’ where Y ∗ ijk = 1 if Y ij = k and 0 otherwise ◮ logit [ Pr ( Y ij ≤ k )] = logit [ Pr ( Y ∗ ijk = 1)] = β 0 k + x ′ ij β AFr. Donneau (ULg) Multiple imputation methods for incomplete longitudinal ordinal data: a simulation study 5 / 21
Analysis - GEE Methods for Non-Gaussian Longitudinal Data GEE ◮ GEE - extension of Generalized Linear Models to longitudinal data ◮ Ordinal data (proportional odds model) - needs some transformations ◮ Define of a ( K − 1) expanded vector of binary responses Y ∗ ij = ( Y ∗ ij 1 , ..., Y ∗ ij , ( K − 1) )’ where Y ∗ ijk = 1 if Y ij = k and 0 otherwise ◮ logit [ Pr ( Y ij ≤ k )] = logit [ Pr ( Y ∗ ijk = 1)] = β 0 k + x ′ ij β N ∂ π i ′ � ∂ β W − 1 ( Y ∗ i − π i ) = 0 i i =1 i ) and W i = V 1 / 2 R i V 1 / 2 where Y ∗ i = ( Y ∗ i 1 , ..., Y ∗ iT ) ′ , π i = E ( Y ∗ with V i the i i diagonal matrix of the variance of the element of Y ∗ i . The matrix R i is the ’working’ correlation matrix that expresses the dependence among repeated observations over the subjects. AFr. Donneau (ULg) Multiple imputation methods for incomplete longitudinal ordinal data: a simulation study 5 / 21
Analysis - GEE Methods for Non-Gaussian Longitudinal Data GEE - Large sample properties √ N (ˆ β − β ) N (0 , I − 1 I 1 I − 1 ) 0 0 ◮ ˆ β are consistent even if working correlation matrix is incorrect ◮ uncorrected specification of the correlation structure affects efficiency of ˆ β ◮ valid only under MCAR AFr. Donneau (ULg) Multiple imputation methods for incomplete longitudinal ordinal data: a simulation study 6 / 21
Analysis - GEE Methods for Non-Gaussian Longitudinal Data GEE - Large sample properties √ N (ˆ β − β ) N (0 , I − 1 I 1 I − 1 ) 0 0 ◮ ˆ β are consistent even if working correlation matrix is incorrect ◮ uncorrected specification of the correlation structure affects efficiency of ˆ β ◮ valid only under MCAR ◮ What if not MCAR? AFr. Donneau (ULg) Multiple imputation methods for incomplete longitudinal ordinal data: a simulation study 6 / 21
Analysis - GEE Methods for Non-Gaussian Longitudinal Data GEE - Large sample properties √ N (ˆ β − β ) N (0 , I − 1 I 1 I − 1 ) 0 0 ◮ ˆ β are consistent even if working correlation matrix is incorrect ◮ uncorrected specification of the correlation structure affects efficiency of ˆ β ◮ valid only under MCAR ◮ What if not MCAR? ◮ Solution: Use Multiple Imputation (MI) as a preliminary step AFr. Donneau (ULg) Multiple imputation methods for incomplete longitudinal ordinal data: a simulation study 6 / 21
Multiple imputation Multiple imputation Multiple imputation Idea Replace each missing value by a set of M > 1 plausible values drawn from conditional distribution of unobserved values given observed ones AFr. Donneau (ULg) Multiple imputation methods for incomplete longitudinal ordinal data: a simulation study 7 / 21
Multiple imputation Multiple imputation Multiple imputation Idea Replace each missing value by a set of M > 1 plausible values drawn from conditional distribution of unobserved values given observed ones How 1. Imputation stage - Y missing ⇒ Y 1 ij , · · · , Y M ij ij 2. Analysis stage - Analyze the M completed datasets using GEE � ˆ � var ( ˆ β m , ˆ β m ) , m = 1 , · · · , M 3. Pooling stage - Combination of the M results M ∗ = 1 � 1 + 1 � ˆ � ˆ β β m T = W + B M M m =1 var ( ˆ ∗ )( ˆ ∗ ) ′ � M � M m =1 ( ˆ β m − ˆ β m − ˆ 1 β m ) and B = 1 where W = m =1 ˆ β β M − 1 M AFr. Donneau (ULg) Multiple imputation methods for incomplete longitudinal ordinal data: a simulation study 7 / 21
Recommend
More recommend