longitudinal data analysis with mixed models a graphical
play

Longitudinal Data Analysis with Mixed Models A Graphical Overview - PowerPoint PPT Presentation

Longitudinal Data Analysis with Mixed Models A Graphical Overview Georges Monette, Ph.D., P.Stat. georges@yorku.ca www.math.yorku.ca/~georges Workshop on Longitudinal Data Analysis for Business and Biostatistics May 10, 2007 1


  1. Source DF Type I SS Mean Square F Value Pr > F Subject 26 518.3796296 19.9376781 11.62 <.0001 age 1 235.3560185 235.3560185 137.14 <.0001 age*Subject 26 71.2814815 2.7415954 1.60 0.0735 Source DF Type III SS Mean Square F Value Pr > F Subject 26 66.9693122 2.5757428 1.50 0.1040 age 1 235.3560185 235.3560185 137.14 <.0001 age*Subject 26 71.2814815 2.7415954 1.60 0.0735 Standard Parameter Estimate Error t Value Pr > |t| Intercept 16.95000000 B 3.28817325 5.15 <.0001 Subject F01 0.30000000 B 4.65017921 0.06 0.9488 Subject F02 -2.75000000 B 4.65017921 -0.59 0.5567 Subject F03 -2.55000000 B 4.65017921 -0.55 0.5857 Subject F04 2.70000000 B 4.65017921 0.58 0.5639 Subject F05 2.65000000 B 4.65017921 0.57 0.5711 Subject F06 0.05000000 B 4.65017921 0.01 0.9915 Subject F07 0.00000000 B 4.65017921 0.00 1.0000 Subject F08 . . . . age*Subject M11 -0.22500000 B 0.41427089 -0.54 0.5893 age*Subject M12 0.45000000 B 0.41427089 1.09 0.2822 age*Subject M13 1.40000000 B 0.41427089 3.38 0.0014 age*Subject M14 -0.02500000 B 0.41427089 -0.06 0.9521 age*Subject M15 0.57500000 B 0.41427089 1.39 0.1708 age*Subject M16 0.00000000 B . . . 15

  2. Estimated variance for each 8 9 10 12 14 8 9 10 12 14 subject: F02 F08 F03 F04 F11 30 ⎡ ⎤ 25 2 1.31 . . . ⎢ ⎥ 20 ⎢ ⎥ 2 . 1.31 . . ⎢ ⎥ F10 F09 F06 F01 F05 F07 ⎢ ⎥ 2 . . 1.31 . ⎢ ⎥ 30 ⎢ ⎥ 2 25 ⎢ . . . 1.31 ⎥ ⎣ ⎦ 20 M06 M04 M01 M10 Problems: distance 30 25 20 No autocorrelation in time M03 M12 M13 M14 M09 M15 30 No estimate of sex effect 25 20 Can construct sex effect but CI M16 M05 M02 M11 M07 M08 30 is for difference in this sample, 25 not for the difference in the 20 population 8 9 10 12 14 8 9 10 12 14 8 9 10 12 14 age 16

  3. 8 9 10 11 12 13 14 Male Female Fitted lines in data space 30 distance 25 20 8 9 10 11 12 13 14 age 17

  4. 0.5 1.0 1.5 2.0 Male Female Fitted lines in beta space 25 20 15 ^ 0 ψ 10 5 0.5 1.0 1.5 2.0 ψ ^ age 18

  5. Each within-subject least 0 1 2 squares estimate Male Female 30 ψ ⎡ ⎤ ˆ ψ ⎢ ⎥ 0 i ˆ = ⎢ ⎥ ψ i ˆ ⎢ ⎥ iage ⎣ ⎦ σ 2 − X X ' 1 20 ( ) has variance i i which is used to construct ^ 0 a confidence ellipse for the ψ ‘fixed effect’ 10 ψ ⎡ ⎤ ψ 0 i ⎢ ⎥ = ⎢ ψ ⎥ i iage ⎣ ⎦ 0 for the ith subject. Each CI uses only the 0 1 2 ψ information from that subject ^ age (except for the estimate of σ 2 ) 19

  6. 0.5 1.0 1.5 2.0 Male Female 25 ψ s and the The dispersion of ˆ i information they provide on the ψ s dispersion of i 20 is not used in the model. 15 ^ 0 ψ The standard error of the estimate of each average Sex 10 line uses the sample distribution ε s but not the variability in of it ψ s. 5 i 0.5 1.0 1.5 2.0 ψ ^ age 20

  7. Other approaches • Repeated measures (univariate and multivariate) o Need same times for each subject, no other time-varying variables • Two-stage approach: use ˆ ι ψ s in second level analysis: ψ s have different variances, and would need different o If design not balanced, then ˆ ι σ 2 − X X ' 1 weights, Using ( ) does not work because the relevant weight is based on i i the marginal variance, not the conditional variance given the ith subject. 21

  8. Multilevel Models Start with the fixed effects model: Within-subject model (same as fixed effects model above): = + + ψ ψ ε ε σ 2 = = … … ~ (0, ) 1, , 1, , y X N I i N t T 0 i it 1 it it i i i i ψ is the ‘true’ intercept and ψ is the ‘true’ slope with respect to X . i 0 i 1 σ 2 is the within-subject residual variance. X ( age in our example) is a time-varying variable. We could have more than one. Then add: Between-subject model (new part): ψ and ψ vary randomly from subject to subject. We suppose that i 0 i 1 But the distribution might be different for different Sexes (a ‘between-subject’ or ‘time-invariant’ variable). So we assume a multivariate distribution: 22

  9. ψ β + β γ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ W 00 01 0 ψ = = + = 0 i i i … ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ 1, , i N ψ β + β γ i ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ W 0 1 1 i 1 1 1 i i β β γ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 1 00 01 0 = + i ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ β β γ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ W 0 1 1 1 1 i i γ ⎛ ⎞ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 0 g g ( ) 0 = 00 01 i G ⎜ ⎟ ⎢ ⎥ ~ , ⎢ ⎥ 0, N ⎢ ⎥ N γ � ⎣ ⎦ ⎣ ⎦ 0 ⎣ ⎦ ⎝ g g ⎠ 1 10 11 i where W is a coding variable for Sex, e.g. 0 for Males and 1 for Females. i ⎛ ψ ⎞ ⎡ ⎤ β ⎡ ⎤ 00 = i 0 ⎜ ⎟ ⎢ ⎥ among Males ⎢ ⎥ E ⎜ ⎟ ψ β ⎣ ⎦ ⎣ ⎦ ⎝ ⎠ 0 i age 1 β + β ⎡ ⎤ 00 01 = ⎢ among Females ⎥ β + β ⎣ ⎦ 0 1 1 1 Some software packages use the formulation of the multilevel model, e.g. MLWin. SAS and R use the ‘mixed model’ formulation. It is very useful to know how to go from one formulation to the other. 23

  10. From Multilevel Model to Mixed Model Combine the two levels of the multilevel model by substituting the between subject model into the within-subject model. Then gather together the fixed terms and the random terms: = ψ + ψ + ε y X 0 1 it i i it it ( ) ( ) = β + β + γ + β + β + γ + ε W W X 00 01 0 1 0 1 1 1 i i i i it it ( ) ( ) = β + β + β + β + ε + γ + γ W W X X 00 01 0 1 1 1 i i it i 0 i 1 it it = + β + β β + β X X (fixed part of the mo d el) W W 00 01 0 1 1 it 1 i it i + ε + γ + γ X (random part of t h e model) i 0 i 1 it it Anatomy of the fixed part: β (Intercept) 00 + β (between-subject, time-invariant variable) W 01 i + β (within-subject, time-varying variable) X 0 1 it + β (cross-level interaction) X W 1 1 i it Interpretation of the fixed part: the parameters reflect population average values. 24

  11. Anatomy of the random part: For one occasion: = + ε δ γ + γ X it i 0 i 1 it it Putting the observations of one subject together: δ ε ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 1 X 1 1 1 i i i ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ δ ε γ ⎡ ⎤ 1 X ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ = + 2 2 2 i i i 0 i ⎢ ⎥ ⎢ δ ⎥ ⎢ ⎥ ⎢ ε ⎥ γ 1 ⎣ ⎦ X 3 3 3 i i i 1 i ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ δ ε 1 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ X 4 4 4 i i i = + ε δ γ Ζ i i i i i i i Note: the random-effects design uses only time-varying variables Distribution assumption: ε = σ 2 γ G R R ~ (0, ) independent of ~ (0, ) where, so far, N N I i i i i i i 25

  12. Notes: • G (usually) does not vary with i. It is usually a free positive definite matrix or it may be a structured pos-def matrix. More on G later. • R (usually) does change with i – as it must if R is T is not constant. i i i σ 2 = R expressed as a function of parameters. The simplest example is . I × i n n i i R to include auto-regressive parameters for longitudinal Later we will use i modeling. • We can’t estimate G and R directly. We estimate them through: = δ = + V Z GZ ' R Var( ) i i i i i i • Some things can be parametrized either on the G-side or on the R-side. If they’re done in both, you lose identifiability. Ill-conditioning due “collinearity” between the G- and R-side models is a common problem. 26

  13. Mixed Model in SAS: PROC SORT DATA = ortho; BY subject age; PROC MIXED; CLASS subject age; MODEL distance = age sex sex*age; RANDOM INTERCEPT age / TYPE = FA0(2) SUB = subject; REPEATED / TYPE = AR(1) SUB = subject; • SORT: the data set must be sorted on the SUBJECT variable – otherwise PROC MIXED will silently report nonsense. Many longitudinal analyses also require sorting on the time variable. • MODEL statement: o specifies the fixed model o includes the INTERCEPT by default o contains time-varying, time-invariant and cross-level variables together 27

  14. • RANDOM statement: o Specifies the variables in the random model o TYPE: Specifies the G matrix. Most people use TYPE = UN (for ‘unstructured’) but then G is not constrained to be non-negative definite. FA0(q), where q is the size of the matrix, generates a free non-negative definite G using the Choleski factor for parametrization. o SUB: name of the grouping variable • REPEATED statement : R matrices o Specifies the model for the i = σ 2 R I o Omitted to get the default: × i n n i i o Here we illustrate the use of an AR(1) structure producing for example ⎡ ⎤ ρ ρ ρ 1 2 3 1 ⎢ ⎥ ρ ρ ρ 1 1 2 1 ⎢ ⎥ = σ 2 in a cluster with 4 occasions. R ⎢ ⎥ ρ ρ ρ i 2 1 1 1 ⎢ ⎥ ρ ρ ρ 3 2 1 ⎣ 1 ⎦ 28

  15. Mixed Model in Matrices In the ith cluster: = + + ε β + β β + β + γ + γ y W X W X X 00 01 0 1 it 1 it 1 i it 0 1 it it i i i ε β ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 1 1 y W X W X 00 1 1 1 1 i i i i i i ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ε β γ ⎡ ⎤ 1 1 y W X W X ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ = 01 + + 2 2 2 2 i i i i i i 0 i ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ε ⎥ β γ 1 1 ⎣ ⎦ y W X W X 0 3 3 3 3 i i i i 1 i i 1 i ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ε β 1 1 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ y W X W X 1 4 4 4 4 i i i i 1 i i = + + y X β Ζ γ ε i i i i i i i i [ Could we fit this model in cluster i? ] where γ 0 G ~ ( , ) N i i ε 0 R ~ ( , ) N i i i δ = + Ζ γ ε i i i i i i i δ + 0 Ζ G Ζ ' R ~ ( , ) N i i i i i 29

  16. For the whole sample ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ y X Ζ � γ ε 0 i i i 1 1 1 1 1 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ = + + � � β � � � � � ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ y X � Ζ γ ε ⎣ ⎦ ⎣ ⎦ ⎣ 0 ⎦ ⎣ ⎦ ⎣ ⎦ i i i N N N N N Finally making the complex look deceptively simple: = + + y X β Ζ γ ε = + X β δ ⎡ ⎤ R � 0 1 ⎢ ⎥ = = ⎢ ε R � � � Var( ) ⎥ ⎢ ⎥ 0 � R ⎣ ⎦ N = γ G Var( ) = + δ Ζ γ ε = = + δ V Ζ GZ R Var( ) ' 30

  17. Fitting the mixed model Use Generalized Least Squares on + y X β Ζ GZ R ~ ( , ' ) N ( ) − 1 ˆ = ˆ − ˆ − β X V X 1 X V y 1 GLS ' ' We need ˆ GLS to get ˆ β V and vice versa so one algorithm iterates from one to the other until convergence. COULD SAY SOMETHING ABOUT ML AND REML We used OLS above: ( ) − ˆ = 1 β X X X y OLS ' ' • How does OLS differ from GLS? o Do they differ only in that GLS produces more accurate standard errors? o Or can ˆ OLS be very different from ˆ GLS β β ? o With balanced data they will be the same. With unbalanced data they can be dramatically different. 31

  18. SAS code and output for Mixed Model Estimate statement One last detail: the ESTIMATE statement: same as GLM for fixed effects. Suppose we want to estimate and test the difference between Sexes at Age 14: Set out values of SAS X matrix for Male at Age 14 and for Female at Age 14, then subtract: Variable INT Age Sex=F Sex =M Sex =Age*F Sex =Age*M Male at Age 14 1 14 0 1 0 14 Female at Age 14 1 14 1 0 14 0 Difference (F-M) 0 0 1 -1 14 -14 Sas code ODS HTML; ODS GRAPHICS ON; /* new for diagnostics in Version 9 */ PROC SORT DATA = ORTHO; BY subject age; PROC MIXED ASYCOV ASYCORR; CLASS subject sex; 32

  19. MODEL distance = age sex sex*age / S CORRB COVB COVBI CL; RANDOM INTERCEPT age / TYPE = FA0(2) SUB = subject G GC GCORR V VCORR INFLUENCE INFLUENCE ( EFFECT = subject ); REPEATED / TYPE = AR(1) SUB = subject R RC RCI RCORR; ESTIMATE ‘gap at 14’ sex 1 -1 age*sex 14 -14; ESTIMATE ‘gap at 11.5’ sex 1 -1 age*sex 11.5 -11.5; Ouput The Mixed Procedure Model Information Data Set WORK.ORTHO Dependent Variable distance Covariance Structures Factor Analytic, Autoregressive Subject Effects Subject, Subject Estimation Method REML 33

  20. Residual Variance Method Profile Fixed Effects SE Method Model-Based Degrees of Freedom Method Containment Class Level Information Class Levels Values Subject 27 F01 F02 F03 F04 F05 F06 F07 F08 F09 F10 F11 M01 M02 M03 M04 M05 M06 M07 M08 M09 M10 M11 M12 M13 M14 M15 M16 Sex 2 Female Male Dimensions Covariance Parameters 5 Columns in X 6 Columns in Z Per Subject 2 Subjects 27 Max Obs Per Subject 4 34

  21. Number of Observations Number of Observations Read 108 Number of Observations Used 108 Number of Observations Not Used 0 Iteration History Iteration Evaluations -2 Res Log Like Criterion 0 1 483.55911746 1 2 428.86186900 0.00042400 2 1 428.80826312 0.00000516 3 1 428.80764501 0.00000000 Convergence criteria met. Estimated R Matrix for Subject F01 Row Col1 Col2 Col3 Col4 1 1.1924 -0.5644 0.2671 -0.1265 2 -0.5644 1.1924 -0.5644 0.2671 3 0.2671 -0.5644 1.1924 -0.5644 4 -0.1265 0.2671 -0.5644 1.1924 35

  22. Estimated R Correlation Matrix for Subject F01 Row Col1 Col2 Col3 Col4 1 1.0000 -0.4733 0.2240 -0.1060 2 -0.4733 1.0000 -0.4733 0.2240 3 0.2240 -0.4733 1.0000 -0.4733 4 -0.1060 0.2240 -0.4733 1.0000 Estimated Chol(R) Matrix for Subject F01 Row Col1 Col2 Col3 Col4 1 1.0920 2 -0.5169 0.9619 3 0.2447 -0.4553 0.9619 4 -0.1158 0.2155 -0.4553 0.9619 Estimated InvChol(R) Matrix for Subject F01 Row Col1 Col2 Col3 Col4 1 0.9158 2 0.4921 1.0396 3 0.4921 1.0396 4 0.4921 1.0396 36

  23. Estimated G Matrix Row Effect Subject Col1 Col2 1 Intercept F01 11.3783 -0.8151 2 age F01 -0.8151 0.08455 Estimated Chol(G) Matrix Row Effect Subject Col1 Col2 1 Intercept F01 3.3732 2 age F01 -0.2416 0.1618 Estimated G Correlation Matrix Row Effect Subject Col1 Col2 1 Intercept F01 1.0000 -0.8310 2 age F01 -0.8310 1.0000 Estimated V Matrix for Subject F01 Row Col1 Col2 Col3 Col4 1 4.9411 2.9071 3.4613 2.7905 2 2.9071 4.7248 3.0290 3.9214 3 3.4613 3.0290 5.1849 3.8273 4 2.7905 3.9214 3.8273 6.3214 37

  24. Estimated V Correlation Matrix for Subject F01 Row Col1 Col2 Col3 Col4 1 1.0000 0.6017 0.6839 0.4993 2 0.6017 1.0000 0.6120 0.7175 3 0.6839 0.6120 1.0000 0.6685 4 0.4993 0.7175 0.6685 1.0000 Covariance Parameter Estimates Cov Parm Subject Estimate FA(1,1) Subject 3.3732 FA(2,1) Subject -0.2416 FA(2,2) Subject 0.1618 AR(1) Subject -0.4733 Residual 1.1924 Asymptotic Covariance Matrix of Estimates Row Cov Parm CovP1 CovP2 CovP3 CovP4 CovP5 1 FA(1,1) 0.5338 -0.04666 0.001766 -0.04933 -0.04393 2 FA(2,1) -0.04666 0.005763 -0.00036 0.006055 0.004865 3 FA(2,2) 0.001766 -0.00036 0.000649 -0.00097 -0.00059 4 AR(1) -0.04933 0.006055 -0.00097 0.03500 0.009928 5 Residual -0.04393 0.004865 -0.00059 0.009928 0.05547 Asymptotic Correlation Matrix of Estimates Row Cov Parm CovP1 CovP2 CovP3 CovP4 CovP5 1 FA(1,1) 1.0000 -0.8412 0.09491 -0.3609 -0.2553 2 FA(2,1) -0.8412 1.0000 -0.1875 0.4264 0.2721 38

  25. 3 FA(2,2) 0.09491 -0.1875 1.0000 -0.2043 -0.09846 4 AR(1) -0.3609 0.4264 -0.2043 1.0000 0.2253 5 Residual -0.2553 0.2721 -0.09846 0.2253 1.0000 Fit Statistics -2 Res Log Likelihood 428.8 AIC (smaller is better) 438.8 AICC (smaller is better) 439.4 BIC (smaller is better) 445.3 Null Model Likelihood Ratio Test DF Chi-Square Pr > ChiSq 4 54.75 <.0001 Solution for Fixed Effects Standard Effect Sex Estimate Error DF t Value Pr > |t| Alpha Lower Upper Intercept 16.1524 0.9985 25 16.18 <.0001 0.05 14.0960 18.2088 age 0.7980 0.08707 25 9.16 <.0001 0.05 0.6186 0.9773 Sex Female 1.2647 1.5643 54 0.81 0.4224 0.05 -1.8715 4.4010 Sex Male 0 . . . . . . . age*Sex Female -0.3222 0.1364 54 -2.36 0.0218 0.05 -0.5957 -0.04876 age*Sex Male 0 . . . . . . . Covariance Matrix for Fixed Effects Row Effect Sex Col1 Col2 Col3 Col4 Col5 Col6 1 Intercept 0.9969 -0.07620 -0.9969 0.07620 39

  26. 2 age -0.07620 0.007581 0.07620 -0.00758 3 Sex Female -0.9969 0.07620 2.4470 -0.1870 4 Sex Male 5 age*Sex Female 0.07620 -0.00758 -0.1870 0.01861 6 age*Sex Male Correlation Matrix for Fixed Effects Row Effect Sex Col1 Col2 Col3 Col4 Col5 Col6 1 Intercept 1.0000 -0.8765 -0.6383 0.5595 2 age -0.8765 1.0000 0.5595 -0.6383 3 Sex Female -0.6383 0.5595 1.0000 -0.8765 4 Sex Male 1.0000 5 age*Sex Female 0.5595 -0.6383 -0.8765 1.0000 6 age*Sex Male 1.0000 Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F age 1 25 87.18 <.0001 Sex 1 54 0.65 0.4224 age*Sex 1 54 5.58 0.0218 Estimates Standard Label Estimate Error DF t Value Pr > |t| gap at 14 -3.2467 0.9258 54 -3.51 0.0009 40

  27. gap at 11.5 -2.4411 0.7785 54 -3.14 0.0028 Diagnostics Regression diagnostics are available in SAS version 9. You must specify: ODS HTML; ODS GRAPHICS ON; before invoking PROC MIXED and you must use the INFLUENCE and/or the RESIDUAL options in the MODEL statement. The INFLUENCE statement can be used to see the effect of dropping an occasion or an entire subject. To drop occasions, use INFLUENCE alone, to drop subjects use INFLUENCE ( EFFECT = subject-variable) . Often it will be desirable. As far as I can see, to get diagnostics for both, you need to run PROC MIXED twice. The following output shows diagnostics for deleting entire subjects. The influential case most clearly identified is subject 24 who has the lowest starting value and the steepest growth. Subject 20 with zigzagging growth is influential but not, generally, as much. Try rerunning the code with the following options for INFLUENCE ( EFFECT = SUBJECT ITER = 3). This will also produce output showing influence on the covariance structure. 41

  28. 42 An example of a diagnostic plot:

  29. Modeling dependencies in time The main difference between using mixed models for multilevel modeling as opposed to ε . For observations observed in time, part longitudinal modeling are the assumptions about it of the correlation between ε s should be related to their distance in time. R-side model allows the modeling of temporal and spatial dependence. REPEATED statement: R TYPE option ⎡ ⎤ ρ ρ ρ 2 3 1 ⎢ ⎥ ⎢ ⎥ ρ ρ ρ 2 1 ⎢ ⎥ σ 2 AR(1) ⎢ ⎥ ρ ρ ρ 2 1 ⎢ ⎥ ⎢ ⎥ ρ ρ ρ 3 2 ⎢ 1 ⎥ ⎣ ⎦ ⎡ ⎤ γ γρ γρ 2 1 ⎢ ⎥ ⎢ ⎥ γ γ γρ 1 σ ⎢ ⎥ 2 ARMA(1,1) ⎢ ⎥ γρ γ γ 1 ⎢ ⎥ ⎢ ⎥ γρ γρ γ 2 1 ⎢ ⎥ ⎣ ⎦ ⎡ ⎤ ρ ρ ρ 4.5 9 1 SP(POW)(time) ⎢ ⎥ ⎢ ⎥ ρ ρ ρ 3.5 8 1 AR(1) in continuous time ⎢ ⎥ σ 2 ⎢ ⎥ ρ ρ ρ 4.5 3.5 4.5 1 e.g. supposing a subject with ⎢ ⎥ ⎢ ⎥ times 1,2, 5.5 and 10 1 ρ ρ ρ 9 8 4.5 ⎢ 1 ⎥ ⎣ ⎦ σ and ρ have the same value. 1 Note that the times and the number of times – hence the indices – can change from subject to subject but 2 43

  30. G-side vs. R-side • A few things can be done with either side. But don’t do it with both in the same model. The redundant parameters will not be identifiable. For example, the G-side random intercept model is ‘almost’ equivalent to the R-side compound symmetry model. • With OLS the linear parameters are orthogonal to the variance parameter. Collinearity among the linear parameters is determined by the design, X , and does not depend on values of parameters. Computational problems due to collinearity can be addressed by orthogonalizing the X matrix. • With mixed models the variance parameters are generally not orthogonal to each other and, with unbalanced data, the linear parameters are not orthogonal to the variance parameters. • G-side parameters can be highly collinear even is the X matrix is orthogonal. Centering the matrix around the “point of minimal variance” will help but the resulting design matrix may be highly collinear. • G-side and R-side parameters can be highly collinear. The degree of collinearity may depend on the value of the parameters. 44

  31. • For example, our model identifies ρ through: ⎡ ⎤ ρ ρ ρ ⎡ − ⎤ 2 3 1 1 3 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎡ ⎤ ⎡ ρ ρ ρ ⎢ − ⎥ ⎤ g 2 1 1 1 1 1 1 1 g ⎢ ⎥ σ ˆ = + V ⎢ 00 01 ⎥ ⎢ ⎢ ⎥ ⎥ 2 ⎢ ⎥ ⎢ ⎥ ⎢ − − ⎢ ⎥ ⎥ ρ ρ ρ 1 1 3 1 1 3 2 1 g g ⎢ ⎥ ⎢ ⎥ ⎢ ⎢ ⎥ 10 11 ⎥ ⎣ ⎦ ⎣ ⎦ ⎢ ⎥ ⎢ ⎥ ρ ρ ρ 1 3 3 2 ⎢ 1 ⎥ ⎢ ⎥ ⎣ ⎦ ⎣ ⎦ For values of ρ above 0.5, the Hesssian is very ill-conditioned. The lesson may be that to use AR, ARMA models effectively, you need some subjects observed on many occasions. • R-side only: population average models • G-side only: hierarchical models with conditionally independent observations in each cluster • Population average longitudinal models can be done on the R-side with AR, ARMA structures, etc. • The absence of the G-side may be less crucial with balanced data. 45

  32. • The G-side is not enough to provide control for unmeasured between subject confounders if the time-varying predictors are unbalanced (more on this soon). • A G-side random effects model DOES NOT provide the equivalent of temporal correlation. 46

  33. Simpler Models The model we’ve looked at is deliberately complex including examples of the main typical components of a mixed model. We can use mixed models for simpler problems. Using X as a generic time-varying (within-subject) predictor and W as a generic time- invariant (between-subject) predictor we have the following: MODEL RANDOM Formula = + ε β 00 + γ One-way ANOVA INT INT y it i 0 it with random effects = ε β + β + γ + Means as outcomes INT W INT y W 00 01 i t 0 it i i = + + ε β β + γ One-way ANCOVA INT X INT y X 00 0 i t 1 it i 0 it with random effects = β + β Random coefficients INT X INT X y X 00 0 it 1 i t model + ε + γ + γ X 0 1 i i it i t = + β + β β Intercepts and slopes INT X W X*W INT X y W X 00 01 0 it 1 it i as outcomes + ε + β + γ + γ X X W 1 1 i it 0 1 it i t i i = + β + β β Non- random slopes INT X W X*W INT y X W 00 01 0 it 1 it i + ε + β + γ W X 1 1 i it 0 it i 47

  34. BLUPS: Estimating Within-Subject Effects ψ ⎡ ⎤ We’ve seen how to estimate β , G and R. Now we consider ψ = ⎢ 0 i ⎥ . ψ i ⎣ ⎦ 1 i ψ using the fixed-effects model with a OLS regression within We’ve already estimated i ψ . How good is it? each subject. Call this estimator: ˆ i ( ) − 1 2 ψ − ψ = σ X X ' ˆ Var( ) i i i i ψ . Can we do better? We have another ‘estimator’ of i Suppose we know β s. We could also predict 2 ψ by using the within Sex mean intercepts i β ⎡ ⎤ 00 and slopes, e.g. for Males we could use: with error variance: ⎢ ⎥ β ⎣ ⎦ 0 1 ⎛ β ψ ⎞ ⎡ ⎤ ⎡ ⎤ 00 − = i 0 G ⎜ ⎟ Var ⎢ ⎥ ⎢ ⎥ β ψ ⎣ ⎦ ⎣ ⎦ ⎝ ⎠ 0 1 i 1 2 Non-statisticians are always thrown for a loop when we ‘predict’ something that happened in the past. We know what we mean. 48

  35. β ⎡ ⎤ 00 ψ and We could then combine ˆ i ⎢ ⎥ by weighting then by inverse variance. This yields β ⎣ ⎦ 0 1 the BLUP (Best Linear Unbiased Predictor): − ⎧ β ⎫ 1 ⎡ ⎤ ⎧ − ⎫ − ( ) 1 ( ) 1 ⎡ − ⎤ ⎡ − ⎤ 1 1 00 − + σ 2 − + σ 2 ψ G 1 X X ' G 1 X X ' ⎨ ⎬ ⎨ ˆ ⎬ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ ⎦ ⎣ ⎦ β i i i i i ⎩ ⎭ ⎣ ⎦ ⎩ ⎭ 0 1 If we replace the unknown parameters with their estimates, we get the EBLUP (Empirical BLUP): ⎧ ⎫ ⎡ ⎤ − β ˆ 1 ⎧ − ⎫ ⎪ − ⎪ ( ) 1 ( ) 1 ⎡ − ⎤ ⎡ − ⎤ 1 1 00 ψ = ˆ − + σ 2 ˆ − + σ 2 ψ � ⎢ ⎥ G 1 X X ' G 1 X X ' ⎨ ⎬ ⎨ ˆ ⎬ ˆ ˆ ⎢ ⎥ ⎢ ⎥ ⎣ ⎦ ⎣ ⎦ i i i i i i ˆ ⎩ ⎭ β ⎢ ⎥ ⎪ ⎪ ⎣ ⎦ ⎩ ⎭ 0 1 The EBLUP ‘optimally’ combines the information from the ith cluster with the information from the other clusters. We borrow strength from the other clusters. ⎡ ⎤ β ˆ 00 ψ towards ⎢ ⎥ The process ‘shrinks’ ˆ i along a path determined by the locus of osculation ˆ β ⎢ ⎥ ⎣ ⎦ 0 1 ⎡ ⎤ β ˆ ( ) ⎡ ⎤ − 1 00 of the families of ellipses with shape ˆ σ 2 ψ . ⎢ ⎥ G around X X ' ˆ around ˆ i and shape ⎢ ⎥ ⎣ ⎦ i i ˆ β ⎢ ⎥ ⎣ ⎦ 0 1 49

  36. Popn BLUE BLUP 8 9 10 12 14 8 9 10 12 14 F02 F08 F03 F04 F11 30 The slope of the BLUP is close 25 to the population slope 20 F10 F09 F06 F01 F05 F07 30 but 25 20 the level of the BLUP is close to M06 M04 M01 M10 the level of the BLUE distance 30 25 20 This suggests that G has a large M03 M12 M13 M14 M09 M15 variance for intercepts and a 30 small variance for slopes 25 20 M16 M05 M02 M11 M07 M08 30 25 20 8 9 10 12 14 8 9 10 12 14 8 9 10 12 14 age 50

  37. Popn BLUE BLUP 0.5 1.0 1.5 2.0 0.5 1.0 1.5 2.0 Population estimate F02 F08 F03 F04 F11 25 BLUE and BLUP 20 15 in beta space 10 5 F10 F09 F06 F01 F05 F07 25 20 15 10 5 M06 M04 M01 M10 25 20 Int 15 10 5 M03 M12 M13 M14 M09 M15 25 20 15 10 5 M16 M05 M02 M11 M07 M08 25 20 15 10 5 0.5 1.0 1.5 2.0 0.5 1.0 1.5 2.0 0.5 1.0 1.5 2.0 slope 51

  38. Popn BLUE BLUP The marginal dispersion of 0.5 1.0 1.5 2.0 BLUEs comes from: Male Female 25 − ψ = + σ G 2 X X ' 1 ˆ Var( ) ( ) i i i σ 2 ψ ≈ + ˆ Var( ) g 20 × 1 11 i 2 T S i X i ψ • Var( = G [population var.] ) 15 i Int − • ψ ψ = σ 2 X X ' 1 ˆ Var( | ) ( ) i i i i [cond’l var. resampling from i th 10 subject] • ψ ψ = ψ [BL U E] ˆ E( | ) i i i So: ψ = ψ ψ + 5 ˆ ˆ Var( ) Var(E( | )) i i i { } ψ ψ ˆ E Var( | ) i i { } = ψ + ψ ψ 0.5 1.0 1.5 2.0 ˆ Var( ) E Var( | ) i i i slope = + σ − G 2 X X ' 1 ( ) i i 52

  39. Popn BLUE BLUP 0.5 1.0 1.5 2.0 While the expected variance of Male Female the BLUEs is larger than G 25 the expected variance of the BLUPs is smaller than G. 20 Beware of drawing conclusions about G from the dispersion of 15 Int the BLUPs. The estimate of G can be unstable and often collapses to 10 singularity leading to non- convergence for many methods. Possible remedies: 5 - Recentre X near point of minimal variance, - Use a smaller G 0.5 1.0 1.5 2.0 - Change the model slope 53

  40. Where the EBLUP comes from : looking at a single subject M11 Popn BLUE BLUP Note that the EBLUP’s slope is close to the slope of the 27 population estimate (i.e. the male population conditioning on between-subject predictors) 26 while the level of the line is close to level of the BLUE. 25 distance The relative precisions of the BLUE and of the population 24 estimate on slope and level are reflected through the shapes of σ − 23 G and 2 X X ' 1 ( ) i i 22 8 9 10 11 12 13 14 age 54

  41. M11 Popn BLUE BLUP The same picture in 0.8 “beta-space” 0.6 slope 0.4 0.2 0.0 16 18 20 22 Int 55

  42. M11 Popn BLUE BLUP The population estimate with a SD ellipse. 0.8 0.6 slope 0.4 0.2 0.0 16 18 20 22 Int 56

  43. M11 Popn BLUE BLUP The population estimate with a SD ellipse 0.8 and 0.6 the BLUE with its SE ellipse slope 0.4 0.2 0.0 16 18 20 22 Int 57

  44. M11 Popn BLUE BLUP The EBLUP is an Inverse Variance Weighted mean of the BLUE and of the population 0.8 estimate. We can think of taking the 0.6 BLUE and ‘shrinking’ it towards the population estimate slope along a path that optimally combines the two components. 0.4 The path is formed by the osculation points of the families 0.2 of ellipses around the BLUE and the population estimate. 0.0 16 18 20 22 Int 58

  45. The amount and direction of M11 Popn BLUE BLUP shrinkage depends on the relative shapes and sizes of G 0.8 and σ 2 ψ = σ − ≈ − 2 X X ' 1 S 1 ˆ Var( | ) ( ) i 0.6 X i i i T i i slope The BLUP is at an osculation 0.4 point of the families of ellipses generated around the BLUE and population estimate. 0.2 Imagine what could happen if G were oriented differently: 0.0 16 18 20 22 Paradoxically, both the slope Int and the intercept could be far outside the population estimate and the BLUE. 59

  46. When is a BLUP a BLUPPER? The rationale behind BLUPs is based on exchangeability . No outside information should make this cluster stand out from the others and the mean of the population deserves the same weight in prediction for this cluster as it deserves for any other cluster that doesn’t stand out. If a cluster stands out somehow, then the BLUP might be a BLUPPER. 60

  47. Interpreting G The parameters of G give the variance of the intercepts, the variance of the slopes and the covariance between intercepts and the slopes. Would it make sense to assume that the covariance is 0 to reduce the number of parameters in the model? To address this, consider that the variance of the heights of individual regression lines a fixed value of X is: ⎛ ψ ⎞ ⎡ ⎤ [ ] ι 0 ψ + ψ = ⎜ ⎟ Var( ) Var 1 X X ⎢ ⎥ ι 0 ι ψ 1 ⎣ ⎦ ⎝ ⎠ ι 1 ⎡ ⎤ ⎡ ⎤ 1 g g [ ] = 00 01 1 ⎢ ⎥ ⎢ X ⎥ ⎣ ⎦ ⎣ ⎦ g g X 10 11 = + + 2 2 g g X g X 00 01 11 g ψ 0 + ψ − 01 So Var( ) has a minimum at X ι ι 1 g 11 61

  48. 2 g − 01 and the minimum variance is g 00 g 11 Thus assuming the covariance is 0 is equivalent to assuming that the minimum variance occurs when X = 0. This is an assumption that is not invariant with location transformations of X . It is similar to removing a main effect that is marginal to an interaction in a model, something that should not be done without a thorough understanding of its consequences. β − ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 20 10.5 1 0 = = ⎢ G Example: Let and ⎢ ⎥ ⎢ ⎥ ⎥ β − − ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ 1 1 0.1 1 62

  49. concentration ellipse A sample of lines in beta space 30 25 ψ 0 20 15 -2.0 -1.5 -1.0 -0.5 0.0 ψ 1 63

  50. The same lines in data space. 25 20 15 Y 10 5 0 5 10 15 20 25 X 64

  51. The same lines in data space with the population 25 mean line and lines at one SD above and below the population mean line 20 15 Y 10 5 0 5 10 15 20 25 X 65

  52. The parameters of G determine the location 25 and value of the g 00 minimum standard deviation of lines 20 15 Y g 00 − g 01 2 g 11 10 5 0 − g 01 g 11 5 10 15 20 25 X 66

  53. With two time-varying variables with random effects, the G matrix would look like: ⎛ ψ ⎞ ⎡ ⎤ ⎡ ⎤ g g g 0 00 01 02 i ⎜ ⎟ ⎢ ⎥ ⎢ ⎥ ψ = Var g g g ⎜ ⎟ ⎢ ⎥ ⎢ ⎥ 1 10 11 12 i ⎜ ⎟ ⎢ ψ ⎥ ⎢ ⎥ ⎣ ⎦ ⎣ ⎦ ⎝ ⎠ g g g 2 20 21 22 i g and g are related to the point of minimum variance in , X space but g an X 01 02 1 2 12 interpretation that is origin-invariant and could be candidate for an assumption that it is 0. PROC MIXED code to do this is shown in an appendix. 67

  54. Differences between PROC GLM and MIXED with balanced data Just looking at regression coefficients: PROC GLM; MODEL RESPONSE = SEX AGE AGE*SEX; Standard Parameter Estimate Error t Value Pr > |t| Intercept 16.34062500 B 1.41622419 11.54 <.0001 age 0.78437500 0.78437500 B 0.12616728 0.12616728 6.22 <.0001 Sex Female 1.03210227 1.03210227 B 2.21879688 2.21879688 0.47 0.6428 Sex Male 0.00000000 B . . . age*Sex Female -0.30482955 -0.30482955 B 0.19766614 0.19766614 -1.54 0.1261 age*Sex Male 0.00000000 B . . . . 68

  55. PROC MIXED; MODEL RESPONSE = SEX AGE AGE*SEX; RANDOM INT / SUB = SUBJECT TYPE = FA0(1); Solution for Fixed Effects Standard Effect Sex Estimate Error DF t Value Pr > |t| Intercept 16.3406 0.9813 104 16.65 <.0001 age 0.7844 0.7844 0.07750 0.07750 79 10.12 <.0001 Sex Female 1.0321 1.0321 1.5374 1.5374 104 0.67 0.5035 Sex Male 0 . . . . age*Sex Female -0.3048 -0.3048 0.1214 0.1214 79 -2.51 0.0141 age*Sex Male 0 . . . . 69

  56. PROC MIXED; MODEL RESPONSE = SEX AGE AGE*SEX; RANDOM INT AGE / SUB = SUBJECT TYPE = FA0(2); Standard Effect Sex Estimate Error DF t Value Pr > |t| Intercept 16.3406 1.0185 25 16.04 <.0001 age 0.7844 0.7844 0.08600 0.08600 25 9.12 <.0001 Sex Female 1.0321 1.0321 1.5957 1.5957 25 0.65 0.5237 Sex Male 0 . . . . age*Sex Female -0.3048 -0.3048 0.1347 0.1347 25 -2.26 0.0326 age*Sex Male 0 . . . . = + PROC MIXED also gives estimates of G and V Z GZ ' R 1 1 1 1 Estimated G Matrix Estimated G Matrix Row Effect Row Effect Subject Col1 Col2 Subject Col1 Col2 1 Intercept 1 Intercept F01 5.7864 -0.2896 F01 5.7864 -0.2896 2 age 2 age F01 -0.2896 0.03252 F01 -0.2896 0.03252 70

  57. Estimated Chol(G) Matrix Estimated Chol(G) Matrix Row Effect Row Effect Subject Col1 Col2 Subject Col1 Col2 1 Int 1 Intercept F01 2.4055 ercept F01 2.4055 2 age F01 -0.1204 0.1343 2 age F01 -0.1204 0.1343 Estimated G Correlation Matrix Estimated G Correlation Matrix Row Effect Row Effect Subject Col1 Col2 Subject Col1 Col2 1 Intercept 1 Intercept F01 1.0000 -0.6676 F01 1.0000 -0.6676 2 age F01 -0.6676 1.0000 2 age F01 -0.6676 1.0000 Estimated V Ma Estimated V Matrix for Subject F01 trix for Subject F01 Row Col1 Col2 Col3 Col4 Row Col1 Col2 Col3 Col4 1 4.9502 1 4.9502 3.1751 3.1162 3.0574 3.1751 3.1162 3.0574 2 3.1751 2 3.1751 4.9625 3.3176 3.3888 4.9625 3.3176 3.3888 3 3.1162 3.3176 5.2351 3.7202 3 3.1162 3.3176 5.2351 3.7202 4 3.0574 4 3.0574 3.3888 3.7202 5.7679 3.3888 3.7202 5.7679 71

  58. E Estimated V Correlation Ma stimated V Correlation Matrix for Subject F01 trix for Subject F01 Row Col1 Row Col1 Col2 Col3 Col4 Col2 Col3 Col4 1 1.0000 1 1.0000 0.6406 0.6122 0.5722 0.6406 0.6122 0.5722 2 0.6406 2 0.6406 1.0000 0.6509 0.6334 1.0000 0.6509 0.6334 3 0.6122 3 0.6122 0.6509 1.0000 0.6770 0.6509 1.0000 0.6770 4 0.5722 4 0.5722 0.6334 0.6770 1.0000 0.6334 0.6770 1.0000 When data are balanced, PROC GLM and PROC MIXED produce the same ˆ β s with different SEs. 72

  59. Take 2: Learning lessons from unbalanced data What can happen with unbalanced data? Here is some data that is similar to the Pothoff and Roy data but with: • different age ranges for different subjects • a between-subject effect of age that is different from the within-subject effect of age 73

  60. 10 20 30 40 10 20 30 40 F01 F04 F11 F09 F05 40 30 20 10 F02 F10 F06 F03 F08 F07 40 30 20 10 M09 M15 M13 M03 40 30 y 20 10 M12 M11 M07 M06 M01 M05 40 30 20 10 M08 M10 M04 M02 M16 M14 40 30 20 10 10 20 30 40 10 20 30 40 10 20 30 40 age_raw 74

  61. 10 20 30 40 Male Female 40 30 y 20 10 10 20 30 40 age_raw 75

  62. -10 0 10 Male Female 40 Using age centered at 25. Why? Like the ordinary regression model, the mixed 30 model is equivariant under global centering but y convergence may be improved because the G matrix is less 20 eccentric. 10 -10 0 10 age 76

  63. SAS code and output PROC MIXED ; CLASS subject SEX; MODEL y = age sex sex*age / SOLUTION DDFM = SATTERTH; RANDOM age INTERCEPT / TYPE = FA0( 2 ) SUB = subject ; RUN ; Solution for Fixed Effects Standard Effect Sex Estimate Error DF t Value Pr > |t| Intercept 23.3275 2.8521 20 8.18 <.0001 age -0.3778 -0.3778 0.1083 87.3 -3.49 0.0008 Sex Female -0.1476 4.4687 20 -0.03 0.9740 Sex Male 0 . . . . age*Sex Female -0.1268 -0.1268 0.1724 83 -0.74 0.4642 age*Sex Male 0 . . . . 77

  64. Between, Within and Pooled Models We first focus on one group, the female data: 40 What models could we fit to this data? 30 y 20 10 -15 -10 -5 0 5 10 age 78

  65. marginal SD ellipse 40 30 y 20 10 -15 -10 -5 0 5 10 age 79

  66. marginal regression line Regressing with the 40 pooled data – ignoring Subject – yields the marginal (unconditional) estimate β ˆ of the slope: P 30 y 20 10 -15 -10 -5 0 5 10 age 80

  67. We could replace each Subject by 40 its means for x and y and use the resulting aggregated data with one point for each Subject. 30 y 20 10 -15 -10 -5 0 5 10 age 81

  68. dispersion ellipse of subject means 40 30 y 20 10 -15 -10 -5 0 5 10 age 82

  69. regression on subject means = ecological regression 40 Performing a regression on the aggregated data yields the ‘between-subject’ regression, in some contexts called an 30 ‘ecological regression’ y 20 10 -15 -10 -5 0 5 10 age 83

  70. within-subject regression We can combine all within-subject regressions to get a combined 40 estimate of the within-subject slope. This is the estimate obtained with a fixed-effects model using age and Subject additively. Equivalently, we can 30 perform a regression using (the within-subject residuals of y minus y mean y) on (age minus mean age). 20 Q: Which is better: ˆ β Β , β or ˆ β ? ˆ W P A: None. They answer different β would be ˆ questions. Typically, P used for prediction across the 10 β for ‘causal’ ˆ population; W inference controlling for between- -15 -10 -5 0 5 10 subject confounders. age 84

  71. between-subject marginal within-subject The relationship among estimators: 40 β combines ˆ ˆ β Β and β : ˆ P W ) ( ) ( − β ˆ = + 1 β ˆ + β ˆ 30 W W W W P B W B B W W The weights depend only on the y design, not of estimated variances 20 of the response. 10 -15 -10 -5 0 5 10 age 85

  72. The Mixed Model The mixed model estimate 3 also between-subject mixed model marginal within-subject combines β Β and ˆ β : ˆ W 40 ( ) − 1 ˆ β = + MM W W MM B W ( ) β ˆ + β ˆ MM W W B B W W 30 but with a lower weight on ˆ β Β : y σ 2 / T σ 2 / g 20 T = = MM 00 W W W σ 2 + σ 2 B B B / / T g T + 00 1 g 00 Note that ≤ MM 10 W W B B -15 -10 -5 0 5 10 age 3 Using a random intercept model 86

  73. • The mixed model estimator is a variance optimal combination of ˆ β Β and β . ˆ W • It makes perfect sense if ˆ β Β and β estimate the same thing, i.e. if ˆ β Β = β ! W W • Otherwise, it’s an arbitrary combination of estimates that estimate different things. The weights in the combination have no substantive interpretation. • i.e. it’s an optimal answer to a meaningless question. Summary of the relationships among 4 models: Model Estimate of slope Precision β Β ˆ Between Subjects W B ˆ β Marginal (pooled data) P β ˆ Mixed Model MM ˆ β Within Subjects W W W The pooled estimate combines ˆ β Β and β : ˆ W ) ( ) ( − β ˆ = + 1 β ˆ + β ˆ W W W W P B W B B W W Mixed model: 87

  74. With a random intercept model: = + ε ε σ 2 β + β + γ γ ~ (0, ), ~ (0, ) y X N g N , 0 00 i t 1 it i 0 it i 0 i t σ 2 known β ˆ is also a weighted combination of ˆ β Β and β but with less ˆ with 00 , g MM W weight on ˆ β Β : 2 σ / T = MM W W 2 σ + B B / T g 00 ⎛ ⎞ Between-Subject Information = × ⎜ ⎟ f W monotone B ⎝ ⎠ Within-Subject Information • β ˆ β and ˆ β , i.e. it does better than ˆ β in the sense of being closer ˆ is between MM W P P β but is not equivalent to ˆ β . ˆ to W W • With balanced data β ˆ = β ˆ = β ˆ = β ˆ W MM P P σ 2 1 • As × → , β ˆ → β ˆ 0 , so a mixed model estimates the within effect MM W T g 00 asymptotically in T. 88

  75. σ 2 1 • As × → ∞ , β ˆ → β ˆ . Thus the mixed model estimate fails to control MM B T g 00 for between-subject confounding factors. 89

  76. A serious a problem? a simulation 1.0 1,000 simulations showing mixed model estimates of slope using the same configuration of 0.5 Xs with β = − β = , 1 1/ 2 and 1 β ^ W B = keeping 1/ 2 and g 00 allowing 0.0 σ 2 to vary from 0.005 to 5 -0.5 0 1 2 3 4 5 σ 90

  77. What happened? 6 As σ gets larger, the relatively small value of g 00 is harder to identify and 4 both sources of variability σ ^ (within-subject and between- subject) are attributed to σ . 2 The blue line is the diagonal σ = σ ˆ and the equation of the σ = σ + . red line is ˆ 1 0 ≈ ˆ When 0 , the between- g 00 0 1 2 3 4 5 subject relationship is treated as σ if it has very high precision and it dominates in forming the mixed model estimate. 91

  78. Splitting age into two variables Since age has a within-subject effect that is inconsistent with its between-subject effect we can split it into two variables: 1. Between-subject ‘contextual predictor’: e.g. age.mean of each subject (or the starting age), and 2. within-subject predictor: a. age itself or b. within-subject residual: age.resid = age – age.mean So we model: = β + β + β + δ E( ) . y age mean age 0 . it age mean i age it it or = β + β + β + δ * * * E( ) . . y age mean age diff 0 . . it age mean i age diff it it Surprisingly 92

  79. β = β * . age diff age but β = β + β * * . . . age mean age mean age diff = β + β . age mean age 93

  80. 94 12 age 11 10 9 12 11 10 9 y

  81. 95 12 age 11 10 9 12 11 10 9 y

  82. 96 12 β age age 11 10 9 12 11 10 9 y

  83. 97 β age = β age.diff 12 * age 11 10 9 12 11 10 9 y

  84. 98 β age = β age.diff 12 * age 11 10 9 12 11 10 9 y

  85. 99 β age = β age.diff 12 * age 11 10 β age.mean 9 12 11 10 9 y

  86. 100 β age = β age.diff 12 * age 11 10 β age.mean 9 12 11 10 9 y

Recommend


More recommend