Checking model assumptions with regression diagnostics
Graeme L. Hickey University of Liverpool
@graemeleehickey www.glhickey.com graeme.hickey@liverpool.ac.uk
Checking model assumptions with regression diagnostics Graeme L. - - PowerPoint PPT Presentation
@graemeleehickey www.glhickey.com graeme.hickey@liverpool.ac.uk Checking model assumptions with regression diagnostics Graeme L. Hickey University of Liverpool Co Confl flicts s of f interest None Assistant Editor (Statistical
Graeme L. Hickey University of Liverpool
@graemeleehickey www.glhickey.com graeme.hickey@liverpool.ac.uk
(raise your hand if the answer is Yes)
* My views do not reflect those of the EJCTS, ICVTS, or of other statistical reviewers
0,, 𝛾 0%, 𝛾 0', … , 𝛾 0)
40 60 80 5 10 15 20
X Y
A
−5 5 10 5 10 15 20
X Residual
B
40 60 80 5 10 15 20
X Y
C
4 8 5 10 15 20
X Residual
D
𝑍 = 𝛾, + 𝛾%𝑌 + 𝜁 𝑍 = 𝛾, + 𝛾%𝑌 + 𝛾'𝑌' + 𝜁
Homoscedastic residuals Heteroscedastic residuals
−5 5 5 10 15 20 25
Fitted value Residual
A
−5 5 5 10 15 20 25
Fitted value Residual
B
−1 1 2 −6 −2 2 4 6
Normal residuals
Theoretical Quantiles Sample Quantiles
−1 1 2 5 10 15
Skewed residuals
Theoretical Quantiles Sample Quantiles Residuals Frequency −6 −4 −2 2 4 6 8 5 10 15 20 25 Residuals Frequency 5 10 15 5 10 20 30
independent
−30 30 25 50 75 100
X Residual
A
−100 −50 50 100 25 50 75 100
X Residual
B
? =
'
Rule of thumb: VIF > 10 indicates multicollinearity
Dataset 1 Dataset 2 Dataset 3 Dataset 4 4 8 12 4 8 12 5 10 15 5 10 15
Measurement 1 Measurement 2
y = 3.00 + 0.500x y = 3.00 + 0.500x y = 3.00 + 0.500x y = 3.00 + 0.500x x y
Outlier High leverage point
0, C" , 𝛾 0% C" , 𝛾 0' −𝑗 , … , 𝛾 0) C"
GLMs (incl. logistic regression)
Cox regression
Patient T0 T1 T2 T0 – T1 T0 – T2 T1 – T2 1 30 27 20 3 10 7 2 35 30 28 5 7 2 3 25 30 20 −5 5 10 4 15 15 12 3 3 5 9 12 7 −3 2 5 Variance 17.0 10.3 10.3
ICH J
'
= 𝜏H
ICH K
'
= 𝜏H
JCH K
'
)
* Grambsch & Therneau. Biometrika. 1994; 81: 515-26.
−0.2 −0.1 0.0 0.1 0.2 0.3 56 150 200 280 350 450 570 730
Time Beta(t) for age
Schoenfeld Individual Test p: 0.5385
−2 −1 1 2 3 56 150 200 280 350 450 570 730
Time Beta(t) for sex
Schoenfeld Individual Test p: 0.1253
0.0 0.2 56 150 200 280 350 450 570 730
Time Beta(t) for wt.loss
Schoenfeld Individual Test p: 0.8769
Global Schoenfeld Test p: 0.416
Central Cancer Treatment Group lung cancer data set*
not see any association between the residuals and time
covariate
proportionality
*Loprinzi CL et al. Journal of Clinical Oncology. 12(3) :601-7, 1994.