Cox’s proportional hazards/regression model - model assessment Rasmus Waagepetersen October 19, 2020 1 / 14
Topics: ◮ Plots based on estimated cumulative hazards ◮ Cox-Snell residuals: overall check of fit ◮ Martingale residuals: assessment of functional form of covariate ◮ Deviance residuals: detection of outliers ◮ Score-process residual: check of proportional hazards for each covariate ◮ Detection of influential observations. 2 / 14
Why not just proceed as for linear normal models ? Issues: ◮ censoring. ◮ for Cox ph model we do not have a fully specified model - thus we do not know distribution of residuals. Generally, residual analysis is a bit tricky not only for survival data but for non-normal data in general - residuals tend to look ‘ugly’ even if the model is correct. 3 / 14
Model with one factor Suppose we have observations ( t ij , δ ij ) i = 1 , . . . , K and model for the i th group h i ( t ij ) = h 0 ( t ij ) exp( β i ) Compute a cumulative hazard estimate ˆ H i for each group. Recall H i ( t ) = H 0 ( t ) exp( β i ) ⇔ log H i ( t ) = log H 0 ( t ) + β i Various types of plots can be considered 1. log ˆ H i ( t )’s against t 2. log ˆ H i vs log ˆ H j 3. ˆ H i vs ˆ H j 4. log ˆ H i ( t ) − log ˆ H 1 ( t )’s vs t . Alternatives 2.-4. require a bit of programming since the estimates are not obtained for the same t s. 4 / 14
Stratified Cox process Suppose we have several covariates and the first is a factor dividing subjects into K groups. Then a stratified Cox model is specified by h i ( t | z ) = h 0 i ( t ) exp( z T − 1 β − 1 ) where h i ( ·| z − 1 ) is the hazard for a subject in the i th group with remaining covariate vector z − 1 = ( z 2 , . . . , z p ) T . That is, a separate baseline hazard h 0 i for each group/strata. If proportional hazards holds for the factor used for stratification then H 0 i ( t ) = H 0 ( t ) exp( β i ) . So we can make plots similar to those on the previous slide to assess proportional hazards for the factor considered. If we want to assess ph for a quantitative covariate then we can initially discretize it into a factor variable. 5 / 14
Martingale residuals Martingale residuals: r M = δ i − ˆ H 0 ( t i ) exp( z T i ˆ β ) i Very skewed with values in interval ] − ∞ , 1]. Not useful for detecting outliers. May be used for assessing functional form of covariate by computing r M for model without covariate and plotting r M against i i the omitted covariate. Curve fitted to scatter plot may give indication of possible transformation of covariate. Reason for terminology will be more clear when we later on discuss counting processes and martingales. 6 / 14
Cox-Snell Cox-Snell residuals based on results for continuous random variable X with survivor function S and cumulative hazard and H : S ( X ) ∼ Unif(]0 , 1[) H ( X ) ∼ Exp(1) . Cox-Snell residual: = ˆ i ˆ r C H 0 ( t i ) exp( z T β ) = δ i − r M i i Cox-Snell residuals should look like censored sample of unit-rate exponential random variables which have H ( t ) = t . This can be checked by considering estimated cumulative hazard for r C i . Cox-Snell residuals may be used for checking overall fit of model - but see reservations in practical notes in KM page 358-359. 7 / 14
Deviance residuals Deviance residuals are obtained by applying ‘symmetrizing’ transformation to martingale residuals: r D = sign( r M i )[ − 2( r M + δ i log( δ i − r M i ))] 1 / 2 . i i These residuals should look (approximately) like a sample of iid normal random variables if model correct. However, if heavy censoring distribution becomes bimodal. May be useful for spotting outliers. 8 / 14
Schoenfeld residuals and score process For a time t let R t denote the random index of the person that dies at t given that persons R ( t ) are at risk and that a death occurs at time t . Recall score function u ( β ) for Cox’s partial likelihood is a sum of terms ( p -dimensional vectors) u i ( β ) = z i − E [ z R ti | H ( t i )]) = z i − e i i ∈ D where H ( t i ) is history up to time t i (determines R ( t i ) and that a death occurs at time t i ). The components of these terms are also known as Schoenfeld residuals (KM page 376). 9 / 14
We can define the score process (KM page 376) as � u ( β, t ) = u l ( β ) l ∈ D : t l ≤ t By definition u (ˆ β, t ) = 0 for t greater than the maximal observed death time. KM suggest to plot score process u (ˆ β, t ) against time and compare with 95% boundaries of Brownian bridge process. Martinussen and Scheike (2006) Dynamic regression models for survival data, suggest to compare with simulations of score process under assumed model. 10 / 14
The score process can also be expressed as n ( z l − e ( l )) � δ i ( z i − e i ) − exp( z T � u ( β, t ) = i β ) k ∈ R ( t l ) exp( z T � k β ) i =1 l ∈ D : t l ≤ t (we will see later why, when considering counting processes and martingales). The score residuals are given by the components of u ( β, t i ), i = 1 , . . . , n (i.e. in total np residuals). These are also available from the residuals function and can be cumulated to obtain score process. 11 / 14
Assessment of timevarying effects Suppose that we do not have proportional hazards for the j th covariate in the sense that the true effect of z j is timevarying: β j ( t ) = β j + γ j g ( t ) . Let r S j , i be Schoenfeld residual scaled with the covariance matrix of ˆ β . Then the expected value of r S j , i is approximately equal to γ j g ( t i ). Thus a plot of scaled Schoenfeld residuals versus time may reveal deviations from proportional hazards. Implemented in the cox.zph procedure. This is not covered in KM. See e.g. book by Collett. 12 / 14
Influential observations Do some observations have unusually large influence on estimation of β ? Let ˆ β and ˆ β − i denote estimates of β based on full data set and data with i th observation omitted. Want to look for i where β − ˆ ˆ β − i is an outlier. Based on score process residuals it is possible to compute approximation of ˆ β − i - i.e. we do not need to fit Cox model for all datasets obtained by omitting one observation. The resulting estimates of ˆ β − ˆ β − i are called dfbeta in the residual function for coxph objects. 13 / 14
Use of formal testing ? KM note 5 page on 380 advocates use of graphical checks rather than formal tests. This is because we know that any statistical model is just an approximation and thus is bound to be rejected if the sample size is large enough. Remember the famous quote by Box: ‘all models are wrong but some are useful’ Graphical checks may reveal if there are any serious deviations between model and data and possibly also hint to the cause of such deviations. 14 / 14
Recommend
More recommend