analysis of multistate data with
play

Analysis of multistate data with thenetsurvival orcause specific - PDF document

stick to this world Analysis of multistate data with thenetsurvival orcause specific survival : realistic rate models and t S c ( t ) = exp c ( s ) d s multiple time scales: 0 not a proper probability A


  1. stick to this world Analysis of multistate data with ◮ the“net”survival or“cause specific survival” : realistic rate models and �� t � S c ( t ) = exp λ c ( s ) d s multiple time scales: 0 ◮ not a proper probability A dogmatic approach ◮ the probability of survival if ◮ all other causes of death than c were absent ◮ c -specific mortality rate were still the same Bendix Carstensen Steno Diabetes Center Gentofte, Denmark ◮ so it is just a transformation of the cause-specific rate with no http://BendixCarstensen.com real world interpretation IARC, Lyon, France, ◮ . . . do not label quantities“survival”or“probability”when they 11 April 2018 are not (of this world) 1/ 51 6/ 51 The dogma [1] (further) dogma for “sticking to this world” ◮ do not condition on the future — indisputable ◮ rates are continuous in time (and“smooth” ) ◮ do not count people after they are dead — disputable ◮ rates may depend on more than one time scale ◮ stick to this world — expandable ◮ which, is an empirical question 2/ 51 7/ 51 do not condition on the future A look at the Cox model ◮ commonly seen in connection with“immortal time bias” λ ( t , x ) = λ 0 ( t ) × exp( x ′ β ) ◮ allocation of follow-up (risk time) to a covariate value only A model for the rate as a function of t and x . assumed in the future Covariates: ◮ all follow-up among persons ever on insulin allocated to the insulin group ◮ x — including the time prior to insulin use (when not on insulin) ◮ t ◮ events always with the correct covariate values ◮ . . . often the effect of t is ignored (forgotten?) ◮ ⇒ too much PY in insulin group; rates too small ◮ i.e. left unreported ◮ ⇒ too little PY in non-insulin group; rates too large ◮ ⇒ insulin vs. non-insulin rates under estimated 3/ 51 8/ 51 do not count people after they are dead The Cox-likelihood as profile likelihood ◮ Reference to Fine & Gray’s paper on models for the ◮ One parameter per death time to describe the effect of time subdistribution hazard [2] (i.e. the chosen timescale). ◮ Recall: hazard and cumulative risk for all cause death: � � � � log λ ( t , x i ) = log λ 0 ( t ) + β 1 x 1 i + · · · + β p x pi = α t + η i � �� ′ � �� � � � � λ ( t ) = Λ ′ ( t ) = F ( t ) = 1 − exp − Λ( t ) ⇔ log 1 − F ( t ) η i ◮ Profile likelihood: ◮ Subdistribution hazard — with more causes of death ◮ Derive estimates of α t as function of data and β s (compting risks), for cumulative risk of cause c , F c ( t ) : — assuming constant rate between death/censoring times ◮ Insert in likelihood, now only a function of data and β s � �� ′ � ˜ ◮ This turns out to be Cox’s partial likelihood λ c ( t ) = log 1 − F c ( t ) ◮ Cumulative intensity ( Λ 0 ( t ) ) obtained via the ◮ Note: F c depends on all cause-specific hazards Breslow-estimator 4/ 51 9/ 51 do not count people after they are dead Mayo Clinic 1.0 lung cancer data: ◮ The estimation of the subdistribution hazard boils down to: 0.8 60 year old woman ˜ h ( t ) = P { X ( t + d t ) = j | X ( t ) � = j } / d t that is, the instantaneous rate of failure per time unit from 0.6 cause j among those who are either alive or have died from Survival causes other than j at time t ◮ . . . sounds crazy, but. . . 0.4 ◮ when modeling the cumulative risk you must refer back to the size of the original population, which include those dead 0.2 from other causes. ◮ The debate is rather if the subdistribution hazard is a useful 0.0 scale for modeling and reporting from competing risk settings 0 200 400 600 800 Days since diagnosis 5/ 51 10/ 51

  2. Splitting the dataset a priori Example: Mayo Clinic lung cancer III lex.id tfe lex.dur lex.Cst lex.Xst inst time status age sex ph.ecog ◮ The Poisson approach needs a dataset of empirical rates ( d , y ) 9235 96 0 5 Alive Alive 12 30 2 72 1 2 9236 96 5 6 Alive Alive 12 30 2 72 1 2 with suitably small values of y . 9237 96 11 1 Alive Alive 12 30 2 72 1 2 ◮ — each individual contributes many empirical rates 9238 96 12 1 Alive Alive 12 30 2 72 1 2 9239 96 13 2 Alive Alive 12 30 2 72 1 2 ◮ (one per risk-set contribution in Cox-modelling) 9240 96 15 11 Alive Alive 12 30 2 72 1 2 9241 96 26 4 Alive Dead 12 30 2 72 1 2 ◮ From each empirical rate we get: [1] 186 ◮ Poisson-response d ◮ Risk time y → log( y ) as offset > system.time( + mLs.pois.fc <- glm( lex.Xst=="Dead" ~ - 1 + factor( tfe ) + ◮ time scale covariates: current age, current date, . . . + age + factor( sex ), ◮ other covariates + offset = log(lex.dur), + family=poisson, data=Lung.s, eps=10^-8, maxit=25 ) ◮ Contributions not independent, but likelihood is a product + ) ◮ Same likelihood as for independent Poisson variates user system elapsed ◮ Poisson glm with spline/factor effect of time 13.550 17.334 8.761 11/ 51 16/ 51 Example: Mayo Clinic lung cancer Example: Mayo Clinic lung cancer IV > length( coef(mLs.pois.fc) ) ◮ Survival after lung cancer [1] 188 ◮ Covariates: > t.kn <- c(0,25,100,500,1000) ◮ Age at diagnosis > dim( Ns(Lung.s$tfe,knots=t.kn) ) ◮ Sex [1] 20022 4 ◮ Time since diagnosis > system.time( + mLs.pois.sp <- glm( lex.Xst=="Dead" ~ Ns( tfe, knots=t.kn ) + ◮ Cox model + age + factor( sex ), ◮ Split data: + offset = log(lex.dur), + family=poisson, data=Lung.s, eps=10^-8, maxit=25 ) + ) ◮ Poisson model, time as factor ◮ Poisson model, time as spline user system elapsed 0.418 0.510 0.317 12/ 51 17/ 51 Mayo Clinic Example: Mayo Clinic lung cancer V 1.0 > ests <- lung cancer + rbind( ci.exp(mL.cox), + ci.exp(mLs.pois.fc,subset=c("age","sex")), 0.8 60 year old woman + ci.exp(mLs.pois.sp,subset=c("age","sex")) ) > cmp <- cbind( ests[c(1,3,5) ,], + ests[c(1,3,5)+1,] ) > rownames( cmp ) <- c("Cox","Poisson-factor","Poisson-spline") 0.6 > colnames( cmp )[c(1,4)] <- c("age","sex") Survival 0.4 > round( cmp, 7 ) age 2.5% 97.5% sex 2.5% 97.5% 0.2 Cox 1.017158 0.9989388 1.035710 0.5989574 0.4313720 0.8316487 Poisson-factor 1.017158 0.9989388 1.035710 0.5989574 0.4313720 0.8316487 Poisson-spline 1.016189 0.9980329 1.034676 0.5998287 0.4319932 0.8328707 0.0 0 200 400 600 800 Days since diagnosis 13/ 51 18/ 51 Example: Mayo Clinic lung cancer I 1.0 10.0 > library( survival ) > library( Epi ) 5.0 0.8 > Lung <- Lexis( exit = list( tfe=time ), + exit.status = factor(status,labels=c("Alive","Dead")), + data = lung ) Mortality rate per year 2.0 0.6 NOTE: entry.status has been set to "Alive" for all. NOTE: entry is assumed to be 0 on the tfe timescale. Survival 1.0 > summary( Lung ) 0.4 Transitions: 0.5 To From Alive Dead Records: Events: Risk time: Persons: Alive 63 165 228 165 69593 228 0.2 0.2 0.1 0.0 0 200 400 600 800 0 200 400 600 800 Days since diagnosis Days since diagnosis 14/ 51 19/ 51 Example: Mayo Clinic lung cancer II 1.0 10 > system.time( + mL.cox <- coxph( Surv( tfe, tfe+lex.dur, lex.Xst=="Dead" ) ~ + age + factor( sex ), 0.8 8 + method="breslow", data=Lung ) ) user system elapsed Mortality rate per year 0.010 0.001 0.009 0.6 6 Survival > Lung.s <- splitLexis( Lung, + breaks=c(0,sort(unique(Lung$time))), + time.scale="tfe" ) 4 0.4 > summary( Lung.s ) Transitions: To 2 0.2 From Alive Dead Records: Events: Risk time: Persons: Alive 19857 165 20022 165 69593 228 > subset( Lung.s, lex.id==96 )[,1:11] ; nlevels( factor( Lung.s$tfe ) ) 0 0.0 0 200 400 600 800 0 200 400 600 800 Days since diagnosis Days since diagnosis 15/ 51 19/ 51

Recommend


More recommend