Coxs proportional hazards model and Coxs partial likelihood Rasmus - PowerPoint PPT Presentation

Cox’s proportional hazards model and Cox’s partial likelihood Rasmus Waagepetersen October 8, 2020 1 / 27

Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function. Approaches: ◮ Non-parametric using Kaplan-Meier. Advantage: no assumption regarding type of distribution. Disadvantage: requires iid observations. ◮ Parametric model. Advantage: we only need to estimate a few parameters that completely characterize distribution (e.g. exponential or Weibull) - gives low variance of estimates. Can be extended to non- iid observations using regression on covariates. Disadvantage: assumed model class may be (or always is) incorrect leading to model error or in other words, bias. Possible to combine the best of two approaches ? 2 / 27

Semi-parametric approach - Cox’s proportional hazards model Sir David Cox in a ground-breaking paper (‘Regression models and life tables’, 1972) suggested the following model for the hazard function given covariates z ∈ R p : h ( t ; z ) = h 0 ( t ) exp( z T β ) , β ∈ R p . Here h 0 ( · ) completely unspecified function except that it must be non-negative. Thus model combines great flexibility via non-parametric h 0 ( · ) with the possibility of introducing covariate effects via exponential term exp( z T β ) This model has become standard in medical statistics. 3 / 27

Some properties Cumulative hazard: � t H ( t ; z ) = exp( z T β ) h 0 ( u ) d u = exp( z T β ) H 0 ( t ) 0 Survival function S ( t ; z ) = S 0 ( t ) exp( z T β ) S 0 ( t ) = exp( − H 0 ( t )) Proportional hazards: h ( t ; z ) h ( t ; z ′ ) = exp(( z − z ′ ) T β ) i.e. constant hazard ratio for two different subjects - curves can not cross ! - this should be checked in any application. 4 / 27

Estimation - partial likelihood Model useless if we can not estimate parameter β . Problem: we can not use likelihood when h 0 ( · ) unspecified. Second break-through contribution of Cox: invention of partial likelikehood for estimating β . Suppose we have observations ( t i , δ i ) as well as (fixed) covariates z 1 , . . . , z n , i = 1 , . . . , n . We assume no ties (all t i distinct) and define D ⊆ { 1 , . . . , n } as D = { l | δ l = 1 } - i.e. the index set of death times. For any t ≥ 0 we further define the risk set R ( t ) = { l | t l ≥ t } i.e. the index set of subjects at risk at time t . 5 / 27

The partial likelihood The partial likelihood is exp( z T l β ) � L ( β ) = k ∈ R ( t l ) exp( z T � k β ) l ∈ D Cox suggested to estimate β by maximizing L ( β ). ◮ does not depend on h 0 ◮ does not depend on actual death times - only their order ◮ censored observations only appear in risk set (as for Kaplan-Meier) Cox’s idea has proven to work very well - but why ? Lots of people have tried to make sense of this partial likelihood. 6 / 27

Cox’s intuition Consider for simplicity the case of no censoring and let t (1) , . . . , t ( n ) denote the set of ordered death times. We can equivalently represent data as the set of inter-arrival times v i = t ( i ) − t ( i − 1) (taking t 0 = 0) together with the information r 1 , r 2 , . . . , r n about which subject died at each time of death - i.e. r i = l if subject l was the i th subject to die. Cox then factored likelihood of ( v 1 , . . . , v n , r 1 , . . . , r n ) as (using generic notation for densities and probabilities) f ( v 1 ) p ( r 1 | v 1 ) f ( v 2 | v 1 , r 1 ) p ( r 2 | v 1 , v 2 , r 1 ) · · · f ( v n | v 1 , . . . , v n − 1 , r 1 , . . . , r n − 1 ) p ( r n | v 1 , . . . , v n , r 1 , . . . , r n − 1 ) 7 / 27

Cox argued that terms f ( v i | . . . ) could not contribute with information regarding β since the interarrival times can be fitted arbitrary well when h 0 is unrestricted - we can essentially just choose h 0 to consist of ‘spikes’ at each death time. Thus estimation of β should be based on remaining factors n � L ( β ) = p ( r i | H i ) i =1 where H i = { v 1 , . . . , v i , r 1 , . . . , r i − 1 } history/previous observations. Here p ( r i | H i ) is the probability that subject r i is the i th person to die given the previous observations. 8 / 27

More precisely, let R i denote the random index of the i th subject that dies ( R i = l means that T l is the i th smallest death time, i.e. T R i = T ( i ) = T l ). Assume that p ( l | H i ) only depends on H i through the knowledge that the i th death happens at time t ( i ) and that R ( t ( i ) ) are the ones at risk at time t ( i ) . Thus p ( l | H i ) = P ( R i = l | T R i ∈ [ t ( i ) , t ( i ) + dt [ , R ( t ( i ) ) = A ) This is the probability that l is the i th person to die given that the i th death happens at time t ( i ) and that the persons in A are at risk at time t ( i ) (thus probability is zero if l �∈ A ) 9 / 27

We now express the conditional probability in terms of the hazard function: P ( R i = l , T R i ∈ [ t ( i ) , t ( i ) + dt [ | R ( t ( i ) ) = A ) = P ( T l ∈ [ t ( i ) , t ( i ) + dt [ , T k > T l , k ∈ A \ { l }| R ( t ( i ) ) = A ) � ‘ = ′ h 0 ( t ( i ) ) exp( z T (1 − h 0 ( t ( i ) ) exp( z T l β ) d t k β ) d t ) k ∈ A \{ l } Note ‘=’ is because we actually replace T k > T l by T k > t ( i ) + d t . This does not really matter since d t infinitesimal. NB: if R i = l then t ( i ) = t l so in the following we replace t ( i ) with t l . 10 / 27

Finally, P ( R i = l | T R i ∈ [ t l , t l + dt [ , R ( t l ) = A ) = P ( R i = l , T R i ∈ [ t l , t l + dt [ | R ( t l ) = A ) P ( T R i ∈ [ t l , t l + dt [ | R ( t l ) = A ) P ( R i = l , T R i ∈ [ t l , t l + dt [ | R ( t l = A )) = � j ∈ R ( t l ) P ( R i = j , T R i ∈ [ t l , t l + dt [ | R ( t l ) = A ) h 0 ( t l ) exp( z T k ∈ R ( t l ) \{ l } (1 − h 0 ( t l ) exp( z T l β ) d t � k β ) d t ) = j ∈ R ( t l ) h 0 ( t l ) exp( z T k ∈ R ( t l ) \{ j } (1 − h 0 ( t l ) exp( z T � j β ) d t � k β ) d t ) exp( z T l β ) = k ∈ R ( t l ) exp( z T � k β ) Note: last = follows after cancelling h 0 ( t l ) d t and noting that (1 − h 0 ( t l ) exp( z T k β ) d t ) tends to one when d t tends to zero. NB: denominator is hazard for minimum of T k , k ∈ R ( t l ) (exercise 18) 11 / 27

Conditional likelihood for matched case-control study Cox’s idea very closely related to conditional likelihood for matched case-control studies. Let X denote a binary random variable (e.g. sick/healthy) for an individual in a population. We want to study the impact of a covariate z on X . Assume that the population can be divided into homogeneous groups (strata) so that probability of being ill is given by a logistic regression exp( α i + β z ) P ( X = 1) = p i ( z ) = 1 + exp( α i + β z ) for an individual in the i th strata and with the covariate z . 12 / 27

Suppose X 1 = 1 with covariate z 1 is observed for a sick person in the i th stratum. In a matched case-control study this observation is paired with an observation X 2 = 0 with covariate z 2 for a randomly selected healthy person in the same stratum. The conditional likelihood is now based on the conditional probabilities P ( X 1 = 1 | X 1 = 1 , X 2 = 0 or X 1 = 0 , X 2 = 1) = p i ( z 1 )(1 − p i ( z 2 )) p i ( z 1 )(1 − p i ( z 2 )) + (1 − p i ( z 1 )) p i ( z 2 ) This reduces to exp( β z 1 ) exp( β z 1 ) + exp( β z 2 ) which is free of the strata specific intercept α i . Note α i is a nuisance parameter when we are just interested in β . 13 / 27

Invariance argument Again consider the case of no censoring. Kalbfleisch and Prentice noticed that if one applies a strictly monotone differentiable function g to the survival times T 1 , . . . , T n then ˜ T i = g ( T i ) again follows a proportional hazards model with a completely unspecified hazard function ˜ h 0 (exercise 17). Hence estimation problem for β the same regardless of whether we consider T i ’s or ˜ T i ’s. They thus concluded that only the ordering (ranks) of the survival times and not the magnitudes of the survival times could matter for inference on β . One can verify (exercise 23) that for the ranks R i , P ( R 1 = r 1 , . . . , R n = r n ) = P ( T r 1 < T r 2 < · · · < T r n ) is precisely Cox’s partial likelihood. 14 / 27

Profile likelihood Cox’s partial likelihood can also be derived as a profile likelihood. Consider likelihood (assuming no ties) � t i n l β )] δ i exp[ − exp( z T � [ h 0 ( t i ) d t exp( z T i β ) h 0 ( u ) d u ] . 0 i =1 Let’s try to maximize wrt h 0 . First, we need h 0 ( t l ) > 0 for l ∈ D . At the same time we should take h 0 ( u ) = 0 between death times. So we let h 0 ( t ) d t = α l in very small intervals around death times, [ t l , t l + d t [, l ∈ D , and zero elsewhere. Note likelihood does not inform about h 0 ( t ) for t larger than max i t i . 15 / 27

Then likelihood becomes �� n α l exp[ z T � exp( z T � L ( α, β ) = l β ] exp( − i β ) α l ) l ∈ D i =1 l ∈ D : t l ≤ t i �� α l exp[ z T � � exp( z T = l β ] exp( − α l i β )) l ∈ D l ∈ D i ∈ R ( t l ) Taking log and differentiating wrt α l we obtain ∂ log L ( α, β ) = 1 � exp( z T − j β ) ∂α l α l j ∈ R ( t l ) Setting equal to zero and solving wrt α l gives 1 α l ( β ) = ˆ j ∈ R ( t l ) exp( z T � j β ) 16 / 27

Coxs proportional hazards model and Coxs partial likelihood Rasmus - PowerPoint PPT Presentation

Coxs proportional hazards model and Coxs partial likelihood Rasmus Waagepetersen October 8, 2020 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function. Approaches: Non-parametric

The Challenge of Natural Hazards This PowerPoint will cover information on: Natural Hazards

Occupational Health Hazards PPT-SM-OCPHLTHHAZ 1 V.A.0.0 Occupational Health Hazards Three

Health Hazards in Construction Health Hazards Potential exposures to health hazards: Worker

Coxs proportional hazards/regression model - model assessment Rasmus Waagepetersen October

LTS Efforts in Network Mapping LTS Efforts in Network Mapping Dr B Ann Cox Dr B Ann Cox Dr. B.

CSCI341 Lecture 36, Pipelining & Hazards RECALL... RECALL... HAZARDS Data Hazards

Middle Grades Proportional Reasoning Middle Grades Proportional Reasoning Middle Grades

Lecture 17: Survival Analysis -- Cox proportional Hazards Ani Manichaikul amanicha@jhsph.edu 14

The Dantzig selector in Coxs proportional hazards model A. Antoniadis 1 , P . Fryzlewicz 2 , F

Overview Partial Constituent Fronting in German The phenomenon: Partial constituent fronting

Algorithms for Cox rings Simon Keicher ICERM May 2018 Algorithms for Cox rings S. Keicher

EE 457 Unit 6c Control Hazards 2 Control Hazards Control (branch) hazards are named such

SIO15-SS1 2020: Topic 6 Earthquake Hazards { SIO15-SS1 2020: Topic 6 Earthquake Hazards {

The Semantics of Partial Model Introduction Transformations Partial Models Transforming

Visualizing covariates in proportional hazards Model comparison with rank-hazard plots

CHAPTER 12: PRACTICAL ISSUES Feedback Controller - P, I and D Proportional - The proportional

Who needs the Cox model anyway Bendix Carstensen Steno Diabetes Center Copenhagen Gentofte,

ConnectHome Nation Webinar Connecthome Nation Webinar Cox Communications Connect2Compete

The Cox Model Heidi Seibold Statistician at LMU Munich DataCamp Survival Analysis in R Why use

(Un)Successful Adaptation NERRS Science Collaborative 1 Welcome Enjoy your lunch and dive right

Transition to Adulthood Learning Collaborative (TALC) FY20 Quarter 4 Meeting August 10, 2020

Measures of Fit for Logistic Regression Paul D. Allison, Ph.D. Statistical Horizons LLC Paper

OVERVIEW OF STATISTICAL DISCLOSURE LIMITATION Lawrence H. Cox, Associate Director National

Manchester UK Professor Jorge Ribeiro Patrick Ribeiro 1 SAS/ETS Econometrics Time Series

Coxs proportional hazards model and Coxs partial likelihood Rasmus - PowerPoint PPT Presentation

Coxs proportional hazards model and Coxs partial likelihood Rasmus Waagepetersen October 8, 2020 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function. Approaches: Non-parametric

The Challenge of Natural Hazards This PowerPoint will cover information on: Natural Hazards

Occupational Health Hazards PPT-SM-OCPHLTHHAZ 1 V.A.0.0 Occupational Health Hazards Three

Health Hazards in Construction Health Hazards Potential exposures to health hazards: Worker

Coxs proportional hazards/regression model - model assessment Rasmus Waagepetersen October

LTS Efforts in Network Mapping LTS Efforts in Network Mapping Dr B Ann Cox Dr B Ann Cox Dr. B.

CSCI341 Lecture 36, Pipelining &amp; Hazards RECALL... RECALL... HAZARDS Data Hazards

Middle Grades Proportional Reasoning Middle Grades Proportional Reasoning Middle Grades

Lecture 17: Survival Analysis -- Cox proportional Hazards Ani Manichaikul amanicha@jhsph.edu 14

The Dantzig selector in Coxs proportional hazards model A. Antoniadis 1 , P . Fryzlewicz 2 , F

Overview Partial Constituent Fronting in German The phenomenon: Partial constituent fronting

Algorithms for Cox rings Simon Keicher ICERM May 2018 Algorithms for Cox rings S. Keicher

EE 457 Unit 6c Control Hazards 2 Control Hazards Control (branch) hazards are named such

SIO15-SS1 2020: Topic 6 Earthquake Hazards { SIO15-SS1 2020: Topic 6 Earthquake Hazards {

The Semantics of Partial Model Introduction Transformations Partial Models Transforming

Visualizing covariates in proportional hazards Model comparison with rank-hazard plots

CHAPTER 12: PRACTICAL ISSUES Feedback Controller - P, I and D Proportional - The proportional

Who needs the Cox model anyway Bendix Carstensen Steno Diabetes Center Copenhagen Gentofte,

ConnectHome Nation Webinar Connecthome Nation Webinar Cox Communications Connect2Compete

The Cox Model Heidi Seibold Statistician at LMU Munich DataCamp Survival Analysis in R Why use

(Un)Successful Adaptation NERRS Science Collaborative 1 Welcome Enjoy your lunch and dive right

Transition to Adulthood Learning Collaborative (TALC) FY20 Quarter 4 Meeting August 10, 2020

Measures of Fit for Logistic Regression Paul D. Allison, Ph.D. Statistical Horizons LLC Paper

OVERVIEW OF STATISTICAL DISCLOSURE LIMITATION Lawrence H. Cox, Associate Director National

Manchester UK Professor Jorge Ribeiro Patrick Ribeiro 1 SAS/ETS Econometrics Time Series

CSCI341 Lecture 36, Pipelining & Hazards RECALL... RECALL... HAZARDS Data Hazards