models for time to event data
play

Models for time-to-event data From Coxs proportional hazards model - PowerPoint PPT Presentation

Models for time-to-event data From Coxs proportional hazards model to deep learning Sebastian Plsterl Artificial Intelligence in Medical Imaging | Ludwig Maximilian Universitt Munich October 2 nd 2018 cole Centrale de Nantes Outline 1


  1. Models for time-to-event data From Cox’s proportional hazards model to deep learning Sebastian Pölsterl Artificial Intelligence in Medical Imaging | Ludwig Maximilian Universität Munich October 2 nd 2018 École Centrale de Nantes

  2. Outline 1 What is Survival Analysis? 2 Parametric Survival Models 3 Semiparametric Survival Models 4 Non-Linear Survival Models 5 Survival Analysis with Deep Learning 6 Conclusion October 2 nd 2018 École Centrale de Nantes Sebastian Pölsterl (AI-Med) 2 of 49

  3. Time-to-event Data in Medical Research Alzheimer’s disease progression Source: Jack et al. (2013) • Mild cognitive impairment (MCI) is a common precursor to dementia in Alzheimer’s disease and is associated with isolated memory loss. • Some patients with MCI remain stable, whereas others progress to Alzheimer’s disease. • For an effective therapy, we want to know the probability of conversion at any time point. October 2 nd 2018 École Centrale de Nantes Sebastian Pölsterl (AI-Med) 3 of 49

  4. Time-to-event Data in Maintenance Remaining useful life of equipment Source: MathWorks • Most equipment, such as a pump, will experience failure eventually. • Failure is usually determined by threshold values on various censors: temperature cannot exceed 74 ◦ C and pressure must be under 10 bar. • We want to know the probability of failure at any time point such that replacing the equipment can be scheduled in advance to minimize downtime. October 2 nd 2018 École Centrale de Nantes Sebastian Pölsterl (AI-Med) 4 of 49

  5. Time-to-event Data in Economics Customer relationship management INCREASE USERS INCREASE USAGE EXPAND FUNC- TIONALITY GROW VALUE (deployment) FIRST VALUE DECREASE (successfull trial) VALUE DECREASE CHURN VALUE START CHURN CHURN Source: For Entrepreneurs • All businesses will lose some of its customers (customer churn). • For each customer, we have a record of purchases and previous interactions with the company. • We want to know how likely it is for a customer to turn away (churn) at any given time point so we can provide targeted incentives to induce customers to stay. October 2 nd 2018 École Centrale de Nantes Sebastian Pölsterl (AI-Med) 5 of 49

  6. Outline 1 What is Survival Analysis? 2 Parametric Survival Models 3 Semiparametric Survival Models 4 Non-Linear Survival Models 5 Survival Analysis with Deep Learning 6 Conclusion October 2 nd 2018 École Centrale de Nantes Sebastian Pölsterl (AI-Med) 6 of 49

  7. Censoring A Lost A Lost B † B † End of study C Dropped out C Dropped out D D † † E E 2 4 6 8 10 12 1 2 3 4 5 6 Time in months Time since enrollment in months October 2 nd 2018 École Centrale de Nantes Sebastian Pölsterl (AI-Med) 7 of 49

  8. Censoring A Lost A Lost B † B † End of study C Dropped out C Dropped out D D † † E E 2 4 6 8 10 12 1 2 3 4 5 6 Time in months Time since enrollment in months • A record is uncensored if an event was observed during the study period: the exact time of the event is known. October 2 nd 2018 École Centrale de Nantes Sebastian Pölsterl (AI-Med) 7 of 49

  9. Censoring A Lost A Lost B † B † End of study C Dropped out C Dropped out D D † † E E 2 4 6 8 10 12 1 2 3 4 5 6 Time in months Time since enrollment in months • A record is uncensored if an event was observed during the study period: the exact time of the event is known. • A record is right censored if a patient remained event-free: it is unknown whether an event occurred after the study ended. October 2 nd 2018 École Centrale de Nantes Sebastian Pölsterl (AI-Med) 7 of 49

  10. Types of Censoring Let y i denote the observable time, t i the actual time of an event, and c i the time of censoring . • Right censoring y i = min( c right , t i ) i October 2 nd 2018 École Centrale de Nantes Sebastian Pölsterl (AI-Med) 8 of 49

  11. Types of Censoring Let y i denote the observable time, t i the actual time of an event, and c i the time of censoring . • Right censoring y i = min( c right , t i ) i • Left censoring y i = max( c left , t i ) i October 2 nd 2018 École Centrale de Nantes Sebastian Pölsterl (AI-Med) 8 of 49

  12. Types of Censoring Let y i denote the observable time, t i the actual time of an event, and c i the time of censoring . • Right censoring y i = min( c right , t i ) i • Left censoring y i = max( c left , t i ) i • Interval censoring t i ∈ ( τ l i ; τ r i ] October 2 nd 2018 École Centrale de Nantes Sebastian Pölsterl (AI-Med) 8 of 49

  13. Types of Censoring Let y i denote the observable time, t i the actual time of an event, and c i the time of censoring . • Right censoring y i = min( c right , t i ) i • Left censoring y i = max( c left , t i ) i • Interval censoring t i ∈ ( τ l i ; τ r i ] • Any combination of left, right, or interval censoring may occur in a study. October 2 nd 2018 École Centrale de Nantes Sebastian Pölsterl (AI-Med) 8 of 49

  14. Basic Quantities Let T denote a continuous non-negative random variable corresponding to a patient’s survival time with probability density function f ( t ) . Survival function � ∞ S ( t ) = P ( T > t ) = 1 − P ( T ≤ t ) = 1 − F ( t ) = f ( u ) du t October 2 nd 2018 École Centrale de Nantes Sebastian Pölsterl (AI-Med) 9 of 49

  15. Basic Quantities Let T denote a continuous non-negative random variable corresponding to a patient’s survival time with probability density function f ( t ) . Survival function � ∞ S ( t ) = P ( T > t ) = 1 − P ( T ≤ t ) = 1 − F ( t ) = f ( u ) du t Hazard function P ( t ≤ T < t + ∆ t | T ≥ t ) h ( t ) = lim ≥ 0 ∆ t ∆ t → 0 October 2 nd 2018 École Centrale de Nantes Sebastian Pölsterl (AI-Med) 9 of 49

  16. Basic Quantities Let T denote a continuous non-negative random variable corresponding to a patient’s survival time with probability density function f ( t ) . Survival function � ∞ S ( t ) = P ( T > t ) = 1 − P ( T ≤ t ) = 1 − F ( t ) = f ( u ) du t Hazard function P ( t ≤ T < t + ∆ t | T ≥ t ) h ( t ) = lim ≥ 0 ∆ t ∆ t → 0 Cumulative hazard function � t H ( t ) = h ( u ) du 0 October 2 nd 2018 École Centrale de Nantes Sebastian Pölsterl (AI-Med) 9 of 49

  17. Survival and Hazard Function 1 . 4 1 . 0 1 . 2 Survival Probability S ( t ) 0 . 8 1 . 0 Hazard h ( t ) 0 . 6 0 . 8 0 . 6 0 . 4 0 . 4 0 . 2 0 . 2 0 0 5 10 15 5 10 15 Time t Time t h ( t ) = f ( t ) S ( t ); H ( t ) = − log S ( t ) October 2 nd 2018 École Centrale de Nantes Sebastian Pölsterl (AI-Med) 10 of 49

  18. Discrete Survival Times Let T be a discrete random variable, which can take on values t i ( i ∈ N ) with probability mass function P ( T = t i ) and t i < t j if and only if i < j . Survival function � S ( t ) = P ( T = t i ) ⇔ P ( T = t i ) = S ( t i − 1 ) − S ( t i ) { i | t i >t } October 2 nd 2018 École Centrale de Nantes Sebastian Pölsterl (AI-Med) 11 of 49

  19. Discrete Survival Times Let T be a discrete random variable, which can take on values t i ( i ∈ N ) with probability mass function P ( T = t i ) and t i < t j if and only if i < j . Survival function � S ( t ) = P ( T = t i ) ⇔ P ( T = t i ) = S ( t i − 1 ) − S ( t i ) { i | t i >t } Hazard function h ( t ) = P ( T = t i | T ≥ t i ) October 2 nd 2018 École Centrale de Nantes Sebastian Pölsterl (AI-Med) 11 of 49

  20. Discrete Survival Times Let T be a discrete random variable, which can take on values t i ( i ∈ N ) with probability mass function P ( T = t i ) and t i < t j if and only if i < j . Survival function � S ( t ) = P ( T = t i ) ⇔ P ( T = t i ) = S ( t i − 1 ) − S ( t i ) { i | t i >t } Hazard function h ( t ) = P ( T = t i | T ≥ t i ) Cumulative hazard function � H ( t ) = h ( t i ) { i | t i ≤ t } October 2 nd 2018 École Centrale de Nantes Sebastian Pölsterl (AI-Med) 11 of 49

  21. Outline 1 What is Survival Analysis? 2 Parametric Survival Models 3 Semiparametric Survival Models 4 Non-Linear Survival Models 5 Survival Analysis with Deep Learning 6 Conclusion October 2 nd 2018 École Centrale de Nantes Sebastian Pölsterl (AI-Med) 12 of 49

  22. Maximum Likelihood Optimization • Assume we have a dataset of d covariates for each of n observations: D = { ( y i , x i ) } n i =1 • We want to fit a model with parameters Θ to estimate S ( t ) – the probability of survival beyond time t – via maximum likelihood optimization. • Observed times y i can be 1. uncensored 2. right-censored 3. left-censored 4. interval-censored • We need to consider carefully what information each observation gives us. October 2 nd 2018 École Centrale de Nantes Sebastian Pölsterl (AI-Med) 13 of 49

  23. Noninformative Censoring Definition (Noninformative Censoring) Usually, we assume that the distribution of survival times T is independent of the distribution of censoring times C : T ⊥ C | x This assumption would be violated if the prognosis of individuals who get censored is worse compared to those who are not censored. October 2 nd 2018 École Centrale de Nantes Sebastian Pölsterl (AI-Med) 14 of 49

  24. Constructing the Likelihood Function Exact time of event is known • Time t y i argmax P ( T = y i ; Θ | x i ) = f ( y i ; Θ | x i ) Θ October 2 nd 2018 École Centrale de Nantes Sebastian Pölsterl (AI-Med) 15 of 49

Recommend


More recommend