Lecture 16: Survival Analysis I Kaplan Meier and Log-rank test Ani - PowerPoint PPT Presentation

Lecture 16: Survival Analysis I – Kaplan Meier and Log-rank test Ani Manichaikul amanicha@jhsph.edu 11 May 2007

Survival Analysis n Statistical methods for the study of time to an event n Accounts for: n Time that events occur n Different follow-up times 2

Survival Analysis n Survival analysis methods allow us to incorporate information about both frequency of event occurrence and time to event information n Subjects are followed until they have an “event,” or the study ends 3

Endpoint n The endpoint doesn’t have to be ‘death’; it can be any well-defined event n Death n Disease onset n Menopause n Pregnancy n Relapse 4

Time Scale n When do you start the clock? n Time from diagnosis of disease to death n Time from HIV infection to AIDS n Time from birth (chronological age) n Time from randomization in clinical trial 5

Why Is Survival Analysis Tricky? n We need a method which can incorporate information about censored data into an analysis 6

The Survival Curve S(t) is an estimate of the proportion of individuals still alive (have not had S(t) the event) at time t Time 7

The Survival Curve n The survival curve as an important and complete summary ( ) # alive at followup time t = S ( t ) ( ) # alive at time 0 n Time 0: “start of clock” 8

Survival Curve Facts: n The curve starts at 1 and decreases n Estimating these curves and comparing them among groups constitutes a “survival analysis” n Need to decide on what summary is important n Mean survival time n Median survival time n Height at a specific time: One, two year survival rates n Difference of curves: S 1 (12) - S 2 (12) 9

Estimating Median Survival S(t) .50 m Time 10

Caveat—Medians Do Not Describe Whole Curve S(t) .50 m Time 11

Survival Function n The survival function, denoted S(t), is a better way to represent the probability distribution of the survival time T, when some of the observed times are censored n only know that T> t, rather than T= t n S(t) = Pr(T > t) = Pr(No event by time t) n S(t) is the probability of surviving beyond t 12

n Uncensored data: The event has occurred n Censored data: The event has yet to occur n Event-free at the current followup time n A competing event that is not an endpoint stops followup n Death (if not part of the endpoint) n Clinical event that requires treatment, etc. 13

n Important issue: If no events are reported in the interval from last followup to “now”, need to choose between: n No news is good news? n No news is no news 14

n Ignore the incomplete cases; drop them n Produces bias in the estimated curve n Unbalanced censoring produces biased comparisons n Impute an event time n Depends on a model n Use the available information on each participant 15

( ) # events = Event Rate total observatio n time n Example: 5 events in 600 person months 5/600 = 1/120 events per month n = 0.1 events per year = 10 events per 100 person-years n Gives an average event rate over the followup period n For a finer time resolution, do the above for small intervals 16

Quantities of Interest n The survivor function S(t) S(t)= P(T> t)= P(No event by time t) n Hazard function � (t) � (t) “= ” P(T= t)/ P(T> t) = risk of event occurring at time t The above form is true for discrete time, but involves more complicated calculus-based notation for continuous time. 17

Quantities of Interest n Often, we are interested in comparing the hazard between groups, for example, the relative hazard of relapse comparing those on chemo to those not on chemo n Relative Risk n Hazard Ratio n Risk Ratio 18

Estimation n Kaplan-Meier survivor function estimator n Cox proportional hazards model (PHM) for hazard ratio n We’ll start with Kaplan-Meier (K-M) 19

Central Problem n Estimation of the survival curve n S(t) = Proportion surviving at least to time t or beyond 20

The Survival Curve 1.0 S(0) always equals 1 All subjects are alive at beginning of the S(t) study Time 0 21

The Survival Curve 1.0 Curve can only remain at same value or decrease as time S(t) progresses Time 0 22

The Survival Curve 1.0 If all the subjects do not experience the event by the end of the S(t) study window, the curve may never reach zero Time 0 23

Example n Consider a clinical trial in patients with acute myelogenous leukemia (AML) comparing two groups of patients: no maintenance treatment with chemotherapy ( X= 0 ) -vs- maintenance chemotherapy treatment ( X= 1 ) 24

Example: Data 25

Why Survival Methods? n We are interested in estimating the relationship between chemotherapy and the time to AML relapse in weeks. n We need some tools because: n Data are censored, so linear regression is not appropriate n We are interested in time to relapse, not just relapse (yes/no), so logistic regression is not appropriate 26

Kaplan-Meier Estimate n Curve can be estimated at each event, but not at censoring times n S(t) = proportion of individuals surviving beyond time t 27

Kaplan-Meier Estimate n Curve can be estimated at each event, but not at censoring times   − ( ) ( ) n t y t =   × S S ( t ) (Pr evious _ Event _ Time )     ( ) n t n y( t ) = # events at time t n n( t ) = # subjects at risk for event at time t 28

Kaplan-Meier Estimate n Curve can be estimated at each event, but not at censoring times   − n ( t ) y ( t ) =   × S S ( t ) (Pr evious _ Event _ Time )     ( ) n t Proportion of original sample making it to time t 29

Kaplan-Meier Estimate n Curve can be estimated at each event, but not at censoring times   − ( ) ( ) n t y t =   × S S ( t ) (Pr evious _ Event _ Time )     ( ) n t Proportion surviving to time t who survive beyond time t 30

Kaplan-Meier Estimate n Start estimate at first event time n No Chemotherapy Group: Time = 5   − − n ( 5 ) y ( 5 ) 12 2 10   = = = = S ( 5 ) . 833     n ( 5 ) 12 12 31

Kaplan-Meier Estimate n No Chemotherapy group: Time= 8 n 2 nd event time   − −   n ( 8 ) y ( 8 ) 10 2   = × = ×   S ( 8 ) S ( 5 ) (. 833 )       n ( 8 ) 10 8 = × = . 833 . 666 10 32

Kaplan-Meier Estimate n Skip over censoring times: Remove from number at risk for next event time n Continue through final event time 33

Alternative Notation   − = ∏ n y   ˆ i i ( ) S t     n ≤ i : : t t i i = ˆ S ( 0 ) 1 (by convention) 34

Notice n Time 16 was not included in the table, yet 2 people were subtracted from the risk set at time 23 n The estimated survivor function does not change at censoring times when no event occurs n Censored individuals are subtracted from the risk set at subsequent times because they are “lost to follow-up” 36

Kaplan-Meier Estimate n Graph is a step function n “Jumps” at each observed event time n Nothing is assumed about curved shape between each observed event time 38

Kaplan-Meier Estimate 39

Kaplan-Meier Estimate n Product limit estimate n Order survival times n Computed at observed events n Multiplying conditional probabilities n Next time we’ll discuss Confidence Intervals for S(t)! 40

Big Assumption n Independence of censoring and survival n Those censored at time t have the same prognosis as those not censored at t 41

Comparing Survival Curves n Common statistical tests: n Generalized Wilcoxon (Breslow, Gehan) n Logrank 42

Comparing Survival Curves n Both compare survival curves across multiple time points to answer the question: “Is overall survival different between any of the groups?” n H o : No difference in S (t) n H a : Difference in S (t) 43

Comparing Survival Curves n Wilcoxon (Breslow, Gehan) more sensitive to early survival differences Kaplan Meier Curve, by Group 1.00 Group 1 Group 2 0.75 0.50 0.25 0.00 0 100 200 300 400 analysis time 44

Comparing Survival Curves n Logrank more sensitive to later survival differences Kaplan Meier Curve, by Group 1.00 Group 1 Group 2 0.75 0.50 0.25 0.00 0 100 200 300 400 analysis time 45

Comparing Survival Curves n Neither test very good if curves “crossover” Kaplan Meier Curve, by Group 1.00 Group 1 Group 2 0.75 0.50 0.25 0.00 0 100 200 300 400 analysis time 46

Logrank Test n Answers the Quesiton: Are two survivor curves the same? n Use the times of events: t 1 , t 2 , ... (do not include censoring times) n Treat each event and its “set of persons still at risk” (i.e., risk set) at each time t j as an independent table 47

Logrank Test: Recipe n Make a 2 × 2 table at each t j 48

Logrank test n At each event time t j , under assumption of equal survival ( S A (t) = S B (t) ) the expected number of events in Group A out of the total events ( d j = a j + c j ) is in proportion to the numbers at risk in group A to the total at risk at time t j : E(a j )= d j * n jA /n j 49

Lecture 16: Survival Analysis I Kaplan Meier and Log-rank test Ani - PowerPoint PPT Presentation

Lecture 16: Survival Analysis I Kaplan Meier and Log-rank test Ani Manichaikul amanicha@jhsph.edu 11 May 2007 Survival Analysis n Statistical methods for the study of time to an event n Accounts for: n Time that events occur n Different

Kaplan-Meier estimate Heidi Seibold Statistician at LMU Munich DataCamp Survival Analysis in R

2 3 4 5 8 9 MINNEAPOLIS MILWAUKEE MSA RANK #16 MSA RANK #39 CHICAGO MSA RANK #3

Survival Analysis / Time-to- Event Analysis in R Heidi Seibold Statistician at LMU Munich

(142733/102960-Log[4])+(614851/73920-2 Log[64]) h 2 +(2329/1680-Log[4]) h 4 -h 10 /20160

Survival Rates and Multiple timescales Survival Lifetable estimators Competing risks Kaplan-

Survival Rates and Multiple timescales Survival Lifetable estimators Competing risks Kaplan-

platforms R Kaplan MRC Clinical Trials Unit at UCL R Kaplan NCI-MATCH trial R Kaplan

IntelMQ - a KISS incident handling automation project (IHAP) L. Aaron Kaplan kaplan@cert.at

Chandra data reduction The CDFs Giorgio, Margherita, Elisabeta, Eleonora, Lazarus, Enrica,

Survival models and Cox-regression Rates and Survival Lifetable estimators Bendix Carstensen

On the minimum rank of a graph Jisu Jeong June 21, 2013 Jisu Jeong On the minimum rank of a

Survival Analysis Mark Lunt Centre for Epidemiology Versus Arthritis University of Manchester

Why use the Weibull model? Heidi Seibold Statistician at LMU Munich DataCamp Survival Analysis

1 SVD applications: rank, column, row, and null spaces Rank : the rank of a matrix is equal to:

Survival curve showing cohorts Overall Survival Survival Frequency Time (%) 1 year 53.7 2

Getting the Care to the Patients Diane E. Meier, MD diane.meier@mssm.edu Lisa Morgan

In Silico Design of New Drugs for Myeloid Leukemia Treatment Washington Pereira and Ihosvany

A dockerized string analysis workflow for Big Data Maria Kotouza PhD candidate Aristotle

CS 1655 / Spring 2013 Secure Data Management and Web Applications 01 Data Mining and

Search API ecosystem in Drupal 8 Joris Vercammen | @borisson Site building https:/

Probability and Paradoxes Marco Cattaneo Department of Mathematics University of Hull Spring

JUST THE MATHS SLIDES NUMBER 14.12 PARTIAL DIFFERENTIATION 12 (The principle of least

Obesity and Initial High White Blood Cell Count Are Predictors of Thrombo-hemorrhagic Early Death

CSC 411: Lecture 08: Generative Models for Classification Class based on Raquel Urtasun &

Lecture 16: Survival Analysis I Kaplan Meier and Log-rank test Ani - PowerPoint PPT Presentation

Lecture 16: Survival Analysis I Kaplan Meier and Log-rank test Ani Manichaikul amanicha@jhsph.edu 11 May 2007 Survival Analysis n Statistical methods for the study of time to an event n Accounts for: n Time that events occur n Different

Kaplan-Meier estimate Heidi Seibold Statistician at LMU Munich DataCamp Survival Analysis in R

2 3 4 5 8 9 MINNEAPOLIS MILWAUKEE MSA RANK #16 MSA RANK #39 CHICAGO MSA RANK #3

Survival Analysis / Time-to- Event Analysis in R Heidi Seibold Statistician at LMU Munich

(142733/102960-Log[4])+(614851/73920-2 Log[64]) h 2 +(2329/1680-Log[4]) h 4 -h 10 /20160

Survival Rates and Multiple timescales Survival Lifetable estimators Competing risks Kaplan-

Survival Rates and Multiple timescales Survival Lifetable estimators Competing risks Kaplan-

platforms R Kaplan MRC Clinical Trials Unit at UCL R Kaplan NCI-MATCH trial R Kaplan

IntelMQ - a KISS incident handling automation project (IHAP) L. Aaron Kaplan kaplan@cert.at

Chandra data reduction The CDFs Giorgio, Margherita, Elisabeta, Eleonora, Lazarus, Enrica,

Survival models and Cox-regression Rates and Survival Lifetable estimators Bendix Carstensen

On the minimum rank of a graph Jisu Jeong June 21, 2013 Jisu Jeong On the minimum rank of a

Survival Analysis Mark Lunt Centre for Epidemiology Versus Arthritis University of Manchester

Why use the Weibull model? Heidi Seibold Statistician at LMU Munich DataCamp Survival Analysis

1 SVD applications: rank, column, row, and null spaces Rank : the rank of a matrix is equal to:

Survival curve showing cohorts Overall Survival Survival Frequency Time (%) 1 year 53.7 2

Getting the Care to the Patients Diane E. Meier, MD diane.meier@mssm.edu Lisa Morgan

In Silico Design of New Drugs for Myeloid Leukemia Treatment Washington Pereira and Ihosvany

A dockerized string analysis workflow for Big Data Maria Kotouza PhD candidate Aristotle

CS 1655 / Spring 2013 Secure Data Management and Web Applications 01 Data Mining and

Search API ecosystem in Drupal 8 Joris Vercammen | @borisson Site building https:/

Probability and Paradoxes Marco Cattaneo Department of Mathematics University of Hull Spring

JUST THE MATHS SLIDES NUMBER 14.12 PARTIAL DIFFERENTIATION 12 (The principle of least

Obesity and Initial High White Blood Cell Count Are Predictors of Thrombo-hemorrhagic Early Death

CSC 411: Lecture 08: Generative Models for Classification Class based on Raquel Urtasun &amp;

CSC 411: Lecture 08: Generative Models for Classification Class based on Raquel Urtasun &