lecture 16 survival analysis i kaplan meier and log rank
play

Lecture 16: Survival Analysis I Kaplan Meier and Log-rank test Ani - PowerPoint PPT Presentation

Lecture 16: Survival Analysis I Kaplan Meier and Log-rank test Ani Manichaikul amanicha@jhsph.edu 11 May 2007 Survival Analysis n Statistical methods for the study of time to an event n Accounts for: n Time that events occur n Different


  1. Lecture 16: Survival Analysis I – Kaplan Meier and Log-rank test Ani Manichaikul amanicha@jhsph.edu 11 May 2007

  2. Survival Analysis n Statistical methods for the study of time to an event n Accounts for: n Time that events occur n Different follow-up times 2

  3. Survival Analysis n Survival analysis methods allow us to incorporate information about both frequency of event occurrence and time to event information n Subjects are followed until they have an “event,” or the study ends 3

  4. Endpoint n The endpoint doesn’t have to be ‘death’; it can be any well-defined event n Death n Disease onset n Menopause n Pregnancy n Relapse 4

  5. Time Scale n When do you start the clock? n Time from diagnosis of disease to death n Time from HIV infection to AIDS n Time from birth (chronological age) n Time from randomization in clinical trial 5

  6. Why Is Survival Analysis Tricky? n We need a method which can incorporate information about censored data into an analysis 6

  7. The Survival Curve S(t) is an estimate of the proportion of individuals still alive (have not had S(t) the event) at time t Time 7

  8. The Survival Curve n The survival curve as an important and complete summary ( ) # alive at followup time t = S ( t ) ( ) # alive at time 0 n Time 0: “start of clock” 8

  9. Survival Curve Facts: n The curve starts at 1 and decreases n Estimating these curves and comparing them among groups constitutes a “survival analysis” n Need to decide on what summary is important n Mean survival time n Median survival time n Height at a specific time: One, two year survival rates n Difference of curves: S 1 (12) - S 2 (12) 9

  10. Estimating Median Survival S(t) .50 m Time 10

  11. Caveat—Medians Do Not Describe Whole Curve S(t) .50 m Time 11

  12. Survival Function n The survival function, denoted S(t), is a better way to represent the probability distribution of the survival time T, when some of the observed times are censored n only know that T> t, rather than T= t n S(t) = Pr(T > t) = Pr(No event by time t) n S(t) is the probability of surviving beyond t 12

  13. n Uncensored data: The event has occurred n Censored data: The event has yet to occur n Event-free at the current followup time n A competing event that is not an endpoint stops followup n Death (if not part of the endpoint) n Clinical event that requires treatment, etc. 13

  14. n Important issue: If no events are reported in the interval from last follow- up to “now”, need to choose between: n No news is good news? n No news is no news 14

  15. n Ignore the incomplete cases; drop them n Produces bias in the estimated curve n Unbalanced censoring produces biased comparisons n Impute an event time n Depends on a model n Use the available information on each participant 15

  16. ( ) # events = Event Rate total observatio n time n Example: 5 events in 600 person months 5/600 = 1/120 events per month n = 0.1 events per year = 10 events per 100 person-years n Gives an average event rate over the follow- up period n For a finer time resolution, do the above for small intervals 16

  17. Quantities of Interest n The survivor function S(t) S(t)= P(T> t)= P(No event by time t) n Hazard function � (t) � (t) “= ” P(T= t)/ P(T> t) = risk of event occurring at time t The above form is true for discrete time, but involves more complicated calculus-based notation for continuous time. 17

  18. Quantities of Interest n Often, we are interested in comparing the hazard between groups, for example, the relative hazard of relapse comparing those on chemo to those not on chemo n Relative Risk n Hazard Ratio n Risk Ratio 18

  19. Estimation n Kaplan-Meier survivor function estimator n Cox proportional hazards model (PHM) for hazard ratio n We’ll start with Kaplan-Meier (K-M) 19

  20. Central Problem n Estimation of the survival curve n S(t) = Proportion surviving at least to time t or beyond 20

  21. The Survival Curve 1.0 S(0) always equals 1 All subjects are alive at beginning of the S(t) study Time 0 21

  22. The Survival Curve 1.0 Curve can only remain at same value or decrease as time S(t) progresses Time 0 22

  23. The Survival Curve 1.0 If all the subjects do not experience the event by the end of the S(t) study window, the curve may never reach zero Time 0 23

  24. Example n Consider a clinical trial in patients with acute myelogenous leukemia (AML) comparing two groups of patients: no maintenance treatment with chemotherapy ( X= 0 ) -vs- maintenance chemotherapy treatment ( X= 1 ) 24

  25. Example: Data 25

  26. Why Survival Methods? n We are interested in estimating the relationship between chemotherapy and the time to AML relapse in weeks. n We need some tools because: n Data are censored, so linear regression is not appropriate n We are interested in time to relapse, not just relapse (yes/no), so logistic regression is not appropriate 26

  27. Kaplan-Meier Estimate n Curve can be estimated at each event, but not at censoring times n S(t) = proportion of individuals surviving beyond time t 27

  28. Kaplan-Meier Estimate n Curve can be estimated at each event, but not at censoring times   − ( ) ( ) n t y t =   × S S ( t ) (Pr evious _ Event _ Time )     ( ) n t n y( t ) = # events at time t n n( t ) = # subjects at risk for event at time t 28

  29. Kaplan-Meier Estimate n Curve can be estimated at each event, but not at censoring times   − n ( t ) y ( t ) =   × S S ( t ) (Pr evious _ Event _ Time )     ( ) n t Proportion of original sample making it to time t 29

  30. Kaplan-Meier Estimate n Curve can be estimated at each event, but not at censoring times   − ( ) ( ) n t y t =   × S S ( t ) (Pr evious _ Event _ Time )     ( ) n t Proportion surviving to time t who survive beyond time t 30

  31. Kaplan-Meier Estimate n Start estimate at first event time n No Chemotherapy Group: Time = 5   − − n ( 5 ) y ( 5 ) 12 2 10   = = = = S ( 5 ) . 833     n ( 5 ) 12 12 31

  32. Kaplan-Meier Estimate n No Chemotherapy group: Time= 8 n 2 nd event time   − −   n ( 8 ) y ( 8 ) 10 2   = × = ×   S ( 8 ) S ( 5 ) (. 833 )       n ( 8 ) 10 8 = × = . 833 . 666 10 32

  33. Kaplan-Meier Estimate n Skip over censoring times: Remove from number at risk for next event time n Continue through final event time 33

  34. Alternative Notation   − = ∏ n y   ˆ i i ( ) S t     n ≤ i : : t t i i = ˆ S ( 0 ) 1 (by convention) 34

  35. 35

  36. Notice n Time 16 was not included in the table, yet 2 people were subtracted from the risk set at time 23 n The estimated survivor function does not change at censoring times when no event occurs n Censored individuals are subtracted from the risk set at subsequent times because they are “lost to follow-up” 36

  37. 37

  38. Kaplan-Meier Estimate n Graph is a step function n “Jumps” at each observed event time n Nothing is assumed about curved shape between each observed event time 38

  39. Kaplan-Meier Estimate 39

  40. Kaplan-Meier Estimate n Product limit estimate n Order survival times n Computed at observed events n Multiplying conditional probabilities n Next time we’ll discuss Confidence Intervals for S(t)! 40

  41. Big Assumption n Independence of censoring and survival n Those censored at time t have the same prognosis as those not censored at t 41

  42. Comparing Survival Curves n Common statistical tests: n Generalized Wilcoxon (Breslow, Gehan) n Logrank 42

  43. Comparing Survival Curves n Both compare survival curves across multiple time points to answer the question: “Is overall survival different between any of the groups?” n H o : No difference in S (t) n H a : Difference in S (t) 43

  44. Comparing Survival Curves n Wilcoxon (Breslow, Gehan) more sensitive to early survival differences Kaplan Meier Curve, by Group 1.00 Group 1 Group 2 0.75 0.50 0.25 0.00 0 100 200 300 400 analysis time 44

  45. Comparing Survival Curves n Logrank more sensitive to later survival differences Kaplan Meier Curve, by Group 1.00 Group 1 Group 2 0.75 0.50 0.25 0.00 0 100 200 300 400 analysis time 45

  46. Comparing Survival Curves n Neither test very good if curves “crossover” Kaplan Meier Curve, by Group 1.00 Group 1 Group 2 0.75 0.50 0.25 0.00 0 100 200 300 400 analysis time 46

  47. Logrank Test n Answers the Quesiton: Are two survivor curves the same? n Use the times of events: t 1 , t 2 , ... (do not include censoring times) n Treat each event and its “set of persons still at risk” (i.e., risk set) at each time t j as an independent table 47

  48. Logrank Test: Recipe n Make a 2 × 2 table at each t j 48

  49. Logrank test n At each event time t j , under assumption of equal survival ( S A (t) = S B (t) ) the expected number of events in Group A out of the total events ( d j = a j + c j ) is in proportion to the numbers at risk in group A to the total at risk at time t j : E(a j )= d j * n jA /n j 49

Recommend


More recommend