boxhed b oosted e x act h azard e stimator with d ynamic
play

BoXHED : B oosted e X act H azard E stimator with D ynamic covariates - PowerPoint PPT Presentation

BoXH BoXHED : B oosted e X act H azard E stimator with D ynamic covariates Xiaochen Wang Yale University With Donald K.K. Lee (Emory U.), Bobak J. Mortazavi (TAMU), Arash Pakbin (TAMU), Hongyu Zhao (Yale U.) Motivation Dynamic Features in


  1. BoXH BoXHED : B oosted e X act H azard E stimator with D ynamic covariates Xiaochen Wang Yale University With Donald K.K. Lee (Emory U.), Bobak J. Mortazavi (TAMU), Arash Pakbin (TAMU), Hongyu Zhao (Yale U.)

  2. Motivation Dynamic Features in Survival Analyses

  3. Motivation Dynamic Features in Survival Analyses High frequency health vitals in ICU

  4. Motivation Longitudinal data from clinical studies Dynamic Features in Survival Analyses High frequency health vitals in ICU

  5. Motivation Longitudinal data from clinical studies Dynamic Features in Survival Analyses High frequency health Mobile data and vitals in ICU wearables devices Behavioral data in financial risk assessment

  6. Challenges & Our contributions • Challenges: • ML survival methods mainly focus on time-static features. (Ishwaran et al. 08; Ranganath et al. 16; Bellot & van der Schaar 18, 19; Lee et al. 19) • Methods dealing with dynamic features are very sparse: • Non-parametric: kernel smoothing for low-dimensional covariate settings. • Parametric : ‘flexsurv’ R package.

  7. Challenges & Our contributions • Challenges: • ML survival methods mainly focus on time-static features. (Ishwaran et al. 08; Ranganath et al. 16; Bellot & van der Schaar 18, 19; Lee et al. 19) • Methods dealing with dynamic features are very sparse: • Non-parametric: kernel smoothing for low-dimensional covariate settings. • Parametric : ‘flexsurv’ R package. • Contributions: 1. First publicly available software for boosted hazard estimation with time- dependent features. https://github.com/BoXHED 2. Novel algorithmic implementation of Lee, Chen, Ishwaran “ Boosted nonparametric hazards with time-dependent covariates ” (2017)

  8. Problem statement Each participant 𝑗 is represented by a triplet (𝑌 $ 𝑢 &∈ (,* + , Δ $ , 𝑈 $ ) . • 𝑌 $ 𝑢 is a set of continuously-monitored features. • Δ $ is a binary event indicator: 1 for an uncensored instance and 0 for a censored instance. • 𝑈 $ is the observed time, i.e. Event bme if Δ $ = 1 𝑈 $ = 2 Censoring bme if Δ $ = 0 Goal: Given above information of 𝑜 participants, we want to estimate log-hazard function 𝐺 𝑢, 𝑦 .

  9. Loss function • Loss function – negative log-likelihood. = * + 𝑆 𝐺 = 1 𝑓 @(&,A + & ) 𝑒𝑢 − Δ $ 𝐺(𝑈 𝑜 : > $ , 𝑌 $ 𝑈 $ ) ( $;< • Challenge: Likelihood risk 𝑆(𝐺) is too complex to be optimized using traditional techniques. Solution provided in Lee, Chen, Ishwaran 17 .

  10. Algorithm Overview

  11. Algorithm Overview

  12. Constructing the tree 𝑕 E Tree Construction Demo • Select candidate splits based on percentiles (adjustable). + Trajectory X i ( t ) + X + + + + + + + Time + Candidate splits on time and feature

  13. Constructing the tree 𝑕 E Tree Construction Demo • Select candidate splits based on percentiles (adjustable). + • Splits chosen to minimize R(F). How? Trajectory X i ( t ) + X + + + + + + + Time + Candidate splits on time and feature

  14. Constructing the tree 𝑕 E What’s the risk reduction if we split here? Tree Construction Demo • Select candidate splits based on percentiles (adjustable). 𝐵 < 𝐵 G + • Splits chosen to minimize R(F). How? 1. A split creates two new sub-regions 𝐵 < and 𝐵 G . Trajectory X i ( t ) + X + + + + + + + Time + Candidate splits on time and feature

  15. Constructing the tree 𝑕 E 𝑉 < Tree Construction Demo • Select candidate splits based on percentiles (adjustable). 𝐵 < 𝐵 G + • Splits chosen to minimize R(F). How? 1. A split creates two new sub-regions 𝐵 < and 𝐵 G . Trajectory X i ( t ) 2. Split score: + X O P O R SO T G 𝑒 = ∑ I;< 𝑊 I 1 + log Q P − (𝑊 < + 𝑊 G )(1 + log T ) , where Q R SQ + * + 𝑓 @ W &,A + & 𝐽 Y P 𝑢, 𝑌 $ 𝑢 = 𝑉 I = ∑ $;< 𝑒𝑢 , ∫ + ( 𝑊 I = # 𝑝𝑔 𝑝𝑐𝑡𝑓𝑠𝑤𝑓𝑒 𝑓𝑤𝑓𝑜𝑢𝑡 𝑗𝑜 𝐵 I . + + + + + Time + Candidate splits on time and feature

  16. Constructing the tree 𝑕 E 𝑉 < Tree Construction Demo • Select candidate splits based on percentiles (adjustable). 𝐵 < 𝐵 G + • Splits chosen to minimize R(F). How? 1. A split creates two new sub-regions 𝐵 < and 𝐵 G . Trajectory X i ( t ) 2. Split score: + X O P O R SO T G 𝑒 = ∑ I;< 𝑊 I 1 + log Q P − (𝑊 < + 𝑊 G )(1 + log T ) , where Q R SQ + * + 𝑓 @ W &,A + & 𝐽 Y P 𝑢, 𝑌 $ 𝑢 = 𝑉 I = ∑ $;< 𝑒𝑢 , ∫ + ( 𝑊 I = # 𝑝𝑔 𝑝𝑐𝑡𝑓𝑠𝑤𝑓𝑒 𝑓𝑤𝑓𝑜𝑢𝑡 𝑗𝑜 𝐵 I . + + + + + 3. Choose the split that minimized 𝑒 . Time + Candidate splits on time and feature

  17. Constructing the tree 𝑕 E 𝑉 < Tree Construction Demo • Select candidate splits based on percentiles (adjustable). 𝐵 < 𝐵 G + • Splits chosen to minimize R(F). How? 1. A split creates two new sub-regions 𝐵 < and 𝐵 G . Trajectory X i ( t ) 2. Split score: + X O P O R SO T G 𝑒 = ∑ I;< 𝑊 I 1 + log Q P − (𝑊 < + 𝑊 G )(1 + log T ) , where Q R SQ + * + 𝑓 @ W &,A + & 𝐽 Y P 𝑢, 𝑌 $ 𝑢 = 𝑉 I = ∑ $;< 𝑒𝑢 , ∫ + ( 𝑊 I = # 𝑝𝑔 𝑝𝑐𝑡𝑓𝑠𝑤𝑓𝑒 𝑓𝑤𝑓𝑜𝑢𝑡 𝑗𝑜 𝐵 I . + + + + + 3. Choose the split that minimized 𝑒 . Time • Choose subsequent splits to also minimize split score. + Candidate splits on time and feature

  18. Results • Simulation data • Framingham heart study data

  19. Simulation Data Four hazard functions (Pérez et al. 13) 𝜇 < 𝑢, 𝑦 & = 𝐶𝑓𝑢𝑏 𝑢, 2, 2 ×𝐶𝑓𝑢𝑏 𝑦 & , 2, 2 , 𝑢 ∈ 0, 1 ; 𝜇 G 𝑢, 𝑦 & = 𝐶𝑓𝑢𝑏 𝑢, 4, 4 ×𝐶𝑓𝑢𝑏 𝑦 & , 4, 4 , 𝑢 ∈ 0, 1 ; 𝜇 h 𝑢, 𝑦 & = 1 𝜚(log 𝑢 − 𝑦 & ) Φ(𝑦 & − log 𝑢) , 𝑢 ∈ 0, 5 ; 𝑢 𝜇 l 𝑢, 𝑦 & = 3 2 𝑢 (.n exp − 1 2 cos 2𝜌𝑦 & − 3 2 , 𝑢 ∈ 0, 5 . 0, 20, and 40 irrelevant features from standard normal distribution are added to above four hazards.

  20. Methods Can handle time- Nonparametric? Variable selection Parameter tuning dependent features? Cross-validated on BoXHED √ √ √ training data Kernel bandwidth Kernel Smoothing √ √ tuned directly to test data Best parametric FlexSurv √ √ family for test data Best parametric family and Black-boost √ #iterations for test data

  21. RMSE error RMSE error with 95% confidence interval.

  22. RMSE error The kernel function is a beta density, resembling 𝜇 < and 𝜇 G . RMSE error with 95% confidence interval.

  23. RMSE error The kernel function is a beta density, resembling 𝜇 < and 𝜇 G . RMSE error with 95% confidence interval. flexsurv is correctly specified for 𝜇 h (log-normal distribution)

  24. Time-dependent AUC AUC versus time 𝑢 for the estimators when applied to data simulated from 𝜇 < . Larger AUC values are better. Left: No irrelevant covariates; right: 20 irrelevant covariates.

  25. Framingham heart study data • 9,697 participants enrolled by 1975 with event follow-up through 2017. • Many features were measured repeatedly in physical exams almost every two years. • Risk factors: age, gender, systolic blood pressure (SBP), diastolic blood pressure (DBP), total cholesterol (TC), smoking, diabetes, and BMI. • Outcome: first occurrence of a CVD event.

  26. Relationship between SBP and CVD Conflicting clinical literature on how SBP affects CVD risk. • CVD risk increases with SBP; • CVD risk decreases with SBP, and then increases (U-shaped); • some more complicated interaction patterns ... BoXHED identified novel interaction effects that may partially explain these conflicting findings.

  27. Estimated hazard by SBP

  28. Novel clinical finding • Hypotheses: The interaction effects SBP × BMI and SBP × Gender are responsible for the reported clinical findings on SBP and CVD risk. • Validation: SBP × BMI interaction effect is validated using the conventional odds ratio analyses.

  29. Conclusions • BoXHED is first publicly available software for boosted hazard estimation that is • completely nonparametric • able to handle time-dependent features • applicable to high-dimensional data • Uncovered a novel interaction effect that may explain conflicting findings on CVD risk in clinical literature. https://github. com /BoXHED

  30. Q&A

Recommend


More recommend