DR. TED Deep learning Recommendation of Treatment from Electronic Data David Ledbetter Melissa Aczon Randall Wetzel, M.D. Children’s Hospital Los Angeles (CHLA) Virtual Pediatric ICU (VPICU) GTC April 7th 2016 1
Outline ● Problem ● Data ● Models ● Results ● Summary 2
Problem For an individual patient, can we recommend the most effective treatment? There’s actually a patient there 3
Traditional Approach ● Doctors generate implicit models ○ Requires significant training ○ Combination of academic, clinical experience, and medical research ● Use model to generate treatment strategies for new patients ○ Limited time (other patients, rapid deteriorations) ○ Limited capacity* to ingest data *Miller. The Magical Number Seven, Plus or Minus Two . Psychological Review, 63 (2): 81-97, 1956 4
Moving Forward ● Generate explicit model from clinical data to predict which treatments will give best patient outcomes ○ Leverage 10+ years of electronic health records (EHR) ■ ~12,000 patient encounters from CHLA PICU ■ (patient, treatment, outcome) triples ● Learn the most important relationships utilizing state-of-the- art information extraction techniques 5
Moving Forward My CPU is a neural-network processor; a learning computer. The more CHLA PICU data I have, the more I learn. 6
DRTED Input Model Output 7
DRTED Input Model Output 8
Data Structure - Overview ● Convert non-uniformly sampled time-series data into image representation ● Image representation enables exploitation via advanced computer vision algorithms 9
Data Structure - Patient Snapshots Labs ● 161 measurements ○ 53 labs/vitals Vitals ○ 108 drugs/interventions Drugs ● 12 hours of data ○ Sampled every 5 minutes ○ (144 samples) Inter. Time 10
DRTED Input Model Output 11
v1: Convolutional Neural Network ● Utilizes patient snapshot as input ‘image’ ● VGG-style architecture ○ Heavily exploit temporal relationships with 1-D convolutions ● Generates mortality prediction given fixed time window ● NVIDIA GTX Titan X used for training 12
v2: Recurrent Neural Network ● Basic structure is a feedback loop Mortality Prediction y m ○ At each time, t, a vector X is input ○ An output is generated and fed back Physiology forecasting y p into the network ● Advantages: Kernel ○ Native comprehension of the temporal dimension ■ Including non-uniform samples Patient Vitals X V ○ Increased temporal memory Patient Treatments X T ○ Formal feedback mechanism ○ Generate predictions for all vitals 13
DRTED Input Model Output 14
Assessment - Mortality ● Results for holdout set of 3372 patients with encounter length of at least 12 hours ○ DRTED AUC - 90.3% ○ PIM 2* AUC - 83.0%** Notes: *Pediatric Index of Mortality **Published PIM 2 AUC 15
Predictions Heart Rate Diastolic Pressure Systolic Pressure ● Predictions generated for 5 key vitals + Mortality ○ Heart Rate ○ Diastolic Blood Pressure ○ Systolic Blood Pressure Respiratory Rate Pulse Oximetry Probability of Survival ○ Respiratory Rate ○ Pulse Oximetry ● Accurate prediction of vitals and mortality enable prediction of treatment Time (hours) Time (hours) Time (hours) effects 16
Predicted Treatment Effect Patient diagnosed with: Cardiac Arrest Cardiomyopathy Epileptic Seizures Utilize machinery to Pneumothorax predict effect of each treatment on patient Eventually treated with Piperacillin Vancomycin Epinephrine Phenylephrine 17
Summary ● Applied deep learning methods on 10+ years of Pediatric ICU data ○ Able to generate state-of-the-art mortality predictions ○ Able to generate physiology predictions ○ Able to generate predictions of treatment/therapy effects ● Framework/machinery is being extended to provide additional Decision Support Services to clinicians 18
Machine Learning in Healthcare Contact: ledbetdr@gmail.com macs.aczon@gmail.com Interested? Machine Learning in Healthcare Conference August 19th, 20th Children’s Hospital Los Angeles http://www.mucmd.org/ 19
Backups 20
Augment Data - Example Original 0-8 Hours 21
Patient Snapshots - High Survive Survivors Sample surviving patients with high predicted probability of survival Labs Vitals Drugs Inter. 22
Patient Snapshots - Low Survive Died Sample non-surviving patients with low predicted probability of survival Labs Vitals Drugs Inter. 23
Patient Snapshots - Low Survive Survived Sample surviving patients with low predicted probability of survival Labs Vitals Drugs Inter. 24
Patient Snapshots - High Survive Died Sample non-surviving patients with high predicted probability of survival Labs Vitals Patient encounter lasts for 4 days but no data during first 72 hours Drugs Inter. 25
Optimization V - h t V || α + β|y - h t y | ● Minimize: ||x t+1 ● First term represents ability to predict future vital readings from current information ● Second term represents ability to predict outcome from current information ● Alpha term represents vector cost weight for vital prediction ● Beta term represents cost weight of mortality prediction 26
Implementation ● Each patient is represented as a sequence of n t (m+1)-length vectors ○ m+1 → # of measurements + Δt ○ n t → # of discrete time steps ● Vitals receive forward fill + median imputation ● Intermittent exogenous inputs are delta functions ○ Continuous drugs/interventions are propagated ● Δt informs the algorithm how far into the future it needs to predict ○ training is allowed to ‘cheat’ - knows when next measure is ○ But that’s OK, we just want to learn the relationships ○ At test time, we specify Δt to predict precise point in time 27
LSTM http://colah.github.io/posts/2015-08-Understanding-LSTMs/ 28
Assessment - Treatment Response (cont.) ● Intrinsic difficulty of assessment is ambiguity of truth ● Best metric would be A/B test ○ Average outcome of patients whose doctors have access to decision aid vs. ○ Average outcome of patient whose doctors do not have access to decision aid ○ Not practical for initial development or iteration ● Instead, develop intuitive quantification ○ Provide adequate feedback for iteration ○ Base on simple assumption: ■ Maximizing frequency of recommendations of actual treatments used in successful cases is good 29
Assessment - Treatment Response ● Compress interventions and drugs into treatment response y = Δ HealthIndex * treatment where: Δ HealthIndex = HealthIndex i+1 - HealthIndex i HealthIndex i is the expected survival at time t i as computed by survival model treatment is a vector: [t 1 , t 2 , …, t n ], with t i ∈ {0, 1} indicating presence of treatment categories ● Elements of y contain: ○ positive values for treatments that contributed to improvement ○ negative values for treatments detrimental to patient condition ○ 0 for treatments not utilized 30
Recommend
More recommend