reliable decision support using counterfactual models
play

Reliable Decision Support using Counterfactual Models Suchi Saria - PowerPoint PPT Presentation

Reliable Decision Support using Counterfactual Models Suchi Saria Assistant Professor Computer Science, Applied Math & Stats and Health Policy Institute for Computational Medicine w/ Peter Schulam, PhD candidate Example: Customer Churn !


  1. Reliable Decision Support 
 using Counterfactual Models Suchi Saria Assistant Professor Computer Science, Applied Math & Stats and Health Policy Institute for Computational Medicine w/ Peter Schulam, PhD candidate

  2. Example: Customer Churn ! Cancels Account | P

  3. Example: Customer Churn ! , Supervised 
 Learning ! ˆ , P ! , ! ,

  4. Example: Customer Churn ! , Supervised 
 Learning ! ˆ , P ! , ! Supervised ML models can be biased 
 , for decision-making problems!

  5. Why? ! Ad emails, , , discounts, etc. ! Ad emails, discounts, etc. , , Past actions determined by some policy.

  6. Why? ! Ad emails, , , discounts, etc. ! Ad emails, discounts, etc. , , Actions determined by a policy ˆ based on your learned model P

  7. Why? ! Cancels Account | P π train , 6 = ! Cancels Account | π test ( ˆ P P ) , Supervised ML leads to models that are unstable to 
 shifts in the policy between the train and test

  8. Example: Risk Monitoring Adverse Event Onset Is the patient at risk of a septic shock?

  9. 
 
 • Rise in Temperature and Rise in WBC are indicators of sepsis and death • But, doctors in H1 aggressively treat patients with high temperature • As doctors treat treat more aggressively, supervised learning model learns high temperature is associated with low risk . Dyagilev and Saria, Machine Learning 2015

  10. Treat based on 
 Treat based on 
 temp WBC Increasing discrepancy in physician prescription behavior in train vs. test environment Predictive model trained using classical supervised ML creates 
 unsafe scenarios where sick patients are overlooked. Dyagilev and Saria, Machine Learning 2015

  11. Run an experiment: 
 observe outcome under diff scenarios • Clone the customer; give a 10% and 20% discount code to each clone • Choose the outcome that has the better outcome { } Y ( d 10 ) Y ( d 20 ) , Outcome under 10% discount.

  12. Run an experiment: 
 observe outcome under diff scenarios • Clone the customer; give a 10% and 20% discount code to each clone • Choose the outcome that has the better outcome { } Y ( d 10 ) Y ( d 20 ) , Outcome under 20% discount.

  13. 
 Can we learn models of these outcomes from observational data? • Factual: outcome observed in the data 
 vs. • Counterfactual: outcome is unobserved { } Y ( d 10 ) Y ( d 20 ) ,

  14. Potential Outcomes Set of actions Random variable { Y ( a ) : a ∈ A} Action Potential outcomes model the observed outcome under each possible action (or intervention) Rubin, 1974 Neyman et al., 1923 Rubin, 2005

  15. Sequential Decisions in 
 Continuous-Time 120 ● 100 Lung Capacity ● ● ● ● ● ● ● ● PFVC 80 ● 60 40 0 5 10 15 Years Since First Symptom

  16. Sequential Decisions in 
 Continuous-Time 120 ● 100 Lung Capacity ● ● ● ● ● ● ● ● PFVC 80 ● 60 40 0 5 10 15 Years Since First Symptom

  17. Sequential Decisions in 
 Continuous-Time 120 ● 100 Lung Capacity ● ● ● ● ● ● ● ● PFVC 80 ● 60 40 0 5 10 15 Years Since First Symptom

  18. Sequential Decisions in 
 Continuous-Time 120 ● 100 Lung Capacity ● ● ● ● ● ● ● ● PFVC 80 ● 60 40 0 5 10 15 Years Since First Symptom

  19. Sequential Decisions in 
 Continuous-Time 120 ● 100 Lung Capacity ● ● ● ● ● ● ● ● PFVC 80 ● 60 40 0 5 10 15 Years Since First Symptom

  20. Sequential Decisions in 
 Continuous-Time 120 ● 100 Lung Capacity ● ● ● ● ● ● ● ● PFVC 80 ● 60 40 0 5 10 15 Years Since First Symptom

  21. Sequential Decisions in 
 Continuous-Time 120 ● 100 Lung Capacity ● ● ● ● ● ● ● ● PFVC 80 ● 60 40 0 5 10 15 Years Since First Symptom

  22. Counterfactual GP 120 ● 100 Lung Capacity ● ● ● ● ● ● ● ● PFVC ? 80 ● 60 40 0 5 10 15 Years Since First Symptom

  23. Counterfactual GP 120 ● 100 Lung Capacity ● ● ● ● ● ● ● ● PFVC 80 ● E [ Y ( ) | H = h ] 60 40 0 5 10 15 Years Since First Symptom

  24. Counterfactual GP 120 ● 100 Lung Capacity ● ● ● ● E [ Y ( ) | H = h ] ● ● ● ● PFVC 80 ● E [ Y ( ) | H = h ] 60 40 0 5 10 15 Years Since First Symptom

  25. Counterfactual GP 120 ● E [ Y ( ) | H = h ] 100 Lung Capacity ● ● ● ● E [ Y ( ) | H = h ] ● ● ● ● PFVC 80 ● E [ Y ( ) | H = h ] 60 40 0 5 10 15 Years Since First Symptom

  26. Related Work • Counterfactual models: See Schulam and Saria, NIPS 2017 for discussion of related work. 
 Schulam Saria, 2017 ads; single intervention Brodersen et al., 2015 Bottou et al., 2013 epidemiology; multiple sequential 
 Taubman et al.,2009 interventions sparse, irregularly sampled 
 Xu, Xu, Saria, 2016 longitudinal data; functional outcomes Lok et al., 2008 • Off-policy evaluation: Re-weighting to evaluate reward 
 for a policy when learning from offline data. e.g. Dudik et al., 2011 Jiang and Li, 2016 Paduraru et al. 2013

  27. Critical Assumptions • To learn the potential outcome models, we will use three important assumptions: • (1) Consistency • Links observed outcomes to potential outcomes • (2) Treatment Positivity • Ensures that we can learn potential outcome models • (3) No unmeasured confounders (NUC) • Ensures that we do not learn biased models Rubin, 1974 Neyman et al., 1923 Rubin, 2005

  28. (1) Consistency • Consider a dataset containing observed outcomes, observed treatments, and covariates: { y i , a i , x i } n i =1 • E.g.: blood pressure, exercise, BMI • Consistency allows us to replace the observed response with the potential outcome of the observed treatment Y , Y ( a ) | A = a • Under consistency our dataset satisfies { y i , a i , x i } n i =1 , { y i ( a i ) , a i , x i } n i =1

  29. (2) Positivity • When working with observational data, for any set of covariates we need to assume a non-zero x probability of seeing each treatment • Otherwise, in general, cannot learn a conditional model of the potential outcomes given those covariates • Formally, we assume that P Obs ( A = a | X = x ) > 0 ∀ a ∈ A , ∀ x ∈ X

  30. (3) No Unmeasured Confounders (NUC) • Formally, NUC is an statistical independence assertion: Y ( a ) ⊥ A | X = x : ∀ a ∈ A , ∀ x ∈ X

  31. (3) No Unmeasured Confounders (NUC) • Formally, NUC is an statistical independence assertion: Y ( a ) ⊥ A | X = x : ∀ a ∈ A , ∀ x ∈ X Exerc Exerc y BP y BP x BMI x BMI Exerc y BP x BMI

  32. Learning Potential Outcome Models • Assumptions allow estimation of potential outcomes from (observational) data: (A3) P( Y ( a ) | X = x ) = P( Y ( a ) | X = x , A = a ) = P( Y | X = x , A = a ) (A1) Estimation requires a statistical model for estimating conditionals • To simulate data from a new policy, we need to learn the potential outcome models • If we have an observational dataset where assumptions 1-3 hold, then this is possible! UAI Tutorial: Saria and Soleimani, 2017

  33. Observational Traces Creatinine is a test used to measure kidney function. Timing between 
 measurements is 
 irregular and random

  34. Observational Traces And so are times 
 between treatments

  35. Challenges w/ Observational Traces In the discrete-time setting, 
 we did not treat the timing of events as random

  36. Counterfactual GP • Collection of Gaussian processes n o { Y t ( a ) : t ∈ [0 , τ ] } : a ∈ C Fixed time period Set of finite sequences of 
 actions

  37. Learning from Observational Traces tss pfvc pdlco rvsp ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 75 ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● Marker Value ● Medication ● ● ● Prednisone 50 ● ● Methotrex ● ● ● ● ● Cyclophosphamide Cytoxan ● ● ● 25 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● 0 ● 0 5 10 15 0 5 10 15 0 5 10 15 0 5 10 15 Years Since Diagnosis

  38. Learning from Observational Traces tss pfvc pdlco rvsp ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 75 ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● Marker Value ● Medication ● ● ● Prednisone 50 ● ● Methotrex ● ● ● ● ● Cyclophosphamide Cytoxan ● ● ● 25 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● 0 ● 0 5 10 15 0 5 10 15 0 5 10 15 0 5 10 15 Years Since Diagnosis Treatments administered according to unknown policy (i.e. not an RCT)

Recommend


More recommend