machine learning for healthcare 6 s897 hst s53 lecture 3
play

MACHINE LEARNING FOR HEALTHCARE 6.S897, HST.S53 Lecture 3: Causal - PowerPoint PPT Presentation

MACHINE LEARNING FOR HEALTHCARE 6.S897, HST.S53 Lecture 3: Causal inference Prof. David Sontag MIT EECS, CSAIL, IMES (Thanks to Uri Shalit for many of the slides) *Last week: Type 2 diabetes 2013 1994 2000 <4.5% 4.5%5.9%


  1. MACHINE LEARNING FOR HEALTHCARE 6.S897, HST.S53 Lecture 3: Causal inference Prof. David Sontag MIT EECS, CSAIL, IMES (Thanks to Uri Shalit for many of the slides)

  2. *Last week: Type 2 diabetes 2013 1994 2000 <4.5% 4.5%–5.9% 6.0%–7.4% 7.5%–8.9% >9.0% Early detection of Type 2 diabetes: (Razavian et al., Big Data , 2016)

  3. *Last week: Discovered risk factors Additional Disease Risk Factors Include: Highly weighted features Odds Ratio Pituitary dwarfism (253.3), 4.17 Impaired Fasting Glucose (Code 790.21) Hepatomegaly(789.1), Chronic (3.87 4.49) 4.07 Hepatitis C (070.54), Hepatitis Abnormal Glucose NEC (790.29) (3.76 4.41) (573.3), Calcaneal Spur(726.73), 3.28 Hypertension (401) (3.17 3.39) Thyrotoxicosis without mention 2.98 Obstructive Sleep Apnea (327.23) of goiter(242.90), Sinoatrial (2.78 3.20) 2.88 Node dysfunction(427.81), Acute Obesity (278) (2.75 3.02) frontal sinusitis (461.1 ), 2.49 Abnormal Blood Chemistry (790.6) (2.36 2.62) Hypertrophic and atrophic 2.45 Hyperlipidemia (272.4) conditions of skin(701.9), (2.37 2.53) 2.09 Shortness Of Breath (786.05) Irregular menstruation(626.4), … (1.99 2.19) 1.85 Esophageal Reflux (530.81) (1.78 1.93) Diabetes (Razavian et al., Big Data , 2016) 1-year gap

  4. Thinking about interventions Do highly weighted features suggest avenues for 1. preventing onset of diabetes? Example: Gastric bypass surgery . Highest negative weight • (9 th most predictive feature) What is the mathematical justification for thinking of highly • weighted features in this way? What happens if the patient did not get diabetes 2. because an intervention made in the gap? How do we deconvolve effect of interventions from the • prediction task? Solution is to reframe as causal inference problem: 3. predict for which patients an intervention will reduce chances of getting T2D

  5. Randomized trials vs. observational studies Which treatment works better? A or B

  6. Randomized controlled trial (RCT) A B A B A A A B A B B A A B A B Socio-economic class Poor Wealthy Which treatment works better? A or B

  7. Observational study A A A B A A A A A B A B B B B B Socio-economic class Poor Wealthy Which treatment works better? A or B

  8. Observational study Socio-economic class is a potential Confounder A A A B A A A A A B A B B B B B Socio-economic class Poor Wealthy Which treatment works better? A or B

  9. In many fields randomized studies are the gold standard for causal inference, but…

  10. • Does inhaling Asbestos cause cancer? • Does decreasing the interest rate reinvigorate the economy? • We have a budget for one new anti- diabetic drug experiment. Can we use past health records of 100,000 diabetics to guide us?

  11. Even randomized controlled trials have flaws • Not personalized – only population effect • Study population might not represent true population • Recruiting is hard • People might drop out of study • Study in one company/hospital/state/country could fail to generalize to others

  12. Example 1 Precision medicine: Individualized Treatment Effect (ITE)

  13. Which treatment is best for me ? • Which anti-hypertensive treatment? • Calcium channel blocker (A) • ACE inhibitor (B) • Blood pressure = 150/95 WBC count = 6*10 9 /L • • Current situation: • Temperature = 98°F • HbA1c = 6.6% • Clinical trials • Thickness of heart artery plaque = 3mm • Doctor’s knowledge & intuition • Weight = 65kg • Use datasets of patients and their histories

  14. Which treatment is best for me ? • Which anti-hypertensive treatment? • Calcium channel blocker (A) • ACE inhibitor (B) • Future blood pressure: treatment A vs. B • Individualized Treatment Effect (ITE)

  15. Which treatment is best for me ? • Which anti-hypertensive treatment? • Calcium channel blocker (A) • ACE inhibitor (B) • Potential confounder : maybe rich patients got medication A more often, and poor patients got medication B more often

  16. Example 2 Job training: Average Treatment Effect (ATE)

  17. Should the government fund job-training programs? • Existing job training programs seem to help unemployed and underemployed find better jobs • Should the government fund such programs? • Maybe training helps but only marginally? Is it worth the investment? • Average Treatment Effect (ATE) • Potential confounder : Maybe only motivated people go to job training? Maybe they would have found better jobs anyway?

  18. Observational studies A major challenge in causal inference from observational studies is how to control or adjust for the confounding factors

  19. Counterfactuals and causal inference • Does treatment 𝑼 cause outcome 𝒁 ? • If 𝑼 had not occurred, 𝒁 would not have occurred (David Hume) • Counterfactuals: Kim received job training (𝑼) , and her income one year later ( 𝒁 ) is 20,000$ What would have been Kim’s income had she not had job training?

  20. Counterfactuals and causal inference • Counterfactuals: Kim received job training (𝑼) , and her income one year later ( 𝒁 ) is $20,000 What would have been Kim’s income had she not had job training? • If her income would have been $18,000, we say that job training caused an increase of $2,000 in Kim’s income • The problem: you never know what might have been

  21. Sliding Doors

  22. Potential Outcomes Framework (Rubin-Neyman Causal Model) • Each unit 𝑦 & has two potential outcomes: • 𝑍 ( (𝑦 & ) is the potential outcome had the unit not been treated: “ control outcome ” • 𝑍 ) (𝑦 & ) is the potential outcome had the unit been treated: “ treated outcome ” • Individual Treatment Effect for unit 𝑗 : 𝐽𝑈𝐹 𝑦 & = 𝔽 0 1 |5 6 ) [𝑍 ) |𝑦 & ] − 𝔽 0 ; |5 6 ) [𝑍 ( |𝑦 & ] 1 ~3(0 ; ~3(0 • Average Treatment Effect: 𝐵𝑈𝐹:= 𝔽 𝑍 ) − 𝑍 ( = 𝔽 5~3(5) 𝐽𝑈𝐹 𝑦

  23. Potential Outcomes Framework (Rubin-Neyman Causal Model) • Each unit 𝑦 & has two potential outcomes: • 𝑍 ( (𝑦 & ) is the potential outcome had the unit not been treated: “ control outcome ” • 𝑍 ) (𝑦 & ) is the potential outcome had the unit been treated: “ treated outcome ” • Observed factual outcome: 𝑧 & = 𝑢 & 𝑍 ) 𝑦 & + 1 − 𝑢 & 𝑍 ( (𝑦 & ) • Unobserved counterfactual outcome: BC = (1 − 𝑢 & )𝑍 𝑧 & ) 𝑦 & + 𝑢 & 𝑍 ( (𝑦 & )

  24. Terminology • Unit : data point, e.g. patient, customer, student • Treatment : binary indicator (in this tutorial) Also called intervention • Treated : units who received treatment=1 • Control : units who received treatment=0 • Factual : the set of observed units with their respective treatment assignment • Counterfactual : the factual set with flipped treatment assignment

  25. Example – Blood pressure and age 𝑧 = 𝑐𝑚𝑝𝑝𝑒_𝑞𝑠𝑓𝑡. 𝑍 ) 𝑦 Treated 𝑍 ( 𝑦 𝑦 = 𝑏𝑕𝑓

  26. Blood pressure and age 𝑧 = 𝑐𝑚𝑝𝑝𝑒_𝑞𝑠𝑓𝑡. 𝐽𝑈𝐹(𝑦) 𝑍 ) 𝑦 Treated 𝑍 ( 𝑦 𝑦 = 𝑏𝑕𝑓

  27. Blood pressure and age 𝑧 = 𝑐𝑚𝑝𝑝𝑒_𝑞𝑠𝑓𝑡. 𝐵𝑈𝐹 𝑍 ) 𝑦 Treated 𝑍 ( 𝑦 𝑦 = 𝑏𝑕𝑓

  28. Blood pressure and age 𝑧 = 𝑐𝑚𝑝𝑝𝑒_𝑞𝑠𝑓𝑡. 𝑍 ) 𝑦 Treated 𝑍 ( 𝑦 Treated Control 𝑦 = 𝑏𝑕𝑓

  29. Blood pressure and age 𝑍 ) 𝑦 𝑧 = 𝑐𝑚𝑝𝑝𝑒_𝑞𝑠𝑓𝑡. Treated 𝑍 ( 𝑦 Treated Control 𝑦 = 𝑏𝑕𝑓 Counterfactual treated Counterfactual control

  30. The fundamental problem of causal inference “The fundamental problem of causal inference” We only ever observe one of the two outcomes

  31. “The Assumptions” – no unmeasured confounders 𝑍 ( , 𝑍 ) : potential outcomes for control and treated 𝑦 : unit covariates (features) T: treatment assignment We assume: (𝑍 ( ,𝑍 ) ) ⫫ 𝑈 | 𝑦 The potential outcomes are independent of treatment assignment, conditioned on covariates 𝑦

  32. “The Assumptions” – no unmeasured confounders 𝑍 ( , 𝑍 ) : potential outcomes for control and treated 𝑦 : unit covariates (features) T: treatment assignment We assume: (𝑍 ( ,𝑍 ) ) ⫫ 𝑈 | 𝑦 Ignorability

  33. Ignorability covariates 𝒚 𝑼 treatment (features) 𝒁 𝟏 𝒁 𝟐 Potential outcomes (𝑍 ( ,𝑍 ) ) ⫫ 𝑈 | 𝑦

  34. Ignorability anti- hypertensive medication age, gender, 𝒚 𝑼 weight, diet, heart rate at rest,… 𝒁 𝟏 𝒁 𝟐 blood pressure blood pressure after after medication A medication B (𝑍 ( ,𝑍 ) ) ⫫ 𝑈 | 𝑦

  35. No Ignorability anti- hypertensive medication age, gender, 𝒚 𝑼 weight, diet, diabetic heart rate at rest,… 𝒊 𝒁 𝟏 𝒁 𝟐 blood pressure blood pressure after after medication A medication B (𝑍 ( ,𝑍 ) ) ⫫ 𝑈 | 𝑦

  36. “The Assumptions” – common support Y ( , 𝑍 ) : potential outcomes for control and treated 𝑦 : unit covariates (features) 𝑈 : treatment assignment We assume: 𝑞 𝑈 = 𝑢 𝑌 = 𝑦 > 0 ∀𝑢, 𝑦

  37. Average Treatment Effect The expected causal effect of 𝑈 on 𝑍 : ATE := E [ Y 1 − Y 0 ]

Recommend


More recommend