Causal inference: challenges for health data analysts Dr Jeremy - PowerPoint PPT Presentation

Causal inference: challenges for health data analysts Dr Jeremy Wyatt DM FRCP, Professor of Digital Healthcare, University of Southampton; Clinical Advisor on New Technologies, Royal College of Physicians IMAGE: KACPER PEMPEL

Asthmopolis

Some advantages of f big ig health data & “real world evidence” • Datasets 100-1000 times larger than for RCTs, so can examine patient subgroups • Data captured from routine care, so more representative / pragmatic • Wider variety of data items, so can answer more questions eg. on side effects, effect modifiers • Uses existing data, so quicker to start up and cheaper to answer questions (but EPIC in Cambridge cost £200M + 1-2 years of lower Care Quality Commission ratings) Sherman et al – FDA view on RWE - NEJMed 2016 Lars Hemkens, Ioannidis et al – Routinely collected data, promises & limitations. CMAJ 2016 3/39

Concerns about making in inferences fr from routine data 4/39 https://utmost.org/going-through-spiritual-confusion/

Simpson’s Paradox: mortality in diabetes Type 1 Type 2 < > > 64% of 358 97% of 544 Data from Poole Diabetes cohort, cited by Julious et al BMJ 1994

Association vs. . causation: Rochester library ry study Study question : is hospital length of stay (LOS) shorter in patients whose doctors used the Rochester NY library ? Method : compared LOS in patients of library-using Drs vs. patients of Drs who do not (case-control) Result: LOS 1 day less in library-using Drs; savings would easily pay for the library ! Possible interpretations: a) Library use is the cause of reduced LOS b) Library use is a marker of doctors who keep their patients in hospital for less time c) Library use result s from doctors keeping patients in hospital less ! A better question : What is the impact on LOS of providing a sample of doctors with access to the library ?

Confounding by indication • 40% of cancer patients treated with new drug survive 5 years versus 30% of patients treated with old drug • Difference persist despite taking account of differences in age, baseline cancer severity, genetic markers… • Conclusion: the new drug reduces mortality by 10% • But maybe allocation to the new drug depends on the doctor’s intuition on who will survive (subtle predictive feature not recorded in any database) • So, receipt of the new drug is a marker of better outcome - not the cause

The im impact of bia ias on estim imating mortali lity for ezetim imibe in in 2233 post-MI deaths (all ll cause mortali lity) 1.2 Hazard ratio for death compared to Ezetemibe 1 Intensified statin simvastatin group 0.8 0.6 0.4 0.2 0 Cox model Propensity scoring Further modelling Eg. First incident MI; missing cholesterol levels; medication covariates Source: Pauriah et al. Ezetimibe Use and Mortality in Survivors of an Acute Myocardial Infarction: A Population-based Study. Heart 2014

Estimating causality fr from big ig health data: some possible solu lutions Understand & quantify the biases & apply expertise in relevant analytical methods: • life course epidemiology • multi-level modelling • functional data analysis for intermittent monitoring data • case-crossover design (Farrington) • mediation and Rubin causal modelling • instrumental variable analysis eg. regression discontinuity

Regression dis iscontinuity design • Some drugs / procedures are used according to the threshold in a continuous variable eg. test result or predicted risk • But due to measurement error, people just above & just below an allocation threshold are very similar • So, if you have enough people to compare, you can estimate the impact of the intervention, just like an RCT… Thistlethwaite & Campbell, 1960

Our attempted RDD study in 45,0 ,000 Scottish women with breast cancer • NHS Predict score is an accurate, well calibrated algorithm for predicting p(Response|Chemotherapy) • NICE: doctors should usually offer women chemotherapy when p(R|C) >5%, be reluctant to give it if <3% and discuss it with woman if 3-5% • However, this is what happens in Scotland: Gray, Hall, Marti, Brewster, Wyatt, to be submitted. Funded by CSO Scotland

Beware: : non-randomised stu tudy designs are associated with replication fail ilure ! Intervention studied Original study Claim from Findings from design original study later studies / SRs Post menopausal HRT Non randomised Prevents CAD & Ineffective stroke Vitamin E RCT 1 o CAD prevention Ineffective 2 o CAD prevention Vitamin E Non randomised Ineffective Inhaled nitric oxide Non randomised Treats ARDS Ineffective Endotoxin antibodies Non randomised Treats gram neg Ineffective sepsis Flavonoids Non randomised Prevents CAD Effect smaller Carotid endartectomy Non randomised Treats high grade Effect smaller stenosis Coronary stent vs. PTCA Non randomised Treats CAD Effect smaller Zidoudine Non randomised Treats HIV infection Effect smaller Ionnidis et al. Contradicted and initially stronger effects in highly cited clinical research. JAMA 2005 [original articles with 1000+ citations,1990-2003]

Conclusions • We must use routine health data to improve patient safety, target interventions, evaluate process innovations and create the “Learning Health System” • But it’s often hard to know if our data is biased or lacks key unmeasured variables • Propensity scoring can help some times - but not other times • More research is needed to understand when we can trust the results of PS, RDD and other inferential methods

Causal inference: challenges for health data analysts Dr Jeremy - PowerPoint PPT Presentation

Causal inference: challenges for health data analysts Dr Jeremy Wyatt DM FRCP, Professor of Digital Healthcare, University of Southampton; Clinical Advisor on New Technologies, Royal College of Physicians IMAGE: KACPER PEMPEL Asthmopolis Some

Causal Inference An introduction based on S. Wagers course on Causal Inference (OIT 661) Imke

Causal Inference and Response Surface Modeling Inference and

Introduction to Causal Inference Lan Liu University of Minnesota at Twin Cities liux3771@umn.edu

Modes of Statistical Inference for Causal Efgects Plus an overview of the testing based approach

{ a) policy evaluation (treatment effects ) Old b) attribution but for gems

Causal inference Gary Goertz Kroc Institute for International Peace Studies University of Notre

A Brief Introduction to Causal Inference Brady Neal causalcourse.com What is causal inference?

Causal Inference Theory and Applications Dr. Matthias Uflacker, Johannes Huegle, Christopher

Political Science 209 - Fall 2018 Causal Inference Florian Hollenbach 7th September 2018 Causal

Topics in Causal Inference DRP Final Presentation Omkar A. Katta April 30, 2020 Outline I.

Causal Inference By: Miguel A. Hern an and James M. Robins Part I: Causal inference without

Geographic Data Science - Lecture IX Causal Inference Dani Arribas-Bel Today Correlation Vs

Geographic Data Science - Lecture IX Causal Inference Dani Arribas-Bel Today Correlation Vs

Geographic Data Science - Lecture IX Causal Inference Dani Arribas-Bel Today Correlation Vs

Causal reasoning and inference with causal Bayes nets Alexander Gebharter Duesseldorf Center for

Contents 1 Causal Inference and Predictive Comparison 2 1.1 How Predictive Comparison Can

CAUSAL INFERENCE AS COMPUTATIONAL LEARNING Judea Pearl University of California Los Angeles

High-dimensional causal inference, graphical modeling and structural equation models Peter B

Causal Inference and Experimentation Macartan Humphreys mh2245@columbia.edu November 15, 2011

Gov 2002 - Causal Inference III: Regression Discontinuity Designs Matthew Blackwell Arthur

Probality-free causal inference via the Algorithmic Markov Condition Dominik Janzing Max Planck

A Partial Solution To the Fundamental Problem of Causal Inference Some of our most important

On estimation of functional causal models: Post - nonlinear causal model as an

JOINT PROBABILISTIC INFERENCE OF CAUSAL STRUCTURE Dhanya Sridhar Lise Getoor U.C. Santa Cruz