doubly robust treatment e ff ect estimation with missing
play

Doubly robust treatment e ff ect estimation with missing attributes - PowerPoint PPT Presentation

Doubly robust treatment e ff ect estimation with missing attributes E ff ect of tranexamic acid on mortality of patients with traumatic brain injury Imke Mayer, Julie Josse, Stefan Wager, Tobias Gauss, Jean-Denis Moyer Group EHESS; Ecole


  1. Doubly robust treatment e ff ect estimation with missing attributes E ff ect of tranexamic acid on mortality of patients with traumatic brain injury Imke Mayer, Julie Josse, Stefan Wager, Tobias Gauss, Jean-Denis Moyer � Group EHESS; ´ Ecole Polytechnique; Stanford Business School; Traumabase R Statistique, Math´ ematique et Applications, Fr´ ejus, 3 sept. 2019 1

  2. Introduction

  3. Traumabase • 20 , 000 patients • 250 continuous and categorical variables: heterogeneous • 16 hospitals: multilevel data • 4,000 new patients/ year Center Accident Age Sex Weight Lactactes BP shock . . . Beaujon fall 54 m 85 NM 180 yes Pitie gun 26 m NR NA 131 no Beaujon moto 63 m 80 3.9 145 yes Pitie moto 30 w NR Imp 107 no HEGP knife 16 m 98 2.5 118 no . ... . . 2

  4. Traumabase • 20 , 000 patients • 250 continuous and categorical variables: heterogeneous • 16 hospitals: multilevel data • 4,000 new patients/ year Center Accident Age Sex Weight Lactactes BP shock . . . Beaujon fall 54 m 85 NM 180 yes Pitie gun 26 m NR NA 131 no Beaujon moto 63 m 80 3.9 145 yes Pitie moto 30 w NR Imp 107 no HEGP knife 16 m 98 2.5 118 no . ... . . ) Estimate causal e ff ect : Administration of the treatment ”tranexamic acid” (within 3 hours after the accident) on the outcome mortality for traumatic brain injury patients 2

  5. Missing values Percentage 100 25 50 75 AIS.external 0 AIS.face AIS.head Impossible Not Applicable Not made Not Informed NA ISS Trauma.center TBI Pupil.anomaly Pupil.anomaly.ph OTI.MICU Cardiac.arrest.ph HR GSC.init Delta.hemoCue Vasopressor.therapy IGS.II Hemoglobin SBP Death.in.ICU DBP Anticoagulant.therapy Antiplatelet.therapy SpO2 FiO2 SBP.min Variable HR.max DBP.min SpO2.min GSC.motor.init Neurosurgery.day0 Medcare.time.ph Tranexamic.acid HemoCue.init Cristalloid.volume Colloid.volume Decompressive.craniectomy Osmotherapy ICP EVD TCD.PI.max SBP.MICU HR.MICU DBP.MICU Glasgow.discharge IICP Osmotherapy.ph Improv.anomaly.osmo Cause.of.death Temperature.min 3

  6. Causal inference: classical framework

  7. Potential outcome framework (Neyman, 1923, Rubin, 1974) Causal e ff ect Binary treatment w 2 { 0 , 1 } on i-th individual with potential outcomes Y i (1) and Y i (0). Individual causal e ff ect of the treatment: ∆ i = Y i (1) � Y i (0) • Problem: ∆ i never observed (only observe one outcome/indiv). Causal inference as a missing value pb? Covariates Treatment Outcome(s) X 1 X 2 X 3 W Y(0) Y(1) 1.1 20 F 1 NA T -6 45 F 0 F NA 0 15 M 1 F NA . . . . . . . . . . . . -2 52 M 0 T NA 4

  8. Potential outcome framework (Neyman, 1923, Rubin, 1974) Causal e ff ect Binary treatment w 2 { 0 , 1 } on i-th individual with potential outcomes Y i (1) and Y i (0). Individual causal e ff ect of the treatment: ∆ i = Y i (1) � Y i (0) • Problem: ∆ i never observed (only observe one outcome/indiv). Causal inference as a missing value pb? • Average treatment e ff ect (ATE) τ = E [ ∆ i ] = E [ Y i (1) � Y i (0)]: The ATE is the di ff erence of the average outcome had everyone gotten treated and the average outcome had nobody gotten treatment. ) First solution: estimate τ with randomized controlled trials (RCT). 4

  9. Observational data Non random assignment ! Confounding Mortality rate 20% - treated 38% - not treated 16%: treatment kills? survived deceased Pr(survived | treatment) Pr(deceased | treatment) TA not administered 2,167 (68%) 399 (13%) 0.84 0.16 TA administered 374 (12%) 228 (7%) 0.62 0.38 Table 1: Occurrence and frequency table for traumatic brain injury patients (total number: 3,168). 5

  10. Unconfoundedness and the propensity score Assumptions • n iid samples ( X i , Y i , W i ), • Y i = W i Y i (1) + (1 � W i ) Y i (0) (SUTVA) • Treatment assignment is random conditionally on X i : { Y i (0) , Y i (1) } ? ? W i | X i ⌘ unconfoundedness assumption . Propensity score and overlap assumption e ( x ) , P ( W i = 1 | X i = x ) 8 x 2 X . We will assume overlap, i.e. 0 < e ( x ) < 1 8 x 2 X . Key property e is a balancing score, i.e. under unconfoundedness, it satisfies { Y i (0) , Y i (1) } ? ? W i | e ( X i ) 6

  11. Propensity based estimators Inverse Propensity Weighted estimator n τ IPW , 1 ✓ W i Y i e ( X i ) � (1 � W i ) Y i ◆ X ˆ n ˆ 1 � ˆ e ( X i ) i =1 7

  12. Propensity based estimators Inverse Propensity Weighted estimator n τ IPW , 1 ✓ W i Y i e ( X i ) � (1 � W i ) Y i ◆ X ˆ n ˆ 1 � ˆ e ( X i ) i =1 Augmented IPW: a doubly robust estimator Define µ ( w ) ( x ) := E [ Y i ( w ) | X i = x ]. n Y i − ˆ µ (1) ( X i ) − (1 − W i ) Y i − ˆ µ (0) ( X i ) τ AIPW := 1 ✓ ◆ X ˆ ˆ µ (1) ( X i ) − ˆ µ (0) ( X i ) + W i n ˆ e ( X i ) 1 − ˆ e ( X i ) i =1 is consistent if either the ˆ µ ( w ) ( x ) are consistent or ˆ e ( x ) is consistent. ) The AIPW has better statistical properties than IPW (Robins et al., 1994; Chernozhukov et al., 2018). ) Possibility to use any (machine learning) procedure such as random forests, deep nets, etc. to estimate ˆ e ( x ) and ˆ µ ( w ) ( x ) without harming the interpretability of the causal e ff ect estimation. R package grf (Athey et al., 2019) 7

  13. Causal inference: with missing attributes?

  14. Unconfoundedness with missing attributes? Without any changes to the previous framework, the only straightforward – but generally biased – solution is complete-case analysis. Covariates Treatment Outcome(s) X 1 X 2 X 3 W Y(0) Y(1) 20 F 1 T NA NA -6 45 NA 0 F NA 0 NA M 1 NA F 32 F 1 T NA NA 1 63 M 1 F NA -2 NA M 0 T NA 8

  15. Unconfoundedness with missing attributes? Without any changes to the previous framework, the only straightforward – but generally biased – solution is complete-case analysis. Covariates Treatment Outcome X 1 X 2 X 3 W Y 20 F 1 T NA -6 45 NA 0 F 0 NA M 1 F 32 F 1 T NA 1 63 M 1 F -2 NA M 0 T 8

  16. Unconfoundedness with missing attributes? Without any changes to the previous framework, the only straightforward – but generally biased – solution is complete-case analysis. Covariates Treatment Outcome X 1 X 2 X 3 W Y 20 F 1 T NA -6 45 NA 0 F 0 NA M 1 F 32 F 1 T NA 1 63 M 1 F -2 NA M 0 T 8

  17. Unconfoundedness with missing attributes? Without any changes to the previous framework, the only straightforward – but generally biased – solution is complete-case analysis. ! Often not a good idea! What are the alternatives? Two families of methods • Unconfoundedness despite missingness • Classical missing values mechanisms ( MCAR , MAR , MNAR , (Rubin, 1976)) 8

  18. Unconfoundedness with missing attributes? Unconfoundedness despite missingness Adapt the initial assumptions s.t. treatment assignment is unconfounded given only the observed information, that is, observed covariates and the response pattern . 8

  19. Unconfoundedness with missing attributes? Notations • response pattern R 2 { NA , 1 } p , R j , 1 { X j is observed } + NA 1 { X j is missing } , • X ∗ = R � X 2 { R [ NA } p Unconfoundedness despite missingness Treatment is unconfounded given X ∗ : { Y i (1) , Y i (0) } ? ? W i | X ∗ , (1) or alternatively: { Y i (1) , Y i (0) } ? ? W i | X i , R i , 8 ? X i | X ∗ CIT: W i ? i , R i (2) > < or > CIO: Y i ( t ) ? ? X i | X ∗ for t 2 { 0 , 1 } i , R i : 8

  20. Unconfoundedness with missing attributes? Unconfoundedness despite missingness Treatment is unconfounded given X ∗ : { Y i (1) , Y i (0) } ⊥ ⊥ W i | X ∗ , (1) or alternatively: { Y i (1) , Y i (0) } ? ? W i | X i , R i ,  CIT: W i ? ? X i | X ∗ i , R i (2)   or  CIO: Y i ( t ) ? ? X i | X ∗ i , R i for t 2 { 0 , 1 }  (a) CIT (b) CIO X X ∗ R X X ∗ R w w Y ( w ) Y ( w ) W W 8

  21. Generalized propensity score and random forests Generalized propensity score (Rosenbaum and Rubin, 1984) e ∗ ( X ∗ ) = P ( W = 1 | X ∗ ) . ! Allows to balance treatment and control groups on the observed information X ∗ in the case of missing values (1). 9

  22. Generalized propensity score and random forests Generalized propensity score (Rosenbaum and Rubin, 1984) e ∗ ( X ∗ ) = P ( W = 1 | X ∗ ) . ! Allows to balance treatment and control groups on the observed information X ∗ in the case of missing values (1). ! Random forests allow incorporating missing values directly since they allow semi-discrete variables (e.g. X ∗ 2 ( R ⇥ NA ) p ). ! With specific representation/encoding of missing values ( MIA ), splits are possible either on observed variables or on response pattern (Josse et al., 2019). 9

  23. Generalized propensity score and random forests Generalized propensity score (Rosenbaum and Rubin, 1984) e ∗ ( X ∗ ) = P ( W = 1 | X ∗ ) . ! Random forests allow incorporating missing values directly since they allow semi-discrete variables (e.g. X ∗ 2 ( R ⇥ NA ) p ). ! With specific representation/encoding of missing values ( MIA ), splits are possible either on observed variables or on response pattern (Josse et al., 2019). ! recursively find partition that minimizes empirical risk. For every covariate X j and threshold z , there are three possibilities: { X ∗ j  z or X ∗ { X ∗ j = NA } j > z } vs { X ∗ { X ∗ j > z or X ∗ j  z } vs j = NA } { X ∗ { X ∗ j = NA } vs j 6 = NA } 9

Recommend


More recommend