Identification of Causal Effect in the Presence of Selection Bias Juan D. Correa Jin Tian Elias Bareinboim AAAI Honolulu, 2019
Challenge 1: Confounding Bias Age Whatβs the causal effect of Exercise on Cholesterol ? What about π πβππππ‘π’ππ ππ ππ¦ππ πππ‘π) ? Exercise Cholesterol Cholesterol Exercise (Hours) 2
Challenge 1: Confounding Bias Age Age 10 Age 30 Age 50 Age 20 Age 40 Exercise Cholesterol Cholesterol Exercise (Hours) 3
Challenge 1: Confounding Bias Age Age 10 Age 30 Age 50 Age 20 Age 40 Exercise Cholesterol π πβππππ‘π’ππ ππ ππ(ππ¦ππ πππ‘π)) Cholesterol β π(πβππππ‘π’ππ ππ | ππ¦ππ πππ‘π) This difference is called Confounding Bias Exercise (Hours) 4
Age Challenge 2: Selection Bias Exercise Cholesterol Variables in the system affect the inclusion of units in the sample S Fitness S=0 Cholesterol S=1 Exercise (Hours) 5
Age Challenge 2: Selection Bias Exercise Cholesterol Variables in the system affect the inclusion of units in the sample S Fitness S=0 π(πππ, ππ¦, πβ, πππ’) Cholesterol β π πππ, ππ¦, πβ, πππ’ π = 1) S=1 This difference is due to Selection Bias Exercise (Hours) 6
Current literature No Confounding Confounding No Selection Complete Algorithms Association = Causation [Tian and Pearl β02; Huang and No control Valtorta β06; Shpitser and Pearl β06; Bareinboim and Pearl β12] RCE Controlling Selection Bias [ Bareinboim, Tian, Pearl β15 ] Selection [Bareinboim and Pearl β12] Generalized Adjustment Recovering from Selection Bias in [Correa, Tian, Bareinboim β18] Causal and Statistical Inference IDSB [Bareinboim, Tian, Pearl β14] [Correa, Tian, Bareinboim β19] 7
Problem I Is there a function π such that Given: π£ Variables π, π π π ππ π = π(π ; ) π π(π|π = 1) ? 1 β¦ 1 β¦ 1 β¦ π 8
Result 1 Theorem 1: Let π, π β πΎ be two disjoint sets of variables and π£ a causal diagram over πΎ and π . If π β₯ π π£ ππ CDE , then π π (π) is not recoverable from π(π | π = 1) in π£ . 9
Problem II Is there a function π such that Given: π£ Variables π, π π π ππ π = π(π ; , π F ) π π(π|π = 1) π(π) ? 1 β¦ β¦ 1 β¦ β¦ 1 β¦ β¦ π π F ; 10
Result II Algorithm IDSB Given a causal diagram, a selection-biased distribution and external data over a subset of the variables and the variables of interest ( π, π ); returns an expression for π π (π) in terms of the input or failure . Strictly more powerful than the best known algorithm that accepts both biased and unbiased data. 11
οΏ½ Decomposing the Problem Intervention X W 1 W 2 W 3 Y X W 1 W 2 W 3 Y S S π H π§ = J π H (π§, π₯ L , π₯ F , π₯ ; ) N O ,N P ,N Q 12
οΏ½ οΏ½ Decomposing the Problem C-Components W 2 Y π H,N O ,N Q π§, π₯ F π N P ,R π₯ L , π₯ ; W 1 W 3 X W 1 W 2 W 3 Y S S π H π§ = J π H (π§, π₯ L , π₯ F , π₯ ; ) = J π H,N O ,N Q π§, π₯ F π N P ,R π₯ L , π₯ ; N O ,N P ,N Q N O ,N P ,N Q 13
Summary 1. Complete characterization recoverable causal effects from the causal diagram and a selection-biased probability distribution. 2. Sufficient procedure to recover causal effects from a causal diagram, selection-biased distributions and auxiliary unbiased data which is strictly more powerful than state-of- the-art procedure. Thanks! 14
15
Recommend
More recommend