Two Challenges • Some possible explanations for the discrepancy in those results are: entire/target population 𝛒 * 1. Transportability 👥 👥 👥 👥 👥 👥 👥 👥 👥 There is a mismatch between the study 👥 👥 👥 👥 👥 👥 👥 👥 👥 population 𝛒 and the general clinical 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 population 𝛒 * regarding ethnicity, race, 👥 👥 👥 👥 👥 👥 👥 👥 👥 and income (covariates named E ). 👥 👥 👥 👥 👥 👥 👥 👥 👥 � 7
Two Challenges • Some possible explanations for the discrepancy in those results are: study/source entire/target population population 𝛒 𝛒 * 1. Transportability 👥 👥 👥 👥 👥 👥 👥 👥 👥 There is a mismatch between the study 👥 👥 👥 👥 👥 👥 👥 👥 👥 population 𝛒 and the general clinical 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 population 𝛒 * regarding ethnicity, race, 👥 👥 👥 👥 👥 👥 👥 👥 👥 and income (covariates named E ). 👥 👥 👥 👥 👥 👥 👥 👥 👥 � 7
Two Challenges • Some possible explanations for the discrepancy in those results are: study/source entire/target population population 𝛒 𝛒 * 1. Transportability 👥 👥 👥 👥 👥 👥 👥 👥 👥 There is a mismatch between the study 👥 👥 👥 👥 👥 👥 👥 👥 👥 population 𝛒 and the general clinical 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 population 𝛒 * regarding ethnicity, race, 👥 👥 👥 👥 👥 👥 👥 👥 👥 and income (covariates named E ). 👥 👥 👥 👥 👥 👥 👥 👥 👥 P *( e ) ≠ P ( e ) � 7
Two Challenges • Some possible explanations for the discrepancy in those results are: study/source entire/target population population 𝛒 𝛒 * 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 � 8
Two Challenges • Some possible explanations for the discrepancy in those results are: study/source entire/target population population 𝛒 𝛒 * 2. Selection Bias 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 FDA's studies sampled from a 👥 👥 👥 👥 👥 👥 👥 👥 👥 distinct population by excluding 👥 👥 👥 👥 👥 👥 👥 👥 👥 youths with elevated baseline risk for suicide (B) from their cohorts. 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 � 8
Two Challenges • Some possible explanations for the discrepancy in those results are: study/source entire/target population population 𝛒 𝛒 * 2. Selection Bias 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 FDA's studies sampled from a 👥 👥 👥 👥 👥 👥 👥 👥 👥 distinct population by excluding 👥 👥 👥 👥 👥 👥 👥 👥 👥 youths with elevated baseline risk for suicide (B) from their cohorts. 👥 👥 👥 👥 👥 👥 👥 👥 👥 sampled individuals ( S=1 ) 👥 👥 👥 👥 👥 👥 👥 👥 👥 � 8
Two Challenges • Some possible explanations for the discrepancy in those results are: study/source entire/target population population 𝛒 𝛒 * 2. Selection Bias 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 👥 FDA's studies sampled from a 👥 👥 👥 👥 👥 👥 👥 👥 👥 distinct population by excluding 👥 👥 👥 👥 👥 👥 👥 👥 👥 youths with elevated baseline risk for suicide (B) from their cohorts. 👥 👥 👥 👥 👥 👥 👥 👥 👥 sampled individuals ( S=1 ) 👥 👥 👥 👥 👥 👥 👥 👥 👥 P ( y , b , e | do ( x ), S = 1) ≠ P ( y , b , e | do ( x )) P ( x , y , b , e | S = 1) ≠ P ( x , y , b , e ) � 8
Formalizing the Problem B E X Y � 9
Formalizing the Problem We use indicator named T to mark variables with di ff erences B E T between domains 𝛒 and 𝛒 *. X Y � 9
Formalizing the Problem We use indicator named T to mark variables with di ff erences B E T between domains 𝛒 and 𝛒 *. Similarly, the indicator , named S, is defined such that S=1 for every unit sampled in the study, and 0, otherwise. X Y S � 9
Formalizing the Problem We use indicator named T to mark variables with di ff erences B E T between domains 𝛒 and 𝛒 *. Similarly, the indicator , named S, is defined such that S=1 for every unit sampled in the study, and 0, otherwise. X Y S (called selection diagram D ) � 9
Formalizing the Problem We use indicator named T to mark variables with di ff erences B E T between domains 𝛒 and 𝛒 *. Similarly, the indicator , named S, is defined such that S=1 for every unit sampled in the study, and 0, otherwise. X Y S In this example, the causal e ff ect can be estimated by recalibrating (called selection the experimental findings using observations from the target domain diagram D ) P *( y | do ( x )) = ∑ P ( y | do ( x ), b , e , S = 1) P *( b , e ) b , e � 9
Formalizing the Problem We use indicator named T to mark variables with di ff erences B E T between domains 𝛒 and 𝛒 *. Similarly, the indicator , named S, is defined such that S=1 for every unit sampled in the study, and 0, otherwise. X Y S In this example, the causal e ff ect can be estimated by recalibrating (called selection the experimental findings using observations from the target domain diagram D ) P *( y | do ( x )) = ∑ P ( y | do ( x ), b , e , S = 1) P *( b , e ) b , e causal e ff ect in target domain � 9
Formalizing the Problem We use indicator named T to mark variables with di ff erences B E T between domains 𝛒 and 𝛒 *. Similarly, the indicator , named S, is defined such that S=1 for every unit sampled in the study, and 0, otherwise. X Y S In this example, the causal e ff ect can be estimated by recalibrating (called selection the experimental findings using observations from the target domain diagram D ) P *( y | do ( x )) = ∑ P ( y | do ( x ), b , e , S = 1) P *( b , e ) b , e causal e ff ect experimental data from the in target domain source under selection bias � 9
Formalizing the Problem We use indicator named T to mark variables with di ff erences B E T between domains 𝛒 and 𝛒 *. Similarly, the indicator , named S, is defined such that S=1 for every unit sampled in the study, and 0, otherwise. X Y S In this example, the causal e ff ect can be estimated by recalibrating (called selection the experimental findings using observations from the target domain diagram D ) P *( y | do ( x )) = ∑ Observations from P ( y | do ( x ), b , e , S = 1) P *( b , e ) the target domain b , e causal e ff ect experimental data from the in target domain source under selection bias � 9
Problem Statement � 10
Problem Statement B E T X Y S Selection Diagram D � 10
Problem Statement B E T X Y S Selection Diagram D P ( v | do ( x ), S = 1) Selection-biased Exp. Distribution P 1 from 𝛒 � 10
Problem Statement B E T X Y S Selection Diagram D P ( v | do ( x ), S = 1) Selection-biased Exp. Distribution P 1 from 𝛒 P *( w ) Covariate Distribution P 2 from 𝛒 * � 10
Problem Statement B E T X Y S Selection Diagram D Is there a function f such that P ( v | do ( x ), S = 1) P *( y | do ( x )) = f ( P 1 , P 2 ) Selection-biased Exp. Distribution P 1 from 𝛒 P *( w ) Covariate Distribution P 2 from 𝛒 * � 10
Problem Statement B E T X Y S Selection Diagram D Is there a function f such that P ( v | do ( x ), S = 1) yes ( ) / no f P *( y | do ( x )) = f ( P 1 , P 2 ) 😁 ☹ Selection-biased Exp. Distribution P 1 from 𝛒 P *( w ) Covariate Distribution P 2 from 𝛒 * � 10
Related Work � 11
Related Work confounding type of input selection bias transportability complete � 11
Related Work confounding type of input selection bias transportability complete Backdoor Criterion [Pearl ’93] obs. ✔ Extended Backdoor [Pearl and Paz ’10] � 11
Related Work confounding type of input selection bias transportability complete Backdoor Criterion [Pearl ’93] obs. ✔ Extended Backdoor [Pearl and Paz ’10] Adjustment Criterion obs. ✔ ✔ [Shpitser et al. ’10; Perkovic et al. ’15,’18] � 11
Related Work confounding type of input selection bias transportability complete Backdoor Criterion [Pearl ’93] obs. ✔ Extended Backdoor [Pearl and Paz ’10] Adjustment Criterion obs. ✔ ✔ [Shpitser et al. ’10; Perkovic et al. ’15,’18] Selection Backdoor obs. ✔ ✔ [Bareinboim, Tian and Pearl ’14] � 11
Related Work confounding type of input selection bias transportability complete Backdoor Criterion [Pearl ’93] obs. ✔ Extended Backdoor [Pearl and Paz ’10] Adjustment Criterion obs. ✔ ✔ [Shpitser et al. ’10; Perkovic et al. ’15,’18] Selection Backdoor obs. ✔ ✔ [Bareinboim, Tian and Pearl ’14] Generalized Adjustment Criterion obs. ✔ ✔ ✔ [Correa, Tian and Bareinboim ’18] � 11
Related Work confounding type of input selection bias transportability complete Backdoor Criterion [Pearl ’93] obs. ✔ Extended Backdoor [Pearl and Paz ’10] Adjustment Criterion obs. ✔ ✔ [Shpitser et al. ’10; Perkovic et al. ’15,’18] Selection Backdoor obs. ✔ ✔ [Bareinboim, Tian and Pearl ’14] Generalized Adjustment Criterion obs. ✔ ✔ ✔ [Correa, Tian and Bareinboim ’18] st-Adjustment Criterion — exp. ✔ ✔ ✔ [Correa, Tian and Bareinboim ’19] � 11
Solution: Covariate st -Adjustment • Strategy: Recalibrate the results from experiments in the the studied population using observations from the target population. � 12
Solution: Covariate st -Adjustment • Strategy: Recalibrate the results from experiments in the the studied population using observations from the target population. P *( y | do ( x )) � 12
Solution: Covariate st -Adjustment • Strategy: Recalibrate the results from experiments in the the studied population using observations from the target population. P *( y | do ( x )) unbiased target e ff ect in 𝛒 * � 12
Solution: Covariate st -Adjustment • Strategy: Recalibrate the results from experiments in the the studied population using observations from the target population. = ∑ P *( y | do ( x )) P ( y | do ( x ), z , S = 1) P *( z ) unbiased target z e ff ect in 𝛒 * � 12
Solution: Covariate st -Adjustment • Strategy: Recalibrate the results from experiments in the the studied population using observations from the target population. = ∑ P *( y | do ( x )) P ( y | do ( x ), z , S = 1) P *( z ) unbiased target z experiment results e ff ect in 𝛒 * in source domain 𝛒 � 12
Solution: Covariate st -Adjustment • Strategy: Recalibrate the results from experiments in the the studied population using observations from the target population. = ∑ P *( y | do ( x )) P ( y | do ( x ), z , S = 1) P *( z ) unbiased target z experiment results observations from e ff ect in 𝛒 * in source domain 𝛒 the target domain 𝛒 * � 12
Solution: Covariate st -Adjustment • Strategy: Recalibrate the results from experiments in the the studied population using observations from the target population. = ∑ P *( y | do ( x )) P ( y | do ( x ), z , S = 1) P *( z ) unbiased target z experiment results observations from e ff ect in 𝛒 * in source domain 𝛒 the target domain 𝛒 * • Questions: � 12
Solution: Covariate st -Adjustment • Strategy: Recalibrate the results from experiments in the the studied population using observations from the target population. = ∑ P *( y | do ( x )) P ( y | do ( x ), z , S = 1) P *( z ) unbiased target z experiment results observations from e ff ect in 𝛒 * in source domain 𝛒 the target domain 𝛒 * • Questions: 1. How to determine if st-adjustment holds for a set of covariates Z ? � 12
Solution: Covariate st -Adjustment • Strategy: Recalibrate the results from experiments in the the studied population using observations from the target population. = ∑ P *( y | do ( x )) P ( y | do ( x ), z , S = 1) P *( z ) unbiased target z experiment results observations from e ff ect in 𝛒 * in source domain 𝛒 the target domain 𝛒 * • Questions: 1. How to determine if st-adjustment holds for a set of covariates Z ? 2. How to find admissible covariate sets? � 12
Challenge I. Covariate Admissibility � 13
Challenge I. Covariate Admissibility • In general, adjusting for some variables that are a ff ected by the treatment could introduce more bias, instead of controlling for the current, existent ones. � 13
Challenge I. Covariate Admissibility • In general, adjusting for some variables that are a ff ected by the treatment could introduce more bias, instead of controlling for the current, existent ones. • In our setting, in particular, special attention needs to be paid to these covariates ( a ff ected by the treatment) that are correlated with the outcome given pre-treatment covariates. � 13
Challenge I. Covariate Admissibility • In general, adjusting for some variables that are a ff ected by the treatment could introduce more bias, instead of controlling for the current, existent ones. • In our setting, in particular, special attention needs to be paid to these covariates ( a ff ected by the treatment) that are correlated with the outcome given pre-treatment covariates. • Let’s call this set Z p . � 13
Challenge I. Covariate Admissibility • In general, adjusting for some variables that are a ff ected by the treatment could introduce more bias, instead of controlling for the current, existent ones. • In our setting, in particular, special attention needs to be paid to these covariates ( a ff ected by the treatment) that are correlated with the outcome given pre-treatment covariates. T • Let’s call this set Z p . Z 1 X Y Z 2 Z 3 S � 13
Challenge I. Covariate Admissibility • In general, adjusting for some variables that are a ff ected by the treatment could introduce more bias, instead of controlling for the current, existent ones. • In our setting, in particular, special attention needs to be paid to these covariates ( a ff ected by the treatment) that are correlated with the outcome given pre-treatment covariates. T • Let’s call this set Z p . Z 1 X Y • For example if adjusting for Z = {Z 1 , Z 2 , Z 3 } in this model Z 2 Z 3 Z p = {Z 3 }. S � 13
Main Result I: Complete Graphical Condition � 14
Main Result I: Complete Graphical Condition A set of covariates Z is admissible for st-adjustment in D relative to treatment X and outcome Y if: � 14
Main Result I: Complete Graphical Condition A set of covariates Z is admissible for st-adjustment in D relative to treatment X and outcome Y if: (i) Variables in Z p are independent of the treatment given all other covariates, and � 14
Main Result I: Complete Graphical Condition A set of covariates Z is admissible for st-adjustment in D relative to treatment X and outcome Y if: (i) Variables in Z p are independent of the treatment given all other covariates, and (ii) The outcome Y is independent of all the transportability ( T ) and selection bias nodes ( S ) given the covariates Z and the treatment X . � 14
Main Result I: Complete Graphical Condition A set of covariates Z is admissible for st-adjustment in D relative to treatment X and outcome Y if: (i) Variables in Z p are independent of the treatment given all other covariates, and (ii) The outcome Y is independent of all the transportability ( T ) and selection bias nodes ( S ) given the covariates Z and the treatment X . Thm. The causal e ff ect P * (y | do(x)) is identifiable by st-adjustment on a set Z with D if and only if the conditions above hold for Z relative to X and Y . � 14
Understanding the criterion � 15
Understanding the criterion T Task: Compute P * (y | do(x)) Z 1 X Y Z 2 Z 3 S � 15
Understanding the criterion T T Task: Compute P * (y | do(x)) Z 1 Z 1 • The outcome Y is a ff ected by di ff erences in the distribution of Z 1 X X Y Y Z 2 Z 2 Z 3 Z 3 between the source and target domains. S S � 15
Understanding the criterion T T T Task: Compute P * (y | do(x)) Z 1 Z 1 Z 1 • The outcome Y is a ff ected by di ff erences in the distribution of Z 1 X X X Y Y Y Z 2 Z 2 Z 2 Z 3 Z 3 Z 3 between the source and target domains. S S S • The variable Z 3 a ff ects the likelihood of units being sampled. � 15
Understanding the criterion T T T Task: Compute P * (y | do(x)) Z 1 Z 1 Z 1 • The outcome Y is a ff ected by di ff erences in the distribution of Z 1 X X X Y Y Y Z 2 Z 2 Z 2 Z 3 Z 3 Z 3 between the source and target domains. S S S • The variable Z 3 a ff ects the likelihood of units being sampled. T Z 1 • If we adjust for Z 3 to control for selection bias, we introduce spurious correlation. Hence, we should also control for Z 2 . X Y Z 2 Z 3 S � 15
Understanding the criterion T T T Task: Compute P * (y | do(x)) Z 1 Z 1 Z 1 • The outcome Y is a ff ected by di ff erences in the distribution of Z 1 X X X Y Y Y Z 2 Z 2 Z 2 Z 3 Z 3 Z 3 between the source and target domains. S S S • The variable Z 3 a ff ects the likelihood of units being sampled. T T Z 1 Z 1 • If we adjust for Z 3 to control for selection bias, we introduce spurious correlation. Hence, we should also control for Z 2 . X X Y Y Z 2 Z 2 Z 3 Z 3 S S � 15
Getting the intuition behind the rules Example T Z 1 X Y Z 2 Z 3 S � 16
Getting the intuition behind the rules Example By making Z ={ Z 1 , Z 2 , Z 3 }, we can verify the st-adjustment conditions, T i.e.: Z 1 X Y Z 2 Z 3 S � 16
Getting the intuition behind the rules Example By making Z ={ Z 1 , Z 2 , Z 3 }, we can verify the st-adjustment conditions, T i.e.: Z 1 (i) The variable in Z p ={Z 3 } is independent of X given the other X Y covariates {Z 1 , Z 2 } . Z 2 Z 3 S � 16
Getting the intuition behind the rules Example By making Z ={ Z 1 , Z 2 , Z 3 }, we can verify the st-adjustment conditions, T i.e.: Z 1 (i) The variable in Z p ={Z 3 } is independent of X given the other X Y covariates {Z 1 , Z 2 } . Z 2 Z 3 (ii) The outcome Y is independent of S and T given Z . S � 16
Getting the intuition behind the rules Example By making Z ={ Z 1 , Z 2 , Z 3 }, we can verify the st-adjustment conditions, T i.e.: Z 1 (i) The variable in Z p ={Z 3 } is independent of X given the other X Y covariates {Z 1 , Z 2 } . Z 2 Z 3 (ii) The outcome Y is independent of S and T given Z . S Hence, the st-adjustment is guaranteed to hold, i.e.: P *( y | do ( x )) = ∑ P ( y | do ( x ), z 1 , z 2 , z 3 , S = 1) P *( z 1 , z 2 , z 3 ) z 1 , z 2 , z 3 � 16
Getting the intuition behind the rules Example By making Z ={ Z 1 , Z 2 , Z 3 }, we can verify the st-adjustment conditions, T i.e.: Z 1 (i) The variable in Z p ={Z 3 } is independent of X given the other X Y covariates {Z 1 , Z 2 } . Z 2 Z 3 (ii) The outcome Y is independent of S and T given Z . S Hence, the st-adjustment is guaranteed to hold, i.e.: P *( y | do ( x )) = ∑ P ( y | do ( x ), z 1 , z 2 , z 3 , S = 1) P *( z 1 , z 2 , z 3 ) z 1 , z 2 , z 3 measurements from causal e ff ect experimental data from the the target domain in target domain source under selection bias � 16
Challenge II. Searching for Admissible Sets � 17
Challenge II. Searching for Admissible Sets • Given a candidate set Z, we have a condition to determine if it is admissible or not. � 17
Challenge II. Searching for Admissible Sets • Given a candidate set Z, we have a condition to determine if it is admissible or not. • The natural question that follows is how to find an admissible set without resorting to trial and error. There could be exponentially many candidates (and even valid ones!). � 17
Challenge II. Searching for Admissible Sets • Given a candidate set Z, we have a condition to determine if it is admissible or not. • The natural question that follows is how to find an admissible set without resorting to trial and error. There could be exponentially many candidates (and even valid ones!). • How to determine the existence of at least one admissible set? � 17
Challenge II. Searching for Admissible Sets • Given a candidate set Z, we have a condition to determine if it is admissible or not. • The natural question that follows is how to find an admissible set without resorting to trial and error. There could be exponentially many candidates (and even valid ones!). • How to determine the existence of at least one admissible set? • There are sets that could be preferred among other admissible ones due to certain properties (e.g., cost, variance). � 17
Main Result II: Listing Algorithm � 18
Main Result II: Listing Algorithm B E T X Y S Selection Diagram D � 18
Main Result II: Listing Algorithm B E T X Y S Selection Diagram D P ( v | do ( x ), S = 1) Selection-biased Exp. Distribution from 𝛒 � 18
Main Result II: Listing Algorithm B E T X Y S Selection Diagram D P ( v | do ( x ), S = 1) Selection-biased Exp. Distribution from 𝛒 Set W of covariates measurable in 𝛒 * � 18
Main Result II: Listing Algorithm B E T X Y S Selection Diagram D What are all the admissible sets satisfying st-adjustment? P ( v | do ( x ), S = 1) Selection-biased Exp. Distribution from 𝛒 Set W of covariates measurable in 𝛒 * � 18
Main Result II: Listing Algorithm B E T X Y S List of of sets Selection Diagram D What are all the admissible sets Z 1 , Z 2 , … ⊆ W such that for satisfying st-adjustment? P ( v | do ( x ), S = 1) each Z i : P *( y | do ( x )) = ∑ P ( y | do ( x ), z i , S = 1) P *( z i ) Selection-biased Exp. z i Distribution from 𝛒 Set W of covariates measurable in 𝛒 * � 18
Recommend
More recommend