Two Optimal Strategies for Active Learning of Causal Models from Interventions Alain Hauser Peter B¨ uhlmann Seminar f¨ ur Statistik, ETH Z¨ urich PGM 2012, Granada Alain Hauser (ETH Z¨ urich) Active learning of causal models PGM 2012, Granada 1 / 16
Causal model: example Random variables: X 1 : taxis honking X 2 : Jonas awake X 3 : Alain awake X 4 : watermelons eaten Alain Hauser (ETH Z¨ urich) Active learning of causal models PGM 2012, Granada 2 / 16
Causal model: example Directed acyclic graph ( DAG ) of causal dependencies: Random variables: X 1 : taxis honking 1 X 2 : Jonas awake X 3 : Alain awake 2 3 X 4 : watermelons eaten 4 Alain Hauser (ETH Z¨ urich) Active learning of causal models PGM 2012, Granada 2 / 16
Causal model: example Directed acyclic graph ( DAG ) of causal dependencies: Random variables: X 1 : taxis honking 1 X 2 : Jonas awake X 3 : Alain awake 2 3 X 4 : watermelons eaten 4 Factorization of density: f ( x ) = f ( x 1 ) f ( x 2 | x 1 ) f ( x 3 | x 1 ) f ( x 4 | x 2 , x 3 ) f has Markov property of D Alain Hauser (ETH Z¨ urich) Active learning of causal models PGM 2012, Granada 2 / 16
Intervention: example Random variables: 1 X 1 : taxis honking X 2 : Jonas awake 2 3 X 3 : Alain awake 4 X 4 : watermelons eaten True DAG D Observational density: f ( x ) = f ( x 1 ) f ( x 2 | x 1 ) f ( x 3 | x 1 ) f ( x 4 | x 2 , x 3 ) Alain Hauser (ETH Z¨ urich) Active learning of causal models PGM 2012, Granada 3 / 16
Intervention: example Random variables: X 1 : taxis honking 1 X 2 : Jonas awake X 3 : Alain awake 2 3 X 4 : watermelons eaten 4 Intervention at X 2 : waking Jonas Observational density: f ( x ) = f ( x 1 ) f ( x 2 | x 1 ) f ( x 3 | x 1 ) f ( x 4 | x 2 , x 3 ) Alain Hauser (ETH Z¨ urich) Active learning of causal models PGM 2012, Granada 3 / 16
Intervention: example Random variables: 1 X 1 : taxis honking X 2 : Jonas awake 2 3 X 3 : Alain awake 4 X 4 : watermelons eaten Intervention DAG D ( { 2 } ) Observational density: f ( x ) = f ( x 1 ) f ( x 2 | x 1 ) f ( x 3 | x 1 ) f ( x 4 | x 2 , x 3 ) Interventional density: f ( x | do( X 2 = U )) = f ( x 1 )˜ f ( x 2 ) f ( x 3 | x 1 ) f ( x 4 | x 2 , x 3 ) Alain Hauser (ETH Z¨ urich) Active learning of causal models PGM 2012, Granada 3 / 16
Markov equivalence A probability density in general obeys the Markov properties of several DAGs; those DAGs are called Markov equivalent � limited identifiability under observational data 1 1 1 2 3 2 3 2 3 4 4 4 D 1 D 2 D Alain Hauser (ETH Z¨ urich) Active learning of causal models PGM 2012, Granada 4 / 16
Markov equivalence A probability density in general obeys the Markov properties of several DAGs; those DAGs are called Markov equivalent � limited identifiability under observational data 1 1 1 2 3 2 3 2 3 4 4 4 D 1 D 2 D On the other hand, intervention effects do depend on the DAG � improved identifiability of causal models under interventional data Alain Hauser (ETH Z¨ urich) Active learning of causal models PGM 2012, Granada 4 / 16
Interventional Markov equivalence Assume experiment in which different interventions at targets I 1 , I 2 , . . . are performed, summarized as family of targets I = { I 1 , I 2 , . . . } . Note: observational case corresponds to special family I = {∅} Definition ( I -Markov equivalence; Hauser and B¨ uhlmann, 2012) Given a family of targets I , two DAGs D 1 and D 2 are called I -Markov equivalent if they produce the same class of tuples of interventional densities. Alain Hauser (ETH Z¨ urich) Active learning of causal models PGM 2012, Granada 5 / 16
Interventional Markov equivalence Assume experiment in which different interventions at targets I 1 , I 2 , . . . are performed, summarized as family of targets I = { I 1 , I 2 , . . . } . Note: observational case corresponds to special family I = {∅} Definition ( I -Markov equivalence; Hauser and B¨ uhlmann, 2012) Given a family of targets I , two DAGs D 1 and D 2 are called I -Markov equivalent if they produce the same class of tuples of interventional densities. In words: two DAGs D 1 and D 2 are I -Markov equivalent if they are statistically indistinguishable from data produced from interventions at the targets in I . Alain Hauser (ETH Z¨ urich) Active learning of causal models PGM 2012, Granada 5 / 16
Interventional essential graph Definition Let I be a family of targets. The I -essential graph of some DAG D is D ′ ∼ I D D ′ . defined as E I ( D ) := � In words: E I ( D ) is a partially directed graph having the same skeleton as D with a directed edge where the corresponding arrows of all DAGs I -equivalent to D have the same orientation with an undirected edge where the orientation of the corresponding arrow is not common to all DAGs I -equivalent to D Alain Hauser (ETH Z¨ urich) Active learning of causal models PGM 2012, Granada 6 / 16
Interventional essential graph Definition Let I be a family of targets. The I -essential graph of some DAG D is D ′ ∼ I D D ′ . defined as E I ( D ) := � In words: E I ( D ) is a partially directed graph having the same skeleton as D with a directed edge where the corresponding arrows of all DAGs I -equivalent to D have the same orientation with an undirected edge where the orientation of the corresponding arrow is not common to all DAGs I -equivalent to D Properties: unique representation of an I -Markov equivalence class chain graph with chordal chain components (Hauser and B¨ uhlmann, 2012) Alain Hauser (ETH Z¨ urich) Active learning of causal models PGM 2012, Granada 6 / 16
Interventional Markov equivalence: example 1 1 1 1 2 3 2 3 2 3 2 3 4 4 4 4 E {∅} ( D ) D 1 D 2 D Observational Markov equivalence class of D with corresponding essential graph Alain Hauser (ETH Z¨ urich) Active learning of causal models PGM 2012, Granada 7 / 16
Interventional Markov equivalence: example 1 1 1 2 3 2 3 2 3 4 4 4 E {∅ , { 2 }} ( D ) D 1 D Interventional Markov equivalence class of D for family of targets I = {∅ , { 2 }} . Corresponds to an experiment which measures observational data ( I = ∅ ) interventional data from an intervention at X 2 ( I = { 2 } ) Alain Hauser (ETH Z¨ urich) Active learning of causal models PGM 2012, Granada 8 / 16
Active learning: overview Up to now: given list of interventions; characterization of identifiability via interventional essential graphs Alain Hauser (ETH Z¨ urich) Active learning of causal models PGM 2012, Granada 9 / 16
Active learning: overview Up to now: given list of interventions; characterization of identifiability via interventional essential graphs Problem 4 Given list of interventions performed so far and corresponding interventional essential graph, find 1 2 3 “optimal” intervention target for maximal im- provement of identifiability of causal models 5 Alain Hauser (ETH Z¨ urich) Active learning of causal models PGM 2012, Granada 9 / 16
Active learning: overview Up to now: given list of interventions; characterization of identifiability via interventional essential graphs Problem 4 Given list of interventions performed so far and corresponding interventional essential graph, find 1 2 3 “optimal” intervention target for maximal im- provement of identifiability of causal models 5 Objectives: assessing identifiability Number of edges orientable after one (single-vertex) intervention � OptSingle Alain Hauser (ETH Z¨ urich) Active learning of causal models PGM 2012, Granada 9 / 16
Active learning: overview Up to now: given list of interventions; characterization of identifiability via interventional essential graphs Problem 4 Given list of interventions performed so far and corresponding interventional essential graph, find 1 2 3 “optimal” intervention target for maximal im- provement of identifiability of causal models 5 Objectives: assessing identifiability Number of edges orientable after one (single-vertex) intervention � OptSingle Number of interventions (at arbitrary targets) needed for full identifiability � OptUnb Alain Hauser (ETH Z¨ urich) Active learning of causal models PGM 2012, Granada 9 / 16
OptSingle : overview Yields single-vertex intervention that maximizes number of orientable edges in worst case Implementation: local algorithm that finds optimal intervention target in “local” fashion, only considering neighborhood of candidate vertices Complexity: in worst case exponential, depending on clique number of I -essential graph Alain Hauser (ETH Z¨ urich) Active learning of causal models PGM 2012, Granada 10 / 16
OptSingle : worst case example 4 1 2 3 5 Alain Hauser (ETH Z¨ urich) Active learning of causal models PGM 2012, Granada 11 / 16
OptSingle : worst case example 4 1 2 3 5 OptSingle : Find vertex that guarantees orientability of a maximum of edges after intervention Alain Hauser (ETH Z¨ urich) Active learning of causal models PGM 2012, Granada 11 / 16
OptSingle : worst case example 4 1 2 3 5 Intervention at vertex 2 Alain Hauser (ETH Z¨ urich) Active learning of causal models PGM 2012, Granada 11 / 16
OptSingle : worst case example 4 1 2 3 5 Alain Hauser (ETH Z¨ urich) Active learning of causal models PGM 2012, Granada 11 / 16
Recommend
More recommend