Predicting perturbation effects in large-scale systems from observational data Marloes Maathuis Seminar f¨ ur Statistik, ETH Z¨ urich, Switzerland
Joint work with Peter B¨ uhlmann Diego Colombo Markus Kalisch Marloes Maathuis, ETH Z¨ urich 2 / 29
Research question • In short: Can we learn perturbation effects without doing perturbation experiments? Marloes Maathuis, ETH Z¨ urich 3 / 29
Research question • In short: Can we learn perturbation effects without doing perturbation experiments? • Concretely: Can we learn the gene regulatory network of yeast from observational data? • Predict perturbation effects between all pairs of genes • Identify pairs of genes between which there is a large effect Marloes Maathuis, ETH Z¨ urich 3 / 29
Why use observational data? • Thousands of perturbation experiments needed to estimate all perturbation effects ⇒ time consuming and expensive Marloes Maathuis, ETH Z¨ urich 4 / 29
Why use observational data? • Thousands of perturbation experiments needed to estimate all perturbation effects ⇒ time consuming and expensive • Questions: • Does observational data provide some information on perturbation effects? • Can this information be used to guide and prioritize perturbation experiments? Marloes Maathuis, ETH Z¨ urich 4 / 29
Definition of perturbation effect • Consider the effect of gene i on gene j . Let X i and X j be the expression levels of the genes. • If we experimentally change X i , what happens to X j ? Marloes Maathuis, ETH Z¨ urich 5 / 29
Definition of perturbation effect • Consider the effect of gene i on gene j . Let X i and X j be the expression levels of the genes. • If we experimentally change X i , what happens to X j ? • Hypothetical experiment: Genetically modified Genetically modified such that X i ≈ a such that X i ≈ a + 1 Marloes Maathuis, ETH Z¨ urich 5 / 29
Definition of perturbation effect • Consider the effect of gene i on gene j . Let X i and X j be the expression levels of the genes. • If we experimentally change X i , what happens to X j ? • Hypothetical experiment: do( X i = a ) do( X i = a + 1 ) Marloes Maathuis, ETH Z¨ urich 5 / 29
Definition of perturbation effect • Consider the effect of gene i on gene j . Let X i and X j be the expression levels of the genes. • If we experimentally change X i , what happens to X j ? • Hypothetical experiment: do( X i = a ) do( X i = a + 1 ) • Perturbation effect of gene i on gene j : E ( X j | do ( X i = a + 1)) − E ( X j | do ( X i = a )) (value of a drops out if the system is linear) Marloes Maathuis, ETH Z¨ urich 5 / 29
Estimating perturbation effects from observational data • It is easy to estimating associations from observational data. But association is not causation! • Pearl (2003): • “An associational concept is any relationship that can be defined in terms of the joint distribution of observed variables.” Marloes Maathuis, ETH Z¨ urich 6 / 29
Estimating perturbation effects from observational data • It is easy to estimating associations from observational data. But association is not causation! • Pearl (2003): • “An associational concept is any relationship that can be defined in terms of the joint distribution of observed variables.” • “A causal concept [such as a perturbation effect] is any relationship that cannot be defined from the distribution alone (...) Any claim invoking causal concepts must be traced to some premises that invoke such concepts; it cannot be inferred or derived from statistical associations alone.” Marloes Maathuis, ETH Z¨ urich 6 / 29
Estimating perturbation effects from observational data • It is easy to estimating associations from observational data. But association is not causation! • Pearl (2003): • “An associational concept is any relationship that can be defined in terms of the joint distribution of observed variables.” • “A causal concept [such as a perturbation effect] is any relationship that cannot be defined from the distribution alone (...) Any claim invoking causal concepts must be traced to some premises that invoke such concepts; it cannot be inferred or derived from statistical associations alone.” • An assumption that is often made: data were generated by a known directed acyclic graph (DAG) Marloes Maathuis, ETH Z¨ urich 6 / 29
Directed acyclic graph (DAG) X 2 X 1 X 3 • Nodes represent random variables and edges represent conditional independence relationships • The DAG encodes causal assumptions: • Edge X 2 → X 1 : X 2 may have a direct causal effect on X 1 • No edge X 1 � X 3 : X 1 cannot have a direct causal effect on X 3 (but X 1 and X 3 will be correlated!) Marloes Maathuis, ETH Z¨ urich 7 / 29
Pearl’s intervention-calculus / do-calculus X 2 X 1 X 3 • The perturbation effect of X 1 on X 3 : E ( X 3 | do ( X 1 = a + 1)) − E ( X 3 | do ( X 1 = a )) Marloes Maathuis, ETH Z¨ urich 8 / 29
Pearl’s intervention-calculus / do-calculus X 2 X 1 X 3 • The perturbation effect of X 1 on X 3 : E ( X 3 | do ( X 1 = a + 1)) − E ( X 3 | do ( X 1 = a )) • The do-operator stands for a hypothetical experiment. So E ( X 3 | do ( X 1 = a )) is not the usual conditional expectation! In the example: • E ( X 3 | X 1 = a ) � = E ( X 3 ) • E ( X 3 | do ( X 1 = a )) = E ( X 3 ) Marloes Maathuis, ETH Z¨ urich 8 / 29
Pearl’s intervention-calculus / do-calculus X 2 X 1 X 3 • The perturbation effect of X 1 on X 3 : E ( X 3 | do ( X 1 = a + 1)) − E ( X 3 | do ( X 1 = a )) • The do-operator stands for a hypothetical experiment. So E ( X 3 | do ( X 1 = a )) is not the usual conditional expectation! In the example: • E ( X 3 | X 1 = a ) � = E ( X 3 ) • E ( X 3 | do ( X 1 = a )) = E ( X 3 ) • Pearl’s do-calculus uses the DAG to write expressions involving the do-operator in terms of pre-intervention conditional distributions Marloes Maathuis, ETH Z¨ urich 8 / 29
Pearl’s intervention-calculus / do-calculus X 2 X 1 X 3 • Summary: If the DAG is given, one can estimate perturbation effects (or causal effects) from observational data Marloes Maathuis, ETH Z¨ urich 9 / 29
Main points in this talk • Present IDA (Intervention calculus when the DAG is Absent) • Requires observational data • generated from an unknown DAG • multivariate Gaussian • no hidden confounders • potentially high-dimensional system • Returns (summary measures of) estimated set of possible causal effects • Consistent in sparse high-dimensional settings • Validation on yeast data Marloes Maathuis, ETH Z¨ urich 10 / 29
What to do when the DAG is unknown? • A DAG encodes conditional independence relationships • So given all conditional independence relationships of the data, can we infer the DAG? Marloes Maathuis, ETH Z¨ urich 11 / 29
What to do when the DAG is unknown? • A DAG encodes conditional independence relationships • So given all conditional independence relationships of the data, can we infer the DAG? • Almost... Marloes Maathuis, ETH Z¨ urich 11 / 29
What to do when the DAG is unknown? • A DAG encodes conditional independence relationships • So given all conditional independence relationships of the data, can we infer the DAG? • Almost... several DAGs can encode the same conditional independence relationships. They form an equivalence class, described by a CPDAG. Marloes Maathuis, ETH Z¨ urich 11 / 29
What to do when the DAG is unknown? • A DAG encodes conditional independence relationships • So given all conditional independence relationships of the data, can we infer the DAG? • Almost... several DAGs can encode the same conditional independence relationships. They form an equivalence class, described by a CPDAG. • One can estimate this CPDAG, for example using the PC-algorithm of Peter Spirtes and Clark Glymour (Spirtes et al, 2000) • Fast implementation in the R-package pcalg • Consistent in sparse high-dimensional settings (Kalisch and B¨ uhlmann, JMLR 2007) Marloes Maathuis, ETH Z¨ urich 11 / 29
IDA (oracle version) PC-algorithm do-calculus DAG 1 effect 1 DAG 2 effect 2 . . . . oracle CPDAG multi-set Θ . . . . . . . . DAG m effect m Marloes Maathuis, ETH Z¨ urich 12 / 29
The multi-set Θ • Why multi-set instead of a unique value? Marloes Maathuis, ETH Z¨ urich 13 / 29
The multi-set Θ • Why multi-set instead of a unique value? • Recall quote of Pearl. We make “weak” causal assumptions: • The data are generated from unknown DAG • There are no hidden confounders Marloes Maathuis, ETH Z¨ urich 13 / 29
The multi-set Θ • Why multi-set instead of a unique value? • Recall quote of Pearl. We make “weak” causal assumptions: • The data are generated from unknown DAG • There are no hidden confounders • What information does Θ provide? Examples: • Θ = { 1 . 5 } ⇒ causal effect is 1 . 5 • Θ = { 1 . 5 , 0 . 5 , 3 . 1 } ⇒ causal effect is positive • Θ = { 1 . 5 , 1 . 5 , − 1 } ⇒ absolute value of causal effect ≥ 1 Marloes Maathuis, ETH Z¨ urich 13 / 29
Recommend
More recommend