weakening faithfulness some heuristic causal discovery
play

Weakening Faithfulness: Some Heuristic Causal Discovery Algorithms - PowerPoint PPT Presentation

Weakening Faithfulness: Some Heuristic Causal Discovery Algorithms Zhalama 1 Jiji Zhang 2 Wolfgang Mayer 1 1 University of South Australia 2 Lingnan University Causa Ca sal DAG Causal DAG = , Each edge


  1. Weakening Faithfulness: Some Heuristic Causal Discovery Algorithms Zhalama 1 Jiji Zhang 2 Β· Wolfgang Mayer 1 1 University of South Australia 2 Lingnan University

  2. Causa Ca sal DAG β€’ Causal DAG 𝐻 = π‘Š, 𝐹 Each edge π‘Œ β†’ 𝑍 represents a direct causal relation that π‘Œ is a direct cause of 𝑍 relative to π‘Š β€’ Assumption: π‘Š is causally sufficient A B C D E A B Causal Distribution Causal Sufficiency DAG C D E

  3. Causa Ca sal Inference Assu Assump mptions β€’ Causal Markov Condition: Every conditional independence statement entailed by the causal DAG over π‘Š is satisfied by the joint probability distribution of π‘Š . i.e., π‘Œ and 𝑍 are (causally) d-separated by Z ⟹ π‘Œ βŠ₯ 𝑍 | π‘Ž . Causal Markov Assumption A B C D E A B Causal Distribution Causal Sufficiency DAG C D E

  4. Ca Causa sal Inference Assu Assump mptions β€’ Faithfulness assumption: Every conditional independence statement satisfied by the joint distribution of π‘Š is entailed by the causal DAG over π‘Š . i.e., π‘Œ βŠ₯ 𝑍 | π‘Ž ⟹ π‘Œ and 𝑍 are (causally) d-separated by π‘Ž . Causal Markov Assumption A B C D E A B Causal Distribution Causal Sufficiency DAG C D E Causal Faithfulness

  5. Ca Causa sal Faithfu fulness ss Assu Assump mption β€’ More dubious than Causal Markov assumption. β€’ Even if Faithfulness is not exactly violated, the distribution may be sufficiently close to being unfaithful to make trouble with finite data. β€’ Can we relax the Faithfulness assumption and adjust the causal discovery method to make it more robust against unfaithfulness? β€’ Adjacency unfaithfulness β€’ Orientation unfaithfulness

  6. Adjacency Faithfulness Violation β€’ Adjacency-Faithfulness: For every π‘Œ, 𝑍 ∈ π‘Š , if π‘Œ and 𝑍 are adjacent in the true causal DAG, then they are not independent conditional on any subset of π‘Š\{π‘Œ, 𝑍}. A B 𝐡 βŠ₯ 𝐸 | {C} The distribution satisfies 𝐷 βŠ₯ 𝐢 | {𝐡, 𝐸} one extra independence 𝐡 βŠ₯ 𝐢 | βˆ… C D True Graph

  7. PC under Adjacency Faithfulness Failure A B 1. Adjacency step : for every pair of variables π‘Œ and 𝑍 , search for a set of P : 𝐡 βŠ₯ 𝐢 | βˆ… variables given which π‘Œ and 𝑍 are conditionally independent, and infer them to be adjacent if and only if no C D such set is found. PC β€’ Justified by adjacency faithfulness True Graph assumption 2. Orientation step : for every unshielded triple (π‘Œ; 𝑍; π‘Ž) , infer that it is a collider if and only if the A B set found in step 1 that renders π‘Œ and π‘Ž conditionally independent does not include 𝑍 β€’ Justified by orientation faithfulness assumption C D

  8. GES β€’ Searches for a pattern that β€’ GES seems to be robust against maximizes a score over the space Adjacency unfaithfulness of patterns β€’ Proceeds from one pattern to a neighbor by adding or removing A B edges, one at a time β€’ Forward phase: β€’ Greedily add edges until the score C D cannot improve further β€’ Backward phase: β€’ Remove edges until the score cannot improve further

  9. Orientation Faithfulness Violation β€’ Orientation-Faithfulness: For every unshielded triple (π‘Œ, 𝑍, π‘Ž) β€’ If π‘Œ β†’ 𝑍 ← π‘Ž is a collider, then X and Z are not conditionally independent given any subset of π‘Š\{π‘Œ, π‘Ž} that includes 𝑍 . β€’ Otherwise, X and Z are not conditionally independent given any subset of π‘Š\{π‘Œ, π‘Ž} that excludes 𝑍 . A B 𝐡 βŠ₯ 𝐸 | {𝐢, 𝐷} The distribution satisfies 𝐢 βŠ₯ 𝐷 | {𝐡} one extra independence 𝐡 βŠ₯ 𝐸 | βˆ… C D True Graph

  10. GES under Orientation Faithfulness Violation A B The distribution satisfies 𝐡 βŠ₯ 𝐸|{𝐢, 𝐷} 𝐢 βŠ₯ 𝐷 | {𝐡} one extra independence 𝐡 βŠ₯ 𝐸|βˆ… C D GES True Graph A B C D

  11. 𝛽 βˆ’ Conservative Orientation β€’ Given a skeleton and a unshielded triple therein, consider all subsets of the variables adjacent to π‘Œ or of the variables that are adjacent to π‘Ž that render (π‘Œ, π‘Ž) consitionally independent 𝑠 = π‘œπ‘£π‘›π‘π‘“π‘  𝑝𝑔 𝑑𝑓𝑒𝑑 π‘’β„Žπ‘π‘’ π‘—π‘œπ‘‘π‘šπ‘£π‘’π‘“ 𝑍 π‘œπ‘£π‘›π‘π‘“π‘  𝑝𝑔 𝑑𝑓𝑒𝑑 β€’ If 𝑠 ≀ 𝛽 , the triple is marked as a collider. β€’ If 𝑠 β‰₯ 1 βˆ’ 𝛽 , the triple is marked as a non-collider. β€’ Otherwise it is ambiguous β€’ CPC(Ramsey et al, 2006) : 𝛽 = 0 : too cautious β€’ Majority rule orientation(Colombo and Maathuis, 2014) : 𝛽 = 0 .5 : not conservative enough β€’ We use 𝛽 = 0.4

  12. Proposed Hybrid Methods β€’ PC+GES β€’ Run PC first, use the output pattern as a starting point for GES β€’ Mitigate PC’s vulnerability to adjacency faithfulness violations β€’ GES+c β€’ Run GES first, then apply the 𝛽 -conservative orientation rules and Meek’s orientation rules(Meek, 1996) β€’ Mitigate GES’s vulnerability to orientation faithfulness violations β€’ PC+GES+c β€’ Run PC+GES first, then apply the 𝛽 -conservative orientation rules and Meek’s orientation rules(Meek, 1996) β€’ Mitigate both vulnerabilities

  13. Simulations – Examples of exact Faithfulness violations Adjacency unfaithfulness Orientation unfaithfulness A B A B C D C D PC PC- PC+GES GES MMHC stable GES GES+c PC CPC MMHC True adj. rate 0.75 0.75 0.95 0.93 0.76 0.35 0.96 0.49 0.99 0.56 False adj. rate 0.01 0.01 0.02 0.06 0.02 Mean Arrow Precision

  14. More comprehensive simulations(without exact unfaithfulness) β€’ Number of variables (dimension) ∈ {10, 20, 30, 40} β€’ Expected vertex degree (sparsity) ∈ {2, 4} β€’ Sample size ∈ {200, 500, 1000, 5000} β€’ For each setting, 100 random DAGs are generated, and on each DAG a linear Gaussian model is randomly built: β€’ Edge coefficients are uniformly drawn from [-1, -0.1] ∩ [0.1, 1] β€’ Variances of error terms are uniformly drawn from [0.5, 1] β€’ From each model, 50 datasets at each sample size are generated.

  15. Adjacency on Random Graphs

  16. Orientation on Random Graphs

  17. Conclusion and Outlook β€’ PC and GES are vulnerable to violations of Faithfulness β€’ Heuristic hybrid algorithms shown to be able to mitigate some adjacency and orientation issues β€’ even if faithfulness is not exactly violated β€’ Try to develop efficient methods for causal inference under weaker faithfulness assumptions (e.g. triangle faithfulness)

Recommend


More recommend