Weakening Faithfulness: Some Heuristic Causal Discovery Algorithms Zhalama 1 Jiji Zhang 2 Β· Wolfgang Mayer 1 1 University of South Australia 2 Lingnan University
Causa Ca sal DAG β’ Causal DAG π» = π, πΉ Each edge π β π represents a direct causal relation that π is a direct cause of π relative to π β’ Assumption: π is causally sufficient A B C D E A B Causal Distribution Causal Sufficiency DAG C D E
Causa Ca sal Inference Assu Assump mptions β’ Causal Markov Condition: Every conditional independence statement entailed by the causal DAG over π is satisfied by the joint probability distribution of π . i.e., π and π are (causally) d-separated by Z βΉ π β₯ π | π . Causal Markov Assumption A B C D E A B Causal Distribution Causal Sufficiency DAG C D E
Ca Causa sal Inference Assu Assump mptions β’ Faithfulness assumption: Every conditional independence statement satisfied by the joint distribution of π is entailed by the causal DAG over π . i.e., π β₯ π | π βΉ π and π are (causally) d-separated by π . Causal Markov Assumption A B C D E A B Causal Distribution Causal Sufficiency DAG C D E Causal Faithfulness
Ca Causa sal Faithfu fulness ss Assu Assump mption β’ More dubious than Causal Markov assumption. β’ Even if Faithfulness is not exactly violated, the distribution may be sufficiently close to being unfaithful to make trouble with finite data. β’ Can we relax the Faithfulness assumption and adjust the causal discovery method to make it more robust against unfaithfulness? β’ Adjacency unfaithfulness β’ Orientation unfaithfulness
Adjacency Faithfulness Violation β’ Adjacency-Faithfulness: For every π, π β π , if π and π are adjacent in the true causal DAG, then they are not independent conditional on any subset of π\{π, π}. A B π΅ β₯ πΈ | {C} The distribution satisfies π· β₯ πΆ | {π΅, πΈ} one extra independence π΅ β₯ πΆ | β C D True Graph
PC under Adjacency Faithfulness Failure A B 1. Adjacency step : for every pair of variables π and π , search for a set of P : π΅ β₯ πΆ | β variables given which π and π are conditionally independent, and infer them to be adjacent if and only if no C D such set is found. PC β’ Justified by adjacency faithfulness True Graph assumption 2. Orientation step : for every unshielded triple (π; π; π) , infer that it is a collider if and only if the A B set found in step 1 that renders π and π conditionally independent does not include π β’ Justified by orientation faithfulness assumption C D
GES β’ Searches for a pattern that β’ GES seems to be robust against maximizes a score over the space Adjacency unfaithfulness of patterns β’ Proceeds from one pattern to a neighbor by adding or removing A B edges, one at a time β’ Forward phase: β’ Greedily add edges until the score C D cannot improve further β’ Backward phase: β’ Remove edges until the score cannot improve further
Orientation Faithfulness Violation β’ Orientation-Faithfulness: For every unshielded triple (π, π, π) β’ If π β π β π is a collider, then X and Z are not conditionally independent given any subset of π\{π, π} that includes π . β’ Otherwise, X and Z are not conditionally independent given any subset of π\{π, π} that excludes π . A B π΅ β₯ πΈ | {πΆ, π·} The distribution satisfies πΆ β₯ π· | {π΅} one extra independence π΅ β₯ πΈ | β C D True Graph
GES under Orientation Faithfulness Violation A B The distribution satisfies π΅ β₯ πΈ|{πΆ, π·} πΆ β₯ π· | {π΅} one extra independence π΅ β₯ πΈ|β C D GES True Graph A B C D
π½ β Conservative Orientation β’ Given a skeleton and a unshielded triple therein, consider all subsets of the variables adjacent to π or of the variables that are adjacent to π that render (π, π) consitionally independent π = ππ£ππππ ππ π‘ππ’π‘ π’βππ’ πππππ£ππ π ππ£ππππ ππ π‘ππ’π‘ β’ If π β€ π½ , the triple is marked as a collider. β’ If π β₯ 1 β π½ , the triple is marked as a non-collider. β’ Otherwise it is ambiguous β’ CPC(Ramsey et al, 2006) : π½ = 0 : too cautious β’ Majority rule orientation(Colombo and Maathuis, 2014) : π½ = 0 .5 : not conservative enough β’ We use π½ = 0.4
Proposed Hybrid Methods β’ PC+GES β’ Run PC first, use the output pattern as a starting point for GES β’ Mitigate PCβs vulnerability to adjacency faithfulness violations β’ GES+c β’ Run GES first, then apply the π½ -conservative orientation rules and Meekβs orientation rules(Meek, 1996) β’ Mitigate GESβs vulnerability to orientation faithfulness violations β’ PC+GES+c β’ Run PC+GES first, then apply the π½ -conservative orientation rules and Meekβs orientation rules(Meek, 1996) β’ Mitigate both vulnerabilities
Simulations β Examples of exact Faithfulness violations Adjacency unfaithfulness Orientation unfaithfulness A B A B C D C D PC PC- PC+GES GES MMHC stable GES GES+c PC CPC MMHC True adj. rate 0.75 0.75 0.95 0.93 0.76 0.35 0.96 0.49 0.99 0.56 False adj. rate 0.01 0.01 0.02 0.06 0.02 Mean Arrow Precision
More comprehensive simulations(without exact unfaithfulness) β’ Number of variables (dimension) β {10, 20, 30, 40} β’ Expected vertex degree (sparsity) β {2, 4} β’ Sample size β {200, 500, 1000, 5000} β’ For each setting, 100 random DAGs are generated, and on each DAG a linear Gaussian model is randomly built: β’ Edge coefficients are uniformly drawn from [-1, -0.1] β© [0.1, 1] β’ Variances of error terms are uniformly drawn from [0.5, 1] β’ From each model, 50 datasets at each sample size are generated.
Adjacency on Random Graphs
Orientation on Random Graphs
Conclusion and Outlook β’ PC and GES are vulnerable to violations of Faithfulness β’ Heuristic hybrid algorithms shown to be able to mitigate some adjacency and orientation issues β’ even if faithfulness is not exactly violated β’ Try to develop efficient methods for causal inference under weaker faithfulness assumptions (e.g. triangle faithfulness)
Recommend
More recommend