JOINT PROBABILISTIC INFERENCE OF CAUSAL STRUCTURE Dhanya Sridhar Lise Getoor U.C. Santa Cruz KDD Workshop on Causal Discovery August 14 th , 2016 1
Outline • Motivation • Problem Formulation • Our Approach • Preliminary Results 2
Traditional to Hybrid Approaches X 1 X 2 X 1 X 2 Score ( G , D ) X X X 1 X 2 Score ( G , D ) X X 3 X 4 X 3 … Constraint Based Search and Score Based 3
Traditional to Hybrid Approaches X 1 X 2 X 1 X 2 Score ( G , D ) X X X 1 X 2 Score ( G , D ) X X 3 X 4 X 4 … Constraint Based Search and Score Based Hybrid Approaches 4
Traditional to Hybrid Approaches X 1 X 2 X 1 X 2 X X X X 3 X 4 X 3 X 4 Constraint Based Score ( G , D ) Hybrid Approaches: • PC-based DAG Search – Dash and Drudzel, UAI 99 • Min-max Hill Climbing – Tsamardinos et al., JMLR 06 5
Joint Inference for Structure Discovery C 12 Joint Inference of Variables: X 1 X 2 Causal Edge C ij A Adjacency Edges A ij C 24 13 A 34 X 3 X 4
Joint Inference for Structure Discovery C 12 Joint Inference of Variables: X 1 X 2 Causal Edge C ij A Adjacency Edges A ij C 24 13 A 34 X 3 X 4 Joint Inference Approaches: • Linear Programming Relaxations, Jaakkola et al., AISTATS 10
Joint Inference for Structure Discovery C 12 Joint Inference of Variables: X 1 X 2 Causal Edge C ij A Adjacency Edges A ij C 24 13 A 34 X 3 X 4 Joint Inference Approaches: • Linear Programming Relaxations, Jaakkola et al., AISTATS 10 • MAX-SAT, Hyttinen et al., UAI 13
Outline • Motivation • Problem Formulation • Our Approach • Preliminary Results 9
Probabilistic Joint Model of Causal Structure C 12 X 1 X 2 A C 24 13 A 34 X 3 X 4 Extending joint approaches: probabilistic model over causal structures
Probabilistic Joint Model of Causal Structure C 12 X 1 X 2 A C 24 13 A 34 X 3 X 4 Independence Tests
Probabilistic Joint Model of Causal Structure C 12 X 1 X 2 A C 24 13 A 34 X 3 X 4 Combining logical and structural constraints and probabilistic reasoning
Outline • Motivation • Problem Formulation • Our Approach • Preliminary Results 13 13
Probabilistic Soft Logic (PSL) • Logic-like syntax with probabilistic, soft constraints • Describes an undirected graphical model 5.0: Causes(A, B) ^ Causes(B, C) ^ Linked(A,C) à Causes(A, C) Weighted rules Bach et. al (2015). “Hinge-loss Markov Random Fields and Pr Bach et. al (2015). “Hinge-loss Markov Random Fields and Probabilistic Soft obabilistic Soft Logic.” Logic.” arXiv arXiv. . Open sour Open source softwar ce software: https://psl.umiacs.umd.edu 14 14
Probabilistic Soft Logic (PSL) • Logic-like syntax with probabilistic, soft constraints • Describes an undirected graphical model 5.0: Causes(A, B) ^ Causes(B, C) ^ Linked(A,C) à Causes(A, C) Weighted rules Predicates are continuous random variables! Bach et. al (2015), Bach et. al (2015), arXiv arXiv Open sour Open source softwar ce software: https://psl.umiacs.umd.edu 15 15
Probabilistic Soft Logic (PSL) • Logic-like syntax with probabilistic, soft constraints • Describes an undirected graphical model Relaxations of Logical Operators 5.0: Causes(A, B) ^ Causes(B, C) ^ Linked(A,C) à Causes(A, C) Weighted rules Predicates are continuous random variables! Bach et. al (2015), Bach et. al (2015), arXiv arXiv Open sour Open source softwar ce software: https://psl.umiacs.umd.edu 16 16
Probabilistic Soft Logic (PSL) • Rules instantiated with values from real network 5.0: Causes(A, B) ^ Causes(B, C) ^ Linked(A,C) à Causes(A, C) C 12 X 1 X 2 A C 24 13 A 34 X 3 X 4 17 17
Probabilistic Soft Logic (PSL) • Rules instantiated with variables from real network C 24 5.0: Causes(X 1 , X 2 ) ^ Causes(X 2 , X 4 ) ^ Linked(X 1 ,X 4 ) à Causes(X 1 , X 4 ) C 12 A C 14 14 18 18
Soft Logic Relaxation 5.0: Causes(X 1 , X 2 ) ^ Causes(X 2 , X 4 ) ^ Linked(X 1 ,X 4 ) à Causes(X 1 , X 4 ) Convex relaxation of implication and distance to rule satisfaction Linear Function Bach et al. (2015), arXiv arXiv 19 19 Bach et al. NIPS 12, Bach et al. UAI 13
Hinge-loss Markov Random Fields 2 3 m 1 h max { � j ( Y , X ) , 0 } ] { 1 , 2 } i X p ( Y | X ) = Z ( w, X ) exp w j 4 − 5 j =1 Conditional Conditional random field random field Bach et al. (2015), arXiv arXiv 20 20 Bach et al. NIPS 12, Bach et al. UAI 13
Hinge-loss Markov random fields 2 3 m 1 h max { � j ( Y , X ) , 0 } ] { 1 , 2 } i X p ( Y | X ) = Z ( w, X ) exp w j 4 − 5 j =1 Conditional Conditional Featur Feature functions ar e functions are e random field random field hinge-loss functions hinge-loss functions Bach et al. (2015), arXiv arXiv 21 21 Bach et al. NIPS 12, Bach et al. UAI 13
Hinge-loss Markov random fields 2 3 m 1 h max { � j ( Y , X ) , 0 } ] { 1 , 2 } i X p ( Y | X ) = Z ( w, X ) exp w j 4 − 5 j =1 Conditional Conditional random field random field Featur Feature function for e function for each each instantiated rule instantiated rule Bach et al. (2015), arXiv 22 22 Bach et al. NIPS 12, Bach et al. UAI 13
Hinge-loss Markov random fields 2 3 m 1 h max { � j ( Y , X ) , 0 } ] { 1 , 2 } i X p ( Y | X ) = Z ( w, X ) exp w j 4 − 5 j =1 Conditional Conditional random field random field 5.0: Causes(X 1 , X 2 ) ^ Causes(X 2 , X 4 ) ^ Linked(X 1 ,X 4 ) à Causes(X 1 , X 4 ) Bach et al. (2015), arXiv arXiv 23 23 Bach et al. NIPS 12, Bach et al. UAI 13
Hinge-loss Markov random fields 2 3 m 1 h max { � j ( Y , X ) , 0 } ] { 1 , 2 } i X p ( Y | X ) = Z ( w, X ) exp w j 4 − 5 j =1 Conditional Conditional random field random field MAP Inference Intuition: minimize distances to satisfaction! Bach et al. (2015), arXiv arXiv 24 24 Bach et al. NIPS 12, Bach et al. UAI 13
Fast Inference in Hinge-loss MRFs Convex, continuous inference objective… Convex optimization! • Solved using efficient, message-passing algorithm called Alternating Direction Method of Multipliers • Algorithms for weight learning and reasoning with latent variables Bach et al. (2015), arXiv arXiv Open sour Open source softwar ce software: https://psl.umiacs.umd.edu 25 25 Bach et al. NIPS 12, Bach et al. UAI 13
Encoding PC Algorithm with PSL PC Algorithm: • No latent variables and confounders • Constraint-based approach • PC with PSL: • Use all independence tests • All rule weights set to 1.0 • 26 26
PSL Causal Structure Discovery Multiple independence tests No early pruning! with various separation sets 27 27
PSL Causal Structure Discovery Colliders in triples using d-separation 28 28
PSL Causal Structure Discovery 29 29
PSL Causal Structure Discovery 30 30
PSL Causal Structure Discovery 31 31
Outline • Motivation • Problem Formulation • Our Approach • Preliminary Results 32 32
Evaluation Dataset Synthetic Causal DAG Dataset – 2000 examples Causality Challenge: http://www.causality.inf.ethz.ch/data/LUCAS.html 33 33
Evaluation • Experimental setup: • G 2 Independence Tests for both PC and PSL • Max separation set of size 3 • Evaluation details • Run PC and PC-PSL algorithms and compare to causal ground truth • For PSL, round with threshold selected by cross- validation on causal edges 34 34
Causal Edge Prediction Results Average causal edge prediction accuracy and F1 score on 3-fold cross validation Accuracy Accuracy F1 Scor F1 Score PC Algorithm 0.91 ± 0.06 0.53 ± 0.26 PC-PSL 0.94 ± 0.02 0.58 ± 0.19 35 35
Summary and Future Directions • Joint inference of causal structure using probabilistic, soft constraints • Incorporate prior and domain knowledge for causal edges from text-mining, ontological constraints, and variable selection methods • Extensive, cross-validation experiments on multiple datasets 36 36
Recommend
More recommend