Entropic Causal Inference Murat Kocaoglu, Alexandros G. Dimakis, Sriram Vishwanath and Babak Hassibi University of Texas at Austin November 28, 2019 Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 1 / 25
Outline Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 2 / 25
Outline Problem Definition Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 2 / 25
Outline Problem Definition Approach Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 2 / 25
Outline Problem Definition Approach Background and Notation Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 2 / 25
Outline Problem Definition Approach Background and Notation Identifiability ( H 0 ) Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 2 / 25
Outline Problem Definition Approach Background and Notation Identifiability ( H 0 ) Identifiability ( H 1 ) Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 2 / 25
Outline Problem Definition Approach Background and Notation Identifiability ( H 0 ) Identifiability ( H 1 ) Greedy Entropy Minimization Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 2 / 25
Outline Problem Definition Approach Background and Notation Identifiability ( H 0 ) Identifiability ( H 1 ) Greedy Entropy Minimization Experiments Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 2 / 25
Problem Definition Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 3 / 25
Problem Definition Pair of random variables: ( X , Y ) ∼ p X , Y Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 3 / 25
Problem Definition Pair of random variables: ( X , Y ) ∼ p X , Y Causal discovery: ? → Y X Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 3 / 25
Problem Definition Pair of random variables: ( X , Y ) ∼ p X , Y Causal discovery: ? → Y X Structural Causal Model: E ∼ p E Y = f ( X , E ) Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 3 / 25
Problem Definition Pair of random variables: ( X , Y ) ∼ p X , Y Causal discovery: ? → Y X Structural Causal Model: E ∼ p E Y = f ( X , E ) Causal sufficiency: X ⊥ ⊥ E Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 3 / 25
Problem Definition Pair of random variables: ( X , Y ) ∼ p X , Y Causal discovery: ? → Y X Structural Causal Model: E ∼ p E Y = f ( X , E ) Causal sufficiency: X ⊥ ⊥ E Example: Additive noise: f ( X , E ) = f ( X ) + E Linear causal mechanism: f ( X ) = A . X + µ Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 3 / 25
Approach Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 4 / 25
Approach The use of information theory as a tool for causal discovery Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 4 / 25
Approach The use of information theory as a tool for causal discovery e.g. Granger causality, Directed information and etc. Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 4 / 25
Approach Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 5 / 25
Approach Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 6 / 25
Approach Key Assumption : Exogenous noise E is “simple” in the correct causal direction. Occam’s Razor There should not be too much complexity not included in the causal model Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 7 / 25
Approach Focus on discrete random variables i.e. categorical variables p X ( i ) = P ( X = i ) Notions of simplicity: Renyi entropy 1 � p X ( i ) a ) H a ( X ) = 1 − alog ( i This work emphasizes on: Shannon entropy: H 1 Cardinality: H 0 Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 8 / 25
Approach Objective: Find the minimum H ( E ) such that Y = f ( X , E ) is feasible Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 9 / 25
Identifiability ( H 0 ) Causal model: M = ( { X , Y } , E , f , X → Y , p X , E ) Independent identically distributed samples: { ( x i , y i ) } i ∼ p X , Y Decide X → Y or Y → X , given the joint distribution p X , Y . Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 10 / 25
Identifiability ( H 0 ) Have in mind that both X , Y have cardinality n and E has cardinality m . Definition (Conditional distribution matrix). n × n matrix Y | X ( i , j ) := P ( Y = i | X = j ). The vector vec ( Y | X )( i + ( j − 1) n ) = Y | X ( i , j ) is called the conditional distribution vector. Definition (Block Partition Matrices). Consider a matrix M ∈ { 0 , 1 } n 2 × m . Let m i , j represent the i + ( j − 1) n th row of M . Let S i , j = { k ∈ [ m ] : m i , j ( k ) � = 0 } . The matrix M is called a block partition matrix if it belongs to C := { M : M ∈ { 0 , 1 } n 2 × m , i ∈ [ n ] S i , j = [ m ] , S i , j ∩ S l , j = ∅ , ∀ i � = l } . Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 11 / 25
Identifiability ( H 0 ) Equivalent condition for existence of causal mechanism: Lemma Lemma 1. Given discrete random variables X , Y with distribution p X , Y , ∃ a causal model M = ( { X , Y } , E , f , X → Y , p X , E ) with H 0 ( E ) = m if and only if ∃ M ∈ C , e ∈ R m + with � i e ( i ) = 1 that satisfy vec ( Y | X ) = Me . Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 12 / 25
Identifiability ( H 0 ) Lemma (Upper Bound on Minimum Cardinality of E). Let X , Y be two random variables with joint probability distribution p X , Y ( x , y ), where H 0 ( X ) = H 0 ( Y ) = n . Then exists a causal model Y = f ( X , E ) , X ⊥ ⊥ E that induces p X , Y , where m = H 0 ( E ) ≤ n ( n 1) + 1. If the columns of Y | X are uniformly sampled points in the n 1 dimensional simplex, then n ( n 1) states are necessary for E Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 13 / 25
Identifiability ( H 0 ) True causal direction: Y = f ( X , E ) , X ⊥ ⊥ Y Wrong causal direction: X = g ( Y , ˜ E ) , ˜ E ⊥ ⊥ X Under mild assumptions about the generation process of causal mechanism f , X , E instead of Y | X , we can have the same lower-bound. Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 14 / 25
Identifiability ( H 0 ) Definition (Generic Function). Let Y = f ( X , E ) where variables X , Y , E have supports X , Y , E , respectively. Let S y , x = f − 1 ( y ) ⊂ E be the inverse map, x i.e., S y , x = { e ∈ E : y = f ( x , e ) } . A function f is called “generic”, if for each ( x 1 , x 2 , y ) triple f − 1 x 1 ( y ) � = f − 1 x 2 ( y ) and for every pair ( x , y ), f − 1 ( y ) � = ∅ . x Causal mechanism f will be generic almost surely (!) Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 15 / 25
Identifiability ( H 0 ) Theorem (Identifiability). Consider the causal model M = ( { X , Y } , E , f , X → Y , p X , E ) where the random variables X , Y have n states, E ⊥ ⊥ X has θ states and f is a generic function. If the distributions of X and E are uniformly randomly selected from the n 1 and 1 simplices, then with probability 1 , any ˜ ⊥ Y that satisfies X = g ( Y , ˜ E ⊥ E ) for some deterministic function g has cardinality at least n ( n − 1) . Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 16 / 25
Identifiability ( H 0 ) Assume that we have the algorithm A that given the joint probability of X , Y , outputs E and f such that Y = f ( X , E ) with minimum cardinality E . Corollary The causal direction can be recovered with probability 1 if the original exogenous random variable E has cardinality less than n ( n − 1), the causal mechanism f is generic and the distributions of X and E are selected uniformly randomly from the proper simplice. Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 17 / 25
Identifiability ( H 0 ) Proposition (Inference algorithm). Suppose X → Y . Let X ∈ X , Y ∈ Y , |X| = n , |Y| = m . Assume that A is the algorithm that finds the exogenous variables E , ˜ E with minimum cardinality. Then, if the underlying exogenous variable has less cardinality than n ( m − 1), with probability 1, we have H 0 ( X ) + H 0 ( E ) < H 0 ( Y ) + H 0 ( ˜ E ) . Unfortunately, it turns out there does not exist an efficient algorithm A , unless P = NP Definition Subset sum problem: For a given set of integers V , and an integer a , decide whether there exists a subset S of V such that � u ∈ S u = a . Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 18 / 25
Identifiability ( H 1 ) THE EXACT SAME STORY! Amirkasra Jalaldoust Entropic Causal Inference November 28, 2019 19 / 25
Recommend
More recommend