Causal Inference and Graphical Models - II Jin Tian Iowa State University – p.1
Outline Computing the effects of manipulations Inferring constraints implied by DAGs with hidden variables on nonexperimental data on experimental data Determining the causes of effects Counterfactuals Probabilities of causation – p.2
Causal Bayesian Networks Causal graph , a DAG, U Nodes: random variables. X Z Y Edges: direct causal Smoking Tar in Cancer lungs influence. Modularity : Each parent-child relationship represents an autonomous causal mechanism. Functional: v i = f ( pa i , ε ) Probabilistic: P ( v i | pa i ) – p.3
Atomic Intervention/Manipulation do ( T = t ) : fixing a set T of variables to some constants T = t . P(U) U U P(U) P(X|U) P(Z|X) P(Z|X) do(X=False) P(Y|Z,U) P(Y|Z,U) X Z Y X Z Y Smoking Tar in Cancer Smoking Tar in Cancer lungs lungs P ( u , x , z , y ) = P ( u ) P ( x | u ) P ( z | x ) P ( y | z , u ) P X = False ( u , z , y ) = P ( u ) P ( z | X = False ) P ( y | z , u ) – p.4
Terminologies and Notations Effects of manipulations/interventions/actions The causal effect of T on S : P t ( s ) . Notations: P t ( s ) = P ( s | do ( t )) = P ( s | set ( t )) = P ( s | ˆ t ) = P ( s || t ) – p.5
Computing Causal Effects Given: observational data: distribution P ( v ) qualitative causal assumptions: a causal graph Can we compute the causal effect P t ( s ) . Causal BNs with no hidden common causes P ( v ) = ∏ P ( v i | pa i ) i t ( v ) = ∏ P P ( v i | pa i ) { i | V i �∈ T } – p.6
Computing Causal Effects The presence of unobserved (hidden, latent) variables. U X Y Input: causal graph + P ( x , y ) . Can we predict P x ( y ) ? – p.7
Computing Causal Effects Unidentifiable P ( x , y ) = ∑ P M 1 ( x | u ) P M 1 ( y | x , u ) P M 1 ( u ) u = ∑ P M 2 ( x | u ) P M 2 ( y | x , u ) P M 2 ( u ) U u x ( y ) = ∑ P M 1 P M 1 ( y | x , u ) P M 1 ( u ) u X Y x ( y ) = ∑ P M 2 P M 2 ( y | x , u ) P M 2 ( u ) u P M 1 x ( y ) � = P M 2 x ( y ) – p.8
Computing Causal Effects U X Z Y Input: causal graph + P ( x , y , z ) . – p.9
Computing Causal Effects U X Z Y Input: causal graph + P ( x , y , z ) . Output: P x ( y ) = ∑ P ( z | x ) ∑ P ( y | x ′ , z ) P ( x ′ ) z x ′ Identifiable – p.9
Causal Calculus Pearl’s do -calculus Rule 1: Ignoring observations P x ( y | z , w ) = P x ( y | w ) if ( Y ⊥ ⊥ Z | X , W ) G X Rule 2: Action/observation exchange P x , z ( y | w ) = P x ( y | z , w ) if ( Y ⊥ ⊥ Z | X , W ) G XZ Rule 3: Ignoring actions P x , z ( y | w ) = P x ( y | w ) if ( Y ⊥ ⊥ Z | X , W ) G XZ ( W ) – p.10
Computing In Do-calculus P x ( y ) = ∑ P x ( y | z ) P x ( z ) z = ∑ P x ( y | z ) P ( z | x ) Rule 2 z U = ∑ P x , z ( y ) P ( z | x ) Rule 2 z = ∑ P z ( y ) P ( z | x ) Rule 3 X Z Y z = ∑ z ∑ P z ( y | x ′ ) P z ( x ′ ) P ( z | x ) = ... x ′ = ∑ P ( z | x ) ∑ P ( y | x ′ , z ) P ( x ′ ) z x ′ When to use which rule of do -calculus? – p.11
Semi-Markovian Models For convenience of presentation, consider models in which each hidden variable is a root node and has exactly two observed children. U X Z Y U X Y – p.12
Semi-Markovian Models For convenience of presentation, consider models in which each hidden variable is a root node and has exactly two observed children. U U X Z Y X Z Y U U X Y X Y Represent the presence of hidden variables with bidirected links. – p.12
C-components Variables are partitioned into c-components. Two variables are in the same c-components iff they are connected by a bi-directed path. Bi-directed path: each link on the path is a bidirected link. U X 1 Two c-components: Z 1 S 1 = { X , Z 2 } Z U 2 S 2 = { Z 1 , Y } 2 Y – p.13
Decomposition of P ( v ) P ( v ) = ∑ P ( v i | pa v i ) ∏ ∏ P ( u i ) u { i | V i ∈ V } { i | U i ∈ U } For any set S ⊆ V , define Q [ S ]( v ) = P v \ s ( s ) = ∑ P ( v i | pa v i ) ∏ ∏ P ( u i ) u { i | V i ∈ S } { i | U i ∈ U } – p.14
Decomposition of P ( v ) P ( v ) = ∑ P ( v i | pa v i ) ∏ ∏ P ( u i ) u { i | V i ∈ V } { i | U i ∈ U } For any set S ⊆ V , define Q [ S ]( v ) = P v \ s ( s ) = ∑ P ( v i | pa v i ) ∏ ∏ P ( u i ) u { i | V i ∈ S } { i | U i ∈ U } Theorem (Decomposition of joint) Let a causal graph be partitioned into c-components S 1 ,..., S k . Then P ( v ) = ∏ Q [ S i ]( v ) = ∏ P v \ s i ( s i ) i i – p.14
Decomposition of P ( v ) U X 1 P ( x , y , z 1 , z 2 ) Z 1 = ∑ P ( x | u 1 ) P ( z 1 | x , u 2 ) P ( z 2 | z 1 , u 1 ) Z U 2 2 u 1 , u 2 P ( y | x , z 1 , z 2 , u 2 ) P ( u 1 ) P ( u 2 ) Y Two c-components: S 1 = { X , Z 2 } S 2 = { Z 1 , Y } – p.15
Decomposition of P ( v ) P ( x , y , z 1 , z 2 ) U = ∑ X 1 P ( x | u 1 ) P ( z 1 | x , u 2 ) P ( z 2 | z 1 , u 1 ) Z 1 u 1 , u 2 P ( y | x , z 1 , z 2 , u 2 ) P ( u 1 ) P ( u 2 ) Z U 2 2 ∑ � � P ( x | u 1 ) P ( z 2 | z 1 , u 1 ) P ( u 1 ) = Y u 1 Two c-components: ∑ � � P ( z 1 | x , u 2 ) P ( y | x , z 1 , z 2 , u 2 ) P ( u 2 ) S 1 = { X , Z 2 } u 2 S 2 = { Z 1 , Y } = Q [ S 1 ]( x , z 1 , z 2 ) Q [ S 2 ]( x , z 1 , z 2 , y ) = P y , z 1 ( x , z 2 ) P x , z 2 ( y , z 1 ) – p.15
Computing Q [ S i ] ’s Theorem Let a causal graph be partitioned into c-components S 1 ,..., S k . Then each Q [ S i ] is identifiable and is given by ∏ Q [ S i ]( v ) = P v \ s i ( s i ) = P ( v j | v 1 ,..., v j − 1 ) , { j | V j ∈ S i } assuming a topological order over V be V 1 < ... < V n . – p.16
Conditional Independences Theorem Let a topological order over V be V 1 < ... < V n , P ( v i | v 1 ,..., v i − 1 ) = P ( v i | pa ( T i ) \{ v i } ) where T i is the c-component of the subgraph G { V 1 ,..., V i } that contains V i . In the presence of hidden variables, each variable is independent of its non-descendants given its parents, the non-descendant variables in its c-component, and the parents of the non-descendant variables in its c-component. – p.17
An Example Two c-components: U X 1 S 1 = { X , Z 2 } Z 1 S 2 = { Z 1 , Y } Z U 2 2 Topological order: X < Z 1 < Z 2 < Y Y – p.18
An Example Two c-components: U X 1 S 1 = { X , Z 2 } Z 1 S 2 = { Z 1 , Y } Z U 2 2 Topological order: X < Z 1 < Z 2 < Y Y P ( x , y , z 1 , z 2 ) = Q [ { X , Z 2 } ] Q [ { Z 1 , Y } ] – p.18
An Example Two c-components: U X 1 S 1 = { X , Z 2 } Z 1 S 2 = { Z 1 , Y } Z U 2 2 Topological order: X < Z 1 < Z 2 < Y Y P ( x , y , z 1 , z 2 ) = Q [ { X , Z 2 } ] Q [ { Z 1 , Y } ] Q [ { X , Z 2 } ] = P y , z 1 ( x , z 2 ) = P ( x ) P ( z 2 | x , z 1 ) Q [ { Z 1 , Y } ] = P x , z 2 ( y , z 1 ) = P ( z 1 | x ) P ( y | x , z 1 , z 2 ) – p.18
Decomposition of P v \ h ( h ) Theorem Let H ⊆ V , and G H denote the subgraph of G composed only of the variables in H . Assume G H is partitioned into c-components H 1 ,..., H l . Then 1. Q [ H ] = ∏ P v \ h ( h ) = ∏ Q [ H i ] , i . e ., P v \ h i ( h i ) . i i 2. Each Q [ H i ] = P v \ h i ( h i ) is computable in terms of Q [ H ] = P v \ h ( h ) . – p.19
Computing Q [ S ] A procedure for computing Q [ S ]( v ) = P v \ s ( s ) is developed, that 1. Determine the identifiability of Q [ S ] . 2. Express identifiable Q [ S ] in terms of P ( v ) . – p.20
Identifying Causal Effects P t ( s ) Let D = An ( S ) G V \ T , and assume that the subgraph G D is partitioned into c-components D 1 ,..., D k . Then t ( s ) = ∑ P P t ( v \ t ) ( v \ t ) \ s = ∑ Q [ V \ T ] ( v \ t ) \ s ... = ∑ d \ s ∏ Q [ D i ] . i P t ( s ) is identifiable iff each Q [ D i ] is identifiable. – p.21
Computing P t ( s ) – Summary A complete algorithm is developed that will either determine P t ( s ) to be unidentifiable or express P t ( s ) in terms of P ( v ) Do-calculus is complete for computing causal effects Open questions: computing causal effects in partially known DAGs, or PAGs – p.22
Outline Computing the effects of manipulations Inferring constraints implied by DAGs with hidden variables Determining the causes of effects – p.23
Implications of Causal Models The validity of a causal model can be tested only if it has empirical implications, that is, it must impose constraints on data. No hidden variables: observational implications of a BN are completely captured by conditional independence relationships read by d-separation – p.24
Implications of Causal Models The validity of a causal model can be tested only if it has empirical implications, that is, it must impose constraints on data. No hidden variables: observational implications of a BN are completely captured by conditional independence relationships read by d-separation When hidden variables are present: other types of constraints on the observed distribution. – p.24
An Example U A B C D P ( a , b , c , d ) must satisfy: ∑ P ( d | a , b , c ) P ( b | a ) = f ( c , d ) b i . e . ∑ P ( d | a , b , c ) P ( b | a ) = ∑ P ( d | a ′ , b , c ) P ( b | a ′ ) b b Functional constraints – p.25
Applications Empirically validating causal models. Distinguishing causal models with the same set of conditional independence relationships. U U A B C D A B C D (a) (b) Independence statements: A is independent of C given B . – p.26
Recommend
More recommend