causal inference as computational learning
play

CAUSAL INFERENCE AS COMPUTATIONAL LEARNING Judea Pearl - PowerPoint PPT Presentation

CAUSAL INFERENCE AS COMPUTATIONAL LEARNING Judea Pearl University of California Los Angeles (www.cs.ucla.edu/~judea) OUTLINE Inference: Statistical vs. Causal distinctions and mental barriers Formal semantics for


  1. CAUSAL INFERENCE AS COMPUTATIONAL LEARNING Judea Pearl University of California Los Angeles (www.cs.ucla.edu/~judea)

  2. OUTLINE • Inference: Statistical vs. Causal distinctions and mental barriers • Formal semantics for counterfactuals: definition, axioms, graphical representations • Inference to three types of claims: 1. Effect of potential interventions 2. Attribution (Causes of Effects) 3. Direct and indirect effects

  3. TRADITIONAL STATISTICAL INFERENCE PARADIGM P Q ( P ) Data Joint (Aspects of P ) Distribution Inference e.g., Infer whether customers who bought product A would also buy product B . Q = P ( B | A )

  4. FROM STATISTICAL TO CAUSAL ANALYSIS: 1. THE DIFFERENCES Probability and statistics deal with static relations P ′ P Q ( P ′ ) Joint Data Joint (Aspects of P ′ ) Distribution Distribution change Inference What happens when P changes? e.g., Infer whether customers who bought product A would still buy A if we were to double the price.

  5. FROM STATISTICAL TO CAUSAL ANALYSIS: 1. THE DIFFERENCES What remains invariant when P changes say, to satisfy P ′ ( price =2)=1 P ′ P Q ( P ′ ) Joint Data Joint (Aspects of P ′ ) Distribution Distribution change Inference Note: P ′ ( v ) ≠ P ( v | price = 2) P does not tell us how it ought to change e.g. Curing symptoms vs. curing diseases e.g. Analogy: mechanical deformation

  6. FROM STATISTICAL TO CAUSAL ANALYSIS: 1. THE DIFFERENCES (CONT) 1. Causal and statistical concepts do not mix. CAUSAL STATISTICAL Spurious correlation Regression Randomization Association / Independence Confounding / Effect “Controlling for” / Conditioning Instrument Odd and risk ratios Holding constant Collapsibility Explanatory variables Propensity score 2. 3. 4.

  7. FROM STATISTICAL TO CAUSAL ANALYSIS: 2. MENTAL BARRIERS 1. Causal and statistical concepts do not mix. CAUSAL STATISTICAL Spurious correlation Regression Randomization Association / Independence Confounding / Effect “Controlling for” / Conditioning Instrument Odd and risk ratios Holding constant Collapsibility Explanatory variables Propensity score 2. No causes in – no causes out (Cartwright, 1989) statistical assumptions + data } ⇒ causal conclusions causal assumptions 3. Causal assumptions cannot be expressed in the mathematical language of standard statistics. 4.

  8. FROM STATISTICAL TO CAUSAL ANALYSIS: 2. MENTAL BARRIERS 1. Causal and statistical concepts do not mix. CAUSAL STATISTICAL Spurious correlation Regression Randomization Association / Independence Confounding / Effect “Controlling for” / Conditioning Instrument Odd and risk ratios Holding constant Collapsibility Explanatory variables Propensity score 2. No causes in – no causes out (Cartwright, 1989) statistical assumptions + data } ⇒ causal conclusions causal assumptions 3. Causal assumptions cannot be expressed in the mathematical language of standard statistics. 4. Non-standard mathematics: a) Structural equation models (Wright, 1920; Simon, 1960) Counterfactuals (Neyman-Rubin ( Y x ) , Lewis ( x b) Y ))

  9. WHY CAUSALITY NEEDS SPECIAL MATHEMATICS Scientific Equations (e.g., Hooke’s Law) are non-algebraic e.g., Pricing Policy: “Double the competitor’s price” Correct notation: Y : = 2 X X = 1 Y = 2 X Y = 2 X = 1 Process information The solution Had X been 3, Y would be 6. If we raise X to 3, Y would be 6. Must “wipe out” X = 1.

  10. WHY CAUSALITY NEEDS SPECIAL MATHEMATICS Scientific Equations (e.g., Hooke’s Law) are non-algebraic e.g., Pricing Policy: “Double the competitor’s price” Correct notation: (or) Y ← 2 X X = 1 Y = 2 X = 1 Process information The solution Had X been 3, Y would be 6. If we raise X to 3, Y would be 6. Must “wipe out” X = 1.

  11. THE STRUCTURAL MODEL PARADIGM Data Joint Q ( M ) Data Generating Distribution (Aspects of M ) Model M Inference M – Invariant strategy (mechanism, recipe, law, protocol) by which Nature assigns values to variables in the analysis.

  12. FAMILIAR CAUSAL MODEL ORACLE FOR MANIPILATION X Y Z INPUT OUTPUT

  13. STRUCTURAL CAUSAL MODELS Definition: A structural causal model is a 4-tuple 〈 V,U, F, P ( u ) 〉 , where V = { V 1 ,...,V n } are observable variables • • U = { U 1 ,...,U m } are background variables • F = { f 1 ,..., f n } are functions determining V , v i = f i ( v , u ) • P ( u ) is a distribution over U P ( u ) and F induce a distribution P ( v ) over observable variables

  14. STRUCTURAL MODELS AND CAUSAL DIAGRAMS The arguments of the functions v i = f i ( v,u ) define a graph v i = f i ( pa i ,u i ) PA i ⊆ V \ V i U i ⊆ U Example: Price – Quantity equations in economics U 1 U 2 I W = + + Q P q b p d i u 1 1 1 PA Q = + + p b q d w u 2 2 2

  15. STRUCTURAL MODELS AND INTERVENTION Let X be a set of variables in V . The action do ( x ) sets X to constants x regardless of the factors which previously determined X . do ( x ) replaces all functions f i determining X with the constant functions X=x , to create a mutilated model M x U 1 U 2 I W = + + q b p d i u 1 1 1 = + + p b q d w u 2 2 2 Q P

  16. STRUCTURAL MODELS AND INTERVENTION Let X be a set of variables in V . The action do ( x ) sets X to constants x regardless of the factors which previously determined X . do ( x ) replaces all functions f i determining X with the constant functions X=x , to create a mutilated model M x M p = + + U 1 U 2 I W q b p d i u 1 1 1 = + + p b q d w u 2 2 2 = p p Q P 0 P = p 0

  17. CAUSAL MODELS AND COUNTERFACTUALS Definition: The sentence: “ Y would be y (in situation u ), had X been x , ” denoted Y x ( u ) = y , means: The solution for Y in a mutilated model M x , (i.e., the equations for X replaced by X = x ) with input U=u , is equal to y . The Fundamental Equation of Counterfactuals: = Y ( u ) Y ( u ) x M x

  18. CAUSAL MODELS AND COUNTERFACTUALS Definition: The sentence: “ Y would be y (in situation u ), had X been x , ” denoted Y x ( u ) = y , means: The solution for Y in a mutilated model M x , (i.e., the equations for X replaced by X = x ) with input U=u , is equal to y . • Joint probabilities of counterfactuals: = = = ∑ P ( Y y , Z z ) P ( u ) x w = = u : Y ( u ) y , Z ( u ) z The Fundamental Equation of Counterfactuals: x w In particular: ∆ = = ∑ P ( y | do ( x ) ) P ( Y y ) P ( u ) = = x Y ( u ) Y ( u ) x M = x u : Y ( u ) y x = = ∑ PN Y y x y P u x y ( ' | , ) ( | , ) x ' = u : Y ( u ) y ' x '

  19. AXIOMS OF CAUSAL COUNTERFACTUALS = Y would be y , had X been x (in state U = u ) Y x ( u ) y : 1. Definiteness ∃ ∈ = x X s . t . X ( u ) x y 2. Uniqueness = = ⇒ = ( X ( u ) x ) & ( X ( u ) x ' ) x x ' y y 3. Effectiveness = X xw ( u ) x 4. Composition = ⇒ = W ( u ) w Y ( u ) Y ( u ) x xw x 5. Reversibility = = ⇒ = ( Y ( u ) y & ( W ( u ) w ) Y ( u ) y xw xy x

  20. INFERRING THE EFFECT OF INTERVENTIONS The problem: To predict the impact of a proposed intervention using data obtained prior to the intervention. The solution (conditional): Causal Assumptions + Data → Policy Claims 1. Mathematical tools for communicating causal assumptions formally and transparently. 2. Deciding (mathematically) whether the assumptions communicated are sufficient for obtaining consistent estimates of the prediction required. 3. Deriving (if (2) is affirmative) 4. Suggesting (if (2) is negative) a closed-form expression for the predicted impact a set of measurements and experiments that, if performed, would render a consistent estimate feasible.

  21. NON-PARAMETRIC STRUCTURAL MODELS Given P ( x,y,z ), should we ban smoking? U U 1 1 U 3 U 3 U 2 U 2 f 3 f 1 f 2 α β X Y Z X Y Z Smoking Tar in Cancer Smoking Tar in Cancer Lungs Lungs Linear Analysis Nonparametric Analysis x = u 1 , x = f 1 ( u 1 ), z = α x + u 2 , z = f 2 ( x , u 2 ), y = β z + γ u 1 + u 3 . y = f 3 ( z , u 1 , u 3 ). Find: α ⋅ β Find: P ( y | do ( x ))

  22. EFFECT OF INTERVENTION AN EXAMPLE Given P ( x,y,z ), should we ban smoking? U U 1 1 U 3 U 3 U U 2 2 f 3 f 2 α β X = x Y Z X Y Z Smoking Tar in Cancer Smoking Tar in Cancer Lungs Lungs Linear Analysis Nonparametric Analysis x = u 1 , x = const. z = α x + u 2 , z = f 2 ( x , u 2 ), y = β z + γ u 1 + u 3 . y = f 3 ( z , u 1 , u 3 ). ∆ Find: α ⋅ β Find: P ( y | do ( x )) = P ( Y = y ) in new model

  23. EFFECT OF INTERVENTION AN EXAMPLE (cont) Given P ( x,y,z ) , should we ban smoking? U (unobserved) U (unobserved) X = x Y Z X Y Z Smoking Tar in Cancer Smoking Tar in Cancer Lungs Lungs • • •

  24. EFFECT OF INTERVENTION AN EXAMPLE (cont) Given P ( x,y,z ) , should we ban smoking? U (unobserved) U (unobserved) X = x Y Z X Y Z Smoking Tar in Cancer Smoking Tar in Cancer Lungs Lungs Pre-intervention Post-intervention = = ∑ ∑ P ( x , y , z ) P ( u ) P ( x | u ) P ( z | x ) P ( y | z , u ) P ( y , z | do ( x )) P ( u ) P ( z | x ) P ( y | z , u ) u u •

Recommend


More recommend