Distinguishing Cause and Effect Balram Meena Lohit Jain Indian - PowerPoint PPT Presentation

Distinguishing Cause and Effect Balram Meena Lohit Jain Indian Institute of Technology Kanpur

Motivation • Pervasive in Science, Medicine, Economy and many aspects of everyday life. • What affects Health, Economy, Climate Changes? • Gold Standard: Randomized Controlled Experiments • Experiments Costly, Unethical, Unfeasible! • Non Observational Routine Data easily available

Causal Graph Example Born an Anxiety Peer Pressure Even Day Yellow Smoking Genetics Fingers Attention Allergy Lung Cancer Disorder Coughing Fatigue Car Accident http://causality.inf.ethz.ch/cause- effect.php?page=data

Causality Challenge #3: Cause Effect Pairs • Part of IJCNN 2013 contests • Results discussed in NIPS 2013 • Proceedings: Journal of Machine Learning Research, Workshop and Conference Proceedings (JMLR)

Causality Challenge #3: Cause Effect Pairs • Challenge: Rank pairs of variables {A, B} to prioritize experimental verifications of the conjecture that A causes B. • Determine from the joint observation of samples of two variables A and B that A -> B. • But, “Correlation does not mean C ausation”! • Could be Consequences of a common cause.

Setup • No feedback loops. • No Explicit time information • Variables are aggregate statistic, eg: Temp, life expectancy. • Pairs independent of each other

Datasets • Pair of real variables intermixed with • controls (dependent but not causally related) and • semi-artificial cause-effect pairs (real variables mixed in various ways to produce a given outcome) • 4050 training pairs • 4050 validation pairs • 4050 test pairs

Cause Effect Pair problem A B A -> B Smoking Lung Cancer A <- B Lung Cancer Fatigue Genetics Attention A – B Lung Cancer Disorder Born an A | B Lung Cancer Even Day http://causality.inf.ethz.ch/cause- effect.php?page=data

Evaluation Scheme • For any pair, score between -Inf and +Inf, • Large positive values : A is a cause of B with certainty • Large negative values : B is a cause of A with certainty • Near zero : Neither A causes B nor B causes A • Scores as ranking criterion • Evaluate entries with two Area under the ROC Curve (AUC) scores

Area Under the ROC curve • The results of classification, obtained by thresholding the prediction score, may be represented in a confusion matrix, where tp (true positive), fn (false negative), tn (true negative) and fp (false positive) represent the number of examples falling into each possible outcome: • We define the sensitivity (also called true positive rate or hit rate) and the specificity (true negative rate) as: • Sensitivity = tp/pos • Specificity = tn/neg where pos=tp+fn is the total number of positive examples and neg=tn+fp the total number of negative examples. • The area under the curve obtained by plotting sensitivity against specificity by varying a threshold on the prediction values to determine the classification result. • The AUC is calculated using the trapezoid method.

Causality in two variables : Intuitively • Intuitively : Factorization of the joint distribution P(cause; effect) into P(cause)P(effect | cause) typically yields models of lower total complexity than P(cause; effect) into P(effect)P(cause | effect) • Definition of Notion of Intuition not obvious!

Previous Models • The methods define classes of conditionals C and marginal distributions M , and prefer • X -> Y whenever P(X) ∈ M and P(Y | X) ∈ C but P(Y ) ∉ M or P(X | Y ) ∉ C. • Notion of model complexity: all probability distributions inside the class are simple, and those outside the class are complex. • This a priori restriction poses serious practical limitations

Causality in two variables • Deterministic f(X,E) = F(X) • Non-deterministic I. AN(additive noise) f(X,E) = F(X) + E II. PNL (Post-Non-Linear model) f(X,E) = G(F(X) + E) III. LINGAM (f is linear) f(X,E) = pX + qE IV. HS (hetro-Schedastic noise) f(X,E) = F(X) + E.G(X) • Idea is to fit restriction model in both direction (X -> Y and Y - > X) • Direction to be one that yields the best fit.

Probabilistic Latent Variable : Additional Assumptions A. Determinism (no other causes of Y): a function f exists such that Y = f(X,E) B. X and E are independent. C. The distribution of the cause is “independent” from the causal mechanism (f) D. The noise has a standard-normal distribution: E ~ N(0,1)

Other Models • Based on (A) and (B) with some additional restrictions on f (Slide 13). • For these special cases, it has been shown that a model of the same (restricted) form in the reverse direction Y -> X that induces the same joint distribution on (X, Y) does not exist in general. • But, a limited model class may lead to wrong conclusions about the causal direction.

Probabilistic Latent Variable Model • In general, one can always construct a random variable E’ ~ N(0,1) and a f’ : R 2 -> R such that X = f’ (Y, E’) • In combination with (C) and (D) : an asymmetry! • Infer the causal direction

Basic Idea • Define non-parametric priors on the f and input distributions favoring lower complexity. • Inferring using standard Bayesian model selection • Preference to model with largest marginal likelihood • Bayesian Approach: Noise as Latent Variable summarizing influence of all other unobserved causes.

Bayesian Model Selection • Prefer model with highest evidence: ρ 𝐸 𝑁 = ρ 𝐸 θ, 𝑁 ρ θ 𝑁 𝑒θ , D=Data, M=Model, θ =Parameters Trade-off between likelihood (goodness of fit) and priors (model complexity). • Causal Discovery: Compare evidence X->Y and Y->X

References • Mooij, Joris M., et al. "Probabilistic latent variable models for distinguishing between cause and effect." NIPS. 2010. • Daniusis, Povilas, et al. "Inferring deterministic causal relations." arXiv preprint arXiv:1203.3475 (2012). • Hoyer, Patrik O., et al. "Nonlinear causal discovery with additive noise models." NIPS. Vol. 21. 2008. • Peters, Jonas, Dominik Janzing, and Bernhard Scholkopf. "Causal inference on discrete data using additive noise models." Pattern Analysis and Machine Intelligence, IEEE Transactions on 33.12 (2011): 2436-2450. • Janzing, Dominik, et al. "Information-geometric approach to inferring causal directions .“ Articial Intelligence 182 (2012): 1-31.

Thank You! Questions …

Distinguishing Cause and Effect Balram Meena Lohit Jain Indian - PowerPoint PPT Presentation

Distinguishing Cause and Effect Balram Meena Lohit Jain Indian Institute of Technology Kanpur Motivation Pervasive in Science, Medicine, Economy and many aspects of everyday life. What affects Health, Economy, Climate Changes? Gold

Root Cause Analysis 1 Root Cause Analysis Root Cause Analysis is a method that is used to

The Titanic Caliyah Why was the titanic such a disaster Cause Effect The cause of why the

Indistinguishability Theory Ueli Maurer ETH Zurich FOSAD 2009, Bertinoro, Sept. 2009.

Spin Hall Effect and Experimental Observation 1701110147@pku.edu.cn 2017.12.15

Quantum Hall effect effect Quantum Hall integer integer Hall bar geometry classical quantum

The Titanic By Tavorria 1 Why was the titanic a disaster Cause the lifeboats rivets Effect

Root C t Cause An Analysis Presented by: Isaac Garcia, RCC Objec ectives es Define Root

1 3 6 th leading cause of death across Alzheimer's all ages Disease 68% 5 th

Adapting Service Delivery in Response to Crisis and Uncertainty ROOT CAUSE WEBINAR SERIES FOR

Our Caps, Your Cause What is Our Caps, Your Cause? Prairie Farms way to make it easy for you

DO CELLPHONES CAUSE VISION LOSS? Presented By Group 11 Dana Ben-Zaid Robyn

Cause Related Marketing Cause Related Marketing Enlightened Self Interest The changing Indian

Root Cause Analysis Information Session SAICA Offices, JHB 27 June 2017 2 Root Cause Analysis

Lung Cancer Diagnosis in 2007 Greatest cause of cancer deaths worldwide Greatest cause of

Salmonella Salmonella Salmonella Salmonella , major cause of bacterial , major cause of

Distinguishing Forced and Internal Multi-Decadal Variability in the North Atlantic and their

Quantitative Reasoning Assessment at James Madison University and Beyond: A Progress Report

Statistical Literacy March 1998 Statistics STATISTICAL Statistics 3/99 3/99 Association

Valley Clean Energy CAC Meeting November 19, 2020 Via Teleconference Item 10 Load, Revenue

THE 12 MAGIC SLIDES: INSIDER SECRETS FOR RAISING GROWTH CAPITAL Download Free Author: Paul M.

Where to Find Potential Donors to Support Your Cause sponsored by @fundraiserchad Thanks for

FROM THE COMFORT OF YOUR OWN HOME CLINICAL PHARMACISTS IMPROVING DIABETES CONTROL USING

COVID19 in Gloucestershire weekly data summary Week 44 (reported week 45) The report is based

Chapter 4: Loops and Iteration CS1: Java Programming Colorado State University Original slides