Type-II errors of independence tests can lead to arbitrarily large - PowerPoint PPT Presentation

Type-II errors of independence tests can lead to arbitrarily large errors in estimated causal effects: an illustrative example Workshop UAI 2014 Nicholas Cornia & Joris M. Mooij University of Amsterdam 27/07/2014

Problem Setting 1 Estimation of the causal effect error form the observed 2 covariance matrix 3 Discussion Conclusions and future work 4

Introduction Task: Inferring causation from observational data Challenge: Presence of hidden confounders. Approach: Causal discovery algorithms based on conditional independence (CIs) tests . Simplest case: Three random variables, a single CI test (LCD-Trigger setting). Contribution: Causal predictions are extremely unstable when type II errors arise.

LCD-Trigger Algorithm Cooper (1997) and Chen et al. (2007). The following causal model X 1 X 2 X 3 is implied by Prior assumptions Statistical tests No Selection Bias X 1 �⊥ ⊥ X 2 X 2 �⊥ ⊥ X 3 Acyclicity Faithfulness X 1 ⊥ ⊥ X 3 | X 2 X 2 , X 3 do not cause X 1

Application of the LCD in biology Example Gene expression SNP G P � �� Single Nucleotide Polymorphism Gene expression level Phenotype Example Disease Treatment X Y Z �� Gender Disease 1 Disease 2

Linear Gaussian model For simplicity: linear-Gaussian case. Structural equations: � X i = α ij X j + E i X = AX + E i � = j where � � � � δ 2 Example E ∼ N 0 , ∆ ∆ = diag i and A = { α ij } is the weighted α 12 α 23 adjacency matrix of the causal X 1 X 2 X 3 graph ( α ij � = 0 ⇐ ⇒ X i → X j ).  X 1 = E 1   X 2 = α 12 X 1 + E 2   X 3 = α 23 X 2 + E 3 Then: � X ∼ N 0 , Σ) Σ = Σ( A , ∆)

Causal effect estimator Causal effect of X 2 on X 3 : Under the LCD assumptions ∂ = Σ 32 � � � � A ∋ α 23 = X 3 | do ( X 2 = x 2 ) X 3 | X 2 E E ∂ x 2 Σ 22 is a valid estimator for the causal effect of X 2 on X 3 . Example Structural equations Structural equations after (observed) an intervention   X 1 = E 1 X 1 = E 1     X 2 = x 2 X 2 = α 12 X 1 + E 2    X 3 = α 23 x 2 + E 3  X 3 = α 23 X 2 + E 3

Fundamental question What happens to the error in the causal effect estimator if in reality there is a weak dependence X 1 �⊥ ⊥ X 3 | X 2 , but we do not have enough data to detect it? Type II error: Erroneously accepting the null hypotesis of independence in the statistical test X 1 ⊥ ⊥ X 3 | X 2 . Can we still guarantee some kind of bound for the distance � � � � | E X 3 | X 2 − E X 3 | do ( X 2 ) |

From LCD to our model Starting from the chain X 1 X 2 X 3 X 1 ⊥ ⊥ X 3 | X 2 If we consider a possible weak dependence not detected by our test suddenly the causal graph gains complexity X 4 X 1 X 2 X 3 X 1 �⊥ ⊥ X 3 | X 2 where X 4 is a confounding variable between X 2 and X 3 .

True model X 4 X 1 X 2 X 3 Prior assumptions Statistical tests No Selection Bias X 1 �⊥ ⊥ X 2 Acyclicity X 2 �⊥ ⊥ X 3 Faithfulness A weak conditional dependence X 1 �⊥ ⊥ X 3 | X 2 X 2 , X 3 do not cause X 1 No confounders between X 1 and X 2 , or X 3 , or both (for simplicity)

Causal effect estimation error function Belief True model X 4 α 23 α 23 X 1 X 2 X 3 X 1 X 2 X 3 α 23 = Σ 32 α 23 � = Σ 32 Σ 22 Σ 22 Error in the causal effect estimation function = Σ 32 � � − α 23 g A , Σ Σ 22

Constraint equations Proposition There exists a map Φ : ( A , ∆) → Σ from the model parameters to the observed covariance matrix that defines a set of polynomial equations. From a geometrical point of view, given Σ ( A , ∆) ∈ M ⊂ R 9 A Σ Φ M . ∆

Non-identification of the model parameters In our model the map Φ is not injective. Thus, the manifold M does not reduce to a single point. A Σ Φ M . Φ − 1 =? ∆ Nevertheless it is still an interesting question whether the function g is a bounded function on M or not.

Main result Theorem There exists a map Ψ(Σ , δ 2 2 , δ 2 3 , s 1 , s 2 ) = A where s 1 , s 2 are two signs and the δ 2 2 , δ 2 3 are the variance of the noise sources of X 2 and X 3 respectively. Corollary It is possible to express the error in the causal effect estimation function g as � � det Σ − m δ 2 m − Σ 11 δ 2 ϑ Σ 12 � � 3 2 Σ , Ψ(Σ , δ 2 2 , δ 2 g 3 , s 1 , s 2 ) = + s 1 s 2 � m Σ 22 δ 2 m � �� 2 small for weak dep. � �� arbitrarily large where ϑ = Σ 13 Σ 22 − Σ 12 Σ 23 and m = Σ 11 Σ 22 − Σ 2 12 .

Approaching the singularity Proposition lim | g | = + ∞ δ 2 2 → 0 ∀ δ 2 ( s 1 , s 2 ) ∈ {− 1 , 1 } 2 3 ∈ [ 0 , det Σ / m ]

Probabilistic estimation of the error ( δ 2 2 , δ 2 3 ) ∈ D (Σ) ⊂ R 2 M M = { ( δ 2 2 , δ 2 3 ) : | g | ≤ M } If we put a uniform prior on the noise variances Pr ( | g | ≤ M ) = ||M M || || D (Σ) || What would be a reasonable prior distribution for δ 2 2 , δ 2 3 ?

Looking for an approximate bound The causal effect error function g can be optimized over the δ 2 3 parameters, giving a confidence interval for the causal weight α 23 α 23 ∈ [ b − , b + ] ⊂ R where √ � m − Σ 11 δ 2 det Σ 2 ) = γ 2 b ± ( δ 2 m ± � δ 2 m 2

Looking for an approximate bound Suppose we would have a lower bound δ 2 2 ≥ ˆ δ 2 2 then this implies an upper bound on | g | . What would be a practical example where we can assume such a lower bound for the variance δ 2 2 ?

Conclusions The causal effect estimation error is sensible to erroneous conclusions in conditional independence tests. The result is in accord with Robins et al. (2003), on the lack of uniform consistency of causal discovery algorithms, but through this paper we wish to emphasize this issue on the more practical matter of type II errors. In our case it was not possible to identify the model parameters explicitly.

Proposal for future work Bayesian model selection : What would be a reasonable prior distribution for the model parameters? Bayesian Information Criterion : Will the BIC still give reasonable results even though the model parameters are not identifiable? Could it deal with irregular or even singular models?

Proposal for future work Adding an “environment” variable : Might it be reasonable to assume that a part, or most, of the external variability is carried by the covariance between the environment variable W and the other measured ones, including possible confounders? W X 4 X 1 X 2 X 3

Thanks for your attention!

Type-II errors of independence tests can lead to arbitrarily large - PowerPoint PPT Presentation

Type-II errors of independence tests can lead to arbitrarily large errors in estimated causal effects: an illustrative example Workshop UAI 2014 Nicholas Cornia & Joris M. Mooij University of Amsterdam 27/07/2014 Problem Setting 1

Unit 3: Foundations for inference Lecture 3: Decision errors, significance levels, sample size,

How willing are you to be wrong? Type I and Type II Errors Type 1, Type II Errors and Power

Chapter 5.6: Tests for Independence Previously, we used parametric tests, e.g. is there any

Quasi-Exact Tests for the dichotomous Rasch Model conditional independence (local

Unit 3: Foundations for inference Lecture 3: Decision errors, significance levels, sample size,

Preventing Elevated Blood Lead Levels High Level of Lead in Sindoor in Children Tests find more

Hypotheses testing, p-values, Type I and Type II Errors Statistics are not substitute for

Lecture 28/Chapters 22 & 23 Hypothesis Tests Variable Types and Appropriate Tests

Mean Tests & X 2 Parametric vs Nonparametric Errors Selection of a Statistical Test SW242

A short overview of Type Theory Yves Bertot June 2015 1 / 36 Motivation for types You know

ELO TRANSLATION PROJECT SARAH **** SOME VOCAB Errors Logic Errors Runtime Errors

05 Errors and Power.notebook November 29, 2012 10.4 Inference as Decision Tests of significance

Linguistic Discrimination in Writing Assessments: How Raters React to African American Errors,

What About Randomization Tests? Strengths Gail et al. (1996) reported nominal Type I and II

Topic 1: Physical Phenomena, Materials Material Testing: Impact on Type Tests and Routine Tests

Building and Using Pluggable Type-Checkers Werner M. Dietl Joint work with: Stephanie Dietzel,

Local Independence Tests for Point Processes Learning causality in event models Nikolaj Thams,

Dynamically diagnosing type errors in unsafe code Stephen Kell stephen.kell@cl.cam.ac.uk

Basic Errors Compiling in Unix Syntax errors Common Errors, and Debugging Run-Time errors

Chapter 10 2 tests for goodness of fit and independence Prof. Tesler Math 186 Winter 2018 Ch.

Diagnosing Type Errors with Class Danfeng Zhang Dimitrios Vytiniotis Simon Peyton-Jones Andrew

UNIT TESTING 3 / 8 1 / 8 Unit testing involves: Lots of small, independent tests Reporting

On the Use of Underspecified Data-Type Semantics for Type Safety in Low-Level Code Hendrik Tews 1

Typing and ML Typing and ML Definition Program organization and documentation A name for a

Type-II errors of independence tests can lead to arbitrarily large - PowerPoint PPT Presentation

Type-II errors of independence tests can lead to arbitrarily large errors in estimated causal effects: an illustrative example Workshop UAI 2014 Nicholas Cornia & Joris M. Mooij University of Amsterdam 27/07/2014 Problem Setting 1

Unit 3: Foundations for inference Lecture 3: Decision errors, significance levels, sample size,

How willing are you to be wrong? Type I and Type II Errors Type 1, Type II Errors and Power

Chapter 5.6: Tests for Independence Previously, we used parametric tests, e.g. is there any

Quasi-Exact Tests for the dichotomous Rasch Model conditional independence (local

Unit 3: Foundations for inference Lecture 3: Decision errors, significance levels, sample size,

Preventing Elevated Blood Lead Levels High Level of Lead in Sindoor in Children Tests find more

Hypotheses testing, p-values, Type I and Type II Errors Statistics are not substitute for

Lecture 28/Chapters 22 &amp; 23 Hypothesis Tests Variable Types and Appropriate Tests

Mean Tests &amp; X 2 Parametric vs Nonparametric Errors Selection of a Statistical Test SW242

A short overview of Type Theory Yves Bertot June 2015 1 / 36 Motivation for types You know

ELO TRANSLATION PROJECT SARAH **** SOME VOCAB Errors Logic Errors Runtime Errors

05 Errors and Power.notebook November 29, 2012 10.4 Inference as Decision Tests of significance

Linguistic Discrimination in Writing Assessments: How Raters React to African American Errors,

What About Randomization Tests? Strengths Gail et al. (1996) reported nominal Type I and II

Topic 1: Physical Phenomena, Materials Material Testing: Impact on Type Tests and Routine Tests

Building and Using Pluggable Type-Checkers Werner M. Dietl Joint work with: Stephanie Dietzel,

Local Independence Tests for Point Processes Learning causality in event models Nikolaj Thams,

Dynamically diagnosing type errors in unsafe code Stephen Kell stephen.kell@cl.cam.ac.uk

Basic Errors Compiling in Unix Syntax errors Common Errors, and Debugging Run-Time errors

Chapter 10 2 tests for goodness of fit and independence Prof. Tesler Math 186 Winter 2018 Ch.

Diagnosing Type Errors with Class Danfeng Zhang Dimitrios Vytiniotis Simon Peyton-Jones Andrew

UNIT TESTING 3 / 8 1 / 8 Unit testing involves: Lots of small, independent tests Reporting

On the Use of Underspecified Data-Type Semantics for Type Safety in Low-Level Code Hendrik Tews 1

Typing and ML Typing and ML Definition Program organization and documentation A name for a

Lecture 28/Chapters 22 & 23 Hypothesis Tests Variable Types and Appropriate Tests

Mean Tests & X 2 Parametric vs Nonparametric Errors Selection of a Statistical Test SW242