Confidence Sets and Hypothesis Testing in a Likelihood-Free - PowerPoint PPT Presentation

Confidence Sets and Hypothesis Testing in a Likelihood-Free Inference Setting Nic Dalmasso 1 , Rafael Izbicki 2 , Ann B. Lee 1 1 Department of Statistics & Data Science, Carnegie Mellon University 2 Department of Statistics, Federal University of Sao Carl˜ os International Conference on Machine Learning (ICML) July 12-18 2020 Nic Dalmasso (Carnegie Mellon University) 1 / 17

Motivation: Likelihood in Studying Complex Phenomena However, for some complex phenomena in the science and engineering, an explicit likelihood function might not be available. Nic Dalmasso (Carnegie Mellon University) 2 / 17

Likelihood-Free Inference 1 True likelihood cannot be evaluated 2 Samples can be generated for fixed settings of θ , so the likelihood is implicitly defined Inference on parameters θ in this setting is known as likelihood-free inference (LFI). Nic Dalmasso (Carnegie Mellon University) 3 / 17

Likelihood-Free Inference Literature Approximate Bayesian computation 1 More recent developments: ◮ Direct posterior estimation (bypassing the likelihood) 2 ◮ Likelihood estimation 3 ◮ Likelihood ratio estimation 4 Hypothesis testing and confidence sets can be considered cornerstones of classical statistics, but have not received much attention in LFI. 1 Beaumont et al. 2002, Marin et al. 2012, Sisson et al. 2018 2 Marin et al., 2016; Izbicki et al., 2019; Greenberg et al., 2019 3 Thomas et al., 2016; Price et al., 2018; Ong et al., 2018; Lueckmann et al., 2019; Papamakarios et al., 2019 4 Izbicki et al., 2014; Cranmer et al., 2015; Frate et al., 2016 Nic Dalmasso (Carnegie Mellon University) 4 / 17

A Frequentist Approach to LFI Our goal is to develop: 1 valid hypothesis testing procedures 2 confidence intervals with the correct coverage Main Challenges: Dealing with high-dimensional and different types of simulated data Computational efficiency Assessing validity and coverage Nic Dalmasso (Carnegie Mellon University) 5 / 17

Hypothesis Testing and Confidence Sets Key ingredients: data D = { X 1 , ..., X n } a test statistic, such as likelihood ratio statistic Λ( D ; θ 0 ) an α -level critical value C θ 0 ,α Reject the null hypothesis H 0 if Λ( D ; θ 0 ) < C θ 0 ,α Theorem (Neyman inversion, 1937) Building a 1 − α confidence set for θ is equivalent to testing H 0 : θ = θ 0 vs . H A : θ � = θ 0 for θ 0 across the parameter space. Nic Dalmasso (Carnegie Mellon University) 6 / 17

A pproximate C omputation via O dds R atio E stimation Key Realization : 1 Likelihood ratio statistic log Λ( D ; Θ 0 ) , 2 Critical value of the test C θ 0 ,α , 3 Coverage of the confidence sets Are conditional distribution functions which often vary smoothly as a function of the (unknown) parameters of interest θ . Rather than relying solely on samples at fixed parameter settings (standard Monte Carlo solutions), we can interpolate across the parameter space with ML models. Nic Dalmasso (Carnegie Mellon University) 7 / 17

Likelihood Ratio Statistic (I) 1 Forward simulator F θ ◮ Identifiable model, i.e. F θ 1 � = F θ 2 for θ 1 � = θ 2 ∈ Θ 2 Proposal distribution for the parameters r ( θ ) over Θ 3 Reference distribution G over the data space X ◮ Does not depend on θ ◮ G needs to be a dominating measure of F θ for every θ ⋆ OK if G = F θ for one specific θ ∈ Θ Train a probabilistic classifier m to discriminate samples from G ( Y = 0 ) between samples from F θ ( Y = 1 ) given θ . ⇒ O ( θ 0 ; x ) = P ( Y = 1 | x , θ ) P ( Y = 0 | x , θ ) = F θ ( x ) m : ( θ, x ) − → P ( Y = 1 | x , θ ) = G ( x ) Nic Dalmasso (Carnegie Mellon University) 8 / 17

Likelihood Ratio Statistic (II) log OR ( x ; θ 0 , θ 1 ) = log O ( θ 0 ; x ) O ( θ 1 ; x ) (log-odds ratio) Suppose we want to test: H 0 : θ ∈ Θ 0 vs H 1 : θ �∈ Θ 0 We define the test statistics: � � n � OR ( X obs � τ ( D ; Θ 0 ) := sup inf log ; θ 0 , θ 1 ) i θ 1 ∈ Θ θ 0 ∈ Θ 0 i =1 Theorem (Fisher’s Consistency) � If P ( Y = 1 | θ, x ) = P ( Y = 1 | θ, x ) ∀ θ, x = ⇒ τ ( D ; Θ 0 ) = log Λ( D ; Θ 0 ) Nic Dalmasso (Carnegie Mellon University) 9 / 17

Likelihood Ratio Statistic (III) Suppose we want to test: H 0 : θ ∈ Θ 0 vs H 1 : θ �∈ Θ 0 We define the test statistics: � � n � OR ( X obs � τ ( D ; Θ 0 ) := sup inf log ; θ 0 , θ 1 ) i θ 1 ∈ Θ θ 0 ∈ Θ 0 i =1 By fitting a classifier m we can: estimate � OR ( x ; θ 0 , θ 1 ) for all x , θ 0 , θ 1 , leverage ML probabilistic classifier to deal with high-dimensional x , use loss-function as relative comparison of which classifier performs best among a set of classifiers. Nic Dalmasso (Carnegie Mellon University) 10 / 17

Determine Critical Values C θ 0 ,α We reject the null hypothesis when τ ( D ; Θ 0 ) ≤ C θ 0 ,α , where C θ 0 ,α is chosen so that the test has a size α . � � C θ 0 ,α = arg sup C : sup P ( τ ( D ; Θ 0 ) < C θ 0 | θ 0 ) ≤ α , θ 0 ∈ Θ 0 C ∈ R Problem : Need to estimate P ( τ ( D ; Θ 0 ) < C θ 0 | θ 0 ) over any θ ∈ Θ . Solution : P ( τ ( D ; Θ 0 ) < C θ 0 | θ 0 ) is a (conditional) CDF, so we can estimate its α quantile via quantile regression. Nic Dalmasso (Carnegie Mellon University) 11 / 17

Assessing Confidence Set Coverage Set Coverage: E [ I ( θ 0 ∈ R ( D ))] = P ( θ 0 ∈ R ( D )) ≥ 1 − α Marginal Coverage ✗ Build R for different θ 1 0 , ..., θ n 0 and check overall coverage Estimate Via Regression � Run ACORE for different θ 1 0 , ..., θ n 0 and estimate coverage: { θ i 0 , R ( D i ) } n i =1 − → learn E [ I ( θ 0 ∈ R ( D ))] We can check that 1 − α is within prediction interval for each θ 0 Nic Dalmasso (Carnegie Mellon University) 12 / 17

Nic Dalmasso (Carnegie Mellon University) 13 / 17

ACORE Relies on 5 Key Components Nic Dalmasso (Carnegie Mellon University) 14 / 17

A Practical Strategy To apply ACORE , we need to choose five key components: a reference distribution G a probabilistic classifier a training sample size B for learning odds ratios a quantile regression algorithm ′ for estimating critical values a training sample size B Empirical Strategy: 1 Use prior knowledge or marginal distribution of a separate simulated sample to build G ; 2 Use the cross entropy loss to select the classifier and B ; 3 Use the goodness-of-fit procedure to select the quantile regression ′ . method and B Nic Dalmasso (Carnegie Mellon University) 15 / 17

Also included in our work 1 Theoretical results 2 Toy examples to showcase ACORE in situations where the true likelihood is known 3 Signal detection example inspired by the particle physics literature 4 Comparison with existing methods 5 Open source Python implementation 5 ◮ based on numpy , sklearn and PyTorch 5 Github: Mr8ND/ACORE-LFI Nic Dalmasso (Carnegie Mellon University) 16 / 17

THANKS FOR WATCHING! Nic Dalmasso (Carnegie Mellon University) 17 / 17

Confidence Sets and Hypothesis Testing in a Likelihood-Free - PowerPoint PPT Presentation

Confidence Sets and Hypothesis Testing in a Likelihood-Free Inference Setting Nic Dalmasso 1 , Rafael Izbicki 2 , Ann B. Lee 1 1 Department of Statistics & Data Science, Carnegie Mellon University 2 Department of Statistics, Federal University

STAT 113 Hypothesis Testing I Colin Reimer Dawson Oberlin College October 5, 2017 1 / 17

Hypothesis Testing Mark Lunt Centre for Epidemiology Versus Arthritis University of Manchester

THE LISTING PRESENTATION A Natural Close! CONFIDENCE CONFIDENCE CONFIDENCE CONFIDENCE Hi

CME/STATS 195 CME/STATS 195 Lecture 7: Hypothesis Testing and Lecture 7: Hypothesis Testing and

Chapter 6 Hypothesis Testing What is Hypothesis Testing? the use of statistical

Chapter 6 Hypothesis Testing What is Hypothesis Testing? the use of statistical

STAT 215 Hypothesis Testing I Colin Reimer Dawson Oberlin College September 7, 2017 1 / 14

Gov 2000: 6. Hypothesis Testing Matthew Blackwell October 11, 2016 1 / 55 1. Hypothesis

Cluster Validity Hypothesis Random Graph Hypothesis Random Label Hypothesis Relative Criteria

Testing Specification testing Michel Bierlaire Introduction to choice models Differences from

Hypothesis tests with binomial example STAT 587 (Engineering) Iowa State University October 2,

t -tests STAT 587 (Engineering) Iowa State University October 2, 2020 Statistical hypothesis

CS70: Jean Walrand: Lecture 29. Confidence? Confidence? Confidence is essential is many

Max. likelihood & Bayesian techniques are both likelihood-based. Weaknesses of likelihood for

Testing 6.1 Specification testing Michel Bierlaire A short reminder on hypothesis testing

Hypothesis testing get data that differ from the null hypothesis. If the data would be quite

Driver Distraction in Commercial Vehicle Operations PRELIMINARY RESULTS FMCSA Webinar Richard

An Evidence Based Search For Neutron Star Ringdowns James Clark

Human-Centered Natural Language Processing CSE392 - Spring 2019 Special Topic in CS The

STATISTICS 536B, Lecture #3 March 3, 2015 General options for binary Y , binary X , confounders C

Regression Models Response Variable (Y). Explanatory (or predictor) Variables (X j ; j =

Voting Behavior of Naturalized Citizens: 1996-2006 Sarah R. Crissey Thom File U.S. Census

Impact of Army Character on Soldier Attrition Marisa Nihill, Ph.D., Loryana L. Vie, Ph.D., Raghav

Machine Learning Lecture 4 Justin Pearson 1 2020 1

Confidence Sets and Hypothesis Testing in a Likelihood-Free - PowerPoint PPT Presentation

Confidence Sets and Hypothesis Testing in a Likelihood-Free Inference Setting Nic Dalmasso 1 , Rafael Izbicki 2 , Ann B. Lee 1 1 Department of Statistics & Data Science, Carnegie Mellon University 2 Department of Statistics, Federal University

STAT 113 Hypothesis Testing I Colin Reimer Dawson Oberlin College October 5, 2017 1 / 17

Hypothesis Testing Mark Lunt Centre for Epidemiology Versus Arthritis University of Manchester

THE LISTING PRESENTATION A Natural Close! CONFIDENCE CONFIDENCE CONFIDENCE CONFIDENCE Hi

CME/STATS 195 CME/STATS 195 Lecture 7: Hypothesis Testing and Lecture 7: Hypothesis Testing and

Chapter 6 Hypothesis Testing What is Hypothesis Testing? the use of statistical

Chapter 6 Hypothesis Testing What is Hypothesis Testing? the use of statistical

STAT 215 Hypothesis Testing I Colin Reimer Dawson Oberlin College September 7, 2017 1 / 14

Gov 2000: 6. Hypothesis Testing Matthew Blackwell October 11, 2016 1 / 55 1. Hypothesis

Cluster Validity Hypothesis Random Graph Hypothesis Random Label Hypothesis Relative Criteria

Testing Specification testing Michel Bierlaire Introduction to choice models Differences from

Hypothesis tests with binomial example STAT 587 (Engineering) Iowa State University October 2,

t -tests STAT 587 (Engineering) Iowa State University October 2, 2020 Statistical hypothesis

CS70: Jean Walrand: Lecture 29. Confidence? Confidence? Confidence is essential is many

Max. likelihood &amp; Bayesian techniques are both likelihood-based. Weaknesses of likelihood for

Testing 6.1 Specification testing Michel Bierlaire A short reminder on hypothesis testing

Hypothesis testing get data that differ from the null hypothesis. If the data would be quite

Driver Distraction in Commercial Vehicle Operations PRELIMINARY RESULTS FMCSA Webinar Richard

An Evidence Based Search For Neutron Star Ringdowns James Clark

Human-Centered Natural Language Processing CSE392 - Spring 2019 Special Topic in CS The

STATISTICS 536B, Lecture #3 March 3, 2015 General options for binary Y , binary X , confounders C

Regression Models Response Variable (Y). Explanatory (or predictor) Variables (X j ; j =

Voting Behavior of Naturalized Citizens: 1996-2006 Sarah R. Crissey Thom File U.S. Census

Impact of Army Character on Soldier Attrition Marisa Nihill, Ph.D., Loryana L. Vie, Ph.D., Raghav

Machine Learning Lecture 4 Justin Pearson 1 2020 1

Max. likelihood & Bayesian techniques are both likelihood-based. Weaknesses of likelihood for