A consistent approach to inconsistencies Fabian Khlinger (Kavli - PowerPoint PPT Presentation

A consistent approach to inconsistencies Fabian Köhlinger (Kavli IPMU) in collaboration with Benjamin Joachimi (UCL) SCLSS workshop Oxford, 19 th April 2018

I. Motivation

Typical questions arising in a (LSS) data analysis: 1. Is model 0 (e.g. wCDM) more likely than my fiducial model 1 (e.g. 𝚳 CDM)? 2. Is data set 1 (e.g. Planck) consistent with data set 0 (e.g. cosmic shear)? 3. Is split 1 of my data set (e.g. z-bin X) consistent with another split 0 of the same data set (e.g. all other z-bins)?

II. Bayesian approach to (in)consistency

1. Bayesian evidence: data hypothesis, model parameters evidence likelihood prior The evidence is the average of the likelihood over the prior, so it automatically implements Occam’s razor. 6

2. Bayes factor: Calculate the ratio of probabilities that each model is correct (given the data): typically set to 1 Bayes’ theorem Bayes factor a priori H 0 : ‘hypothesis for model 1’ H 1 : ‘hypothesis for model 0’ d: data 7

2. Bayes factor: H 0 is more likely to be true than H 1 > 1 Data set comparison: (Nested) model comparison: H 0 : ‘there is one common set of parameters describing e.g. H 0 : ‘wCDM' Planck and cosmic shear’ vs. vs. H 1 : ‘ 𝚳 CDM' H 1 : ‘each data set requires its own set of parameters’ e.g. Marshall, Rajguru & Slosar (2006) 8

2. Bayes factor: H 0 is more likely to be true than H 1 > 1 Data set comparison: H 0 : ‘one common set of parameters is sufficient for This does NOT hold for describing the fiducial (= split 1 + … + split N) data set’ correlated data sets! vs. H 1 : ‘each split i of the data set requires its own set of parameters’ 9

2. Bayes factor: H 0 is more likely to be true than H 1 > 1 Data set comparison: H 0 : ‘one common set of parameters is sufficient for This does NOT hold for describing the fiducial (= split 1 + … + split N) data set’ correlated data sets! vs. H 1 : ‘each split i of the data set requires its own set of parameters’ 10

3. Posterior predictive distribution (PPD): PPD likelihood of new data posterior sample The PPD is the average of the likelihood of the new data over the posterior of the parameters of a given model. : original data d Can the model(s) describe the data? ˆ : PPD split samples   d s Are ‘split’ models consistent? ˆ : PPD joint sample d j quantify this by: - comparing the difference between joint and split PPDs to zero - comparing the (Gaussian) data distribution to the corresponding PPDs 11

3. Posterior predictive distribution (PPD): Quantify tension between Gaussian data distribution and PPDs by calculating overlap with m 𝜏 -region. FK+ in prep. 12

III. Test case: cosmic shear correlation functions from KiDS-450

a) Systematics in z-bin 3?

1. Data and PPDs: + z-bin 3 (incl. cross-correlations) vs. all other correlations black: data from KiDS-450 (Hildebrandt+ 2017) red: mode of joint PPD blue: modes of split PPDs FK+ in prep. 15

2. Comparison of key parameters: z-bin 3 (incl. cross-correlations) vs. all other correlations amplitude of intrinsic alignment model FK+ in prep. 16

3. Comparison in data space: + z-bin 3 (incl. cross-correlations) vs. all other correlations red: mode of joint PPD blue: modes of split PPDs FK+ in prep. 17

4. Comparing difference of PPDs: + z-bin 3 (incl. cross-correlations) vs. all other correlations FK+ in prep. red: mode of joint PPD blue: modes of split PPDs FK+ in prep. 18

b) Scale-dependent systematics?

1. Data and PPDs: + Large scales vs. small scales black: data from KiDS-450 (Hildebrandt+ 2017) red: mode of joint PPD blue: modes of split PPDs FK+ in prep. 20

2. Comparison of key parameters: Large scales vs. small scales amplitude of intrinsic alignment model FK+ in prep. 21

3. Comparison in data space: + Large scales vs. small scales red: mode of joint PPD blue: modes of split PPDs FK+ in prep. 22

4. Comparing difference of PPDs: + Large scales vs. small scales FK+ in prep. red: mode of joint PPD blue: modes of split PPDs FK+ in prep. 23

IV. Summary

Summary: 1. Bayesian evidence and the Bayes factor are powerful concepts for model comparison • can be expanded to consistency checks of (correlated) datasets 2. Quantification of consistency with Bayes factor is not optimal: • all information compressed into one number • no hints to from where systematics arise • mind the priors… 3. Complementary tool: PPDs • systematics apparent in data space • can be compressed into various numbers ( 𝛕 -levels) 25

A consistent approach to inconsistencies Fabian Khlinger (Kavli - PowerPoint PPT Presentation

A consistent approach to inconsistencies Fabian Khlinger (Kavli IPMU) in collaboration with Benjamin Joachimi (UCL) SCLSS workshop Oxford, 19 th April 2018 I. Motivation Typical questions arising in a (LSS) data analysis: 1. Is model 0

Feasibility of Consistent, Feasibility of Consistent, Feasibility of Consistent, Feasibility of

General Structure of a PW code Self-Consistent KS eqs. or Global Minimization approach

CSS Modules with BEM Consistent Design Consistent Design Different Module Versions Consistent

Building Consistent Cross-Platform Interfaces Building Consistent Cross-Platform Interfaces

We are all made of contradictions, inconsistencies, frustrations and desires. The hopes to be

Detecting inconsistencies in INRDB data to identify MOAS cases and possible illegitimate Internet

Detec%ng Unknown Inconsistencies in Web Applica%ons Frolin Ocariza Jr. Karthik Pa:abiraman Ali

Data Integration and Inconsistencies Julius Stuller Institute of Computer Science Academy of

Detec%ng Inconsistencies in JavaScript MVC Applica%ons Frolin S.

MORPH-II Dataset 1. Introduction to the Data 2. Inconsistencies in the Data 3. Cleaning the Data

SITUATION 92.7% of U.S. households have central heating Problem: Inconsistencies

Remaining inconsistencies with solar neutrinos: can spin flavour precession provide a clue? Jo

Streebog and Kuznyechik Inconsistencies in the Claims of their Designers Lo Perrin IETF

Survey of inconsistencies in Linux kernel IPv4/IPv6 UAPI Roopa Prabhu Proceedings of netdev

Inconsistencies Fixed in Writer Miklos Vajna 2014-09-03 About Miklos From Hungary More

The Need & Rationale Spreadsheet syndrome Goal: reduce redundancies and inconsistencies e

Accretion Disk Matt Coleman Institute for Advanced Study Boundary Layers mcoleman@ias.edu UNLV

Executive Order 13636 & Presidential Policy Directive 21 Ed Goff, Duke Energy Melanie

Design Techniques for Scalable, Sub-pJ/b Serial I/O Transceivers Samuel Palermo

Asymptotic Bayesian Generalization Error when Training and Test Distributions are Different

Center for Medicare The National Hospice and Palliative Care Organization (NHPCO) and the Hospice

A Secure Data Architecture for Telehealth Trial Surya Nepal, Julian Jang-Jaccard, Rajiv Jayasena,

Assistance Systems LFE Medieninformatik Doctoral Colloquium 2014 Stefan Diewald 29.07.2014

5/29/2015 The Interplay of ADA, FMLA, and Workers Compensation Presented by Geoffrey A

A consistent approach to inconsistencies Fabian Khlinger (Kavli - PowerPoint PPT Presentation

A consistent approach to inconsistencies Fabian Khlinger (Kavli IPMU) in collaboration with Benjamin Joachimi (UCL) SCLSS workshop Oxford, 19 th April 2018 I. Motivation Typical questions arising in a (LSS) data analysis: 1. Is model 0

Feasibility of Consistent, Feasibility of Consistent, Feasibility of Consistent, Feasibility of

General Structure of a PW code Self-Consistent KS eqs. or Global Minimization approach

CSS Modules with BEM Consistent Design Consistent Design Different Module Versions Consistent

Building Consistent Cross-Platform Interfaces Building Consistent Cross-Platform Interfaces

We are all made of contradictions, inconsistencies, frustrations and desires. The hopes to be

Detecting inconsistencies in INRDB data to identify MOAS cases and possible illegitimate Internet

Detec%ng Unknown Inconsistencies in Web Applica%ons Frolin Ocariza Jr. Karthik Pa:abiraman Ali

Data Integration and Inconsistencies Julius Stuller Institute of Computer Science Academy of

Detec%ng Inconsistencies in JavaScript MVC Applica%ons Frolin S.

MORPH-II Dataset 1. Introduction to the Data 2. Inconsistencies in the Data 3. Cleaning the Data

SITUATION 92.7% of U.S. households have central heating Problem: Inconsistencies

Remaining inconsistencies with solar neutrinos: can spin flavour precession provide a clue? Jo

Streebog and Kuznyechik Inconsistencies in the Claims of their Designers Lo Perrin IETF

Survey of inconsistencies in Linux kernel IPv4/IPv6 UAPI Roopa Prabhu Proceedings of netdev

Inconsistencies Fixed in Writer Miklos Vajna 2014-09-03 About Miklos From Hungary More

The Need &amp; Rationale Spreadsheet syndrome Goal: reduce redundancies and inconsistencies e

Accretion Disk Matt Coleman Institute for Advanced Study Boundary Layers mcoleman@ias.edu UNLV

Executive Order 13636 &amp; Presidential Policy Directive 21 Ed Goff, Duke Energy Melanie

Design Techniques for Scalable, Sub-pJ/b Serial I/O Transceivers Samuel Palermo

Asymptotic Bayesian Generalization Error when Training and Test Distributions are Different

Center for Medicare The National Hospice and Palliative Care Organization (NHPCO) and the Hospice

A Secure Data Architecture for Telehealth Trial Surya Nepal, Julian Jang-Jaccard, Rajiv Jayasena,

Assistance Systems LFE Medieninformatik Doctoral Colloquium 2014 Stefan Diewald 29.07.2014

5/29/2015 The Interplay of ADA, FMLA, and Workers Compensation Presented by Geoffrey A

The Need & Rationale Spreadsheet syndrome Goal: reduce redundancies and inconsistencies e

Executive Order 13636 & Presidential Policy Directive 21 Ed Goff, Duke Energy Melanie