Bootstrapping Sensitivity analysis Qingyuan Zhao Statistical - PowerPoint PPT Presentation

Bootstrapping Sensitivity analysis Qingyuan Zhao Statistical Laboratory, University of Cambridge August 3, 2020 @ JSM

Sensitivity analysis The broader concept [Saltelli et al., 2004] ◮ Sensitivity analysis is “the study of how the uncertainty in the output of a mathematical model or system (numerical or otherwise) can be apportioned to different sources of uncertainty in its inputs ”. ◮ Model inputs may be any factor that “can be changed in a model prior to its execution”, including” “structural and epistemic sources of uncertainty”. In observational studies ◮ The most typical question is: How do the qualitative and/or quantitative conclusions of the observational study change if the no unmeasured confounding assumption is violated? 1/18

Sensitivity analysis for observational studies State of the art ◮ Gazillions of methods specifically designed for different problems. ◮ Various forms of statistical guarantees. ◮ Often not straightforward to interpret Goals of this talk 1. What is the common structure behind various methods for sensitivity analysis? 2. Can we bootstrap sensitivity analysis? 2/18

What is a sensitivity model? General setup infer Observed data O = ⇒ Distribution of the full data F . ◮ Prototypical example: Observe iid copies of O = ( X , A , Y ) from the underlying full data F = ( X , A , Y (0) , Y (1)), where A is a binary treatment, X is covariates, Y is outcome. An abstraction A sensitivity model is a family of distributions F θ,η of F that satisfies: 1. Augmentation: Setting η = 0 corresponds to a primary analysis assuming no unmeasured confounders. 2. Model identifiability: Given η , the implied marginal distribution O θ,η of the observed data O is identifiable. Statistical problem Given η (or the range of η ), use the observed data to make inference about some causal parameter β = β ( θ, η ). 3/18

Understanding sensitivity models Observational equivalence ◮ F θ,η and F θ ′ ,η ′ are said to be observationally equivalent if O θ,η = O θ ′ ,η ′ . We write this as F θ,η ≃ F θ ′ ,η ′ . ◮ Equivalence class [ F θ,η ] = {F θ ′ ,η ′ | F θ,η ≃ F θ ′ ,η ′ } . Types of sensitivity models Testable models When F θ,η is not rich enough, [ F θ,η ] is a singleton and η can be identified from the observed data (should be avoided in practice). Global models For any ( θ, η ) and η ′ , there exists F θ ′ ,η ′ ≃ F θ,η . Separable models For any ( θ, η ), F θ,η ≃ F θ, 0 . 4/18

A visualization η η [F θ , η ] [F θ , η ] θ θ Left: Global sensitivity models; Right: Separable sensitivity models. 5/18

Statistical inference Modes of inference 1. Point identified sensitivity analysis is performed at a fixed η . 2. Partially identified sensitivity analysis is performed simultaneously over η ∈ H for a given range H . Statistical guarantees of interval estimators 1. Confidence interval [ C L ( O 1: n ; η ) , C U ( O 1: n ; η )] satisfies � � inf β ( θ 0 , η 0 ) ∈ [ C L ( η 0 ) , C U ( η 0 )] ≥ 1 − α. θ 0 ,η 0 P θ 0 ,η 0 2. Sensitivity interval [ C L ( O 1: n ; H ) , C U ( O 1: n ; H )] satisfies � � inf β ( θ 0 , η 0 ) ∈ [ C L ( H ) , C U ( H )] ≥ 1 − α. (1) θ 0 ,η 0 P θ 0 ,η 0 They look almost the same, but because the latter interval only depends on H , (1) is actually equivalent to � � inf inf β ( θ, η ) ∈ [ C L ( H ) , C U ( H )] ≥ 1 − α. P θ 0 ,η 0 θ 0 ,η 0 F θ,η ≃F θ 0 ,η 0 6/18

Approaches to sensitivity analysis ◮ Point identified sensitivity analysis is basically the same as primary analysis with known “offset” η . ◮ Partially identified sensitivity analysis is much harder. Let F θ 0 ,η 0 be the truth. The fundamental problem is to make inference about η ∈ H { β ( θ, η ) | F θ,η ≃ F θ 0 ,η 0 } and sup inf { β ( θ, η ) | F θ,η ≃ F θ 0 ,η 0 } η ∈ H Method 1 Solve the population optimization problems analytically. ◮ Not always feasible. Method 2 Solve the sample approximation problem and use asymptotic normality. ◮ Central limit theorems not always true or established. Method 3 Take the union of confidence intervals � [ C L ( H ) , C U ( H )] = [ C L ( η ) , C U ( η )] . η ∈ H ◮ By the union bound, this is a (1 − α )-sensitivity interval if all [ C L ( η ) , C U ( η )] are (1 − α )-confidence intervals. 7/18

Computational challenges for Method 3 � [ C L ( H ) , C U ( H )] = [ C L ( η ) , C U ( η )] . η ∈ H ◮ Using asymptotic theory, it is often not difficult to construct asymptotic confidence intervals of the form 2 · ˆ σ ( η ) [ C L ( η ) , C U ( η )] = ˆ β ( η ) ∓ z α √ n ◮ Unlike Method 2 that only needs to optimize ˆ β ( η ), Method 3 further needs to optimize the usually much more complicated ˆ σ ( η ) over η ∈ H . 8/18

Method 4: Percentile bootstrap 1. For fixed η , use the percentile bootstrap confidence interval ( b is an index for data resample) � ˆ � ˆ � � �� ˆ ˆ [ C L ( η ) , C U ( η )] = Q α β b ( η ) , Q 1 − α β b ( η ) . 2 2 2. Use the generalized minimax inequality to interchange quantile and infimum/supremum: Percentile bootstrap sensitivity interval ˆ � ˆ � ˆ ˆ � � � � � � ˆ ˆ ˆ ˆ Q α inf β b ( η ) ≤ inf η Q α β b ( η ) ≤ sup η Q 1 − α β b ( η ) ≤ Q 1 − α sup β b ( η ) . 2 2 2 2 η η Union sensitivity interval Advantages ◮ Computation is reduced to repeating Method 2 over data resamples. ◮ Only need coverage guarantee for [ C L ( η ) , C U ( η )] for fixed η . 9/18

Bootstrapping sensitivity analysis Point-identified parameter: Efron’s bootstrap Bootstrap Point estimator = = = = = = = = = = = = ⇒ Confidence interval Partially identified parameter: Three ideas Optimization Percentile Bootstrap Minimax inequality Extrema estimator = = = = = = = = = = = = ⇒ Sensitivity interval Rest of the talk Apply this idea to IPW estimators for a marginal sensitivity model. 10/18

Our sensitivity model ◮ Consider the prototypical example: A is a binary treatment, X is covariates, Y is outcome. ◮ U “summarizes” unmeasured confounding, so A ⊥ ⊥ Y (0) , Y (1) | X , U . ◮ Let e 0 ( x ) = P 0 ( A = 1 | X = x ) , e ( x , u ) = P ( A = 1 | X = x , U = u ) . Marginal sensitivity models e ( x , u ) : 1 � � E M (Γ) = Γ ≤ OR ( e ( x , u ) , e 0 ( x )) ≤ Γ , ∀ x ∈ X , y . ◮ Compare this to the Rosenbaum [2002] model: e ( x , u ) : 1 � � E R (Γ) = Γ ≤ OR ( e ( x , u 1 ) , e ( x , u 2 )) ≤ Γ , ∀ x ∈ X , u 1 , u 2 . ◮ Tan [2006] first considered the marginal model, but he did not consider statistical inference in finite sample. √ Γ) ⊆ E R (Γ) ⊆ E M (Γ) . 1 ◮ Relationship between the two models: E M ( 1 The second part needs “compatibility”: e ( x , y ) should marginalize to e 0 ( x ). 11/18

Parametric extension ◮ In practice, the propensity score e 0 ( X ) = P 0 ( A = 1 | X ) is often estimated by a parametric model. Parametric marginal sensitivity models e ( x , u ) : 1 � � E M (Γ , β 0 ) = Γ ≤ OR ( e ( x , u ) , e β 0 ( x )) ≤ Γ , ∀ x ∈ X , y ◮ e β 0 ( x ) is the best parametric approximation to e 0 ( x ). This sensitivity model covers both 1. Model misspecification , that is, e β 0 ( x ) � = e 0 ( x ); and 2. Missing not at random , that is, e 0 ( x ) � = e ( x , u ). 12/18

Logistic representations 1. Rosenbaum’s sensitivity model: logit ( e ( x , u )) = g ( x ) + u log Γ , where 0 ≤ U ≤ 1. 2. Marginal sensitivity model: logit ( e η ( x , u )) = logit ( e 0 ( x )) + η ( x , u ) , where η ∈ H Γ = { η ( x , u ) | � η � ∞ = sup | η ( x , u ) | ≤ log Γ } . 3. Parametric marginal sensitivity model: logit ( e η ( x , u )) = logit ( e β 0 ( x )) + η ( x , u ) , where η ∈ H Γ . 13/18

Computation Bootstrapping partially identified sensitivity analysis Optimization Percentile Bootstrap Minimax inequality Extrema estimator = = = = = = = = = = = = ⇒ Sensitivity interval ◮ Stabilized inverse-probability weighted (IPW) estimator for β = E [ Y (1)]: n � − 1 � 1 n � 1 A i A i Y i � ˆ � � β ( η ) = , n ˆ e η ( X i , U i ) n e η ( X i , U i ) ˆ i =1 i =1 where ˆ e η can be obtained by plugging in an estimator of β 0 . ◮ Computing extrema of ˆ β ( η ) is a linear fractional programming : Let h i = exp {− η ( X i , U i ) } and g i = 1 / e ˆ β 0 ( X i ), � n i =1 A i Y i [1 + h i ( g i − 1)] max or min i =1 A i [1 + h i ( g i − 1)] , � n h i ∈ [Γ − 1 , Γ] , i = 1 , . . . , n . subject to This can be converted to a linear programming and can in fact be solved in O ( n ) time (optimal rate). 14/18

Example Fish consumption and blood mercury ◮ 873 controls: ≤ 1 serving of fish per month. ◮ 234 treated: ≥ 12 servings of fish per month. ◮ Covariates: gender, age, income (very imblanced), race, education, ever smoked, # cigarettes. Implementation details ◮ Rosenbaum’s method: 1-1 matching, CI constructed by Hodges-Lehmann (assuming causal effect is constant). ◮ Our method (percentile Bootstrap): stabilized IPW for ATT w/wo augmentation by outcome linear regression. 15/18

Bootstrapping Sensitivity analysis Qingyuan Zhao Statistical - PowerPoint PPT Presentation

Bootstrapping Sensitivity analysis Qingyuan Zhao Statistical Laboratory, University of Cambridge August 3, 2020 @ JSM Sensitivity analysis The broader concept [Saltelli et al., 2004] Sensitivity analysis is the study of how the

Climate Sensitivity We consider climate sensitivity in a very simple context. Climate Sensitivity

Sensitivity Of Quake3 Players Sensitivity Of Quake3 Players Sensitivity Of Quake3 Players

Bootstrapping without the Boot We like minimally supervised learning (bootstrapping).

Parametric Bootstrapping 18.05 Spring 2017 Parametric bootstrapping Use the estimated parameter

Sensitivity to Market Risks 1 METAC Workshop Sensitivity to Market Risks I OVERVIEW A

Sensitivity Analysis for Fuzzy- Sensitivity Analysis for Fuzzy- Logic-Based Life-Cycle Analysis

Bootstrapping Sensitivity Analysis Qingyuan Zhao Department of Statistics, The Wharton School

Statistical analysis and bootstrapping Michel Bierlaire michel.bierlaire@epfl.ch Transport and

Sensitivity Analysis and Uncertainty Sensitivity Analysis and Uncertainty Propagation from Basic

Explorations in Bootstrapping Guided Search 8th Language and Computation Day Deirdre Lungley

Improved Bootstrapping Approach in Multichannel Cognitive Radio Ad Hoc Networks The 4th Workshop

SFU NatLangLab Bootstrapping via Graph Propagation Max Whitney Anoop Sarkar Simon Fraser

INF5210 Information Infrastructure Class #11 Bootstrapping & Gateways Ben Eaton Dan Truong

Bootstrapping Debian for a new architecture Pietro Abate Universite Paris Diderot / Irill

PS 406 Week 3 Section: Bootstrapping D.J. Flynn April 21, 2014 D.J. Flynn PS406 Week 3

Ring Switching and Bootstrapping FHE Chris Peikert School of Computer Science Georgia Tech

Hierarchical interac,ons between Ethereum m sma mart contracts across Te Testnets Yao-Chieh

rs ts t

Some Facts and Figures Manfred Nagl Informatics Europe, RWTH Aachen University ECSS 2012,

Ti ank God for Second Presbyterian Church! Authentic Worship Gracious Relationships Serious

Proof Pearl: A New Foundation for Nominal Isabelle r

GENERAL MEETING 19-August-20 38th ABMB Annual General Meeting 19 August 2020 Key Results

JIN Ting He A Photo A Photo

Inference by enumeration Slightly intelligent way to sum out variables from the joint without

Bootstrapping Sensitivity analysis Qingyuan Zhao Statistical - PowerPoint PPT Presentation

Bootstrapping Sensitivity analysis Qingyuan Zhao Statistical Laboratory, University of Cambridge August 3, 2020 @ JSM Sensitivity analysis The broader concept [Saltelli et al., 2004] Sensitivity analysis is the study of how the

Climate Sensitivity We consider climate sensitivity in a very simple context. Climate Sensitivity

Sensitivity Of Quake3 Players Sensitivity Of Quake3 Players Sensitivity Of Quake3 Players

Bootstrapping without the Boot We like minimally supervised learning (bootstrapping).

Parametric Bootstrapping 18.05 Spring 2017 Parametric bootstrapping Use the estimated parameter

Sensitivity to Market Risks 1 METAC Workshop Sensitivity to Market Risks I OVERVIEW A

Sensitivity Analysis for Fuzzy- Sensitivity Analysis for Fuzzy- Logic-Based Life-Cycle Analysis

Bootstrapping Sensitivity Analysis Qingyuan Zhao Department of Statistics, The Wharton School

Statistical analysis and bootstrapping Michel Bierlaire michel.bierlaire@epfl.ch Transport and

Sensitivity Analysis and Uncertainty Sensitivity Analysis and Uncertainty Propagation from Basic

Explorations in Bootstrapping Guided Search 8th Language and Computation Day Deirdre Lungley

Improved Bootstrapping Approach in Multichannel Cognitive Radio Ad Hoc Networks The 4th Workshop

SFU NatLangLab Bootstrapping via Graph Propagation Max Whitney Anoop Sarkar Simon Fraser

INF5210 Information Infrastructure Class #11 Bootstrapping &amp; Gateways Ben Eaton Dan Truong

Bootstrapping Debian for a new architecture Pietro Abate Universite Paris Diderot / Irill

PS 406 Week 3 Section: Bootstrapping D.J. Flynn April 21, 2014 D.J. Flynn PS406 Week 3

Ring Switching and Bootstrapping FHE Chris Peikert School of Computer Science Georgia Tech

Hierarchical interac,ons between Ethereum m sma mart contracts across Te Testnets Yao-Chieh

rs ts t

Some Facts and Figures Manfred Nagl Informatics Europe, RWTH Aachen University ECSS 2012,

Ti ank God for Second Presbyterian Church! Authentic Worship Gracious Relationships Serious

Proof Pearl: A New Foundation for Nominal Isabelle r

GENERAL MEETING 19-August-20 38th ABMB Annual General Meeting 19 August 2020 Key Results

JIN Ting He A Photo A Photo

Inference by enumeration Slightly intelligent way to sum out variables from the joint without

INF5210 Information Infrastructure Class #11 Bootstrapping & Gateways Ben Eaton Dan Truong