Bootstrapping Sensitivity Analysis Qingyuan Zhao Department of - PowerPoint PPT Presentation

Bootstrapping Sensitivity Analysis Qingyuan Zhao Department of Statistics, The Wharton School University of Pennsylvania June 2, 2019 @ OSU Bayesian Causal Inference Workshop (Joint work with Bhaswar B. Battacharya and Dylan S. Small)

Why sensitivity analysis? ◮ Unless we have perfectly executed randomized experiment, causal inference is based on some unverifiable assumptions . ◮ In observational studies, the most commonly used assumption is ignorability or no unmeasured confounding : � X . � A ⊥ ⊥ Y (0) , Y (1) We can only say this assumption is “plausible”. ◮ Sensitivity analysis asks: what if this assumption does not hold? Does our qualitative conclusion still hold? ◮ This question appears in many settings: 1. Confounded observational studies. 2. Survey sampling with missing not at random (MNAR). 3. Longitudinal study with non-ignorable dropout. ◮ In general, this means that the target parameter (e.g. average treatment effect) is only partially identified . 1/20

Overview: Bootstrapping sensitivity analysis Point-identified parameter: Efron’s bootstrap Bootstrap Point estimator = = = = = = = = = = = = ⇒ Confidence interval Partially identified parameter: An analogy Optimization Percentile Bootstrap Minimax inequality Extrema estimator = = = = = = = = = = = = ⇒ Confidence interval Rest of the talk Apply this idea to IPW estimators in a marginal sensitivity model. 2/20

Some existing sensitivity models Generally, we need to specify how unconfoundedness is violated. 1. Y models: Consider a specific difference between the conditional distribution Y ( a ) | X , A and Y ( a ) | X . ◮ Commonly called “pattern mixture models”. ◮ Robins (1999, 2002); Birmingham et al. (2003); Vansteelandt et al. (2006); Daniels and Hogan (2008). 2. A models: Consider a specific difference between the conditional distribution A | X , Y ( a ) and A | X . ◮ Commonly called “selection models”. ◮ Scharfstein et al. (1999); Gilbert et al. (2003). 3. Simultaneous models: Consider a range of A models and/or Y models and report the “worst case” result. ◮ Cornfield et al. (1959); Rosenbaum (2002); Ding and VanderWeele (2016). Our sensitivity model— A hybrid of 2nd and 3rd, similar to Rosenbaum’s. 3/20

Rosenbaum’s sensitivity model ◮ Imagine there is an unobserved confounder U that “summarizes” all confounding, so A ⊥ ⊥ Y (0) , Y (1) | X , U . ◮ Let e 0 ( x , u ) = P 0 ( A = 1 | X = x , U = u ). Rosenbaum’s sensitivity model e ( x , u ) : 1 � � R (Γ) = Γ ≤ OR ( e ( x , u 1 ) , e ( x , u 2 )) ≤ Γ , ∀ x ∈ X , u 1 , u 2 , where OR ( p 1 , p 2 ) := [ p 1 / (1 − p 1 )] / [ p 2 / (1 − p 2 )] is the odds ratio . ◮ Rosenbaum’s question: can we reject the sharp null hypothesis Y (0) ≡ Y (1) for every e 0 ( x , u ) ∈ R (Γ)? ◮ Robins (2002): we don’t need to assume the existence of U . Let U = Y (1) when the goal is to estimate E [ Y (1)]. 4/20

Our sensitivity model ◮ Let e 0 ( x ) = P 0 ( A = 1 | X = x ) be the propensity score. Marginal sensitivity models e ( x , y ) : 1 � � M (Γ) = Γ ≤ OR ( e ( x , y ) , e 0 ( x )) ≤ Γ , ∀ x ∈ X , y . ◮ Compare this to Rosenbaum’s model: e ( x , u ) : 1 � � R (Γ) = Γ ≤ OR ( e ( x , u 1 ) , e ( x , u 2 )) ≤ Γ , ∀ x ∈ X , u 1 , u 2 . ◮ Tan (2006) first considered this model, but he did not consider statistical inference in finite sample. √ Γ) ⊆ R (Γ) ⊆ M (Γ) . 1 ◮ Relationship between the two models: M ( ◮ For observational studies, we assume both P 0 ( A = 1 | X , Y (1)) , P 0 ( A = 1 | X , Y (0)) ∈ M (Γ). 1 The second part needs “compatibility”: e ( x , y ) marginalizes to e 0 ( x ). 5/20

Parametric extension ◮ In practice, the propensity score e 0 ( X ) = P 0 ( A = 1 | X ) is often estimated by a parametric model. Definition (Parametric marginal sensitivity models) e ( x , y ) : 1 � � M β 0 (Γ) = Γ ≤ OR ( e ( x , y ) , e β 0 ( x )) ≤ Γ , ∀ x ∈ X , y , where e β 0 ( x ) is the best parametric approximation of e 0 ( x ). This sensitivity model covers both 1. Model misspecification , that is, e β 0 ( x ) � = e 0 ( x ); and 2. Missing not at random , that is, e 0 ( x ) � = e 0 ( x , y ). 6/20

Logistic representations 1. Rosenbaum’s sensitivity model: logit ( e ( x , u )) = g ( x ) + γ u , where 0 ≤ U ≤ 1 and γ = log Γ. 2. Marginal sensitivity model: logit ( e ( h ) ( x , y )) = logit ( e 0 ( x )) + h ( x , y ) , where � h � ∞ = sup | h ( x , y ) | ≤ γ . Due to this representation, we also call it a marginal L ∞ -sensitivity model . 3. Parametric marginal sensitivity model: logit ( e ( h ) ( x , y )) = logit ( e β 0 ( x )) + h ( x , y ) , where � h � ∞ = sup | h ( x , y ) | ≤ γ . 7/20

Confidence interval I ◮ For simplicity, consider the “missing data” problem where Y = Y (1) is only observed if A = 1. ◮ Observe i.i.d. samples ( A i , X i , A i Y i ), i = 1 , . . . , n . ◮ The estimand is µ 0 = E 0 [ Y ], however it is only partially identified under a simultaneous sensitivity model. Goal 1 (Coverage of true parameter) Construct a data-dependent interval [ L , U ] such that � � P 0 µ 0 ∈ [ L , U ] ≥ 1 − α whenever e 0 ( X , Y ) = P 0 ( A = 1 | X , Y ) ∈ M (Γ). 8/20

Confidence interval II ◮ The inverse probability weighting (IPW) identity: � AY � MAR AY � � E 0 [ Y ] = E = E . e 0 ( X , Y ) e 0 ( X ) ◮ Define � � AY µ ( h ) = E 0 e ( h ) ( X , Y ) ◮ Partially identified region: { µ ( h ) : e ( h ) ∈ M (Γ) } . Goal 2 (Coverage of partially identified region) Construct a data-dependent interval [ L , U ] such that � { µ ( h ) : e ( h ) ∈ M (Γ) } ⊆ [ L , U ] � ≥ 1 − α. P 0 ◮ Imbens and Manski (2004) have discussed the difference between these two Goals. 9/20

An intuitive idea: “The Union Method” ◮ Suppose for any h , we have a confidence interval [ L ( h ) , U ( h ) ] such that n →∞ P 0 ( µ ( h ) ∈ [ L ( h ) , U ( h ) ]) ≥ 1 − α lim inf � h � L ( h ) and U = sup U ( h ) , so [ L , U ] is the union interval . ◮ Let L = inf � h � Theorem 1. [ L , U ] satisfies Goal 1 asymptotically. 2. Furthermore if the intervals are “congruent”: ∃ α ′ < α such that µ ( h ) < L ( h ) � µ ( h ) > U ( h ) � � ≤ α ′ , lim sup � ≤ α − α ′ . lim sup n →∞ P 0 n →∞ P 0 Then [ L , U ] satisfies Goal 2 asymptotically. 10/20

Practical challenge: How to take the union? ◮ Suppose ˆ g ( x ) is an estimate of logit ( e 0 ( x )). ◮ For a specific difference h , we can estimate e ( h ) ( x , y ) by 1 e ( h ) ( x , y ) = ˆ g ( x , y ) . 1 + e h ( x , y ) − ˆ ◮ This leads to an (stabilized) IPW estimate of µ ( h ) : n � − 1 � 1 n � 1 � A i A i Y i µ ( h ) = � � ˆ . e ( h ) ( X i , Y i ) e ( h ) ( X i , Y i ) n ˆ n ˆ i =1 i =1 ◮ Under regularity conditions, the Z-estimation theory tells us µ ( h ) − µ ( h ) � d √ n � → N (0 , ( σ ( h ) ) 2 ) ˆ σ ( h ) 2 · ˆ µ ( h ) ∓ z α ◮ Therefore we can use [ L ( h ) , U ( h ) ] = ˆ √ n . ◮ However, computing the union interval requires solving a complicated optimization problem. 11/20

Bootstrapping sensitivity analysis Point-identified parameter: Efron’s bootstrap Bootstrap Point estimator = = = = = = = = = = = = ⇒ Confidence interval Partially identified parameter: An analogy Optimization Percentile Bootstrap Minimax inequality Extrema estimator = = = = = = = = = = = = ⇒ Confidence interval A simple procedure for simultaneous sensitivity analysis 1. Generate B random resamples of the data. For each resample, compute the extrema of IPW estimates under M β 0 (Γ). 2. Construct the confidence interval using L = Q α/ 2 of the B minima and U = Q 1 − α/ 2 of the B maxima. Theorem [ L , U ] achieves Goal 2 for M β 0 (Γ) asymptotically. 12/20

Proof of the Theorem Partially identified parameter: Three ideas Optimization 1. Percentile Bootstrap 2. Minimax inequality Extrema estimator = = = = = = = = = = = = ⇒ Confidence interval µ ( h ) can be captured by bootstrap. The 1. The sampling variability of ˆ percentile bootstrap CI is given by � � � � �� µ ( h ) µ ( h ) ˆ ˆ ˆ , Q 1 − α ˆ . Q α b b 2 2 2. Generalized minimax inequality: Percentile Bootstrap CI � � � � � � � � µ ( h ) µ ( h ) µ ( h ) µ ( h ) ˆ ˆ ˆ ˆ Q α inf ˆ ≤ inf h Q α ˆ ≤ sup Q 1 − α ˆ ≤ Q 1 − α sup ˆ . b b b b 2 2 2 2 h h h Union CI 13/20

Computation Partially identified parameter: Three ideas 3. Optimization Percentile Bootstrap Minimax inequality Extrema estimator = = = = = = = = = = = = ⇒ Confidence interval µ ( h ) is a linear fractional programming : 3. Computing extrema of ˆ Let z i = e h ( X i , Y i ) , we just need to solve � n � 1 + z i e − ˆ g ( X i ) � i =1 A i Y i max or min g ( X i ) � , � n � 1 + z i e − ˆ i =1 A i z i ∈ [Γ − 1 , Γ] , i = 1 , . . . , n . subject to ◮ This can be converted to a linear programming. ◮ Moreover, the solution z must have the same/opposite order as Y , so the time complexity can be reduced to O ( n ) (optimal). The role of Bootstrap Comapred to the union method, the workflow is greatly simplified: 1. No need to derive σ ( h ) analytically (though we could). 2. No need to optimize σ ( h ) (which is very challenging). 14/20

Bootstrapping Sensitivity Analysis Qingyuan Zhao Department of - PowerPoint PPT Presentation

Bootstrapping Sensitivity Analysis Qingyuan Zhao Department of Statistics, The Wharton School University of Pennsylvania June 2, 2019 @ OSU Bayesian Causal Inference Workshop (Joint work with Bhaswar B. Battacharya and Dylan S. Small) Why

Bootstrapping Sensitivity analysis Qingyuan Zhao Statistical Laboratory, University of Cambridge

Climate Sensitivity We consider climate sensitivity in a very simple context. Climate Sensitivity

Sensitivity Of Quake3 Players Sensitivity Of Quake3 Players Sensitivity Of Quake3 Players

Bootstrapping without the Boot We like minimally supervised learning (bootstrapping).

Parametric Bootstrapping 18.05 Spring 2017 Parametric bootstrapping Use the estimated parameter

Sensitivity to Market Risks 1 METAC Workshop Sensitivity to Market Risks I OVERVIEW A

Sensitivity Analysis for Fuzzy- Sensitivity Analysis for Fuzzy- Logic-Based Life-Cycle Analysis

Statistical analysis and bootstrapping Michel Bierlaire michel.bierlaire@epfl.ch Transport and

Sensitivity Analysis and Uncertainty Sensitivity Analysis and Uncertainty Propagation from Basic

Explorations in Bootstrapping Guided Search 8th Language and Computation Day Deirdre Lungley

Improved Bootstrapping Approach in Multichannel Cognitive Radio Ad Hoc Networks The 4th Workshop

SFU NatLangLab Bootstrapping via Graph Propagation Max Whitney Anoop Sarkar Simon Fraser

INF5210 Information Infrastructure Class #11 Bootstrapping & Gateways Ben Eaton Dan Truong

Bootstrapping Debian for a new architecture Pietro Abate Universite Paris Diderot / Irill

PS 406 Week 3 Section: Bootstrapping D.J. Flynn April 21, 2014 D.J. Flynn PS406 Week 3

Ring Switching and Bootstrapping FHE Chris Peikert School of Computer Science Georgia Tech

Proving Expected Sensitivity of Probabilistic Programs Gilles Barthe Thomas Espitau Benjamin

Gaussian process regression for Sensitivity analysis GPSS Workshop on UQ, Sheffield, September

Motivation Topic-Sensitive PageRank Improve search results Current engines work well for

simulations Workshop on Bioinformatics of Gene Regulation on the occasion of 30 Years TRANSFAC

Data-driven sensitivity analysis for Matching estimators Giovanni Cerulli 1 1 IRCrES-CNR, Research

Corporate Earnings Sensitivity to FX Volatility: Evidence from Peru Alberto Humala Central

Alias Analysis Last time Alias analysis I (pointer analysis) Address Taken FIAlias,

A Sensitivity Analysis of (and Practitioners Guide to) Convolutional Neural Networks for

Bootstrapping Sensitivity Analysis Qingyuan Zhao Department of - PowerPoint PPT Presentation

Bootstrapping Sensitivity Analysis Qingyuan Zhao Department of Statistics, The Wharton School University of Pennsylvania June 2, 2019 @ OSU Bayesian Causal Inference Workshop (Joint work with Bhaswar B. Battacharya and Dylan S. Small) Why

Bootstrapping Sensitivity analysis Qingyuan Zhao Statistical Laboratory, University of Cambridge

Climate Sensitivity We consider climate sensitivity in a very simple context. Climate Sensitivity

Sensitivity Of Quake3 Players Sensitivity Of Quake3 Players Sensitivity Of Quake3 Players

Bootstrapping without the Boot We like minimally supervised learning (bootstrapping).

Parametric Bootstrapping 18.05 Spring 2017 Parametric bootstrapping Use the estimated parameter

Sensitivity to Market Risks 1 METAC Workshop Sensitivity to Market Risks I OVERVIEW A

Sensitivity Analysis for Fuzzy- Sensitivity Analysis for Fuzzy- Logic-Based Life-Cycle Analysis

Statistical analysis and bootstrapping Michel Bierlaire michel.bierlaire@epfl.ch Transport and

Sensitivity Analysis and Uncertainty Sensitivity Analysis and Uncertainty Propagation from Basic

Explorations in Bootstrapping Guided Search 8th Language and Computation Day Deirdre Lungley

Improved Bootstrapping Approach in Multichannel Cognitive Radio Ad Hoc Networks The 4th Workshop

SFU NatLangLab Bootstrapping via Graph Propagation Max Whitney Anoop Sarkar Simon Fraser

INF5210 Information Infrastructure Class #11 Bootstrapping &amp; Gateways Ben Eaton Dan Truong

Bootstrapping Debian for a new architecture Pietro Abate Universite Paris Diderot / Irill

PS 406 Week 3 Section: Bootstrapping D.J. Flynn April 21, 2014 D.J. Flynn PS406 Week 3

Ring Switching and Bootstrapping FHE Chris Peikert School of Computer Science Georgia Tech

Proving Expected Sensitivity of Probabilistic Programs Gilles Barthe Thomas Espitau Benjamin

Gaussian process regression for Sensitivity analysis GPSS Workshop on UQ, Sheffield, September

Motivation Topic-Sensitive PageRank Improve search results Current engines work well for

simulations Workshop on Bioinformatics of Gene Regulation on the occasion of 30 Years TRANSFAC

Data-driven sensitivity analysis for Matching estimators Giovanni Cerulli 1 1 IRCrES-CNR, Research

Corporate Earnings Sensitivity to FX Volatility: Evidence from Peru Alberto Humala Central

Alias Analysis Last time Alias analysis I (pointer analysis) Address Taken FIAlias,

A Sensitivity Analysis of (and Practitioners Guide to) Convolutional Neural Networks for

INF5210 Information Infrastructure Class #11 Bootstrapping & Gateways Ben Eaton Dan Truong