Bootstrap Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on - PowerPoint PPT Presentation

Stat 451 Lecture Notes 08 12 Bootstrap Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 9 in Givens & Hoeting, Chapter 24 in Lange 2 Updated: April 4, 2016 1 / 36

Outline 1 Introduction 2 Nonparametric bootstrap 3 Parametric bootstrap 4 Bootstrap in regression 5 Better bootstrap CIs 6 Remedies for bootstrap failure 7 Further remarks 2 / 36

Motivation For hypothesis testing and confidence intervals, there is a “statistic” whose sampling distribution is required. For example, to test if H 0 : µ = µ 0 based on a sample from a N( µ, σ 2 ) population, we use the t-statistic T = X − µ 0 S / √ n , whose null distribution (under stated conditions) is Student-t. But almost any deviation from this basic setup leads to a tremendously difficult distributional calculation. The goal of the bootstrap is to give a simple approximate solution based on simulations. 3 / 36

Notation For a distribution with cdf F , suppose we are interested in a parameter θ = ϕ ( F ), written as a functional of F . Examples: � Mean: ϕ ( F ) = x dF ( x ); Median: ϕ ( F ) = inf { x : F ( x ) ≥ 0 . 5 } ; ... Given data X = { X 1 , . . . , X n } from F , the empirical cdf is n � F ( x ) = 1 � I { X i ≤ x } , x ∈ R . n i =1 Then a natural estimate of θ is ˆ θ = ϕ ( � F ), the same functional of the the empirical cdf. 4 / 36

Notation (cont.) For inference, some statistic T ( X , F ) is used; e.g., T ( X , F ) = X − µ 0 S / √ n . Again, the sampling distribution of T ( X , F ) may be very complicated, unknown, or could depend on unknown F . Bootstrap Idea: Replace the unknown cdf F with the empirical cdf � F . Produce a numerical approximation of the sampling distribution of T ( X , F ) by repeated sampling from � F . 5 / 36

Notation (cont.) Let X ⋆ = { X ⋆ n } be an iid sample from � 1 , . . . , X ⋆ F , i.e., a sample of size n (with replacement) from X . Given X ⋆ , the statistic T ⋆ = T ( X ⋆ , � F ) can be evaluated. Repeated sampling of X ⋆ gives a sequence of T ⋆ ’s which can be used to approximate the distribution of T ( X , F ). For example, V { T ( X , F ) } ≈ Var { T ⋆ 1 , . . . , T ⋆ B } . Why should bootstrap work? Glivenko–Cantelli theorem says that � F → F as n → ∞ . So, iid sampling from � F should be approximately the same as iid sampling from F when n is large. 3 3 Lots of difficult theoretical work has been done to determine what it means for this approximation to be good and in what kinds of problems does it fail. 6 / 36

Basic setup Above procedure is essentially the nonparametric bootstrap . Sampling distribution of T ( X , F ) is approximated directly by the empirical distribution of the bootstrap sample T ⋆ 1 , . . . , T ⋆ B . For example: Quantiles of T ( X , F ) are approximated by sample quantiles of T ⋆ 1 , . . . , T ⋆ B . Variance of T ( X , F ) is approximated by the sample variance of T ⋆ 1 , . . . , T ⋆ B . Bootstrap sample size is usually rather large, e.g., B ∼ 1000, but computationally manageable. 8 / 36

Example: variance of a sample median Example 29.4 in DasGupta (2008). X 1 , . . . , X n iid Cauchy with median µ . The sample mean X is a bad estimator of µ ( why? ), so use the sample median M n instead. For odd n , say n = 2 k + 1, there is an exact formula: � π/ 2 2 n ! x k ( π − x ) k (cot x ) 2 dx . V( M n ) = ( k !) 2 π n 0 A CLT-type of asymptotic approximation is � V( M n ) = π 2 / 4 n . What about a bootstrap approximation? 9 / 36

Example: variance of sample median (cont.) With n = 21 we have � V( M n ) = 0 . 1367 and V( M n ) = 0 . 1175 . A bootstrap 4 estimate of V( M n ) using B = 5000 is � V( M n ) boot = 0 . 1102 . Slight under-estimate of the variance... Main point is that we got a pretty good answer with essentially no effort — computer does all the hard work. 4 Note that I used set.seed(77) in the code... 10 / 36

Technical points What does it mean for the bootstrap to “work?” H n ( x ) is the true distribution function for ˆ θ n . n ( x ) is the true distribution function for ˆ H ⋆ θ ⋆ n . Bootstrap is “consistent” if the distance between H n ( x ) and H ⋆ n ( x ) converges to 0 (in probability) as n → ∞ . The bootstrap is successful in many problems, but there are known situations when it may fail: support depends on parameter; true parameter sits on boundary of parameter space; estimator convergence rate � = n − 1 / 2 ; ... The bootstrap can detect skewness in the distribution of ˆ θ n while CLT-type of approximations will not — often has a “second-order accuracy” property. Bootstrap often underestimate variances. 11 / 36

Bootstrap confidence intervals (CIs) A primary application of bootstrap is to construct CIs. The simplest approach is the percentile method . Let ˆ 1 , . . . , ˆ θ ⋆ θ ⋆ B be a bootstrap sample of point estimators. A two-sided 100(1 − α )% bootstrap percentile CI is [ ξ ⋆ α/ 2 , ξ ⋆ 1 − α/ 2 ] , where ξ ⋆ p is the 100 p percentile in the bootstrap sample. Simple and intuitive, but there are “better” methods. 12 / 36

Definition Parametric bootstrap is a variation on the standard (nonparametric) bootstrap discussed previously. Let F = F θ be a parametric model. The parametric bootstrap replaces sampling iid from � F with θ , where ˆ sampling iid from F ˆ θ is some estimator of θ . Potentially more complicated than nonparametric bootstrap because sampling from F ˆ θ might be more difficult than sampling from � F . 14 / 36

Example: variance of sample median X 1 , . . . , X n iid Cauchy with median µ . Write M n for sample median. Parametric bootstrap samples X ⋆ 1 , . . . , X ⋆ n from a Cauchy distribution with median M n . Using B = 5000, parametric bootstrap gives � V( M n ) p-boot = 0 . 1356 . A bit closer to the true variance, V( M n ) = 0 . 1367, compared to nonparametric bootstrap. 15 / 36

Example: random effect model Hierarchical model: ∼ N( λ, ψ 2 ) iid ( µ 1 , . . . , µ n ) Y i | µ i ∼ N( µ i , σ 2 i ) , i = 1 , . . . , n . Parameters ( λ, ψ ) unknown but σ i ’s known. Parameter of interest is ψ ≥ 0, and values ψ ≈ 0 are of interest because it suggests homogeneity. ind ∼ N( λ, σ 2 i + ψ 2 ), i = 1 , . . . , n . Non-hierarchical version: Y i Can estimate ψ via maximum likelihood. Use parametric bootstrap to get confidence intervals? 16 / 36

Example: random effect model (cont.) Want to see what happens when ψ ≈ 0. Take ψ = n − 1 / 2 , near the boundary of ψ ≥ 0. Two-sided 95% parametric bootstrap percentile intervals have pretty low coverage in this case, even for large n . It is possible to get intervals with exact coverage... n Coverage Length 50 0.758 0.183 100 0.767 0.138 250 0.795 0.079 500 0.874 0.039 17 / 36

Setup i , y i ) ⊤ are Consider an observational study where pairs z i = ( x ⊤ sampled from a joint predictor-response distribution. Let Z = { z 1 , . . . , z n } . Following the basic bootstrap principle above, repeatedly sample Z ⋆ = { z ⋆ 1 , . . . , z ⋆ n } with replacement from Z . Then do the same approximation of sampling distributions based on the empirical distribution from the bootstrap sample. This is called the paired bootstrap . What about for a fixed design? Complication is that the y i ’s are not iid. In such cases, first resample the residuals e i = y i − ˆ y i from the i ˆ original LS fit, and then set y ⋆ i = x ⊤ β + e ⋆ i . 19 / 36

Example: ratio of slope coefficients Consider the simple linear regression model y i = β 0 + β 1 x i + ε i , i = 1 , . . . , n , where ε 1 , . . . , ε n are iid mean zero, not necessarily normal. Assume this is an observational study . Suppose the parameter of interest is θ = β 1 /β 0 . A natural estimate of θ is ˆ θ = ˆ β 1 / ˆ β 0 . To get the (paired) bootstrap distribution of ˆ θ : Sample Z ⋆ = { z ⋆ 1 , . . . , z ⋆ n } with replacement from Z . Fit the regression model with data Z ⋆ to obtain ˆ 0 and ˆ β ⋆ β ⋆ 1 . θ ⋆ = ˆ Evaluate ˆ 1 / ˆ β ⋆ β ⋆ 0 . 20 / 36

Example: ratio of slope coefficients (cont.) Figure below shows bootstrap distribution of ˆ θ = ˆ β 1 / ˆ β 0 . 95% bootstrap percentile confidence interval for θ is: ( − 0 . 205 , − 0 . 173) . 50 40 Density 30 20 10 0 -0.24 -0.22 -0.20 -0.18 -0.16 theta.paired 21 / 36

Bootstrap Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on - PowerPoint PPT Presentation

Stat 451 Lecture Notes 08 12 Bootstrap Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 9 in Givens & Hoeting, Chapter 24 in Lange 2 Updated: April 4, 2016 1 / 36 Outline 1 Introduction 2 Nonparametric bootstrap 3

A better Bootstrap, Mack, and the ELRF and PTF modelling Frameworks Bootstrap technique- a

STAT 113 Bootstrap Confidence Intervals Colin Reimer Dawson Oberlin College 3 March 2017

1 Get Started 2 3 Web Application Development What is Bootstrap? Bootstrap is a free

Bootstrap Shan-Hung Wu CS, NTHU Landing Page HTML/CSS taught so far Bootstrap 4 (alpha

AngularJS & Bootstrap Tabs, Forms, Models Tabs inside out Bootstrap has classes that

Bootstrap Percolation on Periodic Trees Milan Bradonji work with Iraj Saniee Bell Labs,

Lecture 21: Bootstrap and Permutation Tests The bootstrap Bootstrapping generally refers to

Bootstrap: A framework for CSS Jay Urbain, Ph.D. Credits: http://getbootstrap.com/

Unit 4: Inference for numerical variables Lecture 1: Bootstrap, paired, and two sample Statistics

Anima IETF 93 Charter Discussion Design Team Update bootstrap design team

Cross-validation and the Bootstrap In the section we discuss two resampling methods:

Bootstrap method for misspecified stochastic differential equation models Yuma Uehara The

Parametric bootstrap August 30, 2017 Resampling from the data or from distribution Simple

Stochastic Simulation The Bootstrap method Bo Friis Nielsen Institute of Mathematical Modelling

Bootstrapping 18.05 Spring 2018 Agenda Leftover from 5/2 : binomial confidence intervals

Stochastic Simulation Non-parametric technique The Bootstrap method Bo Friis Nielsen

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Douglas Bates

Descriptive Statistics and Probability: A Look at Real- World

1 Changelog Changes made in this version not seen in fjrst lecture: 25 September: add back

NRELs STAT Network: Explaining the Opportunity and Process for Requesting Solar Technical

CFSCQ: Extending a verified file system with concurrency Tej Chajed advised by Frans Kaashoek

Who we are Alexey sysbench maintainer since 2004. Formerly performance engineer, Kopytov

NUMA-aware Graph-structured Analytics Kaiyuan Zhang, Rong Chen, Haibo Chen Institute of

Applied Bayesian Statistics STAT 388/488 Dr. Earvin Balderama Department of Mathematics &

Bootstrap Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on - PowerPoint PPT Presentation

Stat 451 Lecture Notes 08 12 Bootstrap Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 9 in Givens & Hoeting, Chapter 24 in Lange 2 Updated: April 4, 2016 1 / 36 Outline 1 Introduction 2 Nonparametric bootstrap 3

A better Bootstrap, Mack, and the ELRF and PTF modelling Frameworks Bootstrap technique- a

STAT 113 Bootstrap Confidence Intervals Colin Reimer Dawson Oberlin College 3 March 2017

1 Get Started 2 3 Web Application Development What is Bootstrap? Bootstrap is a free

Bootstrap Shan-Hung Wu CS, NTHU Landing Page HTML/CSS taught so far Bootstrap 4 (alpha

AngularJS &amp; Bootstrap Tabs, Forms, Models Tabs inside out Bootstrap has classes that

Bootstrap Percolation on Periodic Trees Milan Bradonji work with Iraj Saniee Bell Labs,

Lecture 21: Bootstrap and Permutation Tests The bootstrap Bootstrapping generally refers to

Bootstrap: A framework for CSS Jay Urbain, Ph.D. Credits: http://getbootstrap.com/

Unit 4: Inference for numerical variables Lecture 1: Bootstrap, paired, and two sample Statistics

Anima IETF 93 Charter Discussion Design Team Update bootstrap design team

Cross-validation and the Bootstrap In the section we discuss two resampling methods:

Bootstrap method for misspecified stochastic differential equation models Yuma Uehara The

Parametric bootstrap August 30, 2017 Resampling from the data or from distribution Simple

Stochastic Simulation The Bootstrap method Bo Friis Nielsen Institute of Mathematical Modelling

Bootstrapping 18.05 Spring 2018 Agenda Leftover from 5/2 : binomial confidence intervals

Stochastic Simulation Non-parametric technique The Bootstrap method Bo Friis Nielsen

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Douglas Bates

Descriptive Statistics and Probability: A Look at Real- World

1 Changelog Changes made in this version not seen in fjrst lecture: 25 September: add back

NRELs STAT Network: Explaining the Opportunity and Process for Requesting Solar Technical

CFSCQ: Extending a verified file system with concurrency Tej Chajed advised by Frans Kaashoek

Who we are Alexey sysbench maintainer since 2004. Formerly performance engineer, Kopytov

NUMA-aware Graph-structured Analytics Kaiyuan Zhang, Rong Chen, Haibo Chen Institute of

Applied Bayesian Statistics STAT 388/488 Dr. Earvin Balderama Department of Mathematics &amp;

AngularJS & Bootstrap Tabs, Forms, Models Tabs inside out Bootstrap has classes that

Applied Bayesian Statistics STAT 388/488 Dr. Earvin Balderama Department of Mathematics &