parametric bootstrap
play

Parametric bootstrap August 30, 2017 Resampling from the data or - PowerPoint PPT Presentation

Resampling from the data or from distribution Simple Example Spline Example Parametric bootstrap August 30, 2017 Resampling from the data or from distribution Simple Example Spline Example Bootstrap + Monte Carlo = Parametric Bootstrap


  1. Resampling from the data or from distribution Simple Example Spline Example Parametric bootstrap August 30, 2017

  2. Resampling from the data or from distribution Simple Example Spline Example Bootstrap + Monte Carlo = Parametric Bootstrap

  3. Resampling from the data or from distribution Simple Example Spline Example There is no more in data, than the data – one view The bootstrap is a general tool for assessing statistical accuracy by ‘creating’ data from the data. It is based on sampling randomly from data to study how a quantity of interest behaves when observed in this process It is used to assess the variability of a certain characteristics

  4. Resampling from the data or from distribution Simple Example Spline Example There is a model behind the data – another view Study theoretically a mathematical model Fit statistically model using the data Use the theory to assess variability or other properties What if the model is difficult to study?

  5. Resampling from the data or from distribution Simple Example Spline Example Combine model with sampling Fit statistically theoretical model using the data Take Monte Carlo samples from the fitted model to investigate variability or other properties We use the model to get new samples as oppose to the non-parametric bootstrap where the samples are from the data directly Since the model is fitted from the data, so the data are indirectly used

  6. Resampling from the data or from distribution Simple Example Spline Example Simple example – bootstrap and Monte Carlo Bootstrap Monte Carlo #Data #Monte Carlo study of variances x=scan("Table2_1.txt") N=15000 n=length(x) MCvar=vector(’numeric’,N) mean(x) for(i in 1:N){ sd(x) MCx=rnorm(n,50,0.1) MCvar[i]=var(MCx) #Bootstrapping variances } B=1000 mean(MCvar) Bvar=vector(’numeric’,B) sd(MCvar) for(i in 1:B) X11() #graphical window in Unix { #windows() in Windows Bvar[i]=var(sample(x,n,rep=T)) #quartz() in Mac } hist(MCvar,nclass=10) sd(Bvar) hist(Bvar,nclass=10)

  7. Resampling from the data or from distribution Simple Example Spline Example Simple example – parametric bootstrap Fitting the model by a normal model Parametric bootstrap #Simulate from the fit #Fit the model -- # -- normal distribution PB=1000 PBvar=vector(’numeric’,PB) #Data PBsd=PBvar for(i in 1:PB) x=scan("Table2_1.txt") { n=length(x) PBx=rnorm(n,mu,sigma) mu=mean(x) PBvar[i]=var(PBx) sigma=sd(x) } mean(PBvar) sd(PBvar) X11() hist(PBvar,nclass=10)

  8. Resampling from the data or from distribution Simple Example Spline Example Fitting cubic splines - B-spline basis We fit the data on the right by the cubic B-splines h j ( x ) , j = 1 , ..., 7 on the left. Review question: Why there are seven B-splines?

  9. Resampling from the data or from distribution Simple Example Spline Example Fitting through linear regression We look for a fit of the form 7 � µ ( x ) = β j h j ( x ) . j = 1 From the standard regression solution we get β = ( H T H ) − 1 H T y ˆ so that the fit is 7 � ˆ µ ( x ) = ˆ β j h j ( x ) . j = 1

  10. Resampling from the data or from distribution Simple Example Spline Example Assessing uncertainty of the fit We have obtained the fit but we want to assess its uncertainty (variability). The concept of variability of a curve is not that straightforward as the variability of a point estimate – there can be many ways to define it. The best is to observe how curve can vary for different fits to the model For these we need many samples of data Bootstrap can be suitable But how to resample from the data? One could resample directly from the data (both y ’s and x ’s). However when variability of x ’s is not of interest, it is better to sample from the residuals.

  11. Resampling from the data or from distribution Simple Example Spline Example Resampling from the residuals Residuals ˆ ε = y − ˆ µ ( x ) Compute bootstrap samples ε ∗ from the residuals ˆ ε . For each new sample ε ∗ evaluate bootstrap version of the output data y ∗ = ˆ µ ( x ) + ε ∗ Fit new cubic splines to each bootstrap sample, plot them on the graph.

  12. Resampling from the data or from distribution Simple Example Spline Example Parametric bootstrap One can assume the normal model for errors with the mean zero and the variance N σ 2 = � µ ( x i )) 2 / N ˆ ( y i − ˆ i = 1 Compute parametric bootstrap samples ε ∗ by sampling from N ( 0 , ˆ σ 2 ) . For each new sample ε ∗ evaluate bootstrap version of the output data y ∗ = ˆ µ ( x ) + ε ∗ Fit new cubic splines to each bootstrap sample, plot them on the graph. The result will be similar as seen on the previous graph (but not identical).

Recommend


More recommend