Initial values and convergence diagnostics Applied Bayesian Statistics Dr. Earvin Balderama Department of Mathematics & Statistics Loyola University Chicago October 12, 2017 Initial values and convergence diagnostics 1 Last edited October 5, 2017 by <ebalderama@luc.edu>
MCMC Some things to consider in an MCMC algorithm We need to make sure the algorithm has converged. Here are some decisions that need to be made: Selecting the initial values . 1 Determining if/when the chain(s) has converged . 2 Selecting the number of samples needed to approximate the posterior. 3 Initial values and convergence diagnostics 2 Last edited October 5, 2017 by <ebalderama@luc.edu>
MCMC Initial values In theory, the algorithm will eventually converge no matter what initial values you select. However, taking time to select good initial values can speed up convergence. It’s important to try a few initial values to verify they all give the same result. Usually, 3-5 separate chains is sufficient. How to select initial values? A couple of options: Use frequentist estimates, like MOM or MLE. 1 Purposely pick bad but different initial values for each chain to check 2 convergence. Initial values and convergence diagnostics 3 Last edited October 5, 2017 by <ebalderama@luc.edu>
MCMC Convergence The first few samples are probably not draws from the posterior distribution. It can take a few dozen, hundreds or even thousands of iterations to move from the initial values to the posterior. When the sampler reaches the posterior this is called convergence . Samples before convergence should be discarded as burn-in . Note: After convergence, the samples should NOT converge to a single point! They are all draws from the posterior, and the traceplot should ideally look like a caterpillar. Initial values and convergence diagnostics 4 Last edited October 5, 2017 by <ebalderama@luc.edu>
MCMC Examples of traceplots 6 4 4 2 MCMC sample MCMC sample 2 0 0 − 2 − 2 − 4 0 200 400 600 800 1000 0 200 400 600 800 1000 Iteration number Iteration number Initial values and convergence diagnostics 5 Last edited October 5, 2017 by <ebalderama@luc.edu>
MCMC Examples of traceplots 20 4 15 MCMC sample MCMC sample 2 10 0 5 − 2 0 0 200 400 600 800 1000 0 200 400 600 800 1000 Iteration number Iteration number Initial values and convergence diagnostics 6 Last edited October 5, 2017 by <ebalderama@luc.edu>
MCMC Convergence diagnostics Visual inspection 1 We can visually inspect the chains for convergence (traceplots). Formal diagnostics 2 There are many measures of convergence. The coda package in R has many diagnostics; giving a measure of convergence for each parameter. Checking convergence using some of these one-number summaries is more efficient and objective than visual inspection. Initial values and convergence diagnostics 7 Last edited October 5, 2017 by <ebalderama@luc.edu>
MCMC Autocorrelation Ideally, we want independent samples across iterations. But, it’s a Markov chain! The autocorrelation function, ρ ( h ) , is the correlation between samples h iterations apart. Of course, we want significant autocorrelation to exist only at lower lag values h ; the lower the better. But large values are OK if the chain can be long enough. Thinning: If autocorrelation is zero after some lag t , you can “thin” the samples by t to achieve independence. Initial values and convergence diagnostics 8 Last edited October 5, 2017 by <ebalderama@luc.edu>
MCMC Effective sample size Correlated samples have less information than independent samples of the same size. The effective sample size is the number of independent samples that give the same amount of precision as the MCMC samples. In other words, we gain the same amouht of information from S autocorrelated samples as we do from S eff independent samples. If S is the actual number of MCMC samples, then the effective sample size is S S eff = . ∞ � 1 + 2 ρ ( h ) h = 1 Note: Obviously, S eff ≤ S . S eff should be at least a few thousand for all parameters. Initial values and convergence diagnostics 9 Last edited October 5, 2017 by <ebalderama@luc.edu>
MCMC Standard errors of posterior mean estimates The mean of the MCMC samples is an estimate of the posterior mean; the standard error of this estimate is another diagnostic: Assuming independence, the naive standard error is s Naive SE = √ , S where s is the sample standard deviation, and S is the number of samples. More realistically, the time-series standard error is s Time-series SE = . � S eff Initial values and convergence diagnostics 10 Last edited October 5, 2017 by <ebalderama@luc.edu>
MCMC Gelman-Rubin statistic If we run multiple chains, we hope that all chains give the sam result. The Gelman-Rubin statistic is a measure of agreement between chains. It essentially performs an ANOVA test of the chain means. It is scaled so that 1 is perfect agreement , and about 1.1 is decent but not great convergence. When the statistic reaches 1 this indicates convergence. Initial values and convergence diagnostics 11 Last edited October 5, 2017 by <ebalderama@luc.edu>
MCMC Convergence diagnostics Using the coda package, we can easily compute some convergence diagnostics. Suppose theta is the output vector containing your MCMC samples. # 1. load the coda package library (coda) # 2. create an mcmc or mcmc.list object containing your Markov chains. chain <- mcmc (theta) # 3. use coda functions for diagnostic plots and summaries effectiveSize (chain) #effective sample size autocorr.plot (chain) #autocorrelation plot Initial values and convergence diagnostics 12 Last edited October 5, 2017 by <ebalderama@luc.edu>
Recommend
More recommend