Markov chain Monte Carlo sampling SPiNCOM reading group Jun. 10 th , - PowerPoint PPT Presentation

Markov chain Monte Carlo sampling SPiNCOM reading group Jun. 10 th , 2016 Dimitris Berberidis 1

Problem statement - Motivation Goal : Draw samples from a given pdf Impact of sampling :  Bayesian inference ( :unknowns, : data) Normalization  Marginalization  Our focus  Expectation  Optimization: non-convex multimodal objectives  Statistical mechanics  Penalized likelihood model selection  Simulation of physical systems 2

Roadmap  Motivation  Basic Monte Carlo  Rejection Sampling  Marcov chain Monte Carlo  Metropolis-Hastings  Gibbs sampling  Importance sampling  Relation to Rejection Sampling Sequential Importance Sampling (Particle Filtering)   Conclusions C. Andrieu, N. de Freitas, A. Doucet and M. Jordan, “An Introduction to MCMC for Machine Learning,” 3 Machine Learning , pp. 5-43, Jan 2003.

The Monte Carlo principle  Draw samples i.i.d from  Approximate with  Approx. integrals with tractable sums  unbiased for finite with with  Approx. the maximum of as Challenge: What if does not have a standard form (e.g. Gaussian) ? 4

Rejection Sampling  Instead of , draw i.i.d samples from an “easy”  Proposal pdf should satisfy: Rejection Sampling algorithm  Accepted sampled according to  Severe limitation in practice: can be too large 5

Basics of Markov chains  Discrete stochastic process is a Marcov chain (MC) if  MC is homogeneous if is time invariant  After steps, probability of state is:  MC reaches stationary distribution if :  MC converges to a stationary distribution if  Irreducible: All states are visited (transition graph connected)  Aperiodic: Does not get trapped into cycles 6

Markov chain Monte Carlo  Goal : Construct MC with target as stationary distribution  Sufficient condition: The detailed balance condition (DBC)  Continuous states  Transition kernel:  DBC remains the same  Run MC to convergence and obtain non i.i.d samples  Design to achieve fast convergence (e.g. small mixing time) 7

The Metropolis-Hastings sampler Rejection probability  MH transition kernel:  satisfies DBC Admits as stationary dist.  Scale of not needed! (recall )  MH always aperiodic; irreducible if support of includes support of  Special cases of MH  Independent sampler:  Metropolis sampler: 8

Example of MH sampling  Three different Gaussians as proposal distributions  Choice of proposal distribution is critical! 9

MCMC with mixture of transition kernels  Key property  Let and trans. kernels converge  also converges to  Intuition  Local random walk reduces the number of rejections  Global proposal helps discover other modes 10

Example of MH with mixture of Kernels Target: Proposal: 11

Experiment with mixture of Kernels 12

Simulated Annealing  Simple modification of the MH algorithm for global optimization Example  Simulates a non-homogeneous MC with  Intuition: concentrates around global max. of as 13

Experiment with Simulated Annealing 14

Cycles of MH kernels  Multivariate state is split into blocks  Each block is updated separately  Transition Kernel  Block correlated variables together for fast convergence  Trade-off on block size  Small block size: Chain takes long time to explore space  Large block size: Acceptance probability is small 15

Gibbs sampling  For assume that we know  Gibbs sampling proposal distribution  Acceptance probability =1  Combined with MH if not easy  To sample Markov networks, condition on ``Markov Blanket’’ 16

Importance sampling - Basics  Key idea: sample from and weight with  Draw i.i.d from to obtain:  Target is approximated by  Estimate is unbiased and:  If scale of unknown, set and normalize 17

Efficiency of importance sampling  Proposal pdf selected to minimize variance  Variance lower bound (using Jensen’s ineq.)  Optimum importance distribution  IS can be super efficient!  Generally difficult to sample 18

RS as a special case of IS  Recall the rejection sampling method  Define a new target distribution in  IS with target and proposal  Equivalent to RS if samples are used to obtain  IS generally (and provably) more efficient for this purpose Y. Chen, “Another look at rejection sampling through importance sampling,” Statistic & Probability 19 Letters, pp. 277-283, May 2005.

Hidden markov model  The hidden Marcov model State transition model: Observation model:  Goal of filtering : Approximate and 20

Sequential Importance Sampling (particle filtering)  Target density:  Importance density: Leave the past  How to sample from ? unchanged  At time we have:  Sample for :  Importance weights :  Augment without changing the past (filtering) 21

Particle degeneracy – How to fix it Theorem: The unconditional variance of the weights (with interpreted as r.v.’s) increases with time. Proof . The weight sequence is a Martingale random process Martingale definition: Variance of a martingale is always non-decreasing Rao-Blackwell  Theoretical fix: Sample from optimal  Practical fix : Resample particles after each iteration A. Kong, J. S. Liu, and W. H. Wong, “Sequential imputations and Bayesian missing data problems,” J. 22 of the American Statistical Association, pp. 278-288, March 1994.

The particle filter with resampling  Many available methods for selection (resampling)  Simplest is to ``clone ‘’ w.p.  Particles that are not cloned are ``killed’’ 23

The bootstrap particle filter  Simple, non-adaptive proposal distribution  Convenient for non-linear models with additive Gaussian noise  Transition prob. and likelihood are both Gaussian (easy to sample)  Simple to implement; Modular structure; Adheres parallelization  Resampling is very critical!  Ensures that the particles ‘follow’ the target A. Doucet, N. de Freitas and N. Gordon, “Sequential Monte Carlo Methods in Practice,” Springer , 2001. 24

Example: target tracking  State: position and constant velocity Speed corrections (Gaussian noise with cov. Q) 25

Distance and bearing measurements Uncorrelated Gaussian noise 26

Tracking  Bootstrap PF with particles:  Sampling step (propagation of particles)  Evaluation of weights (likelihood of particles)  Randomized resampling w.p. 27

Result 28

Conclusions  MCMC and IS: powerful, all-around tools for Bayesian inference  Applicable to any problem if tuned properly  Proposal distributions  Resampling schemes (in PF)  Other MCMC derivatives  MCMC expectation-maximization algorithms  Hybrid MC Slice sampler  Reversible jump MCMC for model selection  29

Markov chain Monte Carlo sampling SPiNCOM reading group Jun. 10 th , - PowerPoint PPT Presentation

Markov chain Monte Carlo sampling SPiNCOM reading group Jun. 10 th , 2016 Dimitris Berberidis 1 Problem statement - Motivation Goal : Draw samples from a given pdf Impact of sampling : Bayesian inference ( :unknowns,

Markov chain Monte Carlo Reminder Need to sample large, non-standard distributions: Markov

Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabs Pczos

The Monte Carlo Method Estimating through sampling (estimating , p -value, integrals,...)

Sampling Methods Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 11 Sampling Rejection Sampling

Markov Chain Monte Carlo Methods Michel Bierlaire michel.bierlaire@epfl.ch Transport and

Distributed Markov chain Monte Carlo Lawrence Murray CSIRO Mathematics, Informatics and

STAT 339 Markov Chain Monte Carlo (MCMC) 7 April 2017 Some theory and intuition about MCMC

Introduction to Markov Chain Monte Carlo Olivier Le Matre 1 with Omar Knio (KAUST) 1 Centre de

Stratified Markov Chain Monte Carlo Brian Van Koten University of Massachusetts, Amherst

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling Christopher Jennison

Sampling from distributive lattices the Markov chain approach Graduiertenkolleg MDS TU

Chapter 11: Sampling Methods Lei Tang Department of CSE Arizona State University Dec. 18th,

Chapter 11: Sampling Methods Lei Tang Department of CSE Arizona State University Dec. 18th,

Markov chain Monte Carlo Dr. Jarad Niemi STAT 544 - Iowa State University April 2, 2018 Jarad

Partial ordering of inhomogeneous Markov chains with applications to Markov chain Monte Carlo

Lattice Gaussian Sampling with Markov Chain Monte Carlo (MCMC) Cong Ling Imperial College London

Probabilistic Graphical Models Probabilistic Graphical Models Markov Chain Monte Carlo Inference

Bayesian inference & Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly

Bayesian inference & Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly

Bayesian inference & Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly

Markov Chain Monte Carlo (MCMC) Inference Seung-Hoon Na Chonbuk National University Monte Carlo

Adaptive and Interacting Markov chain Monte Carlo Gersende FORT LTCI CNRS & Telecom

MARKOV CHAIN MONTE CARLO METHODS MARKOV CHAIN MONTE CARLO METHODS MARKO LAINE, FMI MARKO LAINE,

Overview 1. Probabilistic Reasoning/Graphical models 2. Importance Sampling 3. Markov Chain

Markov chain Monte Carlo sampling SPiNCOM reading group Jun. 10 th , - PowerPoint PPT Presentation

Markov chain Monte Carlo sampling SPiNCOM reading group Jun. 10 th , 2016 Dimitris Berberidis 1 Problem statement - Motivation Goal : Draw samples from a given pdf Impact of sampling : Bayesian inference ( :unknowns,

Markov chain Monte Carlo Reminder Need to sample large, non-standard distributions: Markov

Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabs Pczos

The Monte Carlo Method Estimating through sampling (estimating , p -value, integrals,...)

Sampling Methods Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 11 Sampling Rejection Sampling

Markov Chain Monte Carlo Methods Michel Bierlaire michel.bierlaire@epfl.ch Transport and

Distributed Markov chain Monte Carlo Lawrence Murray CSIRO Mathematics, Informatics and

STAT 339 Markov Chain Monte Carlo (MCMC) 7 April 2017 Some theory and intuition about MCMC

Introduction to Markov Chain Monte Carlo Olivier Le Matre 1 with Omar Knio (KAUST) 1 Centre de

Stratified Markov Chain Monte Carlo Brian Van Koten University of Massachusetts, Amherst

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling Christopher Jennison

Sampling from distributive lattices the Markov chain approach Graduiertenkolleg MDS TU

Chapter 11: Sampling Methods Lei Tang Department of CSE Arizona State University Dec. 18th,

Chapter 11: Sampling Methods Lei Tang Department of CSE Arizona State University Dec. 18th,

Markov chain Monte Carlo Dr. Jarad Niemi STAT 544 - Iowa State University April 2, 2018 Jarad

Partial ordering of inhomogeneous Markov chains with applications to Markov chain Monte Carlo

Lattice Gaussian Sampling with Markov Chain Monte Carlo (MCMC) Cong Ling Imperial College London

Probabilistic Graphical Models Probabilistic Graphical Models Markov Chain Monte Carlo Inference

Bayesian inference &amp; Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly

Bayesian inference &amp; Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly

Bayesian inference &amp; Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly

Markov Chain Monte Carlo (MCMC) Inference Seung-Hoon Na Chonbuk National University Monte Carlo

Adaptive and Interacting Markov chain Monte Carlo Gersende FORT LTCI CNRS &amp; Telecom

MARKOV CHAIN MONTE CARLO METHODS MARKOV CHAIN MONTE CARLO METHODS MARKO LAINE, FMI MARKO LAINE,

Overview 1. Probabilistic Reasoning/Graphical models 2. Importance Sampling 3. Markov Chain

Bayesian inference & Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly

Bayesian inference & Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly

Bayesian inference & Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly

Adaptive and Interacting Markov chain Monte Carlo Gersende FORT LTCI CNRS & Telecom