Poisson-Minibatching for Gibbs Sampling with Convergence Rate - PowerPoint PPT Presentation

Poisson-Minibatching for Gibbs Sampling with Convergence Rate Guarantees Ruqi Zhang and Christopher De Sa Cornell University

Scale Gibbs Sampling by Subsampling Gibbs sampling is one of the most popular Markov chain Monte Carlo (MCMC) methods + Converge asymptotically to the desired distribution + Work very well in practice – Prohibitive cost on large-scale datasets or models 1

Scale Gibbs Sampling by Subsampling Gibbs sampling is one of the most popular Markov chain Monte Carlo (MCMC) methods + Converge asymptotically to the desired distribution + Work very well in practice – Prohibitive cost on large-scale datasets or models Subsampling methods to scale MCMC + Reduce computational cost significantly – No guarantees on the accuracy and the efficiency 1

Scale Gibbs Sampling by Subsampling Gibbs sampling is one of the most popular Markov chain Monte Carlo (MCMC) methods + Converge asymptotically to the desired distribution + Work very well in practice – Prohibitive cost on large-scale datasets or models Subsampling methods to scale MCMC + Reduce computational cost significantly – No guarantees on the accuracy and the efficiency We show how to scale Gibbs sampling by subsampling with guarantees on the accuracy, convergence rate, and computational efficiency 1

Inference on Graphical Models Consider factor graphs π ( x 1: n ) = 1 � Z · exp ( φ ( x 1: n )) φ ∈ Φ Sample from π by Gibbs sampling Loop Select a variable x i to sample at random Compute the conditional distribution of x i based on all factors φ that depend on x i Resample variable x i from the conditional distribution End Loop 2

Inference on Graphical Models Consider factor graphs π ( x 1: n ) = 1 � Z · exp ( φ ( x 1: n )) φ ∈ Φ Sample from π by Gibbs sampling Loop Select a variable x i to sample at random Compute the conditional distribution of x i based on all factors φ that depend on x i Resample variable x i from the conditional distribution End Loop Very expensive when the factor set is large! Can we subsample factors to compute conditional distributions? 2

Previous Work Scale MCMC with subsampling methods: [Welling and Teh, 2011], [Maclaurin and Adams, 2014], [Bardenet et.al., 2017] ... Christopher De Sa, Vincent Chen and Wing Wong. Minibatch Gibbs Sampling on Large Graphical Models . ICML 2018 Main idea: • Use conditional distributions based on subsampled factors as proposal distributions • Add the Metropolis-Hastings (M-H) step to correct the bias 3

Previous Work Scale MCMC with subsampling methods: [Welling and Teh, 2011], [Maclaurin and Adams, 2014], [Bardenet et.al., 2017] ... Christopher De Sa, Vincent Chen and Wing Wong. Minibatch Gibbs Sampling on Large Graphical Models . ICML 2018 Main idea: • Use conditional distributions based on subsampled factors as proposal distributions • Add the Metropolis-Hastings (M-H) step to correct the bias Limitations : • The Metropolis-Hastings step is expensive • Only support sampling from discrete distributions 3

Poisson-Minibatching Introduce an auxiliary Poisson variable for each factor to control whether a factor is used or not � λ M φ � s φ | x 1: n ∼ Poisson + φ ( x 1: n ) L 4

Poisson-Minibatching Introduce an auxiliary Poisson variable for each factor to control whether a factor is used or not � λ M φ � s φ | x 1: n ∼ Poisson + φ ( x 1: n ) L The joint distribution  � � � � � λ M φ � L � π ( x 1: n , s φ ∈ Φ ) ∝ exp s φ log 1 + φ ( x 1: n ) + s φ log − log ( s φ !)  λ M φ L φ ∈ Φ A factor φ contributes to the energy only when s φ > 0, thus the algorithm computes conditional distributions with only a subset of factors 4

Poisson-Minibatching Introduce an auxiliary Poisson variable for each factor to control whether a factor is used or not � λ M φ � s φ | x 1: n ∼ Poisson + φ ( x 1: n ) L The joint distribution  � � � � � λ M φ � L � π ( x 1: n , s φ ∈ Φ ) ∝ exp s φ log 1 + φ ( x 1: n ) + s φ log − log ( s φ !)  λ M φ L φ ∈ Φ A factor φ contributes to the energy only when s φ > 0, thus the algorithm computes conditional distributions with only a subset of factors • Expected number of factors being used ≪ the factor set size • Stationary distribution of x 1: n does not change even without the M-H step • Sampling a set of Poisson variables is cheap 4

Algorithm of Poisson-Minibatching Gibbs Sampling (Poisson-Gibbs) Loop Select a variable x i to sample at random Resample s φ from its conditional distribution given x 1: n Compute the conditional distribution based on the chosen factors φ such that s φ > 0 Resample variable x i from the conditional distribution End Loop • Simple to implement • No Metropolis-Hastings step 5

Theoretical Guarantees on Convergence Rate The convergence rate of our method can be slowed down by at most a constant compared to that of Gibbs sampling • Provide recipe of setting the hyperparameter minibatch size to make this constant O (1) 6

Sample from Continuous Distributions Difficulty : non-trivial to sample from continuous conditional distributions Our Solution : Double Chebyshev Approximation method • Get polynomial approximation of the PDF by using Chebyshev approximation twice • Generate a sample by inverse transform sampling 7

Sample from Continuous Distributions Difficulty : non-trivial to sample from continuous conditional distributions Our Solution : Double Chebyshev Approximation method • Get polynomial approximation of the PDF by using Chebyshev approximation twice • Generate a sample by inverse transform sampling Theoretical Guarantees on the accuracy and the efficiency • Stationary distribution of x 1: n does not change • The convergence rate of our method can be slowed down by at most a constant compared to that of Gibbs sampling 7

Summary • Scaling MCMC methods while maintaining theoretical guarantees is hard • We propose Poisson-minibatching Gibbs sampling which solves this problem using the auxiliary variable method • We provide theoretical guarantees on the accuracy, convergence rate and computational efficiency • For more details—including experiments—come see our poster! Thank you! Poster #158, 5:30 – 7:30 today 8

Poisson-Minibatching for Gibbs Sampling with Convergence Rate - PowerPoint PPT Presentation

Poisson-Minibatching for Gibbs Sampling with Convergence Rate Guarantees Ruqi Zhang and Christopher De Sa Cornell University Scale Gibbs Sampling by Subsampling Gibbs sampling is one of the most popular Markov chain Monte Carlo (MCMC) methods +

Gibbs-non-Gibbs dynamical transitions. A large-deviation paradigm R. Fern andez F. den

Gibbs sampling Dr. Jarad Niemi Iowa State University March 29, 2018 Jarad Niemi (Iowa State)

Sampling Methods CMSC 678 UMBC Outline Recap Monte Carlo methods Sampling Techniques Uniform

Poisson Distribution: Review Poisson Over Time Let B 1 Poisson( ) be the number of bikes

CSci 8980: Advanced Topics in Graphical Models MCMC, Gibbs Sampling Instructor: Arindam Banerjee

Poisson Approximation for Two Scan Statistics with Rates of Convergence Xiao Fang (Joint work

Sampling Methods Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 11 Sampling Rejection Sampling

Chapter 7. Sampling Chapter 7. Sampling methods? methods? Two types of sampling methods Two

Multiple importance sampling Slides for CS6630 lecture 6 sampling the BRDF sampling the

What is the strengths and weakness of these sampling methods? Sampling Strengths /

Gibbs Sampling Bayesian Networks: A First Attempt with Cilk++ Alexander Dubbs May 13, 2010

Randomness in Computing L ECTURE 14 Last time Poisson distribution Poisson approximation

Factors of Gibbs measures on subshifts What is a Gibbs measure? Two-ish definitions Equivalence

College P Planning N Night GIBBS GIBBS HIGH IGH SCHOOL SC SCHO HOOL COUNSE SELING OFFICE

Sampling Gibbs : Maximization Expectation Scribes Jered McInerney : 2- hang Xiongyi

Mathematics in the Sciences Eggeling et al. Gibbs sampling for parsMMs with latent variables 1

Differentiable Linearized ADMM Guangcan Liu , 2 Zhouchen Lin , 1 Xingyu Xie *, 1 Jianlong

JUST THE MATHS SLIDES NUMBER 2.4 SERIES 4 (Further convergence and divergence) by

Architecture Knowledge Management: Challenges, Approaches, Tools M. Ali Babar 1 & Ian Gorton 2

Air Force Research Laboratory Trust in Automation Research: Current Directions and Gaps Date: 5

Solutions of Equations in One Variable Newtons Method Numerical Analysis (9th Edition) R L

Convergence of Filtered Spherical Harmonic Equations for Radiation Transport Martin Frank (RWTH)

Convergence to equilibrium for rough differential equations Samy Tindel Purdue University

Module 6 Value Iteration CS 886 Sequential Decision Making and Reinforcement Learning

Poisson-Minibatching for Gibbs Sampling with Convergence Rate - PowerPoint PPT Presentation

Poisson-Minibatching for Gibbs Sampling with Convergence Rate Guarantees Ruqi Zhang and Christopher De Sa Cornell University Scale Gibbs Sampling by Subsampling Gibbs sampling is one of the most popular Markov chain Monte Carlo (MCMC) methods +

Gibbs-non-Gibbs dynamical transitions. A large-deviation paradigm R. Fern andez F. den

Gibbs sampling Dr. Jarad Niemi Iowa State University March 29, 2018 Jarad Niemi (Iowa State)

Sampling Methods CMSC 678 UMBC Outline Recap Monte Carlo methods Sampling Techniques Uniform

Poisson Distribution: Review Poisson Over Time Let B 1 Poisson( ) be the number of bikes

CSci 8980: Advanced Topics in Graphical Models MCMC, Gibbs Sampling Instructor: Arindam Banerjee

Poisson Approximation for Two Scan Statistics with Rates of Convergence Xiao Fang (Joint work

Sampling Methods Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 11 Sampling Rejection Sampling

Chapter 7. Sampling Chapter 7. Sampling methods? methods? Two types of sampling methods Two

Multiple importance sampling Slides for CS6630 lecture 6 sampling the BRDF sampling the

What is the strengths and weakness of these sampling methods? Sampling Strengths /

Gibbs Sampling Bayesian Networks: A First Attempt with Cilk++ Alexander Dubbs May 13, 2010

Randomness in Computing L ECTURE 14 Last time Poisson distribution Poisson approximation

Factors of Gibbs measures on subshifts What is a Gibbs measure? Two-ish definitions Equivalence

College P Planning N Night GIBBS GIBBS HIGH IGH SCHOOL SC SCHO HOOL COUNSE SELING OFFICE

Sampling Gibbs : Maximization Expectation Scribes Jered McInerney : 2- hang Xiongyi

Mathematics in the Sciences Eggeling et al. Gibbs sampling for parsMMs with latent variables 1

Differentiable Linearized ADMM Guangcan Liu , 2 Zhouchen Lin , 1 Xingyu Xie *, 1 Jianlong

JUST THE MATHS SLIDES NUMBER 2.4 SERIES 4 (Further convergence and divergence) by

Architecture Knowledge Management: Challenges, Approaches, Tools M. Ali Babar 1 &amp; Ian Gorton 2

Air Force Research Laboratory Trust in Automation Research: Current Directions and Gaps Date: 5

Solutions of Equations in One Variable Newtons Method Numerical Analysis (9th Edition) R L

Convergence of Filtered Spherical Harmonic Equations for Radiation Transport Martin Frank (RWTH)

Convergence to equilibrium for rough differential equations Samy Tindel Purdue University

Module 6 Value Iteration CS 886 Sequential Decision Making and Reinforcement Learning

Architecture Knowledge Management: Challenges, Approaches, Tools M. Ali Babar 1 & Ian Gorton 2