Sampling from log-concave density Alain Durmus, Eric Moulines, - PowerPoint PPT Presentation

Motivation Framework Sampling from strongly log-concave density Sampling from log-concave density Non-smooth potentials Numerical illustrations Conclusion Sampling from log-concave density Alain Durmus, Eric Moulines, Marcelo Pereyra Telecom ParisTech, Ecole Polytechnique, Bristol University A. Durmus, Eric Moulines, Marcelo Pereyra S´ eminaire des jeunes probabilistes et statisticiens-2016

Motivation Framework Sampling from strongly log-concave density Sampling from log-concave density Non-smooth potentials Numerical illustrations Conclusion 1 Motivation 2 Framework 3 Sampling from strongly log-concave density 4 Sampling from log-concave density 5 Non-smooth potentials 6 Numerical illustrations 7 Conclusion A. Durmus, Eric Moulines, Marcelo Pereyra S´ eminaire des jeunes probabilistes et statisticiens-2016

Motivation Framework Sampling from strongly log-concave density Sampling from log-concave density Non-smooth potentials Numerical illustrations Conclusion Introduction Sampling distribution over high-dimensional state-space has recently attracted a lot of research efforts in computational statistics and machine learning community... Applications (non-exhaustive) 1 Bayesian inference for high-dimensional models and Bayesian non parametrics 2 Bayesian linear inverse problems (typically function space problems converted to high-dimensional problem by Galerkin method) 3 Aggregation of estimators and experts Most of the sampling techniques known so far do not scale to high-dimension... Challenges are numerous in this area... A. Durmus, Eric Moulines, Marcelo Pereyra S´ eminaire des jeunes probabilistes et statisticiens-2016

Motivation Framework Sampling from strongly log-concave density Sampling from log-concave density Non-smooth potentials Numerical illustrations Conclusion Bayesian setting (I) β ∈ R d is embedded with a prior - In a Bayesian setting, a parameter β β distribution ξ and the observations are given by a probabilistic model: Y ∼ ℓ ( ·| β β β ) The inference is then based on the posterior distribution: β | Y ) = ξ (d β β β ) ℓ ( Y | β β β ) β π (d β ℓ ( Y | u ) ξ (d u ) . � In most cases the normalizing constant is not tractable: β β β π (d β β | Y ) ∝ ξ (d β β ) ℓ ( Y | β β ) . A. Durmus, Eric Moulines, Marcelo Pereyra S´ eminaire des jeunes probabilistes et statisticiens-2016

Motivation Framework Sampling from strongly log-concave density Sampling from log-concave density Non-smooth potentials Numerical illustrations Conclusion Bayesian setting (II) Bayesian decision theory relies on computing expectations: � R d f ( β β ) ℓ ( Y | β β β β ) ξ (d β β β ) Generic problem: estimation of an expectation E π [ f ] , where - π is known up to a multiplicative factor ; - we do not know how to sample from π (no basic Monte Carlo estimator); A. Durmus, Eric Moulines, Marcelo Pereyra S´ eminaire des jeunes probabilistes et statisticiens-2016

Motivation Framework Sampling from strongly log-concave density Sampling from log-concave density Non-smooth potentials Numerical illustrations Conclusion Examples: Logistic and probit regression Likelihood: Binary regression set-up in which the binary observations (responses) ( Y 1 , . . . , Y n ) are conditionally independent Bernoulli β T X i ) , where random variables with success probability F ( β β 1 X i is a d dimensional vector of known covariates, 2 β β β is a d dimensional vector of unknown regression coefficient 3 F is a distribution function. Two important special cases: 1 probit regression: F is the standard normal distribution function, 2 logistic regression: F is the standard logistic distribution function, F ( t ) = e t / (1 + e t ) . A. Durmus, Eric Moulines, Marcelo Pereyra S´ eminaire des jeunes probabilistes et statisticiens-2016

Motivation Framework Sampling from strongly log-concave density Sampling from log-concave density Non-smooth potentials Numerical illustrations Conclusion Examples: Logistic and probit regression The posterior density distribution of β β β is given by Bayes’ rule, up to a proportionality constant by π ( β β | ( Y, X )) ∝ exp( − U ( β β β )) , where β the potential U ( β β β ) is given by p β T X i ) + (1 − Y i ) log(1 − F ( β β T X i )) } � U ( β β β ) = − { Y i log F ( β β β i =1 + g( β β β ) , where g is the log-density of the prior distribution. Two important cases: β T Σ β Gaussian prior: g( β β β ) = − (1 / 2) β β β β β β , ridge regression. β β ) = − λ � d Laplace prior: g( β β k =1 | β β β k | , lasso regression. A. Durmus, Eric Moulines, Marcelo Pereyra S´ eminaire des jeunes probabilistes et statisticiens-2016

Motivation Framework Sampling from strongly log-concave density Sampling from log-concave density Non-smooth potentials Numerical illustrations Conclusion New challenges Problem the number of predictor variables d is large ( 10 4 and up). Examples - text categorization, - genomics and proteomics (gene expression analysis), , - other data mining tasks (recommendations, longitudinal clinical trials, ..). A. Durmus, Eric Moulines, Marcelo Pereyra S´ eminaire des jeunes probabilistes et statisticiens-2016

Motivation Framework Sampling from strongly log-concave density Sampling from log-concave density Non-smooth potentials Numerical illustrations Conclusion Data Augmentation The most popular algorithms for Bayesian inference in ridge binary regression models are based on data augmentation: 1 probit link: Albert and Chib (1993). 2 logistic link: Polya-Gamma sampler, Polsson and Scott (2012)... ! Bayesian lexicon: - Data Augmentation instead on sampling π ( β β β | ( Y, X )) sample π ( β β β, W | ( Y, X )) and marginalize W . - Typical application of the Gibbs sampler: sample in turn π ( β β β | W, Y, X ) and π ( W | β β β, X, Y ) - The choice of the DA should make these two steps reasonably easy... A. Durmus, Eric Moulines, Marcelo Pereyra S´ eminaire des jeunes probabilistes et statisticiens-2016

Motivation Framework Sampling from strongly log-concave density Sampling from log-concave density Non-smooth potentials Numerical illustrations Conclusion Data Augmentation algorithms These two algorithms have been shown to be uniformly geometrically ergodic, BUT the constants depends highly on the dimension. The algorithms are very demanding in terms of computational ressources... - applicable only when is d small 10 to moderate 100 but certainly not when d is large ( 10 4 or more). - convergence time prohibitive as soon as d ≥ 10 2 . A. Durmus, Eric Moulines, Marcelo Pereyra S´ eminaire des jeunes probabilistes et statisticiens-2016

Motivation Framework Sampling from strongly log-concave density Sampling from log-concave density Non-smooth potentials Numerical illustrations Conclusion A daunting problem ? In the case of the ridge regression, the potential β β β �→ U ( β β β ) is smooth, strongly convex In the case of the lasso regression, the potential β β β �→ U ( β β β ) is non-smooth but still convex... A wealth of reasonably fast optimisation algorithms are available to solve this problem in high-dimension... A. Durmus, Eric Moulines, Marcelo Pereyra S´ eminaire des jeunes probabilistes et statisticiens-2016

Motivation Framework Sampling from strongly log-concave density Sampling from log-concave density Non-smooth potentials Numerical illustrations Conclusion 1 Motivation 2 Framework 3 Sampling from strongly log-concave density 4 Sampling from log-concave density 5 Non-smooth potentials 6 Numerical illustrations 7 Conclusion A. Durmus, Eric Moulines, Marcelo Pereyra S´ eminaire des jeunes probabilistes et statisticiens-2016

Motivation Framework Sampling from strongly log-concave density Sampling from log-concave density Non-smooth potentials Numerical illustrations Conclusion Framework Denote by π a target density w.r.t. the Lebesgue measure on R d , known up to a normalisation factor � x �→ e − U ( x ) / R d e − U ( y ) d y , Implicitly, d ≫ 1 . Assumption: U is L -smooth : continuously differentiable and there exists a constant L such that for all x, y ∈ R d , �∇ U ( x ) − ∇ U ( y ) � ≤ L � x − y � . A. Durmus, Eric Moulines, Marcelo Pereyra S´ eminaire des jeunes probabilistes et statisticiens-2016

Motivation Framework Sampling from strongly log-concave density Sampling from log-concave density Non-smooth potentials Numerical illustrations Conclusion Langevin diffusion Langevin SDE: √ d Y t = −∇ U ( Y t )d t + 2d B t , where ( B t ) t ≥ 0 is a d -dimensional Brownian Motion. Denote by ( P t ) t ≥ 0 the semigroup of the diffusion, P t ( x, A ) = E x [ Y t ∈ A ] . ( P t ) t ≥ 0 is - aperiodic, strong Feller (all compact sets are small). - reversible w.r.t. to π (admits π as its unique invariant distribution). π ∝ e − U is reversible ❀ the unique invariant probability measure. For all x ∈ R d , measurable and bounded functions f : R d → R � t → + ∞ P t f ( x ) = lim t → + ∞ E x [ f ( Y t )] = lim R d f ( y )d π ( y ) . A. Durmus, Eric Moulines, Marcelo Pereyra S´ eminaire des jeunes probabilistes et statisticiens-2016

Sampling from log-concave density Alain Durmus, Eric Moulines, - PowerPoint PPT Presentation

Motivation Framework Sampling from strongly log-concave density Sampling from log-concave density Non-smooth potentials Numerical illustrations Conclusion Sampling from log-concave density Alain Durmus, Eric Moulines, Marcelo Pereyra

Modified log-Sobolev inequalities for strongly log-concave distributions Heng Guo (University of

(142733/102960-Log[4])+(614851/73920-2 Log[64]) h 2 +(2329/1680-Log[4]) h 4 -h 10 /20160

On the R enyi Entropy of Log-Concave Sequences James Melbourne University of Minnesota

Sampling Methods Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 11 Sampling Rejection Sampling

Chapter 7. Sampling Chapter 7. Sampling methods? methods? Two types of sampling methods Two

Multiple importance sampling Slides for CS6630 lecture 6 sampling the BRDF sampling the

What is the strengths and weakness of these sampling methods? Sampling Strengths /

Chandra data reduction The CDFs Giorgio, Margherita, Elisabeta, Eleonora, Lazarus, Enrica,

The Randomized Midpoint Method for Log-Concave Sampling Ruoqi Shen Yin Tat Lee University of

Relative Density Chapters 3.5 Relative Density 1 2/5/2015 Minimum Density Pluviate soil from

Sampling Overview R toy sampling Non-probability sampling Probability Methods (AKA random)

Sampling Sediment and Sampling Sediment and Sampling Sediment and Porewater Sampling Sediment

Sampling Methods CMSC 678 UMBC Outline Recap Monte Carlo methods Sampling Techniques Uniform

The original problem Let X 1 , . . . , X n be a random sample from a density f 0 in R d . How

Syslog and Log Rotate Computer Center, CS, NCTU Log files Execution information of each

Distributed ephemeral log service Log entries are replicated,dispersed See Ivy,

Simulation of correlated gamma emission . Ivanchenko, CERN & Geant4 Associates International

Sampling Methods CMSC 691 UMBC (Some) Learning Techniques MAP/MLE: Point estimation, basic EM

SciPy & other packages Arno Proeme, ARCHER CSE Team aproeme@epcc.ed.ac.uk Attributed to

Numerical Aspects of Special Functions Nico M. Temme In collaboration with Amparo Gil and Javier

High Energy Gamma-ray Observations with CALET Nicholas Cannady for the CALET Collaboration

Rejection Power for the Hadron-cluster Background with the upgraded KOTO CsI calorimeter

Thompson Sampling Algorithms for Mean-Variance Bandits Qiuyu Zhu Vincent Y. F. Tan Institute of

ADVANCED DATABASE SYSTEMS Multi-Version Concurrency Control (Garbage Collection) @ Andy_Pavlo

Sampling from log-concave density Alain Durmus, Eric Moulines, - PowerPoint PPT Presentation

Motivation Framework Sampling from strongly log-concave density Sampling from log-concave density Non-smooth potentials Numerical illustrations Conclusion Sampling from log-concave density Alain Durmus, Eric Moulines, Marcelo Pereyra

Modified log-Sobolev inequalities for strongly log-concave distributions Heng Guo (University of

(142733/102960-Log[4])+(614851/73920-2 Log[64]) h 2 +(2329/1680-Log[4]) h 4 -h 10 /20160

On the R enyi Entropy of Log-Concave Sequences James Melbourne University of Minnesota

Sampling Methods Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 11 Sampling Rejection Sampling

Chapter 7. Sampling Chapter 7. Sampling methods? methods? Two types of sampling methods Two

Multiple importance sampling Slides for CS6630 lecture 6 sampling the BRDF sampling the

What is the strengths and weakness of these sampling methods? Sampling Strengths /

Chandra data reduction The CDFs Giorgio, Margherita, Elisabeta, Eleonora, Lazarus, Enrica,

The Randomized Midpoint Method for Log-Concave Sampling Ruoqi Shen Yin Tat Lee University of

Relative Density Chapters 3.5 Relative Density 1 2/5/2015 Minimum Density Pluviate soil from

Sampling Overview R toy sampling Non-probability sampling Probability Methods (AKA random)

Sampling Sediment and Sampling Sediment and Sampling Sediment and Porewater Sampling Sediment

Sampling Methods CMSC 678 UMBC Outline Recap Monte Carlo methods Sampling Techniques Uniform

The original problem Let X 1 , . . . , X n be a random sample from a density f 0 in R d . How

Syslog and Log Rotate Computer Center, CS, NCTU Log files Execution information of each

Distributed ephemeral log service Log entries are replicated,dispersed See Ivy,

Simulation of correlated gamma emission . Ivanchenko, CERN &amp; Geant4 Associates International

Sampling Methods CMSC 691 UMBC (Some) Learning Techniques MAP/MLE: Point estimation, basic EM

SciPy &amp; other packages Arno Proeme, ARCHER CSE Team aproeme@epcc.ed.ac.uk Attributed to

Numerical Aspects of Special Functions Nico M. Temme In collaboration with Amparo Gil and Javier

High Energy Gamma-ray Observations with CALET Nicholas Cannady for the CALET Collaboration

Rejection Power for the Hadron-cluster Background with the upgraded KOTO CsI calorimeter

Thompson Sampling Algorithms for Mean-Variance Bandits Qiuyu Zhu Vincent Y. F. Tan Institute of

ADVANCED DATABASE SYSTEMS Multi-Version Concurrency Control (Garbage Collection) @ Andy_Pavlo

Simulation of correlated gamma emission . Ivanchenko, CERN & Geant4 Associates International

SciPy & other packages Arno Proeme, ARCHER CSE Team aproeme@epcc.ed.ac.uk Attributed to