scaling analysis of mcmc algorithms
play

Scaling Analysis of MCMC algorithms Alexandre Thiry 1 1 University of - PowerPoint PPT Presentation

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold Scaling Analysis of MCMC algorithms Alexandre Thiry 1 1 University of Warwick MCQMC, February 2012 Collaboration with Andrew Stuart (Warwick), Gareth Roberts


  1. The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold Scaling Analysis of MCMC algorithms Alexandre Thiéry 1 1 University of Warwick MCQMC, February 2012 Collaboration with Andrew Stuart (Warwick), Gareth Roberts (Warwick), Natesh Pillai (Harvard) and Alex Beskos (UCL). Funded by CRISM

  2. The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold Outline The Scaling Analysis Method 1 High Dimensional MCMC 2 Concentration near a manifold 3

  3. The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold Outline The Scaling Analysis Method 1 High Dimensional MCMC 2 Concentration near a manifold 3

  4. The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold Purposes Analysis of asymptotic complexity [Roberts and Co-workers, 1997] Avoid Spectral gaps, Log-Sobolev, etc ... Provide more intuition on behaviour of algorithms Easy-to-follow guidelines for tuning MCMC algorithms

  5. The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold Sequence of MCMC algorithms Sequence of target distributions π α index by parameter α Sequence of MCMC proposals. (Almost always) local proposals of the form x ⋆ = a ( α ) x + σ ( α ) Z Sequence of MCMC chains indexed by parameter α , x α = x 1 ,α , x 2 ,α , x 3 ,α , . . . We are interested in the limit α → α ∞

  6. The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold Example of Limiting Regime α = dimension of the state space. Interest in α → α ∞ = ∞ . Consider target distribution with density of the form − Ψ( x ) � � π α ( x ) ∝ exp α Interest in α → α ∞ = 0.

  7. The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold Interpolation Choose a time discretisation parameter δ = δ ( α ) such that δ → 0 as α → α ∞ . Define the accelerated process z α by z α ( t ) = x t /δ ( α ) ,α

  8. The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold Scaling Limit

  9. The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold Scaling Limit A scaling limit result is a theorem of the form Theorem (Scaling Limit) α → α ∞ z α = z lim The convergence is on pathspace C ([ 0 , T ] , H ) . The limiting process is typically a non-trivial diffusion, jump or Levy process.

  10. The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold Interpretation Limiting process z ( t ) takes T mix to mix. Using the approximation x k ,α = z α ( k δ ) ≈ z ( k δ ) it follows that x ( · , α ) takes roughly k ≈ T mix /δ ( α ) steps to mix. Consequently, as α → α ∞ the complexity of the MCMC algorithm grows as δ ( α ) − 1 .

  11. The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold Mixing

  12. The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold Outline The Scaling Analysis Method 1 High Dimensional MCMC 2 Concentration near a manifold 3

  13. The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold Some motivations: Bayesian Inverse Problems Consider an infinite dimensional Hilbert space H . Reconstruction of unknown data x ∈ H from noisy observation y = F ( x ) + (Noise) Suppose that the noise is Gaussian and put a Gaussian prior π 0 = N (0,C) on the data x to be estimated. Posterior probability distribution π (living on H ) is given by d π ( x ) ∝ e − Φ( x ) d π 0 − 1 2 � F ( x ) − y � 2 � � where Φ( x ) = exp . Γ

  14. The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold Temperature Field Reconstruction

  15. The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold Some motivations: Conditioned Diffusions Consider a diffusion with constant volatility coefficient (see Lamperti) on the interval I = [ 0 , T ] , dX = −∇ U ( X ) dt + σ dW with X 0 = x − , X T = x + Call π the law of X t ∈ I ∈ H = L 2 ( I ) . Law of diffusion X is absolutely continuous (Girsanov) w.r.t. to Wiener bridge measure π 0 dY = σ dW with Y 0 = x − , Y T = x + One can explicitly write down (without stoch. integral) the change of probability d π ( x ) ∝ e − Φ( x ) d π 0

  16. The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold Conditioned Diffusion

  17. The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold Finite Dimensional Discretisation Let ϕ 1 , ϕ 2 . . . , ϕ k , . . . be eigenfunctions of covariance operator C . Let P N ( · ) denote orthogonal projection, in H , onto D span ( ϕ 1 , . . . , ϕ N ) and π N ∼ P N ( π 0 ) . 0 Finite dimensional (but living on H ) posterior π N is given by d π N ( x ) ∝ e − Φ( P N x ) . d π N 0 One can implement all the algorithms in R N but analyse then in H . Other (more natural) discretisation possible.

  18. The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold Random Walk Metropolis (RWM) algorithm RWM for target distribution π α = π N , x ⋆ = x + ξ D � δ ( N ) P N ( ξ ) with ∼ π 0 = N ( 0 , C ) . Discretisation of Brownian motion with covariance P N ( C ) between t and t + δ ( N ) . Diffusion Limit Take δ ( N ) = N − p for any p ≥ 1. The limit N →∞ z N = z lim exists and is a non-trivial ergodic diffusion process.

  19. The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold Scaling Analysis of RWM J. Mattingly, N. Pillai and A.M. Stuart, 2011 Theorem Consider RWM with increment δ ( N ) ≈ N − 1 . The limit z N ⇒ z holds weakly in C ([ 0 , T ] , H s ) . The limit process z is a H -valued Langevin diffusion that is reversible with respect to π . For δ ( N ) ∝ N − 1 , limiting acceptance probability 0 < p < 1. For δ ( N ) ∝ N − ( 1 + ε ) , limiting acceptance probability p = 1. For δ ( N ) ∝ N − ( 1 − ε ) , acceptance probability is exponentially small. Complexity of RWM grows as O ( N ) as N → ∞ .

  20. The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold Langevin Diffusion Probability distribution with density π ( x ) ∝ e − L ( x ) . √ dx = −∇ L ( x ) dt + 2 dW is π -reversible. √ dx = − M ∇ L ( x ) dt + 2 M dW is π -reversible. d π 0 ( x ) ∝ e − Φ( x ) with π 0 = N ( 0 , C ) . Case d π Informally π ( x ) ∝ e − 1 2 � x , C − 1 x �− Φ( x ) . � 1 2 � x , C − 1 x � + Φ( x ) � = C − 1 x + ∇ Φ( x ) , Because ∇ √ dx = − ( x + C ∇ Φ( x )) dt + 2 C dW √ = drift ( x ) dt + 2 C dW is π -reversible. Notice diffusion term.

  21. The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold MALA algorithm MALA for target distribution π α = π N , x ⋆ = x − drift ( x ) δ ( N ) + � 2 C δ ( N ) P N ( ξ ) Euler Discretisation of Langevin Diffusion between t and t + δ ( N ) . Diffusion Limit Take δ ( N ) = N − p for any p ≥ 1 3 . The limit N →∞ z N = z lim exists and is a non-trivial ergodic diffusion process.

  22. The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold Scaling Analysis of MALA N. Pillai,A.M. Stuart and A.T, 2011 Theorem Consider MALA with increment δ ( N ) ≈ N − 1 3 . The limit z N ⇒ z holds weakly in C ([ 0 , T ] , H s ) . The limit process z is a H -valued Langevin diffusion that is reversible with respect to π . For δ ( N ) ∝ N − 1 3 , limiting acceptance probability 0 < p < 1. For δ ( N ) ∝ N − ( 1 3 + ε ) , limiting acceptance probability p = 1. For δ ( N ) ∝ N − ( 1 3 − ε ) , acceptance probability is exponentially small. 1 3 ) as N → ∞ . Complexity of MALA grows as O ( N

  23. The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold What is going wrong? Consider RWM and MALA for Gaussian targets π = π 0 = N ( 0 , C ) and Φ ≡ 0. √ (RWM) x ⋆ = x + δ ξ with ξ D ∼ N ( 0 , C ) . √ (MALA) x ⋆ = ( 1 − δ ) x + 2 δ ξ with ξ D ∼ N ( 0 , C ) . Consequently, if x D ∼ π = N ( 0 , C ) we have (RWM) x ⋆ D ∼ N ( 0 , ( 1 + δ ) C ) . (MALA) x ⋆ D ∼ N ( 0 , ( 1 + δ 2 ) C ) . In infinite dimensional setting, Gaussian measures N ( 0 , C ) and N ( 0 , ( 1 + ε ) C ) are singular. RWM and MALA are NOT well-defined on H .

  24. The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold A Robust Algorithm Target density d π d π 0 ( x ) ∝ e − Φ( x ) . The ’right’ proposal (called pCN) should be √ √ x ⋆ = δ P N ( ξ ) . 1 − δ x + It is well-defined on H and preserve π 0 = N ( 0 , C ) . Theorem (N.Pillai, A.M. Stuart and A.T. (2011)) Under growth conditions on the potential Φ , the pCN algorithm is robust. For any fixed parameter δ > 0 the average acceptance probability stays bounded away fom 0 . RWM complexity grows as O ( N ) 1 3 ) MALA complexity grows as O ( N pCN complexity is O ( 1 )

  25. The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold Optimal Proposal Design Principle Designing proposals which are well-defined on the infinite dimensional parameter space results in MCMC methods which do not suffer from the curse of dimensionality.

  26. The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold Outline The Scaling Analysis Method 1 High Dimensional MCMC 2 Concentration near a manifold 3

Recommend


More recommend