Approximate Bayesian inference for latent Gaussian models avard Rue - PowerPoint PPT Presentation

Approximative Bayesian inference Gaussian Markov Random fields (GMRFs) GMRFs: def A Gaussian Markov random field (GMRF) , x = ( x 1 , . . . , x n ) T , is a normal distributed random vector with additional Markov properties x i ⊥ x j | x − ij ⇐ ⇒ Q ij = 0 where Q is the precision matrix (inverse covariance) Sparse matrices gives fast computations!

Approximative Bayesian inference The GMRF-approximation The GMRF-approximation � � � − 1 2 x T Qx + π ( x | y ) ∝ exp log π ( y i | x i ) i � � − 1 2( x − µ ) T ( Q + diag( c i ))( x − µ ) ≈ exp = � π ( x | θ , y ) Constructed as follows: • Locate the mode x ∗ • Expand to second order Markov and computational properties are preserved

Approximative Bayesian inference Part I Some more background: The Laplace approximation

Approximative Bayesian inference Outline I Background: The Laplace approximation The Laplace-approximation for π ( θ | y ) The Laplace-approximation for π ( x i | θ , y ) The Integrated nested Laplace-approximation (INLA) Summary Assessing the error Examples Stochastic volatility Longitudinal mixed effect model Log-Gaussian Cox process Extensions Model choice Automatic detection of “surprising” observations Summary and discussion Bonus

Approximative Bayesian inference Outline II High(er) number of hyperparameters Parallel computing using OpenMP Spatial GLMs

Approximative Bayesian inference Background: The Laplace approximation The Laplace approximation: The classic case Compute and approximation to the integral � exp( ng ( x )) dx where n is the parameter going to ∞ . Let x 0 be the mode of g ( x ) and assume g ( x 0 ) = 0: g ( x ) = 1 2 g ′′ ( x 0 )( x − x 0 ) 2 + · · · .

Approximative Bayesian inference Background: The Laplace approximation The Laplace approximation: The classic case... Then � � 2 π exp( ng ( x )) dx = n ( − g ′′ ( x 0 )) + · · · • As n → ∞ , then the integrand gets more and more peaked. • Error should tends to zero as n → ∞ • Detailed analysis gives relative error( n ) = 1 + O (1 / n )

Approximative Bayesian inference Background: The Laplace approximation Extension I n � g n ( x ) = 1 g i ( x ) n i =1 then the mode x 0 depends on n as well.

Approximative Bayesian inference Background: The Laplace approximation Extension II � exp( ng ( x )) d x and x is multivariate, then � � (2 π ) n exp( ng ( x )) d x = n | − H | where H is the hessian (matrix) at the mode � � ∂ 2 � H ij = g ( x ) � ∂ x i ∂ x j � x = x 0

Approximative Bayesian inference Background: The Laplace approximation Computing marginals • Our main issue is to compute marginals • We can use the Laplace-approximation for this issue as well • A more “statistical” derivation might be appropriate

Approximative Bayesian inference Background: The Laplace approximation Computing marginals... Consider the general problem • θ is hyper-parameter with prior π ( θ ) • x is latent with density π ( x | θ ) • y is observed with likelihood π ( y | x ) then π ( θ | y ) = π ( x , θ | y ) π ( x | θ, y ) for any x !

Approximative Bayesian inference Background: The Laplace approximation Computing marginals... Further, π ( x , θ | y ) π ( θ | y ) = π ( x | θ, y ) π ( θ ) π ( x | θ ) π ( y | x ) ∝ π ( x | θ, y ) � � π ( θ ) π ( x | θ ) π ( y | x ) � ≈ � π G ( x | θ, y ) � x = x ∗ ( θ ) where π G ( x | θ, y ) is the Gaussian approximation of π ( x | θ, y ) and x ∗ ( θ ) is the mode.

Approximative Bayesian inference Background: The Laplace approximation Computing marginals... Error: With n repeated measurements of the same x, then the error is π ( θ | y ) = π ( θ | y )(1 + O ( n − 3 / 2 )) � after renormalisation. Relative error is a very nice property!

Approximative Bayesian inference Background: The Laplace approximation The Laplace-approximation for π ( θ | y ) Remarks The Laplace approximation � π ( θ | y ) turn out to be accurate: x | y , θ appears almost Gaussian in most cases, as • x is a priori Gaussian. • y is typically not very informative. • Observational model is usually ‘well-behaved’. Note: � π ( θ | y ) itself does not look Gaussian. Thus, a Gaussian approximation of ( θ , x ) will be inaccurate.

Approximative Bayesian inference Background: The Laplace approximation The Laplace-approximation for π ( x i | θ , y ) Approximating π ( x i | y , θ ) This task is more challenging, since • dimension of x , n is large • and there are potential n marginals to compute, or at least O ( n ). An obvious simple and fast alternative, is to use the GMRF-approximation π ( x i | θ , y ) = N ( x i ; µ ( θ ) , σ 2 ( θ )) �

Approximative Bayesian inference Background: The Laplace approximation The Laplace-approximation for π ( x i | θ , y ) Laplace approximation of π ( x i | θ , y ) • The Laplace approximation: � � π ( x , θ | y ) � π ( x i | y , θ ) ≈ � � π ( x − i | x i , y , θ ) � � x − i = x ∗ − i ( x i , θ ) • Again, approximation is very good, as x − i | x i , θ is ‘almost Gaussian’, • but it is expensive. In order to get the n marginals: • perform n optimisations, and • n factorisations of n − 1 × n − 1 matrices. Can be solved.

Approximative Bayesian inference Background: The Laplace approximation The Laplace-approximation for π ( x i | θ , y ) Simplified Laplace Approximation An series expansion of the LA for π ( x i | θ , y ): • computational much faster: O ( n log n ) for each i • correct the Gaussian approximation for error in shift and skewness π ( x i | θ , y ) = − 1 i + bx i + 1 2 x 2 6 d x 3 log � i + · · · • Fit a skew-Normal density 2 φ ( x )Φ( ax ) • sufficiently accurate for most applications

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary The integrated nested Laplace approximation (INLA) I π ( θ | y ) Step I Explore � • Locate the mode • Use the Hessian to construct new variables • Grid-search • Can be case-specific

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary The integrated nested Laplace approximation (INLA) II Step II For each θ j • For each i , evaluate the Laplace approximation for selected values of x i • Build a Skew-Normal or log-spline corrected Gaussian N ( x i ; µ i , σ 2 i ) × exp(spline) to represent the conditional marginal density.

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary The integrated nested Laplace approximation (INLA) III Step III Sum out θ j • For each i , sum out θ � π ( x i | y ) ∝ π ( x i | y , θ j ) × � π ( θ j | y ) � � j • Build a log-spline corrected Gaussian N ( x i ; µ i , σ 2 i ) × exp(spline) to represent � π ( x i | y ).

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary Computing posterior marginals for θ j (I) Main idea • Use the integration-points and build an interpolant • Use numerical integration on that interpolant

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary Computing posterior marginals for θ j (II) Practical approach (high accuracy) • Rerun using a fine integration grid • Possibly with no rotation • Just sum up at grid points, then interpolate

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary Computing posterior marginals for θ j (II) Practical approach (lower accuracy) • Use the Gaussian approximation at the mode θ ∗ • ...BUT, adjust the standard deviation in each direction • Then use numerical integration

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary 1.0 0.8 dnorm(x)/dnorm(0) 0.6 0.4 0.2 0.0 −4 −2 0 2 4 x

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Assessing the error How can we assess the error in the approximations? Tool 1: Compare a sequence of improved approximations 1. Gaussian approximation 2. Simplified Laplace 3. Laplace

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Assessing the error How can we assess the error in the approximations? Tool 2: Estimate the error using Monte Carlo �� − 1 π u ( θ | y ) ∝ E e π G [exp { r ( x ; θ , y ) } ] π ( θ | y ) where r () is the sum of the log-likelihood minus the second order Taylor expansion.

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Assessing the error How can we assess the error in the approximations? Tool 3: Estimate the “effective” number of parameters as defined in the Deviance Information Criteria : p D ( θ ) = D ( x ; θ ) − D ( x ; θ ) and compare this with the number of observations. Low ratio is good. This criteria has theoretical justification.

Approximative Bayesian inference Examples Stochastic volatility Stochastic Volatility model 4 2 0 −2 0 200 400 600 800 1000 Log of the daily difference of the pound-dollar exchange rate from October 1st, 1981, to June 28th, 1985.

Approximative Bayesian inference Examples Stochastic volatility Stochastic Volatility model Simple model x t | x 1 , . . . , x t − 1 , τ, φ ∼ N ( φ x t − 1 , 1 /τ ) where | φ | < 1 to ensure a stationary process. Observations are taken to be y t | x 1 , . . . , x t , µ ∼ N (0 , exp( µ + x t ))

Approximative Bayesian inference Examples Stochastic volatility Results Using just the first 50 data-points only, which makes the problem much harder.

Approximate Bayesian inference for latent Gaussian models avard Rue - PowerPoint PPT Presentation

Approximative Bayesian inference Approximate Bayesian inference for latent Gaussian models avard Rue 1 H Department of Mathematical Sciences NTNU, Norway December 4, 2009 1 With S.Martino/N.Chopin Approximative Bayesian inference Overview

Case Study: Approximate Bayesian Inference for Latent Gaussian Models by Using Integrated Nested

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

Distributed Variational Inference in Sparse Gaussian Process Regression and Latent Variable Models

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Bayesian networks: approximate inference Machine Intelligence Thomas D. Nielsen September 2008

Latent Force Models with Gaussian Processes Neil D. Lawrence Bayesian Research Kitchen,

Approximate Bayesian Computation Chris Drovandi, Charisse Farr October 24, 2012 Chris Drovandi,

Gaussian Filter The Gaussian filter 1 2 1 A Gaussian kernel gives less 1 2 4 2 weight to

Approximate inference: Sampling methods Probabilistic Graphical Models Sharif University of

Approximate Bayesian Computation Dr. Jarad Niemi STAT 615 - Iowa State University December 5,

Piecewise Bounds for Estimating Bernoulli- Logistic Latent Gaussian Models Mohammad Emtiyaz Khan

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

1 Latent variable models In the next section we will discuss latent variable models for

Part III: Latent Tree Models Le Song ICML 2012 Tutorial on Spectral Algorithms for Latent

Polynomial Chaos Acceleration for the Bayesian Inference of Random Fields with Gaussian Priors and

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

ECE444: Software Engineering Design Patterns 3 Shurui Zhou OO Design Principles Building stable

Response Surface Methods 07.12.2016 Goals of Todays Lecture See how a sequence of experiments

RNA Search and Motif Discovery CSEP 527 Computational Biology Previous Lecture

Sequential Model-based Optimization for General Algorithm Configuration Frank Hutter, Holger

Chapter 2: Transformations and Expectations (a recap) STK4011/9011: Statistical Inference Theory

Probabilities and Statistics An introduction to concepts and terminology Christoph Rosemann DESY

Strengthening Early Childhood in Kansas in 2019 WEBINAR October 30, 2019 Webinar Schedule

Problem Set I Intro, Measures of Central Tendency & Variability, Z-scores and the Normal

Sambuz

Useful Links

Newsletter

Mail Us

Approximate Bayesian inference for latent Gaussian models avard Rue - PowerPoint PPT Presentation

Approximative Bayesian inference Approximate Bayesian inference for latent Gaussian models avard Rue 1 H Department of Mathematical Sciences NTNU, Norway December 4, 2009 1 With S.Martino/N.Chopin Approximative Bayesian inference Overview

Case Study: Approximate Bayesian Inference for Latent Gaussian Models by Using Integrated Nested

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

Distributed Variational Inference in Sparse Gaussian Process Regression and Latent Variable Models

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Bayesian networks: approximate inference Machine Intelligence Thomas D. Nielsen September 2008

Latent Force Models with Gaussian Processes Neil D. Lawrence Bayesian Research Kitchen,

Approximate Bayesian Computation Chris Drovandi, Charisse Farr October 24, 2012 Chris Drovandi,

Gaussian Filter The Gaussian filter 1 2 1 A Gaussian kernel gives less 1 2 4 2 weight to

Approximate inference: Sampling methods Probabilistic Graphical Models Sharif University of

Approximate Bayesian Computation Dr. Jarad Niemi STAT 615 - Iowa State University December 5,

Piecewise Bounds for Estimating Bernoulli- Logistic Latent Gaussian Models Mohammad Emtiyaz Khan

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

1 Latent variable models In the next section we will discuss latent variable models for

Part III: Latent Tree Models Le Song ICML 2012 Tutorial on Spectral Algorithms for Latent

Polynomial Chaos Acceleration for the Bayesian Inference of Random Fields with Gaussian Priors and

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

ECE444: Software Engineering Design Patterns 3 Shurui Zhou OO Design Principles Building stable

Response Surface Methods 07.12.2016 Goals of Todays Lecture See how a sequence of experiments

RNA Search and Motif Discovery CSEP 527 Computational Biology Previous Lecture

Sequential Model-based Optimization for General Algorithm Configuration Frank Hutter, Holger

Chapter 2: Transformations and Expectations (a recap) STK4011/9011: Statistical Inference Theory

Probabilities and Statistics An introduction to concepts and terminology Christoph Rosemann DESY

Strengthening Early Childhood in Kansas in 2019 WEBINAR October 30, 2019 Webinar Schedule

Problem Set I Intro, Measures of Central Tendency &amp; Variability, Z-scores and the Normal

Sambuz

Useful Links

Newsletter

Mail Us

Problem Set I Intro, Measures of Central Tendency & Variability, Z-scores and the Normal