Bayesian nonparametric inference for diffusion models with discrete - PowerPoint PPT Presentation

Bayesian nonparametric inference for diffusion models with discrete sampling Delft University of Technology Jakob S¨ ohl joint work with Richard Nickl Van Dantzig Seminar, Leiden, 26 October 2016 Jakob S¨ ohl (TU Delft) Bayesian inference for diffusion models 26 October 2016 1 / 27

Outline 1 Diffusion Processes Background on Diffusion Processes Statistics for Diffusion Processes 2 Contraction Result Prior Distributions Contraction Theorem General Contraction Theorem 3 Main Ideas of Proof Information Theoretic Distance Concentration Inequality 4 Conclusion Jakob S¨ ohl (TU Delft) Bayesian inference for diffusion models 26 October 2016 2 / 27

Diffusion Markov Processes Consider a process ( X t : t � 0) that solves the stochastic differential equation d X t = b ( X t ) d t + σ ( X t ) d W t , t � 0 . Here b is a drift coefficient, σ the diffusion coefficient, ( W t ) t � 0 Brownian motion Under mild assumptions on ( σ, b ), ( X t : t � 0) is a unique Markov process with transition densities p t ,σ b ( x , y ) describing the operator � E σ b [ f ( X t + s ) | X s = x ] = f ( y ) p t ,σ b ( x , y ) d y =: P t f ( x ) , f ∈ C b ( Y ) , s � 0 . Y Jakob S¨ ohl (TU Delft) Bayesian inference for diffusion models 26 October 2016 3 / 27

Applications → Diffusion models are ubiquitous in modern science: They serve as fundamental building blocks in the modelling of dynamic phenomena in • physics, biology, geosciences • evolutionary dynamics and life sciences • engineering • economics & finance They are closely related to stochastic models that model a dynamical system by some differential operator L that propagates the system state perturbed with statistical noise. Buzzwords: ‘data assimilation, uncertainty quantification, filtering problems, Hidden Markov Models’. → Often the parameters ( σ, b ) are unknown and one wants to infer their values from some form of sample of the diffusion. Jakob S¨ ohl (TU Delft) Bayesian inference for diffusion models 26 October 2016 4 / 27

Statistical Inference & Observation Schemes • An idealised assumption would be to observe an entire trajectory ( X t : 0 � t � T ), up to time T . Inference on b becomes possible as T → ∞ . (Note that σ is known in this case.) • More realistic: discrete observations X 0 , X ∆ , X 2∆ , . . . , X n ∆ of the continuous process, where ∆ is the ‘observation distance’. • high-frequency observations: ∆ → 0 and n ∆ = T → ∞ • low-frequency observations: ∆ > 0 fixed as n → ∞ . • The high-frequency regime asymptotically reflects the ‘continuous data’ setting. Low-frequency is harder. Jakob S¨ ohl (TU Delft) Bayesian inference for diffusion models 26 October 2016 5 / 27

Some Spectral Theory When the diffusion is restricted to a regular compact space by reflection, say [0 , 1] for simplicity, the transition operator P t coincides with the action of the semigroup ( e tL : t � 0) on L 2 ( µ ) where the infinitesimal generator d x + σ ( x ) 2 d 2 L = L σ b = b ( x ) d 2 d x 2 admits (subject to suitable boundary conditions) a discrete spectrum of eigenfunctions u k : k = 0 , 1 , 2 , . . . with eigenvalues λ k ∈ [ − Ck 2 , − C ′ k 2 ], k � 1. Here µ is the invariant density of the Markov process. We deduce the expansion � e λ k t u k ( x ) u k ( y ) µ ( y ) , p t ,σ b ( x , y ) = x , y ∈ [0 , 1] . k → In the case of a scalar diffusion reflected at { 0 , 1 } the boundary conditions are of von Neumann type ( u ′ k (0) = u ′ k (1) = 0). If b = 0 and σ = 1 we have reflected Brownian motion. Dirichlet conditions correspond to killed Brownian motion. Jakob S¨ ohl (TU Delft) Bayesian inference for diffusion models 26 October 2016 6 / 27

Frequentist Estimation at Low Frequency • In a seminal paper, Gobet, Hoffmann & Reiß (2004) studied the above model in the nonparametric setting. They started from the spectral identities � · � · u 1 u ′ 1 µ − u ′′ σ 2 = 2 λ 1 0 u 1 d µ 0 u 1 d µ 1 , b = λ 1 . u ′ ( u ′ 1 ) 2 µ 1 µ Jakob S¨ ohl (TU Delft) Bayesian inference for diffusion models 26 October 2016 7 / 27

Frequentist Estimation at Low Frequency • In a seminal paper, Gobet, Hoffmann & Reiß (2004) studied the above model in the nonparametric setting. They started from the spectral identities � · � · u 1 u ′ 1 µ − u ′′ σ 2 = 2 λ 1 0 u 1 d µ 0 u 1 d µ 1 , b = λ 1 . u ′ ( u ′ 1 ) 2 µ 1 µ • While estimation of µ is straightforward, recovery of the first eigen-pair ( u 1 , λ 1 ) requires estimation of the entire transition operator P ∆ . GHR show that this can be done empirically in a minimax optimal way, with resulting L 2 -convergence rates n − s / (2 s +3) for σ 2 and n − ( s − 1) / (2 s +3) for b whenever, for C s a s -H¨ older or Sobolev space, ( σ, b ) ∈ Θ s = {� σ � C s + � b � C s − 1 � B , σ � c > 0 } . These rates reveal an ill-posed nonlinear inverse problem of order 1 and 2. Jakob S¨ ohl (TU Delft) Bayesian inference for diffusion models 26 October 2016 7 / 27

Bayesian Methods From a Bayesian perspective it is natural to put a prior Π on the pair ( σ, b ). The resulting posterior distribution is obtained from Bayes’ formula. For instance if the process is started in equilibrium, X 0 ∼ µ σ b , then µ σ b ( X 0 ) � n i =1 p ∆ ,σ b ( X ( i − 1)∆ , X i ∆ ) d Π( σ, b ) d Π(( σ, b ) | X 0 , X ∆ , . . . X n ∆ ) = i =1 p ∆ ,σ b ( X ( i − 1)∆ , X i ∆ ) d Π( σ, b ) . µ σ b ( X 0 ) � n � Direct evaluation is out of reach, since the transition probabilities depend in an analytically intractable, non-linear way on σ, b . Jakob S¨ ohl (TU Delft) Bayesian inference for diffusion models 26 October 2016 8 / 27

Sampling from the Posterior Distribution Papaspiliopoulos, Pokern, Roberts & Stuart (2012) showed how one can sample from the posterior distribution when σ = 1 (or parametric) and the prior on b comes from a Gaussian process. One uses conjugacy under continuous sampling, combined with a ‘latent’ variables sampling idea. Can this ‘work’, particularly if the prior only models the regularity of σ, b – so is ignorant of the ‘inverse problem’? The same question can be asked about many similar Bayesian ‘solutions’ of inverse problems (Stuart (2010)). Jakob S¨ ohl (TU Delft) Bayesian inference for diffusion models 26 October 2016 9 / 27

Frequentist Posterior Contraction Rates for Inverse Problems • Following the program of van der Vaart, Ghosal et al., one can ask whether the posterior distribution contracts about the ‘true value’ ( σ 0 , b 0 ) at the right rate. Do we have, for large enough M > 0 that � � ( σ, b ) : n s / (2 s +3) � σ − σ 0 � + n ( s − 1) / (2 s +3) � b − b 0 � > M | X 0 , . . . , X n ∆ Π → 0 in P σ 0 b 0 -probability as n → ∞ ? Jakob S¨ ohl (TU Delft) Bayesian inference for diffusion models 26 October 2016 10 / 27

Frequentist Posterior Contraction Rates for Inverse Problems • Following the program of van der Vaart, Ghosal et al., one can ask whether the posterior distribution contracts about the ‘true value’ ( σ 0 , b 0 ) at the right rate. Do we have, for large enough M > 0 that � � ( σ, b ) : n s / (2 s +3) � σ − σ 0 � + n ( s − 1) / (2 s +3) � b − b 0 � > M | X 0 , . . . , X n ∆ Π → 0 in P σ 0 b 0 -probability as n → ∞ ? • For general linear inverse problems Y = Af + ǫ ; A : H 1 → H 2 linear, compact , with Gaussian white noise ǫ , results are available: see Knapik, van der Vaart & van Zanten (2011), Agapiou, Larsson & Stuart (2013) for the Gaussian conjugate setting, and Ray (2013) for a general approach. Jakob S¨ ohl (TU Delft) Bayesian inference for diffusion models 26 October 2016 10 / 27

Bayesian Estimation for Low-Frequency Observations For nonlinear settings, very little is known. Particularly in the diffusion model with low-frequency observations only consistency in a weak topology (with σ = 1 known) has been proved so far (van der Meulen & van Zanten, 2013). There are extensions to multidimensional diffusions (Gugushvili & Spreij, 2014) and to jump diffusions (Koskela, Spano & Jenkins, 2015). All three papers assume σ = 1 known and show consistency in a weak topology. Jakob S¨ ohl (TU Delft) Bayesian inference for diffusion models 26 October 2016 11 / 27

Wavelet Series Priors I ψ lk boundary corrected Daubechies wavelets, 0 < α < β < 1, I = { ( l , k ) : ψ lk supported in [ α, β ] } Model diffusion coefficient σ by 2 − l ( s +1 / 2) u lk ∼ iid U ( − B , B ) . log( σ − 2 ( x )) = � u lk ψ lk ( x ) , l 2 ( l , k ) ∈I Comments: • Could replace uniform distributions U ( − B , B ) by any distribution with bouded support and density bounded away from zero. • Could truncate sum in l at L n → ∞ sufficiently fast. older norms and wavelet series log( σ − 2 ) is • By connection between H¨ modelled as typical s -H¨ older smooth function (with a ‘convenient’ log-factor). Jakob S¨ ohl (TU Delft) Bayesian inference for diffusion models 26 October 2016 12 / 27

Bayesian nonparametric inference for diffusion models with discrete - PowerPoint PPT Presentation

Bayesian nonparametric inference for diffusion models with discrete sampling Delft University of Technology Jakob S ohl joint work with Richard Nickl Van Dantzig Seminar, Leiden, 26 October 2016 Jakob S ohl (TU Delft) Bayesian

Bayesian Nonparametric Models for Data Exploration Melanie F. Pradier Friday 15 th September,

Approximate Bayesian inference for latent Gaussian models avard Rue 1 H Department of

Nonparametric Bayesian Models for Sparse Matrices and Covariances Zoubin Ghahramani Department

On a Class of Nonparametric Bayesian Autoregressive Models Maria Anna Di Lucca 1 , Alessandra

Bayesian inference for discretely observed diffusion processes Moritz Schauer with Frank van der

Dirichlet Processes and Nonparametric Bayesian Modelling Volker Tresp 1 Motivation Infinite

Outline Motivation and challenge Dirichlet Process and Infinite Mixture Formulation

BAYESIAN CALIBRATION OF COMPUTER MODELS Bayesian inference & Markov chain Monte Carlo

Bayesian Hierarchical Models for parameter inference with missing

Bayesian Hierarchical Models for parameter inference with missing

Bayesian nonparametric models for bipartite graphs Fran cois Caron Department of Statistics,

Case Study: Approximate Bayesian Inference for Latent Gaussian Models by Using Integrated Nested

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Synergies in learning syllables and words or Adaptor grammars: a class of nonparametric Bayesian

Analytics, Inference and Computation in Cosmology: Exercises on Bayesian Inference Roberto

Exact Bayesian Inference for Animal Movement in Continuous Time Paul Blackwell University of

Surrogate models and reduction methods for UQ and inference in large-scale models Olivier Le

CS 730/830: Intro AI Bayesian Networks Approx. Inference Exact Inference Wheeler Ruml (UNH)

Toward Reliable Bayesian Nonparametric Learning Erik Sudderth Brown University Department of

Computational Statistics Lectures 10-13: Smoothing and Nonparametric Inference Dr Jennifer

Unsupervised Music Understanding based on Nonparametric Bayesian Models Kazuyoshi Yoshii Masataka

Unsupervised Coreference Resolution in a Nonparametric Bayesian Model Aria Haghighi and Dan Klein

The role of mechanistic models in Bayesian inference Dan Cornford 1 , Alexis Boukouvalas, Yuan

Efficient Bayesian inference for Copula Gaussian graphical models A. Mohammadi, F. Abegaz and E.

Bayesian nonparametric inference for diffusion models with discrete - PowerPoint PPT Presentation

Bayesian nonparametric inference for diffusion models with discrete sampling Delft University of Technology Jakob S ohl joint work with Richard Nickl Van Dantzig Seminar, Leiden, 26 October 2016 Jakob S ohl (TU Delft) Bayesian

Bayesian Nonparametric Models for Data Exploration Melanie F. Pradier Friday 15 th September,

Approximate Bayesian inference for latent Gaussian models avard Rue 1 H Department of

Nonparametric Bayesian Models for Sparse Matrices and Covariances Zoubin Ghahramani Department

On a Class of Nonparametric Bayesian Autoregressive Models Maria Anna Di Lucca 1 , Alessandra

Bayesian inference for discretely observed diffusion processes Moritz Schauer with Frank van der

Dirichlet Processes and Nonparametric Bayesian Modelling Volker Tresp 1 Motivation Infinite

Outline Motivation and challenge Dirichlet Process and Infinite Mixture Formulation

BAYESIAN CALIBRATION OF COMPUTER MODELS Bayesian inference &amp; Markov chain Monte Carlo

Bayesian Hierarchical Models for parameter inference with missing

Bayesian Hierarchical Models for parameter inference with missing

Bayesian nonparametric models for bipartite graphs Fran cois Caron Department of Statistics,

Case Study: Approximate Bayesian Inference for Latent Gaussian Models by Using Integrated Nested

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Synergies in learning syllables and words or Adaptor grammars: a class of nonparametric Bayesian

Analytics, Inference and Computation in Cosmology: Exercises on Bayesian Inference Roberto

Exact Bayesian Inference for Animal Movement in Continuous Time Paul Blackwell University of

Surrogate models and reduction methods for UQ and inference in large-scale models Olivier Le

CS 730/830: Intro AI Bayesian Networks Approx. Inference Exact Inference Wheeler Ruml (UNH)

Toward Reliable Bayesian Nonparametric Learning Erik Sudderth Brown University Department of

Computational Statistics Lectures 10-13: Smoothing and Nonparametric Inference Dr Jennifer

Unsupervised Music Understanding based on Nonparametric Bayesian Models Kazuyoshi Yoshii Masataka

Unsupervised Coreference Resolution in a Nonparametric Bayesian Model Aria Haghighi and Dan Klein

The role of mechanistic models in Bayesian inference Dan Cornford 1 , Alexis Boukouvalas, Yuan

Efficient Bayesian inference for Copula Gaussian graphical models A. Mohammadi, F. Abegaz and E.

BAYESIAN CALIBRATION OF COMPUTER MODELS Bayesian inference & Markov chain Monte Carlo