Stochastic (partial) differential equations and Gaussian processes Simo Särkkä Aalto University, Finland
Contents Basic ideas 1 Stochastic differential equations and Gaussian processes 2 Stochastic partial differential equations and Gaussian 3 processes Conclusion 4 S(P)DEs and GPs Simo Särkkä 2 / 24
Contents Basic ideas 1 Stochastic differential equations and Gaussian processes 2 Stochastic partial differential equations and Gaussian 3 processes Conclusion 4 S(P)DEs and GPs Simo Särkkä 2 / 24
Contents Basic ideas 1 Stochastic differential equations and Gaussian processes 2 Stochastic partial differential equations and Gaussian 3 processes Conclusion 4 S(P)DEs and GPs Simo Särkkä 2 / 24
Contents Basic ideas 1 Stochastic differential equations and Gaussian processes 2 Stochastic partial differential equations and Gaussian 3 processes Conclusion 4 S(P)DEs and GPs Simo Särkkä 2 / 24
Contents Basic ideas 1 Stochastic differential equations and Gaussian processes 2 Stochastic partial differential equations and Gaussian 3 processes Conclusion 4 S(P)DEs and GPs Simo Särkkä 3 / 24
Kernel vs. SPDE representations of GPs GP model x ∈ R d , t ∈ R Equivalent S(P)DE model Spatial k ( x , x ′ ) SPDE model ( L is an operator) L f ( x ) = w ( x ) Temporal k ( t , t ′ ) State-space/SDE model d f ( t ) = A f ( t ) + L w ( t ) dt Spatio-temporal Stochastic evolution equation k ( x , t ; x ′ , t ′ ) ∂ ∂ t f ( x , t ) = A x f ( x , t ) + L w ( x , t ) S(P)DEs and GPs Simo Särkkä 4 / 24
Why use S(P)DE solvers for GPs? The O ( n 3 ) computational complexity is a challenge. What do we get: O ( n ) state-space methods for SDEs/SPDEs. Sparse approximations developed for SPDEs. Reduced rank Fourier/basis function approximations. Path to non-Gaussian processes. Downsides: We often need to approximate. Mathematics can become messy. S(P)DEs and GPs Simo Särkkä 5 / 24
Why use S(P)DE solvers for GPs? The O ( n 3 ) computational complexity is a challenge. What do we get: O ( n ) state-space methods for SDEs/SPDEs. Sparse approximations developed for SPDEs. Reduced rank Fourier/basis function approximations. Path to non-Gaussian processes. Downsides: We often need to approximate. Mathematics can become messy. S(P)DEs and GPs Simo Särkkä 5 / 24
Why use S(P)DE solvers for GPs? The O ( n 3 ) computational complexity is a challenge. What do we get: O ( n ) state-space methods for SDEs/SPDEs. Sparse approximations developed for SPDEs. Reduced rank Fourier/basis function approximations. Path to non-Gaussian processes. Downsides: We often need to approximate. Mathematics can become messy. S(P)DEs and GPs Simo Särkkä 5 / 24
Why use S(P)DE solvers for GPs? The O ( n 3 ) computational complexity is a challenge. What do we get: O ( n ) state-space methods for SDEs/SPDEs. Sparse approximations developed for SPDEs. Reduced rank Fourier/basis function approximations. Path to non-Gaussian processes. Downsides: We often need to approximate. Mathematics can become messy. S(P)DEs and GPs Simo Särkkä 5 / 24
Why use S(P)DE solvers for GPs? The O ( n 3 ) computational complexity is a challenge. What do we get: O ( n ) state-space methods for SDEs/SPDEs. Sparse approximations developed for SPDEs. Reduced rank Fourier/basis function approximations. Path to non-Gaussian processes. Downsides: We often need to approximate. Mathematics can become messy. S(P)DEs and GPs Simo Särkkä 5 / 24
Why use S(P)DE solvers for GPs? The O ( n 3 ) computational complexity is a challenge. What do we get: O ( n ) state-space methods for SDEs/SPDEs. Sparse approximations developed for SPDEs. Reduced rank Fourier/basis function approximations. Path to non-Gaussian processes. Downsides: We often need to approximate. Mathematics can become messy. S(P)DEs and GPs Simo Särkkä 5 / 24
Why use S(P)DE solvers for GPs? The O ( n 3 ) computational complexity is a challenge. What do we get: O ( n ) state-space methods for SDEs/SPDEs. Sparse approximations developed for SPDEs. Reduced rank Fourier/basis function approximations. Path to non-Gaussian processes. Downsides: We often need to approximate. Mathematics can become messy. S(P)DEs and GPs Simo Särkkä 5 / 24
Why use S(P)DE solvers for GPs? The O ( n 3 ) computational complexity is a challenge. What do we get: O ( n ) state-space methods for SDEs/SPDEs. Sparse approximations developed for SPDEs. Reduced rank Fourier/basis function approximations. Path to non-Gaussian processes. Downsides: We often need to approximate. Mathematics can become messy. S(P)DEs and GPs Simo Särkkä 5 / 24
Why use S(P)DE solvers for GPs? The O ( n 3 ) computational complexity is a challenge. What do we get: O ( n ) state-space methods for SDEs/SPDEs. Sparse approximations developed for SPDEs. Reduced rank Fourier/basis function approximations. Path to non-Gaussian processes. Downsides: We often need to approximate. Mathematics can become messy. S(P)DEs and GPs Simo Särkkä 5 / 24
Contents Basic ideas 1 Stochastic differential equations and Gaussian processes 2 Stochastic partial differential equations and Gaussian 3 processes Conclusion 4 S(P)DEs and GPs Simo Särkkä 6 / 24
Ornstein-Uhlenbeck process The mean and covariance functions: m ( x ) = 0 k ( x , x ′ ) = σ 2 exp ( − λ | x − x ′ | ) This has a path representation as a stochastic differential equation (SDE): df ( t ) = − λ f ( t ) + w ( t ) . dt where w ( t ) is a white noise process with x relabeled as t . Ornstein–Uhlenbeck process is a Markov process. What does this actually mean = ⇒ white board. S(P)DEs and GPs Simo Särkkä 7 / 24
Ornstein-Uhlenbeck process The mean and covariance functions: m ( x ) = 0 k ( x , x ′ ) = σ 2 exp ( − λ | x − x ′ | ) This has a path representation as a stochastic differential equation (SDE): df ( t ) = − λ f ( t ) + w ( t ) . dt where w ( t ) is a white noise process with x relabeled as t . Ornstein–Uhlenbeck process is a Markov process. What does this actually mean = ⇒ white board. S(P)DEs and GPs Simo Särkkä 7 / 24
Ornstein-Uhlenbeck process The mean and covariance functions: m ( x ) = 0 k ( x , x ′ ) = σ 2 exp ( − λ | x − x ′ | ) This has a path representation as a stochastic differential equation (SDE): df ( t ) = − λ f ( t ) + w ( t ) . dt where w ( t ) is a white noise process with x relabeled as t . Ornstein–Uhlenbeck process is a Markov process. What does this actually mean = ⇒ white board. S(P)DEs and GPs Simo Särkkä 7 / 24
Ornstein-Uhlenbeck process The mean and covariance functions: m ( x ) = 0 k ( x , x ′ ) = σ 2 exp ( − λ | x − x ′ | ) This has a path representation as a stochastic differential equation (SDE): df ( t ) = − λ f ( t ) + w ( t ) . dt where w ( t ) is a white noise process with x relabeled as t . Ornstein–Uhlenbeck process is a Markov process. What does this actually mean = ⇒ white board. S(P)DEs and GPs Simo Särkkä 7 / 24
Ornstein-Uhlenbeck process (cont.) Consider a Gaussian process regression problem f ( x ) ∼ GP ( 0 , σ 2 exp ( − λ | x − x ′ | )) y k = f ( x k ) + ε k This is equivalent to the state-space model df ( t ) = − λ f ( t ) + w ( t ) dt y k = f ( t k ) + ε k that is, with f k = f ( t k ) we have a Gauss-Markov model f k + 1 ∼ p ( f k + 1 | f k ) y k ∼ p ( y k | f k ) Solvable in O ( n ) time using Kalman filter/smoother. S(P)DEs and GPs Simo Särkkä 8 / 24
Ornstein-Uhlenbeck process (cont.) Consider a Gaussian process regression problem f ( x ) ∼ GP ( 0 , σ 2 exp ( − λ | x − x ′ | )) y k = f ( x k ) + ε k This is equivalent to the state-space model df ( t ) = − λ f ( t ) + w ( t ) dt y k = f ( t k ) + ε k that is, with f k = f ( t k ) we have a Gauss-Markov model f k + 1 ∼ p ( f k + 1 | f k ) y k ∼ p ( y k | f k ) Solvable in O ( n ) time using Kalman filter/smoother. S(P)DEs and GPs Simo Särkkä 8 / 24
Ornstein-Uhlenbeck process (cont.) Consider a Gaussian process regression problem f ( x ) ∼ GP ( 0 , σ 2 exp ( − λ | x − x ′ | )) y k = f ( x k ) + ε k This is equivalent to the state-space model df ( t ) = − λ f ( t ) + w ( t ) dt y k = f ( t k ) + ε k that is, with f k = f ( t k ) we have a Gauss-Markov model f k + 1 ∼ p ( f k + 1 | f k ) y k ∼ p ( y k | f k ) Solvable in O ( n ) time using Kalman filter/smoother. S(P)DEs and GPs Simo Särkkä 8 / 24
State Space Form of Linear Time-Invariant SDEs Consider a N th order LTI SDE of the form d N f d N − 1 f dt N + a N − 1 dt N − 1 + · · · + a 0 f = w ( t ) . If we define f = ( f , . . . , d N − 1 f / dt N − 1 ) , we get a state space model: 0 1 0 . ... ... d f . . dt = f + w ( t ) 0 1 0 − a 0 − a 1 . . . − a N − 1 1 � �� � � �� � A L � � f ( t ) = 1 0 · · · 0 f . � �� � H The vector process f ( t ) is Markovian although f ( t ) isn’t. S(P)DEs and GPs Simo Särkkä 9 / 24
Recommend
More recommend