Least-Action Filtering L. C. G. Rogers Statistical Laboratory, University of Cambridge Least-Action Filtering – p. 1/1
Summary Least-Action Filtering – p. 2/1
Summary • Basics of least-action filtering • Finding the least-action path • The approximate conditional distribution of the hidden path • Example(s) Relationship to particle filtering (SMC) • Least-Action Filtering – p. 2/1
Summary • Basics of least-action filtering • Finding the least-action path • The approximate conditional distribution of the hidden path • Example(s) Relationship to particle filtering (SMC) • SAMSI Program on Sequential Monte Carlo Methods, 9/08-9/09. Organisers: Arnaud Doucet, Simon Godsill Working group on continuous-time methods (Fearnhead, Voss, ....) Least-Action Filtering – p. 2/1
Summary • Basics of least-action filtering • Finding the least-action path • The approximate conditional distribution of the hidden path • Example(s) Relationship to particle filtering (SMC) • SAMSI Program on Sequential Monte Carlo Methods, 9/08-9/09. Organisers: Arnaud Doucet, Simon Godsill Working group on continuous-time methods (Fearnhead, Voss, ....) Markussen (SPA 119 , 208-231, 2009) uses similar techniques to approximate the density of a discretely-sampled diffusion process Least-Action Filtering – p. 2/1
The setting. Least-Action Filtering – p. 3/1
The setting. Diffusion Z t ≡ [ X t ; Y t ] in R d , solving dZ t = σ ( t, Z t ) dW t + µ ( t, Z t ) dt Least-Action Filtering – p. 3/1
The setting. Diffusion Z t ≡ [ X t ; Y t ] in R d , solving dZ t = σ ( t, Z t ) dW t + µ ( t, Z t ) dt where σ , µ and σ − 1 are C 2 b . Least-Action Filtering – p. 3/1
The setting. Diffusion Z t ≡ [ X t ; Y t ] in R d , solving dZ t = σ ( t, Z t ) dW t + µ ( t, Z t ) dt where σ , µ and σ − 1 are C 2 b . We observe ( Y t ) 0 ≤ t ≤ T and want to find the conditional distribution of ( X t ) 0 ≤ t ≤ T . Least-Action Filtering – p. 3/1
The setting. Diffusion Z t ≡ [ X t ; Y t ] in R d , solving dZ t = σ ( t, Z t ) dW t + µ ( t, Z t ) dt where σ , µ and σ − 1 are C 2 b . We observe ( Y t ) 0 ≤ t ≤ T and want to find the conditional distribution of ( X t ) 0 ≤ t ≤ T . Closely related is the Euler scheme dz ( n ) = σ ( t n , z ( n ) t n ) dW t + µ ( t n , z ( n ) t n ) dt t where t n ≡ 2 − n [2 n t ] . Least-Action Filtering – p. 3/1
The setting. Diffusion Z t ≡ [ X t ; Y t ] in R d , solving dZ t = σ ( t, Z t ) dW t + µ ( t, Z t ) dt where σ , µ and σ − 1 are C 2 b . We observe ( Y t ) 0 ≤ t ≤ T and want to find the conditional distribution of ( X t ) 0 ≤ t ≤ T . Closely related is the Euler scheme dz ( n ) = σ ( t n , z ( n ) t n ) dW t + µ ( t n , z ( n ) t n ) dt t where t n ≡ 2 − n [2 n t ] . Despite appearances, this can be viewed as a discrete scheme. Least-Action Filtering – p. 3/1
The setting. Diffusion Z t ≡ [ X t ; Y t ] in R d , solving dZ t = σ ( t, Z t ) dW t + µ ( t, Z t ) dt where σ , µ and σ − 1 are C 2 b . We observe ( Y t ) 0 ≤ t ≤ T and want to find the conditional distribution of ( X t ) 0 ≤ t ≤ T . Closely related is the Euler scheme dz ( n ) = σ ( t n , z ( n ) t n ) dW t + µ ( t n , z ( n ) t n ) dt t where t n ≡ 2 − n [2 n t ] . Despite appearances, this can be viewed as a discrete scheme. We also have − Z t | a.s. | z ( n ) sup → 0 . t 0 ≤ t ≤ T Least-Action Filtering – p. 3/1
The setting. Diffusion Z t ≡ [ X t ; Y t ] in R d , solving dZ t = σ ( t, Z t ) dW t + µ ( t, Z t ) dt where σ , µ and σ − 1 are C 2 b . We observe ( Y t ) 0 ≤ t ≤ T and want to find the conditional distribution of ( X t ) 0 ≤ t ≤ T . Closely related is the Euler scheme dz ( n ) = σ ( t n , z ( n ) t n ) dW t + µ ( t n , z ( n ) t n ) dt t where t n ≡ 2 − n [2 n t ] . Despite appearances, this can be viewed as a discrete scheme. We also have − Z t | a.s. | z ( n ) sup → 0 . t 0 ≤ t ≤ T Use continuous time for guidance, discrete time for numerics and proof. Least-Action Filtering – p. 3/1
Log Likelihoods. Least-Action Filtering – p. 4/1
Log Likelihoods. See y ( j 2 − n ) 0 ≤ j ≤ 2 n T and want conditional law of x ( j 2 − n ) 0 ≤ j ≤ 2 n T . Least-Action Filtering – p. 4/1
Log Likelihoods. See y ( j 2 − n ) 0 ≤ j ≤ 2 n T and want conditional law of x ( j 2 − n ) 0 ≤ j ≤ 2 n T . Log-likelihood is (to within additive constant) ( N ≡ 2 n T, h ≡ 2 − n ) N − 1 ˛ σ ( jh, z jh ) − 1 ` z jh + h − z jh − hµ ( jh, z jh ) ´ ˛ 1 ˛ 2 − ϕ ( x 0 ) X ˛ λ ( x | y ) = − 1 2 h j =0 N − 1 ˛ σ ( jh, z jh ) − 1 ` z jh + h − z jh ˛ 2 − ϕ ( x 0 ) X ˛ ´ ˛ = − 1 h − µ ( jh, z jh ) 2 h j =0 Z T ˛ 2 ds − ϕ ( x 0 ) , ˛ σ ( s, z s ) − 1 ` ˛ ´ ˛ “ = ” − 1 z s − µ ( s, z s ) ˙ 2 0 where exp( − ϕ ) is the (prior) density of X 0 . Least-Action Filtering – p. 4/1
Log Likelihoods. See y ( j 2 − n ) 0 ≤ j ≤ 2 n T and want conditional law of x ( j 2 − n ) 0 ≤ j ≤ 2 n T . Log-likelihood is (to within additive constant) ( N ≡ 2 n T, h ≡ 2 − n ) N − 1 ˛ σ ( jh, z jh ) − 1 ` z jh + h − z jh − hµ ( jh, z jh ) ´ ˛ 1 ˛ 2 − ϕ ( x 0 ) X ˛ λ ( x | y ) = − 1 2 h j =0 N − 1 ˛ σ ( jh, z jh ) − 1 ` z jh + h − z jh ˛ 2 − ϕ ( x 0 ) X ˛ ´ ˛ = − 1 h − µ ( jh, z jh ) 2 h j =0 Z T ˛ 2 ds − ϕ ( x 0 ) , ˛ σ ( s, z s ) − 1 ` ˛ ´ ˛ “ = ” − 1 z s − µ ( s, z s ) ˙ 2 0 where exp( − ϕ ) is the (prior) density of X 0 . Maximising the log-likelihood is like maximising Z T ˛ 2 ds − ϕ ( x 0 ) ˛ σ ( s, z s ) − 1 ` ˛ ´ ˛ Λ( x | y ) = − 1 z s − µ ( s, z s ) ˙ 2 0 Z T ≡ − ψ ( s, x s , p s ) ds − ϕ ( x 0 ) 0 where p s ≡ ˙ x s . Least-Action Filtering – p. 4/1
Log Likelihoods. See y ( j 2 − n ) 0 ≤ j ≤ 2 n T and want conditional law of x ( j 2 − n ) 0 ≤ j ≤ 2 n T . Log-likelihood is (to within additive constant) ( N ≡ 2 n T, h ≡ 2 − n ) N − 1 ˛ σ ( jh, z jh ) − 1 ` z jh + h − z jh − hµ ( jh, z jh ) ´ ˛ 1 ˛ 2 − ϕ ( x 0 ) X ˛ λ ( x | y ) = − 1 2 h j =0 N − 1 ˛ σ ( jh, z jh ) − 1 ` z jh + h − z jh ˛ 2 − ϕ ( x 0 ) X ˛ ´ ˛ = − 1 h − µ ( jh, z jh ) 2 h j =0 Z T ˛ 2 ds − ϕ ( x 0 ) , ˛ σ ( s, z s ) − 1 ` ˛ ´ ˛ “ = ” − 1 z s − µ ( s, z s ) ˙ 2 0 where exp( − ϕ ) is the (prior) density of X 0 . Maximising the log-likelihood is like maximising Z T ˛ 2 ds − ϕ ( x 0 ) ˛ σ ( s, z s ) − 1 ` ˛ ´ ˛ Λ( x | y ) = − 1 z s − µ ( s, z s ) ˙ 2 0 Z T ≡ − ψ ( s, x s , p s ) ds − ϕ ( x 0 ) 0 where p s ≡ ˙ x s . This is a task for calculus of variations .... Least-Action Filtering – p. 4/1
Calculus of Variations. Least-Action Filtering – p. 5/1
Calculus of Variations. If we perturb optimal x ∗ to x ∗ + ξ , the first-order change is Z T „ « ψ ( s, x ∗ s , p ∗ ∆Λ = ∆ − s ) ds − ϕ ( x 0 ) 0 Z T ξ · D x ψ + ˙ ˘ ¯ = − ξ · D p ψ ds − ξ (0) · D x ϕ 0 Z T − [ ξ · D p ψ ] T ˘ ¯ = 0 + ξ · D tp ψ + ( D px ψ ) ˙ x + ( D pp ψ ) ˙ p − D x ψ ds − ξ (0) · D x ϕ. 0 Least-Action Filtering – p. 5/1
Calculus of Variations. If we perturb optimal x ∗ to x ∗ + ξ , the first-order change is Z T „ « ψ ( s, x ∗ s , p ∗ ∆Λ = ∆ − s ) ds − ϕ ( x 0 ) 0 Z T ξ · D x ψ + ˙ ˘ ¯ = − ξ · D p ψ ds − ξ (0) · D x ϕ 0 Z T − [ ξ · D p ψ ] T ˘ ¯ = 0 + ξ · D tp ψ + ( D px ψ ) ˙ x + ( D pp ψ ) ˙ p − D x ψ ds − ξ (0) · D x ϕ. 0 Since ξ is arbitrary, we conclude that D p ψ (0 , x ∗ 0 , p ∗ 0 ) − D x ϕ ( x ∗ 0 = 0 ) x ∗ + ( D pp ψ ) ˙ p ∗ − D x ψ 0 = D tp ψ + ( D px ψ ) ˙ D p ψ ( T, x ∗ T , p ∗ 0 = T ) which is a second-order ODE for the optimal x ∗ , with boundary conditions at 0 and at T . Least-Action Filtering – p. 5/1
Discrete Calculus of Variations. Least-Action Filtering – p. 6/1
Discrete Calculus of Variations. With p j ≡ ( x jh + h − x jh ) /h , must minimize N − 1 X hψ ( t j , x j , p j ) + ϕ ( x 0 ) j =0 Least-Action Filtering – p. 6/1
Discrete Calculus of Variations. With p j ≡ ( x jh + h − x jh ) /h , must minimize N − 1 X hψ ( t j , x j , p j ) + ϕ ( x 0 ) j =0 Perturbing x ∗ to x ∗ + ξ as before gives N − 1 ξ j · D x ψ ( t j , x j , p j ) + ξ j +1 − ξ j X ˘ ¯ 0 = h · D p ψ ( t j , x j , p j ) + ξ 0 · Dϕ ( x 0 ) h j =0 N − 1 X D x ψ ( t j , x j , p j ) − h − 1 ( D p ψ ( t j , x j , p j ) − D p ψ ( t j − 1 , x j − 1 , p j − 1 ) ˘ ¯ = h ξ j j =1 + ξ 0 · { Dϕ ( x 0 ) − D p ψ ( t 0 , x 0 , p 0 ) } + ξ N · D p ψ ( T, x N − 1 , p N − 1 ) Least-Action Filtering – p. 6/1
Recommend
More recommend