Travel Time Estimation using Approximate Belief States on a Hidden Markov Model Walid Krichene
Overview Context Inference on a HMM Modeling framework and exact inference Approximate Inference: the Boyen-Koller algorithm Graph Partitioning
Overview Context Inference on a HMM Modeling framework and exact inference Approximate Inference: the Boyen-Koller algorithm Graph Partitioning
Context ◮ Mobile Millennium project ◮ Travel time estimation on an Arterial Network
Context ◮ Mobile Millennium project ◮ Travel time estimation on an Arterial Network ◮ Input data: probe vehicles that send their GPS locations periodically
Context ◮ Mobile Millennium project ◮ Travel time estimation on an Arterial Network ◮ Input data: probe vehicles that send their GPS locations periodically ◮ processed using path inference
Context ◮ Mobile Millennium project ◮ Travel time estimation on an Arterial Network ◮ Input data: probe vehicles that send their GPS locations periodically ◮ processed using path inference ◮ observation = (path, travel time along the path)
Objective Improve inference algorithm ◮ Time complexity exponential in size of the network (number of links)
Objective Improve inference algorithm ◮ Time complexity exponential in size of the network (number of links) ◮ Solution: assume links are independent ◮ But lose structure of network
Objective Improve inference algorithm ◮ Time complexity exponential in size of the network (number of links) ◮ Solution: assume links are independent ◮ But lose structure of network ◮ Need approximate inference to keep the structure
Overview Context Inference on a HMM Modeling framework and exact inference Approximate Inference: the Boyen-Koller algorithm Graph Partitioning
Graphical Model ◮ Nodes: random variables ◮ Conditional independence: x and y are independent conditionally on ( n 1 , n 2 ) but not on n 1
Hidden Markov Model ◮ Hidden variables s t ∈ ( s 1 , . . . , s N ) ◮ Observed variables y t ◮ ( s 0 , . . . , s t ) is a Markov process
Hidden Markov Model ◮ Hidden variables s t ∈ ( s 1 , . . . , s N ) ◮ Observed variables y t ◮ ( s 0 , . . . , s t ) is a Markov process ◮ Hidden variables are introduced to simplify the model
Hidden Markov Model ◮ Hidden variables s t ∈ ( s 1 , . . . , s N ) ◮ Observed variables y t ◮ ( s 0 , . . . , s t ) is a Markov process ◮ Hidden variables are introduced to simplify the model ◮ Interesting because provides efficient algorithms to do inference and parameter estimation
Parametrization of a HMM ◮ Initial probability distribution π i = P ( s i 0 )
Parametrization of a HMM ◮ Initial probability distribution π i = P ( s i 0 ) ◮ Transition Matrix: T i , j = P ( s j t +1 | s i t )
Parametrization of a HMM ◮ Initial probability distribution π i = P ( s i 0 ) ◮ Transition Matrix: T i , j = P ( s j t +1 | s i t ) ◮ Observation model: P ( y t | s t )
Parametrization of a HMM ◮ Initial probability distribution π i = P ( s i 0 ) ◮ Transition Matrix: T i , j = P ( s j t +1 | s i t ) ◮ Observation model: P ( y t | s t ) ◮ Completely characterizes the HMM: We can compute probability of any event.
Inference General inference problem: compute P ( s t | y 0: T )
Inference General inference problem: compute P ( s t | y 0: T ) ◮ Filtering if t = T ◮ Prediction if t > T ◮ Smoothing if t < T
Inference General inference problem: compute P ( s t | y 0: T ) ◮ Filtering if t = T ◮ Prediction if t > T ◮ Smoothing if t < T Let y = y 0: T P ( s | y ) = P ( s , y ) α ( s t ) β ( s t ) = P ( y ) � s t α ( s t ) β ( s t ) where α ( s t ) ∆ = P ( y 0 , . . . , y t , s t ) β ( s t ) ∆ = P ( y t +1 , . . . , y T | s t )
Message passing algorithms Recursive algorithm to compute α ( s t ) and β ( s t )
Message passing algorithms Recursive algorithm to compute α ( s t ) and β ( s t ) ◮ α ( s t +1 ) = � s t α ( s t ) T s t , s t +1 P ( y t +1 | s t +1 ) ◮ β ( s t ) = � s t +1 β ( s t +1 ) P ( y t +1 | s t +1 ) T s t , s t +1
Message passing algorithms Recursive algorithm to compute α ( s t ) and β ( s t ) ◮ α ( s t +1 ) = � s t α ( s t ) T s t , s t +1 P ( y t +1 | s t +1 ) ◮ β ( s t ) = � s t +1 β ( s t +1 ) P ( y t +1 | s t +1 ) T s t , s t +1 ◮ Complexity: O ( N 2 T ) operations ◮ α recursion: for every t , N possible values of s t +1 , each α ( s t +1 ) requires N multiplications
Parameter estimation Parameters of the HMM: θ = ( π, T , η ) ◮ T : transition matrix ◮ π : initial state probability distribution ◮ η : parameters of observation model: P ( y t | s t , η ) Parameter estimation: maximize log likelihood w.r.t θ T − 1 T � � � � � ln · · · π s 0 P ( y t | s t , η ) T s t , s t +1 s 0 s 1 s T t =0 t =0
Expectation Maximization algorithm ◮ E step: estimate the hidden (unobserved) variables given the observed variables and the current estimate of θ
Expectation Maximization algorithm ◮ E step: estimate the hidden (unobserved) variables given the observed variables and the current estimate of θ ◮ M step: maximize likelihood function under assumption that latent variables are known (they are “filled-in” with their expected values)
Expectation Maximization algorithm In the case of HMMs: � T − 1 t , s j t =0 ξ ( s i t +1 ) ◮ ˆ T ij = � T − 1 t =0 γ ( s i t ) � T t ) y j t =0 γ ( s i ◮ ˆ η ij = t � T t =0 γ ( s i t ) α ( s i 0 ) β ( s i 0 ) ◮ ˆ π i = � s 0 α ( s 0 ) β ( s 0 ) where ξ and γ are simple functions of α and β . Time complexity O ( N 2 T ) operations
Overview Context Inference on a HMM Modeling framework and exact inference Approximate Inference: the Boyen-Koller algorithm Graph Partitioning
Modeling framework ◮ System modeled using a Hidden Markov Model. ◮ L links
Modeling framework ◮ System modeled using a Hidden Markov Model. ◮ L links Hidden variables Link l : discrete state S l t ∈ { 1 , . . . , K } - state of entire system S t = ( S 1 t , . . . , S L t ) - N = K L possible states - Markov process P ( S t +1 | S 0 , . . . , S t ) = P ( S t +1 | S t )
Modeling framework ◮ System modeled using a Hidden Markov Model. ◮ L links Hidden variables Link l : discrete state S l t ∈ { 1 , . . . , K } - state of entire system S t = ( S 1 t , . . . , S L t ) - N = K L possible states - Markov process P ( S t +1 | S 0 , . . . , S t ) = P ( S t +1 | S t ) Observed variables We observe travel times: random variables Distributions depend on the state of links
HMM
Parametrization of the HMM Transition model T t ( s i → s j ) ∆ = P ( s j t +1 | s i t ) Transition matrix, size K L
Parametrization of the HMM Observation model Probability to observe response y = ( l , x i , x f , δ ) given state s at time t � x f O t ( s → y ) ∆ = P ( y t | s t ) = g l , s ρ l t ( δ ) × t ( x ) dx x i ◮ g l , s t : distribution of total travel time on link l at state s . ◮ ρ l t : probability distribution of vehicle locations (results from traffic assumptions)
Parametrization of the HMM Observation model Probability to observe response y = ( l , x i , x f , δ ) given state s at time t � x f O t ( s → y ) ∆ = P ( y t | s t ) = g l , s ρ l t ( δ ) × t ( x ) dx x i ◮ g l , s t : distribution of total travel time on link l at state s . ◮ ρ l t : probability distribution of vehicle locations (results from traffic assumptions) Assumptions Processes time invariant during 1 hour time slices
Travel time estimation ◮ Estimate state of system
Travel time estimation ◮ Estimate state of system ◮ Estimate parameters of models (observation)
Travel time estimation ◮ Estimate state of system ◮ Estimate parameters of models (observation) ◮ Update estimation when new responses are observed
Travel time estimation ◮ Estimate state of system ◮ Estimate parameters of models (observation) ◮ Update estimation when new responses are observed Belief State p t ( s ) ∆ = P ( s t | y 0: t ) Probability distribution over possible states
Travel time estimation Bayesian tracking of the belief state: forward-backward propagation ( O ( N 2 T ) time) Can be done in O ( N 2 ): O y [ . ] T [ . ] p t → q t +1 → p t +1
Travel time estimation Bayesian tracking of the belief state: forward-backward propagation ( O ( N 2 T ) time) Can be done in O ( N 2 ): O y [ . ] T [ . ] p t → q t +1 → p t +1 Parameter estimation of the model ◮ update parameters of probability distribution of vehicle locations: solve � ln ρ l max t ( x ) x ∈ X l t where X l t are the observed vehicle locations
Parameter estimation of the model ◮ update Transition matrix: EM algorithm in O ( N 2 T ) operations
Parameter estimation of the model ◮ update Transition matrix: EM algorithm in O ( N 2 T ) operations ◮ Exact inference and parameter estimation done in O ( N 2 T ) = O ( K 2 L T ) time complexity
Computational intractability Exact inference and EM algorithm not tractable Size of the belief state and transition matrix exponential in size of network. EM algorithm takes exponential time in L .
Recommend
More recommend