Approximation of continuous LMPs [1] Alexandre Bouchard-Côté Supervisors: Prakash Panangaden, Doina Precup Reasoning and Learning Lab, McGill University Sponsored by: NSERC, McGill School of Computer Science. [1] Labelled Markov Processes
Motivation: example Continuous system: state space dynamics Possible finite state approx.: “geometric” better
The approx. scheme [2] level 0 level 1 level 2 level 3 (X, 3) X . Panangaden. (2002). [2] J. Desharnais, V. Gupta, R. Jagadeesan, P Approximating Labelled Markov Processes.
m states in the (Ci, k-1) (B, k-1) preceding level (X, k) ... partition of the range of inf τ a (t, B) the probability kernels into h-intervals of length ε /m t ∈ X Aj ... partition of the states { τ a -1 (•, C i )( A j ) : A j ∈ P}
Implementation difficulties Infimum of measurable functions inf τ a (t, B) t ∈ X Generate partition (check if a set is empty) Invert a measurable function { τ a -1 (•, C i )( A j )}
How to “invert” the kernels Representation of τ a -1 (•, C)( (a,b]) f a Instance’s variables: C (a,b] Check if s 0 ∈ τ a -1 (•, C)( (a,b]) Operations: Output true iff ∫ C f a (s 0 , x) d μ (x) ∈ (a,b]
Infimum g(x) ess inf g(x) inf g(x) Measure zero sets
~ ~ Proof of correctness (sketch) � Q Q � bisimulation approximation sampling approximation
ε -homogeneity M 1 ε -homogenous w.r.t. M if ∃ Φ : S → S 1 surj. s.t. ∀ s ∈ S ∀ a ∈ A Σ s’ ε S | P 1 ( Φ (s), s’, a) - Σ t εΦ -1 ({s’}) P(s, t, a) | k ≤ ε k Φ (s) s s’ Φ M 1 = (S 1 , A, R 1 , P 1 ) (S, A, R, P) = M
Link between 0-homogeneity and bisimulation Let R ≡ 0, M 1 = (S 1 , A, R, P 1 ), M = (S, A, R, P) be MDP’s (and therefore LMP’s). Then they are 0- homogenous with mapping Φ iff { Φ -1 ({s’}) : s’ ∈ S 1 } is a bisimulation equivalence relation on M.
Proof idea Enough: if s 1 , s 2 are s.t. Φ (s 1 ) = Φ (s 2 ) = s, then they satisfy the same formulas in L 0 . Structural induction on L 0 . As usual, the “hard” step is < a > q φ . By induction hypothesis, [[ φ ]] has the form: [[ φ ]] = ∪ { Φ -1 ({s’ i })}
Φ S 1 S [[ φ ]] = ∪ { Φ -1 ({s’ i })} For each of these s’ i , we have, by 0-homogeneity: Σ t ∈ Φ -1 ({s’ i }) P(s j , t, a) = P 1 ( Φ (s j ), s’ i , a) for j=1,2 ∴ P(s j , [[ φ ]], a) = Σ i Σ t ∈ Φ -1 ({s’ i }) P 1 (s, s’ i , a) ∴ P(s 1 , [[ φ ]], a) = P 1 (s 2 , [[ φ ]], a) ∴ s 1 ⊨ < a > q φ ⇔ s 2 ⊨ < a > q φ
Recommend
More recommend