Observation Decoding with Sensor Models: Recognition Tasks via Classical Planning Diego Aineto, Sergio Jimenez, Eva Onaindia October 16, 2020 Universitat Polit` ecnica de Val` encia 1
What is decoding?
What is decoding? Decoding : finding the most likely explanation to some evidence. 2
What is decoding? Decoding : finding the most likely explanation to some evidence. Basic reasoning tool in: • ”Plan recognition as Planning” (Ramirez and Geffner, 2009). • ”Diagnosis as Planning Revisited” (Sohrabi et al., 2010). • ”Counterplanning using Goal Recognition and Landmarks” (Pozanco et al. 2018) • ”Learning action models with minimal observability” (Aineto et al., 2019) 2
What is decoding? Decoding : finding the most likely explanation to some evidence. Basic reasoning tool in: • ”Plan recognition as Planning” (Ramirez and Geffner, 2009). • ”Diagnosis as Planning Revisited” (Sohrabi et al., 2010). • ”Counterplanning using Goal Recognition and Landmarks” (Pozanco et al. 2018) • ”Learning action models with minimal observability” (Aineto et al., 2019) Contributions: • Formalization of the decoding problem within a probabilistic framework. • Extension of decoding to support sensor models. 2
Motivating example o3 Acting agent 0.25 0.25 0.25 0.25 0.25 o2 o1 O = ( � loc = (0 , 2) � , � loc = (1 , 2) � , � loc = (4 , 2) � 3
Motivating example o3 Acting agent 0.25 0.25 0.25 0.25 0.25 o2 o1 O = ( � loc = (0 , 2) � , � loc = (1 , 2) � , � loc = (4 , 2) � τ 1 = ( � ( at x 0 y 2) � , � ( at x 1 y 2) � , � ( at x 2 y 2) � , � ( at x 3 y 2) � , � ( at x 4 y 2) � ) 4
Motivating example o3 Observer Acting agent 0.9 0.1 0.25 0.25 0.25 0 1 0.25 0.25 o2 o1 O = ( � loc = (0 , 2) � , � loc = (1 , 2) � , � loc = (4 , 2) � τ 1 = ( � ( at x 0 y 2) � , � ( at x 1 y 2) � , � ( at x 2 y 2) � , � ( at x 3 y 2) � , � ( at x 4 y 2) � ) 5
Motivating example o3 Observer Acting agent 0.9 0.1 0.25 0.25 0.25 0 1 0.25 0.25 o2 o1 O = ( � loc = (0 , 2) � , � loc = (1 , 2) � , � loc = (4 , 2) � τ 2 = ( � ( at x 0 y 2) � , � ( at x 1 y 2) � , � ( at x 1 y 1) � , � ( at x 2 y 1) � , � ( at x 3 y 1) � , � ( at x 4 y 1) � , � ( at x 4 y 2) � ) 6
Problem Definition
Probabilistic Framework Planning Model Trajectory Observations M p τ O Synthesis Sensing M s Sensor Model 7
Sensor Model and Observations A sensor model M s = � X , Y , Φ � • X are the state variables. • Y are the observable variables. • Φ is the set of sensing functions f i : C i × Y i → [0 , 1] • exhaustive ( � c ∈ C i S c = S ), and • exclusive ( S c ∩ S c ′ = ∅ , ∀ c , c ′ ∈ C i ) 8
Sensor Model and Observations A sensor model M s = � X , Y , Φ � • X are the state variables. • Y are the observable variables. • Φ is the set of sensing functions f i : C i × Y i → [0 , 1] • exhaustive ( � c ∈ C i S c = S ), and • exclusive ( S c ∩ S c ′ = ∅ , ∀ c , c ′ ∈ C i ) Blindspots example Clear tile ( x ≥ 2): f loc ( at x , y , loc = ( x , y )) = 0 . 9, f loc ( at x , y , loc = ǫ ) = 0 . 1 Blindspot tile ( x ≤ 1): f loc ( at x , y , loc = ǫ ) = 1 8
Sensor Model and Observations A sensor model M s = � X , Y , Φ � • X are the state variables. • Y are the observable variables. • Φ is the set of sensing functions f i : C i × Y i → [0 , 1] • exhaustive ( � c ∈ C i S c = S ), and • exclusive ( S c ∩ S c ′ = ∅ , ∀ c , c ′ ∈ C i ) An observation o = � Y 1 = w 1 , . . . , Y | Y | = w | Y | � is a full assignment of Y . 8
The Observation Decoding Problem An observation decoding problem is a triplet D = �M p , M s , O � where: • M p = � X , A � is a planning model, • M s = � X , Y , Φ � is a sensor model, and • O = � o 0 , o 1 , . . . , o m � is an input observation sequence. The solution to D = �M p , M s , O � is the most likely trajectory τ ∗ defined as τ ∗ = arg max P ( O , τ |M p , M s ) , τ ∈T 9
Synthesis and Sensing Probabilities τ ∗ = arg max τ ∈T P ( O , τ |M p , M s ) = arg max τ ∈T P ( τ |M p ) P ( O | τ, M s ) Synthesis probability The probability of generating τ with M p : Planning Model Trajectory Observations | τ | � P ( τ |M p ) = P ( s 0 ) P ( s i | s i − 1 , M p ) , (1) M p τ O Synthesis Sensing i =1 Sensing probability M s The probability of perceiving O from τ : Sensor Model | τ | � P ( O | τ, M s ) = P ( o i | s i , M s ) , (2) i =1 10
Observation decoding via Classical Planning
Compilation From probability maximization to cost minimization: τ ∗ = arg max τ ∈T P ( O , τ |M p , M s ) → τ ∗ = arg min τ ∈T − log P ( O , τ |M p , M s ) Compile D = �M p , M s , O � to a planning problem P ′ = � F ′ , A ′ , I ′ , G ′ � such that A ′ = A t ∪ A e where: • transition actions A t are the cost-normalized versions of A • sensing actions A e to process an observation 11
Compilation From probability maximization to cost minimization: τ ∗ = arg max τ ∈T P ( O , τ |M p , M s ) → τ ∗ = arg min τ ∈T − log P ( O , τ |M p , M s ) Compile D = �M p , M s , O � to a planning problem P ′ = � F ′ , A ′ , I ′ , G ′ � If π is a solution plan for P ′ then: • cost ( π t ) = − log P ( τ π |M p ). • cost ( π e ) = − log P ( O | τ π , M p ). • cost ( π ) = − log P ( O , τ π |M p , M s ). 11
Sensing Actions A e contains a sense k action for each observation o k ∈ O • Implement an acceptor automaton for trajectories that satisfy the observation. • Accumulate − log P ( O | τ, M s ) sense1 senseK sensed0 sensed1 sensedK-1 sensedK ... start guard (sense k ) := P ( o k | s i , M s ) > 0 reset (sense k ) := x + = x − log P ( o k | s i , M s ) 12
Sensing Actions O = ( � loc = (0 , 2) � , � loc = (1 , 2) � , � loc = (4 , 2) � pre(sense 2 ) sensed 1 sensed 2 ∧ eff(sense 2 ) when ( at x 1 y 2) increase total cost − log (0 . 9) when ( not ( at x 1 y 2)) ( deadend ) 13
Experimental Evaluation
Setup Evaluate the effectiveness of using a sensor model for decoding. • OD N : optimal plan that satisfies the observation. • OD S : the approach presented here. Metric : plan diversity 1 δ α ( π i , π j ) = | S i − S j | | S i | + | S j | + | S j − S i | | S i | + | S j | 1 ”Domain independent approaches for finding diverse plans” (Srivastava et al., 2007). 14
Results Domain H L OD S OD N 100 0 0.03 0.18 Blindspots 80 20 0.08 0.20 60 40 0.11 0.17 100 0 0 0.58 H: Observability of the high observability region 80 20 0.07 0.18 Intrusion L: Observability of the low observability region 60 40 0.13 0.14 OD S : δ α ( π, π S ) 100 0 0 0.34 OD N : δ α ( π, π N ) 80 20 0.05 0.27 Blocks 2h 60 40 0.07 0.26 100 0 0 0.58 Office 80 20 0.23 0.38 60 40 0.16 0.23 15
Conclusions
Conclusions • Formalization of the decoding problem within a probabilistic framework. • Extension of decoding to support sensor models. • Unifying probabilistic framework (future work). 16
Recommend
More recommend