A Partition-Based First-Order Probabilistic Logic to Represent Interactive Beliefs Alessandro Panella and Piotr Gmytrasiewicz Fifth International Conference on Scalable Uncertainty Management Dayton, OH October 10, 2011 Panella and Gmytrasiewicz (CS Dept. - UIC) A FOPL for Interactive Beliefs October 10, 2011 1 / 18
Outline Quick Look 1 Introduction 2 The Problem Related Work Proposed Formalization 3 0-th Level Beliefs 1st Level Beliefs n -th Level Beliefs Conclusion 4 Bibliography 5 Panella and Gmytrasiewicz (CS Dept. - UIC) A FOPL for Interactive Beliefs October 10, 2011 2 / 18
Quick Look Contribution Formalization of a theoretical framework that allows to compactly represent interactive beliefs Probability theory (First-Order) Logic Maximum Entropy Main idea: recursive partitioning of the belief simplices ... (a) (b) (c) (d) Panella and Gmytrasiewicz (CS Dept. - UIC) A FOPL for Interactive Beliefs October 10, 2011 3 / 18
Introduction The Problem Stochastic Planning The need for compact representations: Use of first-order logic: Describe sets of states Capture regularities Panella and Gmytrasiewicz (CS Dept. - UIC) A FOPL for Interactive Beliefs October 10, 2011 4 / 18
Introduction The Problem Stochastic Planning The need for compact representations: Use of first-order logic: Describe sets of states Capture regularities Panella and Gmytrasiewicz (CS Dept. - UIC) A FOPL for Interactive Beliefs October 10, 2011 4 / 18
Introduction The Problem Stochastic Planning An Example n × n grid world Actions: UP , DOWN, LEFT, RIGHT Probabilistic transition function: For every location, P ( succeed ) = . 9 GO RIGHT GO DOWN Panella and Gmytrasiewicz (CS Dept. - UIC) A FOPL for Interactive Beliefs October 10, 2011 5 / 18
Introduction The Problem Stochastic Planning An Example n × n grid world Actions: UP , DOWN, LEFT, RIGHT Probabilistic transition function: For every location, P ( succeed ) = . 9 GO RIGHT GO DOWN Panella and Gmytrasiewicz (CS Dept. - UIC) A FOPL for Interactive Beliefs October 10, 2011 5 / 18
Introduction The Problem Stochastic Planning An Example n × n grid world Actions: UP , DOWN, LEFT, RIGHT Probabilistic transition function: For every location, P ( succeed ) = . 9 GO RIGHT GO DOWN Panella and Gmytrasiewicz (CS Dept. - UIC) A FOPL for Interactive Beliefs October 10, 2011 5 / 18
Introduction The Problem Interactive Settings Representation needs even more stringent Panella and Gmytrasiewicz (CS Dept. - UIC) A FOPL for Interactive Beliefs October 10, 2011 6 / 18
Introduction The Problem Interactive Settings Representation needs even more stringent Panella and Gmytrasiewicz (CS Dept. - UIC) A FOPL for Interactive Beliefs October 10, 2011 6 / 18
Introduction The Problem Interactive Settings Representation needs even more stringent Panella and Gmytrasiewicz (CS Dept. - UIC) A FOPL for Interactive Beliefs October 10, 2011 6 / 18
Introduction The Problem Interactive Settings Representation needs even more stringent Panella and Gmytrasiewicz (CS Dept. - UIC) A FOPL for Interactive Beliefs October 10, 2011 6 / 18
Introduction The Problem (Finitely Nested) Interactive POMDPs Level- n belief b i , n ∈ ∆( IS i , n ) , where IS i , 0 = S IS i , 1 = S × ∆( IS j , 0 ) . . . IS i , n = S × ∆( IS j , n − 1 ) s 1 (1, 0) Value function of (I-)POMDPs is (0, 0) Piecewise linear and convex . Divides the simplex into behavior-equivalent partitions. (0, 1) s 2 From Kaelbling et al. (1998) Panella and Gmytrasiewicz (CS Dept. - UIC) A FOPL for Interactive Beliefs October 10, 2011 7 / 18
Introduction The Problem (Finitely Nested) Interactive POMDPs Level- n belief b i , n ∈ ∆( IS i , n ) , where IS i , 0 = S IS i , 1 = S × ∆( IS j , 0 ) . . . IS i , n = S × ∆( IS j , n − 1 ) s 1 (1, 0) Value function of (I-)POMDPs is (0, 0) Piecewise linear and convex . Divides the simplex into behavior-equivalent partitions. (0, 1) s 2 From Kaelbling et al. (1998) Panella and Gmytrasiewicz (CS Dept. - UIC) A FOPL for Interactive Beliefs October 10, 2011 7 / 18
Introduction Related Work Related Work First Order Probabilistic Languages: Seminal theoretical work (Nilsson, 1986; Halpern, 1989) Recent practical approaches: BLOG (Milch et al., 2005) , Markov Logic (Richardson and Domingos, 2006) , . . . Relational stochastic planning: Relational MDPs (Boutilier et al., 2001; Sanner and Boutilier, 2009) Relational POMDPs (Sanner and Kersting, 2010; Wang and Khardon, 2010) Belief hierarchies Extensive treatment in Game Theory, starting from Bayesian Games (Harsanyi, 1967; Aumann, 1999) Probabilistic modal logics (Fagin and Halpern, 1994; Shirazi and Amir, 2008) Interactive POMDPs (Gmytrasiewicz and Doshi, 2005) Panella and Gmytrasiewicz (CS Dept. - UIC) A FOPL for Interactive Beliefs October 10, 2011 8 / 18
Introduction Related Work Related Work First Order Probabilistic Languages: Seminal theoretical work (Nilsson, 1986; Halpern, 1989) Recent practical approaches: BLOG (Milch et al., 2005) , Markov Logic (Richardson and Domingos, 2006) , . . . Relational stochastic planning: Relational MDPs (Boutilier et al., 2001; Sanner and Boutilier, 2009) Relational POMDPs (Sanner and Kersting, 2010; Wang and Khardon, 2010) Belief hierarchies Extensive treatment in Game Theory, starting from Bayesian Games (Harsanyi, 1967; Aumann, 1999) Probabilistic modal logics (Fagin and Halpern, 1994; Shirazi and Amir, 2008) Interactive POMDPs (Gmytrasiewicz and Doshi, 2005) Panella and Gmytrasiewicz (CS Dept. - UIC) A FOPL for Interactive Beliefs October 10, 2011 8 / 18
Introduction Related Work Related Work First Order Probabilistic Languages: Seminal theoretical work (Nilsson, 1986; Halpern, 1989) Recent practical approaches: BLOG (Milch et al., 2005) , Markov Logic (Richardson and Domingos, 2006) , . . . Relational stochastic planning: Relational MDPs (Boutilier et al., 2001; Sanner and Boutilier, 2009) Relational POMDPs (Sanner and Kersting, 2010; Wang and Khardon, 2010) Belief hierarchies Extensive treatment in Game Theory, starting from Bayesian Games (Harsanyi, 1967; Aumann, 1999) Probabilistic modal logics (Fagin and Halpern, 1994; Shirazi and Amir, 2008) Interactive POMDPs (Gmytrasiewicz and Doshi, 2005) Panella and Gmytrasiewicz (CS Dept. - UIC) A FOPL for Interactive Beliefs October 10, 2011 8 / 18
Proposed Formalization 0-th Level Beliefs Grid World Example n × n grid Agent i tagging a moving target j 6 6 5 5 4 4 3 3 2 2 1 1 0 0 Uncertainty about target’s position: predicate jPos ( x , y ) Auxiliary deterministic predicates: geq ( x , k ) ≡ x ≥ k leq ( x , k ) ≡ x ≤ k Panella and Gmytrasiewicz (CS Dept. - UIC) A FOPL for Interactive Beliefs October 10, 2011 9 / 18
Proposed Formalization 0-th Level Beliefs Level-0 Belief Base i ’s belief about the state of the world S φ 0 � φ 1 , α 1 � B i , 0 = : : ψ 0 ψ 1 ψ 2 φ 1 � φ m , α m � φ k ’s are arbitrary sentences in predicate logic, and α k ∈ [ 0 , 1 ] ; ψ ’s are the induced partitions – (Ψ , 2 Ψ , p i , 0 ) Only partial specification of distribution To obtain unique distribution: Maximum Entropy (max-ent): � � � − p i , 0 ( S ( ψ )) log p i , 0 ( S ( ψ )) max p i , 0 ψ ∈ Ψ B Panella and Gmytrasiewicz (CS Dept. - UIC) A FOPL for Interactive Beliefs October 10, 2011 10 / 18
Proposed Formalization 0-th Level Beliefs Grid World Example (cont’d) Assume agent interested in horizontal position of target w.r.t. center: B i , 0 = �∃ x , y ( jPos ( x , y ) ∧ leq ( x , ⌊ n / 2 ⌋ )) , 0 . 8 � �∃ x , y ( jPos ( x , y ) ∧ geq ( x , ⌊ n / 2 ⌋ )) , 0 . 5 � S ψ 2 ψ 1 ψ 0 φ 0 φ 1 In this case, unique distribution p i , 0 = ( 0 . 5 , 0 . 3 , 0 . 2 ) . Panella and Gmytrasiewicz (CS Dept. - UIC) A FOPL for Interactive Beliefs October 10, 2011 11 / 18
Proposed Formalization 1st Level Beliefs Level-1 Belief Base i ’s belief about j ’s belief � φ j , 0 1 , α 1 � B i , 1 = : : (1) � φ j , 0 m , α m � φ j , 0 k is of the form P j ( φ ) △ β , △ ∈ { <, ≤ , = , ≥ , > } , β ∈ [ 0 , 1 ] ; The sentences φ j , 0 k induce a partitioning on j ’s L-0 belief simplex; p i, 1 Ψ B Ψ j, 0 B ψ ∈ Ψ j, 0 B Φ j, 0 Φ B B ∆ ( Ψ B ) S i (a) (b) (c) Panella and Gmytrasiewicz (CS Dept. - UIC) A FOPL for Interactive Beliefs October 10, 2011 12 / 18
Proposed Formalization 1st Level Beliefs Grid World Example (cont’d) Target j models agent i ’s beliefs about j ’s position � P i ( φ 0 ) ≥ 0 . 4 , 0 . 4 � B j , 1 = � P i ( φ 1 ) > 0 . 5 , 0 . 7 � ψ 2 State of the world: i's L0 simplex: S ψ i, 0 ψ 2 2 p i ( φ 0 ) = 0 . 4 ψ 1 ψ i, 0 1 ψ 0 ψ i, 0 0 ψ 0 p i ( φ 1 ) = 0 . 5 ψ 1 φ 0 φ 1 φ i, 0 φ i, 0 0 1 Unique consistent distribution p j , 1 = ( 0 . 3 , 0 . 1 , 0 . 6 ) . Panella and Gmytrasiewicz (CS Dept. - UIC) A FOPL for Interactive Beliefs October 10, 2011 13 / 18
Recommend
More recommend