artificial intelligence in robotics
play

Artificial Intelligence in Robotics Lecture 13: Patrolling Viliam - PowerPoint PPT Presentation

Artificial Intelligence in Robotics Lecture 13: Patrolling Viliam Lis Artificial Intelligence Center Department of Computer Science, Faculty of Electrical Eng. Czech Technical University in Prague Mathematical programming LP MILP Some of


  1. Artificial Intelligence in Robotics Lecture 13: Patrolling Viliam LisΓ½ Artificial Intelligence Center Department of Computer Science, Faculty of Electrical Eng. Czech Technical University in Prague

  2. Mathematical programming LP MILP Some of the variables are integer Objective and constraints are still linear Convex program Optimize a convex function over a convex set Non-convex program 2

  3. Task Taxonomy Robin, C., & Lacroix, S. (2016). Multi-robot target detection and tracking: taxonomy and survey. Autonomous Robots, 40(4), 729 – 760. 3

  4. Resource allocation games Developed by team of prof. M. Tambe at USC (2008-now) In daily use by various organizations and security agencies 4

  5. Resource allocation games 1 2 3 4 5 6 7 8 -15 -14 Unprotected 10 11 9 15 11 15 14 6 -11 Protected 5 4 5 7 6 5 7 3 -10 Optimal strategy 0 0.14 0 0.62 0.2 0.49 0.56 0 5

  6. Resource allocation games Set of targets: π‘ˆ = 𝑒 1 , … , 𝑒 π‘œ Limited (homogeneous) security resources 𝑠 ∈ β„• Each resource can fully protect (cover) a single target The attacker attacks a single target 𝑑 𝑒 < 𝑉 𝑏 𝑣 𝑒 Attacker’s utility for covered/uncovered attack: 𝑉 𝑏 𝑑 𝑒 > 𝑉 𝑒 𝑣 (𝑒) Defender’s utility for covered/uncovered attack: 𝑉 𝑒 6

  7. Stackelberg equilibrium the leader π‘š – publicly commits to a strategy the follower (𝑔) – plays a best response to leader arg 𝜏 π‘š βˆˆΞ” 𝐡 π‘š ; 𝜏 𝑔 βˆˆπΆπ‘† 𝑔 (𝜏 π‘š ) 𝑠 max π‘š (𝜏 π‘š , 𝜏 𝑔 ) Example L R U (4,2) (6,1) D (3,1) (5,2) Why? The defender needs to commit in practice (laws, regulations, etc.) It may lead to better expected utility 7

  8. Solving resource allocation games Kiekintveld, et al.: Computing Optimal Randomized Resource Allocations for Massive Security Games, AAMAS 2009 Only coverage vector 𝑑 𝑒 matters, π‘Ž is a sufficiently large number 8

  9. Sampling the coverage vector c 0,8 0,7 0,6 1 0,5 0,4 0,3 0,2 0,1 0 t1 t2 t3 t4 t5 t6 0 𝑠 𝑠 1 2 9

  10. Scalability 25 resources, 3000 targets => 5 Γ— 10 61 defender’s actions no chance for matrix game representation The algorithm explained above is ERASER 10

  11. Studied extensions Complex structured defender strategies Probabilistically failing actions Attacker’s types Resource types and teams Bounded rational attackers 11

  12. Resource allocation (security) games Advantages Wide existing literature (many variations) Good scalability Real world deployments Limitation The attacker cannot react to observations (e.g., defender’s position) 12

  13. Perimeter patrolling Agmon et al.: Multi-Robot Adversarial Patrolling: Facing a Full- Knowledge Opponent. JAIR 2011. The attacker can see the patrol! 13

  14. Perimeter patrolling Polygon 𝑄 , perimeter split to 𝑂 segments Defender has homogenous resources 𝑙 > 1 move 1 segment per time step turn to the opposite direction in 𝜐 time steps Attacker can wait infinitely long and sees everything chooses a segment where to attack requires 𝑒 time steps to penetrate 14

  15. Interesting parameter settings 𝑂 Let 𝑒 = 𝑙 be the distance between equidistant robots There is a perfect deterministic patrol strategy if 𝑒 β‰₯ 𝑒 the robots can just continue in one direction 4 What about 𝑒 = 5 𝑒 ? 𝜐 𝑒 𝑒 𝑒 +1 𝑒 βˆ’ (𝑒 βˆ’ 𝜐) 𝑒+πœβˆ’1 The attacker can guarantee success if t + 1 < d βˆ’ t βˆ’ 𝜐 β‡’ 𝑒 < 2 15

  16. Optimal patrolling strategy Class of strategies: continue with probability π‘ž , else turn around Theorem: In the optimal strategy, all robots are equidistant and face in the same direction. Proof sketch: 1. the probability of visiting the worst case segment between robots increases with increasing distance between the robots 2. making a move in different directions increases the distance 16

  17. Probability of penetration For simplicity assume 𝜐 = 1 Probability of visiting 𝑑 𝑗 at least once in next 𝑒 steps = probability of visiting the absorbing end state from 𝑑 𝑗 sum of each direction visited separately 17

  18. Probability of penetration All computations are symbolic. The result are functions π‘žπ‘žπ‘’ 𝑗 : 0,1 β†’ [0,1] . 18

  19. Optimal turn probability Maximin value for π‘ž Each line represents one segment ( π‘žπ‘žπ‘’ 𝑗 ) Iterate all pairs of intersection and maximal points to find solution it is all polynomials 19

  20. Perimeter patrol – summary Split the perimeter to segments traversable in unit time Distribute patrollers uniformly along the perimeter Coordinate them to always face the same way Continue with probability π‘ž turn around with probability (1 βˆ’ π‘ž) 20

  21. Area patrolling Basilico et al.: Patrolling security games: Definition and algorithms for solving large instances with single patroller and single intruder. AIJ 2012. 21

  22. Area patrolling - Formal model Environment represented as a graph Targets π‘ˆ = 6,8,12,14,18 Penetration time 𝑒(𝑒) Target values ( 𝑀 𝑒 𝑒 , 𝑀 𝑏 𝑒 ) Defender: Markov policy Attacker: wait, attack(t) 22

  23. Solving zero-sum patrolling game We assume βˆ€π‘’ ∈ π‘ˆ ∢ 𝑀 𝑏 𝑒 = 𝑀 𝑒 𝑒 𝑏 𝑗, π‘˜ = 1 if the patrol can move form 𝑗 to π‘˜ in one step; else 0 𝑄 𝑑 (𝑒, β„Ž) is the probability of stopping an attack at target 𝑒 started when the patrol was at node β„Ž π‘₯,𝑒 is the probability that the patrol reaches node π‘˜ from 𝑗 in π‘₯ steps without visiting target 𝑒 𝛿 𝑗,π‘˜ 𝛽 𝑗,π‘˜ is a probability of moving from 𝑗 to π‘˜ 23

  24. AI (GT) problems can often be solved by transformation to MP 27

Recommend


More recommend