Learning to Trick Robots into Cooperative Behavior Jen Jen Chung - PowerPoint PPT Presentation

Learning to Trick Robots into Cooperative Behavior Jen ¡Jen ¡Chung ¡ ¡ Autonomous ¡Agents ¡and ¡Distributed ¡Intelligence ¡Lab ¡ Oregon ¡State ¡University ¡

UAV Package Delivery Increasing ¡interest ¡in ¡delivery ¡ • drones: ¡UPS, ¡Amazon, ¡etc. ¡ Dense ¡UAV ¡traffic ¡in ¡cluDered ¡ • urban ¡environment ¡ No ¡current ¡framework ¡for ¡large ¡ • scale ¡coordinaIon ¡ Jen Jen Chung | Oregon State University DEMUR 2015 1

A Cross-Section of the Airspace Automated ¡UAV ¡traffic ¡ • management ¡ Challenges: ¡ • – Narrow ¡thoroughfares ¡of ¡ dense ¡traffic ¡ – Heterogeneous ¡UAVs ¡ – Dynamic ¡obstacle ¡landscape ¡ Goals ¡ • – Minimize ¡conflict ¡occurrences ¡ – Avoid ¡cascading ¡effects ¡ – Maintain ¡throughput ¡ 100m Jen Jen Chung | Oregon State University DEMUR 2015 2

Multiagent UAV Traffic Management (UTM) Divide ¡airspace ¡into ¡sectors ¡ • – Assign ¡single ¡UTM ¡agent ¡to ¡ manage ¡each ¡sector ¡ MulIagent ¡team: ¡ • – UTM ¡agents ¡ individually ¡learn ¡ policy ¡for ¡assigning ¡sector ¡ traversal ¡costs ¡ – Reward ¡is ¡total ¡number ¡of ¡ conflicts ¡in ¡ global ¡system ¡ 100m Jen Jen Chung | Oregon State University DEMUR 2015 3

A Hierarchical Approach Sector ¡Agents ¡ Define ¡cost ¡of ¡travel ¡in ¡each ¡sector ¡ according ¡to ¡current ¡UAV ¡density ¡ UAVs ¡ Sector-‑level ¡planner ¡ Plans ¡across ¡sector ¡cost ¡graph ¡ Low-‑level ¡planner ¡ Plans ¡across ¡obstacle ¡map ¡according ¡to ¡ sector ¡traversal ¡plan ¡ Jen Jen Chung | Oregon State University DEMUR 2015 4

UTM Learning Agents Learn ¡the ¡cost ¡of ¡travel ¡to ¡apply ¡ • to ¡UAVs ¡in ¡the ¡sector ¡ • Neural ¡network ¡control ¡ – Inputs: ¡UAV ¡counts ¡in ¡sector ¡ § Separate ¡into ¡traffic ¡types, ¡e.g. ¡ heading, ¡priority, ¡plaTorm ¡etc. ¡ – Outputs: ¡Cost ¡of ¡through-‑sector ¡ travel ¡for ¡each ¡traffic ¡type ¡ • CooperaIve ¡coevoluIon ¡to ¡learn ¡ NN ¡weights ¡ – Fitness ¡value: ¡number ¡of ¡conflicts ¡ Jen Jen Chung | Oregon State University DEMUR 2015 5

Evolutionary Algorithms for Learning Control Policies IniIalize ¡populaIon ¡of ¡ k ¡NNs ¡ Retain ¡ k ¡best ¡performing ¡ Mutate ¡each ¡to ¡create ¡ NNs ¡ total ¡populaIon ¡of ¡2 k ¡NNs ¡ Test ¡each ¡NN ¡and ¡assess ¡fitness ¡ Jen Jen Chung | Oregon State University IROS 2015 6

Cooperative Coevolutionary Algorithms (CCEAs) IniIalize ¡ M ¡populaIons ¡of ¡ k ¡NNs ¡ Retain ¡ k ¡best ¡performing ¡ Mutate ¡each ¡to ¡create ¡M ¡ NNs ¡of ¡each ¡populaIon ¡ populaIons ¡of ¡2 k ¡NNs ¡ Assess ¡team ¡performance ¡and ¡ Randomly ¡select ¡one ¡NN ¡from ¡ assign ¡fitness ¡to ¡team ¡members each ¡populaIon ¡to ¡create ¡team ¡ T i ¡ ¡ Jen Jen Chung | Oregon State University DEMUR 2015 7

Simulation Experiments Urban ¡airspace ¡ • – 256×256 ¡cell ¡map ¡of ¡San ¡ Francisco ¡ – 15 ¡Voronoi ¡parIIons ¡ Fitness ¡calculaIon ¡ • – Linear: ¡no. ¡conflicts ¡at ¡each ¡ cell ¡summed ¡ – QuadraIc: ¡no. ¡conflicts ¡at ¡ each ¡cell ¡squared ¡and ¡ summed ¡ 100m Jen Jen Chung | Oregon State University DEMUR 2015 8

Simulation Experiments Sector ¡agents ¡ • – IniIalized ¡with ¡populaIon ¡of ¡10 ¡NN ¡control ¡policies, ¡10% ¡mutaIon ¡noise ¡ – Inputs: ¡{n N , ¡n S , ¡n E , ¡n W } ¡ – Outputs: ¡{c N , ¡c S , ¡c E , ¡c W } ¡ – Fitness: ¡number ¡of ¡conflicts ¡ UAVs ¡ • – StochasIcally ¡generated ¡from ¡predefined ¡set ¡of ¡start ¡and ¡goal ¡locaIons ¡ – Approximately ¡100 ¡UAVs ¡in ¡airspace ¡during ¡single ¡learning ¡epoch ¡ – A* ¡planning ¡at ¡both ¡sector-‑ ¡and ¡low-‑level ¡ – Conflict ¡radius: ¡2 ¡cells ¡(approx. ¡4m) ¡ Jen Jen Chung | Oregon State University DEMUR 2015 9

Learning Results: Total Conflicts Team ¡performance ¡over ¡ • 100 ¡learning ¡epochs ¡ Averaged ¡over ¡20 ¡trials ¡ • 16% ¡reducIon ¡in ¡total ¡ • system ¡conflicts ¡ Jen Jen Chung | Oregon State University DEMUR 2015 10

Congestion Reduction: Linear Cost Fitness Function Random ¡iniIalized ¡sector ¡costs ¡ Learned ¡sector ¡costs ¡ Jen Jen Chung | Oregon State University DEMUR 2015 11

Congestion Reduction: Quadratic Cost Fitness Function Random ¡iniIalized ¡sector ¡costs ¡ Learned ¡sector ¡costs ¡ Jen Jen Chung | Oregon State University DEMUR 2015 12

Extensions to Sector Agent Control Policies Not ¡all ¡UAVs ¡in ¡the ¡airspace ¡are ¡ • equal ¡ Account ¡for ¡UAV ¡type ¡in ¡NN ¡ • Package ¡ Emergency ¡ inputs ¡and ¡outputs ¡ delivery ¡UAVs ¡ medical ¡UAVs ¡ Weighted ¡ Cross-‑weighted ¡ MulI-‑mind ¡ Jen Jen Chung | Oregon State University DEMUR 2015 13

A Hierarchical Approach Sector ¡Agents ¡ Define ¡cost ¡of ¡travel ¡in ¡each ¡sector ¡ according ¡to ¡current ¡UAV ¡density ¡ UAVs ¡ Sector-‑level ¡planner ¡ Plans ¡across ¡sector ¡cost ¡graph ¡ Low-‑level ¡planner ¡ Plans ¡across ¡obstacle ¡map ¡according ¡to ¡ sector ¡traversal ¡plan ¡ Jen Jen Chung | Oregon State University DEMUR 2015 14

Risk-Aware Graph Search (RAGS) Graph ¡search ¡with ¡uncertain ¡ • edge ¡costs ¡ – Normal ¡distribuIons ¡ Bound ¡path ¡set ¡ • – DominaIon ¡according ¡to ¡ mean ¡and ¡variance ¡ ) ∧ A . σ 2 < B . σ 2 ( ) ( A < B ↔ A . c < B . c 100m Jen Jen Chung | Oregon State University DEMUR 2015 15

RAGS Path Execution A 3 ~ ! µ A m , σ A m ( ) 2 A 2 A A 1 c A 0 Start Goal B 1 c B 0 B 2 B B 3 ~ ! µ B n , σ B n ( ) 2 B 4 The ¡probability ¡that ¡traveling ¡via ¡B ¡ ¡ will ¡yield ¡a ¡cheaper ¡path ¡than ¡traveling ¡via ¡A ¡ m ∞ ( ) ∑ ⋅ 1 − P c B i > x , ∀ i ∈ 1, ! , n ( ) dx ∫ P c A i = x ; c A j > x , ∀ j ≠ i { } −∞ i = 1 Jen Jen Chung | Oregon State University DEMUR 2015 16

RAGS vs. Existing Planning Algorithms TesIng ¡on ¡graph ¡with ¡100 ¡verIces ¡ • – 3 ¡sets ¡of ¡edge ¡cost ¡distribuIons ¡ ε ~ ! µ , σ 2 ( ) Edge cost = Euclidean distance + ε , [ ] µ ∈ 0,100 σ 2 ∈ 0, σ max " $ 2 2 % , = 5,10,20 { } σ max # Compared ¡against ¡ • – Naïve ¡A* ¡on ¡the ¡mean ¡ – Greedy ¡on ¡bounded ¡path ¡set ¡ – D* ¡ Jen Jen Chung | Oregon State University DEMUR 2015 17

RAGS vs. Existing Planning Algorithms σ 2 ∈ 0,5 σ 2 ∈ 0,10 σ 2 ∈ 0,20 ( ) ( ) ( ) Jen Jen Chung | Oregon State University DEMUR 2015 18

RAGS Integration with UTM Agents 100m Jen Jen Chung | Oregon State University DEMUR 2015 19

Comparison of A* and RAGS UAVs ¡planning ¡with ¡A* ¡ UAVs ¡planning ¡with ¡RAGS ¡ Jen Jen Chung | Oregon State University DEMUR 2015 20

Conclusions and Future Work Implicit ¡cooperaIon ¡by ¡learning ¡individual ¡control ¡policies ¡trained ¡on ¡ • global ¡reward ¡structures ¡ Risk-‑aware ¡graph ¡search ¡accounts ¡for ¡modeled ¡uncertainIes ¡in ¡the ¡ • environment ¡ IniIal ¡integraIon ¡of ¡high ¡and ¡low-‑level ¡decision ¡making ¡shows ¡faster ¡ • learning ¡rates ¡ Future ¡work ¡ • – Reward ¡shaping ¡to ¡improve ¡UTM ¡agent ¡policies ¡ – TheoreIcal ¡guarantees ¡of ¡RAGS ¡ – ValidaIon ¡and ¡verificaIon ¡ Jen Jen Chung | Oregon State University DEMUR 2015 21

Acknowledgements Professors ¡ Graduate ¡Students ¡ Undergrads ¡ Interns ¡ Jen Jen Chung | Oregon State University DEMUR 2015 22

Learning to Trick Robots into Cooperative Behavior Jen Jen Chung - PowerPoint PPT Presentation

Learning to Trick Robots into Cooperative Behavior Jen Jen Chung Autonomous Agents and Distributed Intelligence Lab Oregon State University UAV Package Delivery Increasing interest

Cooperative Web Caching Cooperative Web Caching Cooperative Caching Cooperative Caching

UNIVERSAL ROBOTS RUC 2018 Universal Robots - Evolving the future UNIVERSAL ROBOTS SET THE

The Imitation Game: The New Frontline of Security Fighting Robots Weve been warned for a

Robots Playing Catch Brandon Tolsch Brandon Tolsch Robots Playing Catch Two robots throwing

Add Steak to Exploratory Add Steak to Exploratory Testing's Parlor Parlor- -Trick Sizzle Trick

Cooperative Game Theory Outline Introduction Relationship between Non-cooperative and

Cooperative Choice Cooperative and non-cooperative motives and their consequences via Mark

Human robot interaction www.biorobotics.ttu.ee Social robots Traditional robots Tools

CLIMBS Life and General Insurance Cooperative CLIMBS Life and General Insurance Cooperative A

COOPERATIVE EDUCATION INVENTORY STUDY Association of Cooperative Christina Clamp, PhD.

IWWF WATERSKI & WAKEBOARD WORLD CUP WAKEBOARD CABLE WAKEBOARD WATERSKI (SLALOM TRICK

Dissections, Hom-complexes and the Cayley Trick Julian Pfeifle MA II, Universitat Politcnica

A8: Cross-site Request Forgery (CSRF) A8: Cross-site Request Forgery (CSRF) XSS Trick

THE BEST CARD TRICK MICHAEL KLEBER In Mathematical Intelligencer 24 #1 (Winter 2002) You, my

Cooperative Games Mihai Manea MIT Coalitional Games A coalitional (or cooperative) game is a

ROBOTS AND HEALTHCARE PAST, PRESENT, AND FUTURE COMPILED BY HOWIE BAUM What do you think of when

Movement Behaviors Marco Chiarandini Department of Mathematics & Computer Science University

Taking Control of your SmartNIC Andy Gospodarek (Broadcom) Or Gerlitz (Mellanox) What is a

Lecture 22 Access Control Stephen Checkoway Oberlin College Slides based on Baileys ECE

SOME SURPRISING SIMPLE COMBINED CONTROL AND STOPPING PROBLEMS V ACLAV E. BENE S 26

Does a virtuous circle between social capital and CSR exist? A network of games model and

What is an Interstate Compact? Simple, versatile and proven tool Effective means of

Energy Management of End Users Modeling their Reaction from a GENCOs Point of View Mehdi

Long time behaviour of cooperatively branching and coalescing particle systems Anja Sturm

Learning to Trick Robots into Cooperative Behavior Jen Jen Chung - PowerPoint PPT Presentation

Learning to Trick Robots into Cooperative Behavior Jen Jen Chung Autonomous Agents and Distributed Intelligence Lab Oregon State University UAV Package Delivery Increasing interest

Cooperative Web Caching Cooperative Web Caching Cooperative Caching Cooperative Caching

UNIVERSAL ROBOTS RUC 2018 Universal Robots - Evolving the future UNIVERSAL ROBOTS SET THE

The Imitation Game: The New Frontline of Security Fighting Robots Weve been warned for a

Robots Playing Catch Brandon Tolsch Brandon Tolsch Robots Playing Catch Two robots throwing

Add Steak to Exploratory Add Steak to Exploratory Testing's Parlor Parlor- -Trick Sizzle Trick

Cooperative Game Theory Outline Introduction Relationship between Non-cooperative and

Cooperative Choice Cooperative and non-cooperative motives and their consequences via Mark

Human robot interaction www.biorobotics.ttu.ee Social robots Traditional robots Tools

CLIMBS Life and General Insurance Cooperative CLIMBS Life and General Insurance Cooperative A

COOPERATIVE EDUCATION INVENTORY STUDY Association of Cooperative Christina Clamp, PhD.

IWWF WATERSKI &amp; WAKEBOARD WORLD CUP WAKEBOARD CABLE WAKEBOARD WATERSKI (SLALOM TRICK

Dissections, Hom-complexes and the Cayley Trick Julian Pfeifle MA II, Universitat Politcnica

A8: Cross-site Request Forgery (CSRF) A8: Cross-site Request Forgery (CSRF) XSS Trick

THE BEST CARD TRICK MICHAEL KLEBER In Mathematical Intelligencer 24 #1 (Winter 2002) You, my

Cooperative Games Mihai Manea MIT Coalitional Games A coalitional (or cooperative) game is a

ROBOTS AND HEALTHCARE PAST, PRESENT, AND FUTURE COMPILED BY HOWIE BAUM What do you think of when

Movement Behaviors Marco Chiarandini Department of Mathematics &amp; Computer Science University

Taking Control of your SmartNIC Andy Gospodarek (Broadcom) Or Gerlitz (Mellanox) What is a

Lecture 22 Access Control Stephen Checkoway Oberlin College Slides based on Baileys ECE

SOME SURPRISING SIMPLE COMBINED CONTROL AND STOPPING PROBLEMS V ACLAV E. BENE S 26

Does a virtuous circle between social capital and CSR exist? A network of games model and

What is an Interstate Compact? Simple, versatile and proven tool Effective means of

Energy Management of End Users Modeling their Reaction from a GENCOs Point of View Mehdi

Long time behaviour of cooperatively branching and coalescing particle systems Anja Sturm

IWWF WATERSKI & WAKEBOARD WORLD CUP WAKEBOARD CABLE WAKEBOARD WATERSKI (SLALOM TRICK

Movement Behaviors Marco Chiarandini Department of Mathematics & Computer Science University