Contextual Awareness for Robot Autonomy ( FA2386-10-1-4138 ) PI: - PowerPoint PPT Presentation

Contextual Awareness for Robot Autonomy ( FA2386-10-1-4138 ) PI: Reid Simmons (Carnegie Mellon University) AFOSR Joint Program Review: Cognition and Decision Program Human-System Interaction and Robust Decision Making Program Robust Computational Intelligence Program (Jan 28-Feb 1, 2013, Washington DC)

Contextual Awareness (Simmons) Technical Approach: Objective: Anticipate possible failures and respond Develop approaches that provide appropriately by planning and reasoning robots with contextual awareness: about uncertainty explicitly; Awareness of surroundings, Current work switches between policies at capabilities, and intent run time to maximize probability of exceeding reward threshold and finds anomalies in execution data using deviation from nominal model Budget: DoD Benefit: FY11 FY12 FY13 Robots that are more robust to Actual/ uncertainty in environment; 179,479 / 155,232 / 74,670 / Planned $K 166,995 173,353 186,394 Robots that are capable of as of 12/31/12 understanding their limitations and Annual Progress Report Submitted? Y Y NA responding intelligently Project End Date: 8/23/2013

List of Project Goals 1. Develop algorithms to reliably detect, diagnose and recover from exceptional (and uncertain) situations Develop approaches to determine robot’s own 2. limitations and ask for assistance 3. Develop algorithms to explain actions to people 4. Develop approaches to learn from people

Progress Towards Goals 1. Develop algorithms to reliably detect, diagnose and recover from exceptional (and uncertain) situations ( ongoing work by Breelyn Kane, Robotics PhD ) Develop approaches to determine robot’s own 2. limitations and ask for assistance ( ongoing work by Juan Mendoza, Robotics PhD ) 3. Develop algorithms to explain actions to people ( postponed ) 4. Develop approaches to learn from people ( PhD thesis “Graph -based Trajectory Planning through Programming by Demonstration,” Nik Melchior, defended December 2010, thesis completed August 2012 )

Policy Switching to Exceed Reward Thresholds (Breelyn Kane, PhD student, Robotics) “Risk - Variant Policy Switching to Exceed Reward Thresholds,” B. Kane and R. Simmons, In Proceedings International Conference on Automated Planning and Scheduling , Sao Paulo, Brazil, June 2012.

Acting to Exceed Reward Thresholds • In Competitive Domains, Second is no Better than Last – “The person that said winning isn’t everything, never won anything” (Mia Hamm) – “If you’re not first, you’re last!” (Ricky Bobby, Talladega Nights) – Arcade games: Not just beating a level, going for top score

Straightforward Approach • Add time and cumulative reward to state space • Generate optimal plan and execute • Significantly increases state space – Planning probably infeasible for real-world domains

Our Approach Offline • Generate different policies : policies of varying risk attitude • Estimate the reward distributions Online • Switch between policies : Calculate the maximum probability of being over a threshold at each time step based on the current cumulative (discounted) reward

Distribution of Rewards • Our work reasons about the complete , non-parametric reward distribution, including the distribution tails – Estimate reward distribution by running policy and gathering statistics    P V ( s ) x

Switching Decision Criterion      maximize P R s thresh 0       t 1   thresh R s    0 minarg P V s     t t   

Pizza Delivery Domain “30 Minutes or It’s Free” Risk-Neutral Risk = 1.2

Pizza Domain Results • Execute 10,000 runs in original MDP • Same start state every time Risk-neutral vs switching (with risky policy d =1.2) • Threshold = -100; Threshold = -70 Fails to Exceed the Threshold Fails to Exceed the Threshold Risk Neutral Fails: 3120 Risk Neutral Fails: 8026 Switching Fails: 2166 Switching Fails: 5790 Fails 9.5% less Fails 22.4% less using switching strategy; using switching strategy; Reduces losses by 30.6% Reduces losses by 27.9%

Augmented State Approach • Augment state space with cumulative reward – Integer-valued, no discounting – Reward capped [0, -150] – Action rewards based on location and current cumulative reward – State space increases by two orders of magnitude

Comparison of Approaches • Execute 10,000 runs in original MDP • Same start state every time • No discounting Augmented space vs switching (with risky policy d =1.2) • Fails to Exceed the Threshold (-70) Risk Neutral Fails: 7946 Switching Fails: 6945 Augmented State Fails: 6256 Augmented state fails 16.9% less Augmented state fails 6.8% less than risk-neutral, original-space; than switching strategy; Reduces losses by 21.2% Reduces losses by 9.9%

Comparison of Approaches Augmented State Risk-Variant Switching Planning (Offline) Time Solve policy: 18 hours Solve policy: < 1min Generate reward distr: 5-10 min Construct CDF: 1 min Total: 18 hours Total: 12 min * 2 policies = 24 min Execution (Online) Time 0.015s Eval Switch : 0.02s + Augmented state approach performs close to optimal - Very large planning time - Must re-generate policy when threshold changes - State space is enormous if discounting is needed

Robust Execution Monitoring (Juan Pablo Mendoza, PhD student, Robotics) “Motion Interference Detection in Mobile Robots,” J.P. Mendoza, M. Veloso and R. Simmons, In Proceedings of International Conference on Intelligent Robots and Systems, Vilamoura, Portugal, October 2012 “Mobile Robot Fault Detection based on Redundant Information Statistics,” J.P. Mendoza, M. Veloso and R. Simmons, In IROS Workshop on Safety in Human-Robot Coexistence and Interaction , Vilamoura, Portugal, October 2012

Learning to Detect Motion Interference • Learn HMM from Robot Data – Includes nominal and Motion Interference ( MI ) states – Hand-labeled training data – Learn transition probabilities to nominal states – Learn observation probabilities of all states – Transition probability to MI state is tunable parameter Accel Stop Constant MI Decel

Learning Behavior Model • Observations – Commanded Velocity – Velocity Difference : difference between commanded and perceived velocity, from encoders – Acceleration : Linear regression of last N measures of velocity – Jerk : Linear regression of last M measures of acceleration

Example Runs

Overall Performance • Precision/recall as MI transition probability varies • With precision = 1 and recall = 0.93, median detection time was 0.36s (mean = 0.647)

Detecting Unexpected Anomalies • Basic Idea – Model nominal behavior – Detect significant deviation from nominal – Determine extent of anomaly • Make execution monitoring efficient , effective , and informative

Modeling Nominal Behavior • Define a residual function that is (close to) zero during nominal behavior – For instance, velocity difference or difference between estimates of redundant sensors – Future work: Make residual function dependent on current state (e.g., using HMM)

Detecting Deviation from Nominal • Compute sample mean of residual function from observed data    2    F r ~ 0 ,     N • Compute probability that sample mean is not   within epsilon of zero    ' a ( D ) 2 z 1 f r   F  r ' z  f r N

Estimate Extent of Anomaly • Define region around current state – Currently: grid state space – Future: maintain continuous state space • Extend region in direction that increase a(d(R)) the most – Currently: extend to form axis-aligned hyper-rectangles – Future: general convex shaped regions • Stop when found locally maximal anomaly region – Current: Continue while a(d(R)) is non-decreasing – Future: Skip over “gaps”

Example Runs Pull from Stop Collision Push from Stop

Discovering a Global Anomaly • Region grows and becomes more certain as robot travels Ideally, keep upper and lower bounds on anomaly region

Interaction with Other Groups and Organizations • Received Infinite Mario software and support from John Laird ’s group at University of Michigan; interaction with student Shiwali Mohan • Interacted with Sven Koenig (USC/NSF) and former student Yaxin Liu regarding generation of risk-sensitive policies • Interaction with Manuela Veloso (CMU CSD/RI) – co- advising Juan Pablo Mendoza

List of Publications Attributed to the Grant • “Risk - Variant Policy Switching to Exceed Reward Thresholds,” B. Kane and R. Simmons, ICAPS, Sao Paulo Brazil, June 2012 • “Graph -based Trajectory Planning through Programming by Demonstration,” Nik Melchior, PhD Thesis, CMU RI -TR-11-40, August 2012 • “Motion Interference Detection in Mobile Robots,” J.P. Mendoza, M. Veloso and R. Simmons, In Proceedings of International Conference on Intelligent Robots and Systems, Vilamoura, Portugal, October 2012 • “Mobile Robot Fault Detection based on Redundant Information Statistics,” J.P. Mendoza, M. Veloso and R. Simmons, In IROS Workshop on Safety in Human-Robot Coexistence and Interaction , Vilamoura, Portugal, October 2012

Contextual Awareness for Robot Autonomy ( FA2386-10-1-4138 ) PI: - PowerPoint PPT Presentation

Contextual Awareness for Robot Autonomy ( FA2386-10-1-4138 ) PI: Reid Simmons (Carnegie Mellon University) AFOSR Joint Program Review: Cognition and Decision Program Human-System Interaction and Robust Decision Making Program Robust

Contextual Inquiry Take Aways Overview of Contextual Design Contextual inquiry

The Contextual Bandits Problem The Contextual Bandits Problem The Contextual Bandits Problem The

Robothlon Team competition, each team programs a robot for each event Events Robot

Contextual Analysis SWEN-444 Contextual analysis Systematic analysis of contextual user work

Autonomy Trading and Financial Statistics Autonomy Historical Trading Performance January 3, 2006

The Varieties of Self- Awareness David Chalmers Self-Awareness n Self-awareness = awareness

Contextual Advertising: Contextual Advertising: Semantic Approach Semantic Approach Ekaterina

Experimental Design & Evaluation 4. Contextual Inquiry SunyoungKim,PhD Contextual

Serving Contextual Communities Serving Contextual Communities The Evangelical Theological

Rational Robot A Test Automation Tool What is Rational Robot? Rational Robot is a complete

Verifying the Motion of a Robot Arm Akul Penugonda 1 /6 Akul Penugonda - Robot Arm Motion 2

What is a robot? A robot is an intelligent system that interacts with the Robot Lecture 2:

CS CS391R: Robot Learnin ing Perception, Decision Making, and General-Purpose Autonomy Prof.

Establishing a Korean Robot Ethics Charter 2007. 4. 14 Robot Division, Ministry of Commerce,

Out line Robot ics Percept ion Robot ics Planning Reading: R&N Sect .

Robot behaviour and control A robot can be defined as an intelligent link between perception

Capital Structure: Perfect Markets (Welch, Chapters 6 and 17) Ivo Welch UCLA Anderson School,

CS 573: Artificial Intelligence Markov Decision Processes Dan Weld University of Washington

Asset Pricing Chapter XI. The Martingale Measure: Part I June 20, 2006 Asset Pricing 11.1

Towards Best-Effort Autonomy R udiger Ehlers University of Bremen Dagstuhl Seminar 17071,

t r Pr t

Alternative Approaches to Development Economics Econ 239 September 2008 Econ 239 () Alternative

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 21: Decisions and Utility

Are Information Security professionals rational decision-makers? . Konstantinos Mersinas

Contextual Awareness for Robot Autonomy ( FA2386-10-1-4138 ) PI: - PowerPoint PPT Presentation

Contextual Awareness for Robot Autonomy ( FA2386-10-1-4138 ) PI: Reid Simmons (Carnegie Mellon University) AFOSR Joint Program Review: Cognition and Decision Program Human-System Interaction and Robust Decision Making Program Robust

Contextual Inquiry Take Aways Overview of Contextual Design Contextual inquiry

The Contextual Bandits Problem The Contextual Bandits Problem The Contextual Bandits Problem The

Robothlon Team competition, each team programs a robot for each event Events Robot

Contextual Analysis SWEN-444 Contextual analysis Systematic analysis of contextual user work

Autonomy Trading and Financial Statistics Autonomy Historical Trading Performance January 3, 2006

The Varieties of Self- Awareness David Chalmers Self-Awareness n Self-awareness = awareness

Contextual Advertising: Contextual Advertising: Semantic Approach Semantic Approach Ekaterina

Experimental Design &amp; Evaluation 4. Contextual Inquiry SunyoungKim,PhD Contextual

Serving Contextual Communities Serving Contextual Communities The Evangelical Theological

Rational Robot A Test Automation Tool What is Rational Robot? Rational Robot is a complete

Verifying the Motion of a Robot Arm Akul Penugonda 1 /6 Akul Penugonda - Robot Arm Motion 2

What is a robot? A robot is an intelligent system that interacts with the Robot Lecture 2:

CS CS391R: Robot Learnin ing Perception, Decision Making, and General-Purpose Autonomy Prof.

Establishing a Korean Robot Ethics Charter 2007. 4. 14 Robot Division, Ministry of Commerce,

Out line Robot ics Percept ion Robot ics Planning Reading: R&amp;N Sect .

Robot behaviour and control A robot can be defined as an intelligent link between perception

Capital Structure: Perfect Markets (Welch, Chapters 6 and 17) Ivo Welch UCLA Anderson School,

CS 573: Artificial Intelligence Markov Decision Processes Dan Weld University of Washington

Asset Pricing Chapter XI. The Martingale Measure: Part I June 20, 2006 Asset Pricing 11.1

Towards Best-Effort Autonomy R udiger Ehlers University of Bremen Dagstuhl Seminar 17071,

t r Pr t

Alternative Approaches to Development Economics Econ 239 September 2008 Econ 239 () Alternative

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 21: Decisions and Utility

Are Information Security professionals rational decision-makers? . Konstantinos Mersinas

Experimental Design & Evaluation 4. Contextual Inquiry SunyoungKim,PhD Contextual

Out line Robot ics Percept ion Robot ics Planning Reading: R&N Sect .