announcements cs 188 artificial intelligence
play

Announcements CS 188: Artificial Intelligence Spring 2011 Practice - PDF document

Announcements CS 188: Artificial Intelligence Spring 2011 Practice Final Out (optional) Similar extra credit system as practice midterm Contest (optional): Advanced Applications: Tomorrow night 11pm deadline for final


  1. Announcements CS 188: Artificial Intelligence Spring 2011 § Practice Final Out (optional) § Similar extra credit system as practice midterm § Contest (optional): Advanced Applications: § Tomorrow night 11pm deadline for final submission Robotics § Project 5 Classification is out: due next week Friday Pieter Abbeel – UC Berkeley A few slides from Sebastian Thrun, Dan Klein 1 Advanced Applications So Far Mostly Foundational Methods 3 4 Robotic Control Tasks Robot folds towels § Perception / Tracking § [pile of 5 video] § Where exactly am I? § What’s around me? § Low-Level Control § How to move the robot and/or objects from position A to position B § High-Level Control § What are my goals? § What are the optimal high-level actions? [Maitin-Shepard, Cusumano-Towner, Lei & Abbeel, 2010] 6 1

  2. Low-Level Planning A Simple Robot Arm § Configuration Space § Low-level: move from configuration A to configuration B § What are the natural coordinates for specifying the robot’s configuration? § These are the configuration space coordinates § Can’t necessarily control all degrees of freedom directly § Work Space § What are the natural coordinates for specifying the effector tip’s position? § These are the work space coordinates Coordinate Systems Obstacles in C-Space § What / where are the obstacles? § Workspace: § Remaining space is free space § The world’s (x, y) system § Obstacles specified here § Configuration space § The robot’s state § Planning happens here § Obstacles can be projected to here Two-link manipulator Example Obstacles in C-Space ( x , y ) d 2 Y Y α 2 d 1 α 1 X X 2

  3. Two-link manipulator Probabilistic Roadmaps § Demo § Idea: sample random points as nodes in a visibility graph http://www-inst.eecs.berkeley.edu/~cs188/fa08/demos/robot.html § This gives probabilistic roadmaps § Very successful in practice § Lets you add points where you need them § If insufficient points, incomplete or weird paths Robotic Control Tasks Perception § Perception / Tracking 1. Find a point see in two camera views § Where exactly am I? 2. Find 3D coordinates by finding the intersection of the rays § What’s around me? § Low-Level Control § How to move the robot and/or objects from position A to position B § High-Level Control § What are my goals? § What are the optimal high-level actions? 18 19 20 3

  4. 21 22 23 24 25 26 4

  5. 27 28 29 30 31 32 5

  6. 33 34 35 36 Motivating Example Autonomous Helicopter Flight § Key challenges: § Track helicopter position and orientation during flight Decide on control inputs to send to helicopter § n How do we specify a task like this? [demo: autorotate / tictoc] 6

  7. Autonomous Helicopter Setup HMM for Tracking the Helicopter On-board inertial measurement unit (IMU) z, ˙ Á , ˙ µ , ˙ Position s = ( x, y, z, Á , µ , Ã , ˙ x, ˙ y, ˙ Ã ) § State: § Measurements: Send out controls to helicopter § 3-D coordinates from vision, 3-axis magnetometer, 3-axis gyro, 3-axis accelerometer § Transitions (dynamics): [time elapse update] § s t+1 = f (s t , a t ) + w t 42 [f encodes helicopter dynamics] [w is a probabilistic noise model] Helicopter MDP Problem: What’s the Reward? z, ˙ Á , ˙ µ , ˙ [demo: hover] s = ( x, y, z, Á , µ , Ã , ˙ x, ˙ y, ˙ Ã ) § State: § Rewards for hovering: § Actions (control inputs): § a lon : Main rotor longitudinal cyclic pitch control (affects pitch rate) § a lat : Main rotor latitudinal cyclic pitch control (affects roll rate) § a coll : Main rotor collective pitch (affects main rotor thrust) § a rud : Tail rotor collective pitch (affects tail rotor thrust) § Rewards for “Tic-Toc”? § Transitions (dynamics): § Problem: what’s the target trajectory? § s t+1 = f (s t , a t ) + w t [f encodes helicopter dynamics] § Just write it down by hand? [w is a probabilistic noise model] [demo: bad] § Can we solve the MDP yet? 44 Probabilistic Alignment using a [demo: unaligned] Helicopter Apprenticeship? Bayes’ Net Intended trajectory Expert demonstrations Time indices § Intended trajectory satisfies dynamics. § Expert trajectory is a noisy observation of one of the hidden states. § But we don’t know exactly which one. [Coates, Abbeel & Ng, 2008] 47 7

  8. [demo: alignment] [demo: airshow] Alignment of Samples Final Behavior § Result: inferred sequence is much cleaner! 49 50 Advanced Applications Quadruped § Low-level control problem: moving a foot into a new location à similar search as for moving robot arm § High-level control problem: where should we place the feet? § Reward function R(x) = w . f (s) [25 features] [Kolter, Abbeel & Ng, 2008] 51 Without learning Apprenticeship Learning § Goal: learn reward function from expert demonstration § Assume § Get expert demonstrations § Guess initial policy § Repeat: § Find w which make the expert better than § Solve MDP for new weights w: 53 8

  9. With learned reward function Advanced Applications 57 Autonomous Vehicles Grand Challenge: Barstow, CA, to Primm, NV 150 mile off-road robot race § across the Mojave desert Natural and manmade hazards § No driver, no remote control § No dynamic passing § Autonomous vehicle slides adapted from Sebastian Thrun Readings: No Obstacles Inside an Autonomous Car 3 2 E-stop 5 Lasers 1 GPS Camera GPS compass Radar 6 Computers Control Screen IMU Steering motor 9

  10. Readings: Obstacles Obstacle Detection Trigger if | Z i - Z j | > 15cm for nearby z i , z j Δ Z Raw Measurements: 12.6% false positives Probabilistic Error Model HMMs for Detection GPS GPS GPS IMU IMU IMU x t x t+ 1 x t+ 2 z t z t+ 1 z t+ 2 Raw Measurements: 12.6% false positives HMM Inference: 0.02% false positives 10

Recommend


More recommend