tm Autonomous driving made safe
Founder, Bio Celite Milbrandt Austin, Texas since 1998 ● ● Founder of Slacker Radio In dash for Tesla, GM, and Ford. ○ ○ 35M active users 2008 Chief Product Officer of RideScout ● ○ Acquired by MBUSA/Daimler 2014
Mission statement: Making autonomous vehicle travel safe
Current scenario verification challenges Large vehicle fleets ● Driver to manage in the event of system ● error Expensive/ad hoc and incomplete. ● Many simple scenarios are missed ● Scenario generation and verification ● happens in real time, and is not easily repeatable
Solution Automate scenario test generation for ● planning testing Deep learning system for automated ● scenario modification and re-generation. Leverages existing gaming systems to ● enable multiphysics simulation Generation of realistic Lidar, Radar, ● Camera, and IMU sensor information for perceptions system testing Enable automated vehicle control ● performance metrics Fast error case regeneration, with ● derivative regeneration
Testing Perception and Planning Ground Truth w/ Scene Labeling Simulation Engine Camera Stereo Vision Lidar Control System Under Test Radar Inertial Measurement Unit (IMU)
Training Realistic Traffic Behavior Each agent/driver must have separate ● behaviors Behaviors must be learned based on ● different reward structures during training Examples of learned behaviors ● Speeder ○ Brake Happy ○ Cell Phone Driver ○ Drunk Driver ○ Behaviors are distributed based on the type ● of scenarios we want to test against Accidents result based on distribution of ● agents with various learned behaviors
Input layer is an image in our case ● Reinforcement Trained Output layers are log probability to apply ● throttle or turn right. Neural Network More negative log probabilities represent ● apply brake or turn left, respectively Number of layers and number of neurons for ● each layer are selected based on the convergence characteristic given your desired value function and or policy. Reward function is chosen based on desired ● behavior you are trying to emulate Comment: control belongs in CPU, ● computation lives in GPU rewards Neuron Updater
Reinforcement Learning Simulator Interface ● Socket-based ○ Python, C++ ○ Single simulator instance ○ Downsampler Per Agent Reward Modifiers ● ↓N Library of reward modifiers ○ Reward Agent Hyperparameters ● Modifier Continuous action space ○ Multiple concurrent agents ○ nxm Downsampling ● Full resolution -> 80x80 ○ Top down view or perspective ○ P(throttle|s) Agent P(turnRight|s)
Learning to Drive Example of basic reward system Stay in lane ● Don’t hit other vehicles ● Maintain safe distance from leading vehicle ● Change lanes only to avoid collision ● Basic System Details: Examples of our reward functions for different ● types of drivers Modulate reward with speed ○ Generate negative/positive rewards based ○ on different collision boundaries Generate reward for cause opposing cars ○ to move, swerve, or change direction
Scalable multi-agent training and testing for A3C...
Example monoDrive Reinforcement Agent Andrej Karpathy # agent based on karpathy http://karpathy.github.io/2016/05/31/rl/ Up to 20 agents (200 future) Continuous action space Reward based on agent reward function/modifier
www.monodrive.io Try it out! Download simulator at www.monodrive.io ● ○ Coming soon! Early version available with request to ○ info@monodrive.io Download sample agent and sample reward at: ● www.github.com/celite/agent_cm.py System Requirements: ● Downsampler ↓N ○ Windows, Mac, Ubuntu Tensorflow-GPU ○ Reward ○ Or Tensorflow if you have more time than money Modifier 32 Gb memory (64GB recommended) ○ ● Example Agent is python based but can be nxm anything. ● Control Interface based on IP sockets P(throttle|s) Agent P(turnRight|s)
Contact Information info@monodrive.io tm
Recommend
More recommend