for End-to-End Simulated Driving Jiakai Zhang, Kyunghyun Cho New - PowerPoint PPT Presentation

Query-Efficient Imitation Learning for End-to-End Simulated Driving Jiakai Zhang, Kyunghyun Cho New York University

Overview  Introduction • End-to-end learning for self-driving • Related work  Learning method • Convolutional neural network • Imitation learning using SafeDAgger  Experiment • Setup • Results  Conclusion and future work

Introduction  End-to-end learning for self-driving • Sensory input from front-facing camera • Control signal Steering Brake

Introduction  Related work • Supervised learning • ALVINN net [Pomerleau 1989] • DeepDriving [Chen et al. 2015] • End-to-end learning for self-driving cars [Bojarski et al. 2016] • Imitation learning • DAgger [Ross, Gordon, and Bagnell 2010] • SafeDAgger [Zhang and Cho 2017]

DAgger algorithm Dataset 𝐸 0 Policy 𝜌 1 Initialize Policy 𝜌 𝑗 = 𝛾 𝑗 𝜌 ∗ + (1 − 𝛾 𝑗 ) Dataset 𝐸 ′ 𝜌 𝑗 Iteration Dataset 𝐸 𝑗 = 𝐸 ′ ∪ 𝐸 𝑗−1 Policy 𝜌 𝑗 Disadvantage: Return Best policy 𝜌 𝑗 • Query a reference policy constantly • Safe issue to environment

SafeDAgger algorithm Policy 𝜌 1 Safety classifier 𝑑 1 Initialize Dataset 𝐸 0 Policy 𝜌 𝑗 = 𝛾 𝑗 𝜌 ∗ + (1 − 𝛾 𝑗 ) 𝜌 𝑗 Dataset 𝐸 ′ not safe Safety classifier 𝑑 𝑗 Iteration Policy 𝜌 𝑗 Dataset 𝐸 𝑗 = 𝐸 ′ ∪ 𝐸 𝑗−1 Safety classifier 𝑑 1 Advantage: Return Safety classifier 𝑑 𝑗 Best policy 𝜌 𝑗 • Query-efficient • Safety feature

 Safety classifier • Deviation of a primary policy from a reference policy defined • Optimal safety classifier defined as  Learning safety classifier • Minimize a binary cross-entropy loss

Experiment – Setup  TORCS – Open source racing game Training tracks Test tracks

Experiment – Model Input image – 3x160x72 Convolutional layer – 64x3x3 x 4 Max Pooling – 2x2 Convolutional layer – 128x5x5 Feature map x 2 Fully connected layer x 2 Fully connected layer Control Environment Safety value signals variables Primary policy Safety classifier Optimization algorithm: stochastic gradient descent

Results Safe Frames Unsafe Frames

Results  Evaluation on test tracks 1. Mean squared error of steering angle 2. Damage per lap 3. Number of laps 4. Portion of time driven by a reference policy

Results Mean squared error of steering angle MSE (Steering Angle) # of Dagger Iterations Dashed curve – with traffic Solid curve – without traffic

Results Damage per Lap Damage per Lap # of Dagger Iterations Dashed curve – with traffic Solid curve – without traffic

Results Number of Laps Avg. # of Laps # of Dagger Iterations Dashed curve – with traffic Solid curve – without traffic

Results Portion of time driven by a reference policy % of c safe = 0 # of Dagger Iterations Dashed curve – with traffic Solid curve – without traffic

Conclusion  Proposed SafeDAgger algorithm • Query efficient • Safety feature  End-to-end simulated driving • Trained a convolutional neural network to drive in TORCS with traffic Future work  Evaluate SafeDAgger in the real world  Learn to use temporal information

for End-to-End Simulated Driving Jiakai Zhang, Kyunghyun Cho New - PowerPoint PPT Presentation

Query-Efficient Imitation Learning for End-to-End Simulated Driving Jiakai Zhang, Kyunghyun Cho New York University Overview Introduction End-to-end learning for self-driving Related work Learning method Convolutional

Simulated Annealing Simulated annealing is a probabilistic search algorithm. The

Simulated Annealing G5BAIM: Artificial Intelligence Methods Graham Kendall 15 Feb 09 1

Outline Convergence DM812 METAHEURISTICS Lecture 2 1. Simulated Annealing Simulated Annealing

Distracted Driving Jennifer Smith What is Distracted Driving? Driving while engaged in any

Self-Driving Cars As Edge Computing Devices Matt Ranney - @mranney Uber ATG Why Self-Driving?

Safe Driving Techniques Road Safety Management Use of mobile phones Safe Driving Policy

DRIVING AI 1 Driving AI AI world representation Path finding AI driving

OASIS: Better simulated events to allow for fewer simulated events Prasanth Shyamsundar

Comparison of Simulated and Comparison of Simulated and Observed Interplanetary Observed

Simulated quantum annealing of double- Simulated quantum annealing of double- well and multiwell

MENTAL WORKLOAD IN VARIOUS DRIVING SETTINGS COMPARING REAL TRAFFIC AND SIMULATED ENVIRONMENT

Winter Driving Safety PPT-SM-WNTRDRVNG 1 V.A.0.0 Winter Driving The leading cause of death

Intelligent Driving Agents Intelligent Driving Agents Microscopic traffic simulation with

DRIVING CHANGE THE FIA A WORLDWIDE PRESENCE DRIVING CHANGE From track to road DRIVING CHANGE

Driving simulated Machine Learning and Humans Training Seb Loze - Simulations Industry Manager,

Simulated sensitivity of the tropical climate to extratropical thermal forcing Stefanie Talento -

DOI datacenters should provide Harry Enke Leibniz-Institute for Astrophysics Potsdam (AIP)

Hazardous Material Management in Thilawa Special Economic Zone, Myanmar Gene Peralta*, Cho Cho

PASSCoDe : P arallel AS ynchronous S tochastic dual Co -ordinate De scent Cho-Jui Hsieh

W H W H I T N E I T N E Y S C H Y S C H M I D M I D T ( W P S 5 ) , T ( W P S 5 ) , M I C

Search Result Diversity for Informational Queries Michael Welch, Junghoo Cho, Christopher Olston

Categorical semantics of metric spaces and continuous logic Simon Cho CT 2019, University of

Mirror symmetry in the complement of an anticanonical divisor Denis Auroux MIT August 27, 2007

Mi-Cho-Coq, a framework for certifying Tezos Smart Contracts Bruno Bernardo , Raphal Cauderlier,