Studying Autonomous Driving Corner Cases Powered by the Jetson TX Series Sascha Hornauer, Baladitya Yellapragada
Autonomous Driving Marcus Erbar, https://www.youtube.com/watch?v=IHhLpB5MNTQ Vehicle Detection: Deep Learning w/ lane detection https://github.com/merbar/CarND-Vehicle-Detection
Autonomous Driving
Failures of predominant approaches “You need to paint the bloody roads here!” - Volvos North America CEO Lex Kerssemakers to Los Angeles Mayor Eric Garcetti while failing to drive autonomously at the Los Angeles Auto Show
Autonomous Off-Road Driving
Our Results
Platform description Model car with Jetson TX1/TX2
1. Stage: Collection of Over 100 Hours of Driving
2. Stage Imitation Learning - Behavioral Cloning of Driving
Outdoor Domain Results
Domain Change - Indoor Results
Follow Behavior and Failure Cases
Benefits ● Complementary Solution ✔ ✔ ✔ ✔ ✔ ✔
Bojarski, M., Del Testa, D., Dworakowski, D., et al.\ 2016, arXiv:1604.07316 Bojarski, M., Del Testa, D., Dworakowski, D., et al.\ 2016, arXiv:1604.07316 Bojarski, M., Del Testa, D., Dworakowski, D., et al.\ 2016, arXiv:1604.07316 Challenges and Further Directions Learned behavior not completely understood Metrics reflect coarse success, better needed (1) 1 − 𝑜𝑣𝑛𝑐𝑓𝑠 𝑝𝑔 𝑗𝑜𝑢𝑓𝑠𝑤𝑓𝑜𝑢𝑗𝑝𝑜 ∙ 6 𝑡𝑓𝑑𝑝𝑜𝑒𝑡 𝑏𝑣𝑢𝑝𝑜𝑝𝑛𝑧 = ∙ 100 𝑓𝑚𝑏𝑞𝑡𝑓𝑒 𝑢𝑗𝑛𝑓 𝑡𝑓𝑑𝑝𝑜𝑒𝑡 - How simple can our network be? How complex must it be? - Can we understand our network performance, before deployment/testing? - What has our network learned? (1) - Bojarski, M., Del Testa, D., Dworakowski, D., et al., 2016, arXiv:1604.07316
Visualizing Ego-motion Semantics Implicitly Learned by a Neural Network for Self-Driving Baladitya Yellapragada
Outline Need for Visualization-Based Experiments ● Semantic Label Creation for Ego-motion Features ● Ego-motion Feature Experiments ●
Outline Need for Visualization-Based Experiments ● Semantic Label Creation for Ego-motion Features ● Ego-motion Feature Experiments ●
Spatiotemporal Input Novelty General Network Our Network
Insights from Visualization – Activation Maps Zhou, et al. “Object Detectors Emerge in Deep Scene CNNs”. 2015. ICLR
Potential Optical Flow Relevance Optical flow asymptotes are cues for path centers (dashed) ● Saunders. “View rotation is used to perceive path curvature from optic flow”. 2010. Journal of Vision.
Insights from Visualization – Gradient Ascent Zeiler and Fergus. “Visualizing and Understanding Convolutional Networks”. 2013. ArXiv
Gradient Ascent with our Z2 Color
Outline Need for Visualization-Based Experiments ● Semantic Label Creation for Ego-motion Features ● Ego-motion Feature Experiments ●
Visual Representation of Network Semantics Auto Human Bau , et al. “Network Dissection: Quantifying Interpretability of Deep Visual Representations”. 2017. CVPR.
Semantic Representations in Networks Bau , et al. “Network Dissection: Quantifying Interpretability of Deep Visual Representations”. 2017. CVPR.
Creating Optical Flow Labels per Input Video Consistent True Driving Signals with Tight Standard Deviation ● vs
Filtering Out Well Predicted Videos Reject videos whose driving signals are not predicted well by the network ● Remaining are task-relevant videos, each with optical flow labels across shared space ○
Outline Need for Visualization-Based Experiments ● Semantic Label Creation for Ego-motion Features ● Ego-motion Feature Experiments ●
Implicit Optical Flow Sensitivity Experiment Meyer, et al. “Phase - Based Frame Interpolation for Video”. 2015. Computer Vision and Pattern Recognition.
Speed Condition Examples Third Speed Normal Speed Thrice Speed
Results
Controls Temporal Order ● Stereoscopic Disparity ● Domain Correlations ●
Compare Time in Other Self-Driving Networks Single Image Networks [1] ● Recurrent Units [2] ● Spatiotemporal Convolutions [3] ● [1] Mengxi Wu. “Self - driving car in a simulator with a tiny neural network”. 2017. Medium. [2] Xu, et al. “End -to-end Learning of Driving Models from Large- scale Video Datasets”. 2017. ArXiv [3] Chi and Mu. “Deep Steering: Learning End -to- End Driving Model from Spatial and Temporal Visual Cues”. 2017. ArXiv
Control Condition: Temporal Order Normal Order Random Order Reverse Order
Results with Temporal Controls Normal Order Random Order Reverse Order
Control Condition: Stereoscopic Disparity Normal Stereo Reverse Stereo No Stereo
Results with Stereo Controls No Stereo Reverse Stereo Normal Stereo
Control Conditions: Domain Correlations Forest Sidewalk vs
Confounding Domain Correlations Forest w/ Sidewalks
Results by Domain (Trained on Tested) Combined on Combined Sidewalk on Sidewalk Forest on Forest
Results by Domain (Trained on Tested) Combined on Forest Sidewalk on Forest Forest on Forest
Conclusions ● Temporal variability is a consistent predictor of driving signal (bigger label = faster, smaller label = slower). ● Even if not explicitly specified, more natural temporal order is preferred over not ○ Evidence for potential optical flow sensitivity ● Even if not explicitly specified, more natural camera differences are preferred over not ○ Evidence for potential stereo disparity sensitivity ● Optical flow may be implicitly learned, but optical flow salience may be not be generalizable across domains, so more explicit training is better.
Recommend
More recommend