Gated Path Planning Networks Lisa Lee Machine Learning Department Carnegie Mellon University Joint work with Emilio Parisotto, Devendra Chaplot, Eric Xing, & Ruslan Salakhutdinov ICML 2018
Path Planning Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 2
Path Planning is a fundamental part of any application that requires navigation. ● Autonomous vehicles ● Drones ● Factory robots ● Household robots https://giphy.com/gifs/battlefield-navigate-selfdriving-AmqDSvVwywm7m Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 3
Path Planning A* search (popular heuristic algorithm) ⇒ Not differentiable Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 4
Path Planning Value Iteration Networks (Tamar et al., 2016) ⇒ Fully differentiable! • Can be used as a path planner module in neural architectures while maintaining end-to-end differentiability. • VINs have become an important path planner component used in many recent works: • QMDP-Net: Deep learning for planning under partial observability (Karkus et al., 2017) • Cognitive mapping and planning for visual navigation (Gupta et al., 2017) • Unifying map and landmark based representations for visual navigation (Gupta et al., 2017) • Memory Augmented Control Networks (Khan et al., 2018) • Deep Transfer in RL by Language Grounding (Narasimhan 2017) Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 5
Outline of this talk Problem: VINs are difficult to optimize. 1. Overview of VIN 2. We reframe VIN as a recurrent-convolutional network. 3. From this perspective, we propose architectural improvements to VIN. ⇒ Gated Path Planning Networks (GPPN) 4. We show that GPPN performs better & alleviates many optimization issues of VIN. Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 6
Methods
Overview of VIN Output State Value State Value State Value recurrence " ($()) " ($) ' ' Map Design & Action-State Value Goal Location Reward M × M M × M M × M max " + conv conv & " ($) pooling ! (1) M × M (2) 2 × M × M N × M × M Convolution with kernel size 3 Value Iteration (Bellman, 1957) ($) = - ($56) " + " [1] + - " [1] . / 4 ! (1) , + " + " " ! " ($, &) = ) + 23 " 4 5 ($ 0 ) / $, &, $ 0 - $′ $, & * +,, + 3 " ($) = max ! " ($, &) " ($) = max ($) " + (2) ! , 9 " + " Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 8
Overview of VIN Output State Value State Value State Value recurrence " ($()) " ($) ' ' Map Design & Action-State Value Goal Location Reward M × M M × M M × M max " + conv conv & " ($) pooling ! (1) M × M (2) 2 × M × M N × M × M recurrence Recurrent-Convolutional Network with: nonlinearity convolution • An unconventional nonlinearity (max-pooling) • Restriction of kernel sizes to 3 " ($) = max ($45) " [/] + + " [/] , - 3 ! ! + • A hidden dimension of 1 * " * " * " Non-gated RNNs are known to be difficult to optimize. kernel size 3 Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 9
Gated Path Planning Networks (GPPN) Output State Value State Value State Value recurrence " ($()) " ($) ' ' Map Design & Action-State Value Goal Location Reward M × M M × M M × M " + conv conv & LSTM " ($) ! M × M 2 × M × M N × M × M recurrence GPPN : nonlinearity convolution • Replace max-pooling activation with a " ($) = LSTM ($67) " [1] + , " [1] . / 5 ! + , ! well-established gated recurrent operator - " - " (e.g., LSTM). - " • Allow kernel size F > 3. kernel size Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 10
Gated Path Planning Networks (GPPN) The gated LSTM update is well-known to alleviate many of the optimization problems with standard recurrent networks. ! (#) = max (#23) VIN update: + , [.] + * ) 1 ! [.] * ) ) ! (#) = LSTM (#23) GPPN update: 1 ! [9] + , [9] + * ) 8 * ) ) Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 11
Experimental Setup
Maze environments Test VIN & GPPN on a variety of settings such as: • Training dataset size • Maze size • Maze Transition Models Agent Goal Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 13
Maze environments Test VIN & GPPN on a variety of settings such as: • Training dataset size • Maze size • Maze Transition Models • NEWS Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 14
Maze environments Test VIN & GPPN on a variety of settings such as: • Training dataset size • Maze size • Maze Transition Models • NEWS • Moore Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 15
Maze environments Test VIN & GPPN on a variety of settings such as: • Training dataset size • Maze size • Maze Transition Models • NEWS • Moore • Differential Drive Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 16
Maze environments 3D ViZDoom Environment First-person RGB images Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 17
Experimental Results
Our GPPN outperforms VIN in a variety of metrics: Learning speed • GPPN learns faster. 100 GPPN % Optimal: Percentage of VIN 90 % Optimal states whose predicted paths have optimal length. unstable 80 70 60 1 4 7 10 13 16 19 22 25 28 # Epochs Test perform ance on 15 × 15 m azes with NEW S m echanism , dataset size 25k, and best (K, F) settings for each m odel. Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 19
Our GPPN outperforms VIN in a variety of metrics: Learning speed • GPPN performs better. Performance • Performance difference 100 95 % Optimal 90 GPPN 85 VIN 80 NEWS Moore Diff. Drive Maze Transition Models Test perform ance on 15 × 15 m azes with dataset size 10k and best (K, F) settings for each m odel. Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 20
Our GPPN outperforms VIN in a variety of metrics: Learning speed • GPPN generalizes better with less data. Performance • 100 Performance Generalization • difference % Optimal 95 GPPN 90 VIN 85 10k 25k 100k Training Dataset Size Test perform ance on 15 × 15 m azes with NEW S m echanism and best (K, F) settings for each m odel. Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 21
Our GPPN outperforms VIN in a variety of metrics: Learning speed • GPPN is more stable to hyperparameter changes. Performance • 100 flatter Generalization • 80 % Optimal 60 Hyperparameter sensitivity • GPPN 40 20 VIN 0 1 4 7 10 13 16 19 22 25 Hyperparameter Setting Index (Ordered by % Optimal) Test perform ance on 15 × 15 m azes with Differential Drive m echanism , dataset size 100k, and best (K, F) settings for each m odel. Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 22
Our GPPN outperforms VIN in a variety of metrics: Learning speed • GPPN exhibits less variance. Performance • 100 Generalization • 99 % Optimal 98 Hyperparameter sensitivity • 97 Amount of variance 96 Random seed sensitivity • 95 94 93 NEWS Diff. Drive GPPN VIN Test perform ance on 15 × 15 m azes with dataset size 100k and best (K, F) settings for each m odel. Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 23
Recommend
More recommend