gated path planning networks
play

Gated Path Planning Networks Lisa Lee Machine Learning Department - PowerPoint PPT Presentation

Gated Path Planning Networks Lisa Lee Machine Learning Department Carnegie Mellon University Joint work with Emilio Parisotto, Devendra Chaplot, Eric Xing, & Ruslan Salakhutdinov ICML 2018 Path Planning Gated Path Planning Networks


  1. Gated Path Planning Networks Lisa Lee Machine Learning Department Carnegie Mellon University Joint work with Emilio Parisotto, Devendra Chaplot, Eric Xing, & Ruslan Salakhutdinov ICML 2018

  2. Path Planning Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 2

  3. Path Planning is a fundamental part of any application that requires navigation. ● Autonomous vehicles ● Drones ● Factory robots ● Household robots https://giphy.com/gifs/battlefield-navigate-selfdriving-AmqDSvVwywm7m Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 3

  4. Path Planning A* search (popular heuristic algorithm) ⇒ Not differentiable Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 4

  5. Path Planning Value Iteration Networks (Tamar et al., 2016) ⇒ Fully differentiable! • Can be used as a path planner module in neural architectures while maintaining end-to-end differentiability. • VINs have become an important path planner component used in many recent works: • QMDP-Net: Deep learning for planning under partial observability (Karkus et al., 2017) • Cognitive mapping and planning for visual navigation (Gupta et al., 2017) • Unifying map and landmark based representations for visual navigation (Gupta et al., 2017) • Memory Augmented Control Networks (Khan et al., 2018) • Deep Transfer in RL by Language Grounding (Narasimhan 2017) Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 5

  6. Outline of this talk Problem: VINs are difficult to optimize. 1. Overview of VIN 2. We reframe VIN as a recurrent-convolutional network. 3. From this perspective, we propose architectural improvements to VIN. ⇒ Gated Path Planning Networks (GPPN) 4. We show that GPPN performs better & alleviates many optimization issues of VIN. Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 6

  7. Methods

  8. Overview of VIN Output State Value State Value State Value recurrence " ($()) " ($) ' ' Map Design & Action-State Value Goal Location Reward M × M M × M M × M max " + conv conv & " ($) pooling ! (1) M × M (2) 2 × M × M N × M × M Convolution with kernel size 3 Value Iteration (Bellman, 1957) ($) = - ($56) " + " [1] + - " [1] . / 4 ! (1) , + " + " " ! " ($, &) = ) + 23 " 4 5 ($ 0 ) / $, &, $ 0 - $′ $, & * +,, + 3 " ($) = max ! " ($, &) " ($) = max ($) " + (2) ! , 9 " + " Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 8

  9. Overview of VIN Output State Value State Value State Value recurrence " ($()) " ($) ' ' Map Design & Action-State Value Goal Location Reward M × M M × M M × M max " + conv conv & " ($) pooling ! (1) M × M (2) 2 × M × M N × M × M recurrence Recurrent-Convolutional Network with: nonlinearity convolution • An unconventional nonlinearity (max-pooling) • Restriction of kernel sizes to 3 " ($) = max ($45) " [/] + + " [/] , - 3 ! ! + • A hidden dimension of 1 * " * " * " Non-gated RNNs are known to be difficult to optimize. kernel size 3 Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 9

  10. Gated Path Planning Networks (GPPN) Output State Value State Value State Value recurrence " ($()) " ($) ' ' Map Design & Action-State Value Goal Location Reward M × M M × M M × M " + conv conv & LSTM " ($) ! M × M 2 × M × M N × M × M recurrence GPPN : nonlinearity convolution • Replace max-pooling activation with a " ($) = LSTM ($67) " [1] + , " [1] . / 5 ! + , ! well-established gated recurrent operator - " - " (e.g., LSTM). - " • Allow kernel size F > 3. kernel size Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 10

  11. Gated Path Planning Networks (GPPN) The gated LSTM update is well-known to alleviate many of the optimization problems with standard recurrent networks. ! (#) = max (#23) VIN update: + , [.] + * ) 1 ! [.] * ) ) ! (#) = LSTM (#23) GPPN update: 1 ! [9] + , [9] + * ) 8 * ) ) Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 11

  12. Experimental Setup

  13. Maze environments Test VIN & GPPN on a variety of settings such as: • Training dataset size • Maze size • Maze Transition Models Agent Goal Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 13

  14. Maze environments Test VIN & GPPN on a variety of settings such as: • Training dataset size • Maze size • Maze Transition Models • NEWS Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 14

  15. Maze environments Test VIN & GPPN on a variety of settings such as: • Training dataset size • Maze size • Maze Transition Models • NEWS • Moore Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 15

  16. Maze environments Test VIN & GPPN on a variety of settings such as: • Training dataset size • Maze size • Maze Transition Models • NEWS • Moore • Differential Drive Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 16

  17. Maze environments 3D ViZDoom Environment First-person RGB images Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 17

  18. Experimental Results

  19. Our GPPN outperforms VIN in a variety of metrics: Learning speed • GPPN learns faster. 100 GPPN % Optimal: Percentage of VIN 90 % Optimal states whose predicted paths have optimal length. unstable 80 70 60 1 4 7 10 13 16 19 22 25 28 # Epochs Test perform ance on 15 × 15 m azes with NEW S m echanism , dataset size 25k, and best (K, F) settings for each m odel. Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 19

  20. Our GPPN outperforms VIN in a variety of metrics: Learning speed • GPPN performs better. Performance • Performance difference 100 95 % Optimal 90 GPPN 85 VIN 80 NEWS Moore Diff. Drive Maze Transition Models Test perform ance on 15 × 15 m azes with dataset size 10k and best (K, F) settings for each m odel. Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 20

  21. Our GPPN outperforms VIN in a variety of metrics: Learning speed • GPPN generalizes better with less data. Performance • 100 Performance Generalization • difference % Optimal 95 GPPN 90 VIN 85 10k 25k 100k Training Dataset Size Test perform ance on 15 × 15 m azes with NEW S m echanism and best (K, F) settings for each m odel. Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 21

  22. Our GPPN outperforms VIN in a variety of metrics: Learning speed • GPPN is more stable to hyperparameter changes. Performance • 100 flatter Generalization • 80 % Optimal 60 Hyperparameter sensitivity • GPPN 40 20 VIN 0 1 4 7 10 13 16 19 22 25 Hyperparameter Setting Index (Ordered by % Optimal) Test perform ance on 15 × 15 m azes with Differential Drive m echanism , dataset size 100k, and best (K, F) settings for each m odel. Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 22

  23. Our GPPN outperforms VIN in a variety of metrics: Learning speed • GPPN exhibits less variance. Performance • 100 Generalization • 99 % Optimal 98 Hyperparameter sensitivity • 97 Amount of variance 96 Random seed sensitivity • 95 94 93 NEWS Diff. Drive GPPN VIN Test perform ance on 15 × 15 m azes with dataset size 100k and best (K, F) settings for each m odel. Gated Path Planning Networks (Lee & Parisotto et al., 2018) Carnegie Mellon University 23

Recommend


More recommend