Learning to Navigate at City Scale Raia Hadsell Senior Research - PowerPoint PPT Presentation

Learning to Navigate … at City Scale Raia Hadsell Senior Research Scientist [BBH Brazil for Renault / Art: Pedro Utzeri]

Navigation Where am I going? Where am I? Where did I start? How distant is A from B? What is the shortest path from A to B? Have I been here before? How long until we get there?

Real world Exploration Modularity and   Multi-task prediction transfer learning of sensory data Memory Representation One-shot navigation   Grounding in   in unseen environment neuroscience Raia Hadsell - Learning to Navigate - 2018

Can we teach agents to explore   partially observed environments? Learning to Navigate in Complex Environments Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hube ru Soyer, Andy Ballard, Andrea Banino,   Misha Denil, Ross Goroshin, Laurent Sifre, Koray Kavukcuoglu, Dharsh Kumaran and Raia Hadsell [MIT News / Photo: Mark Ostow] arxiv.org/abs/1602.01783 (ICLR 2017) Raia Hadsell - Learning to Navigate - 2018

Navigation mazes [Bea tu ie et al (2016)   “DeepMind Lab”,   github.com/deepmind/lab] +10 +1 Within episode: Fixed goal (static or randomly changing b/w episodes)   Random respawns Raia Hadsell - Learning to Navigate - 2018

Given sparse rewards… … explore and learn spatial knowledge Accelerate reinforcement learning through auxiliary losses Derive spatial knowledge from auxiliary tasks : Depth prediction Local loop closure prediction Assess navigation skills through position decoding Raia Hadsell - Learning to Navigate - 2018

Agent training Advantage actor critic reinforcement learning π v [Mnih, Badia et al (2015) “Asynchronous Methods for Deep Reinforcement Learning”] Agent observes state s t and takes action a t π v policy LSTM Value and policy   are updated with estimate of policy gradient   CNN given by the k-step advantage function A CNN Policy term: r θ log π ( a t | s t ; θ ) A ( s t , a t ; θ V ) Raia Hadsell - Learning to Navigate - 2018

Navigation agent architectures π v depth π v policy LSTM Hiddens π v policy LSTM LSTM CNN CNN CNN reward t-1 velocity t , action t-1 Long Short-Term Memory (LSTM) Raia Hadsell - Learning to Navigate - 2018

Results on large static mazes Depth prediction as auxiliary task Importance of auxiliary tasks outperforms using depth as inputs Reward at goal Environment steps Environment steps Raia Hadsell - Learning to Navigate - 2018

Mirowski, Pascanu et al (2017), “Learning to Navigate in Complex Environments”

• 3D, first person environment • partially observed • procedural variations … but it’s not real

Real world Exploration Modularity and   Multi-task prediction transfer learning of sensory data Memory Representation One-shot navigation   Grounding in   in unseen environment neuroscience Raia Hadsell - Learning to Navigate - 2018

Can we solve navigation tasks in the real world? Learning to Navigate in Cities Without a Map Piotr Mirowski*, Ma tu hew Koichi Grimes, Mateusz Malinowski, Karl Moritz Hermann,   Keith Anderson, Denis Teplyashin, Karen Simonyan, Koray Kavukcuoglu,   Andrew Zisserman and Raia Hadsell arxiv.org/abs/1804.00168 Raia Hadsell - Learning to Navigate - 2018

Can we solve navigation tasks in the real world? Street View Raia Hadsell - Learning to Navigate - 2018

Street View as an RL environment: StreetLearn Street View image RGB panoramic image   (we crop it and render at 84x84) Actions:   move to the next node,   Google Maps graph turn left/right Raia Hadsell - Learning to Navigate - 2018

New York, London, Paris 14,000 to 60,000 nodes (panoramas) per “city”, covering range of 3.5-5km ● Discrete action space allows rotating in place and stepping to next node ● Multi-city dataset and RL environment will be released later this year ● Raia Hadsell - Learning to Navigate - 2018

The Courier Task Raia Hadsell - Learning to Navigate - 2018

The Knowledge ● Test to get a black cab license in London ● Candidates study for 3-4 years ● Memorize 25,000 roads and 20,000 named locations ● By the time they’ve passed the exam,   their hippocampuses are ‘significantly enlarged’. Woollett & Maguire. 2011. Acquiring ‘‘the Knowledge’’ of London’s Layout Drives Structural Brain Changes. Current Biology Raia Hadsell - Learning to Navigate - 2018

Presentation Title — SPEAKER

The Courier Task ● Random start and target ● Navigation without a map ● Reward shaped when close to goal (<200m) ● Actions: rotate left, right, or step forward ● Inputs for the agent at every time point t : ○ 84x84 RGB image observations ○ landmark-based goal description Raia Hadsell - Learning to Navigate - 2018

Architecture [Mnih, Badia et al (2015) “Asynchronous Methods for Deep Reinforcement Learning”] Raia Hadsell - Learning to Navigate - 2018

Architecture Raia Hadsell - Learning to Navigate - 2018

Successful learning on all 3 cities New York City around NYU Central London Reward at goal Environment steps Environment steps Raia Hadsell - Learning to Navigate - 2018

Analysis of goal acquisition Examples of 1000-step episodes Examples of value function for the same target Raia Hadsell - Learning to Navigate - 2018

Generalization on new goal areas Goal locations held-out during training   and landmark locations Raia Hadsell - Learning to Navigate - 2018

Architecture Raia Hadsell - Learning to Navigate - 2018

Multi-city modular transfer Given a sequence of cities (regions of NYC), compare the following single joint modular transfer Successful navigation in target city,   Moreover, we note that the transfer even though the convnet and policy LSTM are frozen   success is correlated to number of cities and only the goal LSTM is trained. seen during pre-training. Raia Hadsell - Learning to Navigate - 2018

Many thanks to many collaborators! • Learning to navigate in complex environments (ICLR 2017)   Piotr Mirowski*, Razvan Pascanu*, Fabio Viola, Hubert Soyer, Andy Ballard, Andrea Banino,   Misha Denil, Ross Goroshin, Laurent Sifre, Koray Kavukcuoglu, Dharsh Kumaran and Raia Hadsell • Learning to navigate in cities without a map (NIPS 2018)   Piotr Mirowski*, Matthew Koichi Grimes, Keith Anderson, Denis Teplyashin, Mateusz Malinowski,   Karl Moritz Hermann, Karen Simonyan, Koray Kavukcuoglu, Andrew Zisserman, Raia Hadsell www.deepmind.com www.raiahadsell.com

Learning to Navigate at City Scale Raia Hadsell Senior Research - PowerPoint PPT Presentation

Learning to Navigate at City Scale Raia Hadsell Senior Research Scientist [BBH Brazil for Renault / Art: Pedro Utzeri] Navigation Where am I going? Where am I? Where did I start? How distant is A from B? What is the shortest path from A

Navigate Client Account Access Navigate offers the ability to manage and view your accounts

FERPA and NAVIGATE FERPA and NAVIGATE Office of Academic Affairs 2 FERPA FERPA: refers to

http://cs224w.stanford.edu How to organize/navigate it? How to organize/navigate it?

City Limits Lions Clubs City Limits Lions Clubs City Limits Lions Clubs City Limits Lions

Paperless Board Meetings via Consolidated PDF How to Navigate and Annotate PDF Files on an iPad

DIS Navigate Trial Release Langebro Kollegium Esben Lydiksen Korbin Dimmick Ashley Miller DIS

Large-Scale Machine Learning at Twitter 2 Large-Scale Machine Learning at Twitter Jimmy Lin and

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

entral City, Surrey, B.C. Fast + Epp Fast + Epp lleria Roof Central City Central City

City Campus City Campus Welcome to City of Glasgow College City Campus External Entrance from

City Council May 23, 2018 City of Encinitas City Council Development Standard Review April

One City City Office and the One One City City Plan WHAT ARE WE TRYING TO SOLVE? HOW WE ARE

LEARNING & DEVELOPMENT PRACTICES TO NAVIGATE THE FOURTH INDUSTRIAL REVOLUTION (4IR) Lord

UO Parents Helping Their Children Navigate Online Learning Sol Joye Associate Director Oregon

ALAMEDA COUNTY FIRE DEPARTMENT SERVING : City of Dublin Alameda County City of Emeryville City

ALAMEDA COUNTY FIRE DEPARTMENT SERVING : City of Dublin Alameda County City of Emeryville City

What does it take to do reproducible computational science? What stands in our way? Bill

Chan Joshi UCLA Making Big Science Small : Moving Toward a TeV Accelerator Using Plasmas Work

User authentication on the web Joseph Bonneau jcb82@cl.cam.ac.uk Computer Laboratory SOCIALNETS

SOCIAL ENGINEERING - HOW NOT TO BE A VICTIM! BHUSHAN GUPTA GUPTA CONSULTING, LLC.

Herodotus and the Persian Wars Herodotus and the Persian Wars Herodotus is the first true

New Font Offerings: Cochineal, Nimbus15, LibertinusT1Math Michael Sharpe, UCSD TUG Toronto,

Putting Decisions in Perspective(s) Marco Montali Free University of Bozen-Bolzano DEC2H 2019,

1/14/2018 Logistics Transformation: E-Commerce and Implications on Trade Facilitation Shiumei

Learning to Navigate at City Scale Raia Hadsell Senior Research - PowerPoint PPT Presentation

Learning to Navigate at City Scale Raia Hadsell Senior Research Scientist [BBH Brazil for Renault / Art: Pedro Utzeri] Navigation Where am I going? Where am I? Where did I start? How distant is A from B? What is the shortest path from A

Navigate Client Account Access Navigate offers the ability to manage and view your accounts

FERPA and NAVIGATE FERPA and NAVIGATE Office of Academic Affairs 2 FERPA FERPA: refers to

http://cs224w.stanford.edu How to organize/navigate it? How to organize/navigate it?

City Limits Lions Clubs City Limits Lions Clubs City Limits Lions Clubs City Limits Lions

Paperless Board Meetings via Consolidated PDF How to Navigate and Annotate PDF Files on an iPad

DIS Navigate Trial Release Langebro Kollegium Esben Lydiksen Korbin Dimmick Ashley Miller DIS

Large-Scale Machine Learning at Twitter 2 Large-Scale Machine Learning at Twitter Jimmy Lin and

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

entral City, Surrey, B.C. Fast + Epp Fast + Epp lleria Roof Central City Central City

City Campus City Campus Welcome to City of Glasgow College City Campus External Entrance from

City Council May 23, 2018 City of Encinitas City Council Development Standard Review April

One City City Office and the One One City City Plan WHAT ARE WE TRYING TO SOLVE? HOW WE ARE

LEARNING &amp; DEVELOPMENT PRACTICES TO NAVIGATE THE FOURTH INDUSTRIAL REVOLUTION (4IR) Lord

UO Parents Helping Their Children Navigate Online Learning Sol Joye Associate Director Oregon

ALAMEDA COUNTY FIRE DEPARTMENT SERVING : City of Dublin Alameda County City of Emeryville City

ALAMEDA COUNTY FIRE DEPARTMENT SERVING : City of Dublin Alameda County City of Emeryville City

What does it take to do reproducible computational science? What stands in our way? Bill

Chan Joshi UCLA Making Big Science Small : Moving Toward a TeV Accelerator Using Plasmas Work

User authentication on the web Joseph Bonneau jcb82@cl.cam.ac.uk Computer Laboratory SOCIALNETS

SOCIAL ENGINEERING - HOW NOT TO BE A VICTIM! BHUSHAN GUPTA GUPTA CONSULTING, LLC.

Herodotus and the Persian Wars Herodotus and the Persian Wars Herodotus is the first true

New Font Offerings: Cochineal, Nimbus15, LibertinusT1Math Michael Sharpe, UCSD TUG Toronto,

Putting Decisions in Perspective(s) Marco Montali Free University of Bozen-Bolzano DEC2H 2019,

1/14/2018 Logistics Transformation: E-Commerce and Implications on Trade Facilitation Shiumei

LEARNING & DEVELOPMENT PRACTICES TO NAVIGATE THE FOURTH INDUSTRIAL REVOLUTION (4IR) Lord