Consortium ANR Jeunes Chercheurs Jeunes Chercheuses Programme - PowerPoint PPT Presentation

ANR ExTra-Learn Extraction and Transfer of Knowledge in Reinforcement Learning A. LAZARIC ANR Réunion de lancement projets, Paris SequeL INRIA Lille – Nord Europe November 4th, 2014

Consortium ANR ¡“ Jeunes ¡Chercheurs ¡Jeunes ¡Chercheuses” ¡ Programme ¡ INRIA ¡Lille ¡– ¡Nord ¡Europe ¡ SequeL ¡Team ¡ PhD ¡ Student ¡ A. ¡Lazaric ¡ (CR1) ¡ Post-‑doc ¡ (2yrs) ¡ J. ¡Mary ¡ R. ¡Munos ¡ M. ¡Valko ¡ (MdC) ¡ (DR1) ¡ (CR1) ¡ A. LAZARIC - ExTra-Learn November 4th, 2014 - 2

Reinforcement Learning Environment ¡ Critic acJon ¡ observaJon ¡ reward ¡ Agent ¡ A. LAZARIC - ExTra-Learn November 4th, 2014 - 3

Reinforcement Learning Environment ¡ CriJc ¡ acJon ¡ observaJon ¡ reward ¡ Learning ¡ Agent ¡ Learning ¡of ¡a ¡behavior ¡strategy ¡(a ¡policy) ¡which ¡maximizes ¡the ¡ long ¡term ¡sum ¡of ¡ rewards ¡(delayed ¡reward) ¡by ¡a ¡ direct ¡interacJon ¡ (trial-‑and-‑error) ¡with ¡an ¡ unknown ¡and ¡uncertain ¡ environment . ¡ A. LAZARIC - ExTra-Learn November 4th, 2014 - 4

Reinforcement Learning Task ¡ CriEc ¡ reward ¡ observaEon ¡ acEon ¡ Agent ¡ prior ¡ knowledge ¡ designer ¡ A. LAZARIC - ExTra-Learn November 4th, 2014 - 5

Transfer in Reinforcement Learning Task ¡ n+1 ¡ CriEc ¡ reward ¡ observaEon ¡ acEon ¡ Agent ¡ transferred ¡ knowledge ¡ Task ¡ 1 ¡ Task ¡ n ¡ past ¡ knowledge ¡ … ¡ Transfer ¡ Transfer ¡ of ¡knowledge ¡ across ¡tasks ¡to ¡ improve ¡ the ¡performance ¡of ¡the ¡learning ¡process ¡ A. LAZARIC - ExTra-Learn November 4th, 2014 - 6

Objectives ExTra-‑Learn ¡ (2014-‑2017) ¡ Objec7ve ¡1 ¡ Objec7ve ¡2 ¡ Objec7ve ¡3 ¡ Solve ¡problems ¡ Reduce ¡sample ¡ Improve ¡ with ¡complex ¡ complexity ¡ accuracy ¡ structure ¡ A. LAZARIC - ExTra-Learn November 4th, 2014 - 7

Tasks ExTra-‑Learn ¡ Objec7ve ¡1 ¡ Objec7ve ¡2 ¡ Objec7ve ¡3 ¡ Reduce ¡sample ¡ Solve ¡problems ¡with ¡ Improve ¡accuracy ¡ complexity ¡ complex ¡structure ¡ Task ¡1 ¡ Task ¡2 ¡ Task ¡3 ¡ Transfer ¡of ¡ExploraJon-‑ Transfer ¡SoluJons ¡for ¡ Hierarchical ¡ ¡ ExploitaJon ¡Strategies ¡ Approximated ¡RL ¡ Transfer ¡RL ¡ A. LAZARIC - ExTra-Learn November 4th, 2014 - 8

Expected Results ExTra-‑Learn ¡ Objec7ve ¡1 ¡ Objec7ve ¡2 ¡ Objec7ve ¡3 ¡ Reduce ¡sample ¡ Solve ¡problems ¡with ¡ Improve ¡accuracy ¡ complexity ¡ complex ¡structure ¡ Task ¡1 ¡ Task ¡2 ¡ Task ¡3 ¡ Transfer ¡of ¡ExploraJon-‑ Transfer ¡SoluJons ¡for ¡ Hierarchical ¡ ¡ ExploitaJon ¡Strategies ¡ Approximated ¡RL ¡ Transfer ¡RL ¡ Models ¡and ¡algorithms ¡ Algorithms ¡with ¡ Algorithms ¡with ¡ for ¡automaJc ¡ provable ¡smaller ¡ provable ¡smaller ¡regret ¡ hierarchical ¡ predicJon ¡error ¡ decomposiJon ¡ A. LAZARIC - ExTra-Learn November 4th, 2014 - 9

Expected Impact ExTra-‑Learn ¡ Objec7ve ¡1 ¡ Objec7ve ¡2 ¡ Objec7ve ¡3 ¡ Reduce ¡sample ¡ Solve ¡problems ¡with ¡ Improve ¡accuracy ¡ complexity ¡ complex ¡structure ¡ Task ¡1 ¡ Task ¡2 ¡ Task ¡3 ¡ Transfer ¡of ¡ExploraJon-‑ Transfer ¡SoluJons ¡for ¡ Hierarchical ¡ ¡ ExploitaJon ¡Strategies ¡ Approximated ¡RL ¡ Transfer ¡RL ¡ Models ¡and ¡algorithms ¡ Algorithms ¡with ¡provable ¡ Algorithms ¡with ¡provable ¡ for ¡automaJc ¡hierarchical ¡ smaller ¡regret ¡ smaller ¡predicJon ¡error ¡ decomposiJon ¡ Novel ¡learning ¡algorithms ¡with ¡potenJal ¡applicaJon ¡to ¡ ¡ recommenda:on ¡systems, ¡games, ¡educa:on ¡ online ¡trading, ¡autonomous ¡robo7cs, ¡online ¡adver7sing, ¡energy ¡management… ¡ A. LAZARIC - ExTra-Learn November 4th, 2014 - 10

ExTra-Learn https://project.inria.fr/ExTra-Learn/ (under construction) Agence Nationale de Recherche (ANR) Paris www.inria.fr

Consortium ANR Jeunes Chercheurs Jeunes Chercheuses Programme - PowerPoint PPT Presentation

ANR ExTra-Learn Extraction and Transfer of Knowledge in Reinforcement Learning A. LAZARIC ANR Runion de lancement projets, Paris SequeL INRIA Lille Nord Europe November 4th, 2014 Consortium ANR Jeunes Chercheurs Jeunes

Advanced Technology Consortium May 2013 What is BayTech? A Public-Private Consortium Bay Area

PML Consortium EMA/FDA Workshop on PML July 2011 PML Consortium and Funding 1 1 PML

Welcome Welcome and Introductions NICU Consortium Partner Updates NICU Consortium Committee

Consortium Update Consortium Update Jason M. Coposky June 9-12, 2020 @jason_coposky iRODS User

Consortium Update Consortium Update June 13-15, 2017 Jason Coposky @jason_coposky iRODS User

Consortium Update Consortium Update Jason M. Coposky June 5-7, 2018 @jason_coposky iRODS User

Welcome NICU Consortium Education Program/Webinar July 31, 2019 NICU Consortium Education

PML Consortium Research Activities 25 July 2011 EMA London 1 1 PML Consortium Confidential

THE COMPUTING COMMUNITY CONSORTIUM (CCC) COMPUTING COMMUNITY CONSORTIUM The mission of Computing

EP AZC AZC EP EP AZC Arizona Consortium for the Advancement of EBP AZC EP Arizona

Accessing the UK market now and after Brexit Fresh Produce Consortium Introducing the UK

The Applied Control The Applied Control Technology Consortium Technology Consortium Dr Arek

THE COMPUTING COMMUNITY CONSORTIUM Elizabeth D. Mynatt Chair COMPUTING COMMUNITY CONSORTIUM The

AI Longevity Consortium at Kings College London The UKs First Dedicated AI for Longevity

New England Telehealth Consortium Consortium (Success, Challenges, Future) P Presented by Jim

Human Dynamics Consortium presents This project is implemented by a Consortium led by Hulla and

John M. Keggi, MD Connecticut Joint Replacement Institute Disclosures Smith & Nephew -

Jiangsu Yoke Technology TM PhireGuard LF-11 1 PhireGuard LF-11 The first Yoke Flame

Out of the Box thinking can deliver outstanding outcomes Sharon McNulty Director Support

CMMI Level 5: Return on Investment for Raytheon N TX Donna Freed Network Centric Systems,

Merritt College Accreditation Follow-Up Report STEERING COMMITTEE MEETING WEDNESDAY, APRIL 27,

Busulfan/Cyclophosphamide (BuCy) versus Busulfan/Fludarabine (BuFlu) Conditioning Regimen Debate

Use of Real W orld Data in Developm ent Program m es Dr Alison Cave and Dr Francesca Cerreta

Drizzling Dithered ACS ImagesA Demonstration Max Mutchler, Anton Koekemoer, and Warren Hack

Consortium ANR Jeunes Chercheurs Jeunes Chercheuses Programme - PowerPoint PPT Presentation

ANR ExTra-Learn Extraction and Transfer of Knowledge in Reinforcement Learning A. LAZARIC ANR Runion de lancement projets, Paris SequeL INRIA Lille Nord Europe November 4th, 2014 Consortium ANR Jeunes Chercheurs Jeunes

Advanced Technology Consortium May 2013 What is BayTech? A Public-Private Consortium Bay Area

PML Consortium EMA/FDA Workshop on PML July 2011 PML Consortium and Funding 1 1 PML

Welcome Welcome and Introductions NICU Consortium Partner Updates NICU Consortium Committee

Consortium Update Consortium Update Jason M. Coposky June 9-12, 2020 @jason_coposky iRODS User

Consortium Update Consortium Update June 13-15, 2017 Jason Coposky @jason_coposky iRODS User

Consortium Update Consortium Update Jason M. Coposky June 5-7, 2018 @jason_coposky iRODS User

Welcome NICU Consortium Education Program/Webinar July 31, 2019 NICU Consortium Education

PML Consortium Research Activities 25 July 2011 EMA London 1 1 PML Consortium Confidential

THE COMPUTING COMMUNITY CONSORTIUM (CCC) COMPUTING COMMUNITY CONSORTIUM The mission of Computing

EP AZC AZC EP EP AZC Arizona Consortium for the Advancement of EBP AZC EP Arizona

Accessing the UK market now and after Brexit Fresh Produce Consortium Introducing the UK

The Applied Control The Applied Control Technology Consortium Technology Consortium Dr Arek

THE COMPUTING COMMUNITY CONSORTIUM Elizabeth D. Mynatt Chair COMPUTING COMMUNITY CONSORTIUM The

AI Longevity Consortium at Kings College London The UKs First Dedicated AI for Longevity

New England Telehealth Consortium Consortium (Success, Challenges, Future) P Presented by Jim

Human Dynamics Consortium presents This project is implemented by a Consortium led by Hulla and

John M. Keggi, MD Connecticut Joint Replacement Institute Disclosures Smith &amp; Nephew -

Jiangsu Yoke Technology TM PhireGuard LF-11 1 PhireGuard LF-11 The first Yoke Flame

Out of the Box thinking can deliver outstanding outcomes Sharon McNulty Director Support

CMMI Level 5: Return on Investment for Raytheon N TX Donna Freed Network Centric Systems,

Merritt College Accreditation Follow-Up Report STEERING COMMITTEE MEETING WEDNESDAY, APRIL 27,

Busulfan/Cyclophosphamide (BuCy) versus Busulfan/Fludarabine (BuFlu) Conditioning Regimen Debate

Use of Real W orld Data in Developm ent Program m es Dr Alison Cave and Dr Francesca Cerreta

Drizzling Dithered ACS ImagesA Demonstration Max Mutchler, Anton Koekemoer, and Warren Hack

John M. Keggi, MD Connecticut Joint Replacement Institute Disclosures Smith & Nephew -