Fast Adaptation via Policy-Dynamics Value Functions Roberta - PowerPoint PPT Presentation

Mar 22, 2023 •136 likes •297 views

Fast Adaptation via Policy-Dynamics Value Functions Roberta Raileanu Max Goldstein Arthur Szlam Rob Fergus NYU NYU FAIR NYU ICML 2020 Dynamics Often Change in the Real World How can agents rapidly adapt to changes in the environments

Fast Adaptation via Policy-Dynamics Value Functions Roberta Raileanu Max Goldstein Arthur Szlam Rob Fergus NYU NYU FAIR NYU ICML 2020
Dynamics Often Change in the Real World
How can agents rapidly adapt to changes in the environment’s dynamics ? Learn a General Value Function in the Space of Policies and Dynamics
Policy-Dynamics Value Function (PD-VF) Value Function Total Future Reward Fixed Policy-Dynamics Total Future Reward Value Function
Fast Adaptation to New Dynamics Family of Environments Each Environment has a unobserved Different Transition Function Train on a Family of Different but Related Dynamics Test on New Dynamics
Training Recipe 1. Reinforcement Learning Phase - train individual policies on each training environment 2. Self-Supervised Learning Phase - Learn policy and dynamics embeddings using collected the trajectories 3. Supervised Learning Phase - Learn a value function for this space of policies and environments 4. Evaluation Phase - Infer the dynamics of a new environment using steps - Find the policy that maximizes the learned value function
Learning Policy and Dynamics Embeddings Learn Policy Embedding Learn Dynamics Embedding
Learning the Policy-Dynamics Value Function Training the Policy-Dynamics Value Function
Evaluation Phase Closed-form solution: top singular vector of A’s SVD decomposition Optimal Policy Embedding (OPE)
Environments Spaceship Swimmer Ant-Wind Continuous Dynamics Ant-Legs Ant-Legs Discrete Dynamics
Evaluation on Unseen Environments
Evaluation on Unseen Environments
Learned Embeddings Policy Embeddings Dynamics Embeddings Policy Color Dynamics Color
Takeaways Learn a value function in a space of policies and dynamics Infer the dynamics of a new environment from only a few interactions No need for parameter updates, long rollouts, or dense rewards to adapt Improved performance on unseen environments
Future Work ● Reward function variation → condition W on a task embedding ● Multi-agent settings → dynamics given by the others’ policies ● Continual learning ● Integrate prior knowledge / constraints ● Estimate other metrics apart from reward
Thank you!

Recommend

Biodiversity, Ecosystem Services and Adaptation and Adaptation Dr Pushpam Kumar Associate

International Workshop on Vulnerability and Adaptation to Climate Change: From Practice to Policy on May11-12, New Delhi Biodiversity, Ecosystem Services and Adaptation and Adaptation Dr Pushpam Kumar Associate Professor Institute of Economic

209 views • 20 slides

Animal Adaptation By: Nikeya and Markie What are Adaptations A adaptation is a body part that

Animal Adaptation By: Nikeya and Markie What are Adaptations A adaptation is a body part that help an animal survive.A horse uses its legs to run away from predators. A cheetahs use there very fast speed as a advantage to get away from

332 views • 5 slides

Adaptation dynamics of a polygenic trait Kavita Jain J. Nehru Centre, Bangalore K. Jain & W.

Adaptation dynamics of a polygenic trait Kavita Jain J. Nehru Centre, Bangalore K. Jain & W. Stephan G3 (2015); Genetics (2017); MBE (2017) K. Jain & A. Devi, EPL (2018) t Modes of polygenic adaptation

536 views • 30 slides

Genetic Algorithms for Optimization of Noisy Fitness Functions and Adaptation to Changing

Genetic Algorithms for Optimization of Noisy Fitness Functions and Adaptation to Changing Environments Hajime Kita 1 and Yasuhito Sano 2 1 Kyoto University 2 Nissan Motor Co. Ltd. SMAPIP, July 2003 (1) Outline of

836 views • 38 slides

Slow/fast dynamics and periodic behaviors for mean-field excitable systems Christophe Poquet

Slow/fast dynamics and periodic behaviors for mean-field excitable systems Christophe Poquet Universit e Lyon 1 April 4 th , 2019 Workshop : Mean-field approaches to the dynamics of neuronal networks, EITN, Paris In collaboration with E.

859 views • 43 slides

FTA Climate Adaptation Pilot Projects August 21, 2013 Brian Alberts Program Analyst FTA Office

FTA Climate Adaptation Pilot Projects August 21, 2013 Brian Alberts Program Analyst FTA Office of Budget and Policy FTA Climate Adaptation Initiative Policy Framing: Dear Colleague Letter and Policy Statement describe climate impacts on

120 views • 10 slides

Short-output universal hash functions & their use in fast and secure data authentication

Short-output universal hash functions & their use in fast and secure data authentication Long Nguyen and Bill Roscoe Oxford University Department of Computer Science -almost universal hash functions (UHF) Definition : given R is the set

516 views • 22 slides

On Perspective Functions, Vanishing Constraints, and Complementarity Programming Fast

On Perspective Functions, Vanishing Constraints, and Complementarity Programming Fast Mixed-Integer Nonlinear Feedback Control Christian Kirches 1 , Sebastian Sager 2 1 Interdisciplinary Center for Scientific Computing (IWR) Heidelberg University

619 views • 30 slides

Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks Chelsea Finn, Pieter Abbeel,

Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks Chelsea Finn, Pieter Abbeel, Sergey Levine Presented by: Teymur Azayev CTU in Prague 17 January 2019 Deep Learning Very powerful, expressive differentiable models.

579 views • 18 slides

Many-particle systems far from equilibrium from Green functions to stochastic dynamics Michael

Many-particle systems far from equilibrium from Green functions to stochastic dynamics Michael Bonitz, Sebastian Hermanns, Christopher Hinz, Niklas Schlnzen and Denis Lacroix Institut fr Theoretische Physik und Astrophysik

788 views • 57 slides

The Dynamics of Waste Prevention: Building Evidence to Support Policy Making in Defra Maria

The Dynamics of Waste Prevention: Building Evidence to Support Policy Making in Defra Maria Angulo (Defra) Rachel Freeman (Sustain & University of Bristol) System Dynamics Conference, London 7 th February 2013 The dynamics of Waste

456 views • 28 slides

A Family of Fast Syndrome Based Cryptographic Hash Functions Daniel Augot, Matthieu Finiasz and

A Family of Fast Syndrome Based Cryptographic Hash Functions Daniel Augot, Matthieu Finiasz and Nicolas Sendrier Part I General Facts about Hash Functions The Merkle-Damg ard construction D Padding + length n n n o o o i i i s s

393 views • 25 slides

Fast and Private Submodular and k- Submodular Functions Maximization with Matroid Constraints

ICML | 2020 Thirty-seventh International Conference on Machine Learning Fast and Private Submodular and k- Submodular Functions Maximization with Matroid Constraints Akbar Rafiey Yuichi Yoshida 1 Core massage What is the problem?

511 views • 22 slides

On the Exact Security of Message Authentication using Pseudorandom Functions Fast Software

On the Exact Security of Message Authentication using Pseudorandom Functions Fast Software Encryption 17, Tokyo Ashwin Jha 1 , Avradip Mandal 2 , Mridul Nandi 1 1 Indian Statistical Institute, Kolkata, India 2 Fujitsu Laboratories of America,

751 views • 28 slides

(1, )-Evolution Strategy with Cumulative Step size Adaptation on linear cost functions

(1, )-Evolution Strategy with Cumulative Step size Adaptation on linear cost functions Adrien Coutoux, Auger Anne, Hansen Nikolaus INRIA Saclay - Ile-de-France, Project team TAO (1, )-Evolution Strategy with Cumulative Step size

465 views • 23 slides

Coastal Adaptation Kellie Fisher FCERM Senior Advisor Why Adaptation? Adaptation to a

Coastal Adaptation Kellie Fisher FCERM Senior Advisor Why Adaptation? Adaptation to a changing environment. Coastal change is not new! Traditional defences are a form of adaptation too. Are they always appropriate? Property

516 views • 19 slides

New Methods and Lessons for Science and Policy in Climate Change Adaptation UNU Keystone

New Methods and Lessons for Science and Policy in Climate Change Adaptation UNU Keystone Conference on Mapping Social Vulnerability Prof. Michelle Leighton in Kyrgyzstan, agricultural communities bear significant impacts and therefore need to

315 views • 17 slides

Policy Integration and Cross- Sector Approach to CC Adaptation Strategy Dr. Sajor Asian

Policy Integration and Cross- Sector Approach to CC Adaptation Strategy Dr. Sajor Asian Institute of Technology Uses of Sector in Policy: a mixed-bag concept Sector as in political-administrative sectors of government:

297 views • 16 slides

Knowledge and science-policy nexus for cross-sector adaptation Ajaya Dixit ISET-Nepal Regional

Knowledge and science-policy nexus for cross-sector adaptation Ajaya Dixit ISET-Nepal Regional Training On Cross-Sectoral Climate Change Adaptation Planning October 25-26 th 2011 1 Two cases from Nepal: One in the plains and one in the mid

452 views • 22 slides

A multi-scale fast semi-Lagrangian method for rarefied gas dynamics V. Rispoli 1 , G. Dimarco 2 ,

A multi-scale fast semi-Lagrangian method for rarefied gas dynamics V. Rispoli 1 , G. Dimarco 2 , R. Loub` ere 1 , 1 Institut de Math ematique de Toulouse (IMT), France 2 Dipartimento di matematica ed informatica, Ferrara, Italy.

369 views • 33 slides

MINING THE EVOLUTIONARY DYNAMICS OF PROTEIN LOOP STRUCTURE AND ITS ROLE IN BIOLOGICAL FUNCTIONS

MINING THE EVOLUTIONARY DYNAMICS OF PROTEIN LOOP STRUCTURE AND ITS ROLE IN BIOLOGICAL FUNCTIONS PI: Dr. Gustavo Caetano-Anolls Professor of Bioinformatics (Crop Science / IGB) University of Illinois at Urbana-Champaign Presented By: Fizza

820 views • 15 slides

Graph Partitioning Methods for Fast Parallel Quantum Molecular Dynamics Hristo Djidjev, Georg

Graph Partitioning Methods for Fast Parallel Quantum Molecular Dynamics Hristo Djidjev, Georg Hahn, Sue Mniszewski Christian Negre, Anders Niklasson, Vivek Sandeshmuk Ocober 10, 2016 U N C L A S S I F I E D Slide 1 Talk outline Background

309 views • 14 slides

Control of Coupled Slow and Fast Dynamics Zvi Artstein Presentations in: DIMACS Workshop on

Control of Coupled Slow and Fast Dynamics Zvi Artstein Presentations in: DIMACS Workshop on Perspectives and Future Directions in Systems and Control Theory = Sontagfest May 23, Piscataway 1 Happy Birthday Eduardo Many happy returns! 2 3

877 views • 35 slides

WORKSHOP ZCAS / MONCA O Governance and Policy for Disaster Risk Reduction and Adaptation to

WORKSHOP ZCAS / MONCA O Governance and Policy for Disaster Risk Reduction and Adaptation to Extremes of Climate Variability Eduardo Mario Mendiondo* National Center for Monitoring and Alerts of Natural Disasters Brazilian Ministry of Science,

494 views • 32 slides