Variational Inference and Generative Models CS 294-112: Deep - PowerPoint PPT Presentation

Variational Inference and Generative Models CS 294-112: Deep Reinforcement Learning Sergey Levine

Class Notes 1. Homework 3 due next Wednesday 2. Accept CMT peer review invitations • These are required (part of your final project grade) • If you have not received/cannot find invitation, email Kate Rakelly!

Where we are in the course RL algorithms advanced topics

Today’s Lecture 1. Probabilistic latent variable models 2. Variational inference 3. Amortized variational inference 4. Generative models: variational autoencoders • Goals • Understand latent variable models in deep learning • Understand how to use (amortized) variational inference

Probabilistic models

Latent variable models mixture element

Latent variable models in general “easy” distribution “easy” distribution (e.g., conditional Gaussian) (e.g., Gaussian) “easy” distribution (e.g., Gaussian)

Latent variable models in RL conditional latent variable latent variable models for models for multi-modal policies model-based RL

Other places we’ll see latent variable models Using RL/control + variational inference to model human behavior Muybridge (c. 1870) Mombaur et al. ‘09 Li & Todorov ‘06 Ziebart ‘08 Using generative models and variational inference for exploration

How do we train latent variable models?

Estimating the log-likelihood

The variational approximation

The variational approximation Jensen’s inequality

A brief aside… Entropy: high Intuition 1: how random is the random variable? Intuition 2: how large is the log probability in expectation under itself low this maximizes the first part this also maximizes the second part (makes it as wide as possible)

A brief aside… KL-Divergence: Intuition 1: how different are two distributions? Intuition 2: how small is the expected log probability of one distribution under another, minus entropy? why entropy? this maximizes the first part this also maximizes the second part (makes it as wide as possible)

The variational approximation

How do we use this? how?

What’s the problem?

Amortized variational inference how do we calculate this?

Amortized variational inference look up formula for entropy of a Gaussian can just use policy gradient! What’s wrong with this gradient?

The reparameterization trick Is there a better way? most autodiff software (e.g., TensorFlow) will compute this for you!

Another way to look at it… this often has a convenient analytical form (e.g., KL-divergence for Gaussians)

Reparameterization trick vs. policy gradient • Policy gradient • Can handle both discrete and continuous latent variables • High variance, requires multiple samples & small learning rates • Reparameterization trick • Only continuous latent variables • Very simple to implement • Low variance

The variational autoencoder

Using the variational autoencoder

Conditional models

Examples

1. collect data 2. learn embedding of image & dynamics model ( jointly ) 3. run iLQG to learn to reach image of goal a type of variational autoencoder with temporally decomposed latent state!

Local models with images

Local models with images variational autoencoder with stochastic dynamics

We’ll see more of this for… Using RL/control + variational inference to model human behavior Muybridge (c. 1870) Mombaur et al. ‘09 Li & Todorov ‘06 Ziebart ‘08 Using generative models and variational inference for exploration

Variational Inference and Generative Models CS 294-112: Deep - PowerPoint PPT Presentation

Variational Inference and Generative Models CS 294-112: Deep Reinforcement Learning Sergey Levine Class Notes 1. Homework 3 due next Wednesday 2. Accept CMT peer review invitations These are required (part of your final project grade)

Deep Variational Inference FLARE Reading Group Presentation Wesley Tansey 9/28/2016 What is

Variational Inference for GPs: Presenters Group1: Stochastic variational inference. Slides 2 - 28

Variational Auto-encoders 2 VARIATIONAL AUTO-ENCODERS INTRODUCTION VARIATIONAL AUTO-ENCODERS

Rejection Sampling Variational Inference Karan Grewal CSC2547 / STA4273 Overview Variational

CS480/680 Machine Learning Lecture 11: February 11 th , 2020 Variational Inference Zahra

An Introduction to An Introduction to Variational Variational Methods for Graphical Models

CS 285 Instructor: Sergey Levine UC Berkeley Todays Lecture 1. Probabilistic latent variable

Lecture Variational 13 Inference Panini Kaushal Scribes : - Margulies Smedeuranh Niklas

Variational Inference for Bayes vMF Mixture Hanxiao Liu September 23, 2014 1 / 14 Variational

Variational Mean Field Variational Mean Field for Graphical Models for Graphical Models

Variational Inference CMSC 691 UMBC Goal: Posterior Inference Hyperparameters Unknown

Variational Bayesian Inference for Parametric and Non-Parametric Regression with Missing Predictor

Fast and Simple Natural-Gradient Variational Inference with Mixture of Exponential-family

Neural Variational Inference and Learning Andriy Mnih, Karol Gregor 22 June 2014 1 / 14

Regret bounds for online variational inference Pierre Alquier ACML Nagoya, Nov. 18, 2019

Variational Laplace Autoencoders Yookoon Park, Chris Dongjoo Kim and Gunhee Kim Vision and

Cold planar horizons are floppy Jorge E. Santos New frontiers in dynamical gravity In

Regional Health Improvement Plan Council March 17, 2020 1 Meeting Objectives SWACH updates

CAPITALAND MALL TRUST Singapores First & Largest Retail REIT First Quarter 2019 Financial

Wegman-Carter Style MACs from TBCs Jooyoung Lee School of Computing(GSIS), KAIST Jooyoung Lee

Ti e E ffi cient Server Audit Problem, Deduplicated Re-execution, and the Web Cheng Tan, Lingfan

An 8-core, 64-thread, 64-bit, power efficient SPARC SoC (Niagara2) Umesh Nawathe, Jim Ballard,

KEY PARTNERS PARTNERS OCRT SCHOLARSHIP FUND FRIENDS THANK YOU TO ALL OUR 2018 OCRT SPONSORS

Nernst Branes from special geometry David Errington March 5, 2015 arXiv:hep-th/1501 . 07863 Paul

Variational Inference and Generative Models CS 294-112: Deep - PowerPoint PPT Presentation

Variational Inference and Generative Models CS 294-112: Deep Reinforcement Learning Sergey Levine Class Notes 1. Homework 3 due next Wednesday 2. Accept CMT peer review invitations These are required (part of your final project grade)

Deep Variational Inference FLARE Reading Group Presentation Wesley Tansey 9/28/2016 What is

Variational Inference for GPs: Presenters Group1: Stochastic variational inference. Slides 2 - 28

Variational Auto-encoders 2 VARIATIONAL AUTO-ENCODERS INTRODUCTION VARIATIONAL AUTO-ENCODERS

Rejection Sampling Variational Inference Karan Grewal CSC2547 / STA4273 Overview Variational

CS480/680 Machine Learning Lecture 11: February 11 th , 2020 Variational Inference Zahra

An Introduction to An Introduction to Variational Variational Methods for Graphical Models

CS 285 Instructor: Sergey Levine UC Berkeley Todays Lecture 1. Probabilistic latent variable

Lecture Variational 13 Inference Panini Kaushal Scribes : - Margulies Smedeuranh Niklas

Variational Inference for Bayes vMF Mixture Hanxiao Liu September 23, 2014 1 / 14 Variational

Variational Mean Field Variational Mean Field for Graphical Models for Graphical Models

Variational Inference CMSC 691 UMBC Goal: Posterior Inference Hyperparameters Unknown

Variational Bayesian Inference for Parametric and Non-Parametric Regression with Missing Predictor

Fast and Simple Natural-Gradient Variational Inference with Mixture of Exponential-family

Neural Variational Inference and Learning Andriy Mnih, Karol Gregor 22 June 2014 1 / 14

Regret bounds for online variational inference Pierre Alquier ACML Nagoya, Nov. 18, 2019

Variational Laplace Autoencoders Yookoon Park, Chris Dongjoo Kim and Gunhee Kim Vision and

Cold planar horizons are floppy Jorge E. Santos New frontiers in dynamical gravity In

Regional Health Improvement Plan Council March 17, 2020 1 Meeting Objectives SWACH updates

CAPITALAND MALL TRUST Singapores First &amp; Largest Retail REIT First Quarter 2019 Financial

Wegman-Carter Style MACs from TBCs Jooyoung Lee School of Computing(GSIS), KAIST Jooyoung Lee

Ti e E ffi cient Server Audit Problem, Deduplicated Re-execution, and the Web Cheng Tan, Lingfan

An 8-core, 64-thread, 64-bit, power efficient SPARC SoC (Niagara2) Umesh Nawathe, Jim Ballard,

KEY PARTNERS PARTNERS OCRT SCHOLARSHIP FUND FRIENDS THANK YOU TO ALL OUR 2018 OCRT SPONSORS

Nernst Branes from special geometry David Errington March 5, 2015 arXiv:hep-th/1501 . 07863 Paul

CAPITALAND MALL TRUST Singapores First & Largest Retail REIT First Quarter 2019 Financial