Meta-Amoruized Variational Inference and Learning Kristy Choi - PowerPoint PPT Presentation

Meta-Amoruized Variational Inference and Learning Kristy Choi CS236: December 4th, 2019

Probabilistic Inference Probabilistic inference is a paruicular way of viewing the world: + = Observations Updated (posterior) Prior belief Typically the beliefs are “hidden” (unobserved), and we want to belief model them using latent variables.

Probabilistic Inference Many machine learning applications can be cast as probabilistic inference queries: Medical diagnosis Bioinformatics Human cognition Computer vision

Medical Diagnosis Example observed symptoms z identity of disease x N Goal: Infer identity of disease given a set of observed symptoms from a patient population.

Exact Inference intractable integral disease z family of tractable distributions x symptoms N Marginal is intractable, we can’t compute this even if we want to

Approximate Variational Inference dependence on x : learn new q per data point disease z → turned an intractable inference problem into an x optimization problem symptoms N

Amoruized Variational Inference deterministic mapping predicts z as a function of x disease z → scalability: VAE formulation x symptoms N

Multiple Patient Populations disease disease disease disease disease z z z z z Doctor is equivalently re-learning how to diagnose an illness :/ x x x x x N N N N N symptoms symptoms symptoms symptoms symptoms

Multiple Patient Populations Share statistical strength across difgerent populations to infer latent representations that transfer to similar, but previously unseen populations (distributions)

(Naive) Meta-Amoruized Variational Inference meta-distribution

Meta-Amoruized Variational Inference shared meta-inference network meta-distribution

Meta-Inference Network ● Meta-inference model takes in 2 inputs: Marginal ○ Query point ○ ● Mapping ● Parameterize encoder with neural network ● Dataset : represent each marginal distribution as a set of samples

In Practice: MetaVAE aggregation network samples query summary network decoder_i Summary network ingests samples from each dataset Aggregation network pergorms inference

Related Work Neural Statistician Variational Homoencoder (VHE) VAE MetaVAE c c z z x D x D x z z D D D x D T x N T x != x T D T x T x T T x D = x T T x D ! = x T Avoid restrictive assumption on global prior over datasets p(c)

Intuition: Clustering Mixtures of Gaussians z = 1 z = 0 Learns how to cluster : for 50 datasets, MetaVAE achieves 9.9% clustering error, while VAE gets 27.9%

Learning Invariant Representations Apply various ● transformations Amoruize over subsets ● of transformations, learn representations Test representations ● on held-out transformations (classifjcation)

Invariance Experiment Results MetaVAE representations consistently outpergorm NS/VHE on MNIST + NORB datasets

Analysis MetaVAE representations tend not to change very much within a family of transformations that it was amoruized over, as desired.

Conclusion Limitations ● No sampling ○ Semi-parametric ○ Arbitrary dataset construction ○ Developed an algorithm for a family of probabilistic models: ● meta-amoruized inference paradigm MetaVAE learns transferrable representations that generalize ● well across similar data distributions in downstream tasks Paper: htups://arxiv.org/pdf/1902.01950.pdf ●

Encoding Musical Style with Transformer Autoencoders

Generative Models for Music Generating music is a challenging problem, as music ● contains structure at multiple timescales. Periodicity, repetition ○ Coherence in style and rhythm across (long) time periods! ● raw audio: WaveNet, symbolic: RNNs, GANs, etc. LSTMs, etc.

Music Transformer ● Symbolic: event-based representation that allows for generation of expressive pergormances (without generating a score) ● Current SOTA in music generation Can generate music over 60 seconds in length ○ ● Atuention-based Replaces self-atuention with relative atuention ○

What We Want Control music generation using either (1) pergormance or (2) ● melody + perg as conditioning Generate pieces that sound similar in style to input pieces! ●

Transformer Autoencoder 1. Sum 2. Concatenation 3. Tile (temporal dimension)

Quantitative Metrics Transformer autoencoder (both pergormance-only and melody & perg) outpergorm baselines in generating similar pieces!

Samples Twinkle, Twinkle melody Claire de Lune Conditioning Pergormance Conditioning Pergormance Generated Pergormance: Generated Pergormance: “Twinkle, Twinkle” in the style “Claire de Lune” in the style of the above pergormance of the above pergormance

Conclusion ● Developed a method for controllable generation with high-level controls for music Demonstrated effjcacy both quantitatively and ○ through qualitative listening tests ● Thanks! Stanford: Mike Wu, Noah Goodman, Stefano Ermon ○ Magenta @ Google Brain: Jesse Engel, Ian Simon, ○ Curuis “Fjord” Hawthorne, Monica Dinculescu

References 1. Edwards, H., and Storkey, A. Towards a neural statistician. 2016 2. Hewitu, L. B., Nye, M. I.; Gane, A.; Jaakkola, T., and Tenenbaum, J.B. Variational Homoencoder. 2018 3. Kingma, D.P., and Welling, M. Auto-encoding variational bayes. 2013 4. Gershman, S., and Goodman, N. Amoruized inference in probabilistic reasoning. 2014 5. Jordan, M. I.; Ghahramani, Z.; Jaakkola, T.S.; and Saul, L.K. An introduction to variational methods for graphical models. 1999 6. Blei, D. M.; Kuckelbir, A.; and McAulifge, J.D. Variational inference: a review for statisticians. 2017 7. Huang, C.Z.; Vaswani, A., Uskoreit, J., Shazeer, N., Simon, I., Hawthorne, C., Dai, A. M., Hofgman, M. D., Dinculescu, M., Eck, D. Music Transformer. 2019 8. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., Polosukhin, I. Atuention is all you need. 2017 9. Shaw, P., Uszkoreit, J., Vaswani, A. Self-Atuention with relative position representations. 2018 10. htups://magenta.tensorglow.org/music-transformer 11. Engel, J., Agrawal, K. K., Chen, S., Gulrajani, I., Donahue, C., Roberus, A. Adversarial Neural Audio Synthesis. 2019 12. Van den Oord, A., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A., Kavukcuoglu, K. WaveNet: A Generative Model for Raw Audio. 2016 13. Kalchbrenner, N., Elsen, E., Simonyan, K., Noury, S., Casagrande, N., Lockharu, E., Stimberg, F., van den Oord, A., Dieleman, S., Kavukcuoglu, K. Effjcient Neural Audio Synthesis. 2018

Meta-Amoruized Variational Inference and Learning Kristy Choi - PowerPoint PPT Presentation

Meta-Amoruized Variational Inference and Learning Kristy Choi CS236: December 4th, 2019 Probabilistic Inference Probabilistic inference is a paruicular way of viewing the world: + = Observations Updated (posterior) Prior belief Typically

Meta-Learning with Shared Amortized Variational Inference Ekaterina Iakovleva Jakob Verbeek

CS480/680 Machine Learning Lecture 11: February 11 th , 2020 Variational Inference Zahra

Variational Inference for GPs: Presenters Group1: Stochastic variational inference. Slides 2 - 28

Variational Inference and Learning Michael Gutmann Probabilistic Modelling and Reasoning

Neural Variational Inference and Learning Andriy Mnih, Karol Gregor 22 June 2014 1 / 14

Variational Inference CMSC 691 UMBC Goal: Posterior Inference Hyperparameters Unknown

Meta Reinforcement Learning as Task Inference Jan Humplik, Alexandre Galashov, Leonard

Rejection Sampling Variational Inference Karan Grewal CSC2547 / STA4273 Overview Variational

Variational Inference and Generative Models CS 294-112: Deep Reinforcement Learning Sergey

Deep Variational Inference FLARE Reading Group Presentation Wesley Tansey 9/28/2016 What is

Automating variational inference for statistics and data mining Tom Minka Machine Learning and

Lecture Variational 13 Inference Panini Kaushal Scribes : - Margulies Smedeuranh Niklas

Variational Inference for Bayes vMF Mixture Hanxiao Liu September 23, 2014 1 / 14 Variational

Variational Mean Field Variational Mean Field for Graphical Models for Graphical Models

Multi-Object Representation Learning with Iterative Variational Inference Klaus Greff, Raphal

Fast and Simple Natural-Gradient Variational Inference with Mixture of Exponential-family

Lecture 14: Inference in Dirichlet Processes (Blei & Jordan, Variational inference for

Meta Learning Shengchao Liu Background Meta Learning (AKA Learning to Learn) A

Functional Space Variational Inference for Uncertainty Estimation in Computer Aided Diagnosis

CSC2541: Differentiable Inference and Generative Models Lecture 2: Variational autoencoders

Variational Laplace Autoencoders Yookoon Park, Chris Dongjoo Kim and Gunhee Kim Vision and

Learning to Learn Kernels with Variational Random Features Presenter : Haoliang Sun Xiantong

161-17 Stochastic Variational Lecture Inference I Kaushal Panchi Scribes : Jay DeYoung

Graphical Models Graphical Models Exponential family & Variational Inference I Siamak

Meta-Amoruized Variational Inference and Learning Kristy Choi - PowerPoint PPT Presentation

Meta-Amoruized Variational Inference and Learning Kristy Choi CS236: December 4th, 2019 Probabilistic Inference Probabilistic inference is a paruicular way of viewing the world: + = Observations Updated (posterior) Prior belief Typically

Meta-Learning with Shared Amortized Variational Inference Ekaterina Iakovleva Jakob Verbeek

CS480/680 Machine Learning Lecture 11: February 11 th , 2020 Variational Inference Zahra

Variational Inference for GPs: Presenters Group1: Stochastic variational inference. Slides 2 - 28

Variational Inference and Learning Michael Gutmann Probabilistic Modelling and Reasoning

Neural Variational Inference and Learning Andriy Mnih, Karol Gregor 22 June 2014 1 / 14

Variational Inference CMSC 691 UMBC Goal: Posterior Inference Hyperparameters Unknown

Meta Reinforcement Learning as Task Inference Jan Humplik, Alexandre Galashov, Leonard

Rejection Sampling Variational Inference Karan Grewal CSC2547 / STA4273 Overview Variational

Variational Inference and Generative Models CS 294-112: Deep Reinforcement Learning Sergey

Deep Variational Inference FLARE Reading Group Presentation Wesley Tansey 9/28/2016 What is

Automating variational inference for statistics and data mining Tom Minka Machine Learning and

Lecture Variational 13 Inference Panini Kaushal Scribes : - Margulies Smedeuranh Niklas

Variational Inference for Bayes vMF Mixture Hanxiao Liu September 23, 2014 1 / 14 Variational

Variational Mean Field Variational Mean Field for Graphical Models for Graphical Models

Multi-Object Representation Learning with Iterative Variational Inference Klaus Greff, Raphal

Fast and Simple Natural-Gradient Variational Inference with Mixture of Exponential-family

Lecture 14: Inference in Dirichlet Processes (Blei &amp; Jordan, Variational inference for

Meta Learning Shengchao Liu Background Meta Learning (AKA Learning to Learn) A

Functional Space Variational Inference for Uncertainty Estimation in Computer Aided Diagnosis

CSC2541: Differentiable Inference and Generative Models Lecture 2: Variational autoencoders

Variational Laplace Autoencoders Yookoon Park, Chris Dongjoo Kim and Gunhee Kim Vision and

Learning to Learn Kernels with Variational Random Features Presenter : Haoliang Sun Xiantong

161-17 Stochastic Variational Lecture Inference I Kaushal Panchi Scribes : Jay DeYoung

Graphical Models Graphical Models Exponential family &amp; Variational Inference I Siamak

Lecture 14: Inference in Dirichlet Processes (Blei & Jordan, Variational inference for

Graphical Models Graphical Models Exponential family & Variational Inference I Siamak