learning to learn kernels with variational random features
play

Learning to Learn Kernels with Variational Random Features - PowerPoint PPT Presentation

Learning to Learn Kernels with Variational Random Features Presenter : Haoliang Sun Xiantong Zhen*, Haoliang Sun*, Yingjun Du*, Jun Xu, Yilong Yin, Ling Shao, Cees Snoek ICML | 2020 Meta-Learning (Leaning to Learn) D1 D2 D3 D Datasets


  1. Learning to Learn Kernels with Variational Random Features Presenter : Haoliang Sun Xiantong Zhen*, Haoliang Sun*, Yingjun Du*, Jun Xu, Yilong Yin, Ling Shao, Cees Snoek ICML | 2020

  2. Meta-Learning (Leaning to Learn) D1 D2 D3 D’ Datasets … … Meta Knowledge : t’ Tasks t1 t2 t3 Ø Good parameter initialization (Finn Base Base Base new et al., 2017) Learner1 Learner2 Learner3 Learner Meta Learner Ø Efficient optimization update rules (Ravi et al., 2017) Meta-Learning. Ø General feature extractors (Vinyals Ø Extract prior (meta) knowledge from related tasks (meta learner) et al., 2016) ... Ø Fast adaptation to a new task (base learner) ICML | 2020

  3. Few-Shot Learning (FSL) with Meta-Learning (ML) Ø The episodic training-testing strategy -- meta-training : a meta-learner is trained to enhance base-learners’ performance on the meta-training set with a batch of few-shot learning tasks -- meta-testing : base-learners are evaluated on the meta-test set with novel categories of data Ø An episode (task) -- sample 𝐷 -way 𝑙 -shot classification tasks from the meta-training (testing) set -- 𝑙 is the number of labelled examples for each of the 𝐷 classes ICML | 2020

  4. Few-Shot Learning (FSL) with Meta-Learning (ML) Episode 1 Episode 2 … Example of few-shot learning setup (Ravi et al., 2017) ICML | 2020

  5. An Effective Meta-Learning Scenario Ø Base-learner: -- be powerful to solve individual tasks -- be able to absorb common information Ø Meta-learner: -- extract valid prior knowledge Key idea : Ø integrate kernel learning with random features and variational inference (VI) into the ML framework for FSL Ø formulate the optimization as a VI problem by deriving new ELBO Ø a context inference puts the inference of random bases of the current task into the context of all previous, related tasks ICML | 2020

  6. Problem Statement Meta-learning with kernels For task , support set , query set , predictor , base-learner , loss , mapping function , . A practical base-learner (Kernel ridge regression) The closed-form solution . The predictor . Learning adaptive kernels with data-driven random Fourier features ICML | 2020

  7. Problem Statement Random Fourier Features (RFFs) Ø learn adaptive kernels in a data-driven way Ø leverage the shared knowledge by exploring dependencies among related tasks to generate rich features Ø construct approximate translation-invariant kernels using explicit feature maps via random bases (Bochner’s theorem) Data-driven adaptive kernels is to find the posterior for random bases Formulated as a variational inference problem ICML | 2020

  8. Meta Variational Random Features (MetaVRF) The objective function Ø The posterior is intractable. Approximate it by using a meta variational distribution Variational distribution Ø The Evidence Lower Bound (ELBO) ELBO Ø The objective (maximizing ELBO w.r.t. tasks) ICML | 2020

  9. Context Inference Ø generate rich random bases to build strong kernels Ø put the inference of bases of the current task into the context of all previous, related tasks Ø The context of related tasks x , S t , -th task dependency C . W The directed graphical model. ICML | 2020

  10. An LSTM-Based Context Inference Network Ø LSTM transformation with input of the support set and previous cell states Ø shared MLPs for inference outputs the parameter of the variational distribution Ø The optimization objective with the context inference ICML | 2020

  11. ICML | 2020

  12. Experiments Ø Few-Shot Regression -- Fitting a target sine function Ø Few-Shot Classification -- Three benchmarks Ø Further analysis -- Deep embedding -- Efficiency -- Versatility ICML | 2020

  13. Evaluation: Few-Shot Regression 3-shot 5-shot 10-shot 4 3 4 0 0 0 1.913 0.415 0.294 1.072 0.063 0.024 0.722 0.047 0.009 -4 0.700 0.022 0.003 -4 -3 -5 0 5 -5 0 5 -5 0 5 Figure 1: ICML | 2020

  14. Evaluation: Few-Shot Classification ICML | 2020

  15. Evaluation: Few-Shot Classification ICML | 2020

  16. Further Analysis ICML | 2020

  17. Further Analysis ICML | 2020

  18. Further Analysis ICML | 2020

  19. Conclusion v A novel meta-learning framework, MetaVRF, introducing RFFs into the meta-learning framework and leveraging VI to infer the spectral distribution in a data-driven way. v The LSTM-based context inference explores the shared knowledge and generates rich random features. v Achieve the state-of-the-art performance. v Learned kernels exhibit high representational power with a low spectral sampling rate. v Robustness and flexibility to a great variety of testing conditions. ICML | 2020

  20. ICML | 2020

Recommend


More recommend