CS 4803 / 7643: Deep Learning Topics: (Continue) Low-label ML - - PowerPoint PPT Presentation

Published Jan 2026 1,245 Reads Presentation Transcript
cs 4803 7643 deep learning
SMART_READER_LITE
LIVE PREVIEW

CS 4803 / 7643: Deep Learning Topics: (Continue) Low-label ML - - PowerPoint PPT Presentation

CS 4803 / 7643: Deep Learning Topics: (Continue) Low-label ML Formulations Zsolt Kira Georgia Tech Administrative Projects! Poster details out on piazza Note: No late days for anything project related! Also note: Keep


slide-1
SLIDE 1

CS 4803 / 7643: Deep Learning

Zsolt Kira Georgia Tech

Topics:

– (Continue) Low-label ML Formulations

slide-2
SLIDE 2

Administrative

  • Projects!

– Poster details out on piazza – Note: No late days for anything project related! – Also note: Keep track of your GCP usage and costs! Set limits on spending

(C) Dhruv Batra & Zsolt Kira 31

slide-3
SLIDE 3

Meta-Learning for Few-Shot Recognition

  • Key idea: We want to learn from a few examples

(called the support set) to make predictions on query set for novel classes

– Assume: We have larger labeled dataset for a different set of categories (base classes)

  • How do we test this?

– N-way k-shot test – k: Number of examples in support set – N: Number of “confusers” that we have to choose target class among

(C) Dhruv Batra & Zsolt Kira 32

Target Query Set

slide-4
SLIDE 4

Normal Approach

  • Do what we always do: Fine-tuning

– Train classifier on base classes – Freeze features – Learn classifier weights for new classes using few amounts

  • f labeled data (during “inference” time!)

(C) Dhruv Batra & Zsolt Kira 33

A Closer Look at Few-shot Classification, Wei-Yu Chen, Yen-Cheng Liu, Zsolt Kira, Yu-Chiang Frank Wang, Jia-Bin Huang

slide-5
SLIDE 5

Cons of Normal Approach

  • The training we do on the base classes does not

factor the task into account

  • No notion that we will be performing a bunch of N-

way tests

  • Idea: simulate what we will see during test time

(C) Dhruv Batra & Zsolt Kira 34

slide-6
SLIDE 6

Meta-Training Approach

  • Set up a set of smaller tasks during training which

simulates what we will be doing during testing

– Can optionally pre-train features on held-out base classes (not typical)

  • Testing stage is now the same, but with new classes

(C) Dhruv Batra & Zsolt Kira 35

slide-7
SLIDE 7

Meta-Learning Approaches

  • Learning a model conditioned on support set

(C) Dhruv Batra & Zsolt Kira 36

slide-8
SLIDE 8

Meta-Learner

  • How to parametrize learning algorithms?
  • Two approaches to defining a meta-learner

– Take inspiration from a known learning algorithm

  • kNN/kernel machine: Matching networks (Vinyals et al. 2016)
  • Gaussian classifier: Prototypical Networks (Snell et al. 2017)
  • Gradient Descent: Meta-Learner LSTM (Ravi & Larochelle, 2017) ,

MAML (Finn et al. 2017)

– Derive it from a black box neural network

  • MANN (Santoro et al. 2016)
  • SNAIL (Mishra et al. 2018)

(C) Dhruv Batra & Zsolt Kira 37

Slide Credit: Hugo Larochelle

slide-9
SLIDE 9

Meta-Learner

  • How to parametrize learning algorithms?
  • Two approaches to defining a meta-learner

– Take inspiration from a known learning algorithm

  • kNN/kernel machine: Matching networks (Vinyals et al. 2016)
  • Gaussian classifier: Prototypical Networks (Snell et al. 2017)
  • Gradient Descent: Meta-Learner LSTM (Ravi & Larochelle, 2017) ,

MAML (Finn et al. 2017)

– Derive it from a black box neural network

  • MANN (Santoro et al. 2016)
  • SNAIL (Mishra et al. 2018)

(C) Dhruv Batra & Zsolt Kira 38

Slide Credit: Hugo Larochelle

slide-10
SLIDE 10

Matching Networks

(C) Dhruv Batra & Zsolt Kira 39

Slide Credit: Hugo Larochelle

slide-11
SLIDE 11

Prototypical Networks

(C) Dhruv Batra & Zsolt Kira 40

Slide Credit: Hugo Larochelle

slide-12
SLIDE 12

Prototypical Networks

(C) Dhruv Batra & Zsolt Kira 41

Slide Credit: Hugo Larochelle

slide-13
SLIDE 13

More Sophisticated Meta-Learning Approaches

  • Learn gradient descent:

– Parameter initialization and update rules

  • Learn just an initialization and use normal gradient

descent (MAML)

(C) Dhruv Batra & Zsolt Kira 42

slide-14
SLIDE 14

Meta-Learner LSTM

(C) Dhruv Batra & Zsolt Kira 43

Slide Credit: Hugo Larochelle

slide-15
SLIDE 15

Meta-Learner LSTM

(C) Dhruv Batra & Zsolt Kira 44

Slide Credit: Hugo Larochelle

slide-16
SLIDE 16

Meta-Learner LSTM

(C) Dhruv Batra & Zsolt Kira 45

Slide Credit: Hugo Larochelle

slide-17
SLIDE 17

Meta-Learner LSTM

(C) Dhruv Batra & Zsolt Kira 46

Slide Credit: Hugo Larochelle

slide-18
SLIDE 18

Meta-Learning Algorithm

(C) Dhruv Batra & Zsolt Kira 47

slide-19
SLIDE 19

Meta-Learner LSTM

(C) Dhruv Batra & Zsolt Kira 48

Slide Credit: Hugo Larochelle

slide-20
SLIDE 20

Model-Agnostic Meta-Learning (MAML)

(C) Dhruv Batra & Zsolt Kira 49

Slide Credit: Hugo Larochelle

slide-21
SLIDE 21

Model-Agnostic Meta-Learning (MAML)

(C) Dhruv Batra & Zsolt Kira 50

slide-22
SLIDE 22

Model-Agnostic Meta-Learning (MAML)

(C) Dhruv Batra & Zsolt Kira 51

Slide Credit: Sergey Levine

slide-23
SLIDE 23

Model-Agnostic Meta-Learning (MAML)

(C) Dhruv Batra & Zsolt Kira 52

Slide Credit: Sergey Levine

slide-24
SLIDE 24

Comparison

(C) Dhruv Batra & Zsolt Kira 53

Slide Credit: Sergey Levine

slide-25
SLIDE 25

Meta-Learner

  • How to parametrize learning algorithms?
  • Two approaches to defining a meta-learner

– Take inspiration from a known learning algorithm

  • kNN/kernel machine: Matching networks (Vinyals et al. 2016)
  • Gaussian classifier: Prototypical Networks (Snell et al. 2017)
  • Gradient Descent: Meta-Learner LSTM (Ravi & Larochelle, 2017) ,

MAML (Finn et al. 2017)

– Derive it from a black box neural network

  • MANN (Santoro et al. 2016)
  • SNAIL (Mishra et al. 2018)

(C) Dhruv Batra & Zsolt Kira 54

Slide Credit: Hugo Larochelle

slide-26
SLIDE 26

Experiments

(C) Dhruv Batra & Zsolt Kira 55

Slide Credit: Hugo Larochelle

slide-27
SLIDE 27

Memory-Augmented Neural Network

(C) Dhruv Batra & Zsolt Kira 56

Slide Credit: Hugo Larochelle

slide-28
SLIDE 28

But beware

(C) Dhruv Batra & Zsolt Kira 57

Slide Credit: Hugo Larochelle A Closer Look at Few-shot Classification, Wei-Yu Chen, Yen-Cheng Liu, Zsolt Kira, Yu-Chiang Frank Wang, Jia-Bin Huang

slide-29
SLIDE 29

(C) Dhruv Batra & Zsolt Kira 58

slide-30
SLIDE 30

Distribution Shift

  • What if there is a

distribution shift (cross- domain)?

  • Lesson: Methods that are

successful within-domain might be worse across domains!

(C) Dhruv Batra & Zsolt Kira 59

slide-31
SLIDE 31

Distribution Shift

(C) Dhruv Batra & Zsolt Kira 60

slide-32
SLIDE 32

Random Task Proposals

(C) Dhruv Batra & Zsolt Kira 61

slide-33
SLIDE 33

Does it Work?

(C) Dhruv Batra & Zsolt Kira 62

slide-34
SLIDE 34

Discussions

(C) Dhruv Batra & Zsolt Kira 63

  • What is the right definition of distributions over

problems?

– varying number of classes / examples per class (meta- training vs. meta-testing) ? – semantic differences between meta-training vs. meta-testing classes ? – overlap in meta-training vs. meta-testing classes (see recent “low-shot” literature) ?

  • Move from static to interactive learning

– how should this impact how we generate episodes ? – meta-active learning ? (few successes so far)

Slide Credit: Hugo Larochelle