Adapted Deep Embeddings: A Synthesis of Methods for ! -Shot Inductive Transfer Learning Tyler R. Scott 1,2 , Karl Ridgeway 1,2 , Michael C. Mozer 1,3 1 University of Colorado, Boulder 2 Sensory Inc. 3 Presently at Google Brain
Inductive Transfer Learning Source Target Domain Domain Data Data Target Domain Target Domain Model Input Prediction
Inductive Transfer Learning Weight Transfer Source Domain Target Domain Retrain output Adapt weights to target domain … … Yosinski et al. (2014)
Inductive Transfer Learning Weight Transfer Deep Metric Learning Source & Target Histogram loss Source Domain Domain Embedding (Ustinova & Lempitsky, 2016) Within Between class class … Distance
Inductive Transfer Learning Weight Transfer Deep Metric Learning Few-Shot Learning Prototypical nets Source & Target (Snell et al., 2017) Source Domain Domain Embedding …
Inductive Transfer Learning Weight Transfer Adapted Deep Deep Metric Learning Embeddings Few-Shot Learning 1. Train network using Source Domain Target Domain embedding loss • Histogram loss, Prototypical nets 2. Adapt weights using limited target-domain … … data
Inductive Transfer Learning Why hasn’t a comparison been explored? # labeled examples per target class ( k ) Weight Transfer > 100 Deep Metric Learning agnostic Few-Shot Learning < 20
MNIST Source Domain Target Domain 2200 labeled k labeled examples per examples per class class
MNIST MNIST, n = 5 1.0 0.9 Target Domain Test Accuracy Test Accuracy 0.8 0.7 0.6 0.5 0.4 Baseline 0.3 1 5 10 50 100 500 1000 Labeled Examples in Target Domain ( k ) k
MNIST MNIST, n = 5 1.0 0.9 Target Domain Test Accuracy Test Accuracy 0.8 0.7 0.6 0.5 0.4 Weight Adaptation Baseline 0.3 1 5 10 50 100 500 1000 Labeled Examples in Target Domain ( k ) k
MNIST MNIST, n = 5 1.0 0.9 Target Domain Test Accuracy Test Accuracy 0.8 0.7 0.6 0.5 Prototypical Net 0.4 Weight Adaptation Baseline 0.3 1 5 10 50 100 500 1000 Labeled Examples in Target Domain ( k ) k
MNIST MNIST, n = 5 1.0 0.9 Target Domain Test Accuracy Test Accuracy 0.8 0.7 0.6 0.5 Histogram Loss Prototypical Net 0.4 Weight Adaptation Baseline 0.3 1 5 10 50 100 500 1000 Labeled Examples in Target Domain ( k ) k
MNIST MNIST, n = 5 1.0 0.9 Target Domain Test Accuracy Test Accuracy 0.8 0.7 0.6 Adapted Histogram Loss Adapted Prototypical Net 0.5 Histogram Loss Prototypical Net 0.4 Weight Adaptation Baseline 0.3 1 5 10 50 100 500 1000 Labeled Examples in Target Domain ( k ) k
tinyImageNet, n = 5 tinyImageNet, n = 10 tinyImageNet, n = 50 Isolet, n = 5 0.7 0.6 0.2 1.0 0.9 0.6 0.5 Test Accuracy 0.8 0.5 0.4 0.7 0.1 0.6 0.4 0.3 Adapted Histogram Loss 0.5 Adapted Prototypical Net 0.3 0.2 0.4 Histogram Loss Prototypical Net 0.2 0.1 0.0 0.3 Weight Adaptation 1 10 50 100 300 1 10 50 100 300 1 10 50 100 300 Baseline k k k 0.2 1 10 50 100 200 k Isolet, n = 10 1.0 Omniglot, k = 1 Omniglot, k = 5 Omniglot, k = 10 0.9 0.9 0.9 0.6 0.8 0.8 0.8 0.7 0.5 0.7 0.7 Test Accuracy 0.6 0.6 0.6 0.4 0.5 0.5 0.5 0.3 0.4 0.4 0.4 0.3 0.3 0.2 0.2 0.2 0.3 0.1 0.1 0.1 0.2 0.0 0.0 0.0 200 1 10 50 100 200 5 10 100 1000 5 10 100 1000 5 10 100 1000 k n n n
Conclusion • Weight transfer is the least effective method for inductive transfer learning • Histogram loss is robust regardless of the amount of labeled data in the target domain • Adapted embeddings outperform every static embedding method previously proposed
Poster #167 Room 210 & 230 AB Today, 5:00 - 7:00 PM
Recommend
More recommend