LEEP: A New Measure to Evaluate Transferability of Learned Representations Cuong V. Nguyen Tal Hassner ∗ Amazon Web Services Facebook AI Matthias Seeger Cedric Archambeau Amazon Web Services Amazon Web Services ∗ Work done prior to joining Facebook AI Correspondence to: nguycuo@amazon.com 1/14
Problem Transferability estimation Estimating how easy it is to transfer knowledge from one classification task to another ◮ Given a pre-trained source model and a target data set ◮ Develop a measure (a score) for how effectively transfer learning can transfer from the source model to the target data ◮ Transferability measure should be easy and cheap to compute → ideally without training Correspondence to: nguycuo@amazon.com 2/14
Why do we need transferability estimation? ◮ Help understand the relationships/structures between tasks ◮ Select groups of highly transferable tasks for joint training ◮ Select good source models for transfer learning ◮ Potentially reduce training data size and training time Correspondence to: nguycuo@amazon.com 3/14
Our contributions ◮ We develop a novel transferability measure, Log Expected Empirical Prediction (LEEP), for deep networks ◮ Properties of LEEP: ◮ Very simple ◮ Clear interpretation : average log-likelihood of the expected empirical predictor ◮ Easy to compute : no training needed, only requires one forward pass through target data set ◮ Can be applied to most modern deep networks Correspondence to: nguycuo@amazon.com 4/14
Log Expected Empirical Prediction (LEEP) (1) ◮ Assume source model θ and target data set D = { ( x 1 , y 1 ) , . . . , ( x n , y n ) } ◮ We compute LEEP score between θ and D in 3 steps. 1. Apply θ to each input x i to get dummy label distribution θ ( x i ). ◮ θ ( x i ) is a distribution on source label set Z ◮ Labels in Z may not semantically relate to true label y i of x i e.g., Z is ImageNet labels but ( x i , y i ) is from CIFAR 2. Compute empirical conditional distribution of target label y given dummy source label z Empirical joint dist : ˆ P ( y , z ) = � i : y i = y θ ( x i ) z / n Empirical marginal dist : ˆ y ˆ P ( z ) = � P ( y , z ) Empirical conditional dist : ˆ P ( y | z ) = ˆ P ( y , z ) / ˆ P ( z ) Correspondence to: nguycuo@amazon.com 5/14
Log Expected Empirical Prediction (LEEP) (2) Expected Empirical Predictor (EEP) A classifier that predicts the label y of an input x as follows: ◮ First, randomly drawing a dummy label z from θ ( x ) ◮ Then, randomly drawing y from ˆ P ( y | z ) z ˆ Equivalently, y ∼ � P ( y | z ) θ ( x ) z 3. LEEP is the average log-likelihood of EEP given data D : �� � T ( θ, D ) = 1 � ˆ log P ( y i | z ) θ ( x i ) z n z i Correspondence to: nguycuo@amazon.com 6/14
Experiment: overview ◮ Aim: show that LEEP can predict actual transfer accuracy ◮ Procedure: ◮ Consider many random transfer learning tasks ◮ Compute LEEP scores for these tasks ◮ Compute actual test accuracy of transfer learning methods on these tasks ◮ Evaluate correlations between LEEP scores and the test accuracies ◮ Transfer methods: ◮ Retrain head : only retrain last fully connected layer using target set ◮ Fine-tune : replace the head classifier and fine-tune all model parameters with SGD Correspondence to: nguycuo@amazon.com 7/14
Experiment: LEEP vs. Transfer Accuracy ◮ Compare LEEP score with test accuracy of transferred models on 200 random target tasks ◮ Result: LEEP scores highly correlated with actual test accuracies (correlation coefficients > 0.94) 1.0 1.0 0.8 0.8 Test accuracy Test accuracy 0.6 0.6 0.4 0.4 0.2 0.2 fine-tune fine-tune retrain head retrain head 0.0 0.0 4 3 2 1 4 3 2 1 LEEP score LEEP score ImageNet → CIFAR100 CIFAR10 → CIFAR100 (ResNet18) (ResNet20) Correspondence to: nguycuo@amazon.com 8/14
Experiment: LEEP with Small Data ◮ Restrict target data sets to 5 random classes and 50 examples per class ◮ Partitioning LEEP scores’ range into 5 transferability levels and averaging test accuracies of tasks within each level ◮ Result: higher transferability level according to LEEP → easier to transfer ◮ Similar results when target data sets are imbalanced. Average test accuracy 0.8 0.6 0.4 fine-tune retrain head 0.2 1 2 3 4 5 Transferability level Correspondence to: nguycuo@amazon.com 9/14
Experiment: LEEP vs. Meta-Transfer Accuracy ◮ Compare LEEP score with test accuracy of Conditional Neural Adaptive Processes (CNAPs) (Requeima et al., 2019) ◮ CNAPs was trained using the Meta-dataset (Triantafillou et al., 2020) ◮ Target tasks are drawn from CIFAR100 ◮ Result: higher transferability level according to LEEP → easier to meta-transfer 0.9 Average test accuracy 0.8 0.7 0.6 1 2 3 4 5 Transferability level Correspondence to: nguycuo@amazon.com 10/14
Experiment: LEEP vs. Convergence of Fine-tuned Models ◮ Compare convergence speed to a reference model ◮ Reference model: trained from scratch using only the target data set ◮ Result: higher transferability level according to LEEP → better convergence Accuracy difference Accuracy difference 0.2 0.0 0.1 0.0 level 1 level 4 level 1 level 4 0.2 level 2 level 5 level 2 level 5 0.2 level 3 level 3 0.3 1 5 10 15 1 5 10 15 # epoch # epoch ImageNet → CIFAR100 CIFAR10 → CIFAR100 (ResNet18) (ResNet20) Correspondence to: nguycuo@amazon.com 11/14
Experiment: LEEP for Source Model Selection ◮ Select from 9 candidate models and transfer to CIFAR100 ◮ Compare with: ◮ Negative Conditional Entropy (NCE) (Tran et al., 2019) ◮ H score (Bao et al., 2019) ◮ ImageNet top-1 accuracy (Kornblith et al., 2019) ◮ Result: LEEP can predict test accuracies better ResNet18 ResNet50 MobileNet0.75 MobileNet0.25 SENet154 ResNet34 MobileNet1.0 MobileNet0.5 DarkNet53 0.3 Test accuracy 0.2 0.1 0.0 4.5 4.0 3.5 4.0 3.7 3.4 4 11 18 0.5 0.6 0.7 0.8 LEEP score NCE score H score ImageNet accuracy Correspondence to: nguycuo@amazon.com 12/14
Discussion ◮ Model selection results are very sensitive to the architecture and the size of the source networks. → May need to calibrate/normalize the scores for better performance ◮ Potentially useful for feature selection as well. ◮ For very small data sets, re-training the head directly using 2 nd -order optimization methods could also be efficient Correspondence to: nguycuo@amazon.com 13/14
Thank you. Correspondence to: nguycuo@amazon.com 14/14
Recommend
More recommend