LEEP: A New Measure to Evaluate Transferability of Learned - PowerPoint PPT Presentation

LEEP: A New Measure to Evaluate Transferability of Learned Representations Cuong V. Nguyen Tal Hassner ∗ Amazon Web Services Facebook AI Matthias Seeger Cedric Archambeau Amazon Web Services Amazon Web Services ∗ Work done prior to joining Facebook AI Correspondence to: nguycuo@amazon.com 1/14

Problem Transferability estimation Estimating how easy it is to transfer knowledge from one classification task to another ◮ Given a pre-trained source model and a target data set ◮ Develop a measure (a score) for how effectively transfer learning can transfer from the source model to the target data ◮ Transferability measure should be easy and cheap to compute → ideally without training Correspondence to: nguycuo@amazon.com 2/14

Why do we need transferability estimation? ◮ Help understand the relationships/structures between tasks ◮ Select groups of highly transferable tasks for joint training ◮ Select good source models for transfer learning ◮ Potentially reduce training data size and training time Correspondence to: nguycuo@amazon.com 3/14

Our contributions ◮ We develop a novel transferability measure, Log Expected Empirical Prediction (LEEP), for deep networks ◮ Properties of LEEP: ◮ Very simple ◮ Clear interpretation : average log-likelihood of the expected empirical predictor ◮ Easy to compute : no training needed, only requires one forward pass through target data set ◮ Can be applied to most modern deep networks Correspondence to: nguycuo@amazon.com 4/14

Log Expected Empirical Prediction (LEEP) (1) ◮ Assume source model θ and target data set D = { ( x 1 , y 1 ) , . . . , ( x n , y n ) } ◮ We compute LEEP score between θ and D in 3 steps. 1. Apply θ to each input x i to get dummy label distribution θ ( x i ). ◮ θ ( x i ) is a distribution on source label set Z ◮ Labels in Z may not semantically relate to true label y i of x i e.g., Z is ImageNet labels but ( x i , y i ) is from CIFAR 2. Compute empirical conditional distribution of target label y given dummy source label z Empirical joint dist : ˆ P ( y , z ) = � i : y i = y θ ( x i ) z / n Empirical marginal dist : ˆ y ˆ P ( z ) = � P ( y , z ) Empirical conditional dist : ˆ P ( y | z ) = ˆ P ( y , z ) / ˆ P ( z ) Correspondence to: nguycuo@amazon.com 5/14

Log Expected Empirical Prediction (LEEP) (2) Expected Empirical Predictor (EEP) A classifier that predicts the label y of an input x as follows: ◮ First, randomly drawing a dummy label z from θ ( x ) ◮ Then, randomly drawing y from ˆ P ( y | z ) z ˆ Equivalently, y ∼ � P ( y | z ) θ ( x ) z 3. LEEP is the average log-likelihood of EEP given data D : �� T ( θ, D ) = 1 � ˆ log P ( y i | z ) θ ( x i ) z n z i Correspondence to: nguycuo@amazon.com 6/14

Experiment: overview ◮ Aim: show that LEEP can predict actual transfer accuracy ◮ Procedure: ◮ Consider many random transfer learning tasks ◮ Compute LEEP scores for these tasks ◮ Compute actual test accuracy of transfer learning methods on these tasks ◮ Evaluate correlations between LEEP scores and the test accuracies ◮ Transfer methods: ◮ Retrain head : only retrain last fully connected layer using target set ◮ Fine-tune : replace the head classifier and fine-tune all model parameters with SGD Correspondence to: nguycuo@amazon.com 7/14

Experiment: LEEP vs. Transfer Accuracy ◮ Compare LEEP score with test accuracy of transferred models on 200 random target tasks ◮ Result: LEEP scores highly correlated with actual test accuracies (correlation coefficients > 0.94) 1.0 1.0 0.8 0.8 Test accuracy Test accuracy 0.6 0.6 0.4 0.4 0.2 0.2 fine-tune fine-tune retrain head retrain head 0.0 0.0 4 3 2 1 4 3 2 1 LEEP score LEEP score ImageNet → CIFAR100 CIFAR10 → CIFAR100 (ResNet18) (ResNet20) Correspondence to: nguycuo@amazon.com 8/14

Experiment: LEEP with Small Data ◮ Restrict target data sets to 5 random classes and 50 examples per class ◮ Partitioning LEEP scores’ range into 5 transferability levels and averaging test accuracies of tasks within each level ◮ Result: higher transferability level according to LEEP → easier to transfer ◮ Similar results when target data sets are imbalanced. Average test accuracy 0.8 0.6 0.4 fine-tune retrain head 0.2 1 2 3 4 5 Transferability level Correspondence to: nguycuo@amazon.com 9/14

Experiment: LEEP vs. Meta-Transfer Accuracy ◮ Compare LEEP score with test accuracy of Conditional Neural Adaptive Processes (CNAPs) (Requeima et al., 2019) ◮ CNAPs was trained using the Meta-dataset (Triantafillou et al., 2020) ◮ Target tasks are drawn from CIFAR100 ◮ Result: higher transferability level according to LEEP → easier to meta-transfer 0.9 Average test accuracy 0.8 0.7 0.6 1 2 3 4 5 Transferability level Correspondence to: nguycuo@amazon.com 10/14

Experiment: LEEP vs. Convergence of Fine-tuned Models ◮ Compare convergence speed to a reference model ◮ Reference model: trained from scratch using only the target data set ◮ Result: higher transferability level according to LEEP → better convergence Accuracy difference Accuracy difference 0.2 0.0 0.1 0.0 level 1 level 4 level 1 level 4 0.2 level 2 level 5 level 2 level 5 0.2 level 3 level 3 0.3 1 5 10 15 1 5 10 15 # epoch # epoch ImageNet → CIFAR100 CIFAR10 → CIFAR100 (ResNet18) (ResNet20) Correspondence to: nguycuo@amazon.com 11/14

Experiment: LEEP for Source Model Selection ◮ Select from 9 candidate models and transfer to CIFAR100 ◮ Compare with: ◮ Negative Conditional Entropy (NCE) (Tran et al., 2019) ◮ H score (Bao et al., 2019) ◮ ImageNet top-1 accuracy (Kornblith et al., 2019) ◮ Result: LEEP can predict test accuracies better ResNet18 ResNet50 MobileNet0.75 MobileNet0.25 SENet154 ResNet34 MobileNet1.0 MobileNet0.5 DarkNet53 0.3 Test accuracy 0.2 0.1 0.0 4.5 4.0 3.5 4.0 3.7 3.4 4 11 18 0.5 0.6 0.7 0.8 LEEP score NCE score H score ImageNet accuracy Correspondence to: nguycuo@amazon.com 12/14

Discussion ◮ Model selection results are very sensitive to the architecture and the size of the source networks. → May need to calibrate/normalize the scores for better performance ◮ Potentially useful for feature selection as well. ◮ For very small data sets, re-training the head directly using 2 nd -order optimization methods could also be efficient Correspondence to: nguycuo@amazon.com 13/14

Thank you. Correspondence to: nguycuo@amazon.com 14/14

LEEP: A New Measure to Evaluate Transferability of Learned - PowerPoint PPT Presentation

LEEP: A New Measure to Evaluate Transferability of Learned Representations Cuong V. Nguyen Tal Hassner Amazon Web Services Facebook AI Matthias Seeger Cedric Archambeau Amazon Web Services Amazon Web Services Work done prior to

Infant S afe S leep Patti Kelly, LMS W, MPH Infant S afe S leep Program Consultant Michigan

Evidence in Aged Care Research Edward Cheong with Professor Joseph Ibrahim Transferability and

2020-07-29_SHPWG_Issue1-Themes Address Calibrate, dynamics of Review the Evaluate Evaluate

Lessons Learned Lessons Learned From From Lessons Learned Lessons Learned From From

Sle leep Log Instructions Thank you for your participation in the Health Benefits Services

Ecosystem Program (LEEP) CAMEO California Association for Micro Enteprise Opportunity West

O RIENTED M ATROIDS W HEN T HEY S LEEP ? Jess A. De Loera Partly based on work with subsets of

The Best Sle leep Aid id The Best Sleep Aid? As noted by Robert Emmons , one of the leading

Regional Measure 3 May 16, 2017 SFMTA Board of Directors Regional Measure 3 Prior Regional

Polynomial Julia sets with positive measure Why bother? Quasiconformal NILF Measure 0? Measure

Skill Transferability, Migration, and Development: Evidence from Population Resettlement in

Clinical Simulation on the Transferability of Clinical Nursing Skills to Practice SUSAN DEANE,

On the Experimental transferability of Spectral Graph Convolutional Networks Masters project

Parallel-Correctness and Transferability for Conjunctive Queries Tom J. Ameloot 1 Gaetano Geck 2

Linguistic Knowledge and Transferability of Contextual Representations Nelson F. Matt Yonatan

Data Transferability and Data Collection Consistency for Marine Renewable Energy Development

Add Type Awareness to File Systems Daniel Peek Jason Flinn Facebook University of Michgan

eMCee web API tester Norbert Hartl 2denker What we do... ...at 2denker mobile applications

Cross-lingual Dependency Parsing of Related Languages with Rich Morphosyntactic Tagsets eljko

Minority Earnings Disparity 1995-2005 1995-2005 Krishna Pendakur and Ravi Pendakur Simon Fraser

Lecture 11 Dijkstras Algorithm Sanjoy Dasgupta Russell Impagliazzo Ragesh Jaiswal

CS3000: Algorithms & Data Jonathan Ullman Lecture 17: More Applications of Network Flow

CSCI 350 Ch. 4 Threads and Concurrency Mark Redekopp Michael Shindler & Ramesh Govindan

New Regularized Algorithms for Transductive Learning Partha Pratim Talukdar University of

LEEP: A New Measure to Evaluate Transferability of Learned - PowerPoint PPT Presentation

LEEP: A New Measure to Evaluate Transferability of Learned Representations Cuong V. Nguyen Tal Hassner Amazon Web Services Facebook AI Matthias Seeger Cedric Archambeau Amazon Web Services Amazon Web Services Work done prior to

Infant S afe S leep Patti Kelly, LMS W, MPH Infant S afe S leep Program Consultant Michigan

Evidence in Aged Care Research Edward Cheong with Professor Joseph Ibrahim Transferability and

2020-07-29_SHPWG_Issue1-Themes Address Calibrate, dynamics of Review the Evaluate Evaluate

Lessons Learned Lessons Learned From From Lessons Learned Lessons Learned From From

Sle leep Log Instructions Thank you for your participation in the Health Benefits Services

Ecosystem Program (LEEP) CAMEO California Association for Micro Enteprise Opportunity West

O RIENTED M ATROIDS W HEN T HEY S LEEP ? Jess A. De Loera Partly based on work with subsets of

The Best Sle leep Aid id The Best Sleep Aid? As noted by Robert Emmons , one of the leading

Regional Measure 3 May 16, 2017 SFMTA Board of Directors Regional Measure 3 Prior Regional

Polynomial Julia sets with positive measure Why bother? Quasiconformal NILF Measure 0? Measure

Skill Transferability, Migration, and Development: Evidence from Population Resettlement in

Clinical Simulation on the Transferability of Clinical Nursing Skills to Practice SUSAN DEANE,

On the Experimental transferability of Spectral Graph Convolutional Networks Masters project

Parallel-Correctness and Transferability for Conjunctive Queries Tom J. Ameloot 1 Gaetano Geck 2

Linguistic Knowledge and Transferability of Contextual Representations Nelson F. Matt Yonatan

Data Transferability and Data Collection Consistency for Marine Renewable Energy Development

Add Type Awareness to File Systems Daniel Peek Jason Flinn Facebook University of Michgan

eMCee web API tester Norbert Hartl 2denker What we do... ...at 2denker mobile applications

Cross-lingual Dependency Parsing of Related Languages with Rich Morphosyntactic Tagsets eljko

Minority Earnings Disparity 1995-2005 1995-2005 Krishna Pendakur and Ravi Pendakur Simon Fraser

Lecture 11 Dijkstras Algorithm Sanjoy Dasgupta Russell Impagliazzo Ragesh Jaiswal

CS3000: Algorithms &amp; Data Jonathan Ullman Lecture 17: More Applications of Network Flow

CSCI 350 Ch. 4 Threads and Concurrency Mark Redekopp Michael Shindler &amp; Ramesh Govindan

New Regularized Algorithms for Transductive Learning Partha Pratim Talukdar University of

CS3000: Algorithms & Data Jonathan Ullman Lecture 17: More Applications of Network Flow

CSCI 350 Ch. 4 Threads and Concurrency Mark Redekopp Michael Shindler & Ramesh Govindan