Standardizing Evaluation of Neural Network Pruning Jose Javier Gonzalez Davis Blalock John V. Guttag
Overview Sh Shri rinkBench: Open source library to facilitate development and standardized evaluation of neural network pruning methods • Rapid prototyping of NN pruning methods • Makes it easy to use standardized datasets, pretrained models and finetuning setups • Controls for potential confounding factors 1
Neural Network Pruning • Pretrained networks are often quite accurate but large • Pr Pruning : Systematically remove parameters from a network 2
Neural Network Pruning Accuracy of Pruned Networks Goal : Reduce size of network • Go 0.70 as much as possible with minimal drop in accuracy 0.65 Accuracy 0.60 • Often requires finetuning 0.55 afterwards 0.50 0.45 0.40 1 2 4 8 16 Compression Ratio 3
Traditional Pipeline Need a whole pipeline for performing experiments Data Pruning Finetuning Evaluation Algorithm Model 4
Traditional Pipeline But only the pruning algorithm usually changes Data Pruning Finetuning Evaluation Algorithm Model 5
Traditional Pipeline But only the pruning algorithm usually changes Data Duplicate effort & confounding variables Pruning Finetuning Evaluation Algorithm Model 6
ShrinkBench Library to facilitate standardized evaluation of pruning methods shrinkbench Utils Evaluation Model Data Finetuning Pruning Algorithm 7
ShrinkBench • Provides standardized datasets, pretrained models, and evaluation metrics • Simple and generic parameter masking API • Measures nonzero parameters, activations, and FLOPs • Controlled experiments show the need for standardized evaluation 8
T owards Standardization But how do we standardize? Standardized datasets. • Larger datasets (ImageNet) will be more insightful than smaller ones (CIFAR10) Standardized architectures • Crucial to match complexity of the network with complexity of dataset/task Pretrained models • This can be a confounding factor so it’s important to use the same Finetuning setup • We want improvement coming from pruning not just better hyperparameters 9
T owards Standardization But how do we standardize? Standardized datasets. • Widely adopted datasets, representative of real-world tasks Standardized architectures • With reproducibility record, matched in complexity to the chosen dataset Pretrained models • Even for a fixed architecture and dataset, exact weights may affect results Finetuning setup • We want improvement from pruning, not from better hyperparameters 10
T owards Standardization But how do we standardize? Standardized datasets. • Widely adopted datasets, representative of real-world tasks Standardized architectures • With reproducibility record, matched in complexity to the chosen dataset Pretrained models • Even for a fixed architecture and dataset, exact weights may affect results Finetuning setup • We want improvement from pruning, not from better hyperparameters 11
Masking API We can capture an arbitrary removal pattern using binary masks Model (+ Data) Pruning Masks -2.1 4.6 0.8 -0.1 0 1 0 0 0 0 1 0 -2.1 4.6 0.8 -0.1 0 1 0 0 -2.1 4.6 0.8 -0.1 0.2 1.5 -4.9 2.3 0 0 1 0 0 0 1 1 0.2 1.5 -4.9 2.3 0 0 1 0 0.2 1.5 -4.9 2.3 -2.5 2.7 4.2 -1.1 1 1 1 0 1 1 1 1 -2.5 2.7 4.2 -1.1 1 1 1 0 -2.5 2.7 4.2 -1.1 -0.3 5.0 3.1 4.7 0 1 0 1 0 1 0 0 -0.3 5.0 3.1 4.7 0 1 0 1 -0.3 5.0 3.1 4.7 12
Masks → Accuracy Given a pruning method in terms of masks, ShrinkBench finetunes the model and systematically evaluates it Accuracy Curve Pruning Masks 0.70 0 1 0 0 0 0 1 0 0.65 0 1 0 0 0 0 1 0 Accuracy 0.60 0 0 1 1 0 0 1 0 0.55 1 1 1 0 1 1 1 1 0.50 1 1 1 0 0.45 0 1 0 1 0 1 0 0 0.40 0 1 0 1 1 2 4 8 16 Compression Ratio 13
ShrinkBench Results I • ShrinkBench returns both compression & speedup since they interact differently with pruning 14 Model Compression Speedup
ShrinkBench Results II • ShrinkBench evaluates with varying compression and with several (dataset, architecture) combinations 15
ShrinkBench Results II • ShrinkBench evaluates with varying compression and with several (dataset, architecture) combinations 16
ShrinkBench Results III • ShrinkBench controls for confounding factors such as pretrained weights or finetuning hyperparemeters 17
Summary • ShrinkBench – an open source library to facilitate development and standardized evaluation of neural network pruning methods • Our controlled experiments across hundreds of models demonstrate the need for standardized evaluation. https://shrinkbench.github.io 18
Recommend
More recommend