When Ensembling Smaller Models is More Effjcient than Single Large - PowerPoint PPT Presentation

Jan 27, 2023 •206 likes •272 views

When Ensembling Smaller Models is More Effjcient than Single Large Models WebVision 2020 Dan Kondratyuk, Mingxing Tan, Matuhew Brown, Boqing Gong {dankondratyuk,tanmingxing,mtbr,bgong}@google.com Model Ensembles Train multiple models and

When Ensembling Smaller Models is More Effjcient than Single Large Models WebVision 2020 Dan Kondratyuk, Mingxing Tan, Matuhew Brown, Boqing Gong {dankondratyuk,tanmingxing,mtbr,bgong}@google.com
Model Ensembles Train multiple models and average their predictions during inference ● E.g., train a neural network architecture with difgerent random initializations ○ Easy method to reduce prediction error ● Introduces heavy effjciency penalties ● Prediction Most commonly reserved for the largest models ○ Can small ensembles be effjcient ? ● Aggregation ... Model Model Model 1 2 N Input Example
Image Classifjcation - Wide ResNet - CIFAR 10 Ensembles can be both ● more accurate and more effjcient Each line represents one ○ model architecture Each point indicates the ○ number of models ensembled As model sizes get larger, ○ the pergormance gap widens Larger ensembles produce ○ diminishing returns and become less effjcient
Image Classifjcation - EffjcientNet - ImageNet This trend appears for ● highly optimized models on larger datasets as well EffjcientNet scales the ○ width, depth, and resolution of each model size
NAS Ensemble - ImageNet Can we use NAS to ● generate diverse ensemble architectures? Can architecture diversity ○ boost the accuracy to FLOPs/latency ratio? Pareto curve shown for model ○ ensembles searched with NAS Surprisingly, a single searched ○ model pergorms nearly the same as a diverse ensemble Latency (ms)
Conclusion Ensembles of smaller models can be more accurate and more effjcient ● than single large models, especially as model size grows One can use ensembles as a more fmexible trade-ofg between a model’s inference ○ speed and accuracy Ensembles can be easily distributed across multiple workers, furuher increasing ○ effjciency A single searched model using NAS can fjnd a well-optimized architecture ● for ensembling However, ensembling diverse architectures from a search on multiple models pergorms ○ nearly the same as ensembling one model architecture from the search

Recommend

When Ensembling Smaller Models is More Efficient than Single Large Models Dan Kondratyuk, Mingxing

When Ensembling Smaller Models is More Efficient than Single Large Models Dan Kondratyuk, Mingxing Tan, Matthew Brown, Boqing Gong Google AI { dankondratyuk,tanmingxing,mtbr,bgong } @google.com Abstract For ensembles with more than two models,

175 views • 4 slides

Maximum Entropy Classifier Ensembling using Ge- netic Algorithm for NER in Bengali Asif Ekbal 1

Outline Background and Motivation Classifier Ensembling Genetic Algorithms Proposed Method of Classifier Ensemble Feature Set Used Experimental Results Conclusions Future Works Maximum Entropy Classifier Ensembling using Ge- netic

749 views • 36 slides

RIS P EKO 3.0 1 2 4 3 Effjcient EC New range - E f f i c i e n c y Extremely fans! 5

AIR HANDLING UNITS RIS P EKO 3.0 1 2 4 3 Effjcient EC New range - E f f i c i e n c y Extremely fans! 5 sizes, 20 models! up to 94%! low height ! Contents Basic features 4 RIS P EKO 3.0 full range

515 views • 28 slides

Cross Validation & Ensembling Shan-Hung Wu shwu@cs.nthu.edu.tw Department of Computer

Cross Validation & Ensembling Shan-Hung Wu shwu@cs.nthu.edu.tw Department of Computer Science, National Tsing Hua University, Taiwan Machine Learning Shan-Hung Wu (CS, NTHU) CV & Ensembling Machine Learning 1 / 34 Outline Cross

1.18k views • 94 slides

Parkinsons Disease Over years their life just gets smaller and smaller Professor Tod,

Parkinsons Disease Over years their life just gets smaller and smaller Professor Tod, University of Manchester Depletion of dopaminergic neurons Symptoms: muscle spasms, slow movement, resting tremor Other non-motor

328 views • 29 slides

A Scalable, Portable, and Memory-Effjcient Lock-Free FIFO Queue Ruslan Nikolaev Systems

A Scalable, Portable, and Memory-Effjcient Lock-Free FIFO Queue Ruslan Nikolaev Systems Software Research Group Virginia Tech, USA Motivation Effjcient concurrent FIFO queues are hard Elimination techniques and relaxed FIFO queues are

178 views • 17 slides

Creating smaller, faster, production-worthy mobile machine learning models Jameson Toole

Creating smaller, faster, production-worthy mobile machine learning models Jameson Toole OReilly AI London, 2019 We showcase this approach by training an 8.3 billion parameter transformer language model with 8-way model parallelism and

828 views • 31 slides

IACP Smaller Law Enforcement Agency Technical Assistance Program Smaller Agency Conference Track

IACP Smaller Law Enforcement Agency Technical Assistance Program Smaller Agency Conference Track 2014 Assessing and Improving Analytic Capacities in Smaller Law Enforcement Agencies Moderator: James Chip R. Coldren, Jr. Mark Spawn, New

863 views • 52 slides

Stacking With Auxiliary Features: Improved Ensembling for Natural Language and Vision Nazneen

Stacking With Auxiliary Features: Improved Ensembling for Natural Language and Vision Nazneen Rajani PhD Proposal November 7, 2016 Committee members: Ray Mooney, Katrin Erk, Greg Durrett and Ken Barker Outline Introduction Background

991 views • 77 slides

Acoustic Scene Classification by Ensembling Gradient Boosting Machine and Convolutional Neural

Acoustic Scene Classification by Ensembling Gradient Boosting Machine and Convolutional Neural Networks DCASE 2017 Eduardo Fonseca, Rong Gong, Dmitry Bogdanov, Olga Slizovskaia, Emilia Gomez and Xavier Serra Outline Introduction

581 views • 38 slides

Explainable Improved Ensembling for Natural Language and Vision Nazneen Rajani University of

Explainable Improved Ensembling for Natural Language and Vision Nazneen Rajani University of Texas at Austin Ph.D. Defense (12 th July, 2018) NLP Vision Discourse Scene Recognition Visual Question Sentiment Analysis Object Tracking

1.24k views • 99 slides

Clustering - Unimodal, to Cluster Ensembling, to Multi-View Clustering Captain Iain Cruickshank

CASOS Clustering - Unimodal, to Cluster Ensembling, to Multi-View Clustering Captain Iain Cruickshank icruicks@Andrew.cmu.edu Summer Institute 2020 Center for Computational Analysis of Social and Organizational Systems

431 views • 8 slides

Effjcient Deterministjc and Non- Deterministjc Pseudorandom Number Generatjon Jie Li, Jianliang

Effjcient Deterministjc and Non- Deterministjc Pseudorandom Number Generatjon Jie Li, Jianliang Zheng, Paula Whitlock Outline Introductjon MaD1 Algorithm A Building Block: MARC-bb Data Structure Key Scheduling

612 views • 26 slides

KenLM: Faster and Smaller Language Model Queries Kenneth Heafield heafield@cs.cmu.edu Carnegie

Backoff Models Data Structures Results KenLM: Faster and Smaller Language Model Queries Kenneth Heafield heafield@cs.cmu.edu Carnegie Mellon July 30, 2011 kheafield.com/code/kenlm Heafield KenLM: Faster and Smaller Language Model Queries

435 views • 41 slides

Self-ensembling for visual domain adaptation Geoff French g.french@uea.ac.uk Colour Lab

Self-ensembling for visual domain adaptation Geoff French g.french@uea.ac.uk Colour Lab (Finlayson Lab) University of East Anglia, Norwich, UK Image montages from http://www.image-net.org Thanks to My supervisory team: Prof. G. Finlayson,

577 views • 55 slides

Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs Timur Garipov 1 , 2 Pavel Izmailov 3

Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs Timur Garipov 1 , 2 Pavel Izmailov 3 Dmitrii Podoprikhin 4 Dmitry Vetrov 5 Andrew Gordon Wilson 3 1 Samsung AI Center in Moscow, 2 Skolkovo Institute of Science and Technology, 3 Cornell

819 views • 10 slides

simpler, more effjcient and fairer public services moder nisation.gouv.fr 1 Use r s e xpe

Premier ministre SECRTARIAT GNRAL POUR LA MODERNISATION DE LACTION PUBLIQUE Gover nment moder nisation: simpler, more effjcient and fairer public services moder nisation.gouv.fr 1 Use r s e xpe c tations r e gar ding

186 views • 16 slides

Effjcient visible-light photocatalytic activity by band alignment in mesoporous ternary

Effjcient visible-light photocatalytic activity by band alignment in mesoporous ternary polyoxometalateAg2SCdS I. Kornarakis, I. N. Lykakis, N. Vordos and G. S. semiconductors Armatas Department of Materials Science and T

498 views • 12 slides

Effjcient Similarity Computation for Collaborative Filtering in Dynamic Environments Olivier

Effjcient Similarity Computation for Collaborative Filtering in Dynamic Environments Olivier Jeunen 1 , Koen Verstrepen 2 , Bart Goethals 1,2 September 18th, 2019 1 Adrem Data Lab, University of Antwerp 2 Froomle olivier.jeunen@uantwerp.be 1

498 views • 27 slides

ENHANCING SAFETY AND EFFICIENCY IN TANK FARMS INTRODUCTION In tank farms, a safe and effjcient

CONTENT ENHANCING SAFETY AND EFFICIENCY IN TANK FARMS INTRODUCTION In tank farms, a safe and effjcient handling of highly valuable liquids is the highest priority. Important > Avoid product spills by ensuring operational focusses are on:

250 views • 11 slides

Mixer Effjcient Many-to-All Broadcast in Dynamic Wireless Mesh Networks Carsten Herrmann, Fabian

Mixer Effjcient Many-to-All Broadcast in Dynamic Wireless Mesh Networks Carsten Herrmann, Fabian Mager, Marco Zimmerling Networked Embedded Systems Lab, TU Dresden, Germany Why Many-to-All Communication? Why Many-to-All universal: can

747 views • 43 slides

Effjcient Private Set Intersection for a Decentralised Web of Trust lvaro Garca-Recuero

Effjcient Private Set Intersection for a Decentralised Web of Trust lvaro Garca-Recuero October 31, 2017 Privacy-preserving protocols for the WWW in the age of mass surveillance and adversarial learning lvaro Garca-Recuero

802 views • 45 slides

Smaller, more accurate regression forests using tree alternating optimization Arman

Smaller, more accurate regression forests using tree alternating optimization Arman Zharmagambetov and Miguel A. Carreira-Perpi n an Dept. of Computer Science and Engineering University of California, Merced ICML, July 2020 Smaller,

315 views • 16 slides

Improved Static Analysis to Generate More Effjcient Code for Execution of Loop Nests in GPUs J.

Improved Static Analysis to Generate More Effjcient Code for Execution of Loop Nests in GPUs J. Nelson Amaral Department of Computjng Science Nathan Michael Z Jacky Michael N. Braedy Sarah Rebecca Eldon Dylan Thomas Ben Hao Ben Hao

779 views • 75 slides