THE LOTTERY TICKET HYPOTHESIS: FINDING SPARSE, TRAINABLE NEURAL - PowerPoint PPT Presentation

Apr 01, 2024 •283 likes •414 views

THE LOTTERY TICKET HYPOTHESIS: FINDING SPARSE, TRAINABLE NEURAL NETWORKS Slides prepared for reading club by Nolan Dey Motivation - Pruning techniques can reduce parameter counts by 90% without harming accuracy Train Prune 90% accuracy

THE LOTTERY TICKET HYPOTHESIS: FINDING SPARSE, TRAINABLE NEURAL NETWORKS Slides prepared for reading club by Nolan Dey
Motivation - Pruning techniques can reduce parameter counts by 90% without harming accuracy Train Prune 90% accuracy 90% accuracy
Motivation Pruning techniques can reduce parameter counts by 90% without harming accuracy Randomly initialize Prune weights and train 90% accuracy 90% accuracy
Motivation Randomly initialize Prune weights and train 90% accuracy 90% accuracy Randomly initialize 😠 weights and train 60% accuracy
The Lottery Ticket Hypothesis A randomly-initialized, dense neural network contains a subnetwork that is initialized such that—when trained in isolation—it can match the test accuracy of the original network after training for at most the same number of iterations.
The Lottery Ticket Hypothesis Randomly initialize Prune weights and train 90% accuracy 90% accuracy Use same weight 😋 initialization and train 90% accuracy
Lottery Analogy - If you want to win the lottery, just buy a lot of tickets and some will likely win - Buying a lot of tickets = having an overparameterized neural network for your task - Winning the lottery = training a network with high accuracy - Winning ticket = pruned subnetwork which achieves high accuracy
Identifying Winning Tickets One-shot pruning 1. Randomly initialize a neural network 2. Train the network 3. Prune p%** of weights with lowest magnitude from each layer (set them to 0) 4. Reset pruned network parameters to the original random initialization Iterative pruning - Iteratively repeat the one-shot pruning process - Yields smaller networks than one-shot pruning **Connections to outputs are pruned at 50% of the pruning rate
Results - Tested with fully connected, convolutional, and ResNet on MNIST and CIFAR-10 - Pruned subnetworks are 10-20% smaller than the original and meet or exceed original test accuracy in at most the same number of iterations - Works with different optimizers (SGD, momentum, Adam), dropout, weight decay, batchnorm, residual connections - Sensitive to learning rate: requires a number of “warmup” iterations to find winning tickets at higher learning rates
Discussion - Are winning initializations already close to fully-trained values? - No! They actually change more during training than the other parameters - Perhaps winning initializations might land in a region of the loss landscape that is particularly amenable to optimization - They conjecture that SGD seeks out a trains a winning ticket in an overparameterized network - Pruned subnetworks generalize better (smaller difference between train and test accuracies)
Limitations - Iterative pruning is computationally intensive -> involves training a network 15 times per trial - Hard to study larger datasets like ImageNet - Future work: find more efficient methods of finding winning tickets - Their winning tickets are not optimized for modern libraries or hardware - Future work: maybe non-magnitude based pruning methods could find smaller winning tickets earlier
Let’s Discuss

Recommend

THE LOTTERY TICKET HYPOTHESIS: FINDING SPARSE, TRAINABLE NEURAL NETWORKS Jonathan Frankle,

THE LOTTERY TICKET HYPOTHESIS: FINDING SPARSE, TRAINABLE NEURAL NETWORKS Jonathan Frankle, Michael Carbin Published as a conference paper at ICLR 2019 16.10.2019 Panu Pietikinen What is the Lottery Ticket Hypothesis about? Original network

664 views • 35 slides

SPORTS WAGERING WEST VIRGINIA John A. Myers Director, West Virginia Lottery WV Lottery

SPORTS WAGERING WEST VIRGINIA John A. Myers Director, West Virginia Lottery WV Lottery Oversight Traditional Lottery Instant and On-line Racetrack and Casino Video Lottery Limited Video Lottery 7 machines in Bars and Taverns

529 views • 14 slides

May 2019 Cheryl Couvillion Delaware Lottery SPORTS LOTTERY-HISTORY PASPA 1992 - Delaware is

May 2019 Cheryl Couvillion Delaware Lottery SPORTS LOTTERY-HISTORY PASPA 1992 - Delaware is one of four states permitted to offer Sports Lottery September 10, 2009 Delaware successfully launched Sports Lottery at our 3 Casino

398 views • 18 slides

National Lottery Heritage Fund Newcastle-under-Lyme National Lottery Heritage Fund The

National Lottery Heritage Fund Newcastle-under-Lyme National Lottery Heritage Fund The difference National Lottery Players have made Introduction to National Lottery Heritage Fund new grants A guide to support available for

416 views • 12 slides

The Freedom Ticket A Southeast Queens Proof of Concept December 2015 What is Freedom Ticket ?

The Freedom Ticket A Southeast Queens Proof of Concept December 2015 What is Freedom Ticket ? Railroad Fare + Free Transfer = Freedom Ticket Freedom Ticket Roll-out Phase 1: Southeast Queens & Brooklyn 2017 Phase 2: Expansion to

406 views • 13 slides

Lottery ticket hypothesis By : Grishma Gupta, Lokit Paras 1.Motivation Deep learning models

Lottery ticket hypothesis By : Grishma Gupta, Lokit Paras 1.Motivation Deep learning models have shown promising results in many domains. However, such models often have millions of parameters. The deep learning models face the following

314 views • 16 slides

The Big Lottery Fund in the High Peak Kelly Hart Big Lottery Fund East Midlands The National

The Big Lottery Fund in the High Peak Kelly Hart Big Lottery Fund East Midlands The National Lottery some facts 258 million has been invested in village halls and community centres in the UK Funding has enabled more than 39,000

449 views • 23 slides

2011 12 2011-12 First Lottery First Lottery Results Results 1 New Student Enrollment

2011 12 2011-12 First Lottery First Lottery Results Results 1 New Student Enrollment Enrollment period started October 6 th o e pe od s a ed Oc obe 6 1,649 new student enrollments Deadline to enroll for first lottery

649 views • 11 slides

2010-11 First Lottery 2010-11 First Lottery y y Results Results New Student Enrollment

2010-11 First Lottery 2010-11 First Lottery y y Results Results New Student Enrollment Enrollment period started October 5 th o e pe od s a ed Oc obe 5 1,643 new student enrollments Deadline to enroll for first lottery

141 views • 11 slides

Big Lottery Fund Sarah Clubb Funding Officer - Lancashire May 2018 Big Lottery Fund is the

Big Lottery Fund Sarah Clubb Funding Officer - Lancashire May 2018 Big Lottery Fund is the largest community funder in the UK. Last year we distributed 700m to 12,000 projects, using money raised by National Lottery players. Most of our

787 views • 21 slides

2019 TORONTO FRINGE FESTIVAL LOTTERY December 6, 2018 The Toronto Fringe Festival - 2019 Lottery

2019 TORONTO FRINGE FESTIVAL LOTTERY December 6, 2018 The Toronto Fringe Festival - 2019 Lottery CATEGORY: INTERNATIONAL 90 INTERNATIONAL 90 CATEGORY LOTTERY NUMBER IN slot 1 IN008 IN slot 2 IN013 IN wait list 1 IN017 IN

330 views • 22 slides

Heritage Lottery Fund and Places of Worship Ros Kerslake OBE Chief Executive, Heritage Lottery

Heritage Lottery Fund and Places of Worship Ros Kerslake OBE Chief Executive, Heritage Lottery Fund @roskerslake Heritage Lottery Fund Supporting Places of Worship since 1994 Context for change Current focus What were doing now

471 views • 10 slides

STATION UPGRADE TICKET HALL STATION UPGRADE TICKET HALL Existing Proposed 2 Improved

STATION UPGRADE TICKET HALL STATION UPGRADE TICKET HALL Existing Proposed 2 Improved ticket hall area STATION UPGRADE PLATFORM LEVEL Need identified for an interchange overbridge Integrated with emergency exit to Thurloe

213 views • 7 slides

Trainable Decoding of Sets of Sequences for Neural Sequence Models Ashwin Kalyan Peter Anderson

Tr Trainable Decoding of Sets of Sequences for Neural Sequence Models Trainable Decoding of Sets of Sequences for Neural Sequence Models Ashwin Kalyan Peter Anderson Stefan Lee Dhruv Batra Ashwin Kalyan Ashwin Kalyan ICML 2019 ICML 2019

528 views • 39 slides

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are zero lower triangular (?) dense few elements are zero These are structured sparse matrices. May be mapped into a 1D array so that a mapping

392 views • 6 slides

Cluster Validity Hypothesis Random Graph Hypothesis Random Label Hypothesis Relative Criteria

Cluster Validity 10/14/2010 1 Erin Wirch & Wenbo Wang Outline Hypothesis Testing Random Position Cluster Validity Hypothesis Random Graph Hypothesis Random Label Hypothesis Relative Criteria Erin Wirch & Wenbo Wang

357 views • 24 slides

Neural Computing Journal Club 14/10/2020 Presented by Edward Jones 1 Overview of the Review

Neural Computing Journal Club 14/10/2020 Presented by Edward Jones 1 Overview of the Review Introduction of the paper and authors Central idea of the Lottery Ticket Hypothesis Discussion of the experiments Relation to

447 views • 23 slides

Subnetwork Encapsulation and Adaptation Layer (SEAL) IETF76 INTAREA Meeting Fred L. Templin

Subnetwork Encapsulation and Adaptation Layer (SEAL) IETF76 INTAREA Meeting Fred L. Templin fred.l.templin@boeing.com Tunnel Maximum Transmission Unit (MTU) End-to-End Final Destination Marginal Link Tunnel (MTU=1500) MTU=1500 MTU=1KB

438 views • 11 slides

Networking: Network Layer Summer 2016 Cornell University 1 Today How can two computers

CS 4410 Operating Systems Networking: Network Layer Summer 2016 Cornell University 1 Today How can two computers communicate in a WAN? 2 Protocol Stack Computer B Computer A Message Application M Application Transport Transport

832 views • 16 slides

GRNET SERVICE BOX George Thanos, GRNet Email: gthanos@grnet.gr Faidon Liampotis, GRNet Email:

GRNET SERVICE BOX George Thanos, GRNet Email: gthanos@grnet.gr Faidon Liampotis, GRNet Email: faidon@grnet.gr 6 th November , 2008 WHAT IS THE GRNET SERVICE BOX? GRNet Service Box is a 1U server that is delivered free of charge to the

188 views • 18 slides

HotNet Background Determine significantly mutated subnetworks in a large gene interaction

HotNet Background Determine significantly mutated subnetworks in a large gene interaction network Problems with current methods Frequency doesnt always predict significance Nave subnetwork analysis Enumeration prohibits

737 views • 10 slides

Compositional Timing Analysis Ramzi Ben Salah Marius Bozga Oded Maler CNRS - VERIMAG Grenoble,

Compositional Timing Analysis Ramzi Ben Salah Marius Bozga Oded Maler CNRS - VERIMAG Grenoble, France 2009 Apology The message of this paper is not easy to communicate It represents many years of work (theory and implementation)

696 views • 23 slides

Proposition of a mechanism to divide a MANET network into subnetworks of given size What are we

2nd OLSR Interop / Workshop Proposition of a mechanism to divide a MANET network into subnetworks of given size What are we talking about? Definition ... A subnet is simply defined by the adding of a subnet identifier parameter to

395 views • 17 slides

EdgeL 3 : Compressing L 3 -Net for Mote-Scale Urban Noise Monitoring Sangeeta Kumari Dhrubojyoti

EdgeL 3 : Compressing L 3 -Net for Mote-Scale Urban Noise Monitoring Sangeeta Kumari Dhrubojyoti Roy Mark Cartwright Ohio State University Ohio State University New York University Juan Pablo Bello Anish Arora New York University Ohio

579 views • 44 slides