Overcoming Multi-Model Forgetting Y. Benyahia, K. Yu, K. - PowerPoint PPT Presentation

Sep 14, 2023 •270 likes •352 views

Overcoming Multi-Model Forgetting Y. Benyahia, K. Yu, K. Bennani-Smires, M. Jaggi, A. Davison, M. Salzmann, C. Musat 1 The Weight Sharing In One of the first NAS papers using Reinforcement Learning, Zoph et Al. (Google) used more than 800 gpus

Overcoming Multi-Model Forgetting Y. Benyahia, K. Yu, K. Bennani-Smires, M. Jaggi, A. Davison, M. Salzmann, C. Musat 1
The Weight Sharing In One of the first NAS papers using Reinforcement Learning, Zoph et Al. (Google) used more than 800 gpus in parallel for two weeks . Weight Sharing was introduced in NAS to speed up the process Efficient Neural Architecture Search (Pham et al.) 2
Assumptions Our hypothesis : 1. Weight-sharing can negatively affect architectures . 2. If justified, this can lead to a wrong evaluation of candidates in NAS , making the evaluation phase closer to random 3
Multi-Model Forgetting 4
Study of Weight-Sharing Simple scenario of two models sharing parameters: and Assume that we have access to the optimal parameters of the first model Maximizing the posterior distribution , Cross-entropy loss Weight importance L2 regularization 5
Experiments on Two Models - WPL reduces multi-model forgetting - WPL have a minimal effect on the learning of the second model 6
ENAS on PTB 7
Summing up To recap, our main contributions are: 1. Weight Sharing negatively impacts NAS 2. Weight Sharing can cause the search phase in NAS to become closer to random 3. WPL reduces Multi-Model Forgetting Pacific Ballroom #19 (6:30pm - 9pm) 8

Recommend

Forgetting with Puzzles: Using Cryptographic Puzzles to support Digital Forgetting Shujaat Mirza

Forgetting with Puzzles: Using Cryptographic Puzzles to support Digital Forgetting Shujaat Mirza msm622@nyu.edu Cyber Security & Privacy Lab (CSP-lab) Digital Forgetting Right to Right to be Privacy Forgotten constitutes data

466 views • 22 slides

Overcoming Catastrophic Forgetting with Unlabeled Data in the Wild Presenters: Nikhil Kannan, Ying

Overcoming Catastrophic Forgetting with Unlabeled Data in the Wild Presenters: Nikhil Kannan, Ying Fan 1 Introduction: 1.1 Catastrophic Forgetting: Goal of class-incremental learning is to learn a model that performs well on previous and new

221 views • 11 slides

Building Your Own WAF as a Service and Forgetting about False Positives Juan Berner 1 About me

Building Your Own WAF as a Service and Forgetting about False Positives Juan Berner 1 About me Lead security developer @Booking.com Twitter: @89berner medium.com/@89berner 2 Building Your Own WAF as a Service and Forgetting

780 views • 56 slides

Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence Arslan

Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence Arslan Chaudhry et al. Presented by Milo Prgr Pattern Recognition and

653 views • 39 slides

Building Your Own WAF as a Service and Forgetting about False Positives 1 Building Your Own WAF

Building Your Own WAF as a Service and Forgetting about False Positives 1 Building Your Own WAF as a Service and Forgetting about False Positives Juan Berner Juan Berner @89berner Lead security developer @Booking.com Blog:

694 views • 58 slides

Resolution-Based Uniform Interpolation and Forgetting for Expressive Description Logics Patrick

Resolution-Based Uniform Interpolation and Forgetting for Expressive Description Logics Resolution-Based Uniform Interpolation and Forgetting for Expressive Description Logics Patrick Koopmann December 12, 2017 1/60 Resolution-Based Uniform

1.07k views • 74 slides

Working Across Borders Working Across Borders Overcoming Culturally-Based Overcoming

Working Across Borders Working Across Borders Overcoming Culturally-Based Overcoming Culturally-Based Technology Challenges in Student Technology Challenges in Student Global Software Development Global Software Development Olly Gotel

783 views • 26 slides

A New Two- -Scale Mix Model: Towards Scale Mix Model: Towards a Multi a Multi- - A New Two A

A New Two- -Scale Mix Model: Towards Scale Mix Model: Towards a Multi a Multi- - A New Two A New Two-Scale Mix Model: Towards a Multi- Component Model of Turbulent Mixing* Component Model of Turbulent Mixing* Component Model of Turbulent

376 views • 26 slides

A Personalized Interest-Forgetting Markov Model for Recommendations Jun Chen , Chaokun Wang,

A Personalized Interest-Forgetting Markov Model for Recommendations Jun Chen , Chaokun Wang, Jianmin Wang Tsinghua University, China chenjun14@mails.tsinghua.edu.cn, {chaokun, jimwang}@tsinghua.edu.cn AAAI-15 28-Jan-15 1 Review on

366 views • 19 slides

XL2B: Excel2013: Model Trendline Multi 4/05/2019 V0P XL2B: V0P Excel2013 Model Trendline Multi

XL2B: Excel2013: Model Trendline Multi 4/05/2019 V0P XL2B: V0P Excel2013 Model Trendline Multi 1 XL2B: V0P Excel2013 Model Trendline Multi 2 Model Using Trendline Assignment: Multiple Models in Excel 2013 1. Goal: Generate six charts. Use

477 views • 14 slides

How hard can GCSES they be? W hat want? you do EBBINGHAUS AND THE FORGETTING CURVE REVISION

How hard can GCSES they be? W hat want? you do EBBINGHAUS AND THE FORGETTING CURVE REVISION A BIT ABOUT SPACED LEARNING You will forget at least some of what you learned if you do not review it. R eviewing something around

568 views • 31 slides

EFFECTIVE NOTE-TAKING OVERVIEW Importance Strategies Note - taking apps FORGETTING CURVE

EFFECTIVE NOTE-TAKING OVERVIEW Importance Strategies Note - taking apps FORGETTING CURVE Approximately 60 % information in class is forgotten after 9 hours Writing down notes increases active listening = better retention HIGH SCHOOL COLLEGE

367 views • 11 slides

The Problem: Forgetting to bring all the radio gear to Field Day and other events.again

The Problem: Forgetting to bring all the radio gear to Field Day and other events.again and again The Requirements: Pack all the needed things in one place - so I dont forget it Minimize the set-up time Reduce equipment

357 views • 16 slides

The Curve of Forgetting General recommendation is to spend about 5-10 minutes every day

The Curve of Forgetting General recommendation is to spend about 5-10 minutes every day reviewing what you learned in each class Sometimes I will give you homework to force you to review, but not always Discuss with partner Look

227 views • 5 slides

SENDING AND FORGETTING: TERMINATION OF AMNESIAC FLOODING ON A GRAPH* WALTER HUSSAK AND AMITABH

SENDING AND FORGETTING: TERMINATION OF AMNESIAC FLOODING ON A GRAPH* WALTER HUSSAK AND AMITABH TREHAN LOUGHBOROUGH UNIVERSITY On the Termination of Flooding. STACS 2020 * * On Termination of a Flooding Process (Brief Announcement). PODC 2019

542 views • 38 slides

Source Code Are we not forgetting something? Guus Sliepen FOSDEM 2017 . . . . . . . . .

. . . . . . . . . . . . . . . . . Source Code Are we not forgetting something? Guus Sliepen FOSDEM 2017 . . . . . . . . . . . . . . . . . . . . . . . February 4, 2017 . . . . . . . . . . . .

1.96k views • 16 slides

Xilai Li 1* , Yingbo Zhou 2* , Tianfu Wu 1 , Richard Socher 2 , and Caiming Xiong 2 North Carolina

Xilai Li 1* , Yingbo Zhou 2* , Tianfu Wu 1 , Richard Socher 2 , and Caiming Xiong 2 North Carolina State University 1 , Salesforce Research 2 Task 1 Task 2 Task i-1 Task i ... ... Model 1 Model 2 Model i-1 Model i Learn to Grow: A

97 views • 9 slides

The Mobile Library at UCD: Achievements & Plans Samantha Drennan Josh Clark Head of Library

The Mobile Library at UCD: Achievements & Plans Samantha Drennan Josh Clark Head of Library Information Technology Outreach Librarian An Leabharlann, An Coliste UCD Library Ollscoile, Baile tha Cliath Overview Achievements to

416 views • 22 slides

quancol . ........ . . . ... ... ... ... ... ... ... www.quanticol.eu Spatial

quancol . ........ . . . ... ... ... ... ... ... ... www.quanticol.eu Spatial Representations and Analysis Techniques Part II: Analysis Vashti Galpin University of Edinburgh Bertinoro 22 June 2016 SFM-16 1 / 60 quancol .

2.5k views • 177 slides

The Visual Policy Making Life-Cycle: Supporting Policy Makers with Visual-Interactive ICT Tools

The Visual Policy Making Life-Cycle: Supporting Policy Makers with Visual-Interactive ICT Tools for Sustainable Policy Making Tobias Ruppert Dept. Information Visualization and Visual Analytics Fraunhofer Institute for Computer Graphics Research

572 views • 23 slides

Concise Preservation by combining Managed Forgetting and Contextualized Remembering Research

Concise Preservation by combining Managed Forgetting and Contextualized Remembering Research Talk, May 9, 2014 University of Twente, Enschede Speaker: Nattiya Kanhabua L3S Research Center / University of Hannover ForgetIT Project Consortium

640 views • 35 slides

1 Optimization in decision graphs Unfolding to decision tree Only option until Shachter

Decision graphs II Influence Diagrams Advanced Herd Management Anders Ringgaard Kristensen Slide 1 Outline Optimization methods Decision tree Strong junction tree Single Policy Updating Decision node ordering Advantages and

467 views • 10 slides

Week 4 Video 7 Memory Algorithms Is future correctness enough? Up until this point weve

Week 4 Video 7 Memory Algorithms Is future correctness enough? Up until this point weve been talking about predicting future correctness But what if you forget it tomorrow? Another way to look at knowledge is how long will you

404 views • 15 slides

Bit attacks D. J. Bernstein University of Illinois at Chicago From: andr...@ise... Date: 11 Feb

Bit attacks D. J. Bernstein University of Illinois at Chicago From: andr...@ise... Date: 11 Feb 2009 14:48 Subject: Question Running CubeHash8/1 with 64 bit output over 2 different datasets give me the same hash under Visual Studio. Using

394 views • 13 slides