LSTM: A Search Space Odyssey Klaus Greff, Rupesh K. Srivastava, Jan - PowerPoint PPT Presentation

Mar 11, 2024 •646 likes •758 views

LSTM: A Search Space Odyssey Klaus Greff, Rupesh K. Srivastava, Jan Koutn k, Bas R. Steunebrink, Ju rgen Schmidhuber, 2015. Presenter: Yijun Tian, Zhenyu Liu Abstract In this paper, the authors analyze performance of LSTM and its

LSTM: A Search Space Odyssey Klaus Greff, Rupesh K. Srivastava, Jan Koutn ́ık, Bas R. Steunebrink, Ju ̈rgen Schmidhuber, 2015. Presenter: Yijun Tian, Zhenyu Liu
Abstract ● In this paper, the authors analyze performance of LSTM and its eight variants on three representative tasks: speech recognition, handwriting recognition and polyphonic music modeling. ● Hyperparameters for each variant were optimized individually using random search and importance was gauged using fANOVA (a tool for assessing hyperparameters importance). NYU Courant
Dataset ● TIMIT: the TIMIT Speech corpus Speech Recognition ● IAM Online: the IAM Online Handwriting Database Handwriting Recognition ● JSB Chorales: a collection of 382 fourpart harmonized chorales by J. S. Bach Polyphonic Music Modeling NYU Courant
Vanilla LSTM N: number of LSTM blocks. M: input size NYU Courant
LSTM Variants NYU Courant
Experiments ● Performed 27 random searches (one for each combination of the nine variants and three datasets). ● Each random search encompasses 200 trials of randomly sampling the following hyperparameters: ● Number of LSTM blocks per hidden layer. ● Learning rate, momentum, standard deviation of Gaussian input noise NYU Courant
NOAF and NFG performs significantly worse Results NYU Courant
Learning rate and network size are important hyperparameters Results NYU Courant
Conclusions and Insights ● None of the variants improve upon the standard LSTM architecture significantly. ● Coupling the input and forget gates (CIFG) or removing peephole connections (NP) are attractive models. ● Forget gate and output activation function are the most critical components of the LSTM block. ● Learning rate and network size are important. ● No apparent structure of hyperparameter interaction. NYU Courant
Take home message: The most commonly used LSTM architecture (vanilla LSTM) performs reasonably well on various datasets. Thank you! Questions?

Recommend

LSTM: A Search Space Odyssey Authors: Klaus Greff, Rupesh K. Srivastava, Jan Koutnk, Bas R.

LSTM: A Search Space Odyssey Authors: Klaus Greff, Rupesh K. Srivastava, Jan Koutnk, Bas R. Steunebrink, Jrgen Schmidhuber Presenter: Sidhartha Satapathy Scientific contributions of the paper: The paper aims at evaluating different

563 views • 38 slides

Attention Graham Neubig Site https://phontron.com/class/nn4nlp2017/ Encoder-decoder Models

CS11-747 Neural Networks for NLP Attention Graham Neubig Site https://phontron.com/class/nn4nlp2017/ Encoder-decoder Models (Sutskever et al. 2014) Encoder kono eiga ga kirai </s> LSTM LSTM LSTM LSTM LSTM I hate this movie

823 views • 35 slides

Attention Graham Neubig Site https://phontron.com/class/nn4nlp2020/ Encoder-decoder Models

CS11-747 Neural Networks for NLP Attention Graham Neubig Site https://phontron.com/class/nn4nlp2020/ Encoder-decoder Models (Sutskever et al. 2014) Encoder kono eiga ga kirai </s> LSTM LSTM LSTM LSTM LSTM I hate this movie

614 views • 35 slides

Odyssey Charter Schools, Inc. Parent Reopening Informational Meeting Odyssey Charter Schools,

Odyssey Charter Schools, Inc. Parent Reopening Informational Meeting Odyssey Charter Schools, Inc. Green Apple School Management Welcome from our FOUNDER Ms. Constance Ortiz Founder of Odyssey Schools and CEO of Green Apple

396 views • 38 slides

Odyssey Charter School September 4, 2020 & Odyssey Preparatory Academy eLearning Update

Odyssey Charter School September 4, 2020 & Odyssey Preparatory Academy eLearning Update and Parent Orientation to Odyssey Charter, Inc. digital programs Were in this Together ! Be positive. Your mind is more powerful than you

722 views • 50 slides

Multi-Dimensional LSTM Networks for Video Prediction Wonmin Byeon NVIDIA Research March 29, 2018

Multi-Dimensional LSTM Networks for Video Prediction Multi-Dimensional LSTM Networks for Video Prediction Wonmin Byeon NVIDIA Research March 29, 2018 Wonmin Byeon | NVIDIA Research | March 29, 2018 1 / 44 Multi-Dimensional LSTM Networks for

739 views • 71 slides

Class 15 - Long Short-Term Memory (LSTM) Class 15 - Long Short-Term Memory (LSTM) Study materials

Class 15 - Long Short-Term Memory (LSTM) Class 15 - Long Short-Term Memory (LSTM) Study materials Study materials http://colah.github.io/posts/2015-08-Understanding-LSTMs/ (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

641 views • 23 slides

E-LSTM: Efficient Inference of Sparse LSTM on Embedded Heterogeneous System Runbin Shi 1 Junjie

E-LSTM: Efficient Inference of Sparse LSTM on Embedded Heterogeneous System Runbin Shi 1 Junjie Liu 1 Shuo Wang 2 Yun Liang 2 Hayden So 1 1 Department of Electrical and Electronic Engineering The University of Hong Kong 2 Center for

317 views • 30 slides

Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting LSTM

Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting LSTM : VALSE 2016/03/23 Content Quick Review of Recurrent Neural Network

738 views • 27 slides

Welcome to Odyssey Join the Journey for the 2016-17 School Year! Odyssey Facts Accredited by

Welcome to Odyssey Join the Journey for the 2016-17 School Year! Odyssey Facts Accredited by SACS CASI, a division of AdvancEd All teachers are Certified or Highly Qualified Paraprofessional in all K-5 classes Open to any

499 views • 20 slides

What is Odyssey of the Mind? Is Odyssey a good activity for my child? Creativity can be taught!

What is Odyssey of the Mind? Is Odyssey a good activity for my child? Creativity can be taught! By tapping into creativity, and through encouraging imaginative paths to problem-solving, students develop skills that will provide them

403 views • 23 slides

Odyssey Charter School and Odyssey Preparatory Academy Parent Orientation for Elementary

Odyssey Charter School and Odyssey Preparatory Academy Parent Orientation for Elementary Programs ~Ms. Wendi Nolder, Principal Please mute your microphone Please type any questions in the CHAT window to be answered at the end of

991 views • 61 slides

The Particle Physics Odyssey [ Where are we? Where are we going? ] G. Isidori The Particle

G. Isidori The Particle Physics Odyssey International Master Classes, LNF 2014 The Particle Physics Odyssey [ Where are we? Where are we going? ] G. Isidori The Particle Physics Odyssey International Master Classes, LNF 2014 The

551 views • 52 slides

Foundations of Artificial Intelligence 9. State-Space Search: Tree Search and Graph Search Malte

Foundations of Artificial Intelligence 9. State-Space Search: Tree Search and Graph Search Malte Helmert Universit at Basel March 7, 2016 Introduction Tree Search Graph Search Evaluating Search Algorithms Summary State-Space Search:

550 views • 23 slides

M E M B R A N E . an odyssey in domestic space E T R A A new L L * * south A

construction workshop in Addis Abeba, Ethiopia - December 05 th -16 th 2016 M E M B R A N E . an odyssey in domestic space E T R A A new L L * * south A Y C G N E Membrane, limit of space, beginning of space, space

405 views • 7 slides

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search Engines Corporate and institutional sites E-commerce Intranets P2P, Meta search and distributed search CMSs and Search Engines

429 views • 25 slides

VIDEO CONFERENCING MISSION HALL Level%2% Level%3% Level%4% Level%1%

VIDEO CONFERENCING MISSION HALL Level%2% Level%3% Level%4% Level%1% Check&back&end& se@ngs& Check&cables& Check&TMS& se@ngs& Bu1on&push& Check&cables& Further&tes;ng/

584 views • 39 slides

Specialization investments and market power in the underwriting market for municipal bonds.

Specialization investments and market power in the underwriting market for municipal bonds. Dario Cestau IE Business School Heat map top 7-12 underwriters of school bonds Actual Theoretical Why are underwriters segmented by state? Part

434 views • 9 slides

lecture 10 - cubic curves - cubic splines - bicubic surfaces We want to define smooth curves:

lecture 10 - cubic curves - cubic splines - bicubic surfaces We want to define smooth curves: - for defining paths of cameras or objects - for defining 1D shapes of objects We want to define smooth surfaces too. Parametric Equation of a

948 views • 48 slides

Eleos: Exit-Less OS Services for SGX Enclaves Meni Orenbach Marina Minkin Pavel Lifshits Mark

Eleos: Exit-Less OS Services for SGX Enclaves Meni Orenbach Marina Minkin Pavel Lifshits Mark Silberstein Accelerated Computing Systems Lab Haifa, Israel What do we do? Improve performance: I/O intensive & memory demanding SGX enclaves

1.02k views • 87 slides

Syntax Macros: a Case-Study in Extending Clang Dr. Norman A. Rink Technische Universitt

Syntax Macros: a Case-Study in Extending Clang Dr. Norman A. Rink Technische Universitt Dresden, Germany norman.rink@tu-dresden.de LLVM Cauldron 8 September 2016 Hebden Bridge, England Who we are Chair for Compiler Construction (since

515 views • 14 slides

Shake: Past, Present, Future Neil Mitchell shakebuild.com Shake: a build system An

Shake: Past, Present, Future Neil Mitchell shakebuild.com Shake: a build system An alternative to Make, as a Haskell library About 9 years old Built my PhD thesis Proprietary SCB build system Open-source reimplementation

488 views • 48 slides

[P ROCESSES ] Shrideep Pallickara Computer Science Colorado State University CS370: Operating

CS370: Operating Systems [Fall 2018] Dept. Of Computer Science , Colorado State University CS 370: O PERATING S YSTEMS [P ROCESSES ] Shrideep Pallickara Computer Science Colorado State University CS370: Operating Systems [Fall 2018] August 28,

594 views • 28 slides

Web development in Lua Introducing Sailor, an MVC web framework E t iene Dalcol @e t iene_d

Web development in Lua Introducing Sailor, an MVC web framework E t iene Dalcol @e t iene_d @etiene_d @e t iene_d FOSDEM 2016 Sailor! sailorproject.org @e t iene_d FOSDEM 2016 Google Summer of Code LabLua @e t iene_d FOSDEM 2016 Lua

603 views • 38 slides