Gated Orthogonal Recurrent Units: On Learning to Forget Li Jing, a - PowerPoint PPT Presentation

Oct 26, 2022 •226 likes •321 views

Gated Orthogonal Recurrent Units: On Learning to Forget Li Jing, a lar Glehre, John Peurifoy, Yichen Shen, Max Tegmark, Marin Solja i , Yoshua Bengio Gradient Vanishing/Explosion Problem During backpropagation through time,

Gated Orthogonal Recurrent Units: On Learning to Forget Li Jing, Ça ğ lar Gülçehre, John Peurifoy, Yichen Shen, Max Tegmark, Marin Solja č i ć , Yoshua Bengio
Gradient Vanishing/Explosion Problem • During backpropagation through time, hidden to hidden Jacobian matrix is multiplied multiple times. • Gradient vanishing/explosion makes RNN hard to train Li Jing
Conventional Solution: LSTM • Practically, gradient clipping is required • slow to learn long term dependency Li Jing
Unitary/Orthogonal RNN Unitary/Orthogonal matrices keep the norm of vectors: By enforcing hidden to hidden transition matrix to be unitary/ orthogonal, no matter how many time steps are propagated, the norm of the gradient will stay the same • Restricted-capacity Unitary Matrix Parametrization (Arjovsky, ICML 2016) • Full-capacity Unitary Matrix by projection (Wisdom, NIPS 2016) • Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNN (Jing, ICML 2017) • Orthogonal Matrix Parametrization by reflection (Mhammedi, ICML 2017) • Orthogonal Matrix by regularization (Vorontsov, ICML 2017) Li Jing
Limitation for basic Orthogonal RNN • No forgetting mechanism • Limited Memory size
Applying Gated System to Orthogonal RNN z 1-z modReLU r W x U IN h OUT Gated Orthogonal Recurrent Unit Unitary/Orthogonal Matrices Long Term Dependency Gated Mechanism Forgetting
Experiment results Synthetic Tasks: GORU is the only one succeeding in all tasks • Parenthesis Task • Copying Task • Denoise Task Li Jing
Experiment results Real Tasks: GORU outperforms • Question Answering Task all other models • Speech Task Li Jing
Thank you

Recommend

Outline Gated Feedback Recurrent Neural Networks. arXiv1502. Introduction: RNN & Gated RNN

Outline GF-RNN ReNet Outline Gated Feedback Recurrent Neural Networks. arXiv1502. Introduction: RNN & Gated RNN Gated Feedback Recurrent Neural Networks (GF-RNN) Experiments: Character-level Language Modeling & Python Program

631 views • 26 slides

Orthogonal Complements and Orthonormal Matrices Orthogonal Complements Defn. For a set W , the

Orthogonal Complements and Orthonormal Matrices Orthogonal Complements Defn. For a set W , the orthogonal comple- ment denoted W is the set of all vectors that are orthogonal to all of W . orthoTWO: 2 Orthogonal Complements are Subspaces

590 views • 13 slides

Range gated cameras technology and its applications Friday, October 10, 14 Range gated cameras

Range gated cameras technology and its applications Friday, October 10, 14 Range gated cameras can have several names * Time resolved cameras * Range gated cameras * Laser tomography * Grindad laserkamera / avbildning ( SV) All systems

648 views • 25 slides

CHAPTER VII VII CHAPTER Learning in Recurrent Networks Learning in Recurrent Networks CHAPTER

Ugur HALICI - METU EEE - ANKARA 11/18/2004 CHAPTER VII VII CHAPTER Learning in Recurrent Networks Learning in Recurrent Networks CHAPTER VI : VI : Learning in CHAPTER Learning in Recurrent Recurrent Networks Networks Introduction We

464 views • 17 slides

CS7015 (Deep Learning) : Lecture 15 Long Short Term Memory Cells (LSTMs), Gated Recurrent Units

CS7015 (Deep Learning) : Lecture 15 Long Short Term Memory Cells (LSTMs), Gated Recurrent Units (GRUs) Mitesh M. Khapra Department of Computer Science and Engineering Indian Institute of Technology Madras 1/43 Mitesh M. Khapra CS7015 (Deep

4.9k views • 196 slides

CS7015 (Deep Learning) : Lecture 15 Long Short Term Memory Cells (LSTMs), Gated Recurrent Units

467 views • 43 slides

Orthogonal range searching Orthogonal range searching Problem: Given a set of n points Orthogonal

CG Lecture 4 CG Lecture 4 Orthogonal range searching Orthogonal range searching Problem: Given a set of n points Orthogonal range search in d , preprocess them such Y that reporting or counting the 1. Problem definition and motivation k

242 views • 6 slides

1 Mammalian Neurons Have Several Types of Voltage-Gated Ion Channels Why do neurons need so many

Voltage-Gated Ion Channels in Health and Disease jdk3 Principles of Neural Science, chapter 9 Voltage-Gated Ion Channels in Health and Disease I. Multiple functions of voltage- gated ion channels II. Neurological diseases involving

613 views • 15 slides

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling Authors: Junyoung

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling Authors: Junyoung Chung, Caglar Gulcehre, KyungHyun Cho and Yoshua Bengio Presenter: Yu-Wei Lin Background: Recurrent Neural Network Traditional RNNs encounter

229 views • 18 slides

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

Ugur HALICI - METU EEE - ANKARA 11/18/2004 CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I : Recurrent Neural Networks CHAPTER I Recurrent Neural Networks Introduction In this chapter first the

404 views • 27 slides

Typical English mistakes The system consist of three main component. Giorgio Buttazzo don't forget

24/01/2020 Most frequent mistakes Typical English mistakes The system consist of three main component. Giorgio Buttazzo don't forget the s don't forget the s don't forget the s don't forget the s g.buttazzo@sssup.it in third persons in third

234 views • 7 slides

Latin Squares and Orthogonal Arrays Lucia Moura School of Electrical Engineering and Computer

Orthogonal Latin Squares MOLS Orthogonal Arrays Latin Squares and Orthogonal Arrays Lucia Moura School of Electrical Engineering and Computer Science University of Ottawa lucia@eecs.uottawa.ca Winter 2017 Latin Squares and Orthogonal Arrays

911 views • 32 slides

Cheap Orthogonal Constraints in Neural Networks: A Simple Parametrization of the Orthogonal and

Cheap Orthogonal Constraints in Neural Networks: A Simple Parametrization of the Orthogonal and Unitary Group Mario Lezcano-Casado David Martnez-Rubio Mathematical Institute Department of Computer Science June 12, 2019 Cheap Orthogonal

3.29k views • 17 slides

Classification of self-orthogonal F q + u F q -codes Classification of self-orthogonal F q + u F q

Self-orthogonal Codes over F q + u F q Rowena Alma Betty, Fidel Nemenzo and Lucky Erap Galvez University of the Philippines-Diliman LAWCI 2018 Classification of self-orthogonal F q + u F q -codes Classification of self-orthogonal F q + u F q

975 views • 59 slides

Designs of Orthogonal Filter Banks and Orthogonal Cosine-Modulated Filter Banks Jie Yan

Designs of Orthogonal Filter Banks and Orthogonal Cosine-Modulated Filter Banks Jie Yan Department of Electrical and Computer Engineering University of Victoria April 16, 2010 1 / 45 OUTLINE INTRODUCTION 1 LS DESIGN OF ORTHOGONAL FILTER

584 views • 45 slides

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class Recurrent Neural Network Cell Recurrent Neural Networks (RNNs) Bi-Directional Recurrent Neural Networks (Bi-RNNs) Multiple-layer /

583 views • 47 slides

Recurrent Neural Networks for Language Modeling CSE392 - Spring 2019 Special Topic in CS Tasks

Recurrent Neural Networks for Language Modeling CSE392 - Spring 2019 Special Topic in CS Tasks Recurrent Neural Network and Language Modeling: Generate how? Sequence Models next word, sentence capture hidden representation of

674 views • 39 slides

Introduction to Recurrent Neural Networks Jakob Verbeek Modeling sequential data with Recurrent

Introduction to Recurrent Neural Networks Jakob Verbeek Modeling sequential data with Recurrent Neural Networks Compact schematic drawing of standard multi-layer perceptron (MLP) output hidden input Modeling sequential data So far we

796 views • 50 slides

Recurrent Neural Network Attention Mechanisms for Interpretable System Log Anomaly Detection

Recurrent Neural Network Attention Mechanisms for Interpretable System Log Anomaly Detection Presented By: Akash Kulkarni System Log Analysis is complicated 1. Log sources generate TBs of data per day 2. Lack of labelled data (scarce or

719 views • 15 slides

Recurrent Neural Networks CS 6956: Deep Learning for NLP Overview 1. Modeling sequences 2.

Recurrent Neural Networks CS 6956: Deep Learning for NLP Overview 1. Modeling sequences 2. Recurrent neural networks: An abstraction 3. Usage patterns for RNNs 4. BiDirectional RNNs 5. A concrete example: The Elman RNN 6. The vanishing

3.07k views • 30 slides

Differential Categories, Recurrent Neural Networks, and Machine Learning Shin-ya Katsumata and

Differential Categories, Recurrent Neural Networks, and Machine Learning Shin-ya Katsumata and David Sprunger* National Institute of Informatics, Tokyo SYCO 4 Chapman University May 23, 2019 1/32 Outline Feedforward neural networks 1

978 views • 59 slides

Recurrent Neural Network Rachel Hu and Zhi Zhang Amazon AI d2l.ai Outline Dependent Random

Recurrent Neural Network Rachel Hu and Zhi Zhang Amazon AI d2l.ai Outline Dependent Random Variables Text Preprocessing Language Modeling Recurrent Neural Networks (RNN) LSTM Bidirectional RNN Deep RNN d2l.ai

798 views • 55 slides

CSCE 496/896 Lecture 6: Architectures Stephen Scott Recurrent Architectures Introduction Basic

CSCE 496/896 Lecture 6: Recurrent CSCE 496/896 Lecture 6: Architectures Stephen Scott Recurrent Architectures Introduction Basic Idea I/O Mappings Stephen Scott Examples Training (Adapted from Vinod Variyam and Ian Goodfellow) Deep

378 views • 35 slides

Recurrent machines for likelihood-free inference Arthur Pesah Antoine Wehenkel Gilles Louppe

Recurrent machines for likelihood-free inference Arthur Pesah Antoine Wehenkel Gilles Louppe KTH ULige ULige 1 Likelihood-free Inference 2 Likelihood-free inference: what? Parameters Goal Finding the parameters corresponding to

526 views • 26 slides

Gated Orthogonal Recurrent Units: On Learning to Forget Li Jing, a - PowerPoint PPT Presentation

Gated Orthogonal Recurrent Units: On Learning to Forget Li Jing, a lar Glehre, John Peurifoy, Yichen Shen, Max Tegmark, Marin Solja i , Yoshua Bengio Gradient Vanishing/Explosion Problem During backpropagation through time,

Outline Gated Feedback Recurrent Neural Networks. arXiv1502. Introduction: RNN &amp; Gated RNN

Orthogonal Complements and Orthonormal Matrices Orthogonal Complements Defn. For a set W , the

Range gated cameras technology and its applications Friday, October 10, 14 Range gated cameras

CHAPTER VII VII CHAPTER Learning in Recurrent Networks Learning in Recurrent Networks CHAPTER

CS7015 (Deep Learning) : Lecture 15 Long Short Term Memory Cells (LSTMs), Gated Recurrent Units

CS7015 (Deep Learning) : Lecture 15 Long Short Term Memory Cells (LSTMs), Gated Recurrent Units

Orthogonal range searching Orthogonal range searching Problem: Given a set of n points Orthogonal

1 Mammalian Neurons Have Several Types of Voltage-Gated Ion Channels Why do neurons need so many

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling Authors: Junyoung

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

Typical English mistakes The system consist of three main component. Giorgio Buttazzo don't forget

Latin Squares and Orthogonal Arrays Lucia Moura School of Electrical Engineering and Computer

Cheap Orthogonal Constraints in Neural Networks: A Simple Parametrization of the Orthogonal and

Classification of self-orthogonal F q + u F q -codes Classification of self-orthogonal F q + u F q

Designs of Orthogonal Filter Banks and Orthogonal Cosine-Modulated Filter Banks Jie Yan

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

Recurrent Neural Networks for Language Modeling CSE392 - Spring 2019 Special Topic in CS Tasks

Introduction to Recurrent Neural Networks Jakob Verbeek Modeling sequential data with Recurrent

Recurrent Neural Network Attention Mechanisms for Interpretable System Log Anomaly Detection

Recurrent Neural Networks CS 6956: Deep Learning for NLP Overview 1. Modeling sequences 2.

Differential Categories, Recurrent Neural Networks, and Machine Learning Shin-ya Katsumata and

Recurrent Neural Network Rachel Hu and Zhi Zhang Amazon AI d2l.ai Outline Dependent Random

CSCE 496/896 Lecture 6: Architectures Stephen Scott Recurrent Architectures Introduction Basic

Recurrent machines for likelihood-free inference Arthur Pesah Antoine Wehenkel Gilles Louppe

Outline Gated Feedback Recurrent Neural Networks. arXiv1502. Introduction: RNN & Gated RNN