Recurrent Language Models CMSC 470 Marine Carpuat Toward a Neural - PowerPoint PPT Presentation

Recurrent Language Models CMSC 470 Marine Carpuat

Toward a Neural Language Model Figures by Philipp Koehn (JHU)

Count-based n-gram models vs. feedforward neural networks • Pros of feedforward neural LM • Word embeddings capture generalizations across word typesq • Cons of feedforward neural LM • Closed vocabulary • Training/testing is more computationally expensive • Weaknesses of both types of model • Only work well for word prediction if the test corpus looks like the training corpus • Only capture short distance context

Language Modeling with Recurrent Neural Networks Figure by Philipp Koehn

Recurrent Neural Networks (RNN) The hidden layer includes a recurrent Unrolling the RNN over the time connection as part of its input sequence as a feed-forward network The hidden layer from the previous time step plays the role of memory, remembering earlier context Figures from Jurafsky & Martin

Unrolled RNN illustrated weights U, V, W are shared across all timesteps

Prediction/Inference with RNNs For language modeling, f = softmax function to provide normalized probability distribution over possible output classes

Training RNNs with backpropagation • Training goal: estimate parameter values for U, V, W • Use same loss as for feedforward language models • Given unrolled network, run forward and backpropagation algorithms as usual

Training RNNs with backpropagation

Practical Training Issues: vanishing/exploding gradients Multiple ways to work around this problem: - ReLU activations help - Dedicated RNN architecture (Long Short Term Memory Networks) Figure by Graham Neubig

Aside: Long Short Term Memory Networks

What do Recurrent Language Models Learn? Figure from Karpathy 2015

What do Recurrent Language Models Learn? • Parameters are hard to interpret, so we can gain insights by analyzing their output behavior instead • Can capture (some) long-distance dependencies After much economic progress over the years, the country has … The country, which has made much economic progress over the years, still has …

Recurrent neural network language models • Have all the strengths of feedforward language model • And do a better job at modeling long distance context • However • Training is trickier due to vanishing/exploding gradients • Performance on test sets is still sensitive to distance from training data

Recurrent Language Models CMSC 470 Marine Carpuat Toward a Neural - PowerPoint PPT Presentation

Recurrent Language Models CMSC 470 Marine Carpuat Toward a Neural Language Model Figures by Philipp Koehn (JHU) Count-based n-gram models vs. feedforward neural networks Pros of feedforward neural LM Word embeddings capture

CHAPTER VII VII CHAPTER Learning in Recurrent Networks Learning in Recurrent Networks CHAPTER

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

Recurrent Neural Network Xiaogang Wang xgwang@ee.cuhk.edu.hk February 26, 2019 cuhk Xiaogang

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

Introduction CSCE CSCE 496/896 496/896 Lecture 6: Lecture 6: Recurrent Recurrent CSCE

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

Models of Language Evolution models thereof its evolution language Models of Language Evolution

CSEP 517: Natural Language Processing Recurrent Neural Networks Autumn 2018 Luke Zettlemoyer

Natural Language Processing with Deep Learning Language Modeling with Recurrent Neural Networks

Understanding LSTM Networks Recurrent Neural Networks An unrolled recurrent neural network The

Lecture 4: Recurrent neural networks for natural language processing Plan of the lecture Part

Recurrent Neural Networks for Language Modeling CSE392 - Spring 2019 Special Topic in CS Tasks

Recurrent Language Models CMSC 470 Marine Carpuat Toward a Neural Language Model Figures by

Recurrent Neural Models: Language Models, and Sequence Prediction and Generation CMSC 473/673

4 Language Models 2: Log-linear Language Models This chapter will discuss another set of language

Recurrent Neural Networks Greg Mori - CMPT 419/726 Goodfellow, Bengio, and Courville: Deep

Contact me about expert training for 10/24/2012 www.ellenfinkelstein.com teams and individuals!

Joint Learning of Speech-Driven Facial Motion with Bidirectional Long-Short Term Memory N AJMEH S

Student Success ICTCM 2020 Diane Hollister Diane.Hollister@pearson.com Presentation Title Arial

Recurrent Neural Networks Luke Zettlemoyer (Slides adapted from Danqi Chen, Chris Manning,

Why UIs are like they are? Week 4 Are there any laws or theory that tell us how to design a user

Deep Recurrent Q-Learning for Partially Observable MDPs Matthew Hausknecht and Peter Stone

Recurrent Neural Networks (RNN) Artificial Intelligence @ Allegheny College Janyl Jumadinova

Cognitive Load Theory Why is learning to code so hard? What is cognitive load theory? Memory

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us