Deep Learning for Natural Language Processing Introduction to - PowerPoint PPT Presentation

Deep Learning for Natural Language Processing Introduction to transfer learning and pre-trained embeddings Richard Johansson richard.johansson@gu.se

recap: embeddings ◮ in a neural network, an embedding layer represents a symbol as a continuous vector ◮ we’ve seen how word embeddings are used as the first layer in NLP systems such as categorizers ◮ so far, we trained the word embeddings from scratch -20pt

transfer learning: idea and motivation ◮ in transfer learning , we try to exploit previously learned knowledge when solving new tasks ◮ in practice: after training, we reuse some part of the model ◮ why? because it can reduce the need for training data for the target task ◮ commonly used when training ML models for vision tasks -20pt

transfer learning in vision [source] -20pt

transfer learning in NLP this lecture: -20pt

transfer learning in NLP this lecture: later: -20pt

key challenges for transfer learning ◮ learning generally useful representations ◮ so we need fairly general training tasks ◮ finding training data ◮ ideally, an unlimited supply! -20pt

key challenges for transfer learning ◮ learning generally useful representations ◮ so we need fairly general training tasks ◮ finding training data ◮ ideally, an unlimited supply! ◮ in NLP, we prefer to use raw text (unannotated) for pre-training representations -20pt

predicting contexts ◮ all pre-training methods for word embeddings are based on predicting what kind of context a word appears in ◮ for instance, the surrounding words ◮ easy to generate large amount of training data -20pt

justification in terms of linguistic theory ◮ “you shall know a word by the company it keeps” (Firth, 1957) ◮ two words probably have a similar “meaning” if they tend to appear in similar contexts ◮ the distributional hypothesis (Harris, 1954): the distribution of contexts in which a word appears is a good proxy for the “meaning” of that word -20pt

example: most frequent verbs near cake and pizza ◮ cake : eat, bake, throw, cut, buy, get, decorate, garnish, make, serve, order ◮ pizza : eat, bake, order, munch, buy, serve, garnish, name, get, make, heat -20pt

so what kinds of “contexts” can we use? ◮ surrounding words: rest of today’s talk ◮ alternatives: ◮ documents (Landauer and Dumais, 1997) ◮ syntax (Padó and Lapata, 2007) ◮ images (Lazaridou et al., 2015) -20pt

using word embeddings in NLP applications ◮ the pre-trained word embeddings can then be “plugged” into NLP applications ◮ how? two alternatives: ◮ let the word embeddings be fixed ◮ fine-tune the embeddings for the application -20pt

next lecture clips ◮ the SGNS ( word2vec ) training algorithm ◮ evaluation and interpretation ◮ more training methods ◮ research outlook -20pt

references J. Firth. 1957. Papers in Linguistics 1934–1951 . OUP. Z. Harris. 1954. Distributional structure. Word 10(23):146–162. T. K. Landauer and S. T. Dumais. 1997. A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review 104:211–240. A. Lazaridou, N. T. Pham, and M. Baroni. 2015. Combining language and vision with a multimodal skipgram model. In NAACL . S. Padó and M. Lapata. 2007. Dependency-based construction of semantic space models. Computational Linguistics 33(2):161–199. -20pt

Deep Learning for Natural Language Processing Introduction to - PowerPoint PPT Presentation

Deep Learning for Natural Language Processing Introduction to transfer learning and pre-trained embeddings Richard Johansson richard.johansson@gu.se recap: embeddings in a neural network, an embedding layer represents a symbol as a

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Paula

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Information Extraction Industrial Natural Language Processing Industrial Natural Language

Deep learning for natural language processing A short primer on deep learning Benoit Favre <

Deep learning for natural language processing Introduction to natural language processing

Natural Language Processing 1 Lecture 11: Language generation and summarisation Katia Shutova

Natural Language Processing 1 Lecture 10: Language generation and summarisation Katia Shutova

Natural Language Processing with Deep Learning CS224N The Future of Deep Learning + NLP Kevin

Natural Language Processing Fall 2018 Frank Ferraro Natural language processing ITE 358

Natural Language Processing 1 Lecture 8: Compositional semantics and discourse processing Katia

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Learning for Natural Language Processing (in 2 hours) Eneko Agirre

Event Semantics Soma Paul International Institute of information Technology Hyderabad

The Electric Form Factor of the Neutron D. Day Institute of Nuclear and Particle Physics,

Urdu/Hindi Modals Rajesh Bhatt 1 , Tina B ogel 2 , Miriam Butt 2 , Annette Hautli 2 , Sebastian

Capital Improvement Program Update Facilities Planning, Design & Construction Stafford

NDPC Annual Meeting Gas Capture Panel Moderator- Phil Archer, Whiting Oil and Gas September 19,

PUBLIC OPINION AND ELECTRIC VEHICLES MARCH 2019 METHODOLOGY SURVEY CONDUCTED ONLINE WITH 1,200

GIF++ gas systems 21/01/2014 Ernesto, Roberto PH-DT-DI gas service 1 Gases needed

ElectraLink - Data Catalogue Update 19 February 2020 Action 1: Agree to publication of UK Link

Sambuz

Useful Links

Newsletter

Mail Us

Deep Learning for Natural Language Processing Introduction to - PowerPoint PPT Presentation

Deep Learning for Natural Language Processing Introduction to transfer learning and pre-trained embeddings Richard Johansson richard.johansson@gu.se recap: embeddings in a neural network, an embedding layer represents a symbol as a

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Paula

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Information Extraction Industrial Natural Language Processing Industrial Natural Language

Deep learning for natural language processing A short primer on deep learning Benoit Favre &lt;

Deep learning for natural language processing Introduction to natural language processing

Natural Language Processing 1 Lecture 11: Language generation and summarisation Katia Shutova

Natural Language Processing 1 Lecture 10: Language generation and summarisation Katia Shutova

Natural Language Processing with Deep Learning CS224N The Future of Deep Learning + NLP Kevin

Natural Language Processing Fall 2018 Frank Ferraro Natural language processing ITE 358

Natural Language Processing 1 Lecture 8: Compositional semantics and discourse processing Katia

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Learning for Natural Language Processing (in 2 hours) Eneko Agirre

Event Semantics Soma Paul International Institute of information Technology Hyderabad

The Electric Form Factor of the Neutron D. Day Institute of Nuclear and Particle Physics,

Urdu/Hindi Modals Rajesh Bhatt 1 , Tina B ogel 2 , Miriam Butt 2 , Annette Hautli 2 , Sebastian

Capital Improvement Program Update Facilities Planning, Design &amp; Construction Stafford

NDPC Annual Meeting Gas Capture Panel Moderator- Phil Archer, Whiting Oil and Gas September 19,

PUBLIC OPINION AND ELECTRIC VEHICLES MARCH 2019 METHODOLOGY SURVEY CONDUCTED ONLINE WITH 1,200

GIF++ gas systems 21/01/2014 Ernesto, Roberto PH-DT-DI gas service 1 Gases needed

ElectraLink - Data Catalogue Update 19 February 2020 Action 1: Agree to publication of UK Link

Sambuz

Useful Links

Newsletter

Mail Us

Deep learning for natural language processing A short primer on deep learning Benoit Favre <

Capital Improvement Program Update Facilities Planning, Design & Construction Stafford