Algorithms for NLP CS 11711, Fall 2019 Lecture 1: Introduction - PowerPoint PPT Presentation

Algorithms for NLP CS 11711, Fall 2019 Lecture 1: Introduction Yulia Tsvetkov 1

Welcome! Yulia Bob Sachin Anjalie Chan 2

Course Website http://demo.clab.cs.cmu.edu/11711fa19/ 3

Communication with Machines ▪ ~50s-70s 4

Communication with Machines ▪ ~80s 5

Communication with Machines ▪ Today 6

Slide by Noah Smith Slide by Noah Smith 7

What is NLP? ▪ NL ∈ {Mandarin, Hindi, Spanish, Arabic, English, … Inuktitut} ▪ Automation of NLs: ▪ analysis ( NL → R ) ▪ generation ( R → NL ) ▪ acquisition of R from knowledge and data 8

What language technologies are required to write such a program? 9

Language Technologies A conversational agent contains ▪ Speech recognition Language analysis ▪ Dialog processing ▪ Information retrieval ▪ Text to speech ▪ 10

Language Technologies 11

Language Technologies ▪ What does “divergent” mean? ▪ What year was Abraham Lincoln born? ▪ How many states were in the United States that year? ▪ How much Chinese silk was exported to England in the end of the 18th century? ▪ What do scientists think about the ethics of human cloning? 12

NLP ▪ ▪ Applications Core technologies ▪ ▪ Machine Translation Language modelling ▪ ▪ Information Retrieval Part-of-speech tagging ▪ ▪ Question Answering Syntactic parsing ▪ ▪ Dialogue Systems Named-entity recognition ▪ ▪ Information Extraction Coreference resolution ▪ ▪ Summarization Word sense disambiguation ▪ ▪ Sentiment Analysis Semantic Role Labelling ▪ ▪ ... ... 13

What does an NLP system need to ‘know’? ▪ Language consists of many levels of structure ▪ Humans fluently integrate all of these in producing/understanding language ▪ Ideally, so would a computer! 14

What does it mean to “know” a language? 15

Levels of linguistic knowledge Slide by Noah Smith

Phonetics, phonology ▪ Pronunciation modeling 17

Words ▪ Language modeling ▪ Tokenization ▪ Spelling correction 18

Morphology ▪ Morphological analysis ▪ Tokenization ▪ Lemmatization 19

Parts of speech ▪ Part-of-speech tagging 20

Syntax ▪ Syntactic parsing 21

Semantics ▪ Named entity recognition ▪ Word sense disambiguation ▪ Semantic role labelling 22

Discourse ▪ Reference resolution ▪ Discourse parsing 23

Where are we now? Li et al. (2016), "Deep Reinforcement Learning for Dialogue Generation" EMNLP 24

Where are we now? Zhao, J., Wang, T., Yatskar, M., Ordonez, V and Chang, https://www.theverge.com/2016/3/24/11297050 M.-W. (2017) Men Also Like Shopping: Reducing Gender /tay-microsoft-chatbot-racist Bias Amplification using Corpus-level Constraint. EMNLP 25

Why is NLP Hard? Ambiguity 1. Scale 2. Sparsity 3. Variation 4. Expressivity 5. Unmodeled variables 6. Unknown representation R 7. 26

Ambiguity ▪ Ambiguity at multiple levels: ▪ Word senses: bank (finance or river?) ▪ Part of speech: chair (noun or verb?) ▪ Syntactic structure: I can see a man with a telescope ▪ Multiple: I saw her duck 27

Ambiguity + Scale 28

Tokenization 29

Word Sense Disambiguation 30

Tokenization + Disambiguation 31

Part of Speech Tagging 32

Tokenization + Morphological Analysis ▪ Quechua 33

Morphology unfriend, Obamacare, Manfuckinghattan 34

Syntactic Parsing, Word Alignment 35

Semantic Analysis ▪ Every language sees the world in a different way ▪ For example, it could depend on cultural or historical conditions ▪ Russian has very few words for colors, Japanese has hundreds ▪ Multiword expressions, e.g. it’s raining cats and dogs or wake up and metaphors, e.g. love is a journey are very different across languages 36

Semantics Every fifteen minutes a woman in this country gives birth. 37

Semantics Every fifteen minutes a woman in this country gives birth. Our job is to find this woman, and stop her! – Groucho Marx 38

Syntax + Semantics We saw the woman with the telescope wrapped in paper. ▪ Who has the telescope? ▪ Who or what is wrapped in paper? ▪ An event of perception, or an assault? 39

Dealing with Ambiguity ▪ How can we model ambiguity and choose the correct analysis in context? ▪ non-probabilistic methods (FSMs for morphology, CKY parsers for syntax) return all possible analyses . ▪ probabilistic models (HMMs for POS tagging, PCFGs for syntax) and algorithms (Viterbi, probabilistic CKY) return the best possible analysis, i.e., the most probable one according to the model. ▪ But the “best” analysis is only good if our probabilities are accurate. Where do they come from? 40

Corpora ▪ A corpus is a collection of text ▪ Often annotated in some way ▪ Sometimes just lots of text ▪ Examples ▪ Penn Treebank: 1M words of parsed WSJ ▪ Canadian Hansards: 10M+ words of aligned French / English sentences ▪ Yelp reviews ▪ The Web: billions of words of who knows what 41

Corpus-Based Methods ▪ Give us statistical information All NPs NPs under S NPs under VP 42

Corpus-Based Methods ▪ Let us check our answers TRAINING DEV TEST 43

Statistical NLP Like most other parts of AI, NLP is dominated by statistical methods ▪ Typically more robust than earlier rule-based methods ▪ Relevant statistics/probabilities are learned from data ▪ Normally requires lots of data about any particular phenomenon 44

Why is NLP Hard? 1. Ambiguity 2. Scale 3. Sparsity 4. Variation 5. Expressivity 6. Unmodeled variables 7. Unknown representation 45

Sparsity Sparse data due to Zipf’s Law ▪ To illustrate, let’s look at the frequencies of different words in a large text corpus ▪ Assume “word” is a string of letters separated by spaces 46

Word Counts Most frequent words in the English Europarl corpus (out of 24m word tokens) 47

Word Counts But also, out of 93,638 distinct words (word types), 36,231 occur only once. Examples: ▪ cornflakes, mathematicians, fuzziness, jumbling ▪ pseudo-rapporteur, lobby-ridden, perfunctorily, ▪ Lycketoft, UNCITRAL, H-0695 ▪ policyfor, Commissioneris, 145.95, 27a 48

Plotting word frequencies Order words by frequency. What is the frequency of n th ranked word? 49

Zipf’s Law Implications ▪ Regardless of how large our corpus is, there will be a lot of infrequent (and zero-frequency!) words ▪ This means we need to find clever ways to estimate probabilities for things we have rarely or never seen 50

Variation ▪ Suppose we train a part of speech tagger or a parser on the Wall Street Journal ▪ What will happen if we try to use this tagger/parser for social media?? 52

Why is NLP Hard? 53

Expressivity Not only can one form have different meanings (ambiguity) but the same meaning can be expressed with different forms: She gave the book to Tom vs. She gave Tom the book Some kids popped by vs. A few children visited Is that window still open? vs. Please close the window 55

Unmodeled variables “Drink this milk” World knowledge ▪ I dropped the glass on the floor and it broke ▪ I dropped the hammer on the glass and it broke 56

Unknown Representation ▪ Very difficult to capture what is R , since we don’t even know how to represent the knowledge a human has/needs: ▪ What is the “meaning” of a word or sentence? ▪ How to model context? ▪ Other general knowledge? 57

Desiderata for NLP models ▪ Sensitivity to a wide range of phenomena and constraints in human language ▪ Generality across languages, modalities, genres, styles ▪ Strong formal guarantees (e.g., convergence, statistical efficiency, consistency) ▪ High accuracy when judged against expert annotations or test data ▪ Ethical 58

Symbolic and Probabilistic NLP 59

Probabilistic and Connectionist NLP 60

NLP ≟ Machine Learning ▪ To be successful, a machine learner needs bias/assumptions; for NLP, that might be linguistic theory/representations. ▪ Symbolic, probabilistic, and connectionist ML have all seen NLP as a source of inspiring applications. 61

Algorithms for NLP CS 11711, Fall 2019 Lecture 1: Introduction - PowerPoint PPT Presentation

Algorithms for NLP CS 11711, Fall 2019 Lecture 1: Introduction Yulia Tsvetkov 1 Welcome! Yulia Bob Sachin Anjalie Chan 2 Course Website http://demo.clab.cs.cmu.edu/11711fa19/ 3 Communication with Machines ~50s-70s 4

SI485i : NLP Missing Topics and the Future Who cares about NLP? NLP has expanded quickly

SI425 : NLP Missing Topics and the Future Who cares about NLP? NLP has expanded quickly

NLP: Two pictures Wordnet and Word Sense Problem NLP Disambiguation Semantics NLP Trinity

Recurrent Neural Networks Graham Neubig Site https://phontron.com/class/nn4nlp2017/ NLP and

Natural Language Processing (NLP) In 11-711 Algorithms for NLP we take an

Ontologies for NLP NLP for Ontologies FOIS 2014 - LogOnto Workshop on Logics and Ontologies for

Algorithms for NLP 11-711, Fall 2019 Lecture 26: Computational Ethics Yulia Tsvetkov 1

Algorithms for NLP IITP, Fall 2019 Lecture 25: Computational Ethics Yulia Tsvetkov 1 Tsvetkov

Facing NLP German Rigau i Claramunt http://adimen.si.ehu.es/~rigau IXA group Departamento de

IXA pipes: Efficient and Ready to Use Multilingual NLP tools Rodrigo Agerri IXA NLP Group,

Prominent Research Directions in NLP Alexander Panchenko Assistant Professor for NLP About

Deep Learning for NLP Kiran Vodrahalli Feb 11, 2015 Overview What is NLP? Natural

Hybrid NLP Hybrid NLP O UTLINE O UTLINE Problems of Deep and Shallow Processing

NLP Programming Tutorial 4 - Word Segmentation Graham Neubig Nara Institute of Science and

SI485i : NLP Set 12 Features and Prediction What is NLP, really? Many of our tasks boil down

Capsule Networks for NLP Will Merrill Advanced NLP 10/25/18 Capsule Networks: A Better ConvNet

Chapter 2- -3 3 Chapter 2 Definition of Theory: A theory is a systematic Definition of

February 7, 2019 CAC Meeting Agenda 12:00 12:25 Introductions, 2018 Accomplishments, and

A Controlled Language for the Specification of Contracts Gordon Pace Michael Rosner University

State & Local Government DB Plans Alan Auerbach November 15, 2019 Major Subnational Fiscal

ohsome Comprehensive OpenStreetMap History Data Analyses - for and with the OSM community

Order Independence Krzysztof R. Apt CWI and University of Amsterdam Order Independence p.

8. Politics and Diplomacy in the Age of Nationalism, 1850-1914 8.1 The Unification of Italy and

University of Braslia Abstract This paper systematically surveys the theory and challenges

Sambuz

Useful Links

Newsletter

Mail Us

Algorithms for NLP CS 11711, Fall 2019 Lecture 1: Introduction - PowerPoint PPT Presentation

Algorithms for NLP CS 11711, Fall 2019 Lecture 1: Introduction Yulia Tsvetkov 1 Welcome! Yulia Bob Sachin Anjalie Chan 2 Course Website http://demo.clab.cs.cmu.edu/11711fa19/ 3 Communication with Machines ~50s-70s 4

SI485i : NLP Missing Topics and the Future Who cares about NLP? NLP has expanded quickly

SI425 : NLP Missing Topics and the Future Who cares about NLP? NLP has expanded quickly

NLP: Two pictures Wordnet and Word Sense Problem NLP Disambiguation Semantics NLP Trinity

Recurrent Neural Networks Graham Neubig Site https://phontron.com/class/nn4nlp2017/ NLP and

Natural Language Processing (NLP) In 11-711 Algorithms for NLP we take an

Ontologies for NLP NLP for Ontologies FOIS 2014 - LogOnto Workshop on Logics and Ontologies for

Algorithms for NLP 11-711, Fall 2019 Lecture 26: Computational Ethics Yulia Tsvetkov 1

Algorithms for NLP IITP, Fall 2019 Lecture 25: Computational Ethics Yulia Tsvetkov 1 Tsvetkov

Facing NLP German Rigau i Claramunt http://adimen.si.ehu.es/~rigau IXA group Departamento de

IXA pipes: Efficient and Ready to Use Multilingual NLP tools Rodrigo Agerri IXA NLP Group,

Prominent Research Directions in NLP Alexander Panchenko Assistant Professor for NLP About

Deep Learning for NLP Kiran Vodrahalli Feb 11, 2015 Overview What is NLP? Natural

Hybrid NLP Hybrid NLP O UTLINE O UTLINE Problems of Deep and Shallow Processing

NLP Programming Tutorial 4 - Word Segmentation Graham Neubig Nara Institute of Science and

SI485i : NLP Set 12 Features and Prediction What is NLP, really? Many of our tasks boil down

Capsule Networks for NLP Will Merrill Advanced NLP 10/25/18 Capsule Networks: A Better ConvNet

Chapter 2- -3 3 Chapter 2 Definition of Theory: A theory is a systematic Definition of

February 7, 2019 CAC Meeting Agenda 12:00 12:25 Introductions, 2018 Accomplishments, and

A Controlled Language for the Specification of Contracts Gordon Pace Michael Rosner University

State &amp; Local Government DB Plans Alan Auerbach November 15, 2019 Major Subnational Fiscal

ohsome Comprehensive OpenStreetMap History Data Analyses - for and with the OSM community

Order Independence Krzysztof R. Apt CWI and University of Amsterdam Order Independence p.

8. Politics and Diplomacy in the Age of Nationalism, 1850-1914 8.1 The Unification of Italy and

University of Braslia Abstract This paper systematically surveys the theory and challenges

Sambuz

Useful Links

Newsletter

Mail Us

State & Local Government DB Plans Alan Auerbach November 15, 2019 Major Subnational Fiscal