SI425 : NLP Missing Topics and the Future Who cares about NLP? NLP - PowerPoint PPT Presentation

SI425 : NLP Missing Topics and the Future

Who cares about NLP? • NLP has expanded quickly • Most top-tier universities now have NLP faculty (Stanford, Cornell, Berkeley, MIT, UPenn, CMU, Hopkins, etc) • Commercial NLP hiring: Google, Amazon, Microsoft, IBM, LinkedIn • Web startups in Silicon Valley are eating up NLP students • Navy, DoD, NSA, NIH : all funding NLP research 2

What NLP topics did we miss? • Speech Recognition 3

What NLP topics did we miss? • Speech Recognition 4

What NLP topics did we miss? • Machine Translation 5

What NLP topics did we miss? • Machine Translation • IBM Models (1 through 5) • Neural Network Translation 6

Machine Translation 7

Learning Translations • Huge corpus of “aligned sentences”. • Europarl • Corpus of European Parliamant proceedings • The EU is mandated to translate into all 21 official languages • 21 languages, (semi-) aligned to each other

Machine Translation Technology • Hand-held devices for military • Speak english -> recognition -> translation -> generate Urdu • Translate web documents • Education technology? • Doesn’t yet receive much of a focus

Information Extraction

What NLP topics did we miss? • Dialogue Systems Do you think I don’t care. Anakin likes me? 11

Dialogue Systems • Dialogue Systems • Why? Heavy interest in human-robot communication. • UAVs require teams of 5+ people for each operating machine • Goal: reduce the number of people • Give computer high-level dialogue commands, rather than low-level system commands 12

Dialogue Systems • Dialogue Systems • Dialogue is a fascinating topic. Not only do we need to understand language, but now discourse cues: • Questions require replies • Imperatives/Commands • Acknowledgments: “ok” • Back-channels: “uh huh”, “mm hmm” • 13

Dialogue Systems • BERT-like models • Input: • [CLS] how are you ? [SEP] great thanks [END] • [CLS] hello [SEP] hi what’s up [END] • … 14

El Fin • Secret 1: 15

El Fin • Secret 1: I intentionally made some of our labs ambiguous 16

El Fin • Secret 1: I intentionally made some of our labs ambiguous Under-defined tasks with unclear expected results 17

El Fin • Secret 1: I intentionally made some of our labs ambiguous Under-defined tasks with unclear expected results • Secret 2: 18

El Fin • Secret 1: I intentionally made some of our labs ambiguous Under-defined tasks with unclear expected results • Secret 2: I tried to teach you skills that have nothing to do with NLP 19

El Fin • Secret 1: I intentionally made some of our labs ambiguous Under-defined tasks with unclear expected results • Secret 2: I tried to teach you skills that have nothing to do with NLP Experimentation Error Analysis 20

El Fin • Secret 1: I intentionally made some of our labs ambiguous Under-defined tasks with unclear expected results • Secret 2: I tried to teach you skills that have nothing to do with NLP Experimentation Error Analysis • Secret 3: 21

El Fin • Secret 1: I intentionally made some of our labs ambiguous Under-defined tasks with unclear expected results • Secret 2: I tried to teach you skills that have nothing to do with NLP Experimentation Error Analysis • Secret 3: I appreciate the hard work you put into the class 22

What NLP topics did we miss? Unsupervised Learning 24

What NLP topics did we miss? Unsupervised Learning • Most of this semester used data that had human labels. • Bootstrapping was our main counter- example: it is mostly unsupervised. • Many many algorithms being researched to learn language and knowledge without humans, only using text. 25

SI425 : NLP Missing Topics and the Future Who cares about NLP? NLP - PowerPoint PPT Presentation

SI425 : NLP Missing Topics and the Future Who cares about NLP? NLP has expanded quickly Most top-tier universities now have NLP faculty (Stanford, Cornell, Berkeley, MIT, UPenn, CMU, Hopkins, etc) Commercial NLP hiring: Google,

SI425 : NLP Set 11 Distributional Similarity some slides adapted from Dan Jurafsky and Bill

SI425 : NLP Set 7 Sentiment and Opinions Fall 2020 : Chambers People have opinions The

SI425 : NLP Set 14 Neural NLP Fall 2020 : Chambers Why are these so different? Last time :

SI425 : NLP Set 2 Probability Review Fall 2020 : Chambers help me make a new rumor

SI425 : NLP Set 7 Syntax and Parsing Syntax Grammar, or syntax: The kind of implicit

SI425 : NLP Set 5 Nave Bayes Classification Motivation We want to predict something .

SI425 : NLP Set 8 Words as Vectors (distributional similarity) Fall 2020 : Chambers some

SI425 : NLP Set 13 Information Extraction Information Extraction Yesterday GM released third

SI425 Natural Language Processing Set 1 Intro to NLP Fall 2017: Chambers Assumptions about

SI425 Natural Language Processing Set 1 Intro to NLP Fall 2020: Chambers Assumptions about

SI425 : NLP Set 10 Syntax and Parsing Fall 2020 : Chambers Syntax Grammar, or syntax:

SI425 : NLP Set 10 Lexical Relations slides adapted from Dan Jurafsky and Bill MacCartney Three

SI425 : NLP Set 9 Word2Vec - Neural Words Fall 2020 : Chambers Why are these so different? Last

SI425 : NLP Set 5 Nave Bayes Classification Fall 2020 : Chambers Motivation We want to

SI425 : NLP Set 8 PCFGs and the CKY Algorithm PCFGs We saw how CFGs can model English (sort

SI425 : NLP Set 4 Smoothing Language Models Fall 2020 : Chambers Review: evaluating n-gram

SI425 : NLP Set 4 Smoothing Language Models Fall 2017 : Chambers Review: evaluating n-gram

SI425 : NLP Set 6 Logistic Regression Fall 2020 : Chambers Last time Naive Bayes Classifier

SI425 : NLP Set 3 Language Models Fall 2017 : Chambers Language Modeling Which sentence is