Natural Language Processing STOR 390 4/18/17 Kurt Vonnegut on the - PowerPoint PPT Presentation

Jul 03, 2023 •145 likes •509 views

Natural Language Processing STOR 390 4/18/17 Kurt Vonnegut on the Shapes of Stories https://www.youtube.com/watch?v=oP3c1h8v2ZQ We know how to work with tidy data We know how to work with tidy data Regression linear model, polynomial

Natural Language Processing STOR 390 4/18/17
Kurt Vonnegut on the Shapes of Stories https://www.youtube.com/watch?v=oP3c1h8v2ZQ
We know how to work with tidy data
We know how to work with tidy data Regression linear model, polynomial terms Classification K-nearest-neighbors, SVM Clustering K-means
Unstructured data : not all data is tidy Networks Text Images
Network data
Image data http://www.dailytarheel.com/article/2017/04/a-title-to- remember-north-carolina-wins-its-sixth-ncaa- championship http://dogtime.com/puppies/255-puppies
Text data https://emeraldcitybookreview.com/2014/06/beautiful-books-picturing-jane-austen_20.html
Unstructured ≠ no structure
Two strategies Invent new tools PageRank Turn it into tidy data
Images are numbers https://medium.com/@ageitgey/machine-learning-is-fun-part-3-deep-learning-and-convolutional-neural-networks- f40359318721
https://research.googleblog.com/2016/09/show-and-tell-image-captioning-open.html
Text data One document = string of words Corpus = collection of documents
“ A token is a meaningful unit of text , most often a word, that we are interested in using for further analysis, and tokenization is the process of splitting text into tokens.” —Text Mining with R
Tokenization turns text into tidy format Word Sentence Paragraph Chapter
Jane Austen’s books tokenized by word
Make text lower case Make words more comparable Door —> door
Tokenization loses information Ignores word order
Most frequently appearing words
Remove stop words Commonly occurring words the to and Hand code a list of words
Most frequently occurring words (no stop words)
Sentiment analysis attempts to quantify emotional content Assign each word an emotional value positive/negative trust, fear, sadness, anger, surprise, disgust, joy, anticipation” -5, -4, … 4, 5
There are precompiled lexicons Hand coded Crowdsourced Amazon turk Online reviews Yelp
Assign each word a sentiment
Sentiment analysis is noisy
Sentiment analysis is noisy Lexicons may not generalize Unigrams no good Context
Sentiment analysis is noisy Statistics is so much fun vs. Statistics is so much fun
Jane Austen novels are fairly balanced
Different ways to quantify “time" chapter paragraph line sentence
Different ways to quantify “time" chapter paragraph line sentence we choose one unit of time = 80 lines
index = line number %/% 80 sentiment = (# positive words) - (# negative words)
Smooth time series with a low band pass filter http://www.matthewjockers.net/2015/02/02/syuzhet/
References Text Mining with R http://tidytextmining.com/ Revealing Sentiment and Plot Arcs with the Syuzhet Package http://www.matthewjockers.net/2015/02/02/syuzhet/

Recommend

Advanced Natural Language Processing: What is Natural Language Processing (NLP)? Background

Course Logistics What is Natural Language Processing? computers using natural language as input and/or Instructor Regina Barzilay, Michael Collins output Email regina@csail.mit.edu, mcollins@csail.mit.edu computer language language

553 views • 13 slides

Natural Language Processing Stages in understanding natural language Why its hard

Natural Language Processing Stages in understanding natural language Why its hard 14-11-2011 (Some slides adapted from presentations by Dan Jurafsky and Bonnie Dorr.) Computer Speech and Language Processing What is it? Getting

609 views • 11 slides

Natural Language Processing Fall 2018 Frank Ferraro Natural language processing ITE 358

CMSC 473/673 Natural Language Processing Fall 2018 Frank Ferraro Natural language processing ITE 358 ferraro@umbc.edu Semantics Monday: 2:15-3 Tuesday: 11:00-11:30 Vision & language processing by appointment Learning with low-to-no

1.46k views • 117 slides

LANGUAGE MODELS 24.05.19 Statistical Natural Language Processing 1 Statistical natural

Jurafsky, D. and Martin, J. H. (2009): Speech and Language Processing. An Introduction to Natural Language Processing , Computational Linguistics and Speech Recognition. Second Edition. Pearson: New Jersey. Chapter 4 Manning, C. D. and

899 views • 60 slides

Natural Language Processing George Konidaris gdk@cs.brown.edu Fall 2019 Natural Language

Natural Language Processing George Konidaris gdk@cs.brown.edu Fall 2019 Natural Language Processing Understanding spoken/written sentences in a natural language. Major area of research in AI. Why? Humans use language to communicate.

1.18k views • 33 slides

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture 7: Lexical Semantics Simone Teufel (Materials mostly by Ann

552 views • 31 slides

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture 10: Discourse Simone Teufel (Materials by Ann Copestake)

501 views • 36 slides

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture 6: Compositional Semantics Simone Teufel (Materials by Ann

493 views • 22 slides

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Paula

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Paula Buttery (materials by Ann Copestake) Computer Laboratory

554 views • 37 slides

Advanced Natural Language Processing: What is Natural Language Processing (NLP)? Background

Overview Advanced Natural Language Processing: What is Natural Language Processing (NLP)? Background and Overview Why is NLP hard? What will this course be about? Michael Collins EECS/CSAIL September 6, 2007 Advanced Natural

296 views • 7 slides

Introduction to Natural Language Processing CMSC 470 Marine Carpuat Natural Language Processing

Introduction to Natural Language Processing CMSC 470 Marine Carpuat Natural Language Processing (NLP) The engineering discipline of doing what people do with language, but using computers Computational Linguistics (CL) The science of

934 views • 28 slides

Natural language is a programming language: Applying natural language processing to software

Natural language is a programming language: Applying natural language processing to software development Michael D. Ernst Presented by: Tomas Geffner, Subendhu Rongali & Natcha Simsiri Before we start, what is software? Not just code/AST

438 views • 23 slides

Information Extraction Industrial Natural Language Processing Industrial Natural Language

Industrial Natural Language Processing & Information Extraction Industrial Natural Language Processing Industrial Natural Language Processing Overview Natural Language Processing Developing and applying techniques NLP and methods for

479 views • 20 slides

Statistical natural language processing 24.05.19 Statistical Natural Language Processing 1 The

Prof. dr. Alexander panchenko Phrase Alignment Chomsky-Hierarchy Syntax Rules Transducers for Morphology Topic Models Sequence Labeling Neural Architectures Machine Learning Semantic Methods Statistical natural language processing

499 views • 38 slides

Natural Language Processing 1 Lecture 6: Distributional semantics: generalisation and word

Natural Language Processing 1 Natural Language Processing 1 Lecture 6: Distributional semantics: generalisation and word embeddings Katia Shutova ILLC University of Amsterdam 15 November 2018 1 / 51 Natural Language Processing 1 Real

941 views • 51 slides

Fuzzy Logic in Natural Fuzzy Logic in Natural Language Processing Language Processing ...wild

Fuzzy Logic in Natural Fuzzy Logic in Natural Language Processing Language Processing ...wild speculation about the nature of truth, and other equally unscientific endeavors. Richard Bergmair Acknowledgments thanks for supervising the

682 views • 42 slides

Statistical Natural Language Processing Prasad Tadepalli CS430 lecture Natural Language

Statistical Natural Language Processing Prasad Tadepalli CS430 lecture Natural Language Processing Some subproblems are partially solved Spelling correction, grammar checking Information retrieval with keywords Semi-automatic

483 views • 9 slides

Natural Language Processing 1 Lecture 8: Compositional semantics and discourse processing Katia

Natural Language Processing 1 Natural Language Processing 1 Lecture 8: Compositional semantics and discourse processing Katia Shutova ILLC University of Amsterdam 26 November 2018 1 / 45 Natural Language Processing 1 Compositional

1.06k views • 80 slides

Natural Language Processing 1 Lecture 10: Language generation and summarisation Katia Shutova

Natural Language Processing 1 Natural Language Processing 1 Lecture 10: Language generation and summarisation Katia Shutova ILLC University of Amsterdam 2 December 2019 1 / 51 Natural Language Processing 1 Language generation Language

830 views • 54 slides

Natural language processing and weak supervision L eon Bottou COS 424 4/27/2010

Natural language processing and weak supervision L eon Bottou COS 424 4/27/2010 Introduction Natural language processing from scratch Natural language processing systems are heavily engineered. How much engineering can we

819 views • 43 slides

SYNTAX PROCESSING Statistical Natural Language Processing 23.04.19 1 Syntax, Grammars, Parsing

Jurafsky, D. and Martin, J. H. (2009): Speech and Language Processing. An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition . Second Edition. Pearson: New Jersey: Chapter 13 Chunking, Syntax trees,

516 views • 38 slides

Overview for today Natural Language Processing with NNs [~15m] Supervised

Overview for today Natural Language Processing with NNs [~15m] Supervised models Unsupervised Learning [~45m] Memory in Neural Nets [~30m] Natural Language Processing Slides from: Jason Weston Tomas Mikolov Wojciech

970 views • 84 slides

Natural Language Processing 1 Lecture 11: Language generation and summarisation Katia Shutova

Natural Language Processing 1 Natural Language Processing 1 Lecture 11: Language generation and summarisation Katia Shutova ILLC University of Amsterdam 6 December 2018 Natural Language Processing 1 Language generation Language generation

828 views • 50 slides

Pragmatic aspects of natural language Vojtch Kov Natural Language Processing Centre

Outline Pragmatics Grices cooperative principle Speech acts Pragmatic aspects of natural language Vojtch Kov Natural Language Processing Centre Faculty of Informatics, Masaryk University Botanick 68a, 602 00 Brno

260 views • 7 slides