deep learning for natural language processing
play

Deep learning for natural language processing Introduction to - PowerPoint PPT Presentation

Deep learning for natural language processing Introduction to natural language processing Aix-Marseille Universit, LIF/CNRS 20 Feb 2017 Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 1 / 24 Benoit Favre < benoit.favre@univ-mrs.fr >


  1. Deep learning for natural language processing Introduction to natural language processing Aix-Marseille Université, LIF/CNRS 20 Feb 2017 Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 1 / 24 Benoit Favre < benoit.favre@univ-mrs.fr >

  2. Deep learning for Natural Language Processing Day 1 20 Feb 2017 DL4NLP: NLP intro Benoit Favre (AMU) Day 5 Day 4 2 / 24 Day 2 Day 3 ▶ Class: intro to natural language processing ▶ Class: quick primer on deep learning ▶ Tutorial: neural networks with Keras ▶ Class: word embeddings ▶ Tutorial: word embeddings ▶ Class: convolutional neural networks, recurrent neural networks ▶ Tutorial: sentiment analysis ▶ Class: advanced neural network architectures ▶ Tutorial: language modeling ▶ Tutorial: Image and text representations ▶ Test

  3. What is Natural Language Processing? What is Natural Language Processing (NLP)? Allow computer to communicate with humans using everyday language Teach computers to reproduce human behavior regarding language manipulation Linked to the study of human language through computers (Computational Linguistics) Why is it diffjcult? People do not follow rules strictly when they talk or write: “r u ready?” Language is ambiguous: “time fmies like an arrow” Input can be noisy: speech recognition in the subway Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 3 / 24

  4. NLP is everywhere Writing recognition (Cheque processing) 20 Feb 2017 DL4NLP: NLP intro Benoit Favre (AMU) Dialog systems (Siri/OK Google/Alexa...) Speech synthesis (In-car GPS) Voice dictation (Dragon, Nuance) Spam fjltering (Email) Spell checker / grammar correction (Word) Sentiment analysis (Amazon) Call routing (Telcos) Automatic summarization (Google news) Question answering (Jeopardy) Information extraction (Ask.com) Machine translation (Google) Information retrieval / search (Google) 4 / 24

  5. Domains reltated to NLP Artifjcial intelligence Formal language theory Machine learning Linguistics Psycholinguistics Cognitive Sciences Philosophy of language Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 5 / 24

  6. Communication channel Perception: how the sound is transmitted to neurons 20 Feb 2017 DL4NLP: NLP intro Benoit Favre (AMU) Integration: believe or not the information, reply... 3 Analysis: interpretation of the linguistic message (syntactic, semantic...) 2 1 From the point of view of the source (the speaker) From a receiver point of view (listener) Production: the muscular action which leads to sound production 3 Generation: the message in linguistic form 2 Intent: the message we want to communicate 1 6 / 24

  7. Processing levels 4 20 Feb 2017 DL4NLP: NLP intro Benoit Favre (AMU) What does it entail ? know( John , Mary ) Since when ? Is it reciprocal ? Pragmatic : what is the function of that sentence in context? love(person( John ), person( Mary )) “John loves Mary” Semantic : represent meaning 3 (S (NP (NNP John)) (VP (VBZ loves) (NP (NNP Mary))) (. .)) Syntax : identify grammatical structures 2 John /fjrstname-male loves /verb-love Mary /fjrstname-female Lexical : segment character stream in words, identify linguistic units 1 7 / 24

  8. Modular approach Benoit Favre (AMU) 20 Feb 2017 DL4NLP: NLP intro 8 / 24 syntactic words tree concepts, Automatic Syntactic Semantic question relations transcription analysis analysis Dialog manager Speech Lexical Syntactic logical answer synthesis generation generation representation words, primitive prosody syntax

  9. Language ambiguity Phonetic 20 Feb 2017 DL4NLP: NLP intro Benoit Favre (AMU) (wikipedia) Notational conventions Referential 9 / 24 Syntactic Etymology Phonetic and graphical Graphical ▶ I don’t know! – I don’t - no! ▶ I live by the bank (river bank or fjnancial institution) ▶ I met an Indian (from India or native American) ▶ I love American wine (from USA or from the Americas) ▶ He looks at the man with a telescope ▶ He gave her cat food ▶ She is gone. Who? ▶ Birth date: 08/01/05

  10. Basic NLP tasks Syntax 20 Feb 2017 DL4NLP: NLP intro Benoit Favre (AMU) Pragmatic 10 / 24 Semantic ▶ Word / sentence segmentation ▶ Morphological analysis ▶ Part-of-speech tagging ▶ Syntactic chunking ▶ Syntactic parsing ▶ Word sense disambiguation ▶ Semantic role labeling ▶ Logical form creation ▶ Coreference resolution ▶ Discourse parsing

  11. Word segmentation Split according to delimiters [ :,.!?’] What about compounds? Multiword expressions? URLs ( http://www.google.com ), variable names (theMaximumInTheTable) In Chinese, no spaces between words: Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 11 / 24 Character sequence → word sequence (tokenization) ▶ 男孩喜歡冰淇淋。 → 男孩 (the boy) 喜歡 (likes) 冰淇淋 (ice cream) 。

  12. Morphological analysis Split words in relevant factors 20 Feb 2017 DL4NLP: NLP intro Benoit Favre (AMU) Rich morphology Agglutinative languages 12 / 24 Verb tense Prefjxes, roots and suffjxes Gender and number ▶ fmower, fmower+s, fmoppy, fmopp+ies ▶ parse, pars+ing, pars+ed ▶ geo+caching ▶ re+do, un+do, over+do ▶ pre+fjx, suf+fjx ▶ geo+local+ization ▶ pronouns are glued to the verb (Arabic, spanish...) ▶ Turkish, Finish → Lemmatization task: fjnd canonical word form

  13. Part-of-speech tagging Syntactic categories 20 Feb 2017 DL4NLP: NLP intro Benoit Favre (AMU) Example : time fmies like an arrow Each word can have multiple categories Pronouns Conjunctions Adjective Punctuation Preposition Verb Foreign words Determiner Proper name Discourse marker Adverb Noun 13 / 24 ▶ fmies: verb or noun? ▶ like: preposition or verb?

  14. Syntactic analysis Constituency parsing Source: http://www.nltk.org/book/tree_images/ch08-tree-1.png Dependency parsing Source: http://www.nltk.org/images/depgraph0.png Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 14 / 24

  15. Word sense disambiguation (WSD) What is the sense of each word in its context? red : color? wine? communist? fmy : what birds do? insect? bank : river? fjnancial institution? book : made of paper? make a reservation? Word meaning highly depends on domain apple : fruit? company? to pitch : a ball? a product? a note? Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 15 / 24

  16. Semantic parsing to his brother 20 Feb 2017 DL4NLP: NLP intro Benoit Favre (AMU) time patient instrument predicate agent this morning his car Syntax is ambiguous sold John When, where, why? Who helps making the action? the instrument Who receives the action? the patient Who performed the action? the agent Semantic roles The key opens the door The door opens The man opens the door 16 / 24

  17. Reference resolution Link all references to the same entity Scottish-born[N 3] scientist, inventor, engineer, and innovator who is credited with patenting the fjrst practical telephone.” (Wikipedia) Ambiguity Pronouns (it, she, he, we, you, who, whose, both...) Noun phrases (the young man, the former president, the company...) Proper names (”Victoria”: South-African city, Canadian region, Queen, model...) Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 17 / 24 ▶ “Alexander Graham Bell (March 3, 1847 – August 2, 1922)[4] was a

  18. Discourse analysis Cause 20 Feb 2017 DL4NLP: NLP intro Benoit Favre (AMU) Reformulation Justifjcation Interpretation Circumstances Objective Relationship between sentences of a text, argument structure. Contrast Preparation Elaboration Background Relation type (Rhetorical Structure Theory) et al, 2012 “Fully Automated Generation of Question-Answer Pairs for Scripted Virtual Instruction”, Kuyten 18 / 24

  19. Create a logical form Predicate representation John loves Mary but it is not reciprocal. John sold his car this morning to his brother. Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 19 / 24 ▶ Can be used to infer new ∃ x , y , name ( x , ‘‘ John ” ) ∧ name ( y , ‘‘ Mary ” ) ∧ loves ( x , y ) ∧ not ( loves ( y , x )) ∃ x , y , z , name ( x , ‘‘ John ” ) ∧ brother ( x , y ) ∧ car ( z ) ∧ owns ( x , z ) ∧ sell ( x , y , z ) ∧ time ( ‘‘ morning ” )

  20. History of natural language processing 1950: Theory (test de Turing, grammaires de Chomsky) 20 Feb 2017 DL4NLP: NLP intro Benoit Favre (AMU) 2010... 2000 20 / 24 1990 1980: Dictation, Development of grammars 1970: 1960: Toy systems ▶ Automatic translation during the cold war ▶ SHRDLU “place the red box next to the blue circle”, ELIZA “the therapis” ▶ Prolog (logic-base language for NLP), Dictionaries of semantic frames ▶ Transition “introspection” → “corpus” ▶ Evaluation campaigns ▶ Neural networks are “forgotten” ▶ Machine learning ▶ Applications: speech recognition, machine translation ▶ Deep learning

  21. Notion of corpus Language in the wild Manual Annotation of all elements we want to predict Benoit Favre (AMU) DL4NLP: NLP intro 20 Feb 2017 21 / 24 ▶ Email ▶ Forums ▶ Chats ▶ Speech recordings ▶ Video ▶ Text → topic ▶ Sentence → parse tree ▶ Review → sentiment

  22. Methodology 5 20 Feb 2017 DL4NLP: NLP intro Benoit Favre (AMU) Evaluate the output of the system 6 Create a system to perform the task Ask people to annotate that data Corpus-based natural language processing 4 Collect raw data 3 Write an annotation guide 2 Defjne a task 1 22 / 24

Recommend


More recommend