natural language processing history limits
play

Natural Language Processing: History & Limits Roman Kern - PowerPoint PPT Presentation

Natural Language Processing: History & Limits SCIENCE PASSION TECHNOLOGY Natural Language Processing: History & Limits Roman Kern <rkern@tugraz.at> 2020-03-05 Roman Kern <rkern@tugraz.at>, Institute for Interactive


  1. Natural Language Processing: History & Limits SCIENCE PASSION TECHNOLOGY Natural Language Processing: History & Limits Roman Kern <rkern@tugraz.at> 2020-03-05 Roman Kern <rkern@tugraz.at>, Institute for Interactive Systems and Data Science 1 2020-03-05

  2. Natural Language Processing: History & Limits Outline 1 History 2 Language Basics 3 Limitations 4 Applications, Tools, Tasks Roman Kern <rkern@tugraz.at>, Institute for Interactive Systems and Data Science 2 2020-03-05

  3. History Where are we coming from (as a discipline)? Roman Kern <rkern@tugraz.at>, Institute for Interactive Systems and Data Science 3 2020-03-05

  4. History Motivational Example Recall the Turing Test (1950) A test designed to assess, if a machine achieves human level of intelligence ... via communication using a teleprinter, i.e. writen text Hence, NLP is ofen seen as a key technology for AI Roman Kern <rkern@tugraz.at>, Institute for Interactive Systems and Data Science 4 2020-03-05

  5. History Telewriter Example Figure: Teleprinter (teletypewriter, Teletype or TTY) Roman Kern <rkern@tugraz.at>, Institute for Interactive Systems and Data Science 5 2020-03-05

  6. History Early Machine Translation Georgetown-IBM experiment 1952-54 60 sentences were translated from Russian to English Rule based system Highly constrained selection of sentences Vocabulary contained 250 words Sparked interest and funding money Authors claimed that within three or five years, machine translation would be a solved problem Roman Kern <rkern@tugraz.at>, Institute for Interactive Systems and Data Science 6 2020-03-05

  7. History Influential Early Work Syntactic Structures by Noam Chomsky (1957) Book (lecture notes) proposing to analyse the structure of text ... and transforming it, so that machine can process them Phase-Structure Grammar “ Colorless green ideas sleep furiously ” Grammatically correct, but semantically meaningless Plus, an example for a sentence that has never been formulated before Roman Kern <rkern@tugraz.at>, Institute for Interactive Systems and Data Science 7 2020-03-05

  8. History Influential Early Work “Early claims that computers can translate languages were vastly exaggerated” Anthony Oetinger (1966) “ Time flies like an arrow ” as example for an ambiguous sentence ... time moves quickly? (figuratively) ... measure the speed of flies? (imperative) ... species “time flies” have a preference for arrows? Roman Kern <rkern@tugraz.at>, Institute for Interactive Systems and Data Science 8 2020-03-05

  9. History ELIZA Developed by Joseph Weizenbaum at MIT (1964-66) Roman Kern <rkern@tugraz.at>, Institute for Interactive Systems and Data Science 9 2020-03-05

  10. History First AI Winter Litle progress NLP (and other AI-related topics) received less funding ... due to failure to deliver, e.g., a working machine translation systems → relatively litle (visible) progress achieved during the late 60ties to early 80ties Roman Kern <rkern@tugraz.at>, Institute for Interactive Systems and Data Science 10 2020-03-05

  11. History Knowledge Representation History of knowledge representation Field of AI, closely related to NLP General Problem Solver (1959) Computer program Could solve “toy examples” Dedicated programming language Separated knowledge from the solving itself Expert systems Introduced by Feigenbaum (1965) Knowledge-based and reasoning system Roman Kern <rkern@tugraz.at>, Institute for Interactive Systems and Data Science 11 2020-03-05

  12. History Knowledge Representation Types of knowledge representation Frame Inspired by psychological research (1930ties) Structures knowledge in hierarchical relationships e.g., KL-ONE (1977), FrameNet (1997) Kicktionary: ❤tt♣✿✴✴✇✇✇✳❦✐❝❦t✐♦♥❛r②✳❞❡ Semantic networks Inspired by associational memory of humans e.g., Aschaffenburg (early 20th century) Cyc (1984) Ontologies, e.g., RDF & OWL Roman Kern <rkern@tugraz.at>, Institute for Interactive Systems and Data Science 12 2020-03-05

  13. History Semantic Net Example of an early semantic net (Collins und Qillian, about 1960s) Roman Kern <rkern@tugraz.at>, Institute for Interactive Systems and Data Science 13 2020-03-05

  14. History Ontologies In philosophy an ontology deals with the existence question Since 1980s the term is being using in computer science Main components Individuals (instances), classes (concepts), atributes and relations Whereas relations ofen can be freely defined Upper ontologies vs. domain ontologies Only a few upper ontologies Roman Kern <rkern@tugraz.at>, Institute for Interactive Systems and Data Science 14 2020-03-05

  15. History Knowledge Graph Ontologies are still popular today Term coined by Google initiative (2012) Knowledge base represented as a graph Well-known example: FreeBase (2007) Graph database (tripe store) Similar projects: YAGO, DBPedia, Wikidata Relevant for NLP WordNet ConceptNet Roman Kern <rkern@tugraz.at>, Institute for Interactive Systems and Data Science 15 2020-03-05

  16. History Logic History of reasoning Initial combination of rules and logic for inference and reasoning e.g., first-order (predicate) logic Notations e.g., context free grammar (BNF) Fuzzy logic Introduced by Lotfi A. Zadeh (1965) Following the intuition that decisions do not have “hard borders” Roman Kern <rkern@tugraz.at>, Institute for Interactive Systems and Data Science 16 2020-03-05

  17. History Corpus Linguistics History of language corpora Brown corpus (1961) 500 samples of English-language text “Computational Analysis of Present-Day American” by Henry Kučera and W. Nelson Francis (1967) → Frequency of words follow the Zipf’s law The Brown corpus was later also tagged Each word was annotated with its word group Roman Kern <rkern@tugraz.at>, Institute for Interactive Systems and Data Science 17 2020-03-05

  18. History Paradigm Shif in NLP The majority of word in NLP by until the mid 1980s were based on rules e.g., mostly hand-crafed rules ... using domain knowledge (linguists) Shif toward statistical and stochastic models e.g., machine learning ... in combination with corpus linguistics “ Every time I fire a linguist, the performance of the speech recognizer goes up ” Frederick Jelinek (1985) Roman Kern <rkern@tugraz.at>, Institute for Interactive Systems and Data Science 18 2020-03-05

  19. History Machine Translation as Example for NLP History History of machine translation ❤tt♣s✿✴✴✈❛s✸❦✳❝♦♠✴❜❧♦❣✴♠❛❝❤✐♥❡❴tr❛♥s❧❛t✐♦♥✴ Roman Kern <rkern@tugraz.at>, Institute for Interactive Systems and Data Science 19 2020-03-05

  20. History Recent History of Deep Learning Based NLP 2001 Neural language models 2008 Multi-task learning 2013 Word embeddings 2013 Neural networks for NLP 2014 Sequence-to-sequence models 2015 Atention 2015 Memory-based networks 2018 Pretrained language models Taken from: ❤tt♣s✿✴✴r✉❞❡r✳✐♦✴❛✲r❡✈✐❡✇✲♦❢✲t❤❡✲r❡❝❡♥t✲❤✐st♦r②✲♦❢✲♥❧♣✴ Roman Kern <rkern@tugraz.at>, Institute for Interactive Systems and Data Science 20 2020-03-05

  21. History Overview of Terms Related to NLP Speech recognition Automatic speech recognition (ASR), speech to text (STT) Natural language understanding (NLU) “Machine reading” Builds upon NLP Natural language generation (NLG) Language production Ofen input to a text-to-speech system Computational linguistics Inter-disciplinary field of linguistics and computer science Roman Kern <rkern@tugraz.at>, Institute for Interactive Systems and Data Science 21 2020-03-05

  22. Language Basics Main basic concepts and terminology Roman Kern <rkern@tugraz.at>, Institute for Interactive Systems and Data Science 22 2020-03-05

  23. Language Basics Fun Facts about Human Languages Some languages do not have words for lef or right More than 6,000 languages spoken today Language differ in their word ordering Sometimes the change in order also changes the meaning The human brain has specific regions for language processing The language affects cognitive processes, e.g., speed Some aspects of language are arbitrary ... Roman Kern <rkern@tugraz.at>, Institute for Interactive Systems and Data Science 23 2020-03-05

  24. Language Basics Basic elements of Lingustics Building blocks of spoken language Phonetics The sounds that make up the languages Phoneme → phones vs. grapheme → glyph Phonology The combination of sounds Morphology Word formation (lexical) Syntax Word combinations for phrases and sentences Semantics The meaning of e.g. sentences Pragmatics Understanding of the context Roman Kern <rkern@tugraz.at>, Institute for Interactive Systems and Data Science 24 2020-03-05

Recommend


More recommend