machine learning for nlp
play

Machine Learning for NLP Introduction session Aurlie Herbelot 2020 - PowerPoint PPT Presentation

Machine Learning for NLP Introduction session Aurlie Herbelot 2020 Centre for Mind/Brain Sciences University of Trento 1 Material, contact All material will be posted at: http: //aurelieherbelot.net/teaching/ml-for-nlp/ Any question,


  1. Machine Learning for NLP Introduction session Aurélie Herbelot 2020 Centre for Mind/Brain Sciences University of Trento 1

  2. Material, contact All material will be posted at: http: //aurelieherbelot.net/teaching/ml-for-nlp/ Any question, worry, complaint... write to: aurelie.herbelot@unitn.it 2

  3. Course overview 3

  4. Goals 1. Understand core machine learning algorithms used in NLP: 1.1 for science; 1.2 for applications. 2. Be able to read and criticise related literature. 3. Acquire some fundamental computational skills to run ML code and interpret its output. 4

  5. Session structure • An introductory week, followed by 9 topics, each associated with 3 classes: 1. A lecture presenting the topic for that week. 2. A reading group on one or two papers using the presented algorithm(s) / metric(s). 3. A practical with a task and/or some code to play with. All code will be provided on GitHub. Some practicals will focus on linguistic questions ( ), some on applications ( ). 5

  6. Syllabus 6

  7. What for? 7

  8. NLP for science 8

  9. Computational tools for language sciences Why? 9

  10. Simulation of star formation. http://burro.astr.cwru.edu/models/sfrmm009.gif 10

  11. Right: Prusinkiewicz (2004), modelling plant growth with a grammar. 10

  12. Current work in the CALM group: Simulate the tension between linguistic creativity and communication needs. What shape can lexical meaning take without breaking alignment between speakers? 10

  13. Modelling • A model is an approximation of reality. • Aim: observe behaviour of the model and ensure it does not produce states incompatible with reality. • Computational models and their implementation (simulations) allow for fast counter-checking of hypotheses. 11

  14. Issues • Assumptions: a model rests on a number of simplifying assumptions. Q: What might be simplified in a model of language? • Evaluation: a model should be compatible with past / future observations (corpora, linguistic judgements, behavioural data, brain states...) Q: In which ways might linguistic data be biased / flawed? • Access to reality: it took 50 years to (partially) confirm the existence of the Higgs Boson. Most of what makes language is invisible, including parts we take for granted. Q: Which ones? • Replicability: will others be able to reproduce your experiment? A: why shouldn’t they? 12

  15. Example question Can language be learned from scratch? Or does it need innate mechanisms? 13

  16. Example question Can language be learned from scratch? Or does it need innate mechanisms? 14

  17. Example question Can language be learned from scratch? Or does it need innate mechanisms? 14

  18. Example question Can language be learned from scratch? Or does it need innate mechanisms? Let’s build a hypothesis. 15

  19. Example question Can language be learned from scratch? Or does it need innate mechanisms? Let’s build a hypothesis. 15

  20. Example question Can language be learned from scratch? Or does it need innate mechanisms? Let’s build a hypothesis. Then let’s choose a model to match the hypothesis. 16

  21. Example question Can language be learned from scratch? Or does it need innate mechanisms? Let’s build a hypothesis. Then let’s choose a model to match the hypothesis. 16

  22. Example question Can language be learned from scratch? Or does it need innate mechanisms? Let’s build a hypothesis. Then let’s choose a model to match the hypothesis. 16

  23. Example question Can language be learned from scratch? Or does it need innate mechanisms? Let’s build a hypothesis. Then let’s choose a model to match the hypothesis. And data to train the model under various conditions. 17

  24. Example question Can language be learned from scratch? Or does it need innate mechanisms? Let’s build a hypothesis. Then let’s choose a model to match the hypothesis. And data to train the model under various conditions. 17

  25. Example question Can language be learned from scratch? Or does it need innate mechanisms? Let’s build a hypothesis. Then let’s choose a model to match the hypothesis. And data to train the model under various conditions. Finally, test the hypothesis. 18

  26. A real model Actual CHILDES data RNN-generated data “What kind of food did you buy?” “What kind of little girl?” buy kind ___|________ ____|_____ | | kind | of | | ____|____ | | | | | of | girl | | | | | | did you what food what little Catenae: Catenae: kind -[DET] → what kind -[DET] → what kind-[PREP] → of kind-[PREP] → of of-[POBJ] → food of-[POBJ] → girl of-[POBJ] → NOUN of-[POBJ] → NOUN ... ... Given a) some training data and b) some generated output from an RNN, processed with the same formalism, we can investigate which structures the RNN can reproduce, and compare their distribution to the original data. (Ongoing work by Ludovica Pannitto.) 19

  27. A real model Actual CHILDES data “Are you teasing me?” RNN-generated data It’s a jelly graving me. teasing _____|_____ graving are you me _____|______ | ’s jelly Catenae: | | | VERB -[AUX] → are me it a ... “It’s a steering wheel.” Catenae: VERB -[AUX] → are ’s NOUN -[DET] → a ___|____ ’s -[NSUBJ] → it | wheel graving -[DOBJ] → me | ____|______ graving -[ADVMOD] → jelly it a steering ... Catenae: We can do this check even when the RNN produces ’s -[NSUBJ] → it partially nonsensical sentences. NOUN -[DET] → a ... 20

  28. NLP for applications 21

  29. Computational tools for technology How? 22

  30. Software development • Requirement analysis: what should the software do? with how much resources? • Design: modelling (as in science), choice of language / hardware, etc. • Implementation: programming and documenting. • Evaluation: check requirements are satisfied, performance is acceptable, etc. 23

  31. Issues • Model: as in science, some assumptions must be made and constraints respected (e.g. hardware / internet access available to end user). • Implementation: for whom are you writing? will the code be open? how do you document? do you need training to be reproducible? • Ethics: should you be doing this? If your app is 99% accurate, does the 1% matter? Where did the data come from? Etc. 24

  32. An example task Build a Web search engine. 25

  33. An example task Build a Web search engine. 26

  34. An example task Build a Web search engine. 26

  35. An example task Build a Web search engine. The system’s architecture must be compatible with developer / user resources. 27

  36. An example task Build a Web search engine. The system’s architecture must be compatible with developer / user resources. 27

  37. An example task Build a Web search engine. The system’s architecture must be compatible with developer / user resources. Representations must be adequate for the task. 28

  38. An example task Build a Web search engine. The system’s architecture must be compatible with developer / user resources. Representations must be adequate for the task. 28

  39. An example task Build a Web search engine. The system’s architecture must be compatible with developer / user resources. Representations must be adequate for the task. The system must be evaluated. 29

  40. An example task Build a Web search engine. The system’s architecture must be compatible with developer / user resources. Representations must be adequate for the task. The system must be evaluated. 29

  41. An example task Build a Web search engine. The system’s architecture must be compatible with developer / user resources. Representations must be adequate for the task. The system must be evaluated. 29

  42. The PeARS search engine More information on https://pearsearch.org/ 30

  43. What to expect in the course 31

  44. • ML for NLP: what’s it good for? • Basic principles of statistical language learning. • Practical: how unique is each person’s language? Can we model inter-speaker differences? 32

  45. Q: in what sense can data be good or bad? • How to generate (good) data for training and evaluation. • Reading: do people agree on what the world is like? • Practical: produce data for your search engine. Pre-process Wikipedia, extract specific category pages, design page/query representations. 33

  46. Q: how do machines build models? • Introduction to supervised learning and regression techniques. • Reading: predicting words from brain images (‘mind-reading’). • Practical: to which extent do people’s conceptual spaces align across languages? 34

  47. Q: why does complexity matter? • Clustering and dimensionality reduction. • Reading: Locality-sensitive hashing: how do fruit flies implement unsupervised learning? • Practical: Document clustering for the backend of your search engine. First search attempts. 35

  48. Q: what kind of decisions are binary? • Introduction to Support Vector Machines. • Reading: Detection of semantic errors in the prose of non-English speakers with SVMs. • Practical: Is there a correlation between a person’s writing and the onset of certain medical conditions? 36

Recommend


More recommend