sequence labeling
play

Sequence Labeling Prof. Sameer Singh CS 295: STATISTICAL NLP - PowerPoint PPT Presentation

Sequence Labeling Prof. Sameer Singh CS 295: STATISTICAL NLP WINTER 2017 January 31, 2017 Based on slides from Nathan Schneider, Noah Smith, Yejin Choi, and everyone else they copied from. Outline Sequence Labelling and POS Tagging


  1. Sequence Labeling Prof. Sameer Singh CS 295: STATISTICAL NLP WINTER 2017 January 31, 2017 Based on slides from Nathan Schneider, Noah Smith, Yejin Choi, and everyone else they copied from.

  2. Outline Sequence Labelling and POS Tagging Generative Modeling: HMMs Inference in HMMs: Viterbi and F/B Unsupervised Tagging using EM CS 295: STATISTICAL NLP (WINTER 2017) 2

  3. Outline Sequence Labelling and POS Tagging Generative Modeling: HMMs Inference in HMMs: Viterbi and F/B Unsupervised Tagging using EM CS 295: STATISTICAL NLP (WINTER 2017) 3

  4. Classification Sentiment Analysis Identify Topic Language Model CS 295: STATISTICAL NLP (WINTER 2017) 4

  5. Sequence Labeling CS 295: STATISTICAL NLP (WINTER 2017) 5

  6. Parts of Speech This is a simple sentence . DET VB DET ADJ NOUN . Applications: Text to speech: record, lead, … • Machine translation: run, walk, … • Noun phrases: `grep {JJ | NN}* {NN | NNS}` • • and many others… CS 295: STATISTICAL NLP (WINTER 2017) 6

  7. Parts of Speech: Tags “Open classes” Nouns, verbs, adjectives, adverbs, numbers “Closed classes” Modal verbs • Prepositions (on, to) • Particles (off, up) • Determiners (the, some) • Pronouns (she, they) • Conjunctions (and, or) • CS 295: STATISTICAL NLP (WINTER 2017) 7

  8. Named Entity Recognition Barack Obama spoke from the White House today . PER PER O O O LOC LOC O O CS 295: STATISTICAL NLP (WINTER 2017) 8

  9. Field Segmentation: Ads 3BR flat in Bruntsfield , near main roads . Bright , well maintained ... SIZE TYPE O LOC O LOC LOC LOC O FEAT O FEAT FEAT ... CS 295: STATISTICAL NLP (WINTER 2017) 9

  10. Field Segmentation: Citations Authors Title Publication Venue CS 295: STATISTICAL NLP (WINTER 2017) 10

  11. Outline Sequence Labelling and POS Tagging Generative Modeling: HMMs Inference in HMMs: Viterbi and F/B Unsupervised Tagging using EM CS 295: STATISTICAL NLP (WINTER 2017) 11

  12. Naïve Bayes Classifier CS 295: STATISTICAL NLP (WINTER 2017) 12

  13. “Transitions” matter “Impossible” Transitions Based on semantics Two determiners never follow each other • Fruit flies like a bird. Two base form verbs never follow each other • Determiner is followed by adjective or noun • Fruit flies like bananas. How do we select a “consistent” set of POS tags? CS 295: STATISTICAL NLP (WINTER 2017) 13

  14. “Transitions” matter CS 295: STATISTICAL NLP (WINTER 2017) 14

  15. “Transitions” matter Transition on Words versus Tags Too many words, learn the same thing again • Support for unseen words: “I like tenguizino!” • CS 295: STATISTICAL NLP (WINTER 2017) 15

  16. Hidden Markov Models S E CS 295: STATISTICAL NLP (WINTER 2017) 16

  17. Example Sentence This is a simple sentence S DET VB DET ADJ NOUN E CS 295: STATISTICAL NLP (WINTER 2017) 17

  18. Estimating Emissions S E Smoothing Unknown/rare words get inaccurate probabilities • Reminder: Laplace Smoothing (Add-k) • Next lecture: we will look at “features” • CS 295: STATISTICAL NLP (WINTER 2017) 18

  19. Estimating Transitions S E Interpolation If there are too many tags, or too little data, some combinations are too rare • Same as N-gram language models, “backoff” to simpler models • CS 295: STATISTICAL NLP (WINTER 2017) 19

  20. Outline Sequence Labelling and POS Tagging Generative Modeling: HMMs Inference in HMMs: Viterbi and F/B Unsupervised Tagging using EM CS 295: STATISTICAL NLP (WINTER 2017) 20

  21. Predicting from HMMs CS 295: STATISTICAL NLP (WINTER 2017) 21

  22. Brute Force Inference CS 295: STATISTICAL NLP (WINTER 2017) 22

  23. Conditional Independence S E CS 295: STATISTICAL NLP (WINTER 2017) 23

  24. Dynamic Programming CS 295: STATISTICAL NLP (WINTER 2017) 24

  25. State Lattice Fruit flies like bananas R(1,N) R(2,N) R(3,N) R(4,N) S R(1,V) R(2,V) R(3,V) R(4,V) E R(1,IN) R(2,IN) R(3,IN) R(4,IN) CS 295: STATISTICAL NLP (WINTER 2017) 25

  26. Viterbi Decoding Algorithm Initialization Iterative Computation (forward) Follow pointers (backward) CS 295: STATISTICAL NLP (WINTER 2017) 26

  27. Computational Complexity CS 295: STATISTICAL NLP (WINTER 2017) 27

  28. Outline Sequence Labelling and POS Tagging Generative Modeling: HMMs Inference in HMMs: Viterbi and F/B Unsupervised Tagging using EM CS 295: STATISTICAL NLP (WINTER 2017) 28

  29. Unsupervised Tagging Supervision is not always appropriate Linguist has to read and understand each sentence • Time consuming and expensive • Contains domain specific signal in the labels • WSJ doesn’t generalize to Twitter, for example • Difficult to agree on the universal part-of-speech tags (C5 tags: 61, Brown: 87) • Want to apply it to low-resource/unknown languages • Generalize the notion of “clustering” to sequence labeling. CS 295: STATISTICAL NLP (WINTER 2017) 29

  30. Expectation Maximization K-Means Initialization Pick K random centroids Compute Expectations Cluster all the points Update Parameters Update centroids CS 295: STATISTICAL NLP (WINTER 2017) 30

  31. Upcoming… Homework 2 is due (~10 days): February 9, 2017 • Homework Write-up, data, and code for Homework 2 is up • Ask questions early! • Proposal is due in a week: February 7, 2017 • Project Only 2 pages • Paper summaries: February 17, February 28, March 14 • Summaries Only 1 page each • CS 295: STATISTICAL NLP (WINTER 2017) 31

Recommend


More recommend