discriminative language models
play

Discriminative Language Models Prof. Sameer Singh CS 295: - PowerPoint PPT Presentation

Discriminative Language Models Prof. Sameer Singh CS 295: STATISTICAL NLP WINTER 2017 January 26, 2017 Based on slides from Noah Smith, Richard Socher, and everyone else they copied from. Language Models Probability of a Sentence Is a


  1. Discriminative Language Models Prof. Sameer Singh CS 295: STATISTICAL NLP WINTER 2017 January 26, 2017 Based on slides from Noah Smith, Richard Socher, and everyone else they copied from.

  2. Language Models Probability of a Sentence • Is a given sentence something you would expect to see? • Syntactically (grammar) and Semantically (meaning) Probability of the Next Word • Predict what comes next for a given sequence of words. • Think of it as V‐way classification CS 295: STATISTICAL NLP (WINTER 2017) 2

  3. Outline Discriminative Language Models Feed‐forward Neural Networks Recurrent Neural Networks Upcoming.. CS 295: STATISTICAL NLP (WINTER 2017) 3

  4. Outline Discriminative Language Models Feed‐forward Neural Networks Recurrent Neural Networks Upcoming.. CS 295: STATISTICAL NLP (WINTER 2017) 4

  5. Logistic Regression Model CS 295: STATISTICAL NLP (WINTER 2017) 5

  6. N‐Grams as Logistic Reg. CS 295: STATISTICAL NLP (WINTER 2017) 6

  7. Other features… CS 295: STATISTICAL NLP (WINTER 2017) 7

  8. Outline Discriminative Language Models Feed‐forward Neural Networks Recurrent Neural Networks Upcoming.. CS 295: STATISTICAL NLP (WINTER 2017) 8

  9. Logistic Reg. w/ Embeddings CS 295: STATISTICAL NLP (WINTER 2017) 9

  10. Neural Networks CS 295: STATISTICAL NLP (WINTER 2017) 10

  11. Activation Functions sigmoid softmax tanh And many others… ReLUs, PReLUs, ELU, step, max, and so on.. CS 295: STATISTICAL NLP (WINTER 2017) 11

  12. Why do they work? https://colah.github.io CS 295: STATISTICAL NLP (WINTER 2017) 12

  13. Why do they work? z x2 y x1 CS 295: STATISTICAL NLP (WINTER 2017) 13

  14. Simulated Example https://github.com/clab/cnn/blob/master/examples/xor.cc CS 295: STATISTICAL NLP (WINTER 2017) 14

  15. Simple Feedforward NN LM Bigram Model CS 295: STATISTICAL NLP (WINTER 2017) 15

  16. Simple Feedforward NN LM N‐gram Model CS 295: STATISTICAL NLP (WINTER 2017) 16

  17. Deep Feedforward NN LM Bengio et al. 2003 CS 295: STATISTICAL NLP (WINTER 2017) 17

  18. Outline Discriminative Language Models Feed‐forward Neural Networks Recurrent Neural Networks Upcoming.. CS 295: STATISTICAL NLP (WINTER 2017) 18

  19. Sequence View of Simple NNs CS 295: STATISTICAL NLP (WINTER 2017) 19

  20. Recurrent Neural Networks CS 295: STATISTICAL NLP (WINTER 2017) 20

  21. Example: “I love food” love food <eos> love food I CS 295: STATISTICAL NLP (WINTER 2017) 21

  22. Power of RNNs: Characters! http://karpathy.github.io/2015/05/21/rnn‐effectiveness/ CS 295: STATISTICAL NLP (WINTER 2017) 22

  23. Char‐RNNs: Shakespeare! CS 295: STATISTICAL NLP (WINTER 2017) 23

  24. Char‐RNNs: Wikipedia! CS 295: STATISTICAL NLP (WINTER 2017) 24

  25. Char‐RNNs: Linux Code! CS 295: STATISTICAL NLP (WINTER 2017) 25

  26. Extension: Stacking CS 295: STATISTICAL NLP (WINTER 2017) 26

  27. Extension: Bidirectional RNNs CS 295: STATISTICAL NLP (WINTER 2017) 27

  28. Deep Bidirectional RNNs CS 295: STATISTICAL NLP (WINTER 2017) 28

  29. Extension: GRUs Gated Recurrent Units CS 295: STATISTICAL NLP (WINTER 2017) 29

  30. Extension: GRUs Gated Recurrent Units CS 295: STATISTICAL NLP (WINTER 2017) 30

  31. Estimating Parameters Beyond the scope of the course • Lots of tricks, heuristics, “domain knowledge” • Lot of engineering for efficiency, e.g. GPUs • New training algorithms being proposed every year • sometimes, architecture‐specific • Lots of available tools you can use! • Tensorflow, Torch, Keras, MxNET, etc. CS 295: STATISTICAL NLP (WINTER 2017) 31

  32. Outline Discriminative Language Models Feed‐forward Neural Networks Recurrent Neural Networks Upcoming.. CS 295: STATISTICAL NLP (WINTER 2017) 32

  33. Homework 1 so far… Public Private CS 295: STATISTICAL NLP (WINTER 2017) 33

  34. Ruslan Salakhutdinov Professor at Carnegie Mellon University Director of Artificial Intelligence, Apple Inc. Learning Deep Unsupervised and Multimodal Models Location : DBH 6011 Time : 11am ‐ 12pm Date: January 27, 2017 Meeting with PhD students, will post on Piazza CS 295: STATISTICAL NLP (WINTER 2017) 34

  35. Upcoming… • Homework 1 is due tonight: January 26, 2017 • Write‐up, data, and code for Homework 2 is up Homework • Homework 2 is due: February 9, 2017 • Proposal is due: February 7, 2017 (~2 weeks) Project • Only 2 pages CS 295: STATISTICAL NLP (WINTER 2017) 35

Recommend


More recommend