squashing
play

Squashing Computational Linguistics Noah A. Smith Paul G. Allen - PowerPoint PPT Presentation

Squashing Computational Linguistics Noah A. Smith Paul G. Allen School of Computer Science & Engineering University of Washington Seattle, USA @nlpnoah Research supported in part by: NSF, DARPA DEFT, DARPA CWC, Facebook, Google, Samsung,


  1. Squashing Computational Linguistics Noah A. Smith Paul G. Allen School of Computer Science & Engineering University of Washington Seattle, USA @nlpnoah Research supported in part by: NSF, DARPA DEFT, DARPA CWC, Facebook, Google, Samsung, University of Washington.

  2. data

  3. Applications of NLP in 2017 • Conversation, IE, MT, QA, summarization, text categorization

  4. Applications of NLP in 2017 • Conversation, IE, MT, QA, summarization, text categorization • Machine-in-the-loop tools for (human) authors Chenhao Tan Elizabeth Clark Revise your Collaborate with an NLP message with help model through an from NLP “exquisite corpse” storytelling game tremoloop.com

  5. Applications of NLP in 2017 • Conversation, IE, MT, QA, summarization, text categorization • Machine-in-the-loop tools for (human) authors • Analysis tools for measuring social phenomena Lucy Lin Dallas Card Sensationalism in Track ideas, propositions, science news frames in discourse over time bit.ly/sensational-news … bookmark this survey!

  6. data ?

  7. Squash

  8. Squash Networks • Parameterized differentiable functions composed out of simpler parameterized differentiable functions, some nonlinear

  9. Squash Networks • Parameterized differentiable functions composed out of simpler parameterized differentiable functions, some nonlinear From Jack (2010), Dynamic System Modeling and Control, goo.gl/pGvJPS *Yes, rectified linear units (relus) are only half-squash; hat-tip Martha White.

  10. Squash Networks • Parameterized differentiable functions composed out of simpler parameterized differentiable functions, some nonlinear From existentialcomics.com • Estimate parameters using Leibniz (1676)

  11. Who wants an all-squash diet? very Curcurbita much festive many dropout wow

  12. Linguistic Structure Prediction output (structure) input (text)

  13. Linguistic Structure Prediction sequences, trees, output (structure) graphs, … input (text)

  14. Linguistic Structure Prediction sequences, trees, output (structure) graphs, … “gold” output input (text)

  15. Linguistic Structure Prediction sequences, trees, output (structure) graphs, … “gold” output input representation input (text)

  16. Linguistic Structure Prediction sequences, trees, output (structure) graphs, … “gold” output clusters, lexicons, input representation embeddings, … input (text)

  17. Linguistic Structure Prediction training objective sequences, trees, output (structure) graphs, … “gold” output clusters, lexicons, input representation embeddings, … input (text)

  18. Linguistic Structure Prediction probabilistic, training objective cost-aware, … sequences, trees, output (structure) graphs, … “gold” output clusters, lexicons, input representation embeddings, … input (text)

  19. Linguistic Structure Prediction probabilistic, training objective cost-aware, … sequences, trees, output (structure) graphs, … “gold” output part representations clusters, lexicons, input representation embeddings, … input (text)

  20. Linguistic Structure Prediction probabilistic, training objective cost-aware, … sequences, trees, output (structure) graphs, … “gold” output segments/spans, arcs, part representations graph fragments, … clusters, lexicons, input representation embeddings, … input (text)

  21. Linguistic Structure Prediction training objective output (structure) “gold” output part representations input representation input (text)

  22. Linguistic Structure Prediction “task” regularization training objective error definitions annotation & weights conventions & theory data selection output (structure) constraints & “gold” output independence assumptions part representations input representation input (text)

  23. Inductive Bias • What does your learning algorithm assume? See also: No Free Lunch Theorem (Mitchell, 1980; Wolpert, 1996) • How will it choose among good predictive functions?

  24. bias data

  25. Three New Models • Parsing sentences into predicate-argument structures • Fillmore frames • Semantic dependency graphs • Language models that dynamically track entities

  26. When Democrats wonder why there is so much resentment of Clinton, they don’t need to look much further than the Big Lie about philandering that Stephanopoulos, Carville helped to put over in 1992. Original story on Slate.com: http://goo.gl/Hp89tD

  27. Frame-Semantic Analysis When Democrats wonder why there is so much resentment of Clinton, they don’t need to look much further than the Big Lie about philandering that Stephanopoulos, Carville helped to put over in 1992. FrameNet: https://framenet.icsi.berkeley.edu

  28. Frame-Semantic Analysis When Democrats wonder why there is so much resentment of Clinton, they don’t need to look much further than the Big Lie about philandering that Stephanopoulos, Carville helped to put over in 1992. FrameNet: https://framenet.icsi.berkeley.edu

  29. Frame-Semantic Analysis cognizer: Democrats topic: why … Clinton When Democrats wonder why there is so much resentment of Clinton, they don’t need to look much further than the Big Lie about philandering that Stephanopoulos, Carville helped to put over in 1992. FrameNet: https://framenet.icsi.berkeley.edu

  30. Frame-Semantic Analysis time: When … Clinton cognizer agent: they ground: much … 1992 landmark event: Democrats … Clinton degree: so sought entity: ? trajector event: they… 1992 mass: resentment of Clinton cognizer: Democrats time: When … Clinton required situation: they … to look … 1992 topic: why … Clinton entity: so … Clinton explanation: why degree: so much content: of Clinton experiencer: ? When Democrats wonder why there is so much resentment of Clinton, they don’t need to look much further than the Big Lie about philandering that Stephanopoulos, Carville helped to put over in 1992. helper: Stephanopoulos … Carville topic: about … 1992 goal: to put over time: in 1992 benefited_party: ? trajector event: the Big Lie … over landmark period: 1992 FrameNet: https://framenet.icsi.berkeley.edu

  31. bracket, translate categorize, brood, consider, class, classify contemplate, deliberate, … appraise, commit to agonize, fret, fuss, assess, memory, learn, lose sleep, … evaluate, … memorize, … FrameNet: https://framenet.icsi.berkeley.edu

  32. Frame-Semantic Analysis time: When … Clinton cognizer agent: they ground: much … 1992 landmark event: Democrats … Clinton degree: so sought entity: ? trajector event: they… 1992 mass: resentment of Clinton cognizer: Democrats time: When … Clinton required situation: they … to look … 1992 topic: why … Clinton entity: so … Clinton explanation: why degree: so much content: of Clinton experiencer: ? When Democrats wonder why there is so much resentment of Clinton, they don’t need to look much further than the Big Lie about philandering that Stephanopoulos, Carville helped to put over in 1992. helper: Stephanopoulos … Carville topic: about … 1992 goal: to put over time: in 1992 benefited_party: ? trajector event: the Big Lie … over landmark period: 1992 FrameNet: https://framenet.icsi.berkeley.edu

  33. When Democrats wonder Cogitation why there is so much resentment of Clinton, they don’t need … words + frame

  34. biLSTM (contextualized word vectors) When Democrats wonder Cogitation why there is so much resentment of Clinton, they don’t need … words + frame

  35. … parts: segments up to length d scored by another biLSTM, with labels biLSTM (contextualized word vectors) When Democrats wonder Cogitation why there is so much resentment of Clinton, they don’t need … words + frame

  36. … output: covering sequence of nonoverlapping segments parts: segments up to length d scored by another biLSTM, with labels biLSTM (contextualized word vectors) When Democrats wonder Cogitation why there is so much resentment of Clinton, they don’t need … words + frame

  37. training objective: Segmental RNN log loss (Lingpeng Kong, Chris Dyer, N.A.S., ICLR 2016) … output: covering sequence of nonoverlapping segments, recovered in O ( Ldn ); see Sarawagi & Cohen, 2004 parts: segments up to length d scored by another biLSTM, with labels biLSTM (contextualized word vectors) input sequence

  38. When Democrats wonder Cogitation why there is so much resentment of Clinton, they don’t need …

  39. When Democrats wonder Cogitation why there is so much resentment of Clinton, they don’t need …

  40. ∅ cognizer topic ∅ ∅ When Democrats wonder Cogitation why there is so much resentment of Clinton, they don’t need …

  41. ∅ wonder Cogitation wonder Cogitation cognizer topic ∅ ∅ wonder Cogitation wonder Cogitation When Democrats wonder Cogitation why there is so much resentment of Clinton, they don’t need …

  42. ∅ wonder Cogitation wonder Cogitation cognizer topic ∅ ∅ wonder Cogitation wonder Cogitation When Democrats wonder Cogitation why there is so much resentment of Clinton, they don’t need …

  43. ∅ wonder Cogitation wonder Cogitation cognizer topic ∅ ∅ wonder Cogitation wonder Cogitation When Democrats wonder Cogitation why there is so much resentment of Clinton, they don’t need …

Recommend


More recommend