Squashing Computational Linguistics Noah A. Smith Paul G. Allen - PowerPoint PPT Presentation

Squashing Computational Linguistics Noah A. Smith Paul G. Allen School of Computer Science & Engineering University of Washington Seattle, USA @nlpnoah Research supported in part by: NSF, DARPA DEFT, DARPA CWC, Facebook, Google, Samsung, University of Washington.

Applications of NLP in 2017 • Conversation, IE, MT, QA, summarization, text categorization

Applications of NLP in 2017 • Conversation, IE, MT, QA, summarization, text categorization • Machine-in-the-loop tools for (human) authors Chenhao Tan Elizabeth Clark Revise your Collaborate with an NLP message with help model through an from NLP “exquisite corpse” storytelling game tremoloop.com

Applications of NLP in 2017 • Conversation, IE, MT, QA, summarization, text categorization • Machine-in-the-loop tools for (human) authors • Analysis tools for measuring social phenomena Lucy Lin Dallas Card Sensationalism in Track ideas, propositions, science news frames in discourse over time bit.ly/sensational-news … bookmark this survey!

data ?

Squash

Squash Networks • Parameterized differentiable functions composed out of simpler parameterized differentiable functions, some nonlinear

Squash Networks • Parameterized differentiable functions composed out of simpler parameterized differentiable functions, some nonlinear From Jack (2010), Dynamic System Modeling and Control, goo.gl/pGvJPS *Yes, rectified linear units (relus) are only half-squash; hat-tip Martha White.

Squash Networks • Parameterized differentiable functions composed out of simpler parameterized differentiable functions, some nonlinear From existentialcomics.com • Estimate parameters using Leibniz (1676)

Who wants an all-squash diet? very Curcurbita much festive many dropout wow

Linguistic Structure Prediction output (structure) input (text)

Linguistic Structure Prediction sequences, trees, output (structure) graphs, … input (text)

Linguistic Structure Prediction sequences, trees, output (structure) graphs, … “gold” output input (text)

Linguistic Structure Prediction sequences, trees, output (structure) graphs, … “gold” output input representation input (text)

Linguistic Structure Prediction sequences, trees, output (structure) graphs, … “gold” output clusters, lexicons, input representation embeddings, … input (text)

Linguistic Structure Prediction training objective sequences, trees, output (structure) graphs, … “gold” output clusters, lexicons, input representation embeddings, … input (text)

Linguistic Structure Prediction probabilistic, training objective cost-aware, … sequences, trees, output (structure) graphs, … “gold” output clusters, lexicons, input representation embeddings, … input (text)

Linguistic Structure Prediction probabilistic, training objective cost-aware, … sequences, trees, output (structure) graphs, … “gold” output part representations clusters, lexicons, input representation embeddings, … input (text)

Linguistic Structure Prediction probabilistic, training objective cost-aware, … sequences, trees, output (structure) graphs, … “gold” output segments/spans, arcs, part representations graph fragments, … clusters, lexicons, input representation embeddings, … input (text)

Linguistic Structure Prediction training objective output (structure) “gold” output part representations input representation input (text)

Linguistic Structure Prediction “task” regularization training objective error definitions annotation & weights conventions & theory data selection output (structure) constraints & “gold” output independence assumptions part representations input representation input (text)

Inductive Bias • What does your learning algorithm assume? See also: No Free Lunch Theorem (Mitchell, 1980; Wolpert, 1996) • How will it choose among good predictive functions?

bias data

Three New Models • Parsing sentences into predicate-argument structures • Fillmore frames • Semantic dependency graphs • Language models that dynamically track entities

When Democrats wonder why there is so much resentment of Clinton, they don’t need to look much further than the Big Lie about philandering that Stephanopoulos, Carville helped to put over in 1992. Original story on Slate.com: http://goo.gl/Hp89tD

Frame-Semantic Analysis When Democrats wonder why there is so much resentment of Clinton, they don’t need to look much further than the Big Lie about philandering that Stephanopoulos, Carville helped to put over in 1992. FrameNet: https://framenet.icsi.berkeley.edu

Frame-Semantic Analysis cognizer: Democrats topic: why … Clinton When Democrats wonder why there is so much resentment of Clinton, they don’t need to look much further than the Big Lie about philandering that Stephanopoulos, Carville helped to put over in 1992. FrameNet: https://framenet.icsi.berkeley.edu

Frame-Semantic Analysis time: When … Clinton cognizer agent: they ground: much … 1992 landmark event: Democrats … Clinton degree: so sought entity: ? trajector event: they… 1992 mass: resentment of Clinton cognizer: Democrats time: When … Clinton required situation: they … to look … 1992 topic: why … Clinton entity: so … Clinton explanation: why degree: so much content: of Clinton experiencer: ? When Democrats wonder why there is so much resentment of Clinton, they don’t need to look much further than the Big Lie about philandering that Stephanopoulos, Carville helped to put over in 1992. helper: Stephanopoulos … Carville topic: about … 1992 goal: to put over time: in 1992 benefited_party: ? trajector event: the Big Lie … over landmark period: 1992 FrameNet: https://framenet.icsi.berkeley.edu

bracket, translate categorize, brood, consider, class, classify contemplate, deliberate, … appraise, commit to agonize, fret, fuss, assess, memory, learn, lose sleep, … evaluate, … memorize, … FrameNet: https://framenet.icsi.berkeley.edu

Frame-Semantic Analysis time: When … Clinton cognizer agent: they ground: much … 1992 landmark event: Democrats … Clinton degree: so sought entity: ? trajector event: they… 1992 mass: resentment of Clinton cognizer: Democrats time: When … Clinton required situation: they … to look … 1992 topic: why … Clinton entity: so … Clinton explanation: why degree: so much content: of Clinton experiencer: ? When Democrats wonder why there is so much resentment of Clinton, they don’t need to look much further than the Big Lie about philandering that Stephanopoulos, Carville helped to put over in 1992. helper: Stephanopoulos … Carville topic: about … 1992 goal: to put over time: in 1992 benefited_party: ? trajector event: the Big Lie … over landmark period: 1992 FrameNet: https://framenet.icsi.berkeley.edu

When Democrats wonder Cogitation why there is so much resentment of Clinton, they don’t need … words + frame

biLSTM (contextualized word vectors) When Democrats wonder Cogitation why there is so much resentment of Clinton, they don’t need … words + frame

… parts: segments up to length d scored by another biLSTM, with labels biLSTM (contextualized word vectors) When Democrats wonder Cogitation why there is so much resentment of Clinton, they don’t need … words + frame

… output: covering sequence of nonoverlapping segments parts: segments up to length d scored by another biLSTM, with labels biLSTM (contextualized word vectors) When Democrats wonder Cogitation why there is so much resentment of Clinton, they don’t need … words + frame

training objective: Segmental RNN log loss (Lingpeng Kong, Chris Dyer, N.A.S., ICLR 2016) … output: covering sequence of nonoverlapping segments, recovered in O ( Ldn ); see Sarawagi & Cohen, 2004 parts: segments up to length d scored by another biLSTM, with labels biLSTM (contextualized word vectors) input sequence

When Democrats wonder Cogitation why there is so much resentment of Clinton, they don’t need …

∅ cognizer topic ∅ ∅ When Democrats wonder Cogitation why there is so much resentment of Clinton, they don’t need …

∅ wonder Cogitation wonder Cogitation cognizer topic ∅ ∅ wonder Cogitation wonder Cogitation When Democrats wonder Cogitation why there is so much resentment of Clinton, they don’t need …

Squashing Computational Linguistics Noah A. Smith Paul G. Allen - PowerPoint PPT Presentation

Squashing Computational Linguistics Noah A. Smith Paul G. Allen School of Computer Science & Engineering University of Washington Seattle, USA @nlpnoah Research supported in part by: NSF, DARPA DEFT, DARPA CWC, Facebook, Google, Samsung,

Squashing the beast into a 60MB cage Tor Lillqvist <tml@collabora.com> tml,

Why Squashing Functions in Shall We Go Beyond . . . Which . . . Multi-Layer Neural Invariance

Special relativity Squashing of the E-field line associated to a moving charge is suggestive

Bug Squashing with SQLsmith andreas.seltenreich@credativ.de October 25, 2018

drm/i915 Updates Daniel Vetter, Intel OTC FOSDEM 2013 bug squashing bugs fixed by the

Squashing Computational Linguistics Noah A. Smith Paul G. Allen - PowerPoint PPT Presentation

Squashing Computational Linguistics Noah A. Smith Paul G. Allen School of Computer Science & Engineering University of Washington Seattle, USA @nlpnoah Research supported in part by: NSF, DARPA DEFT, DARPA CWC, Facebook, Google, Samsung,

Squashing the beast into a 60MB cage Tor Lillqvist &lt;tml@collabora.com&gt; tml,

Why Squashing Functions in Shall We Go Beyond . . . Which . . . Multi-Layer Neural Invariance

Special relativity Squashing of the E-field line associated to a moving charge is suggestive

Bug Squashing with SQLsmith andreas.seltenreich@credativ.de October 25, 2018

drm/i915 Updates Daniel Vetter, Intel OTC FOSDEM 2013 bug squashing bugs fixed by the

Squashing the beast into a 60MB cage Tor Lillqvist <tml@collabora.com> tml,