MWE vs. NLP MWEs from a Natural Language Processing perspective - PowerPoint PPT Presentation

MWE vs. NLP MWEs from a Natural Language Processing perspective PARSEME/ENeL workshop on MWE e-lexicons H´ ector Mart´ ınez Alonso University of Paris-Diderot & INRIA (France) hector.martinez-alonso@inria.fr MWEs from a Natural Language Processing perspective MWE vs. NLP

Overview 1 Common ground 2 MWE for NLP Machine translation Relation extraction 3 NLP for MWE, word association Some applications Pointwise mutual Information 4 Wrap-up MWEs from a Natural Language Processing perspective MWE vs. NLP

MWE Definition 2.1 from Ramisch (2015) MWEs are lexical items that: 1 Are decomposable into multiple lexemes , 2 Present idiomatic behaviour at some level of linguistic analysis and, as a consequence, 3 Must be treated as a unit at some level of computational processing. MWEs from a Natural Language Processing perspective MWE vs. NLP

MWEs from a Natural Language Processing perspective MWE vs. NLP

1) Tokenization Don’t you know I’m John Mayer’s taken-for-dead son, ma’am? MWEs from a Natural Language Processing perspective MWE vs. NLP

1) Tokenization and wordness status To day (until XVI century) To-day (until early XX century) Today (well, today ) MWEs from a Natural Language Processing perspective MWE vs. NLP

2) Idiomaticity: Morphosyntactic By and large , they were criminals at large . MWEs from a Natural Language Processing perspective MWE vs. NLP

2) Variation in morphosyntactic fixedness Ulica Obi-Wana Kenobiego in Grabowiec, Poland MWEs from a Natural Language Processing perspective MWE vs. NLP

MWE for NLP 1 Statistical Machine Translation 2 Relation Extraction MWEs from a Natural Language Processing perspective MWE vs. NLP

1) Statistical Machine Translation MWEs from a Natural Language Processing perspective MWE vs. NLP

1) Statistical Machine Translation (Counterargument: Maybe the idiom is already fixed at It’s .) MWEs from a Natural Language Processing perspective MWE vs. NLP

2) Relation extration We were trying to extract e.g. profession-product/activity pairs. Using patterns like Person Created Entity , with 1 Person , list of human terms, e.g. plumber, child, Galileo . 2 Created , list of creation verbs, e.g. invent, make . 3 Entity , the product or activity we want to identify. E.g. Galileo invented the telescope . MWEs from a Natural Language Processing perspective MWE vs. NLP

2) Relation extraction: Person Created Entity 1 True Positive: Cobblers made shoes 2 True Negative: Mankind brought conflict 3 False positive: Teenagers made out with their classmates 4 False negative: Diplomats brought about negotiations MWEs from a Natural Language Processing perspective MWE vs. NLP

2) Relation extraction: Person Created Entity 1 True Positive: Cobblers made shoes 2 True Negative: Mankind brought conflict 3 False positive: Teenagers made out with their classmates 4 False negative: Diplomats brought about negotiations Ignoring MWEs limited our predictive power. MWEs from a Natural Language Processing perspective MWE vs. NLP

NLP for MWE lexicography 1 Estimate compositionality 2 Help find glosses and examples 3 Identify syonymy 4 Detect MWEs MWEs from a Natural Language Processing perspective MWE vs. NLP

A two-word idiom red herring (noun): 1. a dried smoked herring, turned red by the smoke. 2. a clue or information which is misleading or distracting. bluff, ruse, feint, deception, subterfuge, hoax, trick... MWEs from a Natural Language Processing perspective MWE vs. NLP

Association between words: Pointwise Mutual Information � � p ( x,y ) PMI ( x ; y ) = log p ( x ) p ( y ) MWEs from a Natural Language Processing perspective MWE vs. NLP

PMI, with words w 1 and w 2 � � p ( w 1 ,w 2 ) PMI ( w 1 ; w 2 ) = log p ( w 1 ) p ( w 2 ) MWEs from a Natural Language Processing perspective MWE vs. NLP

PMI, contribution of terms � � p ( w 1 ,w 2 ) PMI ( w 1 ; w 2 ) = log p ( w 1 ) p ( w 2 ) MWEs from a Natural Language Processing perspective MWE vs. NLP

PMI, w 1 = red and w 1 = herring � � p ( red herring ) PMI ( red ; herring ) = log p ( red ) p ( herring ) What is the contribution of the numerator and the two terms of denominator and to the score? MWEs from a Natural Language Processing perspective MWE vs. NLP

Association between words: Mutual Information � � p ( x,y ) PMI ( x ; y ) = log p ( x ) p ( y ) 1 Related but not equal to conditional prob. P ( x | y ) = P ( x,y ) P ( y ) 2 PMI is not a prob and can be < 0 and > 1 3 PMI ( x ; y ) � = PMI ( y ; x ) MWEs from a Natural Language Processing perspective MWE vs. NLP

Association between words: Mutual Information Compare associations of red car , red herring , and fresh herring w p(w) w 1 w 2 p(w 1 w 2 ) red 0.00012 red car 0.00000004 fresh 0.00006 red herring 0.00000018 car 0.00007 fresh herring 0.000000015 herring 0.0000025 ... ... MWEs from a Natural Language Processing perspective MWE vs. NLP

Association between words: Mutual Information w p(w) w 1 w 2 p(w 1 w 2 ) red 0.00012 red car 0.00000004 fresh 0.00006 red herring 0.00000018 car 0.00007 fresh herring 0.000000015 herring 0.0000025 ... ... � � p ( x,y ) MI ( x ; y ) = p ( x, y ) log p ( x ) p ( y ) MI(red herring) = 6.4 MI(red car) = 1.6 MI(fresh herring) = 4.3 MWEs from a Natural Language Processing perspective MWE vs. NLP

A single metric does not explain it all... but it explains a lot! ⋆ ▽ ▽ puerto rico 10.03 hong kong 9.73 los angeles 9.56 ⋆ △ ▽ carbon dioxide 9.10 prize laureate 8.86 san francisco 8.83 nobel prize 8.69 ⋆ △ △ ice hockey 8.66 star trek 8.64 car driver 8.41 � △ △ ... � △ △ and of -2.80 a and -2.92 of and -3.71 MWEs from a Natural Language Processing perspective MWE vs. NLP

Wrapping up 1 NLP benefits from MWE knowledge 2 Lexicography MWEs from a Natural Language Processing perspective MWE vs. NLP

Questions and remarks Thank you! MWEs from a Natural Language Processing perspective MWE vs. NLP

MWE vs. NLP MWEs from a Natural Language Processing perspective - PowerPoint PPT Presentation

MWE vs. NLP MWEs from a Natural Language Processing perspective PARSEME/ENeL workshop on MWE e-lexicons H ector Mart nez Alonso University of Paris-Diderot & INRIA (France) hector.martinez-alonso@inria.fr MWEs from a Natural

1 Very small reactor designs being developed (up to 25 MWe) S.No. Name Capacity Type

LAW-MWE-CxG business meeting Santa Fe, 26 August 2018 Agenda Feedback from the joint workshop

SI485i : NLP Missing Topics and the Future Who cares about NLP? NLP has expanded quickly

SI425 : NLP Missing Topics and the Future Who cares about NLP? NLP has expanded quickly

NLP: Two pictures Wordnet and Word Sense Problem NLP Disambiguation Semantics NLP Trinity

Recurrent Neural Networks Graham Neubig Site https://phontron.com/class/nn4nlp2017/ NLP and

Fissuration des culasses des moteurs diesel de secours du parc 900 MWe Cylinder heads cracking of

MWE-WN Community discussion Florence, August 2, 2019 1 Agenda Feedback from the joint workshop

Multiword expressions: Getting the taste of things to come MWE 2017 Workshop Panel discussion

Ontologies for NLP NLP for Ontologies FOIS 2014 - LogOnto Workshop on Logics and Ontologies for

Facing NLP German Rigau i Claramunt http://adimen.si.ehu.es/~rigau IXA group Departamento de

IXA pipes: Efficient and Ready to Use Multilingual NLP tools Rodrigo Agerri IXA NLP Group,

Prominent Research Directions in NLP Alexander Panchenko Assistant Professor for NLP About

Natural Language Processing (NLP) In 11-711 Algorithms for NLP we take an

Deep Learning for NLP Kiran Vodrahalli Feb 11, 2015 Overview What is NLP? Natural

Hybrid NLP Hybrid NLP O UTLINE O UTLINE Problems of Deep and Shallow Processing

Bristol Bay Fisheries Collaborative Working Group Meeting & Public Outreach Alaska

DFO HERRING SCIENCE AND MANAGEMENT Presented for: Prey Availability TWG, May 8, 2019 1 Annual

L e ss Re a lly Ca n b e Mo re : Why Simplic ity & Co mpa ra b ility Sho uld b e Re g ula

Unexpected Opportunities Jim Wood Multimodal Planning Practice Leader Non-motorized Trail

App Attack Surviving the explosive growth of mobile apps. Kevin Mahaffey John Hering CTO,

The Serial Device Bus Johan Hovold Hovold Consulting AB Embedded Linux Conference Europe

Models in Magnetism E. Burzo Faculty of Physics, Babes-Bolyai University Cluj-Napoca, Romania

Open Source Software Licensing: What Every Technologist Needs to Know Heather Meeker Foundations

MWE vs. NLP MWEs from a Natural Language Processing perspective - PowerPoint PPT Presentation

MWE vs. NLP MWEs from a Natural Language Processing perspective PARSEME/ENeL workshop on MWE e-lexicons H ector Mart nez Alonso University of Paris-Diderot & INRIA (France) hector.martinez-alonso@inria.fr MWEs from a Natural

1 Very small reactor designs being developed (up to 25 MWe) S.No. Name Capacity Type

LAW-MWE-CxG business meeting Santa Fe, 26 August 2018 Agenda Feedback from the joint workshop

SI485i : NLP Missing Topics and the Future Who cares about NLP? NLP has expanded quickly

SI425 : NLP Missing Topics and the Future Who cares about NLP? NLP has expanded quickly

NLP: Two pictures Wordnet and Word Sense Problem NLP Disambiguation Semantics NLP Trinity

Recurrent Neural Networks Graham Neubig Site https://phontron.com/class/nn4nlp2017/ NLP and

Fissuration des culasses des moteurs diesel de secours du parc 900 MWe Cylinder heads cracking of

MWE-WN Community discussion Florence, August 2, 2019 1 Agenda Feedback from the joint workshop

Multiword expressions: Getting the taste of things to come MWE 2017 Workshop Panel discussion

Ontologies for NLP NLP for Ontologies FOIS 2014 - LogOnto Workshop on Logics and Ontologies for

Facing NLP German Rigau i Claramunt http://adimen.si.ehu.es/~rigau IXA group Departamento de

IXA pipes: Efficient and Ready to Use Multilingual NLP tools Rodrigo Agerri IXA NLP Group,

Prominent Research Directions in NLP Alexander Panchenko Assistant Professor for NLP About

Natural Language Processing (NLP) In 11-711 Algorithms for NLP we take an

Deep Learning for NLP Kiran Vodrahalli Feb 11, 2015 Overview What is NLP? Natural

Hybrid NLP Hybrid NLP O UTLINE O UTLINE Problems of Deep and Shallow Processing

Bristol Bay Fisheries Collaborative Working Group Meeting &amp; Public Outreach Alaska

DFO HERRING SCIENCE AND MANAGEMENT Presented for: Prey Availability TWG, May 8, 2019 1 Annual

L e ss Re a lly Ca n b e Mo re : Why Simplic ity &amp; Co mpa ra b ility Sho uld b e Re g ula

Unexpected Opportunities Jim Wood Multimodal Planning Practice Leader Non-motorized Trail

App Attack Surviving the explosive growth of mobile apps. Kevin Mahaffey John Hering CTO,

The Serial Device Bus Johan Hovold Hovold Consulting AB Embedded Linux Conference Europe

Models in Magnetism E. Burzo Faculty of Physics, Babes-Bolyai University Cluj-Napoca, Romania

Open Source Software Licensing: What Every Technologist Needs to Know Heather Meeker Foundations

Bristol Bay Fisheries Collaborative Working Group Meeting & Public Outreach Alaska

L e ss Re a lly Ca n b e Mo re : Why Simplic ity & Co mpa ra b ility Sho uld b e Re g ula