Tagging: An Overview Rule-based Disambiguation Example - PowerPoint PPT Presentation

Tagging: An Overview

Rule-based Disambiguation • Example after-morphology data (using Penn tagset): I watch a fly . NN NN DT NN . PRP VB NN VB VBP VBP • Rules using – word forms, from context & current position – tags, from context and current position – tag sets, from context and current position – combinations thereof 2018/2019 UFAL MFF UK NPFL068/Intro to statistical NLP II/Jan Hajic and Pavel Pecina 24

Example Rules I watch a fly • If-then style: NN NN DT NN PRP VB NN VB • DT eq,-1,Tag  VBP VBP (implies NN in,0,Set as a condition) • PRP eq,-1,Tag and DT eq,+1,Tag  VBP • {DT,NN} sub,0,Set  DT • {VB,VBZ,VBP,VBD,VBG} inc,+1,Tag  not DT • Regular expressions: • not (<*,*,DT  not  • not (<*,*,PRP>,<*,*, not VBP>,<*,*,DT>) • not (<*,{DT,NN} sub, not DT  • not (<*,*,DT>,<*,*,{VB,VBZ,VBP,VBD,VBG}>) 2018/2019 UFAL MFF UK NPFL068/Intro to statistical NLP II/Jan Hajic and Pavel Pecina 25

Implementation • Finite State Automata – parallel (each rule ~ automaton); • algorithm: keep all paths which cause all automata say yes – compile into single FSA (intersection) • Algorithm: – a version of Viterbi search, but: • no probabilities (“categorical” rules) • multiple input: – keep track of all possible paths 2018/2019 UFAL MFF UK NPFL068/Intro to statistical NLP II/Jan Hajic and Pavel Pecina 26

Example: the FSA • R1: not (<*,*,DT  not  • R2: not (<*,*,PRP>,<*,*, not VBP>,<*,*,DT>) • R3: not (<*,{DT,NN} sub, DT  • R4: not (<*,*,DT>,<*,*,{VB,VBZ,VBP,VBD,VBG}>) • R1: anything <*,*,DT   not  anything F1 N3 F2 else anything else • R3: anything <*,{DT,NN} sub , not DT  anything F1 N2 else 2018/2019 UFAL MFF UK NPFL068/Intro to statistical NLP II/Jan Hajic and Pavel Pecina 27

Applying the FSA I watch a f NN NN DT N PRP VB NN V VBP V • R1: not (<*,*,DT  not  • R2: not (<*,*,PRP>,<*,*, not VBP>,<*,*,DT>) • R3: not (<*,{DT,NN} sub, DT  • R4: not (<*,*,DT>,<*,*,{VB,VBZ,VBP,VBD,VBG}>) • R1 blocks: remains: or a fly a fly a fly DT NN NN DT NN VB VB VBP VBP • R2 blocks: remains e.g.: and more I watch a I watch a NN DT DT PRP VB PRP VBP • R3 blocks: remains only: a a • R4  R1! DT NN 2018/2019 UFAL MFF UK NPFL068/Intro to statistical NLP II/Jan Hajic and Pavel Pecina 28

Applying the FSA (Cont.) I watch a fly NN NN DT NN PRP VB NN VB VBP VBP • Combine: a fly a fly DT NN NN NN VB VBP I watch a DT PRP VBP a DT • Result: I watch a fly . PRP VBP DT NN . 2018/2019 UFAL MFF UK NPFL068/Intro to statistical NLP II/Jan Hajic and Pavel Pecina 29

Tagging by Parsing • Build a parse tree from the multiple input: S VP NP I watch a fly NN NN DT NN PRP VB NN VB VBP VBP • Track down rules: e.g., NP  DT NN: extract (a/DT fly/NN) • More difficult than tagging itself; results mixed 2018/2019 UFAL MFF UK NPFL068/Intro to statistical NLP II/Jan Hajic and Pavel Pecina 30

Statistical Methods (Overview) • “Probabilistic”: • HMM – Merialdo and many more (XLT) • Maximum Entropy – DellaPietra et al., Ratnaparkhi, and others • Rule-based: • TBEDL (Transformation Based, Error Driven Learning) – Brill’s tagger • Example-based – Daelemans, Zavrel, others • Feature-based (inflective languages) • Classifier Combination (Brill’s ideas) 2018/2019 UFAL MFF UK NPFL068/Intro to statistical NLP II/Jan Hajic and Pavel Pecina 31

Tagging: An Overview Rule-based Disambiguation Example - PowerPoint PPT Presentation

Tagging: An Overview Rule-based Disambiguation Example after-morphology data (using Penn tagset): I watch a fly . NN NN DT NN . PRP VB NN VB VBP VBP Rules using word forms, from context & current position

Word Sense Word Sense Word Sense Disambiguation Disambiguation Disambiguation Presented by

POS Tagging HMMs L645 / B659 Dept. of Linguistics, Indiana University Fall 2015 1 / 17 POS

Feature-Based Tagging The Task, Again Recall: tagging ~ morphological disambiguation

Part-of-Speech Tagging Part-of-Speech Tagging Berlin Chen 2003 References: 1. Speech and

Using Rule-Based Activity Using Rule-Based Activity Using Rule-Based Activity Using Rule-Based

Part-of-Speech Tagging Part-of-Speech Tagging Berlin Chen 2005 References: 1. Speech and

IN4080 2020 FALL NATURAL LANGUAGE PROCESSING Jan Tore Lnning 2 Tagging and sequence

Forewords Tagging in a nutshell Sources Slides inspired by M. Rajman and J.-C. Chappelier,

Traffic UTM Tagging AdWords WebMaster Tools UTM TAGGING Where does my traffic come from? UTM

Arabic POS Tagging Results Error Analysis Conclusion Emad Mohamed, Sandra K ubler Indiana

Part of Speech Tagging Informatics 2A: Lecture 16 John Longley School of Informatics University

Part of Speech Tagging Informatics 2A: Lecture 15 Mirella Lapata School of Informatics

Publications, Identity, and Disambiguation NIH Workshop on Identifiers and Disambiguation in

Word Sense Disambiguation Word Sense Disambiguation (WSD) Given A

Word Sense Disambiguation WORD SENSE DISAMBIGUATION Homonymy and Polysemy As we have seen,

Word Meaning & Word Sense Disambiguation CMSC 723 / LING 723 / INST 725 M ARINE C ARPUAT

PPs t r rt

Characterizations of subregular tree languages Andreas Maletti Universitt Leipzig, Germany

Retrieval Models Probability Ranking Principle Web Search Slides based on the books: 1

Part I: A Development Framework for Water What is Development ? Tragedies and their causes and

Odds and ends Determinis0c Encryp0on Construc0ons: SIV

Minimum Blockcipher Calls for Block cipher based Designs Mridul Nandi Indian Statistical

Some challenges in heavyweight cipher design Daniel J. Bernstein University of Illinois at

Foundation of Cryptography (0368-4162-01), Lecture 4 Pseudorandom Functions Iftach Haitner, Tel

Tagging: An Overview Rule-based Disambiguation Example - PowerPoint PPT Presentation

Tagging: An Overview Rule-based Disambiguation Example after-morphology data (using Penn tagset): I watch a fly . NN NN DT NN . PRP VB NN VB VBP VBP Rules using word forms, from context & current position

Word Sense Word Sense Word Sense Disambiguation Disambiguation Disambiguation Presented by

POS Tagging HMMs L645 / B659 Dept. of Linguistics, Indiana University Fall 2015 1 / 17 POS

Feature-Based Tagging The Task, Again Recall: tagging ~ morphological disambiguation

Part-of-Speech Tagging Part-of-Speech Tagging Berlin Chen 2003 References: 1. Speech and

Using Rule-Based Activity Using Rule-Based Activity Using Rule-Based Activity Using Rule-Based

Part-of-Speech Tagging Part-of-Speech Tagging Berlin Chen 2005 References: 1. Speech and

IN4080 2020 FALL NATURAL LANGUAGE PROCESSING Jan Tore Lnning 2 Tagging and sequence

Forewords Tagging in a nutshell Sources Slides inspired by M. Rajman and J.-C. Chappelier,

Traffic UTM Tagging AdWords WebMaster Tools UTM TAGGING Where does my traffic come from? UTM

Arabic POS Tagging Results Error Analysis Conclusion Emad Mohamed, Sandra K ubler Indiana

Part of Speech Tagging Informatics 2A: Lecture 16 John Longley School of Informatics University

Part of Speech Tagging Informatics 2A: Lecture 15 Mirella Lapata School of Informatics

Publications, Identity, and Disambiguation NIH Workshop on Identifiers and Disambiguation in

Word Sense Disambiguation Word Sense Disambiguation (WSD) Given A

Word Sense Disambiguation WORD SENSE DISAMBIGUATION Homonymy and Polysemy As we have seen,

Word Meaning &amp; Word Sense Disambiguation CMSC 723 / LING 723 / INST 725 M ARINE C ARPUAT

PPs t r rt

Characterizations of subregular tree languages Andreas Maletti Universitt Leipzig, Germany

Retrieval Models Probability Ranking Principle Web Search Slides based on the books: 1

Part I: A Development Framework for Water What is Development ? Tragedies and their causes and

Odds and ends Determinis0c Encryp0on Construc0ons: SIV

Minimum Blockcipher Calls for Block cipher based Designs Mridul Nandi Indian Statistical

Some challenges in heavyweight cipher design Daniel J. Bernstein University of Illinois at

Foundation of Cryptography (0368-4162-01), Lecture 4 Pseudorandom Functions Iftach Haitner, Tel

Word Meaning & Word Sense Disambiguation CMSC 723 / LING 723 / INST 725 M ARINE C ARPUAT