Multi-source projection of coreference chains Yulia Grishina and - PowerPoint PPT Presentation

Multi-source projection of coreference chains Yulia Grishina and Manfred Stede Applied Computational Linguistics FSP Cognitive Sciences University of Potsdam / Germany

Outline (I) idea (II) strategies (III) results (IV) error analysis (V) outcomes

(1) Idea & Methodology 3

Annotation projection • automatically transfer annotations from source to target

New: multi-src projection • (Yarowski et al., 2001): multiple translations of Bible • (Agic et al., 2016): POS tags • (Rasooli and Collins, 2015; Johannsen et al., 2016): dependency trees • .. coreference? 7

The parallel corpus • 38 parallel texts • 3 languages: English, German, Russian • 3 text genres: newswire 1 , narratives 2 , medicine instruction leaflets 3 (only EN-DE) 1 multilingual newswire agency Project Syndicate (www.project-syndicate.org) 2 short narratives for second language acquisition Daisy stories (http://www.lonweb.org) 3 EMEA subcorpus of the OPUS collection of parallel corpora (Tiedemann, 2009)

The parallel corpus • sentence-aligned • extracted sentences aligned in the three languages (reduced sentences by 5% and coref. chains by 6% as compared to (Grishina & Stede, 2015)) • word alignment using GIZA++ (Och & Ney, 2003)

Annotation • common coreference annotation guidelines • uniform annotations in 3 languages • identity relation • see (Grishina & Stede, 2016) 10

Annotation guidelines • NP coreference: full NPs, proper names, pronouns • no generic NPs annotated • no singletons annotated

The parallel corpus Newswire Narratives Total EN DE RU EN DE RU EN DE RU Tokens 5903 6268 5763 2619 2642 2343 8522 8910 8106 Sentences 239 252 239 190 186 192 429 438 431 REs 558 589 606 470 497 479 1028 1086 1085 Chains 124 140 140 45 45 48 169 185 188 REs/Chains (%) 4.5 4.2 4.3 10.4 11.0 10.0 6.1 5.9 5.8 (Grishina and Stede, 2015), (Grishina, 2016) 12

(2) Strategies 13

Multi-src projection: cases languages L1 L2 L3 [a 1 ] A b 1 c 1 c h a 2 [b 2 ] B c 2 a i [a 3 ] A [b 3 ] B c 3 n s . . c 4 . . . . . . a k b m c n 14

Multi-src projection: trivial case L1 L2 L3 [a 1 ] A b 1 [c 1 ] A a 2 [b 2 ] B c 2 [a 3 ] A [b 3 ] B [c 3 ] A . . c 4 . . . . . . a k b m c m 15

Multi-src projection: trivial case L1 L2 L3 [a 1 ] A b 1 [c 1 ] AB a 2 [b 2 ] B c 2 [a 3 ] A [b 3 ] B [c 3 ] AB . . c 4 . . . . . . a k b m c n 16

Multi-src projection: trivial case L1 L2 L3 [a 1 ] A b 1 [c 1 ] AB a 2 [b 2 ] B c 2 [a 3 ] A [b 3 ] B [c 3 ] AB . . c 4 identical chains . . . . . . a k b m c n 17

Multi-src projection: simple case L1 L2 L3 [a 1 ] A b 1 [c 1 ] A a 2 [b 2 ] B c 2 [a 3 ] A [b 3 ] B [c 3 ] A . . c 4 . . . . . . a k b m c n 18

Multi-src projection: simple case L1 L2 L3 [a 1 ] A b 1 [c 1 ] A a 2 [b 2 ] B [c 2 ] B [a 3 ] A [b 3 ] B [c 3 ] A . . [c 4 ] B . . . . . . a k b m c n 19

Multi-src projection: simple case L1 L2 L3 [a 1 ] A b 1 [c 1 ] A a 2 [b 2 ] B [c 2 ] B [a 3 ] A [b 3 ] B [c 3 ] A . . [c 4 ] B disjoint chains . . . . . . a k b m c n 20

Multi-src projection: typical case L1 L2 L3 [a 1 ] A b 1 [c 1 ] A a 2 [b 2 ] B c 2 [a 3 ] A [b 3 ] B [c 3 ] A . . c 4 . . . . . . a k b m c n 21

Multi-src projection: typical case L1 L2 L3 [a 1 ] A b 1 [c 1 ] A a 2 [b 2 ] B [c 2 ] B [a 3 ] A [b 3 ] B [c 3 ] ? . . c 4 . . . . . . a k b m c n 22

Multi-src projection: typical case L1 L2 L3 [a 1 ] A b 1 [c 1 ] A a 2 [b 2 ] B [c 2 ] B A or B? [a 3 ] A [b 3 ] B [c 3 ] ? . . c 4 overlapping chains . . . . . . a k b m c n 23

Strategies voting, concatenation intersection intersect : intersection of add: disjoint chains from mentions for overlapping one lang are added to the chains other languages concatenate: overlapping chains merged together 24

A real example EN: [A fat lady] [who] wore a fur around [her] neck came in. [She] said that [she] needs [Daisy’s] help and does not know what to do. DE: [Eine dicke Dame mit einer Pelzstola] kam rein. [Sie] hat gesagt, dass [sie] [Daisys] Hilfe braucht und dass [sie] nicht weiß, was [sie] tun soll. RU: Вошла [ полная дама , носившая мех вокруг шеи ]. [ Она ] сказала , что [ ей ] необходима помощь [ Дэйзи ] и что [ она ] не знает , что [ ей ] делать . 25

A real example EN: [A fat lady] [who] wore a fur around [her] neck came in. [She] said that [she] needs [Daisy’s] help and does not know what to do. DE: [[Eine dicke Dame] mit einer Pelzstola] kam rein. [Sie] hat gesagt, dass [sie] [Daisys] Hilfe braucht und dass [sie] nicht weiß, was [sie] tun soll. RU: Вошла [ полная дама , носившая мех вокруг шеи ]. [ Она ] сказала , что [ ей ] необходима помощь [ Дэйзи ] и что [ она ] не знает , что [ ей ] делать . 26

A real example EN: [A fat lady] [who] wore a fur around [her] neck came in. [She] said that [she] needs [Daisy’s] help and does not know what to do. DE: [[Eine dicke Dame] mit einer Pelzstola] kam rein. [Sie] hat gesagt, dass [sie] [Daisys] Hilfe braucht und dass [sie] nicht weiß, was [sie] tun soll. RU: Вошла [ полная дама , носившая мех вокруг шеи ]. [ Она ] сказала , что [ ей ] необходима помощь [ Дэйзи ] и что [ она ] не знает , что [ ей ] делать . 27

(3) Results 28

Results EN,RU->DE +ment EN,DE->RU +ment add 46.6 52.6 56.9 57.3 concatenate 49.6 57.0 58.6 59.0 intersect 35.7 40.3 40.7 40.8 29

Results EN,RU->DE +ment EN,DE->RU +ment add 46.6 52.6 +6.0 56.9 57.3 +0.4 concatenate 49.6 57.0 +7.4 58.6 59.0 +0.4 intersect 35.7 40.3 +4.6 40.7 40.8 +0.1 30

Results: baselines P R F1 EN-DE 55.3 43.8 48.7 RU-DE 40.9 26.7 31.9 EN,RU-DE-con 53.3 46.5 49.6 EN,RU-DE-int 63.0 25.7 35.7 EN-RU 68.0 51.6 58.5 DE-RU 54.4 28.9 37.3 EN,DE-RU-con 67.2 52.2 58.6 EN,DE-RU-int 78.0 28.1 40.7 31

Results: baselines + ment P R F1 EN-DE 63.2 50.0 55.7 RU-DE 41.7 27.0 32.3 EN,RU-DE-con 62.3 52.7 57.0 EN,RU-DE-int 71.8 29.1 40.3 EN-RU 68.4 52.4 58.8 DE-RU 54.9 29.0 37.6 EN,DE-RU-con 67.7 52.5 59.0 EN,DE-RU-int 79.1 28.1 40.8 32

(4) Error analysis 33

Projected markables by type German Russian 60 45 30 15 0 NPs NEs Pronouns 34

Markable accuracy by type DE DE+ment RU RU+ment 100.0 92.5 85.0 77.5 70.0 62.5 55.0 47.5 40.0 NPs NEs Pronouns 35

Markable accuracy by type DE DE+ment RU RU+ment 100.0 Max. 95.2 92.5 85.0 77.5 70.0 62.5 Minimum 55.0 53.4 47.5 40.0 NPs NEs Pronouns 36

Markable accuracy by # of tokens Russian German 37

(5) Outcomes 38

Outcomes • comparable results for both languages: the highest Precision of 78.0/79.1 for German/Russian and the highest Recall of 52.7 for both; • outperforms single-source projection in terms of Precision and Recall; overall results are only slightly higher; • different directions of projection are not equally good. 39

Conclusions • for the first time implemented multi-source projection for coreference and tested several strategies • it outperforms P&R scores as compared to single source & achieves slightly better overall scores • NPs are more challenging for the projection than pronouns; automatic mention extraction supports mention recovery for German.

Future work • experimenting with more sophisticated strategies based upon this study • projection with more than two source languages • projection of automatic annotations & system training

thank you!

Multi-source projection of coreference chains Yulia Grishina and - PowerPoint PPT Presentation

Multi-source projection of coreference chains Yulia Grishina and Manfred Stede Applied Computational Linguistics FSP Cognitive Sciences University of Potsdam / Germany Outline (I) idea (II) strategies (III) results (IV) error analysis (V)

Projection Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU)

Interplay of Coreference and Discourse Research and Annotations Anna Nedoluzhko Charles University,

CORBON 2016: Coreference Resolution Beyond OntoNotes NAACL HLT 2016 Workshop Maciej Ogrodniczuk

Evaluating Theories of Coreference Resolution Coreference Resolution: The Task Bayer AG has

Latent Structures for Coreference Resolution Sebastian Martschat and Michael Strube Heidelberg

Easy Victories and Uphill Ba4les in Coreference Resolu9on Greg

Industrial Robots Industrial Robots Kinematic chains Kinematic chains Kinematic chains Kinematic

Markov Chains Markov Processes Discrete-time Markov Chains Continuous-time Markov Chains Dr

Overview Focus Projection Focus Projection Focus to Accent Focus to Accent Restricted View of

Food Losses/Waste in Food Value Chains Food Losses/Waste in Food Value Chains Areas

Imprecise Markov chains From basic theory to applications II prof. Jasper De Bock Imprecise

Overview Motivation Verifying Continuous-Time Markov Chains 1 Lecture 1+2: Discrete-Time Markov

Discrete time Markov chains Today: Discrete Time Markov Chains, Limiting Discrete time Markov

GroRef: Rule-Based Coreference Resolution for Dutch Rob van der Goot, Hessel Haagsma, Dieke Oele

and Retrieval Source: H. Jegou Source: H. Jegou Source: H. Jegou Source: H. Jegou Source: H.

Radial Projection Techniques InfoVis SS2020 G4 12 05 2020 Radial Projection Basics Also

Latka: A Language for Random Text Generation Getty D. Ritter POPL OBT Jan 25, 2014 Getty D.

bawk bad awk: a powerful text processing language Ashley An, Christine Hsu, Melanie

Presentation for Liaison Group Meeting: Landscape Peter Corrie - LDA Design 10th June 2015

2018-2019 BUDGET PRESENTATION 04/25/2018 AGENDA I. Introductions II. 2018-2019 Budget Process

Cross-linguistic & Cross-cultural Voice Interaction Design: Guidelines from the AVIxD

274: Enhanced Method Handles 275:

Revisiting Document Length Hypotheses NTCIR-4 CLIR and Patent Experiments at Patolis 4 June 2004

Years ago many of us had high expectations for effective database g y g p systems to

Multi-source projection of coreference chains Yulia Grishina and - PowerPoint PPT Presentation

Multi-source projection of coreference chains Yulia Grishina and Manfred Stede Applied Computational Linguistics FSP Cognitive Sciences University of Potsdam / Germany Outline (I) idea (II) strategies (III) results (IV) error analysis (V)

Projection Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU)

Interplay of Coreference and Discourse Research and Annotations Anna Nedoluzhko Charles University,

CORBON 2016: Coreference Resolution Beyond OntoNotes NAACL HLT 2016 Workshop Maciej Ogrodniczuk

Evaluating Theories of Coreference Resolution Coreference Resolution: The Task Bayer AG has

Latent Structures for Coreference Resolution Sebastian Martschat and Michael Strube Heidelberg

Easy Victories and Uphill Ba4les in Coreference Resolu9on Greg

Industrial Robots Industrial Robots Kinematic chains Kinematic chains Kinematic chains Kinematic

Markov Chains Markov Processes Discrete-time Markov Chains Continuous-time Markov Chains Dr

Overview Focus Projection Focus Projection Focus to Accent Focus to Accent Restricted View of

Food Losses/Waste in Food Value Chains Food Losses/Waste in Food Value Chains Areas

Imprecise Markov chains From basic theory to applications II prof. Jasper De Bock Imprecise

Overview Motivation Verifying Continuous-Time Markov Chains 1 Lecture 1+2: Discrete-Time Markov

Discrete time Markov chains Today: Discrete Time Markov Chains, Limiting Discrete time Markov

GroRef: Rule-Based Coreference Resolution for Dutch Rob van der Goot, Hessel Haagsma, Dieke Oele

and Retrieval Source: H. Jegou Source: H. Jegou Source: H. Jegou Source: H. Jegou Source: H.

Radial Projection Techniques InfoVis SS2020 G4 12 05 2020 Radial Projection Basics Also

Latka: A Language for Random Text Generation Getty D. Ritter POPL OBT Jan 25, 2014 Getty D.

bawk bad awk: a powerful text processing language Ashley An, Christine Hsu, Melanie

Presentation for Liaison Group Meeting: Landscape Peter Corrie - LDA Design 10th June 2015

2018-2019 BUDGET PRESENTATION 04/25/2018 AGENDA I. Introductions II. 2018-2019 Budget Process

Cross-linguistic &amp; Cross-cultural Voice Interaction Design: Guidelines from the AVIxD

274: Enhanced Method Handles 275:

Revisiting Document Length Hypotheses NTCIR-4 CLIR and Patent Experiments at Patolis 4 June 2004

Years ago many of us had high expectations for effective database g y g p systems to

Cross-linguistic & Cross-cultural Voice Interaction Design: Guidelines from the AVIxD