1 SemEval 2019 Task 1: Cross-lingual Semantic Parsing with UCCA Daniel Hershcovich, Leshem Choshen, Elior Sulem, Zohar Aizenbud, Ari Rappoport and Omri Abend June 6, 2019
2 L H L H D F P אחרישסייםאתהלימודים A P A ג'וןעברלפריז A H H After Universal Conceptual Cognitive Annotation (UCCA) P graduation A P A John moved C R to Paris Stable in translation (Sulem et al., 2015). Builds on Basic Linguistic Theory (R. M. W. Dixon). Cross-linguistically applicable semantic representation (Abend and Rappoport, 2013). A
3 L H L H D F P אחרישסייםאתהלימודים A P A ג'וןעברלפריז A H H After Universal Conceptual Cognitive Annotation (UCCA) P graduation A P A John moved C R to Paris Stable in translation (Sulem et al., 2015). Builds on Basic Linguistic Theory (R. M. W. Dixon). Cross-linguistically applicable semantic representation (Abend and Rappoport, 2013). A
4 L H L H D F P אחרישסייםאתהלימודים A P A ג'וןעברלפריז A H H After Universal Conceptual Cognitive Annotation (UCCA) P graduation A P A John moved C R to Paris Stable in translation (Sulem et al., 2015). Builds on Basic Linguistic Theory (R. M. W. Dixon). Cross-linguistically applicable semantic representation (Abend and Rappoport, 2013). A
5 an gave John an apple He gve apple Applications for john He • Semantics-based evaluation of • Machine translation (Birch et al., 2016) • Text simplifjcation (Sulem et al., 2018a) • Grammatical error correction (Choshen and Abend, 2018) • Sentence splitting for text simplifjcation (Sulem et al., 2018b).
6 Universal Conceptual Cognitive Annotation (UCCA) Intuitive annotation interface and guidelines (Abend et al., 2017). ucca-demo.cs.huji.ac.il
7 Universal Conceptual Cognitive Annotation (UCCA) The Task: UCCA parsing in English, German and French in difgerent domains.
Phrases may be discontinuous. Remote edges enable reentrancy. 8 Ground C Center D Adverbial E Elaborator F Function G Parallel scene H A L Linker P Process R Relator S State U Participant —– primary edge - - - remote edge F Labeled directed acyclic graphs (DAGs). Complex units are non-terminal nodes. They A thought P about R taking a Graph Structure F short break C P A A A D D Punctuation
Remote edges enable reentrancy. 9 Ground C Center D Adverbial E Elaborator F Function G Parallel scene H A L Linker P Process R Relator S State U Participant —– primary edge - - - remote edge F Labeled directed acyclic graphs (DAGs). Complex units are non-terminal nodes. Phrases may be discontinuous. They A thought P about R taking a Graph Structure F short break C P A A A D D Punctuation
10 Ground Participant C Center D Adverbial E Elaborator F Function G H - - - remote edge Parallel scene L Linker P Process R Relator S State U A —– primary edge Graph Structure taking Labeled directed acyclic graphs (DAGs). Complex units are non-terminal nodes. Phrases may be discontinuous. Remote edges enable reentrancy. They A thought P about R F D a F short break C P A A A D Punctuation
11 LSTM LSTM LSTM taking LSTM LSTM LSTM LSTM a LSTM LSTM LSTM LSTM short LSTM LSTM LSTM LSTM break LSTM LSTM LSTM LSTM MLP LSTM about Baseline F TUPA, a transition-based UCCA parser (Hershcovich et al., 2017). bit.ly/tupademo They A thought P about R taking F a short LSTM break C They LSTM LSTM LSTM LSTM thought LSTM LSTM LSTM Node C
12 English-20K 6,514 German-20K 12,954 492 French-20K 12,574 492 158,573 Data 5,142 English-Wiki tokens sentences Twenty Thousand Leagues Under the Sea (20K). 144,531 • English Wikipedia articles (Wiki). • English-French-German parallel corpus from
13 Tracks • English { in-domain/out-of-domain } × { open/closed } • German in-domain { open/closed } • French low-resource (only 15 training sentences)
14 nsubj Conversion moved obl root case obl punct op case Paris to moved John , graduation graduation John ARG2 top nsubj Parisg Johng ,g graduation Afterg Parisg name tog movedg g John Afterggraduation , name Paris ARG0 After After ARG1 name name city ARG0 name op1 ”John” move-01 op1 person ARG2 op1 SDP graduate-01 CoNLL-U ”Paris” after AMR John top ARG1 ARG1 ARG2 to moved Paris , name After graduation m e ARG2 t i time ARG2 ARG0 0 G R A ⇔ root root ARG1 ARG1 ⇔ ARG2 A d head R a e G h 2 ARG1 head ARG2 obl obl punct h ⇔ e a d head head case e s a c movedg tog
15 9 Paris A H A A 1. Match primary edges by terminal yield + label. 2. Calculate precision, recall and F1 scores. 3. Repeat for remote edges. Primary P R F1 6 67 to 6 10 60 64% Remote P R F1 1 2 50 1 1 100 F P Evaluation R True (human-annotated) graph After L graduation P H , U John A moved P to Paris moved C A H A Automatically predicted graph for the same text After L graduation S H , U John A 67%
16 3. Repeat for remote edges. moved P to F Paris A H A A 1. Match primary edges by terminal yield + label. 2. Calculate precision, recall and F1 scores. Primary John P R F1 6 6 64% Remote P R F1 1 1 Evaluation A U P True (human-annotated) graph After L graduation P H , U John A , moved to R H S graduation L After Automatically predicted graph for the same text A H A C Paris 67% 9 = 67 % 10 = 60 % 2 = 50 % 1 = 100 %
17 Participating Systems 8 groups in total: Soochow University • MaskParse@Deskiñ Orange Labs, Aix-Marseille University • HLT@SUDA • TüPa University of Tübingen • UC Davis University of California, Davis • GCN-Sem University of Wolverhampton • CUNY-PekingU City University of New York, Peking University • DANGNT@UIT.VNU-HCM University of Information Technology VNU-HCM • XLangMo Zhejiang University
18 0.672 CUNY-PekingU 0.669 0.672 0.752 XLangMo CUNY-PekingU 0.796 HLT@SUDA French-20K open 0.791 0.791 0.849 CUNY-PekingU 0.841 baseline HLT@SUDA German-20K open 0.731 0.731 0.832 CUNY-PekingU 0.797 baseline German-20K closed HLT@SUDA 0.709 0.684 0.767 CUNY-PekingU 0.739 TüPa HLT@SUDA English-20K open 0.727 baseline Leaderboard 0.735 0.735 0.805 CUNY-PekingU 0.800 TüPa HLT@SUDA English-Wiki open 0.722 0.728 0.728 Davis 0.774 baseline English-Wiki closed HLT@SUDA baseline 3rd place 2nd place 1st place Track 0.656 0.487 English-20K closed HLT@SUDA
19
20 Main Findings Neural constituency parser + multi-task + BERT French: trained on all languages, with language embedding CUNY-PekingU won the French (open) track: TUPA ensemble + synthetic data by machine translation Surprisingly, results in French were close to English and German Demonstrates viability of cross-lingual UCCA parsing Is this because of UCCA’s stability in translation? • HLT@SUDA won 6/7 tracks:
21 Main Findings Neural constituency parser + multi-task + BERT French: trained on all languages, with language embedding TUPA ensemble + synthetic data by machine translation Surprisingly, results in French were close to English and German Demonstrates viability of cross-lingual UCCA parsing Is this because of UCCA’s stability in translation? • HLT@SUDA won 6/7 tracks: • CUNY-PekingU won the French (open) track:
22 Main Findings Neural constituency parser + multi-task + BERT French: trained on all languages, with language embedding TUPA ensemble + synthetic data by machine translation Surprisingly, results in French were close to English and German • HLT@SUDA won 6/7 tracks: • CUNY-PekingU won the French (open) track: • Demonstrates viability of cross-lingual UCCA parsing • Is this because of UCCA’s stability in translation?
23 Conclusion Thanks! Annotators, organizers, participants Daniel Hershcovich, Leshem Choshen, Elior Sulem, Zohar Aizenbud, Ari Rappoport and Omri Abend Please participate in the CoNLL 2019 Shared Task: Cross-Framework Meaning Representation Parsing SDP, EDS, AMR and UCCA mrp.nlpl.eu | Evaluation Period: July 8–22, 2019 • Substantial improvements to UCCA parsing • High variety of methods • Successful cross-lingual transfer
24 Conclusion Thanks! Annotators, organizers, participants Daniel Hershcovich, Leshem Choshen, Elior Sulem, Zohar Aizenbud, Ari Rappoport and Omri Abend Please participate in the CoNLL 2019 Shared Task: Cross-Framework Meaning Representation Parsing SDP, EDS, AMR and UCCA mrp.nlpl.eu | Evaluation Period: July 8–22, 2019 • Substantial improvements to UCCA parsing • High variety of methods • Successful cross-lingual transfer
25 Conclusion Thanks! Annotators, organizers, participants Daniel Hershcovich, Leshem Choshen, Elior Sulem, Zohar Aizenbud, Ari Rappoport and Omri Abend Please participate in the CoNLL 2019 Shared Task: Cross-Framework Meaning Representation Parsing SDP, EDS, AMR and UCCA mrp.nlpl.eu | Evaluation Period: July 8–22, 2019 • Substantial improvements to UCCA parsing • High variety of methods • Successful cross-lingual transfer
Recommend
More recommend