Towards Deep Universal Dependencies Kira Droganova, Daniel Zeman {droganova,zeman}@ufal.mfg.cuni.cz http://universaldependencies.org/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniel Zeman (ÚFAL MFF UK) Towards Deep UD Paris, 28.8.2019 1 / 25
Multiple Layers of Dependencies Form Surface syntax Deep syntax Semantics Meaning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniel Zeman (ÚFAL MFF UK) Towards Deep UD Paris, 28.8.2019 2 / 25
Meaning-Text Theory II ATTR I ATTR I II coref I II I documento proponer este contrato afectar persona persona engrosar lista paro document suggest this contract afgect person person enlarge list unemployed “The document suggests that this contract afgect the persons who make the unemployment lists swell.” SG PL A2 A2 A2 A2 A2 NUMBER NUMBER NUMBER NUMBER NUMBER A1 A1 A1 A1 A1 A1 A1 persona paro este contrato documento lista . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniel Zeman (ÚFAL MFF UK) Towards Deep UD Paris, 28.8.2019 3 / 25 A2 A2 A1 A1 A1 A1 A2 proponer engrosar ROOT afectar A1 A1 A1 TENSE TENSE TENSE A2 A2 A2 PRES
Functional Generative Description (Prague Tectogrammatics) Pred Coord.m Atr Pnom Apos Obj.m Atr Sb Adv AuxC Obj AuxP Atr Obj.m AuxY Obj.m Obj.m A similar technique is almost impossible to apply to other crops such as cotton, soybean and rice PAT root ACT CONJ.m PAT APPS ADDR.m RSTR EXT BEN ACT RSTR ADDR.m ADDR.m ADDR.m similar technique be almost possible #Benef apply #Gen other crop such_as cotton soybean and rice coref.gram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniel Zeman (ÚFAL MFF UK) Towards Deep UD Paris, 28.8.2019 4 / 25
Proposition Banks ARG-TMP ARG-TMP ARG0 ARG-TMP ARG0 ARG1 ARG0 ARG1 ARG0 ARG1 ARG1 {The … company} said { 0 it expects { -1 to obtain {reg. approval} and complete {the trans.} {by year-end}}} coref “The thrifu holding company said it expects to obtain regulatory approval and complete the transaction by year-end.” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniel Zeman (ÚFAL MFF UK) Towards Deep UD Paris, 28.8.2019 5 / 25
Sequoia suj:suj mod obj:obj det mod mod aux.pass ats:ato Le lot gros œuvre devra probablement être déclaré infructueux The package structural works will have to probably be declared unsuccessful suj:suj suj:obj suj:suj “The structural system package should probably be declared unsuccessful.” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniel Zeman (ÚFAL MFF UK) Towards Deep UD Paris, 28.8.2019 6 / 25
Universal Dependencies conj xcomp obl orphan nsubj mark case cc case Kate wants to go to Florida and Jane (wants) (go) to Europe nsubj mark case nsubj xcomp case xcomp obl :to cc obl :to nsubj nsubj conj . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniel Zeman (ÚFAL MFF UK) Towards Deep UD Paris, 28.8.2019 7 / 25
Enhanced UD: Arabic, Bulgarian, Czech, Dutch, English, Estonian, Finnish, Italian, Latvian, Lithuanian, Polish, Russian, Slovak, Swedish, Tamil, Ukrainian Multilingual Annotation MTT: Russian, English, Spanish, French FGD: Czech, English PropBank: English, Arabic, Chinese, Finnish, Hindi, Urdu, Persian, Portuguese, Turkish, German, French AMR: English, Chinese, Portuguese, Korean, Vietnamese, Spanish, French, German Sequoia: French . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniel Zeman (ÚFAL MFF UK) Towards Deep UD Paris, 28.8.2019 8 / 25
Multilingual Annotation MTT: Russian, English, Spanish, French FGD: Czech, English PropBank: English, Arabic, Chinese, Finnish, Hindi, Urdu, Persian, Portuguese, Turkish, German, French AMR: English, Chinese, Portuguese, Korean, Vietnamese, Spanish, French, German Sequoia: French Enhanced UD: Arabic, Bulgarian, Czech, Dutch, English, Estonian, Finnish, Italian, Latvian, Lithuanian, Polish, Russian, Slovak, Swedish, Tamil, Ukrainian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniel Zeman (ÚFAL MFF UK) Towards Deep UD Paris, 28.8.2019 8 / 25
Basic Universal Dependencies: 82 Languages and Growing I.-E.: Armenian, Ancient Greek, Greek, Breton, Irish, Welsh ◮ Germanic: Afrikaans, Danish, Dutch, English, Faroese, German, Gothic, Norwegian, Swedish ◮ Romance: Catalan, French, Galician, Italian, Latin, Old French, Portuguese, Romanian, Spanish ◮ Balto-Slavic: Belarusian, Bulgarian, Croatian, Czech, Church Slavonic, Old Russian, Polish, Russian, Serbian, Slovak, Slovenian, Ukrainian, Upper Sorbian, Latvian, Lithuanian ◮ Indo-Iranian: Kurmanji, Persian, Hindi, Marathi, Sanskrit, Urdu Uralic: Erzya, Estonian, Finnish, Hungarian, Karelian, Komi, Sámi Dravidian: Tamil, Telugu Turkic: Kazakh, Turkish, Uyghur Af.-As.: Akkadian, Amharic, Arabic, Assyrian, Coptic, Hebrew, Maltese Sino-Tibetan: Cantonese, Classical Chinese, Chinese; Aus.-As.: Vietnamese Tai-Kadai: Thai; Austronesian: Indonesian, Tagalog Other: Buryat, Japanese, Korean, Basque, Sw. Sign, Naija, Bambara, Wolof, Yoruba, Warlpiri, Mbyá Guaraní . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniel Zeman (ÚFAL MFF UK) Towards Deep UD Paris, 28.8.2019 9 / 25
Two-Speed Approach Automatic part: derived from basic UD Optional manual extras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniel Zeman (ÚFAL MFF UK) Towards Deep UD Paris, 28.8.2019 10 / 25
A Mountain of Work . . . . . . . . . . . . . . . . . . . . Pear Blossom [CC BY-SA 3.0 (https://creativecommons.org/licenses/by-sa/3.0)] . . . . . . . . . . . . . . . . . . . . Daniel Zeman (ÚFAL MFF UK) Towards Deep UD Paris, 28.8.2019 11 / 25
A Mountain of Work Work in progress ◮ Only the automatic part now WANTED: Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniel Zeman (ÚFAL MFF UK) Towards Deep UD Paris, 28.8.2019 12 / 25
A Mountain of Work Automatic part: derived from basic UD Work in progress Optional manual extras WANTED: Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniel Zeman (ÚFAL MFF UK) Towards Deep UD Paris, 28.8.2019 13 / 25
Automatic vs. Manual Annotation Automatically derived from UD: ◮ Enhanced Universal Dependencies ⋆ Grammatical coreference (partially) ⋆ Ellipsis: gapping ◮ Normalize syntactic alternations (cf. Candito et al. 2017) ◮ Ellipsis: pro-drop Manual or with extra resources: ◮ Frames, semantic roles ◮ Textual coreference ◮ Everything else… ◮ … and improve the automatic part above . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniel Zeman (ÚFAL MFF UK) Towards Deep UD Paris, 28.8.2019 14 / 25
root nsubj:pass obl:agent det aux:pass case The dragon was killed by him ARG2 ARG1 root root acl:relcl xcomp obj nsubj:pass nsubj obj det det aux:pass She made him kill the dragon The dragon that was killed ARG1 ARG2 ARG2 Normalization of Syntactic Alternations root obj nsubj det He killed the dragon ARG1 ARG2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniel Zeman (ÚFAL MFF UK) Towards Deep UD Paris, 28.8.2019 15 / 25
root root acl:relcl xcomp obj nsubj:pass nsubj obj det det aux:pass She made him kill the dragon The dragon that was killed ARG1 ARG2 ARG2 Normalization of Syntactic Alternations root root obj nsubj:pass obl:agent nsubj det det aux:pass case He killed the dragon The dragon was killed by him ARG1 ARG2 ARG2 ARG1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniel Zeman (ÚFAL MFF UK) Towards Deep UD Paris, 28.8.2019 15 / 25
root acl:relcl nsubj:pass det aux:pass The dragon that was killed ARG2 Normalization of Syntactic Alternations root root obj nsubj:pass obl:agent nsubj det det aux:pass case He killed the dragon The dragon was killed by him ARG1 ARG2 ARG2 ARG1 root xcomp obj nsubj obj det She made him kill the dragon ARG1 ARG2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniel Zeman (ÚFAL MFF UK) Towards Deep UD Paris, 28.8.2019 15 / 25
Recommend
More recommend