VoxEL: A Benchmark Dataset for Multilingual Entity Linking † Henry Rosales-M´ endez, Aidan Hogan and Barbara Poblete University of Chile { hrosales,ahogan,bpoblete } @dcc.uchile.cl October 10, 2018 † ISWC 2018 - The 17th Internationl Semantic Web Conference
Example
Example - Entity Recognition
Example - Entity Disambiguation
Applications • Semantic Search • Semantic Annotations • Relation Extraction • Topic Extraction
Name Variations in Entity Linking Michael Joseph Jackson Michael J. Jackson King of Pop
Name Variations in Entity Linking Michael Jackson
Multilingual Entity Linking - English
Multilingual Entity Linking - Italian
Multilingual Entity Linking - Spanish
Datasets
Datasets
Goals 1 Create a benchmark dataset for multilingual Entity Linking
Curated source: VoxEurop
Example - Any other entity?
Example - Any other entity?
Example - Any other entity?
Example - Any other entity?
Example annotations produced by four EL systems
Example annotations produced by four EL systems
Example annotations produced by four EL systems Aida
Example annotations produced by four EL systems Babelfy Aida
Example annotations produced by four EL systems Babelfy Aida DBpedia Spotlight
Example annotations produced by four EL systems Babelfy Aida T agME DBpedia Spotlight
• What should Entity Linking link?
Datasets
Datasets
Goals 1 Create a benchmark dataset for multilingual Entity Linking
Goals 1 Create a benchmark dataset for multilingual Entity Linking 2 Create two versions of the dataset: strict and relaxed.
Strict verison: class-based definition
Strict verison: class-based definition
Strict verison: class-based definition ?
Relaxed version: Knowledge Base definition
Creation of VoxEL dataset • It is based on curated text from five languages. • Same sentences by each corresponding document. • Same annotations by each corresponding sentence. • Revision process.
Summary
Summary
Summary
Experiments 1 GERBIL Evaluation of state-of-the-art approaches
Experiments
Experiments
Experiments DE EN Avg. of Micro F1 ES .66 .65 FR .61 .60 .59 .60 .58 .58 .57 IT .50.47 .40 .39 .34 .34.33.32 .34 .34 .30 .28 .27 .27 .22 .19 Babefy TAGME THD Babefy DB-sp FREME r s (a) Results of the Relaxed version of VoxEL .86 .81 .81 .78 .76 .75 .74 .74 .72 .71 .72.70.71 Avg. of Micro F1 .70 .71 .65 .64 .64 .60 .60 .54 .53 .49.51 .50 Babefy TAGME THD Babefy DB-sp FREME r s (b) Results of the Strict version of VoxEL
Experiments DE EN Avg. of Micro F1 ES .66 .65 FR .61 .60 .59 .60 .58 .58 .57 IT .50.47 .40 .39 .34 .34.33.32 .34 .34 .30 .28 .27 .27 .22 .19 Babefy TAGME THD Babefy DB-sp FREME r s (a) Results of the Relaxed version of VoxEL .86 .81 .81 .78 .76 .75 .74 .74 .72 .71 .72.70.71 Avg. of Micro F1 .70 .71 .65 .64 .64 .60 .60 .54 .53 .49.51 .50 Babefy TAGME THD Babefy DB-sp FREME r s (b) Results of the Strict version of VoxEL
Experiments DE EN Avg. of Micro F1 ES .66 .65 FR .61 .60 .59 .60 .58 .58 .57 IT .50.47 .40 .39 .34 .34.33.32 .34 .34 .30 .28 .27 .27 .22 .19 Babefy TAGME THD Babefy DB-sp FREME r s (a) Results of the Relaxed version of VoxEL .86 .81 .81 .78 .76 .75 .74 .74 .72 .71 .72.70.71 Avg. of Micro F1 .70 .71 .65 .64 .64 .60 .60 .54 .53 .49.51 .50 Babefy TAGME THD Babefy DB-sp FREME r s (b) Results of the Strict version of VoxEL
Experiments DE EN Avg. of Micro F1 ES .66 .65 FR .61 .60 .59 .60 .58 .58 .57 IT .50.47 .40 .39 .34 .34.33.32 .34 .34 .30 .28 .27 .27 .22 .19 Babefy TAGME THD Babefy DB-sp FREME r s (a) Results of the Relaxed version of VoxEL .86 .81 .81 .78 .76 .75 .74 .74 .72 .71 .72.70.71 Avg. of Micro F1 .70 .71 .65 .64 .64 .60 .60 .54 .53 .49.51 .50 Babefy TAGME THD Babefy DB-sp FREME r s (b) Results of the Strict version of VoxEL
Experiments 1 GERBIL Evaluation of state-of-the-art approaches
Experiments 1 GERBIL Evaluation of state-of-the-art approaches 2 Evaluate the performance of state-of-the-art approaches using machine translation.
Experiments Input T ext EN ES FR IT DE FR DE EN ES IT System Configuration
Experiments Input T ext EN ES FR IT DE FR DE EN ES IT System Configuration
Experiments Input T ext EN ES FR IT DE FR DE EN ES IT System Configuration
Experiments Input T ext EN ES FR IT DE FR DE EN ES IT System Configuration
Experiments Calibrated Translation Avg. of Micro F1 English .55 .52 .51 .47 .46 .45 .43 .40 .41 .41 .39 .39 .33 .31 .31 .31 .30 .24 Babefy DB-sp FREME TAGME THD Babefy r s (a) Results of the Relaxed version of VoxEL Avg. of Micro F1 .71 .71 .70 .71 .63 .60 .60 .57 .57 .55 .53 .53 .35 .33 .33 .32 .30 .27 Babefy DB-sp FREME TAGME THD Babefy r s (b) Results of the Strict version of VoxEL
Conclusion Our main contribution is VoxEL (https://dx.doi.org/10.6084/m9.figshare.6539675) • Most systems perform (much) better for English. • Machine Translation could be an option to address multilingual domains in Entity Linking.
Poster P20: Machine Translation vs. Multilingual Approaches for Entity Linking EN IT ES
VoxEL: A Benchmark Dataset for Multilingual Entity Linking † Henry Rosales-M´ endez, Aidan Hogan and Barbara Poblete University of Chile { hrosales,ahogan,bpoblete } @dcc.uchile.cl October 10, 2018 † ISWC 2018 - The 17th Internationl Semantic Web Conference
Recommend
More recommend