multilingualism in linked data
play

Multilingualism in Linked Data G.Aguado J. Gracia A. Gmez-Prez E. - PowerPoint PPT Presentation

Multilingualism in Linked Data G.Aguado J. Gracia A. Gmez-Prez E. Montiel- D. Vila Ponsoda Ontology Engineering Group (OEG) Artificial Intelligence Department Universidad Politcnica de Madrid (UPM) W3C Multilingual Web Workshop


  1. Multilingualism in Linked Data G.Aguado J. Gracia A. Gómez-Pérez E. Montiel- D. Vila Ponsoda Ontology Engineering Group (OEG) Artificial Intelligence Department Universidad Politécnica de Madrid (UPM) W3C Multilingual Web Workshop Rome, 12-13 March 2013

  2. Foundations: the model, the data, URIs and links Unique identifiers: URI RDF(S) models (ontologies) and data identify or name a resource Equivalence links to other datasets Same As http://iflastandards.info/ns/fr/frbr/frbrer/C1001 http://iflastandards.info/ns/fr/frbr/frbrer/C1005 Is creator of Person Cer Work Ontology Is a Is a Is creator of Data Cer Cervantes El Quijote http://datos.bne.es/resource/XX1718747 http://datos.bne.es/resource/XX3383563 Same As Same As Cervantes http://viaf.org/viaf/17220427 Cervantes http://dbpedia.org/resource/Miguel_de_Cervantes Asuncion Gomez-Perez Multilingualism in Linked Data. W3C Multilingual Web Workshop. Rome March 2013.

  3. Sources of information in different languages Library ¡and ¡ Geographical ¡ Sensor Web ¡2.0 ¡ REST ¡service ¡ Cultural ¡ Diverse ¡Informa1on ¡ Informa1on ¡ Networks annota1on ¡ Heritage ¡ data shp2RDF ¡ RDF Generation and Linking Visualization Geographical Linked Library Data Sensor Data Visualization Visualisation Visualisation Asuncion Gomez-Perez Multilingualism in Linked Data. W3C Multilingual Web Workshop. Rome March 2013.

  4. Observatory of the Multilingual Web of Data • Analysis of BTC datasets 2011 2012 • Analyzed literals: 1,072,386,405 • Analyzed literals: 543,933,327 • Total literals with lang tag: 116,058,734 • Total literals with lang tag: 304,115,676 • % Literals with lang tag: 10.822 % • % Literals with lang tag: 55.91 % • % Literals tagged as English: 94.68 % • % Literals tagged as English: 94.44 % 100000000 1000000 10000 100 2011 1 2012 English French German English US Spanish Rumanian Swedish Chinese Hungarian Asuncion Gomez-Perez Multilingualism in Linked Data. W3C Multilingual Web Workshop. Rome March 2013. 4

  5. A motivating example for using multilingual LD [1] Cities Medicine catalog Köln Serevent “Dame farmacias de guardia en Colonia que tengan Beglan ” (*) German chemists (*) Give me the duty chemists in Cologne having Beglan [1] J. Gracia, E. M. Ponsoda, P. Cimiano, A. G. Pérez, P. Buitelaar, and J. McCrae, "Challenges for the multilingual Web of Data," Journal of Web Semantics Asuncion Gomez-Perez Multilingualism in Linked Data. W3C Multilingual Web Workshop. Rome March 2013. 5

  6. Multilingualism and the Linked Data Process [2] • Monolingual or multilingual data resources Specification • DB, documents, tables, etc. • Linguistic resources: Dictionaries, Lexicons, Thesauri, etc. Modelling • Ontology(TBox URIs) http://phenomenontology.linkeddata.es/ontology/Municipio RDF Generation http://iflastandards.info/ns/fr/frbr/frbrer/C1005 Links Generation • Data (ABox URIs) http://geo.linkeddata.es/resource/Municipio/Madrid Publication http://datos.bne.es/resource/XX1718747 Exploitation [2] Villazón-Terrazas, B . et al., Methodological Guidelines for Publishing Government Linked Data. In D. Wood, ed. Linking Government Data. Springer. Asuncion Gomez-Perez Multilingualism in Linked Data. W3C Multilingual Web Workshop. Rome March 2013. 6

  7. Multilingualism and the Linked Data Process How can we adapt and translate the lexical/ terminological layer of an existent ontology Specification into other languages? Modelling Multilingual labeling Ontology approach if languages Localization RDF Generation involved share a single view Algorithms on a certain domain Links Generation Cross-lingual Mapping Cross-lingual linking approach if Publication independent monolingual Algorithms ontologies exist that cover same or similar subject domain (Problems: Exploitation conceptualization mismatches, or granularity and viewpoint differences) Asuncion Gomez-Perez Multilingualism in Linked Data. W3C Multilingual Web Workshop. Rome March 2013. 7

  8. Multilingualism and the Linked Data Process How to represent multilingual Linked Data? Specification Traditional annotation properties for most cases § dbpedia:Miguel_de_Cervantes Modelling rdfs:label "Miguel de Cervantes"@es . " ミゲル・デ・セルバンテス "@ja . " 미겔 데 세르반테스 "@ko . RDF Generation Richer models for more demanding applications § Links Generation Publication SKOS-XL LIR Exploitation LexInfo Asuncion Gomez-Perez Multilingualism in Linked Data. W3C Multilingual Web Workshop. Rome March 2013. 8

  9. Main issues of cross-lingual linking Specification Modelling � How to discover cross-lingual links ? RDF Generation � How to represent cross-lingual links? Links Generation Publication � How to store and reuse cross-lingual links? Exploitation Asuncion Gomez-Perez Multilingualism in Linked Data. W3C Multilingual Web Workshop. Rome March 2013. 9

  10. Multilingualism and the Linked Data Process How to discover correspondences between ontologies and between LD expressed in different natural Specification languages? Modelling Cross-lingual links medicamento RDF Generation Medikament Health ¡ Health ¡ ¡ontology ¡ Arzneistoff ontology ¡ Links Generation principio activo via via rdfs:type rdfs:type Publication Medicines Medicines Serevent catalog catalog Beglan Cross-lingual Exploitation links salmeterol salmeterol inhalador der Inhalator Asuncion Gomez-Perez Multilingualism in Linked Data. W3C Multilingual Web Workshop. Rome March 2013. 10

  11. Cross-lingual Link Discovery 1. Projecting lexical content of the ontology into a common language, then applying traditional OM techniques Specification Translation Modelling onto1’ onto1 Monolingual Alignment OM RDF Generation onto2 Links Generation 2. Comparing ontology entities directly by means of cross- lingual semantic measures (see CIDER-CL ) Publication Exploitation Cross-lingual onto1 Alignment OM onto2 Asuncion Gomez-Perez Multilingualism in Linked Data. W3C Multilingual Web Workshop. Rome March 2013. 11

  12. Cross-lingual Link Storage and Reuse • Links can be discovered: Specification § runtime -> need of scalable techniques § offline -> need of storage methods Modelling • Storage § Following Linked Data principles RDF Generation § Links can be stored jointly to some of the data sources that they relate (e.g., during LD Links Generation generation) § Links can be stored in separate repositories to Publication be accessed by semantic applications (e.g., for CL-Question Answering) Exploitation Asuncion Gomez-Perez Multilingualism in Linked Data. W3C Multilingual Web Workshop. Rome March 2013. 12

  13. Multilingualism and the Linked Data Process How can a user pose questions in their own language to be processed against the web of Linked Data? Specification “farmacia” “Colonia” Modelling Semantic query 1. Multilingual query interpretation RDF Generation 2. Query federation, ... Links Generation How should the results of a semantic query be adapted to Publication the linguistic and cultural background of a user? Exploitation 1. Adaptation and localization of user interfaces 2. Natural language generation 3. Presentation views to specific linguistic and cultural contexts Asuncion Gomez-Perez Multilingualism in Linked Data. W3C Multilingual Web Workshop. Rome March 2013. 13

  14. Services for the Multilingual Web of Data Services ¡for ¡cross-­‑lingual ¡access ¡ Users ¡ Linked ¡Data ¡ Mul8lingual ¡ mappings ¡ Services ¡for ¡ ¡cross-­‑lingual ¡linkage ¡ Mul8lingual ¡ ¡ linguis8c ¡ ¡ informa8on ¡ Services ¡for ¡genera8ng ¡mul8lingual ¡Linked ¡Data ¡ ¡ Services ¡for ¡ ¡transla8on ¡and ¡ ... ¡ Data ¡silos ¡ ontology ¡localiza8on ¡ Asuncion Gomez-Perez Multilingualism in Linked Data. W3C Multilingual Web Workshop. Rome March 2013. 14

  15. Thanks for your attention ! 15

  16. Research agenda on Multilingual LD at OEG • Ontology lexica representation Elena Montiel, Lupe Aguado • Lexico-syntactic patterns Elena Montiel, Lupe Aguado • Ontology localisation (translation) Elena Montiel, Jorge Gracia, Asun Gomez-Perez • Exploratory analysis of the Multilingual Web of Data Daniel Vila, Asun Gómez-Pérez, Jorge Gracia • Cross-lingual ontology and Instance matching Jorge Gracia, Daniel Vila • Query federation Oscar Corcho 16

Recommend


More recommend