Topic Maps Extraction in Oveia Topic Maps Extraction in Oveia: : Specification and Processing Specification and Processing Extraç ção ão de de Topic Topic Maps Maps no no Oveia Oveia: : Extra Especificaç ção e Processamento ão e Processamento Especifica Giovani R. Librelotto José Carlos Ramalho Pedro R. Henriques Department of Informatics University of Minho Portugal GRLibrelotto & JCRamalho & PRHenriques, CLEI’04, September 2004 1 Motivation • Suppose you have an information system with several heterogeneous data resources: – Relational databases, XML documents, etc… • You want to achieve semantic interoperability between those data resources; • You want to do it fast GRLibrelotto & JCRamalho & PRHenriques, CLEI’04, September 2004 2 1
Motivation • The use of ontologies is a good approach to overcome the problem of semantic heterogeneity; • This supports the usefulness of Topic Maps; • However tools to build Topic Maps are crucial because the Topic Maps creation is an hard task. GRLibrelotto & JCRamalho & PRHenriques, CLEI’04, September 2004 3 Index • Basic Concepts • Our approach • Inside Oveia Oveia • Case Study • Conclusion GRLibrelotto & JCRamalho & PRHenriques, CLEI’04, September 2004 4 2
Ontology • Metaphysical branch of study which is concerned with existence and the nature of being; GRLibrelotto & JCRamalho & PRHenriques, CLEI’04, September 2004 5 Ontology • An ontology is just a set of words and relationships that formally describes an universe of discourse or context. country Five times Pelé Brasil The best of player sport Plays World Champion Football Game played Most popular sport GRLibrelotto & JCRamalho & PRHenriques, CLEI’04, September 2004 6 3
Ontology Specification • Specifications Standards: – RDF(S): Resource Description Format – DAML/OIL: Darpa Agent Markup Language – OWL: Ontology Web Language – XTM: XML Topic Maps (our choice) GRLibrelotto & JCRamalho & PRHenriques, CLEI’04, September 2004 7 Topic Maps • “ “Topic maps are a new ISO standard for describing Topic maps are a new ISO standard for describing • knowledge structures and associating them with knowledge structures and associating them with information resources” information resources ” The TAO of Topic Maps, , The TAO of Topic Maps Steve Pepper, 05- Steve Pepper, 05 -2000 2000 • Topics • Topics • Associations Associations • • Occurrences Occurrences • • However • However too too much much work work to to create create a real a real Topic Topic Map Map. . GRLibrelotto & JCRamalho & PRHenriques, CLEI’04, September 2004 8 4
Ontology Support • 94 tools and similar environments to support creation, use, and maintenance – Ontology Tools Survey, Revisited by Michael Denny, July 14, 2004, www.xml.com • However no one for the automatic creation of Topic Maps. GRLibrelotto & JCRamalho & PRHenriques, CLEI’04, September 2004 9 Index • Basic Concepts • Our approach • Inside Oveia Oveia • Case Study • Conclusion GRLibrelotto & JCRamalho & PRHenriques, CLEI’04, September 2004 10 5
Metamorphosis Metamorphosis Metadata Extractor Ontology Validator Ontology Navigator and Ontology Builder GRLibrelotto & JCRamalho & PRHenriques, CLEI’04, September 2004 11 Index • Basic Concepts • Our approach • Inside Oveia Oveia • Case Study • Conclusion GRLibrelotto & JCRamalho & PRHenriques, CLEI’04, September 2004 12 6
Oveia Oveia • A Topic Maps extractor from heterogeneous information system composed of two engines: – Metadata Extractor: collects pieces of information and stores them in an intermediate representation; – Ontology Builder: uses a specification to transform the intermediate representation into an ontology according to Topic Maps standard. GRLibrelotto & JCRamalho & PRHenriques, CLEI’04, September 2004 13 Oveia Oveia Metadata Extractor + Ontology Builder Oveia Oveia GRLibrelotto & JCRamalho & PRHenriques, CLEI’04, September 2004 14 7
Metadata Extractor • XSDS • XSDS (XML Specification of Data Sources) (XML Specification of Data Sources) • Supports different kinds of sources Supports different kinds of sources • (relational databases, XML files, (relational databases, XML files, … …) ) • Uses a • Uses a driver driver for each data source for each data source • Creates an intermediary representation • Creates an intermediary representation (called Dataset Dataset ) ) (called GRLibrelotto & JCRamalho & PRHenriques, CLEI’04, September 2004 15 Extractor Specification <resources> <datasources> <datasource extratorDriver="br.uneb.dcet.tmbuilder.drivers.DataBase" name=“xata2004"> <parameter name="connectionURL"> jdbc:mysql://localhost/XATA2004 </parameter> <parameter name="password"/> <parameter name="user">root</parameter> <parameter name="jdbcDriver"> org.gjt.mm.mysql.Driver </parameter> <dataset name=“Authors" database=“xata2004"> </datasource> SELECT code, name, url FROM author-table </datasources> </dataset> <datasets> <dataset name=“Papers" database=“xata2004"> ... SELECT code, title FROM paper-table </datasets> </dataset> </resources> GRLibrelotto & JCRamalho & PRHenriques, CLEI’04, September 2004 16 8
Datasets • An intermediary representation; • An intermediary representation; • Contains Contains all all data data extracted extracted from from • information information resources resources; ; • Is • Is the the input input to to the the XS4TM XS4TM processor processor ; ; • Data • Data is is stored stored in in table table format format: : – Line – Line x x collumn collumn GRLibrelotto & JCRamalho & PRHenriques, CLEI’04, September 2004 17 Ontology Builder • XS4TM XS4TM (XML (XML Specification Specification for for Topic Topic Maps Maps) ) • – Ontology extraction specification Ontology extraction specification – • XTM becomes a sub XTM becomes a sub- -set of XS4TM set of XS4TM • • XS4TM has 2 parts: • XS4TM has 2 parts: – Abstract Structure – Abstract Structure – Instances (catalog) Instances (catalog) – GRLibrelotto & JCRamalho & PRHenriques, CLEI’04, September 2004 18 9
OntoBuilder Specification <instances> <topic dataset="Categorias"> <instanceOf> <topicRef xlink:href="#Categorias"/> </instanceOf> <baseName> Reference to the extracted dataset <baseNameString> @Categorias.Descricao </baseNameString> </baseName> </topic> ... </instances> GRLibrelotto & JCRamalho & PRHenriques, CLEI’04, September 2004 19 XSDS x XS4TM GRLibrelotto & JCRamalho & PRHenriques, CLEI’04, September 2004 20 10
Generated topic map • After the XS4TM processing, Oveia generates a topic map stored in memory; • Oveia has two possible output formats: – XTM file: – XTM file: an XML document. – OntologyDB OntologyDB: : a relational database designed – according to Topic Maps standard. GRLibrelotto & JCRamalho & PRHenriques, CLEI’04, September 2004 21 Index • Basic Concepts • Our approach • Inside Oveia Oveia • Case Study • Conclusion GRLibrelotto & JCRamalho & PRHenriques, CLEI’04, September 2004 22 11
Recommend
More recommend