darwin sw darwin core data for the semantic web campbell
play

Darwin-SW: Darwin Core data for the Semantic Web Campbell Webb - PowerPoint PPT Presentation

TDWG Annual Meeting; 2011-10-18 Darwin-SW: Darwin Core data for the Semantic Web Campbell Webb & Steven Baskauf Arnold Arboretum of Harvard University / Dept. of Biological Sciences, Vanderbilt University Version 0.2-2-ge7575a4 The


  1. TDWG Annual Meeting; 2011-10-18 Darwin-SW: Darwin Core data for the Semantic Web Campbell Webb & Steven Baskauf Arnold Arboretum of Harvard University / Dept. of Biological Sciences, Vanderbilt University Version 0.2-2-ge7575a4

  2. The Semantic Web • Persistent, de-referenceable identifiers (GUIDs) • A universal format for transmission (RDF) • Semantically-rich descriptions (self-documenting) • Opportunity for machine reasoning

  3. TDWG’s Darwin Core (DwC) • Biodiversity Information Standards (TDWG) standard • Stable, ‘Technology-independent’ vocabulary of terms • ‘Classes’ are categories, no formal domain declarations for terms: – Direct use in RDF is unclear • Foundation for building RDF Classes and properties (TDWG RDF Task Group)

  4. Darwin-SW • We needed GUID/RDF solution now • NOT officially associated with DwC or TDWG, – but much effort to understand and apply TDWG community consensus • Uses 5 existing DwC classes and adds 2 new ones • Relationships among classes defined by pairs of inverse object properties (new) • Most existing DwC data properties can be used

  5. Five core DwC classes

  6. dsw:IndividualOrganism (new) Needed for linkage to population data, observations (see Baskauf, 2010). Issue of clonal organisms, heterogeneous collection units.

  7. Definition of Occurrence class • DwC (‘class of data’) “The category of information pertaining to evidence of an occurrence in nature, in a collection, or in a dataset” General; including specimens, observations, even species-place ‘checklist’ records • DSW: “the instance of an individual organism at a place and time” • Much biodiversity data ‘hangs’ on an individual’s occurrence • Tidily incorporates location and time for other data objects

  8. dsw:Token (new) ‘Tokens’—Specimens, photos, even plant cuttings, that provide evidence of the Occurrence

  9. Darwin-SW Full set of new object properties

  10. Modeling • Using OWL DL (Web Ontology Language; adds some necessary concepts to basic RDF Schema, RDFS) • Using Prot´ eg´ e 4 • Took care about assigning domains and ranges to properties (Morris, pers. comm.; Horridge, 2009) • All classes other than Token are disjoint • Validation of data statements using eyeball

  11. Treatment of Taxon class • DSW treats dwc:Taxon ≡ tc:TaxonConcept (from the TDWG RDF ontology) • A TaxonConcept combines both a TaxonName and a statement of name usage, but usage seldom given • Taxonomic names (genus, specific epithet) can hang on dwc:Taxon directly, or on tn:TaxonName • Awaiting GNUB URIs for Taxon Names eagerly

  12. Linking to other ontologies Ontology at http://xmalesia.info/sw/onto/bot.rdf. Links to OBOE, PO, PATO, CDAO

  13. E.g., Observation of an organism

  14. Darwin-SW use examples • Steve’s Bioimages database – http://bioimages.vanderbilt.edu – A still image: < http://bioimages.vanderbilt.edu/kirchoff/em2072 > • New collections (physical, images, DNA) in Indonesia (Xmalesia NSF project) – http://xmalesia.info – A tree in an ecological plot, with specimen: < http://xmalesia.info/sw/indiv/360 >

  15. Moving Forward • Darwin-SW can contribute to discussion on TDWG RDF recommendations • Task Group meeting tomorrow PM • Discussion items: – Need for dsw:IndividualOrganism – A dsw:Token class? – Eliminate dwc:Event ? – Linkage to other ontologies – Reasoning use cases – Timeline and plan for a TDWG-RDF BIS • Please join us!

  16. Acknowledgments • Important discussions Pete DeVries, Paul Murray, Hilmar Lapp, Bob Morris, Matt Jones, Jim Balhoff, Shawn Bowers, Chris Mungall, Damian Gessler, many others • Current Funding National Science Foundation (DEB–1020868 to CW) • Software Redland RDF Libraries, redstore , L A T EX, GraphViz, Protege, xqilla , GNU gawk

  17. References Baskauf, S. J. 2010. Organization of occurrence-related biodiversity resources based on the process of their creation and the role of individual organisms as resource relationship nodes. Biodiversity Informatics 7 :17– 44. Horridge, M., 2009. A Practical Guide To Building OWL Ontologies Using Prot´ eg´ e 4 and CO-ODE Tools (Edi- tion 1.2). Technical report, The University Of Manchester. URL http://owl.cs.manchester.ac.uk/tutorials/ protegeowltutorial/.

Recommend


More recommend