distributed data management the power and challenge of
play

Distributed data management, the power and challenge of metadata - PowerPoint PPT Presentation

Distributed data management, the power and challenge of metadata ystein Gody Why bother with data management? Science paradigms Maximise public investment in data according to Jim Gray collection and production empirical


  1. Distributed data management, the power and challenge of metadata Øystein Godøy

  2. Why bother with data management? ● Science paradigms ● Maximise public investment in data – according to Jim Gray collection and production ● empirical science ● Promote scientific ● theoretical science collaboration ● computational ● Promote interdisciplinary science science ● data exploration ● Promote scientific science transparency ● Leave a legacy

  3. Document data through metadata ● Discovery level ● Use level ● What ● Variable names/descriptions ● Where ● Missing values ● When ● Units ● Who ● Coordinates ● Constraints on sharing and usage ● Interdependency ● How to access data between variables ● ... ● Linkages between data

  4. Metadata

  5. The toolbox

  6. METSIS implementations ● Arctic Data Centre (WMO ● Norwegian Satellite Earth DCPC) Observation Database for Marine and Polar Research ● WMO Global Cryosphere (RCN) Watch ● CryoClim – climate ● EU FP6 DAMOCLES consistent satellite remote ● EU FP7 ACCESS sensing products for the ● EUMETSAT Ocean and Sea Ice cryosphere (ESA/NRS) SAF (High Latitude centre) ● International Polar Year ● Svalbard Integrated Arctic ● National node in Norway Observing System (demo) ● International operational data ● SAON (demo) coordination

  7. NORMAP ● Features ● Norwegian Satellite Earth Observation Database for Marine ● Subscription and Polar Research ● http://normap.nersc.no/ ● Collocated ● Satellite products visualisation ● Level 2 and higher ● Transformation of ● Distributed data repository individual products ● Nansen Environmental and Remote Sensing Centre ● (Transformation of ● Norwegian Meteorological multiple products to a Institute common reference) ● Kongsberg Satellite Services ● (CERSAT)

  8. Global Cryosphere Watch ● WMO Information ● A coordinated framework for System ● Observations ● Relies on ● Data management interoperability ● Monitoring interfaces ● Assessment ● Much of the data is of scientific origin ● Product development ● Data are served from ● Focusing on the current the host data centre and future state of the cryosphere

  9. Collocated visualisation

  10. Lessons learned harvesting ● End point availability ● Transformation of metadata to search ● Few dedicated subsets model is done using XSLT of available data and SKOS ● Filtering of records is ● Harvest semi automatic required, but require initially to fully e.g. proper keywords understand the nature of ● CSDGM and ISO19115 each data centre linked metadata often lack ● Frequently interfaces to standardised data and documentation controlled vocabularies of these are lacking

  11. Main challenges ● Standardised ● Propagation of metadata to global frameworks controlled vocabularies in machine readable ● Duplication of records form ● Require agreement to ● Filtering propagate metadata ● Interpretation of ● Transformation standards ● Identification of ● Not all are using interfaces standard interfaces to ● Metadata granularity data differs

  12. Summary ● Integration of metadata sources is challenged by integration of technology and science through documentation standards and controlled vocabularies ● An increasing number of data centres support interoperability standards, but interpretation of standards often differ ● Metadata brokering is the short term key to integration of data centres

Recommend


More recommend