OPeNDAP OPeNDAP Hyrax An extensible data access framework within the Earth System Grid Federation Patrick West 1 , Peter Fox 1 , James Gallagher 2 , Nathan Potter 2 , Dan Halloway 2 , Stephan Zednik 1 1. Tetherless World Constellation (http://tw.rpi.edu) – Rensselaer Polytechnic Institute (http://www.rpi.edu) 2. OPeNDAP (http://www.opendap.org) 1 AGUFM2011-IN43C-04
Motivation OPeNDAP • There are more and more, and larger, data sets being collected all the time • Researchers don’t have the capability to download all of this data in order to get their work done • More and more server-side functionality needs to be provided to work with the data • More advanced services also need to be provided for additional client support and data manipulation • The OPeNDAP Hyrax framework provides for these additional services and capabilities due to its modular design 2
Key Points OPeNDAP • Earth System Grid Federation (ESG) is such a project where larger and more datasets are being collected for use by researchers • But … the most common method for researchers to use this data is to download it • OPeNDAP Hyrax is a data access service that can be used to access remote data • But … there is still a lot of work that can be done within the Hyrax Framework • At the same time, there are other services that can be provided, but not within the context of a data-access framework. 3
ESGF OPeNDAP • The Earth System Grid Federation (ESGF) is a non-profit organization formed by the participants of GO-ESSP providing software for the access and dissemination of climate data 4
ESGF OPeNDAP DIAGRAM 5
OPeNDAP OPeNDAP • OPeNDAP (Open-source Project for a Network Data Access Protocol) is a non-profit organization that provides software and services for the access and manipulation of data in support of the DAP2 (Data Access Protocol) standard. • OPeNDAP Hyrax is a software framework that provides for the networking of scientific data, implementing the DAP2 standard. • TDS (THREDDS Data Server) is not an OPeNDAP software product, but a software server provided by Unidata that can act as a DAP server • Different language bindings for DAP - C++, Pydap, JDap 6
OPeNDAP Hyrax OPeNDAP netCDF Data FIles DAP2 BES Commands DAP OLFS BES THREDDS netCDF SQL HTML WxS XML SOAP Encapsulated SQL Database Response BES Modules - dap Response - dap-service Types - cdf - DAS WxS - cedar - DDS - csv - DDX THREDDS - ferret - DataDDS Catalogs - fileout_netcdf - Ascii - fits - HTML Form - freeform - Info - gateway - NetCDF - hdf4 - RDF/XML - hdf5 - jgofs - ncml - netCDF - wcs - xml-rdf 7
Hyrax BES - Extensible OPeNDAP • New responses • New data types • Reporting mechanism - Metrics • Register Server-Side functions with libdap • New BES commands • Exception handling callbacks • Initialization and termination callbacks 8
ESG Requirements OPeNDAP • Only requirement we received: OPeNDAP Hyrax to be a drop-in replacement for TDS (THREDDS Data Server) • Basic DAP requests and responses • Read THREDDS catalogs, including NCML documents embedded within (ncml_module) • Ferret integration (ferret_module) – Ferret data manipulation functionality only – LAS (Live Access Server) does more with Ferret, but not the DAP server 9
Use Cases OPeNDAP Drop-in 400 page Replacement Requirements Happy Medium === Use Cases • Clearly defined use case with: – a descriptive name and a clearly stated goal – a summary of what the final product of the use case will provide for the system – Actors – Preconditions – Triggers – Basic Flow – User interaction with the system – Resources required for the use case … and more • Using use cases provides for a more iterative approach 10
From our Use Cases OPeNDAP • DAP responses, nothing new to do here, functionality already provided (DONE) • Ferret module – First pass, data manipulation using ferret (DONE) – Future passes, provide new response types (data product types) such as images, movies using ferret • THREDDS catalogs – First pass, be able to read THREDDS catalogs (DONE) – Future passes, integrate with portals that provide richer data inventory browsing, semantic knowledge incorporation, semantic knowledge provenance, etc … (ongoing) • NCML documents – Be able to read NCML documents and perform aggregation (DONE) – Future passes, be able to dynamically pass in NCML documents and perform aggregations (DONE) 11
Future Work OPeNDAP • Improvements for Data Access – Asynchronous support – Distributed data access, manipulation, & transformation – Storing intermediate data products for future access and manipulation – Sharing data products with other researchers – Building Citation information with Data Products – Data Access Provenance – Semantic responses and features – Semantic Data Services descriptions – Data Metrics, who, what, when, where, and how of data access – Administrative features – Better and more advanced middle-tier capabilities 12
Future Work - but not for OPeNDAP (?) OPeNDAP • OPeNDAP is a great tool for data access. – Data Access is only part of the Cyberinfrastructure framework – Keep with its strengths, reliable, scalable, good performance, extensible • Data discovery mechanisms • Data Inventory browsing and querying – Faceted browsing/Hierarchical browsing • Embedded provenance support, from data collection to data product creation and visualization • Data Citation and Attribution 13
References OPeNDAP • OPeNDAP: Open-source Project for a Network Data Access Protocol, www.opendap.org, 2005. • ESG. The Earth System Grid - Turning Climate Model Datasets Into Community Resources. http:// www.earthsystemgrid.org, 2006 • Fox, P., Garcia, J. and West, P. OPeNDAP for the Earth System Grid. Data Science Journal. 2006. • D N Williams, et.all., Data management and analysis for the Earth System Grid, Journal of Physics. 2008. • Jose Garcia, Peter Fox, Patrick West, Stephan Zednik, Developing service-oriented applications in a grid environment: Experiences using the OPeNDAP back-end-server. Earth Science Informatics. 2009 14
Recommend
More recommend