INTRODUCTION TO EUDAT CDI AND B2 SERVICES SUITE Mark van de Sanden | EUDAT/SURFsara @eudat_eu eudat.eu
Ou Outlin line EUDAT B2 services suite EUDAT CDI infrastructure Example use cases Q&A
CDI I Data Domain PUBLISHED DATA DOMAIN Linking Discovery of Publications To Digital Objects Digital Objects REGISTERED DATA DOMAIN Stage Register Digital Digital Objects Objects WORKSPACE (TEMPORARY - TRANSIENT) Data Domain modeled on the ANDS 1 Data Curation Continiuum 1. Australian National Data Service organization – www.ands.org.au
Co Community ty-Dr Driven en Pi Pilots EUDAT services are designed, built and implemented based on user community requirements.
EUDAT generic data service provider storage, workflows, processing, archive Community Repositories (thematic data centres)
Ser Service ce Diagr gram
Who Anyone What Find collections of scientific data quickly and easily, irrespective of their origin, discipline or community Get quick overviews of available data Browse through collections using standardized facets Why Unique collection Ease of Searching Guidelines for data providers which are DataCite and OpenAIRE compliant Harvesting via OAI-PMH , JSON-API and CSW 2.0 Data selection on basis of 9 facets, including Spatial , Time , Publication Year, Tags Full text search on metadata http://b2find.eudat.eu/
Faceted Search and Data Access B2FIND provides ‘faceted’ search for • Free text • Geo spatial • Temporal coverage • Publication year • Textual facets as • Tags • Creator • Discipline • Language • Publisher
Faceted Search and Data Access Dataset view provides display of metadata : Spatial extent • Table of field-value pairs • Links to data resources •
Who Citizens Scientists and small teams What Store and exchange data Synchronize multiple versions Ensure automatic desktop synchronization Why Ease of Use Trusted European Service Support for direct publishing of data sets in B2SHARE Personal quota of 20GB, extended quota possible Fine grain control to share data with other researchers Share data with researchers across other B2DROP and other ownCloud/ Nextcloud instances Easy integration with other research platforms https://b2drop.eudat.eu/
Easy sharing of data
Direct publishing to B2SHARE
Who Small to Medium Teams What Store data (incl. software) and add domain meta data Share registered research data worldwide Preserve (small-scale) research data for long- term Why Register Data for Publications (FAIR) Minimum metadata compliant with DataCite Make known to wider community and OpenAIRE, flexible support for community specific metadata extensions Support for DOI’s on dataset level Support for PIDs, checksums and download statistics on object level Dataset record lifecycle and versioning Authorisation for community domains Metadata automatic harvested by B2FIND Support for annotation via B2NOTE Direct uploads from B2DROP Easy installable as local instance via Docker https://b2share.eudat.eu/
Features
Community metadata extensions
an annotation is “a note added to a text, book, drawing, etc., as a comment or an explanation” (from Merriam Webster) Provide a service to add annotations to digital assets New B2 service, launched at Jan 2018 Can be integrated within community repositories and services Manual annotations via WUI, or programmatic via a REST API Annotation on existing ontologies (in Biomedical domain) Integrated with B2SHARE on basis of PIDs Uses W3C Annotation Model standard (JSON-LD and RDF) https://b2note.eudat.eu/
B2SHARE integration
Who Community Data Managers ‘Sophisticated’ Organisations What Provide an abstraction layer which virtualizes large-scale data resources Guard against data loss in long-term archiving and preservation Optimize access for users from different regions and to computing resources Data management on basis of policies Support for different storage systems (e.g. Posix, NFS,S3, Tape, UMS) Why Policies for data replication, registration Performance of PIDs, data integrity checks, versioning Replication between trusted sites (alpha) Data Preservation Access via GridFTP and HTTP APIs Data downloads via PIDs Support for automated data publication via B2SHARE (alpha) Central policy management
Data Policy Manager Data policies are centrally managed Policy rules are implemented and enforced by site-local rule engines Policies describe in an abstract language Community data managers must authenticate to provide trust Support policies for data replication and integrity checking Central logging for auditable data policies to monitor execution Active collaboration with the RDA Practical Policy WG
Who Users and Communities who want to interact with EUDAT CDI services What Provide a common access layer to B2 services Copy large data sets, ingesting them onto EUDAT data services Enables data transfer for large data collections from EUDAT storages to external HPC facilities for processing Why Support data transfers between PRACE and EGI Simplify data transfers Common API on basis of GridFTP and HTTP Upload/Download of data to/from B2SAFE Downloads via PIDs for both APIs (GridFTP and HTTP)\ Support for anonymous data access HTTP API defined via OpenAPI http://petstore.swagger.io/?url=https://b2stage.cineca.it/api/specs&docExpansion=none - /
Who Groups or Communities who want to make their data citable What Follows policies to register data and make it long term refer - and citable Reliability through mutual PID mirroring Provides abstraction layer between a globally unique persistent identifier and physical location of data objects PIDs global resolvable Why Based on Handle v8 Simple integration M achine readable via HTTP RESTful API Technology Agnostic EUDAT standardized PID record and data types B2HANDLE API Python library for easy integration is client services PID prefixes provided via ePIC Multiple B2HANDLE service providers
Who Anyone wanting to use the B2 Services What Complies with community ownerships and access rights , basis of trust Credential conversion approach (e.g. SAML, OpenID, X.509, Username/password) Identity provider for citizen scientists Support for eduGAIN, Social Identities Why (Facebook, Google, Microsoft, Github), Orcid and Use your own ID in federated environment local accounts IdP integration support for SAML, OpenID, OAuth2, X.509 Community IdP (ELIXIR, PRACE, EGI) SP integration support for SAML, OIDC, OAuth2, X.509 Integrated with B2SHARE, B2SAFE, B2STAGE, B2DROP , B2NOTE, SPMT, DPMT and Gitlab Joint proposal for the Life Sciences with GEANT and EGI Common Federated AAI planned in EOSC-hub
EUDAT EUD CDI I Data Domain PUBLISHED DATA DOMAIN Linking Discovery of Discovery of Publications To Digital Objects Digital Objects Digital Objects REGISTERED REGISTERED DATA DOMAIN DATA DOMAIN Data Objects Stage Stage Register Register Digital Digital Digital Digital Objects Objects Objects Objects WORKSPACE (TEMPORARY - TRANSIENT) Data Entities Data Domain modeled on the ANDS 1 Data Curation Continiuum 1. Australian National Data Service organization – www.ands.org.au
Us User D r Docum umentation Total 33 documents maintained and revised 3 levels of documentation: Engage: for Community decision-makers and data managers Deploy: for system and support engineers Use: for researchers and end users Participation from community experts https://eudat.eu/services/userdoc
Tr Training Material Total of 14 training modules developed and maintained Hands-on training environments for: B2SAFE B2SHARE B2FIND B2HANDLE B2NOTE https://eudat.eu/training - https://github.com/EUDAT-Training
CDI I members more than 20 European research organisations, data and computing centres in 14 countries
CDI I In Infrastructure Operational Services Service Portfolio & Catalogue Management Tool Project (Configuration) Management: DPMT A&R Monitoring, Software Version Monitoring Accounting, Reporting providing PIDs Thematic Service Provider Helpdesk Repository Provider Vulnerability Scanning, CSIRT Generic Service Provider (Archive, Large Storage System, HTC/HPC) Operational and Support services Service Hosting Framework PID Service Provider
He Help lpdesk k and and Su Support rt T Team Requests via the Helpdesk Webform Helpdesk Channels Support Request Webform Trouble Ticketing System B2-Service Queues Site Queues Responsibilities 1st Level Support (BSC) 2nd Level Support: Project Enabling Team (OP) 3rd Level Support Service Developer Teams (DEV) https://eudat.eu/contact-support-request
Us Use C Cases Example SeaDataNet EUDAT CDI Cloud Example EuroArgo Data Subscription service
Recommend
More recommend