Using iRODS to manage, share and publish research data Ton Smeele & Lazlo Westerhof ITS/ResearchIT, Utrecht University
ITS – Research IT Agenda Profile Utrecht University Yoda introduction and concepts Demonstration Challenges, issues & lessons learned
ITS – Research IT Organisation & people ESTABLISHED PROFESSORS FACULTIES STAFF-MEMBERS STUDENTS Incl. faculties 30,523 1636 Medicine teaching 550 7+2 6,960 institutes
ITS – Research IT Top ranking The Netherlands 1 Europe 13 NOBEL PRIZES SPINOZA PRIZES SHANGHAI RANKING 2017 World 15 12 47
ITS – Research IT 4 Strategic themes - focused research DYNAMICS OF YOUTH INSTITUTIONS FOR OPEN SOCIETIES LIFE SCIENCES PATHWAYS TO SUSTAINABILITY • • Cooperation, Self-regulation Towards Industry with Negative • • Integrating Utrecht One Health and Collective Action Emissions • expertise on youth Personalised Medicine & Health • • Sustainability and Future Food: Pathways towards • development, Regenerative Medicine & Stem Resilience Healthy Planet Diets from synapse to society Cells • • Innovation and Economic Transforming Infrastructures for • Science for Life Growth Sustainable Cities • • Equality, Inclusiveness and Water Climate & Future Deltas Social Mobility • Democratic Governance, Citizenship and Trust
ITS - Research IT Why iRODS as Research Data Management platform • scalable platform can be used to manage – can manage billions of files, petabytes of data large/many data collections – infrastructure/vendor neutral solution • enforces data policies supports demonstrable – secures sensitive data research integrity – auditable controls • manages metadata alongside the data facilitates research data – metadata based data policy execution decisions workflows – data workflow automation
ITS – Research IT Utrecht University iRODS managed research data 12 1600 200 180 1400 10 160 1200 140 1000 8 120 800 6 100 600 80 400 4 60 200 40 2 0 20 0 0 Internal External 11 Zones 1400 Users 180 TB Data production instances only, figures are indicative
ITS – Research IT Our iRODS implementation is called "Yoda": preconfigured iRODS based system, delivered and supported as a service – enhanced with (graphical) user interfaces, policies and rules power-user network-disk portal user interaction PRODS Davrods iCommands Apache Web Server iRODS API service UU Data Policies and -services configuration iRODS data integration 10,000 lines of rules 14 custom microservices
ITS – Research IT Yoda Data compartments Collaborate Research Research Research Deposit/ Vault Vault Vault Read only Each data compartment relates to an iRODS group
ITS – Research IT Yoda Communities ("category") A community comprises of multiple data compartments Research Research Research Per community: • cost calculation/invoicing • appointed datamanager(s) Vault Vault Vault • metadata schema Community concept implemented as metadata on iRODS groups
ITS – Research IT Collaborate during research via the Yoda disk WebDAV access from anywhere on any workstation using Davrods
ITS – Research IT Data Deposit workflow data data folder Research Vault Submit Approve Secured package + metadata System Researcher Data manager requests checks metadata deposits to deposit a copy in complies with the vault policies bypass possible for communities that have no datamanager role
ITS – Research IT FAIR Data Publication workflow DOI + data Submit Approve Published Vault landingpage package Researcher Data manager System requests publishes the metadata checks metadata to publish and complies with provides internet access publication policies to data if classified as "Open"
ITS – Research IT 'FAIR' Research Data Management using iRODS Collaborate safely as a group ("Research" folder) Research Maintain integrity, deposit a folder in the vault Vault Allow FAIR reuse, publish a data package
ITS – Research IT demonstration
ITS – Research IT Challenges, issues and lessons learned - Metadata form interaction with browser: was XML now adopting Json - iRODS 4.1.11 stable and reliable except for delayed rules engine (resolved in 4.2.2+) - many components and architectural layers, need to simplify implementation and configuration
ITS – Research IT Yoda manages data during/after research Collaborate safely as a group ("Research" folder) Research -> membership self-managed by researchers Maintain integrity, deposit a folder in the vault -> metadata can vary per community, Vault -> datamanager approves deposit Allow FAIR reuse, publish a data package -> datamanager approves publication, DOI citable data
ITS – Research IT Yoda is available under GPL license at https://github.com/UtrechtUniversity Thank you More info: Ton Smeele a.p.m.smeele@uu.nl Lazlo Westerhof l.r.westerhof@uu.nl
Recommend
More recommend