An EOSC-hub & EUDAT service: Addressing sustainable long-term preservation wall for scientific data: the European Trusted Digital Repository (ETDR) service 2018/10/08 Marion MASSOL (CINES) EOSC- hub receives funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 777536.
« Rai Raider ders of the s of the Los ost Dat Data » 3
On Once ce upon a t tim ime, the co computer ter eng ngin ineer er of th f the la lab re rece ceiv ived ed a a mis issio ion n of t f the hig ighest est importance… 4
To To fi find nd an n exp xpensi ensive ve sci cient ntif ific ic di digit ital l da data that that wa was pro rodu duce ced in… 1998 ! 5
6
… he he fo found nd a a fl floppy py di disk that that seem emed ed to to be be und ndamaged ed!! !! 7
8
… Bu But wi with a a ve very ry li little le in info form rmatio ion n on n it its label… 9
Wa Was it it the ri right fl floppy py di disk??? ?? 10
… To To fi find nd a a co compli liant nt hard rdwa ware re re reade der wa was stil ill le left ft to to do… 11
12
- Ph Phew! 13
He He opened ned the fi file le wi with the la last version of MS Word… 14
And then ?… and then?. And ?... .. 15
16
He He saw saw the Mi Micr crosoft ft onl nlin ine kn knowl wledg dge base… 17
And… 18
19
Me Method od #3 : 3 : « Co Cont ntact ct your system stem adm dmin inis istra rator or » ! » ! 20
Fina Fi nall lly, , he he fo found nd a s solu lutio ion n to re read this is da damane ned file… 21
.. .. Wi With an a n acc ccepta table ble co couple le software / OS… 22
And And th the sci cient ntif ific ic co cont ntent nt appea eare red … 23
24
He He fe fetche ched the re researc archer her and nd showe wed him im the succ ccessfu essful re result lt of th f the mis issio ion. n. 25
And And th the ans nswe wer wa was … 26
- And And MY MY fo form rmattin ing! These ese IT IT guys are re no no-pro rofe fessi ssiona onal!!! !!! Defi fini nitiv ively ely, , we we can’t tru rust you!!! !!! 27
THE END TH 28
4 main risks: Media corruption: bitstream alteration Hardware obsolescence Software evolution – file format obsolescence Lack of documentation & metadata 2 main strategies: Procrastination Implementation 10/10/2018 29
Data curation process Metadata standards Preservation Dublin Core Technological strategy & policy: EAD/ISAAD-G/ISAAR watch processses P2A Community schemas PID: Harvesting process PAIMAS Handle/EPIC/ARK/DOI File format policy PAIS Accreditation: + FF validation process Exchange protocols: CoreTrustSeal/DSA + Emulation DEPIP - ISO 20614 + logical migration SWORD OAIS (ISO 14721) SHA-512/SHA- 256/MD5/… SEDA Storage strategy: Certification: Policy on Regular checks on all copies ISO 16363 digital 3+ copies on 2+ technologies SIAF certification signature 1+ copy on site >300 km HADS (sensitive data) 1+ copy on site >2000km DIN 31644 10/10/2018 30 Physical migration
D R T I E R G P U I O S T S T A I E L T D O R 31 Y
F I N D A B L E A C C E S S I B L E I N T E R O P E R A B L E R E U S A B L E 32
D R T F I N D A B L E R G P U A C C E S S I B L E O S T S I N T E R O P E R A B L E I E L T D O R E U S A B L E 33 Y
And concretly … Wider Community A Community A community Interface + negociated Interface protocol TDR Ingest entity Storage + data Access entity management Transfert Specific entities Business processes entity (file format Interface access tools B2FIND Storage PID (by default, limited direct access) Metadata Quality conversion over time, etc.) manag. checks Storage Preserv. Specific quality tools processes treatments Interface (HPC, EGI facilities, etc.) Administration + data preservation planification entities (data curation policies, restitution tools, etc.)
ETDR (European Trusted Digital Repository) The eTDR could be composed by 3 main possible use cases : - Use case #1: with a single TDR provided by a EUDAT SP - Use case #2: with a single ingest point into the eTDR + few storage SPs (TDR operated by the community itself) - Use case #3: with a single ingest point into the eTDR + few storage SPs (TDR operated by the EUDAT CDI) The eTDR instance for the community X is a technical & organizational solution: - Certified - Sustainable - For the long term preservation and access of scientific data 10/10/2018 35
A use case!... A use case!... A use case!!!... 36
37
Recommend
More recommend