arclib development of open source solution for long term
play

ARCLib Development of Open Source Solution for Long-term - PowerPoint PPT Presentation

ARCLib Development of Open Source Solution for Long-term Preservation Martin Lhotk Library of the Czech Academy of Sciences 11. 6. 2019 Open Repositories, Hamburg ARCLib Complex Solution for Long Term Archiving of (Library)


  1. ARCLib – Development of Open Source Solution for Long-term Preservation Martin Lhoták Library of the Czech Academy of Sciences 11. 6. 2019 Open Repositories, Hamburg

  2. ARCLib Complex Solution for Long Term Archiving of  (Library) Digital Collections Applied research grant from Ministry of Culture of CR  Technologies and methodologies for preservation of  culture heritage 2016-2020  850k Euro 

  3. ARCLib Goals: Development of complex OS LTP system  using Archivematica as 1 of its components Logical preservation methodology for Czech  institutions Bit level preservation methodology 

  4. ARCLib State of archiving in CZ libraries  simple file system with TAR or ZIP packages, backups  uploads to cloud services (mostly just backups)  central (commercial) solution in National Digital Library project (for the NL mainly)  no reasonable alternative (40 libraries with Kramerius, 15 DSpace,...)  master copies, access copies, several generation of metadata standards, some metadata without standards  central registry of digitization, but no registry of archival packages

  5. ARCLib – inspiration Projects Systems CESNET LTP Pilot (testing  Archivematica  of Archivematica) RODA  NDK – National Digital  Commercial solutions Library  (Rosetta, Preservica,...) Czech National Archive  (Archivematica) Custom made solutions  Foreign projects (Finland, (NDK, NDA SVK)  Germany)

  6. ARCLib – inspiration Archivematica open source software  rapid development  too big and too general ... (Finland experience)  dependency, uncertainty  it could be probably more simple (we don’t need universal  product, we need system for defined environment and type of data) inspiring approaches – microservices, the way of management  of ingest

  7. ARCLib – standards Data and metadata standards ISO 14721 (OAIS)  ISO 16363 (Audit a certification)  Data Seal of Approval  NDK (National Digital Library) standard  Kramerius and Dspace formats  Export from ProArc production system (NDK or wider) 

  8. ARCLib – tools Available open tools format identification, validation, technical metadata extraction  DROID, FIDO, JHOVE, JPYLYZER, etc. 

  9. ARCLib – functional requirements

  10. ARCLib AIP AIP consist of two parts: provided SIP + metadata partly extracted from SIP BibMD – DC + MODS • TechMD – type of scanner, date of scannig, operator, data from • JHOVE, etc. and partly generated by ARCLib on ingest AdmMD – data provider, workflow, validation log, validation profile, format identification, date, etc.

  11. ARCLib I ngest Creation of AIP packages from SIP packages antivirus and MD5 controls  validation according to validation profiles  extraction of metadata from XML of SIP  identification of formats in SIP  creation of ARCLib AIP and transfer to persistent storage 

  12. ARCLib Data Management Database of AIP packages (location, BibMD, TechMD, AdmMD) indexation of ARCLib AIP XML  indexation of complete content of SIP  editation of ARCLib AIP XML  export of DIP  API 

  13. ARCLib Archival Storage Archival Storage is complex service for bit-level  preservation enabling replication of data in more geographical localities using advanced technologies to store data

  14. ARCLib Archival Storage

  15. ARCLib Administration Workflow configuration  Infrastructure control  Registry of users, their roles, profiles o SIPs, data  providers, validation profiles and storage locations Communication with database  Control of storage capacity  Administration of jobs 

  16. ARCLib Access ARCLib is backend aplication not given for end users  ARCLib doesn’t solve any access politics  export from AIP to DIP is 1 : 1 

  17. ARCLib Preservation Planning format database – registry of used formats  registry of profiles / workflows  events in archive 

  18. ARCLib ARCLib is a solution for logical and bit-level  preservation/protection of digital data. ARCLib doesn’t have pre-ingest or deposit module –  it doesn’t involve conversion to SIP . ARCLib is system for management of archival  packages. ARCLib is dark archive. It is not repository for end  users .

  19. ARCLib release harmonogram 2018-2019 prototype testing  2020 full final version  https://arclib.cz/

  20. Thank you for attention Martin Lhoták Lhotak@knav.cz http://www.knav.cz

Recommend


More recommend