ARCLib – Development of Open Source Solution for Long-term Preservation Martin Lhoták Library of the Czech Academy of Sciences 11. 6. 2019 Open Repositories, Hamburg
ARCLib Complex Solution for Long Term Archiving of (Library) Digital Collections Applied research grant from Ministry of Culture of CR Technologies and methodologies for preservation of culture heritage 2016-2020 850k Euro
ARCLib Goals: Development of complex OS LTP system using Archivematica as 1 of its components Logical preservation methodology for Czech institutions Bit level preservation methodology
ARCLib State of archiving in CZ libraries simple file system with TAR or ZIP packages, backups uploads to cloud services (mostly just backups) central (commercial) solution in National Digital Library project (for the NL mainly) no reasonable alternative (40 libraries with Kramerius, 15 DSpace,...) master copies, access copies, several generation of metadata standards, some metadata without standards central registry of digitization, but no registry of archival packages
ARCLib – inspiration Projects Systems CESNET LTP Pilot (testing Archivematica of Archivematica) RODA NDK – National Digital Commercial solutions Library (Rosetta, Preservica,...) Czech National Archive (Archivematica) Custom made solutions Foreign projects (Finland, (NDK, NDA SVK) Germany)
ARCLib – inspiration Archivematica open source software rapid development too big and too general ... (Finland experience) dependency, uncertainty it could be probably more simple (we don’t need universal product, we need system for defined environment and type of data) inspiring approaches – microservices, the way of management of ingest
ARCLib – standards Data and metadata standards ISO 14721 (OAIS) ISO 16363 (Audit a certification) Data Seal of Approval NDK (National Digital Library) standard Kramerius and Dspace formats Export from ProArc production system (NDK or wider)
ARCLib – tools Available open tools format identification, validation, technical metadata extraction DROID, FIDO, JHOVE, JPYLYZER, etc.
ARCLib – functional requirements
ARCLib AIP AIP consist of two parts: provided SIP + metadata partly extracted from SIP BibMD – DC + MODS • TechMD – type of scanner, date of scannig, operator, data from • JHOVE, etc. and partly generated by ARCLib on ingest AdmMD – data provider, workflow, validation log, validation profile, format identification, date, etc.
ARCLib I ngest Creation of AIP packages from SIP packages antivirus and MD5 controls validation according to validation profiles extraction of metadata from XML of SIP identification of formats in SIP creation of ARCLib AIP and transfer to persistent storage
ARCLib Data Management Database of AIP packages (location, BibMD, TechMD, AdmMD) indexation of ARCLib AIP XML indexation of complete content of SIP editation of ARCLib AIP XML export of DIP API
ARCLib Archival Storage Archival Storage is complex service for bit-level preservation enabling replication of data in more geographical localities using advanced technologies to store data
ARCLib Archival Storage
ARCLib Administration Workflow configuration Infrastructure control Registry of users, their roles, profiles o SIPs, data providers, validation profiles and storage locations Communication with database Control of storage capacity Administration of jobs
ARCLib Access ARCLib is backend aplication not given for end users ARCLib doesn’t solve any access politics export from AIP to DIP is 1 : 1
ARCLib Preservation Planning format database – registry of used formats registry of profiles / workflows events in archive
ARCLib ARCLib is a solution for logical and bit-level preservation/protection of digital data. ARCLib doesn’t have pre-ingest or deposit module – it doesn’t involve conversion to SIP . ARCLib is system for management of archival packages. ARCLib is dark archive. It is not repository for end users .
ARCLib release harmonogram 2018-2019 prototype testing 2020 full final version https://arclib.cz/
Thank you for attention Martin Lhoták Lhotak@knav.cz http://www.knav.cz
Recommend
More recommend