bring out yer sips an introduction to digital
play

Bring out yer SIPs: An Introduction to Digital Preservation with - PowerPoint PPT Presentation

Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley, Digital Preservation Librarian, Scholars Portal Agenda - Basic concepts in digital preservation - Introduction


  1. Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley, Digital Preservation Librarian, Scholars Portal

  2. Agenda - Basic concepts in digital preservation - Introduction to Archivematica - Preparing transfers + Demo - Processing transfers + Demo - Looking at AIPs - Thinking about DIPs - Processing activity

  3. What’s this “digital preservation” thing? Uh oh

  4. Digital objects (both born digital ● and digitized) need active management to ensure ongoing access Quickly-changing technological ● norms create risks that must be managed from the object’s creation Digital preservation is a set of ● theories and practices that work to keep digital objects authentic , available and reliable over time.

  5. Identity: what it is; format identification, descriptive information, provenance, etc. Integrity: establishing that a file remains unaltered over time

  6. Identity: File formats filename : '/Users/hurleyg/Documents/Teaching/iSkills/CheckYourBits.jpg' filesize : 582231 modified : 2018-01-24T15:50:08-05:00 errors : matches : - ns : 'pronom' id : 'fmt/43' format : 'JPEG File Interchange Format' version : '1.01' mime : 'image/jpeg' basis : 'extension match jpg; byte match at [[[0 14]] [[582229 2]]]' warning : File format identifications/descriptions in Pronom (UK National Archives) - ID = Pronom identifier Archivematica uses Siegfried or FIDO

  7. Integrity: The almighty checksum md5 checksum = md5 checksum = 2c93b97c3d7e53dab9161e389c98465c 1148058955697062ca583d0cc0474322

  8. The even more almighty OAIS

  9. Other important concepts Identification: determining what a particular file’s format and version is Characterization: extracting metadata related to the file’s intrinsic properties. For example, audio sample rate, channels, etc. for a mp3 file. Validation: determining if a file is well-formed and valid according to its specification. Normalization: converting a file from a source format to a standardized format.

  10. What is Archivematica?

  11. What it does - Creates well-formed data packages for long-term preservation and access - Takes a pre-structured transfer from a data source - Makes a Submission Information Package (SIP) - Transforms the SIP into an Archival Information Package (AIP) - Also can create a dissemination information Package (DIP) for access - Each of these functions has configurable tasks associated

  12. What it does - Stores and applies preservation policies for normalization, access copies, etc. - Allows access to, and deletion of, AIPs - Assists in ingest of descriptive metadata, rights information - Manages data flows in and out of system through separate Storage Service module - Can connect to access systems for DIP deposit (mostly just AtoM) - Can be fully automated

  13. Where it came from - Standards for digital preservation developed in late 1990s and early 2000s, but no easy way of applying them - UNESCO released 2007 report advocating for open source digital preservation system - Artefactual Systems started up by creating Access to Memory (AtoM) system for archival description - Various small open source tools were also being developed by others for particular tasks - Artefactual developed Archivematica beginning in 2008 - Beta release in 2012; current release is 1.6.1 (2017)

  14. What it is - Modular workflow created using a microservices design pattern - Data follows structured, chained pathway, there the results of one step triggers the initiation of the next step. - Components can be replaced or turned off/on. - Accessible through the browser - Requires a virtual machine to run on (Ubuntu or CentOS) - Runs in LAMP environment (Linux, Apache, MySQL, PHP) - Open source, developed by Artefactual Systems staff

  15. What it isn’t - A storage system - An access system - Easy to install or maintain in production - User friendly - A complete digital archives workflow

  16. Who uses it Largely, memory institutions (libraries, archives, galleries, museums) with digital collections that need preserving - Libraries: - Digitized/born-digital content in institutional repositories - Research data management (several current projects trying to develop Archivematica’s capacity in this domain) - Digital collections (books, journals, maps, etc.) - Archives - Digitized collections (photographs, audio-visual materials, etc.) - Born digital donations (all sorts of stuff) - Private papers/collections - Records from corporate bodies, institutions, etc.

  17. The Workflow Pre-Transfer* Transfer Backlog Appraisal Ingest Storage & Access* Selection of Generates METS You can send File format Normalize files objects to file to be written something here view/analysis Store in Create & store preserve to if you don’t want location Selection for AIP/DIP to continue Metadata Virus scan retention processing it Send access preparation copies to other File ID, ID sensitive data systems Packaging for characterization, transfer validation *Not in *Linked to by Archivematica Archivematica

  18. Preparing transfers

  19. Steps - Determining content and structure (1 SIP = 1 AIP = fonds, series, item? Or section of one of these?) - Gather and structure metadata (next slide) - Gather submission documentation (not in demo) - Package and structure for ingest - All data needs to be in a directory, at minimum

  20. Metadata Descriptive metadata - Uses simple Dublin Core as key standard, other information is recorded as ‘Custom’ - Transfer level can be added through interface or imported - Item level must be imported via csv file Rights metadata - Mapped to PREMIS - Same import structure as above

  21. Demo - Set of photos + metadata csv file - Bagging using Python script

  22. Processing transfers

  23. Demo - Same materials as before - Uploaded to transfer source on Ontario Library Research Cloud - Process using standard workflow and settings - Briefly demo backlog/appraisal tabs - Store AIP on OLRC - No DIP

  24. Looking at AIPs

  25. AIP Contents - METS file - Originals + normalized copies in ‘objects’ folder - Materials that made up original transfer - Logs

  26. Thinking about DIPs

  27. DIPs - Set of normalized files for access, created with access policies in preservation planning module - Archivematica can connect to AtoM for DIP deposit to existing description - Can transfer over some metadata, so description work can be lessened, but only at transfer/item level

  28. Activity time!

Recommend


More recommend