the r e role of e of t trustwort rthy d digi gital rep
play

The R e Role of e of T Trustwort rthy D Digi gital Rep - PowerPoint PPT Presentation

The R e Role of e of T Trustwort rthy D Digi gital Rep epositori ries i in Sustainability David Giaretta david@giaretta.org www.giaretta.org and www.iso16363.org Big Data to Knowledge AHM & Open Data Science Symposium 29 Nov 1


  1. The R e Role of e of T Trustwort rthy D Digi gital Rep epositori ries i in Sustainability David Giaretta david@giaretta.org www.giaretta.org and www.iso16363.org Big Data to Knowledge AHM & Open Data Science Symposium 29 Nov – 1 Dec 2016 Big Data to Knowledge AHM & Open Data Science Symposium The Role of Trustworthy Digital Repositories in Sustainability 1 Bethesda, MD 29 Nov – 1 Dec 2016 David Giaretta www.giaretta.org

  2. Interoperability, Re-use, Preservation and Sustainability Exploitation/ Re-use VALUE Replication of results Usability Interoperability What do the bits mean? • Preservation Need “metadata” “ metadata ” Sustainability • What kinds? How much of • each kind? EU Commissioner for the Digital Agenda said: • “Data is the new Gold” but Gold is precious because it is rare, and does not combine • Data is precious because there is so much and it becomes • more valuable when it is combined Big Data to Knowledge AHM & Open Data Science Symposium The Role of Trustworthy Digital Repositories in Sustainability 2 Bethesda, MD 29 Nov – 1 Dec 2016 David Giaretta www.giaretta.org

  3. Digitally encoded information – 1’s and 0’s • BITS: 01001110 01001101 01010001 01001101 01010000 01001010 00100000 00100000 Example : “ca fe ba be” at start • HEX: 4e 4d 51 4d 50 4a 20 20 indicates Java class file • Two IEEE 754 32 bit real numbers: 8.6116461E8 1.35644119E10 Assuming “big-endian” • Two 32 bit integers 164211241 168379396 • Actually... .... • ASCII Characters: NMQMPJ What does this mean? ………. Was my flight reference • Big Data to Knowledge AHM & Open Data Science Symposium The Role of Trustworthy Digital Repositories in Sustainability 3 Bethesda, MD 29 Nov – 1 Dec 2016 David Giaretta www.giaretta.org

  4. …sem emanti tics … s … Can anyone guess what this table means? Could be F indable and A ccessible - encoded as Comma Separate Value (CSV) file in ASCII or Unicode or encoded with XML markup Longitude Latitude Ozone Date Big Data to Knowledge AHM & Open Data Science Symposium The Role of Trustworthy Digital Repositories in Sustainability 4 Bethesda, MD 29 Nov – 1 Dec 2016 David Giaretta www.giaretta.org

  5. OAIS (ISO 14721) and digital preservation • Reference Model for Open Archival Information System (OAIS) provides a very general approach • OAIS approach to digital preservation: – covers all types of digitally encoded information – provides a way to test whether preservation is successful – does not require seeing into the future – does require transparency – be clear what is being promised • but does not require “open access” • Very widely accepted and provides the basis for pretty well all work in digital preservation • OAIS provides a good basis for certification • Available free from https://public.ccsds.org/Pubs/650x0m2.pdf Big Data to Knowledge AHM & Open Data Science Symposium The Role of Trustworthy Digital Repositories in Sustainability 5 Bethesda, MD 29 Nov – 1 Dec 2016 David Giaretta www.giaretta.org

  6. Pres eser erving d g digi gital ally y encod oded ed i information on • In order to use/understand the bits requires what OAIS calls “Representation Information” – anything needed to allow the data to be interpreted by software or people and certainly requires semantics and many other things • Additional things such as software which are readily available now may not be available in future • If the bits are unchanged we can keep hashes and be pretty sure of authenticity. • If we have to change the bits e.g. Transform to another format then • Evidence of Authenticity needs care • Probably needs other software etc • It may be that the information must be handed over • To different system and/or different organisation • Need to take care of the details which tend to be ignored Big Data to Knowledge AHM & Open Data Science Symposium The Role of Trustworthy Digital Repositories in Sustainability 6 Bethesda, MD 29 Nov – 1 Dec 2016 David Giaretta www.giaretta.org

  7. Partial Representation Information Network for MERIS Level 2 data Big Data to Knowledge AHM & Open Data Science Symposium The Role of Trustworthy Digital Repositories in Sustainability 7 Bethesda, MD 29 Nov – 1 Dec 2016 David Giaretta www.giaretta.org

  8. Role of people (and automated systems) • Creation of data and capture/creation of the metadata required for use/exploitation now and into the future • Follow “Active” Data Management Plans (RDA and CCSDS/ISO) • Funding, Management and Operation of the repository • Defines the “Designated Community” e.g. people who understands particular sub-discipline • Undertakes preservation activities for the data – ensuring that the data will be usable by members of the Designated Community despite changes in h/w, s/w, environment etc • Use the data (including by the Designated Community) • Exploit and create value from the data • Judge the value of the data Big Data to Knowledge AHM & Open Data Science Symposium The Role of Trustworthy Digital Repositories in Sustainability 8 Bethesda, MD 29 Nov – 1 Dec 2016 David Giaretta www.giaretta.org

  9. Many types of Audit and Certification • ISO 16363 focuses on keeping the Information understandable / usable • www.iso16363.org • based on OAIS concepts – including usability • 100+ metrics covering all aspects of the repository to ensure the auditor looks at the details • uses the ISO certification process on which our lives depend in so may areas e.g. medical equipment, food safety, airlines, automobiles etc.- 3 rd party visits and evaluation • ISO 27000 type audits focus on keeping the bits safe in the context of the needs of the organisation • the information is an asset of the business – what happens after the organisation ceases to exist is of no concern. Security certification may be needed for any information that can be used to identify an individual • DIN 31644 • audit and certification process not clear • ISO 15489 – Records Management • No formal audit process • World Data System and Data Seal of Approval • Small set (16) metrics – not detailed • Recognised as much “lower” than ISO 16363 (DSA as “bronze” and ISO 16363 as “gold”) Big Data to Knowledge AHM & Open Data Science Symposium The Role of Trustworthy Digital Repositories in Sustainability 9 Bethesda, MD 29 Nov – 1 Dec 2016 David Giaretta www.giaretta.org

  10. ISO Standards for certification • ISO 16363: Audit and Certification of Trustworthy Digital Repositories • Available free from https://public.ccsds.org/Pubs/652x0m1.pdf • ISO 16919: Requirements For Bodies Providing Audit And Certification of Trustworthy Digital Repositories • Available free from https://public.ccsds.org/Pubs/652x1m2.pdf • Used for accreditation of auditors by National Accreditation Bodies • Auditors available early next year Big Data to Knowledge AHM & Open Data Science Symposium The Role of Trustworthy Digital Repositories in Sustainability 10 Bethesda, MD 29 Nov – 1 Dec 2016 David Giaretta www.giaretta.org

  11. Sustainability and Trustworthiness • Requires resources ($ / £ / …) • Are the resources being well spent – will the data be usable? • Is the Value (or potential value likely to be derived) worth the Cost • An important factor in appraisal – cannot preserve everything • There are economies of scale • There are limits to the availability of expertise • Competition between repositories? • Trustworthiness is a way to choose between repositories • ISO 16363 certification requires detailed evidence and is fundamentally linked to usability - from which value, and hence sustainability, is derived Big Data to Knowledge AHM & Open Data Science Symposium The Role of Trustworthy Digital Repositories in Sustainability 11 Bethesda, MD 29 Nov – 1 Dec 2016 David Giaretta www.giaretta.org

  12. Useful Links • OAIS • WEB pages: www.oais.info • Site to gather proposals for OAIS updates in 2017: http://review.oais.info • ISO 16363: • www.iso16363.org • Integrated GLOSSARY of digital preservation http://www.alliancepermanentaccess.org/index.php/consultancy/dpglossary/ • SKOS ontology to show relationship between terms from different glossaries • OAIS, APARSEN, DPC, ANZ, SNIA, INTERPARES, ISO16363 • Active Data Management Plans: • CCSDS/ISO • http://cwe.ccsds.org/moims/default.aspx#_MOIMS-DAI • Research Data Alliance: • https://www.rd-alliance.org/groups/active-data-management-plans.html • Me: • www.giaretta.org Big Data to Knowledge AHM & Open Data Science Symposium The Role of Trustworthy Digital Repositories in Sustainability 12 Bethesda, MD 29 Nov – 1 Dec 2016 David Giaretta www.giaretta.org

  13. END david@giaretta.org Big Data to Knowledge AHM & Open Data Science Symposium The Role of Trustworthy Digital Repositories in Sustainability 13 Bethesda, MD 29 Nov – 1 Dec 2016 David Giaretta www.giaretta.org

Recommend


More recommend