panic panic
play

PANIC PANIC Scientific Data Preservation Scientific Data - PowerPoint PPT Presentation

PANIC PANIC Scientific Data Preservation Scientific Data Preservation using Semantic Grid Services using Semantic Grid Services Jane Hunter, jane@dstc.edu.au Objective Objective Address the long term preservation and accessibility of


  1. PANIC PANIC Scientific Data Preservation Scientific Data Preservation using Semantic Grid Services using Semantic Grid Services Jane Hunter, jane@dstc.edu.au

  2. Objective Objective Address the long term preservation and accessibility of digital objects/scientific data 20th APAN Meeting Taipei Aug 2005

  3. Problems Problems • Obsolescence of physical storage devices • Obsolescence of hardware • Obsolescence of software – Operating systems – Authoring software – Web, Application, Database Servers – Search, retrieval software – Rendering/Display software – browser plugins • Obsolescence of file formats 20th APAN Meeting Taipei Aug 2005

  4. Problems Problems Within digital libraries/scientific data archives: • Wide range of file formats - different platforms, different authoring/display software • Massive collections • Composite mixed-media objects – web pages, images, video, audio, Flash, SMIL, SVG • Highly proprietary – software & hardware dependent • Dynamic and interactive • Difficult to capture – boundary problem • Few guides/recommendations 20th APAN Meeting Taipei Aug 2005

  5. Related Work Related Work • LoC National Digital Information Infrastructure and Preservation Program (NDIIPP) • CEDARS, CAMiLEON • National Library of Australia, PANDORA • Networked European Deposits Library (NEDLIB) • OCLC/RLG Preservation Metadata WG – PREMIS Preservation Metadata • International Internet Preservation Consortium • IBM – UVC (Universal Virtual Computer) • UK Digital Curation Centre 20th APAN Meeting Taipei Aug 2005

  6. Current Strategies Current Strategies • Maintenance – of obsolete hardware/software • Migration – convert to sequence of new formats • Emulation – mimic original software application on current environment • Preservation Metadata/Encapsulation – gather information that assists in the process of preservation (e.g., METS) – usually used in conjunction with Emulation or Migration. • Normalisation – original file is converted into platform-independent XML 20th APAN Meeting Taipei Aug 2005

  7. Existing Tools Existing Tools • OCLC’s INFORM, Cornell’s VRC – risk assessment -> notification services • GDFR, PRONOM, DCC-RR – Format registries • VersionTracker, IIPC – Software Registries • XENA, TOM – Conversion services • UVC – Emulation services 20th APAN Meeting Taipei Aug 2005

  8. Objectives Objectives Provide an Integrated Preservation Framework which supports: • Large, heterogeneous, distributed collections • Multiple formats • Changing organizational needs – Range of solutions • Flexible, Dynamic, Scalable, Extensible • New emerging formats, software, recommendations • New migration, emulation services • Recommender services/decision support • Sustainable - cost-effective, semi-automated 20th APAN Meeting Taipei Aug 2005

  9. Networked Distributed Archives Protein Data ESO Science SDSS ADIL GenBank Archive SkyServer Bank Preservation Metadata PANIC Capture Tools (PREMINT, JHOVE, NLNZ) Web services Registries Service Descriptions Software Registry Format Registry (VersionTracker) (OWL-S) (PRONOM, GDFR) Risk Assessment & Preservation Recommendation Notification Services Registry Services (XENA, TOM, UVC) (INFORM) (VRC, INFORM) 20th APAN Meeting Taipei Aug 2005

  10. Steps Steps • Archival – selection and capture of digital object + preservation metadata • Risk assessment and notification of potential obsolescence – New recommendations, format, software versions • Service Specification and Request – Emulation or Migration – Inputs/Outputs – Cost – Speed – Reliability – Lossiness • Select, Compose, Invoke Preservation Service • Record preservation events 20th APAN Meeting Taipei Aug 2005

  11. PANIC A Architecture rchitecture PANIC Internet CustodialOrganization Notification component Invocation component Obsolescence Preservation Notification Preservation Registry(s) Detector Metadata Service Metadata input tool Service Discovery Requester Agent Provider component Service Discovery component Preservation Selection Preservation Web Services Discovery Agent Service TIFF-to-JPEG2000 (e.g. Semantic Multimedia Service Registry Preservation Collections Matchmaker) Collection Invocation OWL-S Profiles Service Provider Manager AIFF-to-MP3 Agent Sesame WSDL RDF Store Mac OS1 Emulator Apache SOAP AXIS Retrieve and Invoke Appropriate Service(s) 20th APAN Meeting Taipei Aug 2005

  12. Preservation Metadata Preservation Metadata Input nput/Capture Tool /Capture Tool I • XML Schema based on Preservation Preservation extended METS schema Metadata Metadata input tool • XML metadata is used by Invocation component. • PREMINT Demo available: http://metadata.net/panic Multimedia Collections Collection Manager 20th APAN Meeting Taipei Aug 2005

  13. METS METS • Metadata Encoding and Descriptive Metadata Transmission Standard Administrative Extensions • Extended to include Technical Metadata presentation and creator Presentation Metadata Rights Metadata intention information Intention Metadata Source Metadata • Structural metadata – DigiProv Metadata use SMIL File Groups Structural Map 20th APAN Meeting Taipei Aug 2005

  14. Notification component Notification component Metadata Encoding and Transmission Standard (METS) Format Registry Descriptive Metadata FormatName PreviousVersion Administrative FormatType ReleaseDate CurrentVersion Obsolescence Detector Technical Metadata Recommendation Registry Extensions FormatName Recommendation Rights Metadata Format Details Presentation Metadata FormatVersion Authority Source Metadata Extract Compare URL ReleaseDate Intention Metadata Software Dependencies DigiProv Metadata Software Registry FormatSupported SoftwareName File Groups SoftwareType Company Return Incompatibilities CurrentVersion Platform Structural Map PreviousVersion ReleaseDate Obsolescence detector – periodically compares the preservation metadata for each object with registries to determine when object is at risk of obsolescence 20th APAN Meeting Taipei Aug 2005

  15. PANIC A Architecture rchitecture PANIC Internet CustodialOrganization Notification component Invocation component Obsolescence Preservation Notification Preservation Registry(s) Detector Metadata Service Metadata input tool Service Discovery Requester Agent Provider component Service Discovery component Preservation Selection Preservation Web Services Discovery Agent Service TIFF-to-JPEG2000 (e.g. Semantic Multimedia Service Registry Preservation Collections Matchmaker) Collection Invocation OWL-S Profiles Service Provider Manager AIFF-to-MP3 Agent WSDL Mac OS1 Emulator SOAP Retrieve and Invoke Appropriate Service(s) Client-side software modules which control the invocation of Components communicate with each other using platform-neutral Step 2. Semi-automated Migration Delivers and invokes the chosen preservation service Provides an interface to Software Version, Format Version and Provides an interface to match service request to Web service preservation services standards: OWL-S, WSDL and SOAP 20th APAN Meeting Taipei Aug 2005 registries. Recommendations registries.

  16. Invocation component Invocation component • Service Discovery –provides a Invocation component user interface so collections manager can specify the type Obsolescence of preservation service they Detector are looking for. Service • Service Selection –presents Discovery Requester the services retrieved by the Discovery agent for selection. Agent Service • Service Invocation – invokes Selection the chosen service and updates the preservation Service metadata where necessary; Invocation 20th APAN Meeting Taipei Aug 2005

  17. OWL- -S Ontology for S Ontology for OWL Web Services Web Services Provides Superclass Resources Service Presents supports DescribedBy ServiceGrounding ServiceProfile How to access it ServiceModel What the service does (automatic invocation) (automatic invocation) How it works (automatic discovery) (automatic discovery) (automatic composition) (automatic composition) 20th APAN Meeting Taipei Aug 2005

  18. OWL- -S Preservation Extensions S Preservation Extensions OWL Remote Execution ExecutionStatus Service Download PreservationService SystemRequirment e.g. Windows XP e.g. John Creator Doe subClassOf ReleaseDate e.g. 8-12-2003 ServiceQuality e.g. Low Speed e.g High Reliability Migration Emulation e.g. TIFF OriginalObjectFormat EmulatedObject e.g. MAC OS e.g. 5.12 OriginalObjectVersion EmulationType e.g. OS e.g. 256 bit TargetObjectFormat e.g. JPEG SystemSetting palette 2000 TargetObjectVersion e.g. 2.02 20th APAN Meeting Taipei Aug 2005 e.g. lossless Lossiness

  19. Discovery component Discovery component • Discovery Agent - matches service request against OWL-S descriptions of Preservation Web services • Returns a ranked list of Preservation Web services that match the request Discovery component Preservation Discovery Agent Service (e.g. Semantic Registry Matchmaker) OWL-S Profiles Sesame RDF Store 20th APAN Meeting Taipei Aug 2005

Recommend


More recommend