the open archives initiative a low barrier framework for
play

The Open Archives Initiative: a low-barrier framework for - PowerPoint PPT Presentation

www.openarchives.org The Open Archives Initiative: a low-barrier framework for interoperability Carl Lagoze Computing and Information Science Cornell University lagoze@cs.cornell.edu Interoperability Trade-offs MARC/ SGML AACR2 FGDC


  1. www.openarchives.org The Open Archives Initiative: a low-barrier framework for interoperability Carl Lagoze Computing and Information Science Cornell University lagoze@cs.cornell.edu

  2. Interoperability Trade-offs MARC/ SGML AACR2 FGDC more function, more function, less acceptance less acceptance Cost less function, less function, OAI-PMH more acceptance more acceptance Dublin Core HTML ASCII Functionality

  3. The Open Archives Initiative The Open Archives Initiative develops and promotes interoperability standards that aim to facilitate the efficient dissemination of content. The Open Archives Initiative has its roots in an effort to enhance access to e-print archives as a means of increasing the availability of scholarly communication. … The fundamental technological framework and standards that are developing to support this work are, however, independent of the both the type of content offered and the economic mechanisms surrounding that content, and promise to have much broader relevance in opening up access to a range of digital materials. OAI Mission Statement

  4. OAI Protocol for Metadata Harvesting (OAI-PMH) The goal of the Open Archives Initiative Protocol for Metadata Harvesting … is to supply and promote an application- independent interoperability framework that can be used by a variety of communities who are engaged in publishing content on the Web. The OAI protocol … permits metadata harvesting.

  5. OAI-PMH: A simple two party model for sharing structured information Service Providers Current Discovery Preservation Awareness harvesting Metadata Data Providers

  6. Yes, its about resource discovery over distributed collections metadata Author Title Abstract Identifer

  7. Facilitating/Monitoring Longevity of Distributed Content Preservation actions Service Policy Enforcer P1 A1 P2 A2 Event Records P3 A3 Metadata Harvesting Selective Web Crawling Preservation Metadata Preservation Metadata Managed Managed Web Site Web Site Repository Repository

  8. Personalization of Content View A: View B: • View Slides • Get Transcript of Audio • View Video • Search for keyword • View synchronized presentation using applet • Get Slides translated to French Portal A Portal B Tool Repository structural metadata DigitalObject Powerpoint presentation SMIL synchronization metadata Realaudio video

  9. Cross-Repository Reference Linking Linkage Service citation citation citation citation citation metadata metadata metadata metadata metadata

  10. Brief History of the OAI • Motivation: expand impact of ePrint archives through federation • 1999: Santa Fe Meeting and convention • 2000: OAI-PMH formation – Scope broadens – OAI steering committee • 2001 OAI-PMH v. 1.0 “experimental” protocol • 2002 OAI-PMH v. 2.0 “stable” protocol

  11. OAI-PMH Key technical features • Deploy now technology – 80/20 rule • Simple HTTP encoding • Foundation of established XML standards • Multiple metadata formats • Repository partitioning (sets) • Selective harvesting (sets and dates) • Clean partition between core and implementation-specific extensions – Multiple item-level metadata – Collection level metadata

  12. OAI Verbs • Identify – repository characteristics • ListMetadataFormats – DC required • ListSets – repository paritioning • ListRecords – (selectively) harvest metadata • ListIdentifiers – (selectively) harvest metadata identifiers • GetRecord – known item retrieval

  13. Measures of Success • Registered data providers • Adoption by major projects • Acceptance as ‘fundamental infrastructure’ for research and implementation

  14. OAI Registered Data Providers 120 100 Total # Registered Sites 80 60 40 20 0 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 / / / / / / / / / / / / / / / / / / / 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 / / / / / / / / / / / / / / / / / / / 1 2 3 4 5 6 7 8 9 0 1 2 1 2 3 4 5 6 7 1 1 1

  15. National Science Digital Library (NSDL) • Very large scale distributed digital library – 1,000,000 users – 10,000,000 items – 100,000 collections • Large institutional and funding commitment – $25M+ funding – Over 80 collaborating institutions • Technical infrastructure builds on OAI-PMH foundation – Aggregation and dissemination of metadata • http://www.nsdl.org

  16. Fundamental Infrastructure • Eprints.org servers – e.g., Cal Tech ePrint framework • Open language archives community • JISC FAIR awards • Mellon OAI service providers • ECDL , DCADL, JCDL research papers

  17. Some questions remain • Is OAI-PMH really low-barrier infrastructure? – NSDL experience indicates that significant barriers remain • Utility of core metadata (unqualified DC) – NSDL and other experience raises doubts • Utility outside of resource discovery – Certification, Reference linking, etc.

  18. Future Questions and Directions • “Standardization”? – De-facto? – Maintenance agency? – Formal standards agency? • Future OAI-PMH versions? – Expanded functionality? • Targeted ‘application profiles’? – ePrints community?

Recommend


More recommend