exploiting and remodelling semantic
play

EXPLOITING AND REMODELLING SEMANTIC RELATIONSHIPS Fran Ale lexande - PowerPoint PPT Presentation

TRANSFORMATION OF A LEGACY UDC- BASED CLASSIFICATION SYSTEM: EXPLOITING AND REMODELLING SEMANTIC RELATIONSHIPS Fran Ale lexande der, , Taxonomy Manager, BBC Information and Archives, London, UK Andy y Heather er, , Chief Technical


  1. TRANSFORMATION OF A LEGACY UDC- BASED CLASSIFICATION SYSTEM: EXPLOITING AND REMODELLING SEMANTIC RELATIONSHIPS Fran Ale lexande der, , Taxonomy Manager, BBC Information and Archives, London, UK Andy y Heather er, , Chief Technical Officer, Dods Parliamentary Communications, London, UK (formerly Principal Programme Architect, BBC Technology, London, UK) *All views expressed here are entirely our own personal views and in no way represent the BBC or official BBC policy.

  2. INTRODUCTION TO THE BBC ARCHIVE  2 million items of TV and video  300,000 hours of audio  still photographs, sheet music, and documents  4,000 loans per week  Lonclass (London Classification), based on UDC, introduced 1964  Telclass (Television Classification), used mainly by the Natural History Unit (NHU), established 1979

  3. DMI PROJECT – “ Fabric ”  launched in 2008  preserve intellectual property and semantic richness of classifications  facilitate publishing of classification data in semantically rich and interoperable forms

  4. FACET CLASSES AS A BASIS FOR ONTOLOGICAL RELATIONSHIP MODELLING Facet/Class Example Format Subject Emergency Services Polyhierarchy Geographic Birmingham Simple hierarchy Event date 1585 Simple hierarchy Motion Takeoff Flat list Organisations The British Library Flat list, divided into sections Person Elizabeth I Flat list, divided into sections Artistic work The Mill on the Floss Flat list, divided into sections Shot type POV Flat list Shooting date (archive) 1971 Simple hierarchy

  5. ANALYSIS OF LONCLASS Lonclass: 370,000 concepts  20,000 simple concepts  350,000 compound concepts  some 150,000 KOS concepts (40%) used for only 1 catalogue item  50,000 (14%) used for only 2 catalogue items  300,000 (80%) used 10 times or fewer  less than 5% of the concepts (approximately 16,000) were used 100 times or more

  6. ANALYSIS OF A LONCLASS COMPOUND TERM

  7. DECOMPOSITION METHODOLOGY  decompose the PCCs in Lonclass and build term hierarchies of each of the set of defined Classes of Concepts  use multiple, redundant classification points to mitigate against loss of semantic accuracy  define a set of terms from the legacy KOS with value as classifications for clustering assets  utilise terms in the legacy KOS with additional semantic value

  8. TO ASSET PATHWAYS

  9. CLASSIFICATION DATA MODEL  nodes in the classification space modelled as Concepts with a variable number of alternate and preferred terms  URIs to provide access to concepts and terms http://fabric.bbc.co.uk/classification/<UUID>  classification groups containing multiple sets of classification terms  classification groups attached at all levels in the Product Information hierarchy  classification groups attached at any point on the media timeline

  10. INFORMATION DISCOVERY ENVIRONMENT  integrate the classification space into the Search environment  match queries against the taxonomy to increase the degree of relevance of the response  open source Solr search engine selected  classification space denormalised in the engine to allow runtime node counts to be calculated

  11. PROBLEMS AND LIMITATIONS  inability of SKOS to fully model the order of relationships between multiple concept instances  SKOS vocabulary of relationship types is limited  stopping point for decomposition

  12. BENEFITS OF EXPORTING TAXONOMIES IN OPEN FORMATS

  13. CONCLUSIONS  preserve semantics through migrations  export in open formats

  14. KEY REFERENCES Ben-Yitzhak, Neumann, Sznajder et al. (2008). Beyond Basic Faceted Search IBM Research Labs Bergman, M. K. (2009). Confronting Misconceptions with Adaptive Ontologies. [Blog post.] Available at: http://www.mkbergman.com/553/confronting-misconceptions-with-adaptive-ontologies/ Black, P . E. (2004). Dictionary of Algorithms and Data Structures [online], ed., U.S. National Institute of Standards and Technology . Available at: http://www.nist.gov/dads/HTML/directAcycGraph.html Bosch, M. (2006). Ontologies, Different Reasoning Strategies, Different Logics, Different Kinds of Knowledge Representation: Working T ogether. Knowledge Organization , 33(3), pp. 153-159. Brickley, D. (2010). Lonclass and RDF. [Blog post.] Available at http://danbri.org/words/2010/11/18/585 Brickley, D. (2011). Video Linking: Archives and Encyclopedias . [Blog post.] Available at http://danbri.org/words/2011/02/01/658 Foskett, A. C. (1971). The Subject Approach to Information . London, UK: Clive Bingley. Frické, M. (2011). Classification, Facets, and Metaproperties. Journal of Information Architecture , 2 (2). Available at http://journalofia.org/volume2/ issue2/04-fricke/. NoTube http://notube.tv/about-3/partners/ Rodriguez-Castro, B.; Glaser, H.; Carr, L. (2010). How to Reuse a Faceted Classification and Put It on the Semantic Web. In ISWC 2010, Part I, LNCS 6496; P.F. Patel-Schneider et al. (eds.), pp. 663 – 678. Berlin/Heidelberg Springer-Verlag Acknowledgements Nicholas Chivers; Ken Haylock; Kathryn Stickley; Helen Pritchard (DMI development team); Oliver Gardiner; John Jordan Map of the Semantic Web http://www.flickr.com/photos/jurvetson/3277667570/

Recommend


More recommend