vocabularies to classification systems
play

Vocabularies to Classification Systems: Modelling DDC with FRSAD - PowerPoint PPT Presentation

Extending Models for Controlled Vocabularies to Classification Systems: Modelling DDC with FRSAD Joan S. Mitchell OCLC, Inc. Marcia Lei Zeng Kent State University Maja umer University of Ljubljana, Slovenia The big question Can the FRSAD


  1. Extending Models for Controlled Vocabularies to Classification Systems: Modelling DDC with FRSAD Joan S. Mitchell OCLC, Inc. Marcia Lei Zeng Kent State University Maja Žumer University of Ljubljana, Slovenia

  2. The big question Can the FRSAD conceptual model be extended beyond subject authority data (its original focus) to model classification data?

  3. Outline 1. From Knowledge Organisation Systems (KOS) to data and conceptual models 2. FRSAD conceptual model 3. FRSAD model for classification systems 4. DDC case study 5. Findings and limitations 6. Future work

  4. 1. From Knowledge Organisation Systems to Data and Conceptual Models: Timeline LCSH TEST* UDC DDC 1876 1898 1905 1967 1985 1974 1998 2004-2009 2009 2010 ISO 2788* FRSAD FRAD FRBR ISO5964* SKOS OWL * Thesaurus of engineering and scientific terms ISO 2788 (1974) Guidelines for the Establishment and Development of Monolingual Thesauri ISO 5964 (1985) Guidelines for the Establishment and Development of Multilingual Thesauri

  5. From Knowledge Organisation Systems to Data and Conceptual Models: Modelling efforts Subject headings Classifi- Classifi- cation cation 1876 1898 1905 1967 1985 1974 1998 2004-2009 2009 2010 ISO 2788 FRSAD Thesauri KOS FRAD FRBR ISO5964 SKOS KOS Thesauri ontology OWL Thesauri: mostly comply with ISO 2788 and ISO 5964. Subject heading schemes: adopted the basic structure of the thesaurus since 1990s. Classification systems: implemented different practices and are usually constructed according to specific conventions and examples.

  6. The “FRBR family”  FRBR: the original framework  All entities, focusing on Group 1 entities: work, expression, manifestation, item  Published 1998  FRAD: Functional Requirements for Authority Data  Focusing on Group 2 entities: person, corporate body, family  Published 2009  FRSAD: Functional Requirements for Subject Authority Data  Focusing on Group3 entities  FRSAR WG established in 2005  Published 2010

  7. The FRBR family models: main entities and relationships FRSAD FRBR FRAD

  8. 2. FRSAD Conceptual Model 2.1 The core of the FRSAD conceptual model �

  9. FRSAD – generalisation of FRBR

  10. The core of the FRSAD conceptual model � FRSAD Part 1: FRSAD Part 2: WORK has as subject THEMA / THEMA has appellation NOMEN / THEMA is subject of WORK NOMEN is appellation of THEMA NOMEN = any sign or sequence of signs (alphanumeric characters, symbols, sound, etc.) that a thema is known by, referred to or addressed as

  11. Note: in a given controlled vocabulary and within a domain, a nomen should be an appellation of only one thema . The ‘has appellation’ relationship between thema and nomen in a controlled vocabulary:

  12. An example of nomens in an authority record for a chemical compound Nomen 1-8 Nomen en 9 Source: STN Database Summary Sheet: USAN (The USP Dictionary of U.S. Adopted Names and International Drug Names) NOMEN = any sign or sequence of signs (alphanumeric characters, symbols, sound, etc.) that a thema is known by, referred to or addressed as.

  13. Nomens in different types of KOS themas represented by: terms ( preferred & non-preferred) • thesauri: notations • classification schemes: terms of pre-coordinated strings • subject heading systems: category labels (w or w/t notations) • taxonomies: terms or identifiers • controlled lists: … … • … …

  14. 2.2 Relationships (1) Thema-to-thema relationships  Hierarchical  The generic relationship  The hierarchical whole-part relationship  The instance relationship  Other hierarchical relationships  Associative  [most commonly considered categories are listed in the report] Other thema- to -thema relationships are domain- or implementation-dependent

  15. 2.2 Relationships (2) Nomen-to-nomen relationships  Equivalence Two nomens are considered equivalent only if they are appellations of the same thema in a controlled vocabulary.  Partitive An instance of a nomen may have parts. A whole-part relationship may exist between a nomen and its components.

  16. 2.3 Attributes  Some general attributes of thema and nomen are proposed (1) thema attributes: type of thema, scope note  In an implementation themas can be organized based on category, kind, or type (2) nomen attributes: see next slide   In an implementation additional attributes may be recorded

  17. Nomen attributes include but not limited to:  Type of nomen (identifier, controlled name, …)  Scheme (LCSH, DDC, UDC, ULAN, ISO 8601…)  Reference source of nomen (Encyclopaedia Britannica…)  Representation of nomen (alphanumeric, sound, visual,...)  Language of nomen (English, Japanese, Slovenian,…)  Script of nomen (Cyrillic, Thai, Chinese- simplified,…)  Script conversion (Pinyin, ISO 3601, Romanisation of Japanese…)  Form of nomen (full name, abbreviation, formula…)  Time of validity of nomen (until xxxx, after xxxx , from… to …)  Audience (English- speaking users, scientists, children …)  Status of nomen (provisional, accepted, official,...) Note: examples of attribute values in parenthesis

  18. 2.4 The importance of the THEMA-NOMEN model to the subject authority data  Separating what are usually called concepts (or topics , subjects, classes [of concepts] ) from what they are known by, referred to, or addressed as  A general abstract model, not limited to any particular domain or implementation  Potential for interoperability within the library field and beyond

  19. 3. FRSAD model for classification systems Each class corresponds to a thema • Notation associated with the class is the nomen • Thema is the full category description of the class • Nomen is the symbol (or surrogate) used to • represent the full category description

  20. 4. DDC case study

  21. Thema: Class 025.04

  22. Nomens: DDC number, Full caption, URI 025.04 Computer science, information & general works/Library & information sciences/Operations of libraries, archives, information centers/Information storage and retrieval systems http://dewey.info/class/025.04/

  23. Thema: Any topic co-extensive with the full meaning of the class topics that are functionally equivalent to the class

  24. Scope note: Text describing or defining thema or specifying scope within particular system Scope note ( ≠ thema/class) Scope note ( ≠ thema/class)

  25. Thema-to-thema relationships associative relationship (poly)hierarchical relationship associative relationship

  26. Alternative nomens: Relative Index terms with equivalence relationship to class

  27. ? ? ? ? ? ? ? ? ? SN SN SN SN scope note equivalence relationship unknown relationship ?

  28. Derived alternative nomens 150 ## $a Databanks 260 ## $i see also $a Databases

  29. ? ? ? Derived ? ? ? SN SN SN SN scope note equivalence relationship unknown relationship ?

  30. 5. Findings and limitations • FRSAD conceptual model appears to accommodate DDC data at a broad level Topic-to-topic relationships require further study • The study did not consider the usefulness of • classification data modelled using FRSAD in real- world applications

  31. 6. Future work Specify all relationships between Relative Index terms • and classes (see earlier work by Green, Mitchell)

  32. ? ? ? Derived ? ? ? SN SN SN SN scope note equivalence relationship unknown relationship ?

  33. 6. Future work Specify all relationships between Relative Index terms • and classes (see earlier work by Green, Mitchell) Investigate DDC translations and mappings in context of • model

  34. German Italian DDC 22 DDC 22 Swedish French Mixed DDC 22 DDC 22 Afrikaans Arabic English Chinese DDC 22 French French DDC Sach- DDC Gruppen Summaries German Italian (German) Norwegian Rhaeto- Portuguese Romansch Russian 200 Guide Religion Scots Gaelic (French) Class Spanish Swedish A14 Vietnamese French A14 A14 Hebrew Spanish A14 A14 Italian A14

  35. Mappings and crosswalks SEARS BISAC CSH RAMEAU SAO MeSH SAB DDC LCSH LCC SWD Nuovo UDC Soggettario

  36. Thema-to-thema relationships across languages: Class 025.04 (22/swe) = Class 025.04 (22)

  37. Thema-to-thema relationships (Complex case): T2 — 43414 (22) = T2 — 43414 (22/ger), but . . . T2 — 43414 Giessen district (Giessen Regierungsbezirk) not equivalent Including *Lahn River to thema/class T2 — 43414 T2 — 43414 Regierungsbezirk Gießen T2 — 434147 Lahn-Dill-Kreis functionally equivalent to Hier auch: der Fluss *Lahn thema/class T2 — 434147

  38. 6. Future work Specify all relationships between Relative Index terms • and classes (see earlier work by Green, Mitchell) • Investigate DDC translations and mappings in context of model Investigate modelling the Relative Index as a separate • controlled vocabulary to provide a topic-centered view Experiment with modelling other classification • schemes Investigate usefulness of classification data modelled • using FRSAD

Recommend


More recommend