Extending Models for Controlled Vocabularies to Classification Systems: Modelling DDC with FRSAD Joan S. Mitchell OCLC, Inc. Marcia Lei Zeng Kent State University Maja Žumer University of Ljubljana, Slovenia
The big question Can the FRSAD conceptual model be extended beyond subject authority data (its original focus) to model classification data?
Outline 1. From Knowledge Organisation Systems (KOS) to data and conceptual models 2. FRSAD conceptual model 3. FRSAD model for classification systems 4. DDC case study 5. Findings and limitations 6. Future work
1. From Knowledge Organisation Systems to Data and Conceptual Models: Timeline LCSH TEST* UDC DDC 1876 1898 1905 1967 1985 1974 1998 2004-2009 2009 2010 ISO 2788* FRSAD FRAD FRBR ISO5964* SKOS OWL * Thesaurus of engineering and scientific terms ISO 2788 (1974) Guidelines for the Establishment and Development of Monolingual Thesauri ISO 5964 (1985) Guidelines for the Establishment and Development of Multilingual Thesauri
From Knowledge Organisation Systems to Data and Conceptual Models: Modelling efforts Subject headings Classifi- Classifi- cation cation 1876 1898 1905 1967 1985 1974 1998 2004-2009 2009 2010 ISO 2788 FRSAD Thesauri KOS FRAD FRBR ISO5964 SKOS KOS Thesauri ontology OWL Thesauri: mostly comply with ISO 2788 and ISO 5964. Subject heading schemes: adopted the basic structure of the thesaurus since 1990s. Classification systems: implemented different practices and are usually constructed according to specific conventions and examples.
The “FRBR family” FRBR: the original framework All entities, focusing on Group 1 entities: work, expression, manifestation, item Published 1998 FRAD: Functional Requirements for Authority Data Focusing on Group 2 entities: person, corporate body, family Published 2009 FRSAD: Functional Requirements for Subject Authority Data Focusing on Group3 entities FRSAR WG established in 2005 Published 2010
The FRBR family models: main entities and relationships FRSAD FRBR FRAD
2. FRSAD Conceptual Model 2.1 The core of the FRSAD conceptual model �
FRSAD – generalisation of FRBR
The core of the FRSAD conceptual model � FRSAD Part 1: FRSAD Part 2: WORK has as subject THEMA / THEMA has appellation NOMEN / THEMA is subject of WORK NOMEN is appellation of THEMA NOMEN = any sign or sequence of signs (alphanumeric characters, symbols, sound, etc.) that a thema is known by, referred to or addressed as
Note: in a given controlled vocabulary and within a domain, a nomen should be an appellation of only one thema . The ‘has appellation’ relationship between thema and nomen in a controlled vocabulary:
An example of nomens in an authority record for a chemical compound Nomen 1-8 Nomen en 9 Source: STN Database Summary Sheet: USAN (The USP Dictionary of U.S. Adopted Names and International Drug Names) NOMEN = any sign or sequence of signs (alphanumeric characters, symbols, sound, etc.) that a thema is known by, referred to or addressed as.
Nomens in different types of KOS themas represented by: terms ( preferred & non-preferred) • thesauri: notations • classification schemes: terms of pre-coordinated strings • subject heading systems: category labels (w or w/t notations) • taxonomies: terms or identifiers • controlled lists: … … • … …
2.2 Relationships (1) Thema-to-thema relationships Hierarchical The generic relationship The hierarchical whole-part relationship The instance relationship Other hierarchical relationships Associative [most commonly considered categories are listed in the report] Other thema- to -thema relationships are domain- or implementation-dependent
2.2 Relationships (2) Nomen-to-nomen relationships Equivalence Two nomens are considered equivalent only if they are appellations of the same thema in a controlled vocabulary. Partitive An instance of a nomen may have parts. A whole-part relationship may exist between a nomen and its components.
2.3 Attributes Some general attributes of thema and nomen are proposed (1) thema attributes: type of thema, scope note In an implementation themas can be organized based on category, kind, or type (2) nomen attributes: see next slide In an implementation additional attributes may be recorded
Nomen attributes include but not limited to: Type of nomen (identifier, controlled name, …) Scheme (LCSH, DDC, UDC, ULAN, ISO 8601…) Reference source of nomen (Encyclopaedia Britannica…) Representation of nomen (alphanumeric, sound, visual,...) Language of nomen (English, Japanese, Slovenian,…) Script of nomen (Cyrillic, Thai, Chinese- simplified,…) Script conversion (Pinyin, ISO 3601, Romanisation of Japanese…) Form of nomen (full name, abbreviation, formula…) Time of validity of nomen (until xxxx, after xxxx , from… to …) Audience (English- speaking users, scientists, children …) Status of nomen (provisional, accepted, official,...) Note: examples of attribute values in parenthesis
2.4 The importance of the THEMA-NOMEN model to the subject authority data Separating what are usually called concepts (or topics , subjects, classes [of concepts] ) from what they are known by, referred to, or addressed as A general abstract model, not limited to any particular domain or implementation Potential for interoperability within the library field and beyond
3. FRSAD model for classification systems Each class corresponds to a thema • Notation associated with the class is the nomen • Thema is the full category description of the class • Nomen is the symbol (or surrogate) used to • represent the full category description
4. DDC case study
Thema: Class 025.04
Nomens: DDC number, Full caption, URI 025.04 Computer science, information & general works/Library & information sciences/Operations of libraries, archives, information centers/Information storage and retrieval systems http://dewey.info/class/025.04/
Thema: Any topic co-extensive with the full meaning of the class topics that are functionally equivalent to the class
Scope note: Text describing or defining thema or specifying scope within particular system Scope note ( ≠ thema/class) Scope note ( ≠ thema/class)
Thema-to-thema relationships associative relationship (poly)hierarchical relationship associative relationship
Alternative nomens: Relative Index terms with equivalence relationship to class
? ? ? ? ? ? ? ? ? SN SN SN SN scope note equivalence relationship unknown relationship ?
Derived alternative nomens 150 ## $a Databanks 260 ## $i see also $a Databases
? ? ? Derived ? ? ? SN SN SN SN scope note equivalence relationship unknown relationship ?
5. Findings and limitations • FRSAD conceptual model appears to accommodate DDC data at a broad level Topic-to-topic relationships require further study • The study did not consider the usefulness of • classification data modelled using FRSAD in real- world applications
6. Future work Specify all relationships between Relative Index terms • and classes (see earlier work by Green, Mitchell)
? ? ? Derived ? ? ? SN SN SN SN scope note equivalence relationship unknown relationship ?
6. Future work Specify all relationships between Relative Index terms • and classes (see earlier work by Green, Mitchell) Investigate DDC translations and mappings in context of • model
German Italian DDC 22 DDC 22 Swedish French Mixed DDC 22 DDC 22 Afrikaans Arabic English Chinese DDC 22 French French DDC Sach- DDC Gruppen Summaries German Italian (German) Norwegian Rhaeto- Portuguese Romansch Russian 200 Guide Religion Scots Gaelic (French) Class Spanish Swedish A14 Vietnamese French A14 A14 Hebrew Spanish A14 A14 Italian A14
Mappings and crosswalks SEARS BISAC CSH RAMEAU SAO MeSH SAB DDC LCSH LCC SWD Nuovo UDC Soggettario
Thema-to-thema relationships across languages: Class 025.04 (22/swe) = Class 025.04 (22)
Thema-to-thema relationships (Complex case): T2 — 43414 (22) = T2 — 43414 (22/ger), but . . . T2 — 43414 Giessen district (Giessen Regierungsbezirk) not equivalent Including *Lahn River to thema/class T2 — 43414 T2 — 43414 Regierungsbezirk Gießen T2 — 434147 Lahn-Dill-Kreis functionally equivalent to Hier auch: der Fluss *Lahn thema/class T2 — 434147
6. Future work Specify all relationships between Relative Index terms • and classes (see earlier work by Green, Mitchell) • Investigate DDC translations and mappings in context of model Investigate modelling the Relative Index as a separate • controlled vocabulary to provide a topic-centered view Experiment with modelling other classification • schemes Investigate usefulness of classification data modelled • using FRSAD
Recommend
More recommend