Documenting Heri ritage Scie ience: A CID IDOC CRM-based System for r Modell lling Scie ientific Data Lisa Castelli | INFN, Italy Achille Felicetti | PIN, University of Florence, Italy
Apologies I’m NOT A DIGITAL WOMAN … For any question about the technical details of the model please contact my colleague Achille Felicetti achille.felicetti@gmail.com
Apologies I’m NOT A DIGITAL WOMAN … WHY AM I HERE? INFN-CHNet in ARIADNE plus
IN INFN-CHNet INF INFN-CHNet: Born to coordin inate th the cu cult ltural l heri ritage ac activ ivities of of IN INFN facilities is opening to par artn tners with ith dif ifferent competencies (restorations centres, archaeology/chemistry departments in Universities, …)
IN INFN-CHNet Mass Spectrometry Fixed Labs TL dating X-ray imaging Medium-large X-ray scale facilities imaging (IBA, 14 C, ...)
IN INFN-CHNet Mass Spectrometry Fixed Labs TL dating X-ray imaging Medium-large X-ray scale facilities imaging (IBA, 14 C, ...) XRF Mobile Labs Thermography XRD
INFN-CHNet IN Mass Spectrometry Fixed Labs TL dating X-ray imaging Medium-large X-ray scale facilities imaging (IBA, 14 C, ...) XRF Mobile Labs Thermography XRD Digital Labs Data Storage and Web tools for fruition data fruition
CHNet-DIGILAB: an ideal world Services Fruition CHNet database Data Data tools @CNAF + + WEB Analysis tools Metadata Metadata ---------------------------------------------------------------------------------- User access control Browse/Query Interfaces Fruition & Analysis expert User Fruition Non- expert User
ARIADNE plus • Realization of a complex Digital Infrastructure • 40 partners - all Europe! • Archaeology + historical sources, linguistic data, catalographic data, scientific data from archaeometric analyses • Innovative Services and Cloud Environment • Data annotation (image and text) • More advanced NLP and text mining (machine learning approach) • Definition of univoque spatial and temporal entities for archaeology • Geografic system: interoperability of national geografical systems and GIS, to integrate in a unique transnational system (ARIADNE GeoServer) • Cloud environment: shared resources, efficient management of services • Implementation of services for visualization and analysis also of scientific data, high resolution images and online 3D visualization.
Goals of f DIG IGIL ILAB (CHNet and more…) • Interoperability between different Heritage Science communities • Data from analyses, conservation and restoration activities • Cross-disciplinary information integration • Interoperability with Humanities (History, Archaeology, Art History, [ … ]) • System for discovering, accessing, reusing integrated data and services • Solid metadata model for data and service description required • Design of the integrated system
Exi xisting schemas and metadata models • International metadata standards for scientific research • CERIF ( EU recommended format), OBOE ( scientific observation and measurement ), NeXus ( neutron and x-ray ), AVM ( astronomical images ), CIF ( crystallography ), [ … ] • Formats in use by national labs/institutions (Italy) • Institute for the Conservation and Valorisation of Cultural Heritage (ICVBC-CNR): XRD data in XRDML format, CIF taxonomy of crystallographic terms • National Institute for Nuclear Physics (INFN) CHNet: C14 binary data + data in Excel tables, XRF data in RAW format • Istituto Superiore per la Conservazione ed il Restauro (ISCR): multispectral information in ICCD-based format • Opificio delle Pietre Dure (OPD): photographs, radiographies, 3D models in various formats • National Institute of Optics (CNR-INO): multispectral and OCT data in RAW and TIFF formats • Institute of Molecular Science and Technologies (ISTM-CNR): MOVIDA software ( multi‐technique diagnostics, information in relational database) -> MOLAB • Formats used by European labs/institutions: investigation planned for next period
Exi xisting schemas and metadata models • International metadata standards for scientific research • CERIF ( EU recommended format), OBOE ( scientific observation and measurement ), NeXus ( neutron and x-ray ), AVM ( astronomical images ), CIF ( crystallography ), [ … ] Too general/too specific • Formats in use by national labs/institutions (Italy) • Institute for the Conservation and Valorisation of Cultural Heritage (ICVBC-CNR): XRD data in XRDML format, CIF taxonomy of crystallographic terms • National Institute for Nuclear Physics (INFN) CHNet: C14 binary data + data in Excel tables, XRF data in RAW format • Istituto Superiore per la Conservazione ed il Restauro (ISCR): multispectral information in ICCD-based format • Opificio delle Pietre Dure (OPD): photographs, radiographies, 3D models in various formats • National Institute of Optics (CNR-INO): multispectral and OCT data in RAW and TIFF formats • Institute of Molecular Science and Technologies (ISTM-CNR): MOVIDA software ( multi‐technique diagnostics, information in relational database) -> MOLAB • Formats used by European labs/institutions: investigation planned for next period
Exi xisting schemas and metadata models • International metadata standards for scientific research • CERIF ( EU recommended format), OBOE ( scientific observation and measurement ), NeXus ( neutron and x-ray ), AVM ( astronomical images ), CIF ( crystallography ), [ … ] Too general/too specific a chaos!!! • Formats in use by national labs/institutions (Italy) • Institute for the Conservation and Valorisation of Cultural Heritage (ICVBC-CNR): XRD data in XRDML format, CIF taxonomy of crystallographic terms • National Institute for Nuclear Physics (INFN) CHNet: C14 binary data + data in Excel tables, XRF data in RAW format • Istituto Superiore per la Conservazione ed il Restauro (ISCR): multispectral information in ICCD-based format • Opificio delle Pietre Dure (OPD): photographs, radiographies, 3D models in various formats • National Institute of Optics (CNR-INO): multispectral and OCT data in RAW and TIFF formats • Institute of Molecular Science and Technologies (ISTM-CNR): MOVIDA software ( multi‐technique diagnostics, information in relational database) -> MOLAB • Formats used by European labs/institutions: investigation planned for next period
“Going semantics”: the CIDOC CRM family • CIDOC CRM (ISO 21127:2006) - http://cidoc-crm.org • High-level conceptual model for standardisation, integration and interoperability of Cultural Heritage information • Core model + domain-specific extensions • CRMsci - http://cidoc-crm.org/crmsci • Scientific observation information • CMRdig - http://cidoc-crm.org/crmdig • Provenance metadata for digital objects • CRMpe • PARTHENOS Entities Model • Datasets, services and curators • Versioning, accessibility, licensing … • PARTHENOS Registry
Basis of f CID IDOC CRM model Devices (D8) Actor (E39) Man-made object (E22) Legal Bodies (E40) Activity (E7) Campaign Measurement Datasets Places (E53)/ Time (E52) Analysis (PE18)
Basis of f CID IDOC CRM model Belong to Devices (D8) Actor (E39) Man-made object (E22) Legal Bodies (E40) Activity (E7) Campaign Measurement Datasets Places (E53)/ Time (E52) Analysis (PE18)
Our model Devices Actor XML file of metadata Man-made object Legal Bodies Campaign Measurement Datasets Places/ Time Analysis
O ur model for scie ientific datasets description • Reuses logics and components of existing CIDOC models • Modular approach • “Complexity hiding” paradigm in content generation • User-friendly interfaces for rapid manual and semi-automatic metadata creation • Scripts for automatic generation of information (e.g.: from digital devices) or existing metadata in a transparent way • Semantic network for advanced queries (CIDOC CRM modelling principles) • Properties and relationships to link entities in a meaningful way • Platform independent intermediate meta-format (XML) • Dynamic mapping and re-encoding in existing standards (CSV, SQL, RDF, JSON …) • Mapping to CIDOC CRM for HS + CH interoperability implementation
Entities: standardising th the content • Standard notations, Gazetteers, PIDs, Permalinks, URIs, URLs • Actors and Artefacts : VIAF/ORCID, DBPedia URIs, authority files, controlled lists of institution/person names • Time Spans and Periods : ISO8601 encoding, xsd: Date /xsd: DateTime formats, PeriodO URIs ( period names and time spans) • Places : WGS84 ( spatial coordinates ), Geonames URIs ( places Artefacts http://dbpedia.org/resource/Mona_Lisa identification ), Pleiades/Pelagios URIs/URLs (historical places ) • Persistent Identifiers: unambiguous, permanent entities identification • Thesauri, vocabularies, controlled lists of types • Widely used in Cultural Heritage • AAT (Getty’s Art and Architecture Thesaurus ), Nomisma.org People ( numismatics ), PICO ( Italian ICCD thesaurus), [ … ] https://viaf.org/258811 • Examples in science: CIF taxonomy of crystallographic terms (Franco Niccolucci) • Existing ones: to be identified | New ones: to be created
Recommend
More recommend