Investigation as a member of research discourse Vasily Bunakov Science and Technology Facilities Council United Kingdom Digital Libraries: Advanced Methods and Technologies, Digital Collections. Dubna, Russia, October 16, 2014
STFC Funds and operates large scale instruments for the UK and visitor researchers in: - physics, astronomy - chemistry, materials - biology, medicine Scientific Computing develops and operates computing infrastructure: - High Performance Computing - Petabyte data store - CERN LHC Tier 1 hub also conducts applied research and does software development
Big Facilities for Small Science ISIS neutron and muon source Central Laser Facility Facilities Support Diamond Light Source
Facilities science in Europe PaNdata Europe 2010 – 2011 Preparation: common policies and standards http://pan-data.eu/pandata/?q=PaNdataEurope PaNdata ODI 2011 – 2014 Implementation: delivering new infrastructure http://pan-data.eu/pandata/?q=ODIWP
Computing support throughout the scientific lifecycle STFC Scientific Computing supports each stage in the JISCMail e-pubs work of researchers from background research through Grid conducting simulations and Grid Computing Computing experiments, to analysing and archiving data.
Facilities Research Lifecycle Record Proposal Publication Subsequent Approval publication Data registered with analysis facility Scientist submits application for Scheduling beamtime Data storage Experiment Tools for processing made Raw data filtered, Facility committee available and stored approves Scientists visits, Facility registers, facility run’s application trains, and A corresponding intellectual entity: experiment schedules Investigation scientist’s visit ICAT data catalogue software: http://code.google.com/p/icatproject/
DOIs for experimental data Much cheaper DOIs than directly from DOI Foundation www.DataCite.org
Our DOIs landing pages are in fact for Investigations (series of Experiments) LCDP 2013
ICAT data catalogue called from DOI landing page https://data.isis.stfc.ac.uk/TOPCATWeb.jsp#view///&&tab=Search//&Model =INVESTIGATION&ServerName=ISIS&InvestigationId=24071239
What can cite what Citations “from” Publication Investigation Dataset Software column “to” row Publication V V V V Investigation V V V V Dataset X V V (derived or V (simulation) aggregated datasets) Software V X V (testing) V (software libraries, service calls)
Publication and Investigation similarity No Feature / aspect Publication Investigation Is an intellectual entity V V 1 Is a subject of peer review V V (via 2 proposal approval) Can cite all significant intellectual entities of a V V 3 research discourse Citation chains (steps of discourse) observed V V 4 Universal identifiers “mints” available V V 5 This gives Investigation a potential for a “full membership” in the research discourse along with Publication. Datasets and software are likely to remain “associated members” because of weaker features 2, 3 and a de-facto weaker feature 5.
Publications and investigations network
How to link Publications and Investigations? Bibliographic Reference in ICAT Data ePubs Publications Catalogue (a few thousand records) Repository Reference Pratt et al, Phys. Rev. Lett. 96, Phys Rev Lett 96 247203 247203 (2006) (2006) Phys Rev B 73 Lancaster et al, Phys. Rev B73, 020410 (2006) 020410(R) (2005) J Phys Condens Blundell and Pratt, J. Phys.: Condens. Matter 16 R771-R828 Matter 16, R771 (2004) (2004) J Phys Condens M.T.F.Telling and S.H.Kilcoyne, Matter 19 2 026221 Electron transfer in dextran, J. Phys.: (2007) Condens. Matter 19 No 2 (17 January 2007) Phys Chem Chem Phys J Tomkinson and M.T.F Telling, 8 4434-4440 (2006) Ammonium ions in alkali metal halide crystals: Tunnelling and spin relaxation, PCCP 2006 8 38 4434
Beyond bibliographic records matching Publications repository Data catalogue records records and their DOI (investigation descriptions) landing pages Instruments / departments Instruments to investigations tags mapping Authors Researchers’ names Publication title Investigation title Authors’ organizations Researchers’ organizations Publication date Investigation period Abstract Investigation description Keywords, e.g. PACS indices ICAT keywords A few thousand publications mapped with publications repository on previous stage could be used for tuning and testing the machine learning techniques
What represents facilities research? Today: Tomorrow: Yesterday: publications and data “facility - centric” Linked publications (in fact, Investigations) Open Data cloud Publications Publications Publica tions catalogue catalogue Experi Research Awards ments + Selected ontologies Experiment descriptions catalogue Selected external Linked Data
Linked Data technology stack Semantically enriched Bespoke / customized data software applications Command line SPARQL Bespoke (Jena) Data cleansers and Fuseki Web tools (ARQ, (can be from a Linked Data API Web mappers with application loaders, remote application vocabularies, optimizers, …) client) ontologies, geolocation services and other Linked Triple store (Jena TDB) Data sources RDF extractors and loaders OAI-PMH Database Data Linked Data Linked Data converters Other triple stores , SPARQL endpoints wrappers wrappers Harvesters & and Linked Data APIs Mappers OAI – PMH sources Databases Implemented or Facilities user Prospective components Legend: evaluated components community
Information entities circulating in your research domain • What are they? (beyond publications) • Do they have a clear identity? • Do they circulate in your organization only or universally across organizations? • Can they be linked with publications and other information entities? • Can they be linked with the world-wide data cloud?
Thank you! Scienti tifi fic c Computi uting Department
Recommend
More recommend