CISMeF Catalog & Index of Health Resources in French on the Internet Prof. SJ. Darmoni, MD, PhD TIBS, LITIS Lab Rouen University Hospital & Rouen Medical School, France Email: Stefan.Darmoni@chu-rouen.fr MIE Oslo August 2011
2 Introduction § Quality controlled subject gateways (or portals) were defined by Koch as Internet services which apply a comprehensive set of quality measures to support systematic resource discovery. § CISMeF = quality controlled health gateway for French institutional health resources ü www.cismef.org
Introduction § The objective of CISMeF (Catalog and Index of French- speaking resources) is to assist the health professional & lay people during the search for electronic information available on the Internet. CISMeF covers healthcare disciplines and medical sciences. § CISMeF was a project originally initiated by Rouen University Hospital (RUH). § URL: http://www.cismef.org & http://www.chu-rouen.fr/cismef § CISMeF began in February 1995 § Doc’CISMeF in 2000: creation of a generic search tool using the CISMeF semi-informal ontology § URL: http://doccismef.chu-rouen.fr/ Methods of Information in Medicine 2000; Jan;39(1) 30-35 Medical Informatics & The Internet in Medicine 2001; 26(3):165 - 178
CISMeF terminology § Two standard tools for organising information: ü the MeSH (Medical Subject Headings) thesaurus from the US National Library of Medicine ü Several metadata element sets • the Dublin Core metadata format + CISMeF specific fields • For teaching resources, IEEE 1484 LOM metadata format 11 elements of the LOM Educational category => DC.Education • For evidence-based medicine resources, CISMeF specific fields: level of evidence + method to evaluate it • The HIDDEL metadata set is used to enhance transparency, trust and quality of health information on the Internet. § Do not reinvent the wheel +++ but adapt it DC-2004 , International Conference on Dublin Core and Metadata Applications Stud Health Technol Inform . 2003;95:707-712
MeSH ‘ enhancements ’ § The heterogeneity of Internet health resources led the CISMeF team to enhance the MeSH thesaurus with the introduction of two new concepts ü resource types (N ≈ 300), ü metaterms (N ≈ 120), ü predefined queries (N ≈ 200) Health Information and Libraries Journal 2004 Dec;21(4):253-61
6 MeSH ‘ enhancements ’ § Improvement of the MeSH thesaurus itself ü Add-on of 10,000 French synonyms, including (ambiguous) acronyms ü Manual translations of 6,000 definitions (semi-automatic translation for the rest of the MeSH soon) ü French translation of >20,000 MeSH Supplementary Concepts (SC) & add-on of 6,000 synonyms
Strategic revolution in 2005 § Between 1995 & 2005, mono-terminological world around the MeSH § Since 2005, shift to multi-terminological universe : ü CCAM, CIM10, SNOMED Int., CIF/CIH, CISP2, DRC ü Creation of a French Health Multi-Terminological Server (HMTS): ANR, InterSTIS ü Multi-Terminological extraction (7th FP EU, PSIP ü Multi-Terminological Information Retrieval (JFIM 2009) § Several health terminologies for the automatic indexing and the information retrieval in the CISMeF quality-controlled health portal… and beyond § Can be reused in any European language if health terminologies are available in your language!!! In particular in Norway
TIBS Information processing in biology and in health Prof. SJ. Darmoni Situation in August 2011 Multi terminology information retrieval Doc ’ CISMeF search engine Multi terminology automatic indexing ECMT InfoRoute European Health multi-Termino-Ontology CISMeF Cross-lingual Portal Information EHTOP System 32 health terminologies
Multi-Terminological extraction § Collaboration with Vidal company § F-MTI & ECMT tools ü 3 PhDs (A. Névéol, S. Pereira & S. Sakji) § Bag of words algorithm, stemming (or lemmatization) § Inclusion of health terminologies available in French ü SNOMED Int, ICD 10, MeSH, MeSH SC, ICDC (included in UMLS) ü ATC, CIF (WHO) ü CCAM, DRC, Orphanet, TUV, CIS, CIP, INN, Brand Names ü MedDRA, WHO-ART, LOINC (to be included) ü Recent study on CISMeF corpus de CISMeF: MonoT vs. MultiT (AMIA 2009) : +7% recall ; -12% precision
Multi-Terminological extraction § New concept: automatic affiliation of a subheading to a MeSH term § Manual affiliation of a subheading to a MeSH Supplementary Concept (Evaluation to perform) § Stoilo & Lewenstein distance (PhD Z. Moalla Y2) § In the near future MeSH Indexing at the concept level and not anymore at the descriptor level ü Interesting fo rare diseases; potential collaboration
11 IR in CISMeF: currently § Only three steps Step1: Reserved terms ( ∈ CISMeF terminology) OR document's title Step2: The CISMeF metadata Mixing the reserved terms, all fields and adjacency in the titles (word adjacency: (n-1)*5) Step 3: Adjacency in the plain texts Mixing the reserved terms, all fields and adjacency in the plain texts (word adjacency: (n-1)*10)
CISMeF Information Retrieval § Since 2005, four levels of indexing in CISMeF ü Level 1: manuel indexing (e.g. guidelines) ü Level 2: supervised indexing (e.g. technical report or teaching document from national medical societies) ü Level 3: automatic indexing (e.g. SCPs, teaching document from one medical school) ü Level 4: extending the CISMeF corpus => Google CISMeF (restricted to publishers included in CISMeF)
13
CISMeF Information Retrieval § Some differences with PubMed ü Resources automatically indexed included § CISMeF resource ranking ü Analysis of the query ü MeSH Major (or Title) first (display of score) • Then, date (as PubMed ) ü Automatic (Title or SubTitle) ü Minor MeSH
Multi-Terminological Information Retrieval § RIMT using the same health terminologies, integrated to the CISMeF backoffice ü Operational in Doc ’ CISMeF since April 2009 (test) ü Bi-terminological in the PSIP DIP since September 2008 § Bag of words algorithm, stemming § Double context ü Knowledge (CISMeF) + contextual knowledge • PhD Saoussen Sakji , dec. 2010 (Tunisia) ü Electronic Health Record (EHR) • PhD AD Dirieh-Dibad 4Y, planned March 2012 (Djibouti)
16 ATC MeSH
17 Results in the Doc ’ CISMeF search engine § Use of multi-terminology indexing with SNOMED & MedDRA + MeSH indexing Multi-terminology information retrieval Multi-terminology manual indexing using PTS
18 CISMeF & PTS § During 2009, in collaboration with 8 students-engineers from the INSA de Rouen, and with LERTIM & MONDECA, the CISMeF team has developed a Multi-Terminology Health Portal (PTS as a French acronym). § Since 2007, modelization of a generic model to integrate main terminologies and ontologies available in French § The current health terminologies included in PTS are: ü MeSH (+ MeSH SC + CISMeF extension), SNOMED Int, CCAM, ICD10 & ICPC2 (InterSTIS project) ü ATC, ICF, WHO-ART, WHO-ICPS (WHO) ü DRC, MedDRA, MEDLINEPlus ü CIS, CIP, CAS, EC, INN, Brand Names, PSIP taxonomy, IUPAC, NCC-MERP (PSIP Project) ü Orphanet (rare diseases), LPP & Cladimed (medical devices) ü TUV (Vidal), FMA, SNOMED CT, LOINC ü ADICAP, NCIT (to be included)
19 HMTP generic model
20 Health Multi-Terminology Portal (HMTP; PTS) § URL: http://pts.chu-rouen.fr / § Access for humans and coumputers (Web services) ü Since September 2010, daily used by CISMeF team to index manually and automatically Web resources ü Since January 2011, MeSH is freely available (600 unique users per working day) § Restricted access to the other terminologies (230 registred) § Cooperation with BioPortal: Clement Jonquet & Mark Musen
Main figures May 2010 Terminologies Concepts Synonyms Definitions Relations & hierarchies 25 > 580 000 > 840 000 > 220 000 > 1 200 000 August 2011 Terminologies Concepts Synonyms Definitions Relations & hierarchies 32 > 1 100 000 > 2 300 000 > 220 000 > 4 000 000
Future work § EHTOP ü ICD10 in five European languages ü URL: cispro.chu-rouen.fr/ehtop_site ü Procedures & medical devices T/O § RIDoPI: Information retrieval on EHR ü Numerical data ü Temporal data ü RAVEL (2012-4 ANR TecSan program) § Interface Terminologies § Multi-lingual search engine (already multi-T/O) § Teaching document: http://www.univ-rouen.fr/med/breeze/ SDinserm/index.htm
Many thanks § Email: Stefan.Darmoni@chu-rouen.fr
Recommend
More recommend