Automatic generation of MedDRA terms groupings using an ontology Gunnar DECLERCK a , Cédric BOUSQUET a,b and Marie ‐ Christine JAULENT a a INSERM, UMRS 872 EQ20, Université Paris Descartes, France. b Department of Public Health, CHU University of Saint Etienne, France. MIE 2012, August 27 th , Pisa
Context and Rationale MedDRA (Medical Dictionary for Drug Regulatory Activities) : standard terminology used to code adverse drug reactions (ADRs) in safety reports for postmarketing drug surveillance. Used for Signal detection : Case reports coded with MedDRA stored in databases (i.e. FDA pharmacovigilance Database for the US, WHO Vigibase for Europe) Data mining algorithms used to find statistical correlations between ADRs (several MedDRA terms defining a unique medical condition) and drugs (a “signal”) Empirical studies to assess the causal relationship between the drug and ADR. MedDRA hierarchical structure 2
PROTECT WP3 ‐ SP6 “Novel techniques for grouping ADRs to improve signal detection” Main goal : To develop and evaluate semantic driven methods for grouping MedDRA terms to improve signal detection performances. Hypothesis : Signal detection is improved when algorithms use groupings of MedDRA terms referring to the same ADR condition (rather than one single term). Need a way to assist human experts to build MedDRA groupings (currently made manually) Method : Building an ADR ontology (OntoADR) providing MedDRA terms with formal machine- processable defintions to support automatic MedDRA terms grouping procedures (OWL queries selecting terms on the basis of their semantic properties). WP Work Package title 1 Project management and administration 2 Framework for pharmacoepidemiological studies 3 Methods for signal detection 4 New tools for data collection from consumers 5 Benefit-risk integration and representation 6 Validation studies involving an Extended Audience 7 Training and communication PROTECT. Pharmacoepidemiological Research on Outcomes of Therapeutics by a European ConsorTium. 3 IMI ( Innovative Medicines Initiative ) project (2009- 2014).
OntoADR ontology MedDRA terminology enriched with Snomed-CT concepts formal definitions 34994 concepts (20856 MedDRA 13.0 terms / others from Snomed-CT) 26 Snomed-CT relations used to express medical meaning of MedDRA concepts HASFINDINGSITE specifies the body site affected by a condition HASOCCURRENCE refers to the specific period of life during which a condition first occurs … 4
Ontologizing MedDRA Hierarchical relations from SOC level to PT (Preferred terms) level are converted to subsomption ( subclass_of ) relations SOC level HLGT level HLT level LLT (low level terms) are integrated as PT level annotation labels of PT concepts 5
Mapping process When it is possible, MedDRA concepts are mapped with Snomed-CT concepts via UMLS metathesaurus Semantic information describing Snomed-CT concepts used to build the formal definition of MedDRA concepts. Formal definition of « Dilatation intrahepatic duct congenital » Meddra PT concept after Snomed ‐ CT mapping (mapped with “Congenital dilatation of lobar intrahepatic bile duct”) When mapping impossible, formal definition made manually (via collaboration between knowledge engineers and medical experts) 6
OntoADR generation: general schema Mapping process SOC HLPT HLT PT Snomed ‐ CT MedDRA Medical experts: Validation of mappings Manual enrichment of OntoADR • 55.6 % of MedDRA 13.0 terms could be defined using ( i ) a direct mapping with a Snomed ‐ CT concept (UMLS or other mappings methods) or ( ii ) a handmade definition. = MEDDRA TERMS + FORMAL DEFINITIONS • Those terms cover 97.02 % of MedDRA ONTOADR .owl terms used in the FDA database ( calculated for the period 2004 ‐ 2010: 11 millions reports ). 7
Using OntoADR to perform automatic query-based MedDRA terms groupings Thanks to OntoADR, OWL queries can be built to select the MedDRA PTs whose formal definition fits some definitional criteria. Example : Query to catch MedDRA terms related to “Upper gastrointestinal bleeding” hasAssociatedMorphology some ‘ Hemorrhage’ AND hasFindingSite some ‘ Upper gastrointestinal tract structure ’ Will select from MedDRA hierarchy all PTs matching those two properties: Duodenal ulcer haemorrhage Gastric haemorrhage Mallory-Weiss syndrome Peptic ulcer haemorrhage etc. 8
Using OntoADR to perform automatic query-based MedDRA terms groupings Through the subsomption mechanism, MedDRA terms referring to hemorrhages of parts of the Upper gastrointestinal tract are also selected. ex . Oesophageal ulcer haemorrhage , Gastric varices haemorrhage , etc. 9
Evaluation of the OntoADR-based grouping method ST 1 Bullous eruptions Focus on 13 ADR safety topics identified by Trifirò et al (2009) ST 2 Acute renal failure ST 3 Anaphylactic shock as first importance pharmacovigilance targets (EU-ADR project). ST 4 Rhabdomyolysis ST 5 Aplastic anaemia/pancytopenia ST 6 Neutropenia ST 7 Cardiac valve fibrosis ST 8 Extrapyramidal disorders ST 9 Confusional state ST 10 Thrombocytopenia For each safety topic: ST 11 Upper gastrointestinal bleeding ST 12 Peripheral neuropathy i. Groupings of MedDRA PTs have been realized with ST 13 Maculo ‐ papular erythematous eruptions OntoADR queries. ADR topics identified by Trifirò et al (2009) ii. The content of those groupings has been evaluated by comparison with existing handmade MedDRA groupings of PTs targeting same or close conditions: A. Original MedDRA hierarchy groupings (HLTs or HLGTs) B. SMQs (Standard Medical Queries): collections of MedDRA PTs developed manually by the MSS0 ( Maintenance and Support Services Organization) to describe a common clinical condition Trifirò G, Pariente A, Coloma PM, Kors JA, Polimeni G, Miremont ‐ Salamé G, Catania MA, Salvo F, David A, Moore N, Caputi AP, Sturkenboom M, Molokhia M, Hippisley ‐ Cox J, Acedo CD, van der Lei J, Fourrier ‐ Reglat A. Data mining on electronic health record databases for signal detection in pharmacovigilance: which events to monitor? Pharmacoepidemiol Drug Saf. 2009; 18(12):1176 ‐ 84. 10
Neutropenia Safety Topic HLT Neutropenias Generally defined as “ an abnormally low Type (for Label Id Meddra SMQ) number of neutrophils ” (*), which are the Agranulocytosis 10001507 Autoimmune neutropenia 10055128 most abundant type of white blood cells Cyclic neutropenia 10053176 (leukocytes) in mammals. Febrile neutropenia 10016288 Felty's syndrome 10016386 Granulocytopenia 10018687 Granulocytopenia neonatal 10018688 Idiopathic neutropenia 10051645 Infantile genetic agranulocytosis 10052210 Neutropenia 10029354 Neutropenia neonatal 10029358 Neutropenic colitis 10062959 Neutropenic infection 10059482 Neutropenic sepsis 10049151 L eukopenia Sometimes used as synonym of SMQ Leukopenia Neutropenia Narrow Agranulocytosis 10001507 Narrow Band neutrophil count decreased 10057950 Narrow Band neutrophil percentage decreased 10059130 But stricly speaking Leukopenia semantically Narrow Cyclic neutropenia 10053176 broader (refers to a deficit in the number of all Narrow Febrile neutropenia 10016288 Narrow Idiopathic neutropenia 10051645 types of white blood cells) Narrow Neutropenia 10029354 Narrow Neutropenic infection 10059482 To enable comparison with query-based Narrow Neutropenic sepsis 10049151 Narrow Neutrophil count decreased 10029366 neutropenia grouping, only neutropenia relevant Broad Myelocyte percentage decreased 10052227 PTs were selected in the SMQ Broad Neutropenia neonatal 10029358 Broad Neutrophil count abnormal 10061313 Broad Neutrophil percentage decreased 10052223 (*) http://www.nlm.nih.gov/medlineplus/ency/article/007230.htm
Neutropenia: Building the OWL query hasDefinitionalManifestation some Neutropenia OR interprets some (hasComponent some 'Segmented neutrophil' OR hasComponent some 'Myelocyte' OR hasComponent some 'Stab form') AND (hasInterpretation some 'Below reference range' OR hasInterpretation some 'Decreased' OR hasInterpretation some 'Abnormal') Semantic relations used by Description the query interprets Refers to the entity being evaluated or interpreted, when an evaluation, interpretation or “judgment” is intrinsic to the meaning of a concept. hasInterpretation This attribute is grouped with the attribute “Interprets”, and designates the judgment aspect being evaluated or interpreted for a concept (e.g., presence, absence, degree, normality, abnormality, etc.). hasComponent Refers to what is being observed or measured by a procedure. hasDefinitionalManifestation Links disorders to the manifestations (observations) that define them. (*) http://en.wikipedia.org/wiki/Neutrophil_granulocyte
Recommend
More recommend