Semantic empowerment of Health Care and Life Science Applications WWW 2006 W3C Track, May 26 2006 Amit Sheth LSDI S Lab & Sem agix University of Georgia Joint work with Athens Heart Center, and CCRC, UGA
Part I: A Healthcare Application Active Semantic Electronic Medical Record @ Athens Heart Center (use Firefox) (deployed since Dec 2005) Collaboration between LSDIS & Athens Heart Center (Dr. Agrawal, Dr. Wingate For on line demo: Google: Active Semantic Documents 2
Active Semantic Document A document (typically in XML) with • Lexical and Semantic annotations (tied to ontologies) • Active/ Actionable information (rules over semantic annotations) Application: Active Semantic EMR for Cardiology Practice • EMRs in XML • 3 ontologies (OWL), populated • RDQL-> SPARQL, Rules • Services, Web 2.0 3
Active Semantic Electronic Medical Record Demonstrates use of Semantic Web technologies to • reduce medical errors and patient safety – accurate completion of patient charts (by checking drug interactions and allergy, coding of impression,… ) • improve physician efficiency, extreme user friendliness, decision support – single window for all work; template driven sentences, auto-complete, contextual info., exploration • improve patient satisfaction in medical practice – Formulary check • improve billing due to more accurate coding and adherence to medical guidelines – Prevent errors and incomplete information that insurance can use to withhold payment 4
One of 3 ontologies used (part of drug ontology) Dosage Form Intake Route Formulary Indication has_indication has_formulary has_interaction Drug Interaction has_type has_class reacts_with Type MonographClass Non-Drug Reactant Physical Condition CPNUMGrp Allergy Pregnancy Generic BrandName Local, licensed and public (Snomed) sources to populated ontologies 5
Example Rules • drug-drug interaction check, • drug formulary check (e.g., whether the drug is covered by the insurance company of the patient, and if not what the alternative drugs in the same class of drug are), • drug dosage range check, • drug-allergy interaction check, • ICD-9 annotations choice for the physician to validate and choose the best possible code for the treatment type, and • preferred drug recommendation based on drug and patient insurance information 6
Exploration of the neighborhood of the drug Tasmar Formulary_1498 generic/brandname Tasmar Telcapone belongsTo belongsTo CPNUMGroup_2119 interacts_with interacts_with classification CPNUMGroup_2118 CPNUMGroup_20 Neurological COMT Inhibitors 6 Agents 7
Active Semantic Doc with 3 Ontologies 8
Explore neighborhood for drug Tasmar E xplore: Drug Tasmar 9
Explore neighborhood for drug Tasmar classification classification belongs to group brand / generic classification belongs to group interaction Semantic browsing and querying-- perform decision support (how many patients are using this class of drug, …) 10
Software Architecture 11
ROI 12
Athens Heart Center Practice Growth Appointments (excluding cancelled/rescheduled but including noshow cases) 1600 1400 2003 1200 1000 2004 appts 800 2005 600 2006 400 200 0 v b r g n r n l p c y t u a p c o e a a u u e e m a j o n f a j m j s d month Increased efficiency demonstrated as more encounters supported without increasing clinical staff 13
Chart Completion before the preliminary deployment of the ASMER 600 500 400 Charts Same Day 300 Back Log 200 100 0 4 4 4 5 5 5 4 4 4 5 0 0 0 0 0 0 0 0 0 0 l l n y t n y r u v r u a p a a a a a o J J M e M M M J N J S Month/Year 14
Chart Completion after the preliminary deployment of the ASMER 700 600 500 Charts Same Day 400 300 Back Log 200 100 0 Sept Nov 05 Jan 06 Mar 06 05 Month/Year 15
Applying Semantic Technologies to the Glycoproteomics Domain 16
Quick take on bioinformatics ontologies and their use • GlycO and ProPreO - among the largest populated ontologies in life sciences • Interesting aspects of structuring and populating these ontologies, and their use • GlycO – a comprehensive domain ontology; it uses simple ‘canonical’ entities to build complex structures thereby → avoids redundancy ensures maintainability and scalability – Web process for entity disambiguation and common representational format → populated ontology from disparate data sources – Ability to display biological pathways • ProPreO is a comprehensive ontology for data and process provenance in glycoproteomics • Use in annotating experimental data, high throughput workflow 17
GlycO 18
GlycO ontology • Challenge – model hundreds of thousands of complex carbohydrate entities • But, the differences between the entities are small (E.g. just one component) • How to model all the concepts but preclude redundancy → ensure maintainability, scalability 19
GlycoTree β - D -Glc p NAc-(1-2)- α - D -Man p -(1-6)+ β - D -Man p -(1-4)- β - D -Glc p NAc β - D -Glc p NAc -(1-4)- β - D -Glc p NAc-(1-4)- α - D -Man p -(1-3)+ β - D -Glc p NAc-(1-2)+ N. Takahashi and K. Kato , Trends in Glycosciences 20 and Glycotechnology , 15: 235-251
Ontology population workflow Semagix Freedom knowledge extractor YES: next Instance Instance Data Has Already in IUPAC to CarbBank NO KB? LINUCS ID? NO YES Compare to Insert into LINUCS to Knowledge KB GLYDE Base 21
GlycO population Semagix Freedom knowledge extractor YES: next Instance Instance Data [][Asn]{[(4+1)][b-D-GlcpNAc] {[(4+1)][b-D-GlcpNAc] {[(4+1)][b-D-Manp] {[(3+1)][a-D-Manp] Has Already in IUPAC to {[(2+1)][b-D-GlcpNAc] CarbBank NO KB? LINUCS {}[(4+1)][b-D-GlcpNAc] ID? {}}[(6+1)][a-D-Manp] {[(2+1)][b-D-GlcpNAc]{}}}}}} NO YES Compare to Insert into LINUCS to Knowledge KB GLYDE Base 22
Ontology Population Workflow Semagix Freedom knowledge extractor <Gly can> YES: <agly con name="Asn"/> next Instance <residue link="4" anomeric_carb on="1" anomer="b" chirality="D" monosaccharide="GlcNAc"> <residue link="4" anom eric_carbon="1" anomer="b" chirality="D" monosaccharide="GlcNAc"> Instance <residue link="4" anomeric_carbon="1" anomer="b" chirality="D" monosaccharide="Man" > <residue link="3" anomeric_carbon="1" anomer="a" chirality="D" monosaccharide="Man" > Data <residue link="2" anomeric_carbon="1" anomer="b" chirality="D" monosaccharide="GlcNAc" > </residue> <residue link="4" anomeric_carbon="1" anomer="b" chirality="D" monosaccharide="GlcNAc" > </residue> Has Already in IUPAC to </residue> CarbBank NO <residue link="6" anomeric_carbon="1" anomer="a" chirality="D" monosaccharide="Man" > KB? LINUCS ID? <residue link="2" anomeric_carbon="1" anomer="b" chirality="D" monosaccharide="GlcNAc"> </residue> </residue> NO YES </residue> </ residue> </r esidue> </Gly can> Compare to Insert into LINUCS to Knowledge KB GLYDE Base 23
Pathway representation in GlycO Pathways do not need to be explicitly defined in GlycO. The residue-, glycan-, enzyme- and reaction descriptions contain the knowledge necessary to infer pathways. 24
Zooming in a little … Reaction R05987 The N-Glycan with KEGG catalyzed by enzyme 2.4.1.145 ID 00015 is the substrate adds_glycosyl_residue to the reaction R05987, N-glycan_b-D-GlcpNAc_13 which is catalyzed by an enzyme of the class EC 2.4.1.145. The product of this reaction is the Glycan with KEGG ID 00020. 25
N- -Glycosylation Glycosylation Process Process (NGP NGP) N Cell Culture extract Glycoprotein Fraction proteolysis Glycopeptides Fraction 1 Separation technique I n Glycopeptides Fraction PNGase n Peptide Fraction Separation technique II n*m Peptide Fraction Mass spectrometry ms data ms/ms data Data reduction Data reduction ms peaklist ms/ms peaklist binning Peptide identification Glycopeptide identification N-dimensional array Peptide list and quantification 26 Data correlation Signal integration
Semantic Annotation of MS Data parent ion charge 830.9570 194.9604 2 parent ion m/z 580.2985 0.3592 parent ion 688.3214 0.2526 abundance 779.4759 38.4939 784.3607 21.7736 1543.7476 1.3822 fragment ion m/z 1544.7595 2.9977 1562.8113 37.4790 fragment ion 1660.7776 476.5043 abundance ms/ms peaklist data 27
Recommend
More recommend