An introduction to analysing a SNOMED CT coded dataset using a FHIR terminology server Matt Cor Cordell ll Ter erminolo logy Spe Specialist
A quick introduction to SNOMED CT, FHIR & Ontoserver SNOMED CT • Much larger than most other code systems traditionally used in healthcare (ICD, ICPC etc.) • Primary purpose is recording clinical notes, with the specificity required by clinicians, and interoperability – Structure* supports secondary uses (analytics). • Codes have no intrinsic meaning, simply identifiers. 278285008|Left hemiplegia| & 278284007|Right hemiplegia| • Concepts in the terminology are associated by range of relationships, forming an Ontology. • Expression Constraint Language (ECL) – language that supports sophisticated queries against the terminology. FHIR • Latest Interoperability standard from HL7, supporting modern RESTful practices. (ValueSets) Ontoserver • Provides FHIR based access to terminology, including ECL support • Made available for use throughout Australia via the National Clinical Terminology Service (NCTS)
ECL in 90 seconds <396234004|Infective arthritis| All (Subtypes) of Infective arthritis <64572001|Disease|:116676008|Associate All Diseases associated with d morphology|=23583003|Inflammation| inflamation <928000|Musculoskeletal Musculoskeletal disorders with disorder|:246075003|Causative some Viral involvement agent|=<<49872002|Virus|
What might a SNOMED CT dataset look like? Unique Conditions : 24647 Index Sex DoB PostCode Condition Medication Unique Medications: 10128 0 F 26/04/1998 B03 102930000 7086011000036102 Rows : 500,000 1 F 24/01/1953 E00 49512000 1112071000168105 * Randomly generated synthetic dataset 2 M 7/09/1943 E00 277627005 5604011000036100 3 M 1/01/1966 E00 3109008 3231000036108 4 F 14/02/1957 E00 723409007 6286011000036105 5 M 14/08/1961 E00 3272007 761951000168100 6 F 28/01/1986 C04 86225009 921045011000036104 7 F 15/06/1983 C04 163577001 NaN 8 F 23/05/1967 C04 191737008 927853011000036101 … … … … … … 499998 M 16/01/1984 B09 443919007 36227011000036103 499999 M 28/03/1995 B09 723913009 5081011000036108
Basic outline of approach to SNOMED CT analytics o Define aggregation categories using SNOMED CT Expression Constraint Language (ECL) o Identify all the codes that match our category, using Ontoserver to perform valueSet Expansions. o Store the results of each expansion in a Hash Set for fast lookup. o Use the Sets to filter our dataset, and optionally create human readable labels. o Use standard analytic approaches to report and visualise the data.
Populate Set with ECL import requests #for Rest calls from fhir.resources.valueset import ValueSet • Create a GET request with the ECL parameter def PopulateSetWithECL(ecl): endpoint= https://ontoserver.csiro.au/stu3-latest • Parse the JSON response to a FHIR Value Set expandAPI="/ValueSet /$expand“ • Iterate through the Value Set and populate sctValueSetUrl='http://snomed.info/sct?fhir_vs=ecl /’ urlParam={'url':sctValueSetUrl+ecl} the Hash with just the codes. response=requests.get(endpoint+expandAPI,params=urlParam) • Return the Hash. j=response.json() vs=ValueSet(j) _set= set () for e in vs.expansion.contains: _set.add(e.code) return _set
Creating Health Condition Labels o A list of tuples, each tuple consisting of an ECL definition and label healthCategories=[ ('<<106028002', 'Musculoskeletal problems’ ), o Iterate through this list ('<<106048009', 'Respiratory problems’ ), ('<<195967001', 'Asthma’ ), o Create the Hash Set based of the ECL ('<<363346000', 'Cancer’ ), ('<<13645005', 'COPD’ ), o Create Boolean filter for concepts that match the Set ('<<73211009', 'Diabetes mellitus’ ), ('<<106063007', 'Cardiovascular problems’ ), o Label accordingly in a new “Category” column. ('<<249578005', 'Kidney problems’ ), ('<<74732009', 'Mental illness’ ), Index Sex Condition Medication Category ('<<40733004', 'Infectious disease’ ), Other ('<<414022008', 'Blood disease’ )] 0 F 102930000 7086011000036102 Condition 1 F 49512000 1112071000168105 Mental illness for category in healthCategories: 2 M 277627005 5604011000036100 Cancer categorySet = PopulateSetWithECL(category[0]) … … … … … filter = codeSet["Condition"].isin(categorySet) 499998 M 443919007 36227011000036103 Mental illness codeSet.loc[filter,"Category"]=category[1] 499999 M 723913009 5081011000036108 Mental illness
codeSet.groupby(['Category','Sex']).size() Category Sex Count Blood disease F 7741 M 3295 Cancer F 1909 M 3298 Cardiovascular problems F 13716 M 10481 Diabetes mellitus F 18463 M 10362 Infectious disease F 1435 M 368 Kidney problems F 531 M 356 Mental illness F 106980 M 104910 Musculoskeletal problems F 1817 M 1400 Other Condition F 107163 M 105340 Respiratory problems F 230 M 205
Category Overlap Overlap managed by: Categories ordered by priority • Later categories overwrite; or • Only label unlabled • Build disjointness into ECL • <<106048009|Respiratory| Minus ( <<363346000|Cancer| OR <<106028002|Musculoskeletal| OR <<40733004|Infectious ) Use case dependent, especially where double counting
Counting Opioids o Again, iterate through this list as before, adding an “Opioid” opioids= [('<34841011000036108','dihydrocodeine'), label ('<21821011000036104','codeine'), Index Sex Medication Opioid ('<21705011000036108','pholcodine'), 65 M 7349011000036100 oxycodone ('<21232011000036101','buprenorphine'), 219 M 1070441000168107 codeine ('<21357011000036109','methadone'), 648 F 1048081000168105 buprenorphine ('<135971000036102','tapentadol'), ... ... ... ... ('<21258011000036102','fentanyl'), 499738 F 34022011000036100 methadone ( '<21259011000036105','oxycodone’ ), 499802 M 785911000168101 fentanyl … 499951 M 36062011000036104 dextropropoxyphene ('<21252011000036100','morphine'), ('<21486011000036105','tramadol'), ('<21901011000036101','dextropropoxyphene'), ( '<34839011000036106','pethidine’ ), ('<1247191000168104','sufentanil')] for opioid in opioids: OpioidSet = PopulateSetWithECL(opioid[0]) filter = codeSet[ “Medication" ].isin(OpioidSet) codeSet.loc[filter,"Opioid"]= opioid[1]
Opioids
Using AMT’s “Concrete domain” in ECL 53798011000036101|Ecotrin 650 mg enteric tablet| /*High Dose, 200mg or greater*/ <30497011000036103|medicinal product|: { 30364011000036101|has Au BoSS|=1817011000036100|aspirin|, 700000111000036105|Strength| >= #200, 177631000036102|has unit|=700000801000036102|mg/each| }, [1..1] 700000081000036101|has intended active ingredient|=ANY /*Low Dose <200mg */ <30497011000036103|medicinal product|: { 30364011000036101|has Au BoSS|=1817011000036100|aspirin|, 700000111000036105|Strength| < #200, 177631000036102|has unit|=700000801000036102|mg/each| }, [1..1] 700000081000036101|has intended active ingredient|=ANY /*Combination Aspirin Products*/ <21719011000036107| aspirin (MP)|: [2..*] 700000081000036101|has intended active ingredient|=ANY
“Concrete Domain” expansions High Dose – 28 concepts o Solprin 300 mg dispersible tablet o Disprin Direct 300 mg chewable tablet o Alka-Seltzer Lemon-Lime 324 mg effervescent tablet Low Dose – 27 concepts o Spren 100 mg tablet o Cardasa 100 mg enteric tablet o Aspirin Low Dose (Nyal) 100 mg enteric tablet Combination Products – 54 concepts o Clopidogrel/Aspirin 75/100 (AN) tablet o Duoprel 75/100 tablet o Action Cold and Flu effervescent tablet
Additional Resources bit.ly/SNOMED_HDA19 Supplementary Jupyter Notebook github.com/AuDigitalHealth/ecl-examples Agency ECL examples snomed.org/ecl SNOMED CT ECL Specification ontoserver.csiro.au/shrimp Shrimp Browser
Contact us 1300 901 001 Help Centre help@digitalhealth.gov.au Email healthterminologies.gov.au Website twitter.com/AuDigitalHealth Twitter OFFICIAL
Recommend
More recommend