CONCEPT DRIFT IN ONTOLOGY MAPPING AND SEMANTIC ANNOTATION - - PowerPoint PPT Presentation

concept drift in ontology mapping and semantic annotation
SMART_READER_LITE
LIVE PREVIEW

CONCEPT DRIFT IN ONTOLOGY MAPPING AND SEMANTIC ANNOTATION - - PowerPoint PPT Presentation

CONCEPT DRIFT IN ONTOLOGY MAPPING AND SEMANTIC ANNOTATION ADAPTATION Cdric PRUSKI Dri%-a-LOD@EKAW 2016, November 20 th , Bologna, Italy 1 MOTIVATION Outdated mappings and ? annotations


slide-1
SLIDE 1

CONCEPT DRIFT IN ONTOLOGY MAPPING AND SEMANTIC ANNOTATION ADAPTATION

1

Cédric ¡PRUSKI ¡

¡ Dri%-­‑a-­‑LOD@EKAW ¡2016, ¡ ¡ November ¡20th, ¡Bologna, ¡Italy ¡

slide-2
SLIDE 2

MOTIVATION

2 data ¡

KS KT

malignancy Malignant neoplasm

=

?

inaccessible

Outdated mappings and annotations may trigger undesirable results in biomedical systems Crucial maintaining mappings and annotations valid

Malignant neoplasm

Large size and complexity Prevents a totally manual maintenance

malignancy malignancy

data

?

slide-3
SLIDE 3
  • What is the impact of concept drift (or ontology evolution) on ontology mappings and

semantic annotations?

  • Quantitative
  • Qualitative
  • How can we formally characterize concept drift?
  • Basic changes (Addition/Deletion of concepts)
  • Complex changes (Split, merge, move of concepts)
  • Can we reuse information that characterizes concept drift to adapt ontology mappings

and semantic annotations?

  • Prevention of re-alignment / re-annotation of whole datasets

PROBLEMATIC

3

slide-4
SLIDE 4

① Concept drift for mapping adaptation

a. DynaMO research project b. Change patterns

② Concept drift for semantic annotation maintenance

a. ELISA research project b. Background knowledge

③ Discussion

a. Concept drift for LOD

AGENDA

4

slide-5
SLIDE 5

THE CASE OF MAPPING ADAPTATION

5

slide-6
SLIDE 6

“Adaptation of existing mappings according to modifications affecting KOS elements at evolution time”

Definition and Problematic

ONTOLOGY MAPPING ADAPTATION

6

MV1=(s, t, r) MV2=(s’, t, r’) Hypothesis: There is a correlation between the way KOS’ elements evolve and the way mappings are adapted

slide-7
SLIDE 7

UNDERSTANDING MAPPING EVOLUTION

7

  • Identify potential interdependencies between changes affecting KOS entities and

the mapping evolution

  • Empirically examine official and real-world mappings over time
  • Evolution of SNOMED CT and ICD9CM as a case study

~400 000

mappings analyzed

SNOMED CT Jan/10 SNOMED CT Jul/10 SNOMED CT Jan/11 SNOMED CT Jul/11

ICD9CM 2009 ICD9CM 2010 MST 1 Jan/10 MST 2 Jul/10 MST 3 Jan/11 MST 4 Jul/11

How concept drift impact mappings?

slide-8
SLIDE 8

How to identify these attributes?

KEY FINDINGS

8 This concept changed

560.39

≡ ≤ ≤ ≤ 560.39

168000

is-a

44635007

is-a

29162007 168000 40515007

is-a

560.32

This concept was added

40515007

Before Evolution After Evolution

197063004

≡ ≤ ≤ ICD9CM

SNOMED CT

SNOMED CT

ICD9CM ≡

similarity

Enterolith (disorder) Typhlolithiasis (disorder) Concretion

  • f intestine

(disorder) Impaction of intestine

29162007 44635007

≤ ≡

Fecal impaction Fecal impaction

  • f colon

197063004

Fecal impaction Fecal impaction

  • f colon

Observed modifications

Time

Attributes

  • Concretion
  • f intestine
  • Enterolith
  • Fecal

impaction

Mapping adaptation based on the evolution of relevant concept attributes

slide-9
SLIDE 9

Lexical change patterns

CHARACTERIZATION OF CHANGES

9

a1, a2,…, an asup1, asup2,…, asupn asib1 asub1, asub2, …, asubn a1, a2, …, an asib1, asib2

Ø Total Copy (TC) Ø Total Transfer (TT) Ø Partial Copy (PC) Ø Partial Transfer (PT)

unspecified mental behavioral problem bronzed diabetes inflammatory bowel diseases bronzed diabetes inflammatory bowel diseases 1 specified behavioral problem inflammatory bowel diseases

cs cs

1

time

CONTEXT = SUP ∪ SUB ∪ SIB

time j specified behavioral problem time j+1

slide-10
SLIDE 10

Semantic change patterns

CHARACTERIZATION OF CHANGES

10

a1, a2,…, an asup1, asup2,…, asupn asib1 a1, a2, …, an asib1, asib2 asub1,…, asubn

Ø Equivalent (EQV) Ø Partial Match (PTM) Ø More Specific (MSP) Ø Less Specific (LSP)

Diabetes type 1 Diabetes type I Focal atelectasis Helical atelectasis familial chylomicronemia familial hyperchylomicronemia Kappa chain disease Kappa light chain disease

cs cs

1

time j+1 time time j

CONTEXT = SUP ∪ SUB ∪ SIB

slide-11
SLIDE 11

Heuristics

LINKING CP AND MAINTENANCE ACTIONS

11

as1, as2, as3, …, asn

as1, as2, as3, …, asn asib1, asib2,…, asibn

cs

a1,…, ak

ct semType

Affected by KOS changes

KOS KS KOS KT

cs

1

relevant attributes

MoveM(mst , ccand

1 )

ccand

1

∃!Lexical CP (Total Transfer) Semantic CP

unchanged

Kappa light chain disease Kappa chain disease

CONTEXT = SUP ∪ SUB ∪ SIB

time j time j+1

slide-12
SLIDE 12
  • Concept drift has a huge impact on ontology mappings but some changes in concept

do not affect mappings

  • Drift of attribute values governs the mapping adaptation process
  • In most of the cases concept drift results in local changes
  • Change in super, sub concepts and siblings
  • Considering ontology versions alone is not enough to characterize concept drift
  • Need of external background knowledge to better determine the semantic relationship

between versions of concept

  • Cf. semantic annotation adaptation

Lessons learned

CONCEPT DRIFT FOR MAPPING ADAPTATION

12

slide-13
SLIDE 13

THE CASE OF SEMANTIC ANNOTATIONS ADAPTATION

13

www.elisa-­‑project.lu ¡ ¡

elisa elisa

slide-14
SLIDE 14

Problem

SEMANTIC ANNOTATIONS ADAPTATION

14

slide-15
SLIDE 15

Impact of concept drift on semantic annotations

METHODOLOGY

15

slide-16
SLIDE 16

RESULTS

16

slide-17
SLIDE 17

RESULTS

17

slide-18
SLIDE 18

RESULTS

18

slide-19
SLIDE 19

RESULTS

19

slide-20
SLIDE 20
  • Concept may have labels before and after evolution that are disjoint from the syntactic
  • r lexical point of view
  • Ex: Cancer Malignant neoplasm
  • Lexical and Semantic change patterns cannot be applied
  • Consideration of external knowledge sources are required to characterize the evolution
  • f concepts in such situations
  • We propose a methods exploiting Bioportal to overcome this limitation
  • Ontologies
  • Mappings
  • The method is able to find the semantic relationship between two versions of the same

concepts

  • Equivalent, less specific, more specific, unrelated, partially matched

Use of external knowledge source

CONCEPT DRIFT FOR ANNOTATIONS

20

slide-21
SLIDE 21

Example

USE OF EXTERNAL KNOWLEDGE SOURCE

21

“Pituitary)dwarfism”) (MeSH)) “Pituitary)dwarfism)II”) (MeSH)) SNOMED)CT,) ICD9CM,)MEDDRA,) NCIT,)DOID,)RCD,)HP,) DERMLEX,)NATPRO,) CRISP,)SOPHARM,) BDO,)SNMI) OMIM) NDFRT) Search)in)ontologies) Search)in)ontologies) No)common)ontologies) Use)mappings) 15)mappings)available) (OMIM)ontology)) “Pituitary)dwarfism)II”)(OMIM)) Mapped_to) “LaronRtype)isolated)somatotropin)defect”)(SNOMED)CT)) SNOMED)CT)is)the)common)ontology) “LaronRtype)isolated)somatotropin)defect”)and) “Pituitary)dwarfism”)have)the)same)super)concept) (“short)stature)disorder”))they)are)siblings) 1 1 2 (Direct)method)) (Indirect)method)) 3

slide-22
SLIDE 22
  • Ontology regions do not evolve in the same way
  • Unstable regions à handle with care
  • Interesting for predicting concept drift
  • Concept drift has a different impact on annotation tools
  • GATE
  • NCBO annotator
  • Background knowledge gives promising results for characterizing concept drift
  • Bioportal ontologies
  • RDF datasets, Web data under investigation
  • Will machine learning help in understanding concept drift?
  • Identification of relevant features
  • What ML techniques to use?

Lessons learned (so far …)

CONCEPT DRIFT IN ANNOTATION ADAPTATION

22

slide-23
SLIDE 23
  • Linked Open Data requires vocabulary for semantic interoperability purposes
  • LOD for characterizing concept drift
  • Quality of LOD is problematic
  • Some datasets rely on outdated vocabularies
  • Concept drift impacting LOD:
  • FOAF, DC not so dynamic as domain ontologies
  • No control over the datasets using controlled vocabularies

à How to propagate changes observed in the vocabulary to RDF datasets?

Concept drift for LOD

DISCUSSION

23

slide-24
SLIDE 24
  • Silvio Cardoso,
  • Dr. Marcos Da Silveira,
  • Dr. Duy Dinh,
  • Dr. Julio Dos Reis,
  • Dr. Anika Gross,
  • Pr. Erhard Rahm
  • Pr. Chantal Reynaud-Delaître,
  • And all the others …

COLLABORATORS

24

slide-25
SLIDE 25
  • M. Da Silveira, J. C. Dos Reis, C. Pruski, Management of Dynamic Biomedical Terminologies: Current Status and Future

Challenges, IMIA Yearbook of Medical Informatics, 10(1), 125-133, 2015

  • J. C. Dos Reis, D. Dinh, M. Da Silveira, C. Pruski, C. Reynaud-Delaître, Recognizing lexical and semantic change patterns in

evolving life science ontologies to inform mapping adaptation, Artificial Intelligence in Medicine, 63(3), 153-170, (DOI: http://dx.doi.org/10.1016/j.artmed.2014.11.002), 2015

  • J. C. Dos Reis, C. Pruski, M. Da Silveira, C. Reynaud-Delaître, Understanding semantic mapping evolution by observing changes

in biomedical ontologies, Journal of Biomedical Informatics, 47, 71-82, 2014.

  • S. D. Cardoso, C. Pruski, M. Da Silveira, Y-C Lin, A. Gross, E. Rahm, C. Reynaud-Delaitre, Leveraging the Impact of Ontology

Evolution on Semantic Annotations, Knowledge Engineering and Knowledge Management - 20th International Conference, (EKAW) 2016, Bologna, Italy, November 19-23, 2016 J.C. Dos Reis, C. Pruski, M. Da Silveira, C. Reynaud-Delaître, Characterizing Semantic Mappings Adaptation via Biomedical KOS Evolution: A Case Study Investigating SNOMED CT and ICD, AMIA 2013 Annual Symposium, Washington DC (USA), 2013 J.C. Dos Reis, D. Dinh, C. Pruski, M. Da Silveira, C. Reynaud-Delaître, Mapping Adaptation Actions for the Automatic Reconciliation of Dynamic Ontologies, ACM International Conference on Information and Knowledge Management (CIKM 2013), San Francisco, CA (USA), 2013 J.C. Dos Reis, D. Dinh, C. Pruski, M. Da Silveira, C. Reynaud-Delaître, The influence of similarity between concepts in evolving biomedical ontologies for mapping adaptation, European Medical Informatics Conference (MIE), 31/08 - 03/09, Istanbul, Turquie, 2014 J.C. Dos Reis, D. Dinh, C. Pruski, M. Da Silveira and C. Reynaud-Delaître, Identifying change patterns of concept attributes in

  • ntology evolution, Proc. of the 11th ESWC, Anissaras, Crete, (Greece), 2014.
  • C. Pruski, J.C. Dos Reis, M. Da Silveira, Capturing the relationship between evolving biomedical concepts via background

knowledge, 9th International SWAT4LS conference, Amsterdam, 2016

REFERENCES

25