KNOWLEDGE MANAGEMENT AND APPLICATIONS David Snchez Department of - PowerPoint PPT Presentation

KNOWLEDGE MANAGEMENT AND APPLICATIONS David Sánchez Department of Computer April 2013 Science and Mathematics

Tarragona 2

The university 3  Created in 1991  52 programmes of study  Over 12,000 students

The faculty 4  Engineering degress  Computer science  Telematics  Masters  Computer Security and Intelligent Systems  Artificial Intelligence  Security of the Information and Communication technologies  Doctoral program  Computer Engineering

Research group 5  9 professors and lecturers  6 post doctoral researchers  7 Ph.D. students  7 Research assistants  Data privacy and electronic commerce  Privacy and security in mobile environments  Private information recovery and codes

Contents 6  Introduction  Knowledge acquisition  Semantic operators  Applications to privacy

Motivation 7  Numerical data is easy to manage and transform  3<4 = true  (1+2)/2 = 1.5  {3, 2, 5} -> {2, 3, 5}  A plethora of algorithms rely on aritmetical functions to deal with numerical data

Motivation 8  What about text?  Car ¿>? bike  (apple + orange) / 2 = ??  {flu, cold, pneumonia} -> {?, ?, ?}  Arithmetical functions do not make sense  Text (words, noun phrases) refers to concepts  Concepts should be managed according to their formal semantics

Ontologies 9  Provide a structured representation of a shared conceptualization  Elements  Classes (concepts)  Instances (individuals)  Semantics  Properties (semantic relationships)  Restrictions (logical definition of meanings)

Creating ontologies 11  Manually  Knowledge formalization is challenging  Knowledge can be subjective  Time consuming  Assisted  Proactive knowledge modelling tools  Wizards  Reasoners to check knowledge consistency  Knowledge engineering methods  101, METHONTOLOGY, On-To-Knowledge

Ontology learning 12  Semantics are implicitly referred in text  Textual corpora can be analysed to acquire knowledge  Discover concepts and individuals  Discover and label relations  Taxonomic ( cancer is a disease )  Non-taxonomic ( cancer is treated with radiotherapy )  Attributes ( cancer is non-contagious )  Discover restrictions  Axioms ( Spain borders France -> France borders Spain )

Ontology learning from the Web 13  Corpora: the Web  The largest electronic repository  Heterogenous  It approximates the distribution of information at a social scale  Availability of massive IR tools: Web search engines

Knowledge discovery from text 14  NL processing tools to identify nouns, noun phrases and named entities  Concepts and individuals  Linguistic patterns to discover semantics  Taxonomic  “ cities such as (Nimes)”, “ cancers likes (melanoma)”  Non taxonomic  “ cancer is treated with (surgery)”  Attributes  “ camera has (10MP resolution)”, “ camera features (3x zoom)”  Axioms (functionality, transitivity, symmetry, reflexibity, etc.)  “ Spain borders France ”, “ France borders Spain ” -> Symmetry

Retrieval of suitable corpora 15  Create appropriate web search queries  Taxonomic: “cities such as” […]  Non taxonomic: “cancer is treated with” […]  Attributes: “camera features” […]  Axioms: “Spain borders” & “France borders”

Statistical assessment 16  Statistical assessor  WSE page count approximates query probabilities at a social scale  Use an association score to filter noisy extractions  Point-wise mutual information

References 17  Taxonomic learning  David Sánchez, Antonio Moreno: Pattern-based automatic taxonomy learning from the Web. AI Commununications 21(1): 27-48 (2008)  Non-taxonomic learning  David Sánchez, Antonio Moreno: Learning non-taxonomic relationships from web documents for domain ontology construction. Data & Knowledge Engineering 64(3): 600-623 (2008)  Attribute learning  David Sánchez: A methodology to learn ontological attributes from the Web. Data & Knowledge Engineering 69(6): 573-597 (2010)  Axiom learning  David Sánchez, Antonio Moreno, Luis Del Vasto Terrientes: Learning relation axioms from text: An automatic Web-based approach. Expert Systems with Applications 39(5): 5792-5805 (2012)

Exploiting ontologies  Structured knowledge enables a semantically-coherent interpretation of textual data by  Defining semantically-grounded operators  Semantic similarity is the most basic operator  Similarity(apple, orange) > Similarity(apple, bike)

Semantic similarity 20  Semantic similarity  Degree of taxonomical resemblance  e.g ., dogs and cats are similar as they are mammals  Semantic relatedness  Other non taxonomic relationships are also considered  e.g ., car and wheel or pencil and paper  Similarity measures can be grouped in several families according to  the type of knowledge exploited  the principles in which similarity estimation relies

Ontology-based similarity 21

Edge-counting measures = ( , ) | min_ ( , ) | Distance a b path a b 22

IC-based measures = ( , ) ( ( , )) Sim a b IC LCS a b Least Common Subsumer (LCS) 23

IC-based semantic similarity 24  IC calculus relies on probability assessments = − ( ) log ( ) IC c p c  Based on corpora  Requires general and heterogeneous corpora  Language ambiguity hampers results  Data sparseness produce weak statistics

Ontology-based IC computation 25  Assumption: concepts with many hyponyms in an ontology are more probable to appear in corpora  Concept probabilities are intrinsically approximated according to taxonomic knowledge  Number of hyponyms ( ) log hyponyms c = − ( ) IC c ontology_size

Feature-based measures common_features(a,b) = ( , ) Sim a b disjoint_features(a,b) 26

References 27 Feature-based similarity measures   Montserrat Batet, David Sánchez, Aïda Valls: An ontology-based measure to compute semantic similarity in biomedicine. Journal of Biomedical Informatics 44(1): 118-125 (2011)  David Sánchez, Montserrat Batet, David Isern, Aïda Valls: Ontology-based semantic similarity: A new feature-based approach. Expert Systems with Applications 39(9): 7718-7728 (2012) IC-based similarity mesures   Based on corpora  David Sánchez, Montserrat Batet, Aïda Valls, Karina Gibert: Ontology-driven web-based semantic similarity. Journal of Intelligent Information Systems 35(3): 383-413 (2010)  Based on ontologies  David Sánchez, Montserrat Batet, David Isern: Ontology-based information content computation. Knowledge-Based Systems 24(2): 297-303 (2011)  David Sánchez, Montserrat Batet: A New Model to Compute the Information Content of Concepts from Taxonomic Knowledge. International Journal on Semantic Web and Information Systems 8(2): 34-50 (2012)  David Sánchez, Montserrat Batet: Semantic similarity estimation in the biomedical domain: An ontology-based information-theoretic perspective. Journal of Biomedical Informatics 44(5): 749-759 (2011)

Other semantic operators 28  Semantic similarity/distance is the base to develop other semantically-grounded operators over a sample of textual data  Aggregation (mean/centroid)   n ∑ =   ( , ,..., ) arg min ( , ) Mean x x x distance c x 1 2 n c i   = i 1

Aggregation 29 Sample colic lumbago lumbago migraine pain appendicitis gastritis Mean colic lumbago migraine appendicitis gastritis pain Sum candidates (1) (3) (2) (1) (1) (1) dist lumbago colic 0 3 3 4 4 1 24 migraine lumbago 3 0 2 5 5 2 19 migraine 3 2 0 5 5 2 21 appendicitis 4 5 5 0 2 3 34 gastritis 4 5 5 2 0 3 34 pain 1 2 2 3 3 0 17 ache 2 1 1 4 4 1 16 inflammation 3 4 4 1 1 2 27 symptom 2 3 3 2 2 1 22

Sorting algorithm 30 Algorithm. Sorting procedure Inputs: P (dataset) Output: P ’ ( P sorted) 1 Compute the mean of all values in P 2 Consider the most distant value f to the mean 3 Add f to P’ and remove it from P 4 while (| P | > 0) do 5 Obtain the least distant value r to f 6 Add r to P’ and remove it from P 7 Output P’

References 31  Sergio Martínez, Aïda Valls, David Sánchez: Semantically- grounded construction of centroids for datasets with textual attributes. Knowledge-Based Systems 35: 160-172 (2012)  Sergio Martínez, David Sánchez and Aida Valls: A semantic framework to protect the privacy of electronic health records with non-numerical attributes. Journal of Biomedical Informatics 46(2): 294-303  Josep Domingo-Ferrer, David Sánchez, Guillem Rufian- Torrell: Anonymization of Nominal Data Based on Semantic Marginality. Information Sciences. To Appear

KNOWLEDGE MANAGEMENT AND APPLICATIONS David Snchez Department of - PowerPoint PPT Presentation

KNOWLEDGE MANAGEMENT AND APPLICATIONS David Snchez Department of Computer April 2013 Science and Mathematics Tarragona 2 The university 3 Created in 1991 52 programmes of study Over 12,000 students The faculty 4

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

Plan for today Knowledge-based systems 1 Explicit knowledge Knowledge Representation Inferred

Plan for today Knowledge-based systems 1 Tacit knowledge Knowledge Representation Inferred

26:198:722 Expert Systems I Knowledge representation I Knowledge acquisition I Machine learning I

OUTLINE CAPITALIZATION OF COLLECTIVE KNOWLEDGE: Knowledge management and Knowledge

707.009 Foundations of Knowledge Management Business Process Oriented Knowledge Management

Knowledge Management KNOWLEDGE INSPIRED INNOVATION An investment in knowledge always pays the

KNOWLEDGE ACQUISITION AND CONSTRUCTION Transfer of Knowledge Knowledge acquisition is the

Knowledge acquisition Development cycle of a knowledge-based system Knowledge acquisition G53KRR

Knowledge Model Basics Challenges in knowledge modeling Basic knowledge-modeling constructs

Recording the Knowledge Footprints February 4, 2015 The Knowledge Problem Those who cannot

707.009 Foundations of Knowledge Management g g Participative Knowledge Acquisition

707.009 Foundations of Knowledge Management g g Broad Knowledge Bases Markus Strohmaier

707.009 Foundations of Knowledge Management g g Knowledge Acquisition I Markus Strohmaier

Journal of Knowledge Management Knowledge-enabled customer relationship management: integrating

Knowledge and Distributed Coordination Yoram Moses Technion NUS Research Week ( :- ) Knowledge

Algorithm Recommendation as Collaborative Filtering Mich` ele Sebag & Mustafa Misir &

A Unified Bias-Variance Decomposition and its Applications Pedro Domingos

Machine Learning of Bayesian Networks Peter van Beek University of Waterloo Collaborators

Combinatorial Benders Cuts Gianni Codato DEI, University of Padova, Italy Matteo Fischetti

Talking about Dying: From Anticipatory Care Planning to End of Life Care Kathryn Mannix

Office hours (today) 2-3 (rather than 2-4) Todays Lecture Thyreophora (continued)

Incompletely Specified Operations and their Clones Jelena Coli c Oravec University of Novi

Care Weekly Updated 30 th April 2020 @Pers_Care #Pallicovid The webinar will be starting

KNOWLEDGE MANAGEMENT AND APPLICATIONS David Snchez Department of - PowerPoint PPT Presentation

KNOWLEDGE MANAGEMENT AND APPLICATIONS David Snchez Department of Computer April 2013 Science and Mathematics Tarragona 2 The university 3 Created in 1991 52 programmes of study Over 12,000 students The faculty 4

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

Plan for today Knowledge-based systems 1 Explicit knowledge Knowledge Representation Inferred

Plan for today Knowledge-based systems 1 Tacit knowledge Knowledge Representation Inferred

26:198:722 Expert Systems I Knowledge representation I Knowledge acquisition I Machine learning I

OUTLINE CAPITALIZATION OF COLLECTIVE KNOWLEDGE: Knowledge management and Knowledge

707.009 Foundations of Knowledge Management Business Process Oriented Knowledge Management

Knowledge Management KNOWLEDGE INSPIRED INNOVATION An investment in knowledge always pays the

KNOWLEDGE ACQUISITION AND CONSTRUCTION Transfer of Knowledge Knowledge acquisition is the

Knowledge acquisition Development cycle of a knowledge-based system Knowledge acquisition G53KRR

Knowledge Model Basics Challenges in knowledge modeling Basic knowledge-modeling constructs

Recording the Knowledge Footprints February 4, 2015 The Knowledge Problem Those who cannot

707.009 Foundations of Knowledge Management g g Participative Knowledge Acquisition

707.009 Foundations of Knowledge Management g g Broad Knowledge Bases Markus Strohmaier

707.009 Foundations of Knowledge Management g g Knowledge Acquisition I Markus Strohmaier

Journal of Knowledge Management Knowledge-enabled customer relationship management: integrating

Knowledge and Distributed Coordination Yoram Moses Technion NUS Research Week ( :- ) Knowledge

Algorithm Recommendation as Collaborative Filtering Mich` ele Sebag &amp; Mustafa Misir &amp;

A Unified Bias-Variance Decomposition and its Applications Pedro Domingos

Machine Learning of Bayesian Networks Peter van Beek University of Waterloo Collaborators

Combinatorial Benders Cuts Gianni Codato DEI, University of Padova, Italy Matteo Fischetti

Talking about Dying: From Anticipatory Care Planning to End of Life Care Kathryn Mannix

Office hours (today) 2-3 (rather than 2-4) Todays Lecture Thyreophora (continued)

Incompletely Specified Operations and their Clones Jelena Coli c Oravec University of Novi

Care Weekly Updated 30 th April 2020 @Pers_Care #Pallicovid The webinar will be starting

Algorithm Recommendation as Collaborative Filtering Mich` ele Sebag & Mustafa Misir &