Language Technologies and the Semantic Web: An Essential Relationship. Enrico Motta Professor of Knowledge Technologies Knowledge Media Institute The Open University
Content of the Talk • Update on the Semantic Web – Beyond the hype • What it is • Why it is interesting • What’s its status? • Semantic Web and AI • Semantic Web Applications – Key features – Reasoning on the Semantic Web – Key role of Language Technologies • Conclusions
The Semantic Web in 2 minutes…
< foa f :Pe r son rd f : a bou t= "h t t p : / / i den t i f i e rs . km i .open .ac .uk/ peop le /en r i co -mo t t a / "> < foa f : name>Enr i c o Mo t ta< / f oa f : name> < foa f : f i r s tName>Enr i co< / f oa f : f i r s tName> < foa f : sur name>Mo t ta< / f oa f : su rname> < foa f : phone rd f : r e sou rce=" t e l :+44 - (0 ) 1908 -653506" /> < foa f : homepage r d f : r esou r ce="h t t p : / / km i .open. ac .uk /peop le /mo t t a / " /> < foa f :wor kp laceHomepage r d f : r esour ce= "h t t p : / / km i .open .ac .uk / " / > < foa f : dep i c t i on rd f : r esou rce="h t t p : / / k mi .open .ac .uk / img / me mbers/ en r i co . j p g" /> < foa f : t o p i c_ in te res t>Know l edge Techno log ies< / foa f : t opi c_ i n te rest > < foa f : t o p i c_ in te res t>Seman t i c Web< / foa f : t opi c_ i n te rest > < foa f : t o p i c_ in te res t>On to log ies< / f oa f : t op i c_ in t e res t> < foa f : t o p i c_ in te res t>Prob lem So lv i ng Me thods< / foa f : t opi c_ i n te rest > < foa f : t o p i c_ in te res t>Know l edge Mode l l i ng< / f oa f : t op i c_i n t e res t> < foa f : t o p i c_ in te res t>Know l edge Management < / f oa f : t o p i c_ in te res t> < foa f : based_nea r > <geo :Poi n t> <geo : l a t >52 .024868< /geo: l a t> <geo : l o ng>-0 .707143< /geo : l ong> <con tac t : nea res tAi r po r t> <a i rpor t : name>London Lu ton A i rpo r t< /a i r por t : name> <a i rpor t : i a t aCode>LTN< / a i rpo r t : i a t aCode> <a i rpor t : l oca t i on>Lu ton , Un i t ed K ingdom</a i r po r t : l ocat i on> <geo : l a t>51 .866666666667< /geo: l a t> <geo : l ong>-0 .36666666666667< / geo : l ong> < rd f s : seeA lso r d f : r esou r ce="h t t p : / /www.daml .o rg / cg i - b i n /a i r por t ?LTN" /> < foa f : cu r r en tP ro ject > < foa f :P ro j ec t> < foa f : name>AquaLog< / foa f : name>
The foaf ontology
The SW as ‘Web of Data’
Current status of the semantic web • 10-20 million semantic web documents – Expressed in RDF, OWL, DAML+ OIL • 7K-10K ontologies – These cover a variety of domains - multimedia, computing, management, bio-medical sciences, geography, entertainment, upper level concepts, etc… The above figures refer to resources w hich are publicly accessible on the w eb
The Semantic Web today • To a significant extent the Semantic Web is already in place and is characterized by a widespread production of formalized knowledge models (ontologies and metadata), from a variety of different groups and individuals – “The Next Knowledge Medium - An information network with semi- automated services for the generation, distribution, and consumption of knowledge” • Stefik, 1986 – “Knowledge modelling to become a new form of literacy?” • Stutt and Motta, 1997 • Still primarily a research enterprise, however interest is rapidly increasing in both governmental and business organizations • “early adopters” phase • The result is slowly emerging as an unprecedented knowledge resource, which can enable a new generation of intelligent applications on the web
Semantic Web Applications What can you do with the Semantic Web?
“Corporate Semantic Webs” • A ‘corporate ontology’ is used to provide a homogeneous view over heterogeneous data sources • Often tackle Enterprise Information Integration scenarios • Hailed by Gartner as one of the key emerging strategic technology trends – E.g., see personal information management in Garlik
Exploiting large scale semantics Next Generation Semantic SW Applications Web
Exploiting large scale semantics Next Generation Semantic SW Applications Web
NGSW Applications in the context of AI research
Knowledge-Based Systems “Today there has been a shift in paradigm. The fundamental problem of understanding intelligence is not the Large Body identification of a few powerful of Know ledge techniques, but rather the question of how to represent large amounts of knowledge in a fashion that permits their effective use” Goldstein and Papert, 1977 I ntelligent Behaviour
The Knowledge Acquisition Bottleneck Know ledge KA Large Body Bottleneck of Know ledge I ntelligent Behaviour
SW as Enabler of Intelligent Behaviour Both a platform for knowledge publishing and a large scale source of knowledge I ntelligent Behaviour
KBS vs SW Systems Classic KBS SW System s Provenance Centralized Distributed Size Small/ Medium Extra Huge Repr. Schem a Homogeneous Heterogeneous Quality High Very Variable Degree of trust High Very Variable
Key Paradigm Shift Classic KBS SW System s I ntelligent A function of A side-effect of Behaviour sophisticated, being able to logical, task- integrate centric problem different types of solving reasoning to handle size and heterogeneous quality and representation
Next Generation SW Applications: Examples Case Study 1: Automatic Alignment of Thesauri in the Agricultural/ Fishery Domain
Method - SCARLET - matching by Harvesting the SW - Automatically select Access and combine multiple Sem antic W eb online ontologies to ≡ ≡ derive a relation Scarlet Scarlet Deduce Concept_A Concept_B Sem antic Relation ⊆ ( ) (e.g., Supermarket) (e.g., Building)
Two strategies Building OrganicChem ical ⊆ ⊆ PublicBuilding Lipid ⊆ ⊆ ≡ ⊆ Shop Steroid ⊆ Steroid Superm arket Cholesterol Sem antic W eb ≡ ≡ ≡ ≡ Scarlet Scarlet Scarlet Scarlet Supermarket Building Cholesterol OrganicChemical ⊆ ⊆ ( A) ( B) Deriving relations from (A) one ontology and (B) across ontologies.
Experiment Matching: • AGROVOC • UN’s Food and Agriculture Organisation (FAO) thesaurus • 28.174 descriptor terms • 10.028 non-descriptor terms • NALT • US National Agricultural Library Thesaurus • 41.577 descriptor terms • 24.525 non-descriptor terms
226 Used Ontologies http:/ / 1 3 9 .9 1 .1 8 3 .3 0 :9 0 9 0 / RDF/ VRP/ Exam ples/ tap.rdf http:/ / reliant.teknow ledge.com / DAML/ SUMO.dam l http:/ / reliant.teknow ledge.com / DAML/ Mid-level-ontology.dam l http:/ / gate.ac.uk/ projects/ http:/ / reliant.teknow ledge.com / DAML/ Econom y.dam l htechsight/ Technologies.dam l
Evaluation 1 - Precision • Manual assessment of 1000 mappings (15% ) • Evaluators: – Researchers in the area of the Semantic Web – 6 people split in two groups • Results: – Comparable to best results for background knowledge based matchers.
Evaluation 2 – Error Analysis
Other Case Studies…
Giving meaning to tags
Example Cluster_ 1 : { college commerce corporate course education high instructing learn learning lms school student} activities 4 education learning 4 teaching 4 training 1, 4 school 2 qualification corporate 1 postSecondary institution School 2 studiesAt college 2 student 3 university 2,3 takesCourse offersCourse course 3 1 http://gate.ac.uk/projects/htechsight/Employment.daml. 2 http://reliant.teknowledge.com/DAML/Mid-level-ontology.daml. 3 http://www.mondeca.com/owl/moses/ita.owl. 4 http://www.cs.utexas.edu/users/mfkb/RKF/tree/CLib-core-office.owl.
Conclusions
Typical misconceptions… • “The SW is a long-term vision… ” – Ehm… actually… it already exists… • “The SW will never work because nobody is going to annotate their web pages” – The SW is not about annotating w eb pages , the SW is a web of data, most of which are generated from DBs, or from web mining software, or from applications which produce SW data as a side effect of supporting users’ tasks • “The idea of a universal ontology has failed before and will fail again. Hence the SW is doomed” – The SW is not about a single universal ontology . Already there are around 10K ontologies and the number is growing… – SW applications may use 1, 2, 3, or even hundreds of ontologies.
Recommend
More recommend