Language Technology and the Language Technology and the Semantic Web Semantic Web Dr. Günter Neumann Dr. Günter Neumann http://www.dfki dfki.de/~neumann .de/~neumann http://www. Language Technology- Language Technology -Lab Lab DFKI, Saarbrücken Saarbrücken DFKI, 7/2004, GN
Overview Overview • Language Technology Language Technology • • Semantic Web Semantic Web • • Information Extraction Information Extraction • • Information Access Information Access • 7/2004, GN
Human Language Language Technology Technology Human • ���������������������� – – covers covers • ����� ���������������������� ����� • The design and implementation of algorithms, data and • The design and implementation of algorithms, data and electronic devices for processing of natural language (text electronic devices for processing of natural language (text and speech), and and speech), and • Their integration into real- -world applications and products world applications and products • Their integration into real • Language Technology defines the engineering part of • Language Technology defines the engineering part of computational linguistic computational linguistic 7/2004, GN
✂ ☞✌ ✄ ☎ ☎ ☎ ☎ ☎ ✍ ✡ ✎ ✏ ✏ ✏ ✏ ✏ ☛ ☎ ✑ ☎ ✟ ✞ ✄ ☎ ☎ ☎ �✁ ☎ ✞ ✠ ✄ ☎ ☎ ☎ ☞✌ ✒ �✁ ✍ ✏ ✏ ✏ ✏ ✏ ✖ ✗ ✕ ✎ ✏ ✏ ✏ ✏ ✏ ✎ ✓ ✎ ✓ ✏ ✏ ✏ ✏ ✏ ☞✌ ✔ ☞✌ ✓ ✎ ✏ ✏ ✏ ✏ ✏ ✞ ☎ ☎ ☎ ✞ �✁ ☎ ☎ ☎ ☎ ☎ ✄ ✞ ✟ ☎ �✁ ☎ ☎ ✄ ☎ ☎ ✄ ✝ ✆ �✁ ☎ ☎ ☎ ☎ ☎ ✄ ✂ �✁ ✠ ✞ ☎ ✂ ☎ ☎ ☎ ✄ ✝ ✆ �✁ ☎ ☎ ☎ ☎ ☎ ✄ ☎ �✁ ☎ ☎ ☎ ☎ ✡ ✂ ☛ ✄ ☎ ☎ ☎ ☎ LT- -methods cover many areas methods cover many areas LT ����� ����� ���������� �������� ���������� �������� ������� � � ������� ������� ������ ������ ������ ������ ����������� ����������� ��������� ���������� ��������� ���������� ����������� ������ ����������� ������ ������ ������ ��������������������� ��������� ��������� ������� ���������� ������� ���������� ������� ������ ������ ����������� ����������� ������� ������� ��������������������� �������� �������� ������� ������� ���������� ��������� ��������� Multi/cross-linguality is of great importance in all these areas! 7/2004, GN
LT as embedded part of LT as embedded part of applications applications • Human- -Machine Machine • Data- -oriented Knowledge oriented Knowledge • Human • Data Communication Acquisition Communication Acquisition ���������������� ����������� • Real-time • Modularity • Robustness • Multi-media • Scalability • Software-Engineering standards • Adaptation • Evaluation 7/2004, GN
Language Technology Language Technology • • ��� �������� ������� �� • Named Entity- -Recognition Recognition • Named Entity • • ��������������� ��������������� • PoS/ /Sem Sem- -Tagging Tagging • PoS • Efficient data structures • Efficient data structures • Controlled Languages • Controlled Languages • Weighted finite state automata • Weighted finite state automata • Integration of shallow & deep • Integration of shallow & deep • Machine learning • Machine learning NLP („text zooming“) NLP („text zooming“) • Statistical inference • Statistical inference • • Reference- Reference -resolution resolution • NL- -oriented oriented ontologies ontologies • NL • Already a successful technology transfer • Already a successful technology transfer • • Industry (Microsoft, IBM, Siemens, Industry (Microsoft, IBM, Siemens, Telekom Telekom, ...) & Spin , ...) & Spin- -offs, offs, competence centers, ... competence centers, ... • Speech- -systems, MT, Editors, Text systems, MT, Editors, Text- -Mining, Knowledge Mining, Knowledge- -Mining Mining • Speech Content- -Management, ... Management, ... Content • Newest Technology Hype: the Semantic Web • Newest Technology Hype: the Semantic Web • • What role does it play for LT? What role does it play for LT? 7/2004, GN
The Semantic Web (SW) The Semantic Web (SW) • Tim Berners Berners- -Lee, 1998: Lee, 1998: • Tim • “This document is a plan for • “This document is a plan for achieving a set of connected achieving a set of connected applications for data on the applications for data on the Web in such a way as to form a Web in such a way as to form a consistent logical web of data consistent logical web of data (semantic web).” (semantic web).” • Tim Berners Berners- -Lee et al., 2001 Lee et al., 2001 • Tim • “… an extension of the current • “… an extension of the current web in which information is web in which information is given well given well- -defined meaning, defined meaning, better enabling computers and better enabling computers and people to work in cooperation.” people to work in cooperation.” 7/2004, GN
� � � ✁ � � � ✁ ✁ ✁ ✁ � ✁ � SW – – illustrated illustrated SW 1 Extension of the Current Web 2 Add meta-data The existing web will further emerge, so that computers can understand content on-line, to better help humans to organize, search, and exchange information. Data over data; Structural Meta ?? linkage of 3 Ontologies associate meaning to meta-data heterogeneou s data SW exists of meta-data and links to global ontolgoies, sources which define the meaning of terms. 4 Strukturiertes Web von Daten An ontology serves as a structural vocabulary for the interpretation of domain-specific terms. ?? defined Meta via �� Person is-a human Person has name Person has Email-adress �������� 5 The SW does not only consider Web-pages Meta 6 How will I use the SW? Meta Meta •Intelligent information search; Meta •Automatic support for the management of my personal CV information on the SW 7/2004, GN
RDF and OWL: Modeling data on the SW RDF and OWL: Modeling data on the SW 1 RDF: Resource Description Framework 2 XML & N3 sind alternative RDF-Syntaxen RDF is language for the representation of meta-data over web resources. ���������������������������������������������������� RDF-statements are triples of the form ������ ����� ����� ���� ������� ������������������������������������������������ � ������������������������������������������������������������ � ������������������������������������������������ � ����� ��������������������������������������������� ����������������������������� ����������� �� ��������������������������� ��� �� ��������������������� ����������� ������������������ 3 OWL: Web Ontology Language B-Thing •some RDF-statements 4 Relevante Aspekte für das SW have a fix interpretation (is- Contractor Employee a, =, inverseOf, card, ...) standardization, Web-globalization, • ������� of information distribution of resources Expert between individuals from Manager Analyst multiple documents ⇒ advises[1-4] 5 Ontology Mapping Web of data from ProgrammeMgr ProjectMgr heterogeneous sources •Semantic of OWL as basis funds for inference mechanism Mapping between over these data structures. distributed, local ontologies 7/2004, GN
The SW- -pyramid pyramid The SW ����� Basic research Current focus of major efforts Established standards 7/2004, GN
Recommend
More recommend