Taaltheorie en Taalverwerking BSc Artificial Intelligence Raquel Fernández Institute for Logic, Language, and Computation Winter 2012, lecture 1a Raquel Fernández TtTv 2012 - lecture 1a 1 / 27
TTTV: Practical Matters • Lecturer: ∗ Raquel Fernández – raquel.fernandez@uva.nl ∗ Office: C3.131. Office hours: Mondays 11-13h (by appointment) • Tutors: meet them today at the practical session ∗ Sharon Gieske ∗ Elise Koster ∗ Tim van Rossum ∗ Kirsten Teulen • Timetable: ∗ We have a slightly irregular timetable. Please check online the schedule for each week – time slots, rooms, etc. Raquel Fernández TtTv 2012 - lecture 1a 2 / 27
TTTV: Practical Matters Every week: • Two lectures (a, b) ∗ slides online on Blackboard ∗ readings that need to be done before the lecture • Two practical sessions ∗ two groups in different rooms with 2 tutors per group ∗ some sessions will take place in a computer lab • Homework exercises ∗ one deadline per week ∗ use practical sessions to resolve doubts about homework • Materials ∗ see Course Information on Blackboard, where for each week you’ll find the learning objectives, readings, homework, deadlines, etc. Raquel Fernández TtTv 2012 - lecture 1a 3 / 27
Taaltheorie en Taalverwerking 2012 Schedule for Part 1 of the course Lectures Practical Sessions Recommended self-study times (for reading, studying, homework, etc.) Time reserved for other courses (e.g. Linear Algebra). Week 1 Ma Di Wo Do Vr 9-11h 11-13h 13-15h Lecture 1a Practical 15-17h Practical Lecture 1b 17-19h (19h) Vr 19h: submission deadline HW on regular expressions and finite-state methods Week 2 Ma Di Wo Do Vr 9-11h 11-13h Lecture 2a 13-15h Practical Practical 15-17h Lecture 2b 17-19h (19h) Vr 19h: submission deadline HW on formal language theory, syntax and DCGs in Prolog Week 3 Ma Di Wo Do Vr 9-11h 11-13h 13-15h Lecture 3a Lecture 3b 15-17h Practical 17-19h Practical (17h) Do 17h: submission deadline HW on more advanced syntax and parsing Week 4: Deeltoets (Di 13-15h) Raquel Fernández TtTv 2012 - lecture 1a 4 / 27
TTTV: Evaluation • Weekly homework exercises 20 % of the final grade. Can be done in pairs. One deadline per week • Two exams 35 % + 35 % of the final grade. Must be done individually. • Presentation and webpage 10 % of final grade. Done in groups of 4-5 students. At the end of the course, each team will be expected to give a presentation about a natural language application of their choice and create a webpage summarizing their findings. More details will be provided in class. Raquel Fernández TtTv 2012 - lecture 1a 5 / 27
TTTV: Evaluation You will pass the course only if all the following conditions apply: • you have submitted all homework assignments and gotten a minimum average homework grade of 4 . 5 . • you have taken the two exams and gotten a minimum grade of 4 . 5 for each of them. • you have participated in a presentation and your team has gotten a minimum grade of 4 . 5 . • your overall weighted average grade is at least 5 . 5 . Raquel Fernández TtTv 2012 - lecture 1a 6 / 27
What is this course about? Raquel Fernández TtTv 2012 - lecture 1a 7 / 27
What is this course about? Taaltheorie ≈ (Theoretical/Formal) Linguistics Raquel Fernández TtTv 2012 - lecture 1a 7 / 27
What is this course about? Taaltheorie ≈ (Theoretical/Formal) Linguistics en Taalverwerking Raquel Fernández TtTv 2012 - lecture 1a 7 / 27
What is this course about? Taaltheorie ≈ (Theoretical/Formal) Linguistics en Taalverwerking ≈ Computational Linguistics ≈ Natural Language Processing ≈ Human Language Technology Raquel Fernández TtTv 2012 - lecture 1a 7 / 27
What is this course about? Main goals of the course: • to understand human linguistic abilities ∗ language is a cognitive ability that is exclusively human ∗ recall the Turing test Raquel Fernández TtTv 2012 - lecture 1a 8 / 27
What is this course about? Main goals of the course: • to understand human linguistic abilities ∗ language is a cognitive ability that is exclusively human ∗ recall the Turing test • to emulate these abilities using computational models ∗ no need to be committed to simulating actual cognitive processes ∗ but cognitive modelling might also be an underlying aim Raquel Fernández TtTv 2012 - lecture 1a 8 / 27
What is this course about? Main goals of the course: • to understand human linguistic abilities ∗ language is a cognitive ability that is exclusively human ∗ recall the Turing test • to emulate these abilities using computational models ∗ no need to be committed to simulating actual cognitive processes ∗ but cognitive modelling might also be an underlying aim • to study how these computational models can be used to get computers to perform useful tasks involving human language ∗ plenty of practical applications involving natural language processing, for instance: machine translation, email filtering, information retrieval, . . . Raquel Fernández TtTv 2012 - lecture 1a 8 / 27
Language and Communication Raquel Fernández TtTv 2012 - lecture 1a 9 / 27
Language and Communication Diagram from Russell & Norvig (2003) Artificial Intelligence: a Modern Approach Raquel Fernández TtTv 2012 - lecture 1a 9 / 27
What is this course about? We’ll focus on the comprehension/hearer’s side. Raquel Fernández TtTv 2012 - lecture 1a 10 / 27
What is this course about? We’ll focus on the comprehension/hearer’s side. • First part of the course: • Second part of the course: structure meaning ∗ formal language theory ∗ compositional semantics ∗ syntax ∗ lexical semantics ∗ parsing ∗ pragmatics and dialogue N.B: The contents of the course are slightly different from previous years Raquel Fernández TtTv 2012 - lecture 1a 10 / 27
Related Courses in the AI Curriculum We will build on knowledge and skills you have acquired during the first semester of the 1st year: • Logisch Programmeren en Zoektechnieken • Inleiding Logica Other language-related courses in subsequent years: • 2nd year: Natuurlijke Taalmodellen en Interfaces • 3rd year: Discourse Raquel Fernández TtTv 2012 - lecture 1a 11 / 27
Taaltheorie en Taalverwerking 2012 Raquel Fern´ andez Content and Overall Learning Objectives The overall goal of this course is to introduce students to the fundamental topics in computational linguistics and to explain how linguistic knowledge can be used for natural language processing and other key problems in artificial intelligence, such as machine translation and conversational agents. By the end of the course, students should be able to: 1. demonstrate an understanding of the basic concepts in formal language theory, by being able to define formal languages with formalisms and automata and to compare languages, automata, and grammars with different levels of complexity. 2. analyse the syntactic structure of natural language sentences by means of formal grammars and imple- ment some of those grammars in Prolog. 3. describe and compare different parsing algorithms for syntactic processing. 4. represent the meaning of natural language sentences with logic-based formulas and calculate those formulas in a systematic and compositional fashion, on paper and in Prolog. 5. describe the main computational tasks associated with word meanings, including the disambiguation of word meanings in context and the computation of relations between words. 6. describe the main computational challenges of modelling language interaction, including dialogue co- herence and the automatic recognition of speech acts. 7. demonstrate an understanding of the inner workings of key natural language applications, by being able to explain how different types of linguistic knowledge bear on applications such as machine translation, information retrieval, and dialogue systems. Materials Raquel Fernández TtTv 2012 - lecture 1a 12 / 27 The main resource for the course is the textbook by Jurafsky & Martin (2009). This is a big book which
TTTV: Course Materials Main resource: • Jurafsky & Martin (2009) Speech and Language Processing , Second Edition, Pearson Education. Draft versions of some chapters will be available on Blackboard. Other materials, such as online articles and book chapters, will be pointed out during the course. ⇒ See Course Information on Blackboard. Raquel Fernández TtTv 2012 - lecture 1a 13 / 27
Break Raquel Fernández TtTv 2012 - lecture 1a 14 / 27
Overview This week we’ll look into Formal Language Theory • Today: ∗ formal languages: alphabets and strings ∗ regular expressions • Next lecture: ∗ finite state automata ∗ finite state methods for simple natural language tasks Raquel Fernández TtTv 2012 - lecture 1a 15 / 27
Formal Languages: strings and alphabets A formal language is a set of strings, each string composed of symbols from a finite set called an alphabet (or a vocabulary). Raquel Fernández TtTv 2012 - lecture 1a 16 / 27
Formal Languages: strings and alphabets A formal language is a set of strings, each string composed of symbols from a finite set called an alphabet (or a vocabulary). Examples • Let Σ 1 = { 0 , 1 } be an alphabet. Then all binary numbers are strings over Σ 1 . For instance: 01101 , 000001 , 1101 . Raquel Fernández TtTv 2012 - lecture 1a 16 / 27
Recommend
More recommend