Inf2A: Course Roadmap John Longley Stuart Anderson Please Read J&M Chapter 1, Kozen Chapters 1&2
High Level Summary This course is foundational – it tries to capture the fundamental ■ concepts that underpin a wide range of phenomena - with special reference to natural language, artificial languages, and the possible behaviours of simple control systems. The fundamental concepts are that of a language , and its ■ description by means of grammars and automata . Broadly, grammars are oriented towards generating sentences or strings of the language; automata are oriented towards processing existing sentences. This course is also practical – you will use your knowledge of ■ grammars and automata to design and analyse a variety of specific computational systems. 23 Sept 2010 Inf2A: Course Roadmap 2
Revision What is the language recognised by ■ this FSM? 1. Any sequence of a's and b's with an even number of a's 2. Any sequence of a's and b's 3. The empty language 4. A sequence of b's of any length b a b a 23 Sept 2010 Inf2A: Course Roadmap 3
Overview ■ For our present purposes, a language is a set (usually infinite) of finite sequences of symbols (e.g. like letters or simple sounds). A particular such sequence is called a sentence of the language. ■ To specify a language we specify the alphabet of symbols (usually finite), and then say which sequences of symbols that are in the language. ■ Specifications of languages may be given by either: – a grammar , that is, a set of rules for generating all possible sentences of a language (recall regular expressions), or – an acceptor (recall finite state acceptors from Inf1A), that is, an automaton for deciding if a given sentence is in the language. Sometimes there are also outputs – transducers. 23 Sept 2010 Inf2A: Course Roadmap 4
Overview - Continued ■ We study different classes of grammars and acceptors: in each case, we're interested in the class of languages that can be described by a grammar or acceptor of a certain kind. ■ In particular, we'll study four classes of grammar and four corresponding classes of acceptor (plus variants). In order of increasing power, these are – Regular grammars, context-free grammars, context-sensitive grammars, unrestricted grammars – Finite-state automata, pushdown automata, linear-bounded automata, and Turing Machines. ■ To some extent these were developed independently but are intimately connected. 24 Sept 2009 Inf2A: Course Roadmap 5
Ambiguity and probabilistic models Perhaps the most important difference between natural and ■ artificial languages (for our purposes) is that natural languages are riddled with ambiguity at many levels, whereas a well- designed artificial language usually won't be. So in processing natural languages, we can't always be sure which ■ interpretation of a sentence is the intended one. The best we can do is to try to gauge which is the most probable . This leads us to add bells and whistles to the models already ■ mentioned so as to make them “probabilistic”. (E.g. FSMs become Hidden Markov Models or similar.) 23 Sept 2010 Inf2A: Course Roadmap 6
Kinds of things we are concerned with (increasingly meta) ■ The design and construction of particular machines (e.g. a traffic light controller, or a parser for Java). ■ Questions about properties of particular machines (e.g. is it the case that two opposing traffic signals never both display green?) ■ Issues about relationships between machines (e.g. do two machines have “the same behaviour” in some sense?) ■ Issues about all machines of a particular class (e.g. is there any FSM that does such-and-such?) ■ Issues across classes of machines (e.g. can every machine in class X be “simulated” by one in class Y?) 23 Sept 2010 Inf2A: Course Roadmap 7
When are two automata “the same”? • Are these two FSMs equivalent? a • Why (not)? b c a a c b 23 Sept 2010 Inf2A: Course Roadmap 8
Equality • Are these two FSMs equivalent? a • Why (not)? b c It depends what you mean! * They are the same, because they recognise the same language. * But they are different, because after a a accepting a, either b or c is acceptable in one but not the other. The first answer is the more relevant c b one for the purpose of language theory (but not for most other purposes!) 23 Sept 2010 Inf2A: Course Roadmap 9
More on Equality tick tick tock tock tock tick 1. Are these two machines equal? • True, or • False • How would you convince me? 24 Sept 2009 Inf2A: Course Roadmap 10
More on Equality tick tick tock tock tock tick 1. Are these two machines equal? • True, or • False • How would you convince me? tick 1. Is this machine equal to the two-state machine above? • True, or tock • False tick • How would you convince tick me? tock 24 Sept 2009 Inf2A: Course Roadmap 11
More on Equality tick tick tock tock tock tick 1. Are these two machines equal? • True, or • False • How would you convince me? tick tick 2. Is this machine equal to the two-state machine above? tock • True, or tock • False tick tick tock • How would you convince tick me? 3. Is this machine equal to the tock two-state machine above? • True, or • False 24 Sept 2009 Inf2A: Course Roadmap 12
What do we do with Grammars and Machines? ■ We can use grammars and machines to describe particular languages we are interested in. We consider: – Using these mechanisms (particularly grammars) to describe a naturally occurring language (e.g. English or Hindi): • Here we are constructing a model of some mechanism we can empirically observe. • We worry about the adequacy of the model, whether the mechanism explains anything about the phenomenon. – Using these mechanisms to design a new artificial language (e.g. a programming language or some interchange format between two computer systems): • We worry about properties of the language e.g. how easy is it to parse, is it unambiguous, is it easy to detect and recover from errors in a sentence in the language. 24 Sept 2009 Inf2A: Course Roadmap 13
Revision ■ What regular expression describes the language recognised by this machine? – (a+b)* – (a*b*)* – (b*ab*a)*b* – (aba)* a,b a b a 24 Sept 2009 Inf2A: Course Roadmap 14
What do we do with Grammars and Machines? ■ We can explore the definitional power of particular mechanism (either grammar or machine) and see how it relates to other mechanisms. This is the study of the foundations of computation. Our concerns are questions like: – Is a particular mechanism more or less powerful than another? – Is it possible to describe any conceivable language in one of these mechanisms? – Are there languages that are impossible to describe? – For a given language description, is it always possible to decide whether a sentence is in the language or not? 24 Sept 2009 Inf2A: Course Roadmap 15
Natural Language ■ Complex, naturally occurring phenomenon, so our models are always approximate. Areas of study: – Phonetics and phonology: study of linguistic sounds – Morphology: study of the structure of words in.sur.mount.able, sale.s.manager – Syntax: study of sentence structure fruit flies like a banana – Semantics: the study of meaning A student failed every course: ( ∃ x)(student(x) ∧ ( ∀ y)(course(y) failed(x,y))) – Pragmatics and Discourse: study of language use and of larger linguistic units (dialogues, texts) It’s freezing in here Command: close the window 24 Sept 2009 Inf2A: Course Roadmap 16
Designing Artificial Languages ■ Here we are in control of the language so we try to design in “good” properties like being easy to check if a sentence is correct, make it easy for the checker to recover from human errors (e.g. omissions, misspelling, …), make it easy for a human to understand. Typically we study a subset of the areas studied for natural language: – Lexical analysis (part of morphology) studies how the symbols of the language are built from the components that make them up (e.g. a name and the letters making up a name). – Syntax: the study of the structure of sentences. – Semantics: how to relate meaning to sentences (e.g. in a programming language, relating the text of the program to its behaviour (ideally in a way that is independent of a particular implementation). 24 Sept 2009 Inf2A: Course Roadmap 17
Contrast: The attitude to ambiguity ■ Natural utterances are full of ambiguity . The less context, the harder to decide what was intended: – [J&M] “ I made her duck ” – All meanings are valid in this context (and each has a different structure). – We don’t want to throw away any possibilities until we know more. ■ In the design of programming languages, ambiguity is not often tolerated : – y = 1; if x>3 then if x<5 then y = 2 else y = 3 – If x == 2 then what is the value of y after executing this? – Solution: either don’t allow it or always ensure the syntax is unambiguous. 24 Sept 2009 Inf2A: Course Roadmap 18
Recommend
More recommend