CSC 473 Automata, Grammars & Languages 8/15/10 Automata, Grammars and Languages Discourse 01 Introduction C SC 473 Automata, Grammars & Languages Fundamental Questions Theory of Computation seeks to answer fundamental questions about computing • What is computation? Ancient activity back as far as Babylonians, Egyptians Not precisely settled until circa 1936 • What can be computed? Different ways of computing (C, Lisp, …) result in the same “effectively computable” functions from input to output? • What cannot be computed? Not but can get arbitrarily close 2 Are there precisely defined tasks (“problems”) that cannot be carried out? Yes/No decisions that cannot be computed? • What can be computed efficiently ? (Computational Complexity) Are there inherently difficult although computable problems? 2 C SC 473 Automata, Grammars & Languages Basic Concepts: Automata, Grammars & Languages • Language: a set of strings over some finite alphabet Σ = L { TAA TGA TAG , , , … } DNA codons Ex: � ={ , A G C T , , } • Automaton (Machine): abstract (=simplified) model of a computing device. Used to “recognize” strings of a language L b Finite Automaton Ex: (Finite State Machine) b a a • Grammar: finite set of string rewriting rules. Used to specify (derive) strings of a language � + S SS Ex: Context-Free Grammar � S x (CFG) 3 C SC 473 Automata, Grammars & Languages 1
CSC 473 Automata, Grammars & Languages 8/15/10 Languages = � = L { aa ab ba bb , , , } { , } a b 1 = � � = L { , , a aa aaa aaaa , , , … } { } a 2 = L { : e e } is a well-formed arithmetic expression in C 3 � ={ 0-9,a-z,A-Z,+,-, *, /, (, ), ., &, ! , � } = � = L { : p p } {ASCII} is a well-formed C program 4 = L { : p p C program that halts for all inputs } is a w.-f. 5 = L {( , ) : x y x y } is a decimal integer and is its binary representation 6 4 C SC 473 Automata, Grammars & Languages Types of Machines • Logic circuit memoryless; values combined using gates c s Circuit size = 5 < ⊕ > Circuit depth = 3 > ⊕ x y z 5 C SC 473 Automata, Grammars & Languages Types of Machines (cont.) • Finite-state automaton (FSA) bounded number of memory states step: input, current state determines next state & output a Mod 3 counter q 0 / 0 q 1 /1 state/ouput (Moore) machine a � = ( q a , ) ( q ,2) 1 2 q 2 / 2 a • models programs with a finite number of bounded registers •reducible to 0 registers b 6 C SC 473 Automata, Grammars & Languages 2
CSC 473 Automata, Grammars & Languages 8/15/10 Types of Machines (cont.) • Pushdown Automaton (PDA) � � = ( q a , , ) ( q , ) A finite control and a single 2 2 unbounded stack a, ε → A b, A → ε ε , ε → $ = n n � L { a b # : n 1 } b, A → ε #, $ → ε models finite program + one unbounded stack of bounded registers top $ b 7 C SC 473 Automata, Grammars & Languages Types of Machines (cont.) • Random access machine (RAM) • finite program and an unbounded, addressable • random access memory of ``registers” • models general programs ◆ unbounded # of bounded registers ◆ Simple 1-addr instructions • � + Example: R R R • 0 0 1 • L : JMPZ R L 0 1 1 4 INC R 3 0 2 DEC R 1 1 JMP L 0 0 b L : CONTINUE 1 8 C SC 473 Automata, Grammars & Languages Types of Machines (cont.) • Turing Machine (TM) finite control & tape of bounded cells unbounded in # to R Input left adjusted on tape at start with blank cell terminating current state, cell scanned determine next state & overprint symbol control writes over symbol in cell and moves head 1 cell L or R models simple ``sequential ʼʼ memory; no addressability fixed amount of information (b bits) per cell b • • • Finite- � = ( , q X ) ( , , p Y R ) state control 9 C SC 473 Automata, Grammars & Languages 3
CSC 473 Automata, Grammars & Languages 8/15/10 Theory of Computation Study of languages and functions that can be described by computation that is finite in space and time • Grammar Theory Context-free grammars Right-linear grammars Unrestricted grammars Capabilities and limitations Application: programming language specification • Automata Theory FA PDA Turing Machines Capabilities and limitations Characterizing “what is computable?” Application: parsing algorithms 10 C SC 473 Automata, Grammars & Languages Theory of Computation (cont ʼ d) • Computational Complexity Theory Inherent difficulty of “problems” Time/space resources needed for computation “Intractable” problems Ranking of problems by difficulty (hierarchies) Application: algorithm design, algorithm improvement, analysis 11 C SC 473 Automata, Grammars & Languages FSA Ex: Specifying/Recognizing C Identifiers • Deterministic FA Λ ={a,…,z,A…,Z, _ } Δ ={0,…,9} State diagram (labeled digraph) Λ Λ q q 0 acc Δ Δ q reject Regular Expression + + + + � + + + + + + 9) * (_ a … A … ) (_ a … A … 0 … Right-Linear Grammar � … � … S a T | | z T T a T | | z T | A T | … | Z T | A T | … | Z T | _ T | 0 T | … | 9 T | _ T 12 C SC 473 Automata, Grammars & Languages 4
CSC 473 Automata, Grammars & Languages 8/15/10 FSA Ex: C Floating Constants • " A floating constant consists of an integer part, a decimal point, a fraction part, an e or E , an optionally signed integer exponent (and an optional type suffix … ). The integer and fraction parts both consist of a sequence of digits. Either the integer part or the fraction part (not both) may be missing; either the decimal point or the e and the exponent (not both) may be missing. …" --B. W. Kernighan and D.M. Ritchie, The C Programming Language , Prentice-Hall, 1978 (The type is determined by the suffix; F or f makes it a float , L or l makes it a long double; otherwise it is double.) 13 C SC 473 Automata, Grammars & Languages FSA Ex: C Floats (cont ʼ d) d d = … 0 | 1 | 9 d + � , e, E d d e, E d • Note: type suffixes f,F,l,L omitted d • d “Either the integer part or the fraction part (not both) may be missing; either the decimal point or the e and the exponent (not both) may be missing ” 14 C SC 473 Automata, Grammars & Languages CFG Ex: A Calculator Language • Syntactic Classes 20*30-12= Numerals 3 40 ∗ 8 9 7 Digits 0 1 9 4 5 6 - Expressions 3*9 40-3*3 1 2 3 + Commands 3*9= 40-3*3= 0 = • Context-Free Grammar C → E= Note: no division & E → N no decimal point terminals Σ = {=,+, − , ∗ ,0,…,9 } → E+N → E-N rules → E ∗ N variables = { N,D,E,C } V R N → ND N → D start variable = C D → 0... = → 9 � G (V, , ,C) R grammar 15 C SC 473 Automata, Grammars & Languages 5
CSC 473 Automata, Grammars & Languages 8/15/10 Calculator Language (cont ʼ d) N • Syntax Trees—exhibit “phrase structure” N N D • Numerals N N N D 5 N D D 0 D D 6 • Expressions E 3 4 3 C • Commands C C E = E = E ∗ N … N E ∗ N - E 3 … … N Is this the parse 3 N 9 you expected? … … … 4 0 3 3*9= 40-3*3= 16 C SC 473 Automata, Grammars & Languages TM Ex: An “Algorithmically Unsolvable” Problem • Q: Is there an algorithm for deciding if a given program P halts on a given input x? � � P � 1 if P(x) Halting � � Decider � 0 if P(x) � x • A: No. There is no program that works correctly for all P,x • For the proof, we will need a simple programming language‡: NatC— a simplified C One data type: nat = {0,1,2, …}. All variables of type nat All programs have one nat input and one nat output ‡We will later on use Turing Machines to model a “simple programming language”. NatC is simpler to describe. 17 C SC 473 Automata, Grammars & Languages Unsolvable Problem (cont ʼ d) • Observations: A standard C compiler can be modified to accept only NatC programs as “legal” Every NatC program P computes a function from natural numbers to � f : nat nat natural numbers. P f Note: may not be defined for some inputs, i.e., it is a partial P function Ex: P does not halt for nat P(nat x) some inputs { if (x=3) return(6); else { while(x=x) do x=x+1; return x; } } 18 C SC 473 Automata, Grammars & Languages 6
Recommend
More recommend