CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis - PowerPoint PPT Presentation

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall

Announcements HW-01 posted PR-01 posted Team formation: what is current status?

Lexical Phases of structure a compiler Figure 1.6, page 5 of text

Bird's eye view { for, while, x, factorial, … } G = (N, ∑ , P, S) grammar: rules for language: a set of strings generating language finite automaton regular expression a machine for language regex: a form of grammar C program generated by FLEX

languages & grammars Formally, a grammar is defined by 4 items: 1. N, a set of non-terminals 2. ∑ , a set of terminals 3. P, a set of productions 4. S, a start symbol G = (N, ∑ , P, S)

languages & grammars N, a set of non-terminals ∑ , a set of terminals (alphabet) N ∩ ∑ = {} P, a set of productions of the form (right linear) X -> a X -> aY X -> 𝜁 X ∈ N, Y ∈ N, a ∈ ∑ , 𝜁 denotes the empty string S, a start symbol S ∈ N

Lexical Analysis Lexical structure described by regular grammar Deterministic finite state machine performs analysis

LANGUAGE operations base cases { 𝜁 } is a regular language ∀ a ∈ ∑ , { a } is a regular language Recall, 𝜁 is the empty string

LANGUAGE operations If L and M are regular, so are: L ∪ M = { s | s ∈ L or s ∈ M } union LM = { st | s ∈ L and t ∈ M } concatenation L * = ∪ i=0, ∞ L i Kleene closure L i is L concatenated with itself i times: L 0 = { 𝜁 }, by definition L 1 = L L 2 = LL L 3 = LLL, etc. L * is the union of all these sets!

Example of L * Suppose L is {a, bb} L 0 = { 𝜁 }, by definition L 1 = L = {a, bb} L 2 = LL = {aa, abb, bba, bbbb} L 3 = LLL = {aaa, aabb, abba, abbbb, bbaa, bbbba, bbaa, bbabb, bbbba, bbbbbb, abbbb, bbabb} L 4 = …and so so… L * = ∪ i=0, ∞ L i = { 𝜁 , a, bb, aa, abb, bba, bbbb, aaa, aabb, abba, abbbb, bbaa, bbbba, bbaa, bbabb, bbbba, bbbbbb, abbbb, bbabb, … }

Given an alphabet ∑ REGular EXpression (regex) Inductive definition 𝜁 is a regex 𝓜 ( 𝜁 ) = { 𝜁 } For each a ∈ ∑ , a is a regex 𝓜 (a) = {a}

Regular expressions (regex) Inductive definition Assume r and s are regexes. r|s is a regex denoting 𝓜 (r) ∪ 𝓜 (s) rs is a regex denoting 𝓜 (r) 𝓜 (s) r * is a regex denoting ( 𝓜 (r)) * (r) is a regex denoting 𝓜 (r) Precedence: Kleene closure > concatenation > union Associativity: all left-associative (minimize use of parentheses: (r|s)|t = r|s|t )

Algebraic laws Assume r and s are regexes. Commutativity r|s = s|r Associativity r|(s|t) = (r|s)|t and r(st) = (rs)t Disributivity r(s|t) = rs|rt and (s|t)r = sr|tr Identity 𝜁 r = r 𝜁 = r Idempotency r ** = r *

We can describe a regular language using a regular expression

A regular expression can be recognized using a finite state machine. Machines: NFA non-deterministic finite automaton DFA deterministic finite automaton

Process of building lexical analyzer 1) spell out the language language

Process of building lexical analyzer 2) formulate a regular expression language regex

Process of building lexical analyzer 3) build an NFA language regex NFA

Process of building lexical analyzer 4) transform NFA to DFA language regex NFA DFA

Process of building lexical analyzer 5) transform DFA to a minimal DFA language regex NFA DFA DFA

Process of building lexical analyzer 5) The minimal DFA is character our lexical analyzer stream language regex NFA DFA DFA token stream lexical analyzer

Focus for today regex NFA

Nondeterministic Finite Automata (NFA) A finite set of states S An alphabet ∑ , 𝜁 ∉ ∑ 𝛆 ⊆ S X ( ∑ ∪ { 𝜁 }) X 𝒬 (S) (transition function) s 0 ∈ S (a single start state) F ⊆ S (a set of final or accepting states)

Deterministic Finite Automata (DFA) A finite set of states S An alphabet ∑ , 𝜁 ∉ ∑ 𝛆 ⊆ S X ∑ X S (transition function) s 0 ∈ S (a single start state) F ⊆ S (a set of final or accepting states)

A state is a circle with its state number written inside. 0

Initial state has an arrow from nowhere pointing in. State 0 is often the initial state. 0

A final state is drawn with a double circle. 1

Arrows are labeled with 𝜁 … 𝜁 1 0 … or a ∈ ∑ . a 1 0 for each a ∈ ∑

Regex -> NFA 𝜁 1 0 N(s) 𝜁 𝜁 0 1 𝜁 𝜁 N(t) a 1 0 S | t for each a ∈ ∑

Regex -> NFA St 0 1 N(s) N(t) 𝜁 𝜁 S * 0 1 N(s) 𝜁 𝜁

Simple example static

Simple example static c s t a t i 0 1 2 3 4 5 6

Simple example static struct c s a t i t 0 1 2 3 4 5 6 𝜁 𝜁 i F t s t r u c 7 8 9 𝜁 𝜁 10 11 12 13

Process of building lexical analyzer 5) The minimal DFA is character our lexical analyzer stream language regex NFA DFA DFA token stream lexical analyzer

Focus above: build a non-deterministic recognizer regex NFA

Next step: make recognizer deterministic NFA DFA

(a|b) * abb first we construct an NFA from this regular expression

(a|b) * abb a

(a|b) * abb a b

(a|b) * abb a 𝜁 𝜁 b 𝜁 𝜁

(a|b) * abb 𝜁 a 𝜁 𝜁 𝜁 𝜁 𝜁 b 𝜁 𝜁

(a|b) * abb 𝜁 a 𝜁 𝜁 a 𝜁 𝜁 𝜁 b 𝜁 𝜁

(a|b) * abb 𝜁 a 𝜁 𝜁 a b 𝜁 𝜁 𝜁 b 𝜁 𝜁

(a|b) * abb 𝜁 a 𝜁 𝜁 a b b 𝜁 𝜁 𝜁 b 𝜁 𝜁

(a|b) * abb 𝜁 a 2 3 𝜁 𝜁 a b b 𝜁 0 1 6 8 7 9 10 𝜁 𝜁 b 𝜁 𝜁 4 5

Operations 𝜁 -closure(t) is the set of states reachable from state t using only 𝜁 -transitions. 𝜁 -closure(T) is the set of states reachable from any state t ∈ T using only 𝜁 - transitions. move(T,a) is the set of states reachable from any state t ∈ T following a transition on symbol a ∈ ∑ .

NFA -> DFA algorithm (set of states construction - page 153 of text) INPUT: An NFA N = (S, ∑ , 𝛆 , s 0 , F) OUTPUT: A DFA D = (S', ∑ , 𝛆 ', s 0 ', F') such that ℒ (D)= ℒ (N) ALGORITHM: Compute s 0 ' = 𝜁 -closure(s 0 ), an unmarked set of states Set S' = { s 0 ' } while there is an unmarked T ∈ S' mark T for each symbol a ∈ ∑ let U = 𝜁 -closure(move(T,a)) if U ∉ S', add unmarked U to S' add transition: 𝛆 '(T,a) = U F' is the subset of S' all of whose members contain a state in F .

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis - PowerPoint PPT Presentation

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall Announcements HW-01 posted PR-01 posted Team formation: what is current status? Lexical Phases of structure a compiler Figure 1.6, page 5 of text Bird's eye view

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall www.cse.buffalo.

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall www.cse.buffalo.

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

CSE443 Compilers Dr. Carl Alphonce ruhansa@buffalo.edu Ruhan Sa alphonce@buffalo.edu 343

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

Theory of Computer Science C6. Context-free Languages: Closure & Decidability Gabriele R

Regular Expressions Greg Plaxton Theory in Programming Practice, Spring 2004 Department of

Formal Languages 1 Discrete Mathematical Structures Formal Languages

GoogLeNet Deeper than deeper Some slides are from Christian Szegedy GoogLeNet Convolution

Lecture 4 Regular Expressions 4-0 DFAs vs NFAs Surprisingly, for finite

91.304 Foundations of (Th (Theoretical) Computer Science ti l) C t S i Chapter 1 Lecture

Compiler Construction Lecture 3: Scanner Generators 2020-01-14 Michael Engel Includes material

CS 301 Lecture 07 Closure properties of regular languages Stephen Checkoway February 7, 2018

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis - PowerPoint PPT Presentation

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall Announcements HW-01 posted PR-01 posted Team formation: what is current status? Lexical Phases of structure a compiler Figure 1.6, page 5 of text Bird's eye view

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall www.cse.buffalo.

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall www.cse.buffalo.

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

CSE443 Compilers Dr. Carl Alphonce ruhansa@buffalo.edu Ruhan Sa alphonce@buffalo.edu 343

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

Theory of Computer Science C6. Context-free Languages: Closure &amp; Decidability Gabriele R

Regular Expressions Greg Plaxton Theory in Programming Practice, Spring 2004 Department of

Formal Languages 1 Discrete Mathematical Structures Formal Languages

GoogLeNet Deeper than deeper Some slides are from Christian Szegedy GoogLeNet Convolution

Lecture 4 Regular Expressions 4-0 DFAs vs NFAs Surprisingly, for finite

91.304 Foundations of (Th (Theoretical) Computer Science ti l) C t S i Chapter 1 Lecture

Compiler Construction Lecture 3: Scanner Generators 2020-01-14 Michael Engel Includes material

CS 301 Lecture 07 Closure properties of regular languages Stephen Checkoway February 7, 2018

Theory of Computer Science C6. Context-free Languages: Closure & Decidability Gabriele R