cse443 compilers
play

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis - PowerPoint PPT Presentation

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall Team formation Please sit with your teams starting next week. C vs SML? Lexical Phases of structure a compiler Figure 1.6, page 5 of text languages & grammars


  1. CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall

  2. Team formation Please sit with your teams starting next week. C vs SML?

  3. Lexical Phases of structure a compiler Figure 1.6, page 5 of text

  4. languages & grammars Formally, a grammar is defined by 4 items: 1. N, a set of non-terminals 2. ∑ , a set of terminals 3. P, a set of productions 4. S, a start symbol G = (N, ∑ , P, S)

  5. languages & grammars N, a set of non-terminals ∑ , a set of terminals (alphabet) N ∩ ∑ = {} P, a set of productions of the form (right linear) X -> a X -> aY X -> ℇ X ∈ N, Y ∈ N, a ∈ ∑ , ℇ denotes the empty string S, a start symbol S ∈ N

  6. Lexical Analysis Lexical structure described by regular grammar Deterministic finite state machine performs analysis

  7. LANGUAGE operations If L and M are regular, so are: L ∪ M = { s | s ∈ L or s ∈ M } union LM = { st | s ∈ L and t ∈ M } concatenation L * = ∪ i=0, ∞ L i Kleene closure By definition, L 0 = { ℇ }

  8. Given an alphabet ∑ REGular EXpression (regex) Inductive definition ℇ is a regex 𝓜 ( ℇ ) = { ℇ } For each a ∈ ∑ , a is a regex 𝓜 (a) = {a}

  9. Regular expressions (regex) Inductive definition Assume r and s are regexes. r|s is a regex denoting 𝓜 (r) ∪ 𝓜 (s) rs is a regex denoting 𝓜 (r) 𝓜 (s) r * is a regex denoting ( 𝓜 (r)) * (r) is a regex denoting 𝓜 (r) Precedence: Kleene closure > concatenation > union Associativity: all left-associative (minimize use of parentheses: (r|s)|t = r|s|t )

  10. Algebraic laws Assume r and s are regexes. Commutativity r|s = s|r Associativity r|(s|t) = (r|s)|t and r(st) = (rs)t Disributivity r(s|t) = rs|rt and (s|t)r = sr|tr Identity ℇ r = r ℇ = r Idempotency r ** = r *

  11. We can describe a regular language using a regular expression

  12. A regular expression can be recognized using a finite state machine. Machines: NFA non-deterministic finite automaton DFA deterministic finite automaton

  13. Process of building lexical analyzer 1) spell out the language language

  14. Process of building lexical analyzer 2) formulate a regular expression language regex

  15. Process of building lexical analyzer 3) build an NFA language regex NFA

  16. Process of building lexical analyzer 4) transform NFA to DFA language regex NFA DFA

  17. Process of building lexical analyzer 5) transform DFA to a minimal DFA language regex NFA DFA DFA

  18. Process of building lexical analyzer 5) The minimal DFA is character our lexical analyzer stream language regex NFA DFA DFA token stream lexical analyzer

  19. Focus for today regex NFA

  20. Nondeterministic Finite Automata (NFA) A finite set of states S An alphabet ∑ , ℇ ∉ ∑ 𝛆 ⊆ S X ( ∑ ∪ { ℇ }) X 𝒬 (S) (transition function) s 0 ∈ S (a single start state) F ⊆ S (a set of final or accepting states)

  21. Deterministic Finite Automata (DFA) A finite set of states S An alphabet ∑ , ℇ ∉ ∑ 𝛆 ⊆ S X ∑ X S (transition function) s 0 ∈ S (a single start state) F ⊆ S (a set of final or accepting states)

  22. Initial state: arrow from Regex -> NFA nowhere pointing in. Often labelled state 0. ℇ 1 0 N(s) ℇ ℇ 0 1 Final state: drawn with a ℇ double circle ℇ N(t) a 1 0 Arrows are labeled with ℇ or a ∈ ∑ . S | t for each a ∈ ∑

  23. Regex -> NFA ℇ 1 0 N(s) ℇ ℇ 0 1 ℇ ℇ N(t) a 1 0 S | t for each a ∈ ∑

  24. Regex -> NFA St 0 1 N(s) N(t) ℇ ℇ S * 0 1 N(s) ℇ ℇ

  25. Simple example static

  26. Simple example static c s t a t i 0 1 2 3 4 5 6

  27. Simple example static struct c s a t i t ℇ 0 1 2 3 4 5 6 ℇ i F t s t r u c ℇ ℇ 7 8 9 10 11 12 13

Recommend


More recommend