csci 3136 principles of programming languages
play

CSCI 3136 Principles of Programming Languages Lexical Analysis and - PowerPoint PPT Presentation

CSCI 3136 Principles of Programming Languages Lexical Analysis and Automata Theory - 4 Summer 2013 Faculty of Computer Science Dalhousie University 1 / 11 Regular Expression to NFA (Example) d ( . d | d . )d . d d


  1. CSCI 3136 Principles of Programming Languages Lexical Analysis and Automata Theory - 4 Summer 2013 Faculty of Computer Science Dalhousie University 1 / 11

  2. Regular Expression to NFA (Example) d ∗ ( . d | d . )d ∗ . d ǫ ǫ ǫ ǫ d d ǫ ǫ ǫ ǫ ǫ ǫ d . ǫ ǫ 2 / 11

  3. NFA to DFA (Example) d ∗ ( . d | d . )d ∗ ǫ ǫ . d 5 6 7 ǫ ǫ d d ǫ ǫ ǫ ǫ 1 2 3 4 11 12 13 14 ǫ ǫ d . 8 9 10 ǫ ǫ d . { 1 } { 2,3,4,5,8,9 } { 6 } { 2,3,4,5,8,9 } { 2,3,4,5,8,9 } { 6,10,11,12,14 } { 6 } { 7,11,12,14 } ∅ { 6,10,11,12,14 } { 7,11,12,13,14 } ∅ { 7,11,12,14 } { 12,13,14 } ∅ { 7,11,12,13,14 } { 12,13,14 } ∅ { 12,13,14 } { 12,13,14 } ∅ 3 / 11

  4. NFA to DFA (Example) d d d d { 1 } { 2 , 3 , 4 , 5 , 8 , 9 } 1 2 . . . . 3 4 d d { 6 } { 6 , 10 , 11 , 12 , 14 } 5 6 d d 7 d d d ⇐ ⇒ { 7 , 11 , 12 , 14 } { 7 , 11 , 12 , 13 , 14 } d d { 12 , 13 , 14 } d 4 / 11

  5. DFA Minimization Algorithm • Create lower-triangular table DISTINCT, initially blank • For every pair of states ( p , q ): ◮ If p is final and q is not, or vice versa, Set DISTINCT( p , q ) to be ǫ • Loop until there is no change in the table contents: ◮ For each pair of states ( p , q ) and each symbol a in the alphabet: ◮ If DISTINCT( p , q ) is empty and DISTINCT( δ ( p , a ), δ ( q , a ) ) is not empty Set DISTINCT( p , q ) to be a • Combine all states that are not distinct 5 / 11

  6. Minimizing the DFA (Example) d d 1 2 . . 3 4 d d 5 6 d d 7 2 . 3 d d d 4 ǫ ǫ ǫ d 5 ǫ ǫ ǫ d 1 2 6 ǫ ǫ ǫ . . 7 ǫ ǫ ǫ d 1 2 3 4 5 6 4 , 5 , 6 , 7 3 d 6 / 11

  7. Limits of Regular Language • You cannot construct DFA to recognize these languages ◮ L = a n b n , ( n ) n (parenthesis languages) ◮ L = { set of all syntactically valid C programs } ◮ L = { a p : where p is a prime number } ◮ . . . • Not all languages are regular 7 / 11

  8. Pumping Lemma for RLs For any regular language L , there exists a constant n such that any string w ∈ L , | w |≥ n can be broken into w = xyz , such that: ◮ | xy | ≤ n ◮ | y | > 0 ◮ xy k z ∈ L for all k = 0 , 1 , 2 , · · · That is: the substring y can be pumped (removed or repeated any number of times, and the resulting string is always in L ). 8 / 11

  9. Proof Sketch for Pumping Lemma Let L be defined by a DFA with n states. If string w has length | w | ≥ n number of states then, from the pigeonhole principle , a state q is repeated in the walk 9 / 11

  10. Proof Sketch for Pumping Lemma Let L be defined by a DFA with n states. If string w has length | w | ≥ n number of states then, from the pigeonhole principle , a state q is repeated in the walk q walk w 10 / 11

  11. Example: Prove that L = { a n b n | n > 0 } is not regular Proof : Assume L is regular. = ⇒ the pumping lemma holds. Choose w ∈ L , where m is the constant in the pumping lemma. (Note that w must be choosen such that | w | ≥ m .) The only way to partition w into three parts, w = xyz , is such that x contains 0 or more a ’s, y contains 1 or more a ’s, and z contains 0 or more a ’s concatenated with b m . This is because of the restrictions | xy | ≤ m and | y | > 0. So the partition is: m m � �� � a m b m = � �� � a · · · a a · · · a · · · a b · · · b � �� � � �� � x y y = a k , k ≥ 1 We have: xyz = a m b m From the Pumping Lemma: xy i z ∈ L , i = 0 , 1 , 2 , · · · Thus: xy 2 z ∈ L xy 2 z = xyyz = a m + k b m ∈ L ( A contradiction ! ) 11 / 11

Recommend


More recommend