cfls and regular languages
play

CFLs and Regular Languages We can show that every RL is also a CFL - PDF document

CFLs and Regular Languages We can show that every RL is also a CFL CFLs and Regular Languages We will show by only using Regular Expressions and Context Free Grammars That is what we will do in this half. Note: Much of this lecture


  1. CFLs and Regular Languages  We can show that every RL is also a CFL CFLs and Regular Languages  We will show by only using Regular Expressions and Context Free Grammars  That is what we will do in this half.  Note: Much of this lecture is not in the text! CFLs and Regular Languages Union, Concatenation, and Kleene Star of CFLs  Will show that all Regular Languages  Formally, Let L 1 and L 2 be CFLs. Then there exists CFGs: are CFLs  G 1 = (V 1 , T, S 1 , P 1 ) - If L 1 and L 2 are CFLs then  G 2 = (V 2 , T, S 2 , P 2 ) such that - L 1 ∪ L 2 is a CFL  L(G 1 ) = L 1 and L(G 2 ) = L 2 - L 1 L 2 is a CFL  Assume that V 1 ∩ V 2 = ∅ - L 1 * is a CFL  We will define: - With the above shown, showing every  G u = (V u , T, S u , P u ) such that L(G u ) = L 1 ∪ L 2 Regular Language is also a CFL can be  G c = (V c , T, S c , P c ) such that L(G c ) = L 1 L 2 shown using a basic inductive proof.  G k = (V k , T, S k , P k ) such that L(G c ) = L 1 * Union, Concatenation, and Kleene Star of CFLs Union, Concatenation, and Kleene Star of CFLs  Union  Union  Basic Idea  Formally  Define the new CFG so that we can either  G u = (V u , T, S u , P u )  start with the start variable of G 1 and follow the  V u = V 1 ∪ V 2 ∪ {S u } production rules of G 1 or  S u = S u  start with the start variable of G 2 and follow the  P u = P 1 ∪ P 2 ∪ {S u → S 1 | S 2 } production rules of G 2  The first case will derive a string in L 1  The second case will derive a string in L 2 1

  2. Union, Concatenation, and Kleene Star of CFLs Union, Concatenation, and Kleene Star of CFLs  Concatenation  Concatenation  General Idea  Formally  Define the new CFG so that  G c = (V c , T, S c , P c )  We force a derivation staring from the start variable  V c = V 1 ∪ V 2 ∪ {S c } of G 1 using the rules of G 1  S c = S c  After that…  P u = P 1 ∪ P 2 ∪ {S c → S 1 S 2 }  We force a derivation staring from the start variable of G 2 using the rules of G 2 Union, Concatenation, and Kleene Star of CFLs Union, Concatenation, and Kleene Star of CFLs  Kleene Star  Kleene star  General Idea  Formally  Define the new CFG so that  G k = (V k , T, S k , P k )  We can repeatedly concatenate derivations of strings  V k = V 1 ∪ {S k } in L 1  S k = S k  Since L * contains λ , we must be careful to  P k = P 1 ∪ {S k → S 1 S k | λ } assure that there are productions in our new CFG such that λ can be derived from the start variable CFLs and Regular Languages Regular Expression  Now we can complete the proof Recursive definition of regular languages /  expression over Σ :  Use an inductive proof ∅ is a regular language and its regular 1. expression is ∅ { λ } is a regular language and λ is its regular 2. expression For each a ∈ Σ , { a } is a regular language and 3. its regular expression is a 2

  3. Regular Expression CFLs and Regular Languages 4. If L 1 and L 2 are regular languages with regular RE -> CFG  expressions r 1 and r 2 then Base cases  -- L 1 ∪ L 2 is a regular language with regular ∅ can be expressed as a CFG with no expression (r 1 + r 2 ) 1. productions -- L 1 L 2 is a regular language with regular { λ } can be expressed by a CFG with expression (r 1 r 2 ) 2. the single production S → λ -- L 1 * is a regular language with regular expression (r 1 * ) For each a ∈ Σ , { a } can be expressed by 3. a CFG with the single production S → a Only languages obtainable by using rules 1-4 are regular languages . CFLs and Regular Languages Union, Concatenation, and Kleene Star of CFLs  RE -> CFG  Assume R 1 and R 2 are regular expressions that Context Free Languages describe languages L 1 and L 2 . Then, by the induction hypothesis, L 1 and L 2 are CFLs and as Regular Languages such there are CFGs that describe L 1 and L 2  Create CFGs that describe the the languages: Finite  L 1 ∪ L 2 Languages  L 1 L 2  L 1 *  Which we just did…We are done! CFLs and Regular Languages CFLs and Regular Languages  What have we learned?  Example  Find a CFG for the L = (011 + 1) * (01) *  CFLs are closed under union, concatenation, and Kleene Star  (011 + 1) can be described by the CFG with productions:  Every Regular Language is also a CFL  A → 011 | 1  (011 + 1) * can be described by the CFG with  We now have an algorithm, given a productions: Regular Expression, to construct a CGF that  B → AB | λ describes the same language  A → 011 | 1 3

  4. CFLs and Regular Languages CFLs and Regular Languages  Example  Example  Find a CFG for the L = (011 + 1) * (01) *  Find a CFG for the L = (011 + 1) * (01) *  Putting it all together  (01) can be described by the CFG with productions:  (011 + 1) * (01) * can be described by the CFG with productions:  D → 01  S → BC  (01) * can be described by the CFG with productions:  B → AB | λ  C → DC | λ  A → 011 | 1  D → 01  C → DC | λ  D → 01  Questions? Union, Concatenation, and Kleene Star of CFLs Union, Concatenation, and Kleene Star of CFLs  You can use proof of closure properties  Example: in building CFLs:  Find a CFL for L = {0 i 1 j 0 k | j > i + k}  This language can be expressed as  Example:  L = {0 i 1 i 1 m 1 k 0 k | m > 0}  Find a CFL for L = {0 i 1 j 0 k | j > i + k}  This is concatenation of 3 languages L 1 L 2 L 3 where  Number of 1s is greater than the combined number  L 1 = {0 i 1 i | i ≥ 0} of 0s  L 2 = {1 m | m > 0}  This language can be expressed as  L 3 = {1 k 0 k | k ≥ 0}  L = {0 i 1 i 1 m 1 k 0 k | m > 0} Union, Concatenation, and Kleene Star of CFLs Union, Concatenation, and Kleene Star of CFLs  Example  Example  CFG for L 1 = {0 i 1 i | i ≥ 0}  Formally  A → 0A1 | λ  G = (V, T, S, P) where  CFG for L 2 = {1 m | m > 0}  B → 1B | 1  V = {S, A, B, C}  CFG for L 3 = {1 k 0 k | k ≥ 0}  Σ = {0, 1}  C → 1C0 | λ  P = {S → ABC  CFG for L A → 0A1 | λ  S → ABC B → 1B | 1  A → 0A1 | λ C → 1C0 | λ }  B → 1B | 1  C → 1C0 | λ 4

  5. CFLs and Regular Languages Practical uses for grammars  Questions?  How a compiler works Stream of Parse lexer parser codegen tokens Tree Object code Source file Theory Hall of Fame The Bell Labs Gang A real practical example  Grammars for programming languages  <stmt> → … | <for-stmt> | <if-stmt> | …  <stmt> → { <stmt> <stmt> } | ε  <if-stmt> → if ( <expr> ) then <stmt> Eric E. Schmidt  <for-stmt> → for ( <expr> ; <expr> ; Mike Lesk <expr> ) <stmt> lex Ken Thompson Stephen C Johnson Regular expressions in yacc UNIX / grep / vi A real practical example Famous programming language ambiguity  Grammars for programming languages  Dangling else  <stmt> → if (<expr>) <stmt> |  Keywords and punctuation are terminals if (<expr>) <stmt> else <stmt> |  Program constructs are variables <some_other_stmt>  Production rules define the syntax of the if (expr1) if (expr2) f(); else g(); language if (expr1) if (expr2) f(); else g(); To which if does the else belong?  This is really the second step in building a compiler! 5

  6. Famous programming language ambiguity Famous programming language ambiguity stmt stmt if ( expr ) stmt else stmt if ( expr ) stmt expr1 g(x); expr1 if ( expr ) stmt if ( expr ) stmt else stmt expr2 f(x); expr2 f(x); g(x); In this derivation, the else belongs to the 1 st if if (expr1) if (expr2) f(); else g(); if (expr1) if (expr2) f(); else g(); Famous programming language ambiguity Famous programming language ambiguity  A way to fix this  Note what productions are not defined:  <stmt> → <matched> | <unmatched> <matched> → if (<expr>) <matched> else <matched> | <otherstmt>  <stmt> → if (<expr>) <unmatched> else <matched> <unmatched> → if (<expr>) <matched> |  <stmt> → if (<expr>) <unmatched> else <unmatched> if (<expr>) <unmatched> | if (<expr>) <matched> else <unmatched>  <unmatched> can not come between if and else <matched> represents if statements with matching else <unmatched> represents if statements with at least 1 unmatched if Famous programming language ambiguity Famous programming language ambiguity stmt stmt matched unmatched if ( expr ) unmatched else stmt if ( expr ) matched expr1 g(x); expr1 if ( expr ) stmt if ( expr ) s1 else s1 NOT ALLOWED expr2 f(x); expr2 f(x); g(x); In this derivation, the else belongs to the 1 st if if (expr1) if (expr2) f(); else g(); if (expr1) if (expr2) f(); else g(); 6

  7. Summary  All Regular Languages are CFLs  Use regular language operations in constructing CFGs  CFGs in compiler design 7

Recommend


More recommend