CFLs and Regular Languages We can show that every RL is also a CFL CFLs and Regular Languages We will show by only using Regular Expressions and Context Free Grammars That is what we will do in this half. Note: Much of this lecture is not in the text! CFLs and Regular Languages Union, Concatenation, and Kleene Star of CFLs Will show that all Regular Languages Formally, Let L 1 and L 2 be CFLs. Then there exists CFGs: are CFLs G 1 = (V 1 , T, S 1 , P 1 ) - If L 1 and L 2 are CFLs then G 2 = (V 2 , T, S 2 , P 2 ) such that - L 1 ∪ L 2 is a CFL L(G 1 ) = L 1 and L(G 2 ) = L 2 - L 1 L 2 is a CFL Assume that V 1 ∩ V 2 = ∅ - L 1 * is a CFL We will define: - With the above shown, showing every G u = (V u , T, S u , P u ) such that L(G u ) = L 1 ∪ L 2 Regular Language is also a CFL can be G c = (V c , T, S c , P c ) such that L(G c ) = L 1 L 2 shown using a basic inductive proof. G k = (V k , T, S k , P k ) such that L(G c ) = L 1 * Union, Concatenation, and Kleene Star of CFLs Union, Concatenation, and Kleene Star of CFLs Union Union Basic Idea Formally Define the new CFG so that we can either G u = (V u , T, S u , P u ) start with the start variable of G 1 and follow the V u = V 1 ∪ V 2 ∪ {S u } production rules of G 1 or S u = S u start with the start variable of G 2 and follow the P u = P 1 ∪ P 2 ∪ {S u → S 1 | S 2 } production rules of G 2 The first case will derive a string in L 1 The second case will derive a string in L 2 1
Union, Concatenation, and Kleene Star of CFLs Union, Concatenation, and Kleene Star of CFLs Concatenation Concatenation General Idea Formally Define the new CFG so that G c = (V c , T, S c , P c ) We force a derivation staring from the start variable V c = V 1 ∪ V 2 ∪ {S c } of G 1 using the rules of G 1 S c = S c After that… P u = P 1 ∪ P 2 ∪ {S c → S 1 S 2 } We force a derivation staring from the start variable of G 2 using the rules of G 2 Union, Concatenation, and Kleene Star of CFLs Union, Concatenation, and Kleene Star of CFLs Kleene Star Kleene star General Idea Formally Define the new CFG so that G k = (V k , T, S k , P k ) We can repeatedly concatenate derivations of strings V k = V 1 ∪ {S k } in L 1 S k = S k Since L * contains λ , we must be careful to P k = P 1 ∪ {S k → S 1 S k | λ } assure that there are productions in our new CFG such that λ can be derived from the start variable CFLs and Regular Languages Regular Expression Now we can complete the proof Recursive definition of regular languages / expression over Σ : Use an inductive proof ∅ is a regular language and its regular 1. expression is ∅ { λ } is a regular language and λ is its regular 2. expression For each a ∈ Σ , { a } is a regular language and 3. its regular expression is a 2
Regular Expression CFLs and Regular Languages 4. If L 1 and L 2 are regular languages with regular RE -> CFG expressions r 1 and r 2 then Base cases -- L 1 ∪ L 2 is a regular language with regular ∅ can be expressed as a CFG with no expression (r 1 + r 2 ) 1. productions -- L 1 L 2 is a regular language with regular { λ } can be expressed by a CFG with expression (r 1 r 2 ) 2. the single production S → λ -- L 1 * is a regular language with regular expression (r 1 * ) For each a ∈ Σ , { a } can be expressed by 3. a CFG with the single production S → a Only languages obtainable by using rules 1-4 are regular languages . CFLs and Regular Languages Union, Concatenation, and Kleene Star of CFLs RE -> CFG Assume R 1 and R 2 are regular expressions that Context Free Languages describe languages L 1 and L 2 . Then, by the induction hypothesis, L 1 and L 2 are CFLs and as Regular Languages such there are CFGs that describe L 1 and L 2 Create CFGs that describe the the languages: Finite L 1 ∪ L 2 Languages L 1 L 2 L 1 * Which we just did…We are done! CFLs and Regular Languages CFLs and Regular Languages What have we learned? Example Find a CFG for the L = (011 + 1) * (01) * CFLs are closed under union, concatenation, and Kleene Star (011 + 1) can be described by the CFG with productions: Every Regular Language is also a CFL A → 011 | 1 (011 + 1) * can be described by the CFG with We now have an algorithm, given a productions: Regular Expression, to construct a CGF that B → AB | λ describes the same language A → 011 | 1 3
CFLs and Regular Languages CFLs and Regular Languages Example Example Find a CFG for the L = (011 + 1) * (01) * Find a CFG for the L = (011 + 1) * (01) * Putting it all together (01) can be described by the CFG with productions: (011 + 1) * (01) * can be described by the CFG with productions: D → 01 S → BC (01) * can be described by the CFG with productions: B → AB | λ C → DC | λ A → 011 | 1 D → 01 C → DC | λ D → 01 Questions? Union, Concatenation, and Kleene Star of CFLs Union, Concatenation, and Kleene Star of CFLs You can use proof of closure properties Example: in building CFLs: Find a CFL for L = {0 i 1 j 0 k | j > i + k} This language can be expressed as Example: L = {0 i 1 i 1 m 1 k 0 k | m > 0} Find a CFL for L = {0 i 1 j 0 k | j > i + k} This is concatenation of 3 languages L 1 L 2 L 3 where Number of 1s is greater than the combined number L 1 = {0 i 1 i | i ≥ 0} of 0s L 2 = {1 m | m > 0} This language can be expressed as L 3 = {1 k 0 k | k ≥ 0} L = {0 i 1 i 1 m 1 k 0 k | m > 0} Union, Concatenation, and Kleene Star of CFLs Union, Concatenation, and Kleene Star of CFLs Example Example CFG for L 1 = {0 i 1 i | i ≥ 0} Formally A → 0A1 | λ G = (V, T, S, P) where CFG for L 2 = {1 m | m > 0} B → 1B | 1 V = {S, A, B, C} CFG for L 3 = {1 k 0 k | k ≥ 0} Σ = {0, 1} C → 1C0 | λ P = {S → ABC CFG for L A → 0A1 | λ S → ABC B → 1B | 1 A → 0A1 | λ C → 1C0 | λ } B → 1B | 1 C → 1C0 | λ 4
CFLs and Regular Languages Practical uses for grammars Questions? How a compiler works Stream of Parse lexer parser codegen tokens Tree Object code Source file Theory Hall of Fame The Bell Labs Gang A real practical example Grammars for programming languages <stmt> → … | <for-stmt> | <if-stmt> | … <stmt> → { <stmt> <stmt> } | ε <if-stmt> → if ( <expr> ) then <stmt> Eric E. Schmidt <for-stmt> → for ( <expr> ; <expr> ; Mike Lesk <expr> ) <stmt> lex Ken Thompson Stephen C Johnson Regular expressions in yacc UNIX / grep / vi A real practical example Famous programming language ambiguity Grammars for programming languages Dangling else <stmt> → if (<expr>) <stmt> | Keywords and punctuation are terminals if (<expr>) <stmt> else <stmt> | Program constructs are variables <some_other_stmt> Production rules define the syntax of the if (expr1) if (expr2) f(); else g(); language if (expr1) if (expr2) f(); else g(); To which if does the else belong? This is really the second step in building a compiler! 5
Famous programming language ambiguity Famous programming language ambiguity stmt stmt if ( expr ) stmt else stmt if ( expr ) stmt expr1 g(x); expr1 if ( expr ) stmt if ( expr ) stmt else stmt expr2 f(x); expr2 f(x); g(x); In this derivation, the else belongs to the 1 st if if (expr1) if (expr2) f(); else g(); if (expr1) if (expr2) f(); else g(); Famous programming language ambiguity Famous programming language ambiguity A way to fix this Note what productions are not defined: <stmt> → <matched> | <unmatched> <matched> → if (<expr>) <matched> else <matched> | <otherstmt> <stmt> → if (<expr>) <unmatched> else <matched> <unmatched> → if (<expr>) <matched> | <stmt> → if (<expr>) <unmatched> else <unmatched> if (<expr>) <unmatched> | if (<expr>) <matched> else <unmatched> <unmatched> can not come between if and else <matched> represents if statements with matching else <unmatched> represents if statements with at least 1 unmatched if Famous programming language ambiguity Famous programming language ambiguity stmt stmt matched unmatched if ( expr ) unmatched else stmt if ( expr ) matched expr1 g(x); expr1 if ( expr ) stmt if ( expr ) s1 else s1 NOT ALLOWED expr2 f(x); expr2 f(x); g(x); In this derivation, the else belongs to the 1 st if if (expr1) if (expr2) f(); else g(); if (expr1) if (expr2) f(); else g(); 6
Summary All Regular Languages are CFLs Use regular language operations in constructing CFGs CFGs in compiler design 7
Recommend
More recommend