context free languages
play

Context-Free Languages Wen-Guey Tzeng Department of Computer - PowerPoint PPT Presentation

Context-Free Languages Wen-Guey Tzeng Department of Computer Science National Chiao Tung University Context-Free Grammars A grammar G=(V, T, S, P) is context-free if all productions in P are of form A x, where A V, x (V T)*


  1. Context-Free Languages Wen-Guey Tzeng Department of Computer Science National Chiao Tung University

  2. Context-Free Grammars • A grammar G=(V, T, S, P) is context-free if all productions in P are of form A  x, where A  V, x  (V  T)* – The left side has only one variable. • Example, G = ({S,A,B}, {a,b}, S, {S  aAb|bBa, A  aAb|  , B  bBb|  }) 2017 Spring 2

  3. • Derivation: 2017 Spring 3

  4. • L(G) = {w  * | S  * w} • A language L is context-free if and only if there is a context-free grammar G such that L=L(G). 2017 Spring 4

  5. Examples • G=({S}, {a, b}, S, P), with P={S  aSa|bSb|  } – S  aSa  aaSaa  aabSbaa  aab  baa=aabbaa – L(G) = {ww R : w  {a, b}*} 2017 Spring 5

  6. • S  abB, A  aaBb|  , B  bbAa – L(G) = {ab(bbaa) n bba(ba) n : n  0} ? 2017 Spring 6

  7. Design cfg’s • Give a cfg for L={a n b n : n  0} 2017 Spring 7

  8. • Give a cfg for L={a n b m : n>m} 2017 Spring 8

  9. • Give a cfg for L={a n b m : n  m  0} – Idea1 : • parse L into two cases (not necessarily disjoint) L 1 ={a n b m : n>m}  L 2 ={a n b m : n<m}. • Then, construct productions for L 1 and L 2 , respectively. 2017 Spring 9

  10. • Give a cfg for L={a n b m : n  m  0} – Idea2 : • produce the same amount of a’s and b’s, then extra a’s or b’s 2017 Spring 10

  11. • Give a cfg for L={a n b m c k : m=n+k} – Match ‘a’ and ‘b’, ‘b’ and ‘c’ 2017 Spring 11

  12. • Give a cfg for L={a n b m c k : m>n+k} 2017 Spring 12

  13. • Give a cfg for L={w  {a,b}* : n a (w)=n b (w)} – Find the “recursion” 2017 Spring 13

  14. • Give a cfg for L={w  {a,b}* : n a (w)>n b (w)} – Find relation with other language – Consider starting with ‘a’ and ‘b’, respectively 2017 Spring 14

  15. Leftmost and rightmost derivation • G=({A, B, S}, {a, b}, S, P), where P contains S  AB, A  aaA, A   , B  Bb, B   – L(G)={a 2n b m : n, m  0} • For string aab – Rightmost derivation – Leftmost derivation 2017 Spring 15

  16. Derivation (parse) tree • A  abABc 2017 Spring 16

  17. • S  aAB, A  bBb, B  A|  2017 Spring 17

  18. Some comments • Derivation trees represent no orders of derivation • Leftmost/rightmost derivations correspond to depth-first visiting of the tree • Derivation tree and derivation order are very important to “programming language” and “compiler design” 2017 Spring 18

  19. Grammar for C 2017 Spring 19

  20. main() { int i=1; printf("i starts out life as %d.", i); i = add(1, 1); /* Function call */ printf(" And becomes %d after function is executed.\n", i); } 2017 Spring 20

  21. Parsing and ambiguity • Parsing of w  L(G): find a sequence of productions by which w  L(G) is derived. • Questions : given G and w – Is w  L(G) ? (membership problem) – Efficient way to determine whether w  L(G) ? – How is w  L(G) parsed ? ( build the parsing tree ) – Is the parsing unique ? 2017 Spring 21

  22. Exhaustive search/top down parsing • S  SS|aSb|bSa|  • Determine aabb  L(G) ? – 1 st round: (1) S  SS; (2) S  aSb; (3) S  bSa; (4) S  – 2 nd round: • From (1), S  SS  SSS, S  SS  aSbS, S  SS  bSaS, S  SS   S • From (2), S  aSb  aSSb, S  aSb  aaSbb, S  aSb  abSab, S  aSb  ab – 3 rd round: … • Drawback: inefficiency • Other ways ? 2017 Spring 22

  23. • If no productions of form A   or A  B, the exhaustive search for w  L(G) can be done in |P|+|P| 2 +…+|P| 2|w| = O(|P| 2|w|+1 ) – Consider the leftmost parsing method. – w can be obtained within 2|w| derivations. 2017 Spring 23

  24. Bottom up parsing • To reduce a string w to the start variable S • S  aSb|  – w=aabb  aaSbb  aSb  S • Efficiency: O(|w| 3 ) 2017 Spring 24

  25. Linear-time parsing • Simple grammar (s-grammar) – All productions are of form A  ax, where x  (V  T)* – Any pair (A, a) occurs at most once in P. • Example: S  aS|bSS|c – Parsing for ababccc 2017 Spring 25

  26. Ambiguous grammars • G is ambiguous if some w  L(G) has two derivation trees. • Example: S  aSb|SS|  2017 Spring 26

  27. Example from programming languages • C-like grammar for arithmetic expressions. G=({E, I}, {a, b, c, +, x, (, )}, E, P), where P contains E  I E  E+E E  ExE E  (E) I  a|b|c • w=a+bxc has two derivation trees 2017 Spring 27

  28. 2017 Spring 28

  29. Ambiguous languages • A cfl L is inherently ambiguous if any cfg G with L(G)=L is ambiguous. Otherwise, it is unambiguous. • Note : an unambiguous language may have ambiguous grammar. • Example: L={a n b n c m }  {a n b m c m } is inherently ambigous. – Hard to prove. 2017 Spring 29

  30. CFG and Programming Languages • Programming language: syntax + semantics • Syntax is defined by a grammar G – <expression> ::= <term> | <expression> + <term> <term> ::= <factor> | <term> * <factor> – <while_statement> ::= while <expression><statement> • Syntax checking in compilers is done by a parser – Is a program p grammatically correct ? – Is p  L(G) ? – We need efficient parsers. 2017 Spring 30

  31. Restricted CFG Programming Languages • Goal: – Its expression power is enough. – It has no ambiguity. if then if then else  If then “if then else”  If then “if then” else – There exists an efficient parser. 2017 Spring 31

  32. • C -- LR(1) • PASCAL -- LL(1) • Hierarchy of classes of context-free languages – LL(1)  LR(0)  LR(1)=DCFL  LR(2)  …  CFL 2017 Spring 32

  33. Syntactic Correctness • Lexical analyzer produces a stream of tokens x = y +2.1  <id> <op> <id> <op> <real> • Parser (syntactic analyzer) verifies that this token stream is syntactically correct by constructing a valid parse tree for the entire program – Unique parse tree for each language construct – Program = collection of parse trees rooted at the top by a special start symbol slide 33 2017 Spring

  34. CFG For Floating Point Numbers ::= stands for production rule; <…> are non-terminals; | represents alternatives for the right-hand side of a production rule Sample parse tree: slide 34 2017 Spring

  35. CFG For Balanced Parentheses Could we write this grammar using regular expressions or DFA? Why? <balanced>  ( <balanced> ) Sample derivation:  (( <balanced> ))  (( <empty> ))  (( )) slide 35 2017 Spring

  36. CFG For Decimal Numbers (Redux) This grammar is right-recursive <num>  <digit> <num> Sample top-down leftmost  7 <num> derivation:  7 <digit> <num>  7 8 <num>  7 8 <digit>  7 8 9 slide 36 2017 Spring

  37. Compiler-compiler • A compiler-compiler is a program that generates a compiler from a defined grammar • Parser can be built automatically from the BNF description of the language’s CFG • Tools: yacc, Bison slide 37 2017 Spring

  38. program Programming language grammar G=(V, T, S, P) Compiler: Compiler- parser + code compiler generator Input data Execution code result slide 38 2017 Spring

Recommend


More recommend