context free languages and grammars
play

Context Free Languages and Grammars Lecture 7 September 18, 2018 - PowerPoint PPT Presentation

CS/ECE 374: Algorithms & Models of Computation, Fall 2018 Context Free Languages and Grammars Lecture 7 September 18, 2018 Nikita Borisov (UIUC) CS/ECE 374 1 Fall 2018 1 / 37 Regular Languages Regular expressions allow us to


  1. CS/ECE 374: Algorithms & Models of Computation, Fall 2018 Context Free Languages and Grammars Lecture 7 September 18, 2018 Nikita Borisov (UIUC) CS/ECE 374 1 Fall 2018 1 / 37

  2. Regular Languages Regular expressions allow us to describe/express a class of languages compactly and precisely. Equivalence with DFAs show the following: given any regular expression r there is a very efficient algorithm for solving the language recognition problem for L ( r ) : given w ∈ Σ ∗ is w ∈ L ( r ) ? Nikita Borisov (UIUC) CS/ECE 374 2 Fall 2018 2 / 37

  3. Regular Languages Regular expressions allow us to describe/express a class of languages compactly and precisely. Equivalence with DFAs show the following: given any regular expression r there is a very efficient algorithm for solving the language recognition problem for L ( r ) : given w ∈ Σ ∗ is w ∈ L ( r ) ? In fact the running time of the algorithm is linear in | w | . Nikita Borisov (UIUC) CS/ECE 374 2 Fall 2018 2 / 37

  4. Regular Languages Regular expressions allow us to describe/express a class of languages compactly and precisely. Equivalence with DFAs show the following: given any regular expression r there is a very efficient algorithm for solving the language recognition problem for L ( r ) : given w ∈ Σ ∗ is w ∈ L ( r ) ? In fact the running time of the algorithm is linear in | w | . Disadvantage of regular expressions/languages: Nikita Borisov (UIUC) CS/ECE 374 2 Fall 2018 2 / 37

  5. Regular Languages Regular expressions allow us to describe/express a class of languages compactly and precisely. Equivalence with DFAs show the following: given any regular expression r there is a very efficient algorithm for solving the language recognition problem for L ( r ) : given w ∈ Σ ∗ is w ∈ L ( r ) ? In fact the running time of the algorithm is linear in | w | . Disadvantage of regular expressions/languages: too simple and cannot express interesting features such as balanced parenthesis that we need in programming languages. No recursion allowed even in limited form. Nikita Borisov (UIUC) CS/ECE 374 2 Fall 2018 2 / 37

  6. Language classes: Chomsky Hierarchy Generative models for languages based on grammars. All Recursively Enumerable Context Sensitive Context Free Regular Nikita Borisov (UIUC) CS/ECE 374 3 Fall 2018 3 / 37

  7. Chomsky Hierarchy and Machines For each class one can define a corresponding class of machines. All Recursively Enumerable TM Context Sensitive LBA Context Free PDA Regular DFA Nikita Borisov (UIUC) CS/ECE 374 4 Fall 2018 4 / 37

  8. Programming Language Design Question: What is a valid C program? Or a Python program? Question: Given a string w what is an algorithm to check whether w is a valid C program? The parsing problem. Nikita Borisov (UIUC) CS/ECE 374 5 Fall 2018 5 / 37

  9. Context Free Languages and Grammars Programming Language Specification Parsing Natural language understanding Generative model giving structure . . . CFLs provide a good balance between expressivity and tractability. Limited form of recursion. Nikita Borisov (UIUC) CS/ECE 374 6 Fall 2018 6 / 37

  10. Programming Languages Nikita Borisov (UIUC) CS/ECE 374 7 Fall 2018 7 / 37

  11. Natural Language Processing Nikita Borisov (UIUC) CS/ECE 374 8 Fall 2018 8 / 37

  12. Models of Growth L -systems http://www.kevs3d.co.uk/dev/lsystems/ Nikita Borisov (UIUC) CS/ECE 374 9 Fall 2018 9 / 37

  13. Kolam drawing generated by grammar Nikita Borisov (UIUC) CS/ECE 374 10 Fall 2018 10 / 37

  14. Context Free Grammar (CFG) Definition Definition A CFG is is a quadruple G = ( V , T , P , S ) V is a finite set of non-terminal symbols Nikita Borisov (UIUC) CS/ECE 374 11 Fall 2018 11 / 37

  15. Context Free Grammar (CFG) Definition Definition A CFG is is a quadruple G = ( V , T , P , S ) V is a finite set of non-terminal symbols T is a finite set of terminal symbols (alphabet) Nikita Borisov (UIUC) CS/ECE 374 11 Fall 2018 11 / 37

  16. Context Free Grammar (CFG) Definition Definition A CFG is is a quadruple G = ( V , T , P , S ) V is a finite set of non-terminal symbols T is a finite set of terminal symbols (alphabet) P is a finite set of productions, each of the form A → α where A ∈ V and α is a string in ( V ∪ T ) ∗ . Formally, P ⊂ V × ( V ∪ T ) ∗ . Nikita Borisov (UIUC) CS/ECE 374 11 Fall 2018 11 / 37

  17. Context Free Grammar (CFG) Definition Definition A CFG is is a quadruple G = ( V , T , P , S ) V is a finite set of non-terminal symbols T is a finite set of terminal symbols (alphabet) P is a finite set of productions, each of the form A → α where A ∈ V and α is a string in ( V ∪ T ) ∗ . Formally, P ⊂ V × ( V ∪ T ) ∗ . S ∈ V is a start symbol Nikita Borisov (UIUC) CS/ECE 374 11 Fall 2018 11 / 37

  18. Example V = { S } T = { a , b } P = { S → ǫ | a | b | aSa | bSb } (abbrev. for S → ǫ, S → a , S → b , S → aSa , S → bSb ) Nikita Borisov (UIUC) CS/ECE 374 12 Fall 2018 12 / 37

  19. Example V = { S } T = { a , b } P = { S → ǫ | a | b | aSa | bSb } (abbrev. for S → ǫ, S → a , S → b , S → aSa , S → bSb ) S � aSA � abSba � abbSBba � abbba Nikita Borisov (UIUC) CS/ECE 374 12 Fall 2018 12 / 37

  20. Example V = { S } T = { a , b } P = { S → ǫ | a | b | aSa | bSb } (abbrev. for S → ǫ, S → a , S → b , S → aSa , S → bSb ) S � aSA � abSba � abbSBba � abbba What strings can S generate like this? Nikita Borisov (UIUC) CS/ECE 374 12 Fall 2018 12 / 37

  21. Palindromes Madam in Eden I’m Adam Dog doo? Good God! Dogma: I am God. A man, a plan, a canal, Panama Are we not drawn onward, we few, drawn onward to new era? Doc, note: I dissent. A fast never prevents a fatness. I diet on cod. http://www.palindromelist.net Nikita Borisov (UIUC) CS/ECE 374 13 Fall 2018 13 / 37

  22. Example L = { 0 n 1 n | n ≥ 0 } Nikita Borisov (UIUC) CS/ECE 374 14 Fall 2018 14 / 37

  23. Example L = { 0 n 1 n | n ≥ 0 } S → ǫ | 0 S 1 Nikita Borisov (UIUC) CS/ECE 374 14 Fall 2018 14 / 37

  24. Notation and Convention Let G = ( V , T , P , S ) then a , b , c , d , . . . , in T (terminals) A , B , C , D , . . . , in V (non-terminals) u , v , w , x , y , . . . in T ∗ for strings of terminals α, β, γ, . . . in ( V ∪ T ) ∗ X , Y , Z in V ∪ T Nikita Borisov (UIUC) CS/ECE 374 15 Fall 2018 15 / 37

  25. “Derives” relation Formalism for how strings are derived/generated Definition Let G = ( V , T , P , S ) be a CFG. For strings α 1 , α 2 ∈ ( V ∪ T ) ∗ we say α 1 derives α 2 denoted by α 1 � G α 2 if there exist strings β, γ, δ in ( V ∪ T ) ∗ such that α 1 = β A δ α 2 = βγδ A → γ is in P . Examples: S � ǫ , S � 0 S 1 , 0 S 1 � 00 S 11 , 0 S 1 � 01 . Nikita Borisov (UIUC) CS/ECE 374 16 Fall 2018 16 / 37

  26. “Derives” relation continued Definition For integer k ≥ 0 , α 1 � k α 2 inductive defined: α 1 � 0 α 2 if α 1 = α 2 α 1 � k α 2 if α 1 � β 1 and β 1 � k − 1 α 2 . Nikita Borisov (UIUC) CS/ECE 374 17 Fall 2018 17 / 37

  27. “Derives” relation continued Definition For integer k ≥ 0 , α 1 � k α 2 inductive defined: α 1 � 0 α 2 if α 1 = α 2 α 1 � k α 2 if α 1 � β 1 and β 1 � k − 1 α 2 . Alternative defn: α 1 � k α 2 if α 1 � k − 1 β 1 and β 1 � α 2 Nikita Borisov (UIUC) CS/ECE 374 17 Fall 2018 17 / 37

  28. “Derives” relation continued Definition For integer k ≥ 0 , α 1 � k α 2 inductive defined: α 1 � 0 α 2 if α 1 = α 2 α 1 � k α 2 if α 1 � β 1 and β 1 � k − 1 α 2 . Alternative defn: α 1 � k α 2 if α 1 � k − 1 β 1 and β 1 � α 2 ∗ is the reflexive and transitive closure of � . � ∗ α 2 if α 1 � k α 2 for some k . α 1 � ∗ 0000011111 . ∗ ǫ , 0 S 1 � Examples: S � Nikita Borisov (UIUC) CS/ECE 374 17 Fall 2018 17 / 37

  29. Context Free Languages Definition The language generated by CFG G = ( V , T , P , S ) is denoted by L ( G ) where L ( G ) = { w ∈ T ∗ | S � ∗ w } . Nikita Borisov (UIUC) CS/ECE 374 18 Fall 2018 18 / 37

  30. Context Free Languages Definition The language generated by CFG G = ( V , T , P , S ) is denoted by L ( G ) where L ( G ) = { w ∈ T ∗ | S � ∗ w } . Definition A language L is context free (CFL) if it is generated by a context free grammar. That is, there is a CFG G such that L = L ( G ) . Nikita Borisov (UIUC) CS/ECE 374 18 Fall 2018 18 / 37

  31. Examples L = { 0 n 1 n | n ≥ 0 } Nikita Borisov (UIUC) CS/ECE 374 19 Fall 2018 19 / 37

  32. Examples L = { 0 n 1 n | n ≥ 0 } L = { 0 n 1 m | m > n } Nikita Borisov (UIUC) CS/ECE 374 19 Fall 2018 19 / 37

  33. Examples L = { 0 n 1 n | n ≥ 0 } L = { 0 n 1 m | m > n } L = { 0 n 1 m | m < n } Nikita Borisov (UIUC) CS/ECE 374 19 Fall 2018 19 / 37

  34. Examples L = { 0 n 1 n | n ≥ 0 } L = { 0 n 1 m | m > n } L = { 0 n 1 m | m < n } L = { w ∈ { ( , ) } ∗ | w is properly nested string of parenthesis } Nikita Borisov (UIUC) CS/ECE 374 19 Fall 2018 19 / 37

  35. Examples L = { 0 n 1 n | n ≥ 0 } L = { 0 n 1 m | m > n } L = { 0 n 1 m | m < n } L = { w ∈ { ( , ) } ∗ | w is properly nested string of parenthesis } L = { w ∈ { 0 , 1 } ∗ | w has twice as many 1 s as 0 ’s } Nikita Borisov (UIUC) CS/ECE 374 19 Fall 2018 19 / 37

Recommend


More recommend