context free grammars
play

Context-Free Grammars Z. Sawa (TU Ostrava) Introd. to Theoretical - PowerPoint PPT Presentation

Context-Free Grammars Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 1 / 63 Context-Free Grammars Example: We would like to describe a language of arithmetic expressions, containing expressions such as: 175


  1. Context-Free Grammars Grammars are used for generating words. Example: G = ( Π, Σ, A , P ) where Π = { A , B , C } , Σ = { a , b } , and P contains rules A → aBBb | AaA B → ε | bCA C → AB | a | b For example, the word abbabb can be in grammar G generated as follows: A ⇒ aBBb ⇒ abCABb ⇒ abCaBBbBb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 12 / 63

  2. Context-Free Grammars Grammars are used for generating words. Example: G = ( Π, Σ, A , P ) where Π = { A , B , C } , Σ = { a , b } , and P contains rules A → aBBb | AaA B → ε | bCA C → AB | a | b For example, the word abbabb can be in grammar G generated as follows: A ⇒ aBBb ⇒ abCABb ⇒ abCaBBbBb ⇒ abCaBbBb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 12 / 63

  3. Context-Free Grammars Grammars are used for generating words. Example: G = ( Π, Σ, A , P ) where Π = { A , B , C } , Σ = { a , b } , and P contains rules A → aBBb | AaA B → ε | bCA C → AB | a | b For example, the word abbabb can be in grammar G generated as follows: A ⇒ aBBb ⇒ abCABb ⇒ abCaBBbBb ⇒ abCaBbBb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 12 / 63

  4. Context-Free Grammars Grammars are used for generating words. Example: G = ( Π, Σ, A , P ) where Π = { A , B , C } , Σ = { a , b } , and P contains rules A → aBBb | AaA B → ε | bCA C → AB | a | b For example, the word abbabb can be in grammar G generated as follows: A ⇒ aBBb ⇒ abCABb ⇒ abCaBBbBb ⇒ abCaBbBb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 12 / 63

  5. Context-Free Grammars Grammars are used for generating words. Example: G = ( Π, Σ, A , P ) where Π = { A , B , C } , Σ = { a , b } , and P contains rules A → aBBb | AaA B → ε | bCA C → AB | a | b For example, the word abbabb can be in grammar G generated as follows: A ⇒ aBBb ⇒ abCABb ⇒ abCaBBbBb ⇒ abCaBbBb ⇒ abbaBbBb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 12 / 63

  6. Context-Free Grammars Grammars are used for generating words. Example: G = ( Π, Σ, A , P ) where Π = { A , B , C } , Σ = { a , b } , and P contains rules A → aBBb | AaA B → ε | bCA C → AB | a | b For example, the word abbabb can be in grammar G generated as follows: A ⇒ aBBb ⇒ abCABb ⇒ abCaBBbBb ⇒ abCaBbBb ⇒ abbaBbBb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 12 / 63

  7. Context-Free Grammars Grammars are used for generating words. Example: G = ( Π, Σ, A , P ) where Π = { A , B , C } , Σ = { a , b } , and P contains rules A → aBBb | AaA B → ε | bCA C → AB | a | b For example, the word abbabb can be in grammar G generated as follows: A ⇒ aBBb ⇒ abCABb ⇒ abCaBBbBb ⇒ abCaBbBb ⇒ abbaBbBb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 12 / 63

  8. Context-Free Grammars Grammars are used for generating words. Example: G = ( Π, Σ, A , P ) where Π = { A , B , C } , Σ = { a , b } , and P contains rules A → aBBb | AaA B → ε | bCA C → AB | a | b For example, the word abbabb can be in grammar G generated as follows: A ⇒ aBBb ⇒ abCABb ⇒ abCaBBbBb ⇒ abCaBbBb ⇒ abbaBbBb ⇒ abbaBbb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 12 / 63

  9. Context-Free Grammars Grammars are used for generating words. Example: G = ( Π, Σ, A , P ) where Π = { A , B , C } , Σ = { a , b } , and P contains rules A → aBBb | AaA B → ε | bCA C → AB | a | b For example, the word abbabb can be in grammar G generated as follows: A ⇒ aBBb ⇒ abCABb ⇒ abCaBBbBb ⇒ abCaBbBb ⇒ abbaBbBb ⇒ abbaBbb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 12 / 63

  10. Context-Free Grammars Grammars are used for generating words. Example: G = ( Π, Σ, A , P ) where Π = { A , B , C } , Σ = { a , b } , and P contains rules A → aBBb | AaA B → ε | bCA C → AB | a | b For example, the word abbabb can be in grammar G generated as follows: A ⇒ aBBb ⇒ abCABb ⇒ abCaBBbBb ⇒ abCaBbBb ⇒ abbaBbBb ⇒ abbaBbb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 12 / 63

  11. Context-Free Grammars Grammars are used for generating words. Example: G = ( Π, Σ, A , P ) where Π = { A , B , C } , Σ = { a , b } , and P contains rules A → aBBb | AaA B → ε | bCA C → AB | a | b For example, the word abbabb can be in grammar G generated as follows: A ⇒ aBBb ⇒ abCABb ⇒ abCaBBbBb ⇒ abCaBbBb ⇒ abbaBbBb ⇒ abbaBbb ⇒ abbabb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 12 / 63

  12. Context-Free Grammars Grammars are used for generating words. Example: G = ( Π, Σ, A , P ) where Π = { A , B , C } , Σ = { a , b } , and P contains rules A → aBBb | AaA B → ε | bCA C → AB | a | b For example, the word abbabb can be in grammar G generated as follows: A ⇒ aBBb ⇒ abCABb ⇒ abCaBBbBb ⇒ abCaBbBb ⇒ abbaBbBb ⇒ abbaBbb ⇒ abbabb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 12 / 63

  13. Context-Free Grammars On strings from ( Π ∪ Σ ) ∗ we define relation ⇒ ⊆ ( Π ∪ Σ ) ∗ × ( Π ∪ Σ ) ∗ such that α ⇒ α ′ iff α = β 1 A β 2 and α ′ = β 1 γβ 2 for some β 1 , β 2 , γ ∈ ( Π ∪ Σ ) ∗ and A ∈ Π where ( A → γ ) ∈ P . Example: If ( B → bCA ) ∈ P then aCBbA ⇒ aCbCAbA Remark: Informally, α ⇒ α ′ means that it is possible to derive α ′ from α by one step where an occurrence of some nonterminal A in α is replaced with the right-hand side of some rule A → γ with A on the left-hand side. Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 13 / 63

  14. Context-Free Grammars On strings from ( Π ∪ Σ ) ∗ we define relation ⇒ ⊆ ( Π ∪ Σ ) ∗ × ( Π ∪ Σ ) ∗ such that α ⇒ α ′ iff α = β 1 A β 2 and α ′ = β 1 γβ 2 for some β 1 , β 2 , γ ∈ ( Π ∪ Σ ) ∗ and A ∈ Π where ( A → γ ) ∈ P . Example: If ( B → bCA ) ∈ P then aCBbA ⇒ aCbCAbA Remark: Informally, α ⇒ α ′ means that it is possible to derive α ′ from α by one step where an occurrence of some nonterminal A in α is replaced with the right-hand side of some rule A → γ with A on the left-hand side. Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 13 / 63

  15. Context-Free Grammars A derivation of length n is a sequence β 0 , β 1 , β 2 , · · · , β n , where β i ∈ ( Π ∪ Σ ) ∗ , and where β i − 1 ⇒ β i for all 1 ≤ i ≤ n , which can be written more succinctly as β 0 ⇒ β 1 ⇒ β 2 ⇒ . . . ⇒ β n − 1 ⇒ β n The fact that for given α, α ′ ∈ ( Π ∪ Σ ) ∗ and n ∈ N there exists some derivation β 0 ⇒ β 1 ⇒ β 2 ⇒ . . . ⇒ β n − 1 ⇒ β n , where α = β 0 and α ′ = β n , is denoted α ⇒ n α ′ The fact that α ⇒ n α ′ for some n ≥ 0, is denoted α ⇒ ∗ α ′ Remark: Relation ⇒ ∗ is the reflexive and transitive closure of relation ⇒ (i.e., the smallest reflexive and transitive relation containing relation ⇒ ). Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 14 / 63

  16. Context-Free Grammars Sentential forms are those α ∈ ( Π ∪ Σ ) ∗ , for which S ⇒ ∗ α where S is the initial nonterminal. Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 15 / 63

  17. Context-Free Grammars A language L ( G ) generated by a grammar G = ( Π, Σ, S , P ) is the set of all words over alphabet Σ that can be derived by some derivation from the initial nonterminal S using rules from P , i.e., L ( G ) = { w ∈ Σ ∗ | S ⇒ ∗ w } Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 16 / 63

  18. Context-Free Grammars Example: We want to construct a grammar generating the language L = { a n b n | n ≥ 0 } Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 17 / 63

  19. Context-Free Grammars Example: We want to construct a grammar generating the language L = { a n b n | n ≥ 0 } Grammar G = ( Π, Σ, S , P ) where Π = { S } , Σ = { a , b } , and P contains S → ε | aSb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 17 / 63

  20. Context-Free Grammars Example: We want to construct a grammar generating the language L = { a n b n | n ≥ 0 } Grammar G = ( Π, Σ, S , P ) where Π = { S } , Σ = { a , b } , and P contains S → ε | aSb S ⇒ ε S ⇒ aSb ⇒ ab S ⇒ aSb ⇒ aaSbb ⇒ aabb S ⇒ aSb ⇒ aaSbb ⇒ aaaSbbb ⇒ aaabbb S ⇒ aSb ⇒ aaSbb ⇒ aaaSbbb ⇒ aaaaSbbbb ⇒ aaaabbbb · · · Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 17 / 63

  21. Context-Free Grammars Example: We want to construct a grammar generating the language consisting of all palindroms over the alphabet { a , b } , i.e., L = { w ∈ { a , b } ∗ | w = w R } Remark: w R denotes the reverse of a word w , i.e., the word w written backwards. Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 18 / 63

  22. Context-Free Grammars Example: We want to construct a grammar generating the language consisting of all palindroms over the alphabet { a , b } , i.e., L = { w ∈ { a , b } ∗ | w = w R } Remark: w R denotes the reverse of a word w , i.e., the word w written backwards. Solution: S → ε | a | b | aSa | bSb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 18 / 63

  23. Context-Free Grammars Example: We want to construct a grammar generating the language consisting of all palindroms over the alphabet { a , b } , i.e., L = { w ∈ { a , b } ∗ | w = w R } Remark: w R denotes the reverse of a word w , i.e., the word w written backwards. Solution: S → ε | a | b | aSa | bSb S ⇒ aSa ⇒ abSba ⇒ abaSaba ⇒ abaaaba Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 18 / 63

  24. Context-Free Grammars Example: We want to construct a grammar generating the language L consisting of all correctly parenthesised sequences of symbols ‘ ( ’ and ‘ ) ’. For example (()())(()) ∈ L but )()) �∈ L . Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 19 / 63

  25. Context-Free Grammars Example: We want to construct a grammar generating the language L consisting of all correctly parenthesised sequences of symbols ‘ ( ’ and ‘ ) ’. For example (()())(()) ∈ L but )()) �∈ L . Solution: S → ε | ( S ) | SS Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 19 / 63

  26. Context-Free Grammars Example: We want to construct a grammar generating the language L consisting of all correctly parenthesised sequences of symbols ‘ ( ’ and ‘ ) ’. For example (()())(()) ∈ L but )()) �∈ L . Solution: S → ε | ( S ) | SS S ⇒ SS ⇒ ( S ) S ⇒ ( S )( S ) ⇒ ( SS )( S ) ⇒ (( S ) S )( S ) ⇒ (() S )( S ) ⇒ (()( S ))( S ) ⇒ (()())( S ) ⇒ (()())(( S )) ⇒ (()())(()) Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 19 / 63

  27. Context-Free Grammars Example: We want to construct a grammar generating the language L consisting of all correctly constructed arithmetic experessions where operands are always of the form ‘ a ’ and where symbols + and ∗ can be used as operators. For example ( a + a ) ∗ a + ( a ∗ a ) ∈ L . Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 20 / 63

  28. Context-Free Grammars Example: We want to construct a grammar generating the language L consisting of all correctly constructed arithmetic experessions where operands are always of the form ‘ a ’ and where symbols + and ∗ can be used as operators. For example ( a + a ) ∗ a + ( a ∗ a ) ∈ L . Solution: E → a | E + E | E ∗ E | ( E ) Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 20 / 63

  29. Context-Free Grammars Example: We want to construct a grammar generating the language L consisting of all correctly constructed arithmetic experessions where operands are always of the form ‘ a ’ and where symbols + and ∗ can be used as operators. For example ( a + a ) ∗ a + ( a ∗ a ) ∈ L . Solution: E → a | E + E | E ∗ E | ( E ) E ⇒ E + E ⇒ E ∗ E + E ⇒ ( E ) ∗ E + E ⇒ ( E + E ) ∗ E + E ⇒ ( a + E ) ∗ E + E ⇒ ( a + a ) ∗ E + E ⇒ ( a + a ) ∗ a + E ⇒ ( a + a ) ∗ a + ( E ) ⇒ ( a + a ) ∗ a + ( E ∗ E ) ⇒ ( a + a ) ∗ a + ( a ∗ E ) ⇒ ( a + a ) ∗ a + ( a ∗ a ) Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 20 / 63

  30. Derivation Tree A → aBBb | AaA B → ε | bCA C → AB | a | b Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 21 / 63

  31. Derivation Tree A A → aBBb | AaA B → ε | bCA C → AB | a | b A Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 21 / 63

  32. Derivation Tree A A → aBBb | AaA B → ε | bCA C → AB | a | b A Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 21 / 63

  33. Derivation Tree A a B B b A → aBBb | AaA B → ε | bCA C → AB | a | b A ⇒ aBBb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 21 / 63

  34. Derivation Tree A a B B b A → aBBb | AaA B → ε | bCA C → AB | a | b A ⇒ aBBb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 21 / 63

  35. Derivation Tree A a B B b A → aBBb | AaA B → ε | bCA C → AB | a | b A ⇒ aBBb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 21 / 63

  36. Derivation Tree A a B B b A → aBBb | AaA b C A B → ε | bCA C → AB | a | b A ⇒ aBBb ⇒ abCABb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 21 / 63

  37. Derivation Tree A a B B b A → aBBb | AaA b C A B → ε | bCA C → AB | a | b A ⇒ aBBb ⇒ abCABb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 21 / 63

  38. Derivation Tree A a B B b A → aBBb | AaA A b C B → ε | bCA C → AB | a | b A ⇒ aBBb ⇒ abCABb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 21 / 63

  39. Derivation Tree A a B B b A → aBBb | AaA A b C B → ε | bCA C → AB | a | b a B B b A ⇒ aBBb ⇒ abCABb ⇒ abCaBBbBb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 21 / 63

  40. Derivation Tree A a B B b A → aBBb | AaA b C A B → ε | bCA C → AB | a | b a B B b A ⇒ aBBb ⇒ abCABb ⇒ abCaBBbBb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 21 / 63

  41. Derivation Tree A a B B b A → aBBb | AaA b C A B → ε | bCA C → AB | a | b a B B b A ⇒ aBBb ⇒ abCABb ⇒ abCaBBbBb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 21 / 63

  42. Derivation Tree A a B B b A → aBBb | AaA b C A B → ε | bCA C → AB | a | b a B B b ε A ⇒ aBBb ⇒ abCABb ⇒ abCaBBbBb ⇒ abCaBbBb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 21 / 63

  43. Derivation Tree A a B B b A → aBBb | AaA b C A B → ε | bCA C → AB | a | b a B B b ε A ⇒ aBBb ⇒ abCABb ⇒ abCaBBbBb ⇒ abCaBbBb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 21 / 63

  44. Derivation Tree A a B B b A → aBBb | AaA C b A B → ε | bCA C → AB | a | b a B B b ε A ⇒ aBBb ⇒ abCABb ⇒ abCaBBbBb ⇒ abCaBbBb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 21 / 63

  45. Derivation Tree A a B B b A → aBBb | AaA b C A B → ε | bCA C → AB | a | b a b B B b ε A ⇒ aBBb ⇒ abCABb ⇒ abCaBBbBb ⇒ abCaBbBb ⇒ abbaBbBb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 21 / 63

  46. Derivation Tree A a B B b A → aBBb | AaA b C A B → ε | bCA C → AB | a | b a b B B b ε A ⇒ aBBb ⇒ abCABb ⇒ abCaBBbBb ⇒ abCaBbBb ⇒ abbaBbBb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 21 / 63

  47. Derivation Tree A a B B b A → aBBb | AaA b C A B → ε | bCA C → AB | a | b a b B B b ε A ⇒ aBBb ⇒ abCABb ⇒ abCaBBbBb ⇒ abCaBbBb ⇒ abbaBbBb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 21 / 63

  48. Derivation Tree A a B B b A → aBBb | AaA ε b C A B → ε | bCA C → AB | a | b a b B B b ε A ⇒ aBBb ⇒ abCABb ⇒ abCaBBbBb ⇒ abCaBbBb ⇒ abbaBbBb ⇒ abbaBbb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 21 / 63

  49. Derivation Tree A a B B b A → aBBb | AaA ε b C A B → ε | bCA C → AB | a | b a b B B b ε A ⇒ aBBb ⇒ abCABb ⇒ abCaBBbBb ⇒ abCaBbBb ⇒ abbaBbBb ⇒ abbaBbb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 21 / 63

  50. Derivation Tree A a B B b A → aBBb | AaA ε b C A B → ε | bCA C → AB | a | b a B b B b ε A ⇒ aBBb ⇒ abCABb ⇒ abCaBBbBb ⇒ abCaBbBb ⇒ abbaBbBb ⇒ abbaBbb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 21 / 63

  51. Derivation Tree A a B B b A → aBBb | AaA ε b C A B → ε | bCA C → AB | a | b a B b B b ε ε A ⇒ aBBb ⇒ abCABb ⇒ abCaBBbBb ⇒ abCaBbBb ⇒ abbaBbBb ⇒ abbaBbb ⇒ abbabb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 21 / 63

  52. Derivation Tree A a B B b A → aBBb | AaA ε b C A B → ε | bCA C → AB | a | b a b B B b ε ε A ⇒ aBBb ⇒ abCABb ⇒ abCaBBbBb ⇒ abCaBbBb ⇒ abbaBbBb ⇒ abbaBbb ⇒ abbabb Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 21 / 63

  53. Derivation Tree For each derivation there is some derivation tree : Nodes of the tree are labelled with terminals and nonterminals. The root of the tree is labelled with the initial nonterminal. The leafs of the tree are labelled with terminals or with symbols ε . The remaining nodes of the tree are labelled with nonterminals. If a node is labelled with some nonterminal A then its children are labelled with the symbols from the right-hand side of some rewriting rule A → α . Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 22 / 63

  54. Left and Right Derivation E → a | E + E | E ∗ E | ( E ) A left derivation is a derivation where in every step we always replace the leftmost nonterminal. E ⇒ E + E ⇒ E ∗ E + E ⇒ a ∗ E + E ⇒ a ∗ a + E ⇒ a ∗ a + a A right derivation is a derivation where in every step we always replace the rightmost nonterminal. E ⇒ E + E ⇒ E + a ⇒ E ∗ E + a ⇒ E ∗ a + a ⇒ a ∗ a + a A derivation need not be left or right: E ⇒ E + E ⇒ E ∗ E + E ⇒ E ∗ a + E ⇒ E ∗ a + a ⇒ a ∗ a + a Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 23 / 63

  55. Left and Right Derivation There can be several different derivations corresponding to one derivation tree. For every derivation tree, there is exactly one left and exactly one right derivation corresponding to the tree. Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 24 / 63

  56. Equvalence of Grammars Grammars G 1 and G 2 are equivalent if they generate the same language, i.e., if L ( G 1 ) = L ( G 2 ) . Remark: The problem of equivalence of context-free grammars is algorithmically undecidable. It can be shown that it is not possible to construct an algorithm that would decide for any pair of context-free grammars if they are equivalent or not. Even the problem to decide if a grammar generates the language Σ ∗ is algorithmically undecidable. Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 25 / 63

  57. Ambiguous Grammars A grammar G is ambiguous if there is a word w ∈ L ( G ) that has two different derivation trees, resp. two different left or two different right derivations. Example: E ⇒ E + E ⇒ E ∗ E + E ⇒ a ∗ E + E ⇒ a ∗ a + E ⇒ a ∗ a + a E ⇒ E ∗ E ⇒ E ∗ E + E ⇒ a ∗ E + E ⇒ a ∗ a + E ⇒ a ∗ a + a E E + ∗ E E E E ∗ a a + E E E E a a a a Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 26 / 63

  58. Ambiguous Grammars Sometimes it is possible to replace an ambiguous grammar with a grammar generating the same language but which is not ambiguous. Example: A grammar E → a | E + E | E ∗ E | ( E ) can be replaced with the equivalent grammar E → T | T + E T → F | F ∗ T F → a | ( E ) Remark: If there is no unambiguous grammar equivalent to a given ambiguous grammar, we say it is inherently ambiguous . Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 27 / 63

  59. Context-Free Languages Definition A language L is context-free if there exists some context-free grammar G such that L = L ( G ) . The class of context-free languages is closed with respect to: concatenation union iteration The class of context-free languages is not closed with respect to: complement intersection Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 28 / 63

  60. Context-Free Languages We have two grammars G 1 = ( Π 1 , Σ, S 1 , P 1 ) and G 2 = ( Π 2 , Σ, S 2 , P 2 ) , and can assume that Π 1 ∩ Π 2 = ∅ and S �∈ Π 1 ∪ Π 2 . Grammar G such that L ( G ) = L ( G 1 ) · L ( G 2 ) : G = ( Π 1 ∪ Π 2 ∪ { S } , Σ, S , P 1 ∪ P 2 ∪ { S → S 1 S 2 } ) Grammar G such that L ( G ) = L ( G 1 ) ∪ L ( G 2 ) : G = ( Π 1 ∪ Π 2 ∪ { S } , Σ, S , P 1 ∪ P 2 ∪ { S → S 1 , S → S 2 } ) Grammar G such that L ( G ) = L ( G 1 ) ∗ : G = ( Π 1 ∪ { S } , Σ, S , P 1 ∪ { S → ε, S → S 1 S } ) Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 29 / 63

  61. A Context-Free Grammar for a Regular Expression Example: The construction of a context-free grammar for regular expression (( a + b ) · b ) ∗ : ∗ · + b a b Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 30 / 63

  62. A Context-Free Grammar for a Regular Expression Example: The construction of a context-free grammar for regular expression (( a + b ) · b ) ∗ : ∗ · + b S 1 → a S 1 a b Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 30 / 63

  63. A Context-Free Grammar for a Regular Expression Example: The construction of a context-free grammar for regular expression (( a + b ) · b ) ∗ : ∗ · S 2 → b S 2 + b S 1 → a S 1 S 2 a b Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 30 / 63

  64. A Context-Free Grammar for a Regular Expression Example: The construction of a context-free grammar for regular expression (( a + b ) · b ) ∗ : ∗ · S 3 → S 1 | S 2 S 2 → b S 3 S 2 + b S 1 → a S 1 S 2 a b Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 30 / 63

  65. A Context-Free Grammar for a Regular Expression Example: The construction of a context-free grammar for regular expression (( a + b ) · b ) ∗ : ∗ S 4 S 4 → S 3 S 2 · S 3 → S 1 | S 2 S 2 → b S 3 S 2 + b S 1 → a S 1 S 2 a b Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 30 / 63

  66. A Context-Free Grammar for a Regular Expression Example: The construction of a context-free grammar for regular expression (( a + b ) · b ) ∗ : S 5 ∗ S 5 → ε | S 4 S 5 S 4 S 4 → S 3 S 2 · S 3 → S 1 | S 2 S 2 → b S 3 S 2 + b S 1 → a S 1 S 2 a b Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 30 / 63

  67. Lexical and Syntactic Analysis — an example Example: We would like to recognize a language of arithmetic expressions containing expressions such as: 34 x+1 -x * 2 + 128 * (y - z / 3) The expressions can contain number constants — sequences of digits 0 , 1 , . . . , 9 . The expressions can contain names of variables — sequences consisting of letters, digits, and symbol “ ”, which do not start with a digit. The expressions can contain basic arithmetic operations — “ + ”, “ - ”, “ * ”, “ / ”, and unary “ - ”. It is possible to use parentheses — “ ( ” and “ ) ”, and to use a standard priority of arithmetic operations. Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 31 / 63

  68. Lexical and Syntactic Analysis — an example The problem we want to solve: Input: a sequence of characters (e.g., a string, a text file, etc.) Output: an abstract syntax tree representing the structure of a given expression, or an information about a syntax error in the expression Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 32 / 63

  69. Lexical and Syntactic Analysis — an example It is convenient to decompose this problem into several parts: Lexical analysis — recognizing of lexical elements (so called tokens ) such as for example identifiers, number constants, operators, etc. Syntactic analysis — determining whether a given sequence of tokens corresponds to an allowed structure of expressions; basically, it means finding corresponding derivation (resp. derivation tree) for a given word in a context-free grammar representing the given language (e.g., in our case, the language of all well-formed expressions). Construction of an abstract syntax tree — this phase is usually connected with the syntax analysis, where the result, actually produced by the program, is typically not directly a derivation tree but rather some kind of abstract syntax tree or performing of some actions connected with rules of the given grammar. Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 33 / 63

  70. Lexical and Syntactic Analysis — an example Terminals for the grammar representing well-formed expressions: � ident � — identifier, e.g. “ x ”, “ q3 ”, “ count r12 ” � num � — number constant, e.g. “ 5 ”, “ 42 ”, “ 65535 ” “ ( ” — left parenthesis “ ) ” — right parenthesis “ + ” — plus “ - ” — minus “ * ” — star “ / ” — slash Remark: Recognizing of sequences of symbols that correspond to individual terminals is the goal of lexical analysis. Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 34 / 63

  71. Lexical and Syntactic Analysis — an example Example: Expression -x * 2 + 128 * (y - z / 3) is represented by the following sequence of symbols: - x * 2 + 1 2 8 * ( y - z / 3 ) The following sequence of tokens corresponds to this sequence of symbols; these tokens are terminal symbols of the given context-free grammar: - � ident � * � num � + � num � * ( � ident � - � ident � / � num � ) Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 35 / 63

  72. Lexical and Syntactic Analysis — an example The context-free grammar for the given language — the first try: E → � ident � | � num � | ( E ) | - E | E + E | E - E | E * E | E / E Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 36 / 63

  73. Lexical and Syntactic Analysis — an example The context-free grammar for the given language — the first try: E → � ident � | � num � | ( E ) | - E | E + E | E - E | E * E | E / E This grammar is ambiguous. Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 36 / 63

  74. Lexical and Syntactic Analysis — an example The context-free grammar for the given language — the second try: E → T | T + E | T - E T → F | F * T | F / T F → � ident � | � num � | ( E ) | - F Different levels of priority are represented by different nonterminals: E — expression T — term F — factor This grammar is unambiguous. Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 36 / 63

  75. Lexical and Syntactic Analysis — an example The context-free grammar for the given language — the third try: E → T | T A E A → + | - T → F | F M T M → * | / F → � ident � | � num � | ( E ) | - F We create separate nonterminals for operators on different levels of priority: A — additive operator M — multiplicative operator Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 36 / 63

  76. Lexical and Syntactic Analysis — an example The context-free grammar for the given language — the fourth try: S → E � eof � E → T | T A E A → + | - T → F | F M T M → * | / F → � ident � | � num � | ( E ) | - F It is useful to introduce special nonterminal � eof � representing the end of input. Moreover, in this grammar the initial nonterminal S does not occur on the right hand side of any grammar. Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 36 / 63

  77. Implementation of Lexical Analysis Enumerated type Token kind representing different kinds of tokens : T EOF — the end of input T Ident — identifier T Number — number constant T LParen — “ ( ” T RParen — “ ) ” T Plus — “ + ” T Minus — “ - ” T Star — “ * ” T Slash — “ / ” Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 37 / 63

  78. Implementation of Lexical Analysis Variable c : a currently processed character (resp. a special value � eof � representing the end of input): at the beginning, the first character in the input is read to variable c function next-char () returns a next charater from the input Some helper functions: error () — outputs an information about a syntax error and aborts the processing of the expression is-ident-start-char ( c ) — tests whether c is a charater that can occur at the beginning of an identifier is-ident-normal-char ( c ) — tests whether c is a character that can occur in an identifier (on other positions except beginning) is-digit ( c ) — tests whether c is a digit Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science April 20, 2020 38 / 63

Recommend


More recommend