derivations
play

Derivations Derivations Informatics 2A: Lecture 4 Tree Diagrams - PowerPoint PPT Presentation

Context-Free Grammars Context-Free Grammars Context-Sensitive Grammars Context-Sensitive Grammars Normal Forms Normal Forms 1 Context-Free Grammars Review Derivations Derivations Informatics 2A: Lecture 4 Tree Diagrams Non-Equivalent


  1. Context-Free Grammars Context-Free Grammars Context-Sensitive Grammars Context-Sensitive Grammars Normal Forms Normal Forms 1 Context-Free Grammars Review Derivations Derivations Informatics 2A: Lecture 4 Tree Diagrams Non-Equivalent Derivations Bonnie Webber (revised by Frank Keller) 2 Context-Sensitive Grammars School of Informatics 3 Normal Forms University of Edinburgh keller@inf.ed.ac.uk Chomsky and Greibach Normal Forms Converting to Chomsky Normal Form 25 September 2007 Reading: Kozen, ch. 21 (on Normal Form ) Informatics 2A: Lecture 4 Derivations 1 Informatics 2A: Lecture 4 Derivations 2 Review Review Context-Free Grammars Context-Free Grammars Derivations Derivations Context-Sensitive Grammars Context-Sensitive Grammars Tree Diagrams Tree Diagrams Normal Forms Normal Forms Non-Equivalent Derivations Non-Equivalent Derivations Review Derivations in Context-free Grammars A derivation is the sequence of strings over V produced by a Consider a simple CFG with non-terminal symbols { S, A, B } and sequence of PS rule applications, starting from a start symbol Σ. terminal symbols { a, b } : In a phrase structure grammar (either context-free or S → AB context-sensitive), only one symbol is rewritten at each step in a A → AB | a derivation. B → BA | b S ⇒ NP VP ⇒ NP verb NP ⇒ NP verb the book ⇒ NP took the Example 1 – rewriting leftmost NT book ⇒ the man took the book S ⇒ AB ⇒ ABB ⇒ aBB ⇒ aBAB ⇒ abAB ⇒ abaB ⇒ abaBA We distinguish those symbols that can be re-written (non-terminal ⇒ ababA ⇒ ababAB ⇒ ababaB ⇒ ababab symbols) from those that cannot (terminal symbols), and take sentences of a language to be strings of terminal symbols. Informatics 2A: Lecture 4 Derivations 3 Informatics 2A: Lecture 4 Derivations 4

  2. Review Review Context-Free Grammars Context-Free Grammars Derivations Derivations Context-Sensitive Grammars Context-Sensitive Grammars Tree Diagrams Tree Diagrams Normal Forms Normal Forms Non-Equivalent Derivations Non-Equivalent Derivations Derivations in Context-free Grammars Derivations in Context-free Grammars The order in which NTs are rewritten does not matter in CFG Example 2 – rewriting rightmost NT derivations. S ⇒ AB ⇒ ABA ⇒ ABAB ⇒ ABAb ⇒ ABab ⇒ Abab ⇒ ABbab How can we represent such equivalent derivations in a simple way? ⇒ ABAbab ⇒ ABabab ⇒ Ababab ⇒ ababab Option 1: Always write a derivation in the same order (e.g., left-to-right). This is called a canonical order. Example 3 – rewriting NT at random Option 2: Use an immediate constituency diagram: S ⇒ AB ⇒ ABB ⇒ ABAB ⇒ ABaB ⇒ AbaB ⇒ AbaBA ⇒ a b a b a b AbabA ⇒ AbabAB ⇒ AbabaB ⇒ Ababab ⇒ ababab A B A B A B B A These are different derivations, but they only differ in the order in A B which the same PS rules have applied to the same NTs. S Informatics 2A: Lecture 4 Derivations 5 Informatics 2A: Lecture 4 Derivations 6 Review Review Context-Free Grammars Context-Free Grammars Derivations Derivations Context-Sensitive Grammars Context-Sensitive Grammars Tree Diagrams Tree Diagrams Normal Forms Normal Forms Non-Equivalent Derivations Non-Equivalent Derivations Tree Diagrams Non-Equivalent Derivations Are derivations that produce the same string always equivalent? Option 3: Use a tree diagram: Let’s look at two derivations of abab. S A B Example 4 A B B A S ⇒ AB ⇒ aB ⇒ aBA ⇒ abA ⇒ abAB ⇒ abaB ⇒ abab B A A B a b a b a Example 4 b S ⇒ AB ⇒ ABB ⇒ aBB ⇒ aBB ⇒ aBAB ⇒ abAB ⇒ abaB ⇒ Both tree diagrams and constituency diagrams show which rules abab have been applied, but hide the order of application. Given a tree diagram, we can associate a canonical order with how Both these derivations are left-to-right, and they produce the same we unfold it – e.g., top-down left-to-right. We’ll see this when we string. look at parsing CFGs in Week 5. Is there any difference between them? Informatics 2A: Lecture 4 Derivations 7 Informatics 2A: Lecture 4 Derivations 8

  3. Review Review Context-Free Grammars Context-Free Grammars Derivations Derivations Context-Sensitive Grammars Context-Sensitive Grammars Tree Diagrams Tree Diagrams Normal Forms Normal Forms Non-Equivalent Derivations Non-Equivalent Derivations Non-equivalent Derivations Quick in-class exercise Tree diagrams make clear that terminal symbols in the string “abab” come from different NTs. Consider the toy CFG with non-terminals { S, NP, VP, Adj, N, V } , S S terminals { fish, police, scots } and PS rules: A B A B S → NP VP A B A B NP → Adj N | N A A B VP → V | V NP B Adj → scots b a a b a b a b → fish | police | scots N When a string has more than one structural analysis with respect V → fish | police to a grammar, it is called ambiguous with respect to that grammar. Ambiguity is one of the key ideas in Inf2A. Does this grammar produce ambiguous strings? Through what derivations? Ambiguity is not a property of the order of phrase-structure rule applications: Order is still irrelevant. Informatics 2A: Lecture 4 Derivations 9 Informatics 2A: Lecture 4 Derivations 10 Context-Free Grammars Context-Free Grammars Context-Sensitive Grammars Context-Sensitive Grammars Normal Forms Normal Forms Derivations in Context-Sensitive Grammars Derivations in Context-Sensitive Grammars With context-sensitive grammars, the order of rule applications can matter. Example 6 There are two ways of looking at this: S ⇒ aXbY ⇒ aZWbY ⇒ aZWbst 1 Different canonical orders of rule applications can produce different string sets: Grammars restricted to different orders Example 7 can produce different languages. 2 Given the sequence of rule applications that produces a string S ⇒ aXbY ⇒ aXbrs ⇒ aZWbrs σ 1 , a different sequence may produce a different string σ 2 . In a left-to-right derivation (Example 6), only the second Consider part of a toy CSG with non-terminals { S, W, X, Y, Z } and terminals { a, b, r, S } : production for Y can be applied. In a right-to-left derivation (Example 7), only the first production for Y can be applied. → S aXbY Xb → ZWb With CSGs, but not CFGs, derivation order is significant. XbY → Xbrs WbY → Wbst Informatics 2A: Lecture 4 Derivations 11 Informatics 2A: Lecture 4 Derivations 12

  4. Context-Free Grammars Context-Free Grammars Chomsky and Greibach Normal Forms Chomsky and Greibach Normal Forms Context-Sensitive Grammars Context-Sensitive Grammars Converting to Chomsky Normal Form Converting to Chomsky Normal Form Normal Forms Normal Forms Normal Forms Converting to Chomsky Normal Form There are two canonical (aka normal) forms for PS rules. Recall the simple CFG used for generating L1: = { a , b , S } V Chomsky Normal Form: All productions are of the form: Σ = S A → BC S → aSb A → a S → ab where A , B , C are NT symbols and a is a terminal symbol. Convert this to Chomsky Normal Form by: Greibach Normal Form : All productions are of the form: 1 Adding a new non-terminal symbol for each terminal symbol: A → a A → aB 1 B 2 . . . B k k ≥ 0 B → b where A , B 1 , . . . , B k are NT symbols and a is a terminal symbol. 2 Replace terminal symbols in the original rules with these new non-terminals: The basic CKY parser (Week 6) assumes all production rules are in → S ASB Chomsky Normal Form. Other efficient parsers use an extended S → AB version of Greibach Normal Form. Informatics 2A: Lecture 4 Derivations 13 Informatics 2A: Lecture 4 Derivations 14 Context-Free Grammars Context-Free Grammars Chomsky and Greibach Normal Forms Chomsky and Greibach Normal Forms Context-Sensitive Grammars Context-Sensitive Grammars Converting to Chomsky Normal Form Converting to Chomsky Normal Form Normal Forms Normal Forms Converting to Chomsky Normal Form Summary 3 For any rule with more than two non-terminal symbols on the RHS, add a new non-terminal that rewrites as the final k − 1 Derivation: sequence of strings produce by applications of symbols on the RHS. S → AC grammar rules; can be left-most or right-most. C → SB Tree structure diagram: graphs the structure of a string 4 Continue introducing such non-terminals until there is no rule independent of the derivation order. whose RHS has more than two non-terminals Ambiguity: a string can have more than one structure in a given grammar. Chomsky Normal Form grammar for L1: Normal form: standardized form for grammar rules; Chomsky S → AC C → SB and Greibach normal forms most important. S → AB A → a B → b Kozen gives a proof that the two grammars produce the same string set. Do they assign the strings the same structure? Informatics 2A: Lecture 4 Derivations 15 Informatics 2A: Lecture 4 Derivations 16

Recommend


More recommend