pumping and ogden properties of multiple context free
play

Pumping and Ogden Properties of Multiple Context-Free Grammars - PDF document

Pumping and Ogden Properties of Multiple Context-Free Grammars Makoto Kanazawa National Institute of Informatics and SOKENDAI Japan 1994: Ph.D. in Linguistics, Stanford University 1994:


  1. 略歴 Pumping and Ogden Properties of Multiple Context-Free Grammars Makoto Kanazawa National Institute of Informatics and SOKENDAI Japan • 1994: Ph.D. in Linguistics, Stanford University • 1994: 千葉大学文学部行動科学科 • 2000: 東京大学情報学環 • 2004: 国立情報学研究所 • 2018: 法政大学理工学部創生科学科 2 Arising from concerns in Multiple Context-Free Grammars computational linguistics. CFGs are almost good enough for NL grammars, but not quite; a mild • Introduced by Seki, Matsumura, Fujii, and Kasami (1987–1991) extension of CFGs is needed. • Independently by Vijay-Shanker,Weir, and Joshi Several criteria were put forward as to (1987) what constitutes a “mild” extension. • Many equivalent models • Often thought to be an adequate formalization of mildly context-sensitive grammars (Joshi 1985) 3

  2. Context-Free Grammars production A → w 0 B 1 w 1 … B n w n B i ∈ N, w j ∈ Σ * S ⇒ G * β A γ A → α ∈ P S ⇒ G * S S ⇒ G * β α γ top-down derivation L(G) = { w ∈ Σ * | S ⇒ G * w } 4 Bottom-Up Interpretation B i ⇒ G * v i (i = 1,…,n) A → w 0 B 1 w 1 … B n w n ∈ P A ⇒ G * w 0 v 1 w 1 … v n w n L(G) = { w ∈ Σ * | S ⇒ G * w } 5 CFGs as Logic Programs on Strings A → w 0 B 1 w 1 … B n w n A(w 0 x 1 w 1 … x n w n ) ← B 1 ( x 1 ),…,B n ( x n ) Horn clause L(G) = { w ∈ Σ * | G ⊢ S(w) } 6

  3. It’s best to think of an MCFG as a Multiple Context-Free Grammars kind of logic program. Each rule is a definite clause. Nonterminals are predicates on A( α 1 ,…, α q ) ← B 1 ( x 1,1 ,…, x 1,q 1 ),…,B n ( x n,1 ,…, x n,q n ) strings. n ≥ 0, q, q i ≥ 1, α k ∈ ( Σ ∪ { x i,j | i ∈ [1,n], j ∈ [1,q i ] })* each x i,j occurs exactly once in ( α 1 ,…, α q ) • q = dim(A) ( dimension of A) • dim(S) = 1 • L(G) = { w ∈ Σ * | G ⊢ S(w) } 7 m-MCFG = MCFG with nonterminal S( x 1 # x 2 ) ← D( x 1 , x 2 ) dimension not exceeding m D( ε , ε ) ← 1-MCFG = CFG { w#w R | w ∈ D 1 * } D( x 1 y 1 , y 2 x 2 ) ← E( x 1 , x 2 ), D( y 1 , y 2 ) Derivation tree for w = proof of S(w) E(a x 1 ā , ā x 2 a) ← D( x 1 , x 2 ) S(aa āā a ā # ā a āā aa) 2-MCFG 2-ary branching D(aa āā a ā , ā a āā aa) E(aa āā , āā aa) D(a ā , ā a) D(a ā , ā a) E(a ā , ā a) D( ε , ε ) E(a ā , ā a) D( ε , ε ) D( ε , ε ) D( ε , ε ) derivation tree 8 The languages of MCFGs form an S( x 1 … x m ) ← A( x 1 ,…, x m ) infinite hierarchy. A( ε ,…, ε ) ← A(a 1 x 1 a 2 ,…,a 2m − 1 x m a 2m ) ← A( x 1 ,…, x m ) non-branching m-MCFG { a 1n a 2n … a 2m − 1n a 2mn | n ≥ 0 } m-MCFL (m − 1)-MCFL Seki et al. 1991 2-MCFL 1-MCFL = CFL 9

  4. Chomsky Hierarchy Rewriting Machines Logic Programs on Strings Languages Grammars Elementary Formal Systems Unrestricted Turing r.e. (Smullyan 1961) Context- Length-Bounded EFS (Arikawa CSL = LBA Sensitive et al. 1989) NSPACE(n) Simple LMG (Groenink 1997) / Poly-time Hereditary EFS (Ikeda and P Turing Arimura 1997) MCFG MCFL Context-Free PDA Simple EFS (Arikawa 1970) CFL Right-Linear FA Reg 10 Which Properties of CFGs Are Shared by/ Generalize to MCFGs? • Membership in LOGCFL • Semilinearity • … 11 Derivation trees of MCFGs are similar Pumping to those of CFGs. When the same nonterminal occurs S( x 1 # x 2 ) ← D( x 1 , x 2 ) twice on the same path of a derivation D( ε , ε ) ← D( x 1 y 1 , y 2 x 2 ) ← E( x 1 , x 2 ), D( y 1 , y 2 ) tree,… S(aa āā a ā # ā a āā aa) E(a x 1 ā , ā x 2 a) ← D( x 1 , x 2 ) D(aa āā a ā , ā a āā aa) E(aa āā , āā aa) D(a ā , ā a) D(a ā , ā a) E(a ā , ā a) D( ε , ε ) E(a ā , ā a) D( ε , ε ) D( ε , ε ) D( ε , ε ) 12

  5. You can decompose the derivation tree into three parts, and the middle S( y 1 # y 2 ) part can be iterated any number of D( y 1 , y 2 ) times, including zero times. D(a x 1 ā a ā , ā a ā x 2 a) In the overall derivation tree, the E(a x 1 ā , ā x 2 a) D(a ā , ā a) variables x 1 , x 2 , y 1 , y 2 are instantiated by … D( x 1 , x 2 ) E(a ā , ā a) D( ε , ε ) The number of iterated substrings D( ε , ε ) D(a ā , ā a) (factors) larger than two. E(a ā , ā a) D( ε , ε ) a n a ā ( ā a ā ) n #( ā a ā ) n ā aa n ∈ L(G) D( ε , ε ) 13 For MCFGs, need to consider a Iterative Properties generalized form of the condition of the puming lemma. Not straightforward; open question for ∃ p ∀ z ∈ L(|z| ≥ p ⇒ L is k-iterative iff ∃ u 1 …u k+1 v 1 …v k ( a long time. z = u 1 v 1 …u k v k u k+1 ∧ v 1 …v k ≠ ε ∧ ∀ n ≥ 0(u 1 v 1n …u k v kn u k+1 ∈ L)) L ∈ CFL ⇒ L is 2-iterative L ∈ m-MCFL ⇒ L is 2m-iterative ? wrong claim in 1991 14 The middle part of the derivation tree Difficulty with Pumping may look like this. A(v 12 x 1 v 22 , v 32 x 2 v 42 ) A(v 1 x 1 v 2 , v 3 x 2 v 4 ) A(v 1 x 1 v 2 , v 3 x 2 v 4 ) A( x 1 , x 2 ) A( x 1 , x 2 ) 15

  6. Or like this. Difficulty with Pumping A(v 12 x 1 v 2 x 2 v 3 v 2 v 4 v 3 , v 4 ) A(v 1 x 1 v 2 x 2 v 3 , v 4 ) A(v 1 x 1 v 2 x 2 v 3 , v 4 ) A( x 1 , x 2 ) A( x 1 , x 2 ) “uneven pump” 16 The pumping lemma fails for 3- MCFGs. S( x 1 # x 2 # x 3 ) ← A( x 1 , x 2 , x 3 ) A(a x 1 , y 1 c x 2 c ̄ d y 2 d ̄ x 3 , y 3 b) ← A( x 1 , x 2 , x 3 ), A( y 1 , y 2 , y 3 ) A(a, ε , b) ← not k-iterative for any k m-MCFL 3-MCFL Kanazawa et al. 2014 2-MCFL 1-MCFL = CFL 17 Pumping possible for special cases. Pumping Lemma for Subclasses Well-nested MCFGs. 2m-iterative m-MCFL m-MCFL wn 2-MCFL 2-MCFL wn 1-MCFL 1-MCFL wn = = CFL CFL 4-iterative well-nested MCFGs Kanazawa 2009 18

  7. Has a natural equivalent Well-Nestedness characterization: yCFT sp { w#w R | w ∈ D 1 * } { w#w | w ∈ D 1 * } S( x 1 # x 2 ) ← D( x 1 , x 2 ) S( x 1 # x 2 ) ← D( x 1 , x 2 ) D( ε , ε ) ← D( ε , ε ) ← D( x 1 y 1 , y 2 x 2 ) ← E( x 1 , x 2 ), D( y 1 , y 2 ) D( x 1 y 1 , x 2 y 2 ) ← E( x 1 , x 2 ), D( y 1 , y 2 ) E(a x 1 ā , ā x 2 a) ← D( x 1 , x 2 ) E(a x 1 ā , a x 2 ā ) ← D( x 1 , x 2 ) well-nested non-well-nested { w#w | w ∈ D 1 * } ∉ MCFL wn Kanazawa and Salvati 2010 19 Pumping not easy to prove even form Difficulty with Pumping well-nested MCFGs: this situation can still arise. A(v 12 x 1 v 2 x 2 v 3 v 2 v 4 v 3 , v 4 ) A(v 1 x 1 v 2 x 2 v 3 , v 4 ) A(v 1 x 1 v 2 x 2 v 3 , v 4 ) A( x 1 , x 2 ) A( x 1 , x 2 ) “uneven pump” 20 A very simple example. S( x 1 x 2 ) ← A( x 1 , x 2 ) The only choice you can make is the A(a x 1 b x 2 c, d) ← A( x 1 , x 2 ) non-branching ⊆ well-nested number of times you use the second A( ε , ε ) ← rule. S( ε ) S(abcd) S(aabcbdcd) S(aaabcbdcbdc,d) Actually 2-iterative, but no A( ε , ε ) A(abc, d) A(aabcbdc,d) A(aaabcbdcbdc,d) straightforward connection between A( ε , ε ) A(abc, d) A(aabcbdc,d) the iterated substrings and parts of derivation trees. A( ε , ε ) A(abc, d) A( ε , ε ) i=0 i=1 i=2 i=3 { ε } ∪ { a i − 1 abc(bdc) i − 1 d | i ≥ 1 } 21

  8. If the derivation tree contains an ( v 1 x 1 v 2 ,…, v 2 m − 1 x m v 2 m ) even m-pump, the string is 2m- • If G is a well-nested m -MCFG, pumpable. B Otherwise, the string is in the { T | T is a derivation tree of “even m- pump” language of some w.n. (m-1)- G without even m -pumps } B MCFG, and therefore is 2(m-1)- may not be finite. pumpable (disregarding finitely ( x 1 ,…, x m ) many exceptions). • But there is a well-nested ( m − 1)-MCFG Proof by induction on m. generating { yield( T ) | T is a derivation tree of G without even m -pumps }. 22 My proof of the pumping lemma for Pumping Lemma for Subclasses m-MCFL wn and 2-MCFL is not straightforward. 2m-iterative m-MCFL m-MCFL wn 2-MCFL 2-MCFL wn 1-MCFL 1-MCFL wn = = CFL CFL 4-iterative Kanazawa 2009, by grammar splitting and transformation What about Ogden’s Lemma ? 23 Can be used to show inherent Ogden’s Lemma for CFL ambiguity of some CFLs, e.g., { a m b n c p | m = n ∨ n = p }. L ∈ CFL ⇒ ∃ p ∀ z ∈ L(at least p positions of z are marked ⇒ ∃ u 1 u 2 u 3 v 1 v 2 ( z = u 1 v 1 u 2 v 2 u 3 ∧ (u 1 , v 1 , u 2 each contain a marked position ∨ u 2 , v 2 , u 3 each contain a marked position) ∧ v 1 u 2 v 2 contains no more than p marked positions ∧ ∀ n ≥ 0(u 1 v 1n u 2 v 2n u 3 ∈ L)) Ogden 1968 24

  9. There are various ways of generalizing Ogden’s lemma suitable for MCFGs. At least this much should be implied. L has the weak Ogden property iff ∃ p ∀ z ∈ L(at least p positions of z are marked ⇒ ∃ k ≥ 1 ∃ u 1 …u k+1 v 1 …v k ( z = u 1 v 1 …u k v k u k+1 ∧ ∃ i(v i contains a marked position) ∧ ∀ n ≥ 0(u 1 v 1n …u k v kn u k+1 ∈ L)) 25 This is the first new result in this talk. The Failure of Ogden’s Lemma 2m-iterative m-MCFL m-MCFL wn 3-MCFL wn 6-iterative 2-MCFL 2-MCFL wn 1-MCFL 1-MCFL wn = = CFL CFL 4-iterative The weak Ogden property fails for 3-MCFL wn and 2-MCFL. 26 A language for which the weak Ogden property fails. { a i 1 b i 0 $a i 2 b i 1 $a i 3 b i 2 $…$a i n b i n − 1 | n ≥ 3, i 0 ,…,i n ≥ 0 } 3-MCFL wn 2-MCFL 2-MCFL wn CFL 27

Recommend


More recommend