Why “working” with TAG? Formal reasons Hypothesis of the adequacy of expressive power TAG exactly provides the expressive power needed to treat NL. (The complexity of a language is determined by the weakest formal grammar that generates it.) Why is the formal complexity of natural languages interesting? It allows one to gain insights into ⇒ the general structure of natural language ⇒ the general human language capacity ⇒ the adequacy of grammar formalisms ⇒ lower bound of the computational complexity of NLP tasks Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 24 11
Why “working” with TAG? Formal reasons Expressive power in terms of a specific generative capacity: Weak generative capacity (WGC) The capacity to generate string languages . Strong generative capacity (SGC) The capacity to generate tree languages . Derivational generative capacity (DGC) The capacity to generate string languages in a certain way. In what follows we will consider the weak generative capacity . Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 25 12
Why “working” with TAG? Formal reasons How much expressive power do we need to treat NL? Chomsky(-Schützenberger) type 0: recursively enumerable hierarchy HPSG, TG, TM a f ( n ) (Chomsky & Schützenberger 1963) type 1: context-sensitive a 2 n , a n b n c n ..., W k LFG, LBA type 2: context-free a n b m c m d n , WW R CFG, PDA type 3: regular a n b m c k d l FSA Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 26 13
Why “working” with TAG? Formal reasons How much expressive power do we need to treat NL? Chomsky(-Schützenberger) type 0: recursively enumerable hierarchy HPSG, TG, TM a f ( n ) (Chomsky & Schützenberger 1963) type 1: context-sensitive a 2 n , a n b n c n ..., W k LFG, LBA type 2: context-free a n b m c m d n , WW R CFG, PDA NL is not regular! type 3: regular (Chomsky 1956; 1957) a n b m c k d l FSA center embedding with rela- tive clauses n 1 n 2 n 3 v 3 v 2 v 1 Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 27 13
Why “working” with TAG? Formal reasons How much expressive power do we need to treat NL? Chomsky(-Schützenberger) type 0: recursively enumerable hierarchy HPSG, TG, TM a f ( n ) (Chomsky & Schützenberger 1963) type 1: context-sensitive a 2 n , a n b n c n ..., W k LFG, LBA type 2: context-free NL is not context-free! a n b m c m d n , WW R CFG, PDA Shieber (1985) type 3: regular cross serial dependencies in Dutch and Swiss-German a n b m c k d l FSA n 1 n 2 n 3 v 1 v 2 v 3 Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 28 13
Why “working” with TAG? Formal reasons How much expressive power do we need to treat NL? Chomsky(-Schützenberger) type 0: recursively enumerable hierarchy HPSG, TG, TM a f ( n ) (Chomsky & Schützenberger 1963) type 1: context-sensitive a 2 n , a n b n c n ..., W k LFG, LBA NL is context-sensitive? type 2: context-free a n b m c m d n , WW R CFG, PDA type 3: regular a n b m c k d l FSA Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 29 13
Why “working” with TAG? Formal reasons How much expressive power do we need to treat NL? Chomsky(-Schützenberger) type 0: recursively enumerable hierarchy HPSG, TG, TM a f ( n ) (Chomsky & Schützenberger 1963) type 1: context-sensitive a 2 n , a n b n c n ..., W k LFG, LBA mildly context-sensitive NL is mildly context- sensitive? (Joshi 1985) a n b m c n d m , WW TAG, EPDA ⊃ CFL type 2: context-free cross-serial dep. CFG, PDA a n b m c m d n , WW R semi-linear in PTIME type 3: regular FSA a n b m c k d l Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 30 13
Why “working” with TAG? Formal reasons How much expressive power do we need to treat NL? Chomsky(-Schützenberger) type 0: recursively enumerable hierarchy HPSG, TG, TM a f ( n ) (Chomsky & Schützenberger 1963) type 1: context-sensitive a 2 n , W ( # W ) k LFG, LBA W k a n b n c n ... mildly context-sensitive a n b m c n d m , TAG, EPDA WW W # W type 2: context-free a n b m c m d n , CFG, PDA WW R W # W R type 3: regular FSA a n b m c k d l Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 31 13
Why “working” with TAG? Linguistic reasons extended domain of locality S NP VP V NP repaired long-distance dependencies / discontinuous constituents (3) Who did Mary say that Tom claimed ...repaired the fridge? multi-word expressions (4) to kick the bucket (‘to die’) incarnation of Construction Grammar Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 32 14
Outline of today’s course Why “working” with TAG? 1 Formal reasons Linguistic reasons From CFG to TAG 2 Context-Free Grammars Lexicalization Tree Substitution Grammars (TSG) Adding adjunction Further related formalisms 3 Summary & outlook 4 Appendix: NL and the generative capacity of grammar formalisms 5 Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 33 15
From CFG to TAG: Context-Free Grammar string rewriting replace non-terminals by strings of terminals and non-terminals G CFG = � N , T , S , P � P = { S → NP VP VP → V NP | AP VP NP → N | Det N AP → A N → Peter | fridge Det → the A → easily V → repaired } Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 34 16
From CFG to TAG: Context-Free Grammar string rewriting replace non-terminals by strings of terminals and non-terminals G CFG = � N , T , S , P � P = { S → NP VP VP → V NP | AP VP NP → N | Det N Example derivation : AP → A N → Peter | fridge S Det → the A → easily V → repaired } Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 35 16
From CFG to TAG: Context-Free Grammar string rewriting replace non-terminals by strings of terminals and non-terminals G CFG = � N , T , S , P � P = { S → NP VP VP → V NP | AP VP NP → N | Det N Example derivation : AP → A N → Peter | fridge NP VP Det → the A → easily V → repaired } Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 36 16
From CFG to TAG: Context-Free Grammar string rewriting replace non-terminals by strings of terminals and non-terminals G CFG = � N , T , S , P � P = { S → NP VP VP → V NP | AP VP NP → N | Det N Example derivation : AP → A N → Peter | fridge N VP Det → the A → easily V → repaired } Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 37 16
From CFG to TAG: Context-Free Grammar string rewriting replace non-terminals by strings of terminals and non-terminals G CFG = � N , T , S , P � P = { S → NP VP VP → V NP | AP VP NP → N | Det N Example derivation : AP → A N → Peter | fridge Peter VP Det → the A → easily V → repaired } Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 38 16
From CFG to TAG: Context-Free Grammar string rewriting replace non-terminals by strings of terminals and non-terminals G CFG = � N , T , S , P � P = { S → NP VP VP → V NP | AP VP NP → N | Det N Example derivation : AP → A N → Peter | fridge Peter AP VP Det → the A → easily V → repaired } Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 39 16
From CFG to TAG: Context-Free Grammar string rewriting replace non-terminals by strings of terminals and non-terminals G CFG = � N , T , S , P � P = { S → NP VP VP → V NP | AP VP NP → N | Det N Example derivation : AP → A N → Peter | fridge Peter A VP Det → the A → easily V → repaired } Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 40 16
From CFG to TAG: Context-Free Grammar string rewriting replace non-terminals by strings of terminals and non-terminals G CFG = � N , T , S , P � P = { S → NP VP VP → V NP | AP VP NP → N | Det N Example derivation : AP → A N → Peter | fridge Peter easily VP Det → the A → easily V → repaired } Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 41 16
From CFG to TAG: Context-Free Grammar string rewriting replace non-terminals by strings of terminals and non-terminals G CFG = � N , T , S , P � P = { S → NP VP VP → V NP | AP VP NP → N | Det N Example derivation : AP → A N → Peter | fridge Peter easily V NP Det → the A → easily V → repaired } Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 42 16
From CFG to TAG: Context-Free Grammar string rewriting replace non-terminals by strings of terminals and non-terminals G CFG = � N , T , S , P � P = { S → NP VP VP → V NP | AP VP NP → N | Det N Example derivation : AP → A N → Peter | fridge Peter easily repaired NP Det → the A → easily V → repaired } Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 43 16
From CFG to TAG: Context-Free Grammar string rewriting replace non-terminals by strings of terminals and non-terminals G CFG = � N , T , S , P � P = { S → NP VP VP → V NP | AP VP NP → N | Det N Example derivation : AP → A N → Peter | fridge Peter easily repaired Det N Det → the A → easily V → repaired } Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 44 16
From CFG to TAG: Context-Free Grammar string rewriting replace non-terminals by strings of terminals and non-terminals G CFG = � N , T , S , P � P = { S → NP VP VP → V NP | AP VP NP → N | Det N Example derivation : AP → A N → Peter | fridge Peter easily repaired the N Det → the A → easily V → repaired } Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 45 16
From CFG to TAG: Context-Free Grammar string rewriting replace non-terminals by strings of terminals and non-terminals G CFG = � N , T , S , P � P = { S → NP VP VP → V NP | AP VP NP → N | Det N Example derivation : AP → A N → Peter | fridge Peter easily repaired the fridge Det → the A → easily V → repaired } Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 46 16
From CFG to TAG: Context-Free Grammar string rewriting replace non-terminals by strings of terminals and non-terminals G CFG = � N , T , S , P � P = { Example derivation history: S → NP VP S VP → V NP | AP VP NP VP NP → N | Det N AP → A N AP VP N → Peter | fridge Peter A V NP Det → the A → easily easily repaired Det N V → repaired the fridge } Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 47 16
From CFG to TAG: Context-Free Grammar Why not stick to CFGs (literally)? low generative capacity: cannot describe all NL phenomena; e.g. cross-serial dependencies ( a n b m c n d m ) Swiss German (Shieber 1985) duplication ( w # w ) Bambara (Culy 1985) multiple agreement ( a n b n c n ) Bantu languages Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 48 17
From CFG to TAG: Context-Free Grammar Why not stick to CFGs (literally)? low generative capacity: cannot describe all NL phenomena; e.g. cross-serial dependencies ( a n b m c n d m ) Swiss German (Shieber 1985) duplication ( w # w ) Bambara (Culy 1985) multiple agreement ( a n b n c n ) Bantu languages poor support of expressing linguistic generalizations Rules have a very limited domain of locality. ( � no strong lexicalization) atomic non-terminals ( � massive proliferation) Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 49 17
From CFG to TAG: Context-Free Grammar Why not stick to CFGs (literally)? low generative capacity: cannot describe all NL phenomena; e.g. cross-serial dependencies ( a n b m c n d m ) Swiss German (Shieber 1985) duplication ( w # w ) Bambara (Culy 1985) multiple agreement ( a n b n c n ) Bantu languages poor support of expressing linguistic generalizations Rules have a very limited domain of locality. ( � no strong lexicalization) atomic non-terminals ( � massive proliferation) First step: turn strings into trees! Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 50 17
Lexicalization lexicalization → each structure of the grammar has at least one non-terminal Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 51 18
Lexicalization lexicalization → each structure of the grammar has at least one non-terminal Lexicalized grammar A lexicalized grammar consists of: (i) a finite set of structures each associated with a lexical item (anchor); and (ii) operation(s) for composing these structures. Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 52 18
Lexicalization lexicalization → each structure of the grammar has at least one non-terminal Lexicalized grammar A lexicalized grammar consists of: (i) a finite set of structures each associated with a lexical item (anchor); and (ii) operation(s) for composing these structures. Lexicalization A formalism F can be lexicalized by another formalism F ′ , if for any finitely ambiguous grammar G in F , there is a grammar G ′ in F ′ , such that (i) G ′ is a lexicalized grammar; and (ii) G and G ′ generate the same tree set. Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 53 18
Lexicalization lexicalization → each structure of the grammar has at least one non-terminal Lexicalized grammar A lexicalized grammar consists of: (i) a finite set of structures each associated with a lexical item (anchor); and (ii) operation(s) for composing these structures. Lexicalization A formalism F can be lexicalized by another formalism F ′ , if for any finitely ambiguous grammar G in F , there is a grammar G ′ in F ′ , such that (i) G ′ is a lexicalized grammar; and (ii) G and G ′ generate the same tree set. weak vs. strong lexicalization weak lexicalization: preserve the string language strong lexicalization: preserve the tree structure Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 54 18
Lexicalization Formally interesting: a finite lexicalized grammar provides finitely many analyses for each string (finitely ambiguous) Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 55 19
Lexicalization Formally interesting: a finite lexicalized grammar provides finitely many analyses for each string (finitely ambiguous) Linguistically interesting: syntactic properties of lexical items can be accounted for more directly each lexical item comes with the possibility of certain partial syntactic constructions Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 56 19
Lexicalization Formally interesting: a finite lexicalized grammar provides finitely many analyses for each string (finitely ambiguous) Linguistically interesting: syntactic properties of lexical items can be accounted for more directly each lexical item comes with the possibility of certain partial syntactic constructions Computationally interesting: the search space during parsing can be delimited (grammar filtering) Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 57 19
Lexicalization of CFG’s Greibach normal-form (Greibach 1965): A → aX or A → a ( a ∈ V T ; A ∈ V N ; X ∈ ( V N ) ∗ ) Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 58 20
Lexicalization of CFG’s Greibach normal-form (Greibach 1965): A → aX or A → a ( a ∈ V T ; A ∈ V N ; X ∈ ( V N ) ∗ ) example: a CFG G : S → SS , S → a Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 59 20
Lexicalization of CFG’s Greibach normal-form (Greibach 1965): A → aX or A → a ( a ∈ V T ; A ∈ V N ; X ∈ ( V N ) ∗ ) example: a CFG G : S → SS , S → a lexicalize G ⇒ G ′ (Greibach): S → aS , S → a same string language, but not the same tree set Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 60 20
Lexicalization of CFG’s Greibach normal-form (Greibach 1965): A → aX or A → a ( a ∈ V T ; A ∈ V N ; X ∈ ( V N ) ∗ ) example: a CFG G : S → SS , S → a lexicalize G ⇒ G ′ (Greibach): S → aS , S → a same string language, but not the same tree set by G ′ : by G : S S a S S S a S S S S S a S a a a a a Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 61 20
Lexicalization of CFG’s Greibach normal-form (Greibach 1965): A → aX or A → a ( a ∈ V T ; A ∈ V N ; X ∈ ( V N ) ∗ ) example: a CFG G : S → SS , S → a lexicalize G ⇒ G ′ (Greibach): S → aS , S → a same string language, but not the same tree set by G ′ : by G : S S a S S S a S S S S S a S a a a a a � only weak lexicalization possible Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 62 20
From CFG to TAG: Tree Substitution Grammar (TSG) First step: turn strings into trees! tree rewriting Substitution: replace a non-terminal leaf with a tree G CFG = � N , T , S , P � G TSG = � N , T , S , I � P = { I = { S VP VP S → NP VP VP → V NP | AP VP NP VP V NP AP VP NP → N | Det N ≈ A NP NP AP AP → A N → Peter | fridge easily N Det N A Det → the N V N Det A → easily V → repaired fridge repaired Peter the } } Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 63 21
From CFG to TAG: Tree Substitution Grammar (TSG) First step: turn strings into trees! tree rewriting Substitution: replace a non-terminal leaf with a tree G TSG = � N , T , S , I � I = { S VP VP NP VP V NP AP VP Example derivation: A NP NP AP S easily N Det N A NP VP N V N Det fridge repaired Peter the } Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 64 21
From CFG to TAG: Tree Substitution Grammar (TSG) First step: turn strings into trees! tree rewriting Substitution: replace a non-terminal leaf with a tree G TSG = � N , T , S , I � I = { S VP VP NP VP V NP AP VP Example derivation: S A NP NP AP NP VP easily N Det N A N N V N Det fridge repaired Peter the } Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 65 21
From CFG to TAG: Tree Substitution Grammar (TSG) First step: turn strings into trees! tree rewriting Substitution: replace a non-terminal leaf with a tree G TSG = � N , T , S , I � I = { S VP VP Example derivation: NP VP V NP AP VP S A NP NP AP NP VP easily N Det N A N N V N Det Peter fridge repaired Peter the } Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 66 21
From CFG to TAG: Tree Substitution Grammar (TSG) First step: turn strings into trees! tree rewriting Substitution: replace a non-terminal leaf with a tree G TSG = � N , T , S , I � I = { S VP VP Example derivation: NP VP V NP AP VP S A NP NP AP NP VP easily N Det N A N AP VP N V N Det Peter fridge repaired Peter the } Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 67 21
From CFG to TAG: Tree Substitution Grammar (TSG) First step: turn strings into trees! tree rewriting Substitution: replace a non-terminal leaf with a tree G TSG = � N , T , S , I � I = { S VP VP Example derivation: NP VP V NP AP VP S A NP NP AP NP VP easily N Det N A N AP VP N V N Det Peter A fridge repaired Peter the } Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 68 21
From CFG to TAG: Tree Substitution Grammar (TSG) First step: turn strings into trees! tree rewriting Substitution: replace a non-terminal leaf with a tree G TSG = � N , T , S , I � I = { S VP VP Example derivation: S NP VP V NP AP VP NP VP A NP NP AP N AP VP easily N Det N A Peter A N V N Det easily fridge repaired Peter the } Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 69 21
From CFG to TAG: Tree Substitution Grammar (TSG) First step: turn strings into trees! tree rewriting Substitution: replace a non-terminal leaf with a tree G TSG = � N , T , S , I � I = { S VP VP Example derivation: S NP VP V NP AP VP NP VP A NP NP AP N AP VP easily N Det N A Peter A V NP N V N Det easily fridge repaired Peter the } Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 70 21
From CFG to TAG: Tree Substitution Grammar (TSG) First step: turn strings into trees! tree rewriting Substitution: replace a non-terminal leaf with a tree G TSG = � N , T , S , I � I = { S VP VP Example derivation: S NP VP V NP AP VP NP VP A NP NP AP N AP VP easily N Det N A Peter A V NP N V N Det easily repaired fridge repaired Peter the } Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 71 21
From CFG to TAG: Tree Substitution Grammar (TSG) First step: turn strings into trees! tree rewriting Substitution: replace a non-terminal leaf with a tree G TSG = � N , T , S , I � I = { S VP VP Example derivation: S NP VP V NP AP VP NP VP A NP NP AP N AP VP easily N Det N A Peter A V NP N V N Det easily repaired Det N fridge repaired Peter the } Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 72 21
From CFG to TAG: Tree Substitution Grammar (TSG) First step: turn strings into trees! tree rewriting Substitution: replace a non-terminal leaf with a tree G TSG = � N , T , S , I � I = { Example derivation: S VP VP S NP VP V NP AP VP NP VP A NP NP AP N AP VP easily N Det N A Peter A V NP N V N Det easily repaired Det N fridge repaired Peter the the } Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 73 21
From CFG to TAG: Tree Substitution Grammar (TSG) First step: turn strings into trees! tree rewriting Substitution: replace a non-terminal leaf with a tree G TSG = � N , T , S , I � I = { Example derivation: S VP VP S NP VP V NP AP VP NP VP A NP NP AP N AP VP easily N Det N A Peter A V NP N V N Det easily repaired Det N fridge repaired Peter the the fridge } Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 74 21
From CFG to TAG: Tree Substitution Grammar (TSG) TSG versus CFG: weakly equivalent (same string languages, but more tree languages) S NP VP single recursion! AP VP V NP repaired still no strong lexicalization of CFG, cross-serial dependencies etc. Applications of TSG: Data-Oriented Parsing (DOP, Bod 2009) Second step: add adjunction! Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 75 22
From CFG to TAG: Adding adjunction Adjunction: replace a non-terminal node with an “auxiliary” tree put the subtree of the replaced node under the footnode (*) S VP S NP VP AP VP* NP VP ⇒ AP VP A V NP A V NP easily repaired easily repaired Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 76 23
From CFG to TAG: Adding adjunction Adjunction: replace a non-terminal node with an “auxiliary” tree put the subtree of the replaced node under the footnode (*) VP VP VP AP VP AP VP* AP VP* ⇒ A AP VP* A A easily A easily easily easily Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 77 23
From CFG to TAG: Adding adjunction Adjunction: replace a non-terminal node with an “auxiliary” tree put the subtree of the replaced node under the footnode (*) VP VP VP AP VP AP VP* AP VP* ⇒ A AP VP* A A easily A easily easily easily ⇒ Adjunction at footnodes causes spurious ambiguities in derivations. ⇒ Therefore, this is usually forbidden. Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 78 23
From CFG to TAG: Example with adjunction tree rewriting Substitution: replace a non-terminal leaf with a tree Adjunction: replace a non-terminal node with an “auxiliary” tree G TSG = � N , T , S , I � VP I = { S S VP VP AP VP* NP VP NP VP V NP AP VP A repaired NP ≈ A NP NP AP easily easily N Det N A NP NP Det N V N Det Det N N the fridge repaired Peter the fridge Peter } Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 79 24
From CFG to TAG: Example with adjunction tree rewriting Substitution: replace a non-terminal leaf with a tree Adjunction: replace a non-terminal node with an “auxiliary” tree S VP NP VP AP VP* Example derivation: V NP A S NP VP repaired easily V NP NP NP Det repaired Det N N the Peter fridge Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 80 24
From CFG to TAG: Example with adjunction tree rewriting Substitution: replace a non-terminal leaf with a tree Adjunction: replace a non-terminal node with an “auxiliary” tree S VP NP VP AP VP* Example derivation: V NP A S NP VP repaired easily N V NP NP NP Det Peter repaired Det N N the Peter fridge Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 81 24
From CFG to TAG: Example with adjunction tree rewriting Substitution: replace a non-terminal leaf with a tree Adjunction: replace a non-terminal node with an “auxiliary” tree S VP NP VP AP VP* Example derivation: S V NP A NP VP repaired easily N AP VP NP NP Peter A V NP Det Det N N easily repaired the Peter fridge Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 82 24
From CFG to TAG: Example with adjunction tree rewriting Substitution: replace a non-terminal leaf with a tree Adjunction: replace a non-terminal node with an “auxiliary” tree S VP Example derivation: NP VP AP VP* S V NP A NP VP N AP VP repaired easily Peter A V NP NP NP Det Det N easily repaired Det N N the fridge Peter fridge Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 83 24
From CFG to TAG: Example with adjunction tree rewriting Substitution: replace a non-terminal leaf with a tree Adjunction: replace a non-terminal node with an “auxiliary” tree S VP Example derivation: NP VP AP VP* S V NP A NP VP N AP VP repaired easily Peter A V NP NP NP Det Det N easily repaired Det N N the the fridge Peter fridge Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 84 24
From CFG to TAG: Restrictions on adjunction ( I ) Restrictions on the shape of auxiliary trees: The root node and the footnode must carry the same non-terminal. Specific adjunction constraints on target nodes: obligatory adjunction (OA): true/false null adjunction (NA): no adjoinable auxiliary tree selective adjunction (SA): a nonempty set of adjoinable auxiliary trees Adjunction constraints are essential in generating non-context-free languages (e.g. the copy language { ww | w ∈ { a , b } ∗ } )! Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 85 25
From CFG to TAG: Restrictions on adjunction ( I ) Example grammar for the copy language { ww | w ∈ { a , b } ∗ } : Example derivation of abbabb : S NA S NA S a S b S ε S* NA S* NA a b ⇒ TAG = TSG + adjunction + adjunction constraints Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 86 26
From CFG to TAG: Restrictions on adjunction ( I ) Example grammar for the copy language { ww | w ∈ { a , b } ∗ } : Example derivation of abbabb : S NA S NA S a S b S S ε S* NA S* NA a b ε ⇒ TAG = TSG + adjunction + adjunction constraints Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 87 26
From CFG to TAG: Restrictions on adjunction ( I ) Example grammar for the copy language { ww | w ∈ { a , b } ∗ } : Example derivation of abbabb : S NA S NA S NA S a S b S S a ε S* NA S* NA a b S NA a ε ⇒ TAG = TSG + adjunction + adjunction constraints Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 88 26
From CFG to TAG: Restrictions on adjunction ( I ) Example grammar for the copy language { ww | w ∈ { a , b } ∗ } : Example derivation of abbabb : S NA S NA S NA S a S b S S a ε S* NA S* NA a b S NA a ε ⇒ TAG = TSG + adjunction + adjunction constraints Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 89 26
From CFG to TAG: Restrictions on adjunction ( I ) Example grammar for the copy language { ww | w ∈ { a , b } ∗ } : Example derivation of abbabb : S NA S NA a S NA S NA S b S S S a b ε S NA b S* NA S* NA a b S NA a ε ⇒ TAG = TSG + adjunction + adjunction constraints Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 90 26
From CFG to TAG: Restrictions on adjunction ( I ) Example grammar for the copy language { ww | w ∈ { a , b } ∗ } : Example derivation of abbabb : S NA S NA a S NA S NA S b S S S a b ε S NA b S* NA S* NA a b S NA a ε ⇒ TAG = TSG + adjunction + adjunction constraints Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 91 26
From CFG to TAG: Restrictions on adjunction ( I ) Example grammar for the copy language { ww | w ∈ { a , b } ∗ } : Example derivation of abbabb : S NA S NA a S NA S NA S NA b S a S b S b S ε S NA b S* NA S* NA a b S NA b S NA a ε ⇒ TAG = TSG + adjunction + adjunction constraints Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 92 26
From CFG to TAG: Tree-Adjoining Grammar A Tree Adjoining Grammar (TAG) is a tuple G = � N , T , I , A , O , C � : T and N are disjoint alphabets, the terminals and nonterminals, I is a finite set of intial trees , and A is a finite set of auxiliary trees . O : { v | v is a node in a tree in I ∪ A } → { 1 , 0 } is a function, and C : { v | v is a node in a tree in I ∪ A } → P ( A ) is a function. Let v be a node in I ∪ A : obligatory adjunction (OA): O ( v ) = 1 null adjunction (NA): O ( v ) = 0 and C ( v ) = ∅ selective adjunction (SA): O ( v ) = 0 and C ( v ) � ∅ and C ( v ) � A The trees in I ∪ A are called elementary trees. Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 93 27
From CFG to TAG: Tree-Adjoining Grammar TAG is mildly context-sensitive (MCS, Joshi 1985) generates the context-free languages generates cross-serial dependencies (i.e. WW ) constant growth (or semi linear, no a 2 n ) polynomial time parsing ( O ( n 6 ) ) (Schabes 1990; Joshi & Schabes 1997; Kallmeyer 2010) TAG can strongly lexicalize finitely ambiguous CFG. (Schabes 1990; Joshi & Schabes 1991) Formally interesting: a finite lexicalized grammar provides finitely many analyses for each string (finitely ambiguous). Linguistically interesting: syntactic properties of lexical items can be accounted for more directly. Computationally interesting: the search space during parsing can be delimited (grammar filtering). Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 94 28
Outline of today’s course Why “working” with TAG? 1 Formal reasons Linguistic reasons From CFG to TAG 2 Context-Free Grammars Lexicalization Tree Substitution Grammars (TSG) Adding adjunction Further related formalisms 3 Summary & outlook 4 Appendix: NL and the generative capacity of grammar formalisms 5 Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 95 29
Restricting TAG Further adjunction constraints: no adjunction at the spine below the root node of auxiliary trees off-spine TAG (osTAG, Swanson et al. (2013)) ⇒ WGC of CFG ( O ( n 3 ) ) ⇒ more compact grammars than CFG or TSG ⇒ strongly lexicalizes CFG? Restrictions on the shape of auxiliary trees: Footnodes are at the lef or right edge of an ET. Tree Insertion Grammar (TIG, Schabes & Waters (1995)) further constraint: no adjunction of lef auxiliary trees to the spine of right auxiliary trees ⇒ WGC of CFG ( O ( n 3 ) ) ⇒ more compact grammars than CFG (or TSG?) ⇒ strongly lexicalizes CFG Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 96 30
MCS-alternatives to TAG and extensions Linear Indexed Grammar (LIG Gazdar 1988; Keller & Weir 1995) Head Grammar (HG Pollard 1984) Multicomponent TAG (MCTAG Seki et al. 1991) Minimalist Grammar (MG Stabler 1997) Combinatory Categorial Grammar (CCG Steedman 1984) Linear Context-Free Rewriting Systems (LCFRS Vijay-Shanker et al. 1987) TAG, CCG (but not recent versions of CCG), LIG and HG are weakly equivalent. MCTAG and LCFRS subsume TAG, CCG, LIG and HG. (Kallmeyer 2010) ⇒ TAG cannot generate all MCSLs! { a n b n c n d n e n | n ≥ 1 } , { www | w ∈ { a , b } ∗ } MIX : = { w | w ∈ { a , b , c } ∗ , | w | a = | w | b = | w | c } (Bach 1988) SCR ind : = { σ ( NP 1 , . . . , NP m ) V m . . . V 1 | m ≥ 1 and σ is a permutation } (Becker et al. 1992) Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 97 31
Outline of today’s course Why “working” with TAG? 1 Formal reasons Linguistic reasons From CFG to TAG 2 Context-Free Grammars Lexicalization Tree Substitution Grammars (TSG) Adding adjunction Further related formalisms 3 Summary & outlook 4 Appendix: NL and the generative capacity of grammar formalisms 5 Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 98 32
Summary & outlook Summary motivation CFG → TSG → TSG+adjunction → TSG + adjunction + adjunction constraints = TAG Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 99 33
Summary & outlook Summary motivation CFG → TSG → TSG+adjunction → TSG + adjunction + adjunction constraints = TAG Tomorrow linguistic applications using LTAG the derivation tree subcategorization, extraction, modification adding feature structures Kallmeyer, Lichte, Osswald & Petitjean (HHU Düsseldorf) 100 33
Recommend
More recommend