A constraint driven metagrammar Joseph Le Roux (1) - Benoˆ e (2) - Yannick Parmentier (3) ıt Crabb´ (1) Calligramme Project LORIA - INPL (2) HCRC / ICCS University of Edinburgh (3) Langue Et Dialogue Project INRIA / LORIA - UHP TAG+8 – Sydney 15 July 2006 1 / 32
Introduction ◮ Our concern: semi-automatic grammar development of real-scale Lexicalised TAG s. ◮ Related problems: design and maintnance issues raised by redundancy inherent to strong lexicalisation. ◮ The MetaGrammar approach: ◮ capturing linguistic generalisations among grammatical structures ( i.e. , trees), ◮ describing the trees of a grammar as combinations of elementary tree fragments. ◮ This work has been realized in collaboration with Pr. Denys Duchier. 2 / 32
Outline eXtensible MetaGrammar Constraining admissible structures An efficient implementation of constraints Some features of the XMG system Conclusion 3 / 32
eXtensible MetaGrammar (1 / 2) ◮ Monotonic description of the grammar trees using an expressive and relatively intuitive language. ◮ MetaGrammar ≡ manipulation of elementary tree descriptions using a control language. ◮ (1) description of tree fragments and (2) combinations of these fragments. ◮ Two methodological axes of description (Crabb´ e, 05): 1. structure sharing ( i.e. reusable elementary tree fragments). 2. alternatives ( i.e. combination of fragments using conjunction and disjunction ). 4 / 32
eXtensible MetaGrammar (2 / 2) ◮ A language to describe tree fragments: Description ::= x → y | x → + y | x → ∗ y | x ≺ y | x ≺ + y | x ≺ ∗ y | (1) x [ f : E ] | x ( p : E ) ◮ A language to combine tree fragments: Name → Content Class ::= (2) Content ::= Description | Name | (3) Content ∨ Content | Content ∧ Content 5 / 32
Example (1 / 2) ◮ Tree fragment #1: SubjectCan → ( X [ cat : s ] → Y [ cat : v ] ) ∧ ( X → Z ( mark : subst ) [ cat : n ] ) ∧ ( Z ≺ Y ) X [cat:s] SubjectCan → Z ↓ [cat:n] Y [cat:v] 6 / 32
Example (1 / 2) ◮ Tree fragment #1: SubjectCan → ( X [ cat : s ] → Y [ cat : v ] ) ∧ ( X → Z ( mark : subst ) [ cat : n ] ) ∧ ( Z ≺ Y ) X [cat:s] SubjectCan → Z ↓ [cat:n] Y [cat:v] ◮ Tree fragment #2: Active → ( X [ cat : s ] ∧ Y ( mark : anchor ) [ cat : v ] ) ∧ X → Y ) X [cat:s] → Active Y ⋄ [cat:v] 7 / 32
Example (1 / 2) ◮ Tree fragment #1: SubjectCan → ( X [ cat : s ] → Y [ cat : v ] ) ∧ ( X → Z ( mark : subst ) [ cat : n ] ) ∧ ( Z ≺ Y ) X [cat:s] SubjectCan → Z ↓ [cat:n] Y [cat:v] ◮ Tree fragment #2: Active → ( X [ cat : s ] ∧ Y ( mark : anchor ) [ cat : v ] ) ∧ X → Y ) X [cat:s] → Active Y ⋄ [cat:v] ◮ Combination rule: Intransitive → SubjectCan ∧ Active ( ∗ ) 8 / 32
Example (2 / 2) Some trees for intransitive verbs ( e.g. , the lexical item sleeps ) S S S ∧ ⇒ N ↓ V V ⋄ N ↓ V ⋄ (Canonical Subject) (Active verb morph) (e.g. the boy sleeps) N N S N* S N* S V ⋄ ∧ ⇒ N ↓ N ↓ V V ⋄ (Active verb morph) (Extracted Subject) (e.g. the boy who sleeps) 9 / 32
About namespaces ◮ Scope of the node variables used within descriptions ? ◮ local scope by default. ◮ possibility to explicitly manage namespaces via Import / Export declarations. ◮ Furthermore, introduction of an inheritance mechanism whose semantics corresponds to class conjunction and namespace merging. 10 / 32
Outline eXtensible MetaGrammar Constraining admissible structures An efficient implementation of constraints Some features of the XMG system Conclusion 11 / 32
Constraining admissible structures ◮ Further constraining the tree structures generated from the metagrammar. ◮ Specifying constraints on the well-formedness of trees. ◮ Interest: avoid manual checking ( e.g. no tree with more than one foot node, etc). ◮ Classification of these constraints into 4 categories: 1. Formal constraints 2. Operational constraints 3. Language-dependent constraints 4. Theoretical constraints 12 / 32
Formal constraints ◮ Constraints assuring that the trees generated by the model builder are regular TAG trees. ◮ On top of being trees, the output structures must respect some specific criteria: ◮ each node has a category label, ◮ leaf nodes are either marked as subst , foot or anchor , ◮ the category of the foot node is identical to that of the root node, ◮ etc. 13 / 32
Operational constraint (1 / 3) ◮ Constraints controlling the combinations of tree fragments (closely linked to the concept of Resources / Needs). ◮ Constraints based on a colouring of the nodes. ◮ Each node of the description is labelled either Black, Red or White. ◮ During minimal model computation, nodes are identified according to the following rules: ◦ w + ◦ w = ◦ w • b + ◦ w = • b • b + • b = ⊥ • r + { ◦ w ; • b ; • r } = ⊥ 14 / 32
Operational constraint (2 / 3) Benefits: ◮ Avoids node naming issues (no global names). ◮ Allows to reduce the metagrammatical description (node equations are replaced with implicit coloured node identifications). ◮ Facilitates the reuse of a same tree fragment several times. 15 / 32
Operational constraint (3 / 3) Example: S ◦ w N • r V ◦ w (SubjectCan) S • b S ◦ w ∨ ∧ ∧ V ⋄ • b N • r V ◦ w N ↓ • r (ObjectCan) (Active) N • r S ◦ w N • r V ◦ w (SubjectRel) 16 / 32
Language-dependent constraints (1 / 2) ◮ For French, the ordering and uniqueness of clitics. ◮ (Perlmutter, 70): first they appear in front of the verb in a fixed order according to their rank (a-b) and second two different clitics in front of the verb cannot have the same rank (c). ◮ For instance the clitics le, la have the rank 3 and lui the rank 4 (rank is a node property ). (a) Jean le 3 lui 4 donne John gives it to him (b) *Jean lui 4 le 3 donne *John gives to him it (c) *Jean le 3 la 3 donne *John gives it it 17 / 32
Language-dependent constraints (2 / 2) S S V’ V’ V’ ∧ ∧ ∧ Cl ↓ 3 Cl ↓ 4 N ↓ ≺ + ≺ + ≺ + V ⋄ V’ V V ( Jean ) ( le ) ( lui ) ( donne ) S S N ↓ N ↓ V’ V’ ⇒ Cl ↓ 3 Cl ↓ 4 Cl ↓ 4 Cl ↓ 3 V ⋄ V ⋄ ( Jean le lui donne ) ( Jean lui le donne ) 18 / 32
Theoretical principles ◮ Language-independent principles related to the grammatical formalism described. ◮ For TAG , such a principle may be the Principle of Predicate-Argument Coocurrency . ◮ NB: such principles are not yet implemented within the XMG system. 19 / 32
Outline eXtensible MetaGrammar Constraining admissible structures An efficient implementation of constraints Some features of the XMG system Conclusion 20 / 32
An efficient implementation of constraints ◮ A 3-step metagrammar compilation: 1. translation of the descriptions into intermediate code for a specific virtual machine (WAM-based), 2. execution of this code and accumulation of partial tree descriptions, 3. solving of tree descriptions. ◮ The third step is performed by a tree description solver implemented using the Constraint Satisfaction approach. ◮ In this context, the constraints introduced above can be expressed naturally. 21 / 32
Solving Tree Descriptions (1 / 3) 1. Setting the constraint framework: ◮ Each node in the input description is associated with an integer. ◮ Then, we use an asbtract data type to refer to a node of a valid model in terms of the nodes being equals, above, below, or on its side: Up Eq Right Left Down N i ::= node ( Eq : {� ints �} Up : {� ints �} Down : {� ints �} x Left : {� ints �} Right : {� ints �} ) 22 / 32
Solving Tree Descriptions (2 / 3) ◮ The input description is converted into relation constraints on node sets. For instance, the dominance relation x → y can be translated as: x . EqUp ⊆ N j x . Down ⊇ N j y ≡ [ N i y . Up ∧ N i y . EqDown N i x → N j x . Left ⊆ N j x . Right ⊆ N j ∧ N i y . Left ∧ N i y . Right ] 23 / 32
Solving Tree Descriptions (2 / 3) ◮ The input description is converted into relation constraints on node sets. For instance, the dominance relation x → y can be translated as: x . EqUp ⊆ N j x . Down ⊇ N j y ≡ [ N i y . Up ∧ N i y . EqDown N i x → N j x . Left ⊆ N j x . Right ⊆ N j ∧ N i y . Left ∧ N i y . Right ] ◮ x . Down ⊇ N j N i y . EqDown 24 / 32
Solving Tree Descriptions (3 / 3) 2. Searching the solutions to the problem: ◮ The solutions are the assignments for each of the node sets associated with the nodes of the input description. ◮ A distribution strategy is used to explore the consistent assignments for these node sets. ◮ The implementation of the solver follows the ideas of (Duchier and Niehren, 2000) and uses the constraint programming support of the Oz/Mozart system. 25 / 32
Extension to specific constraints (1 / 4) ◮ This constraint framework can relatively easily be extended to solve specific constraints, such as those introduced previously. ◮ The idea: 1. extension of the node representation (tuples whose fields contain sets of nodes), 2. definition of additional constraints on these fields, reflecting the syntactic constraints we want to express. 26 / 32
Recommend
More recommend