Characterizations of subregular tree languages Andreas Maletti Universität Leipzig, Germany andreas.maletti@uni-leipzig.de CAALM, Chennai — January 24, 2019
Constituent Syntax Tree Syntax tree for We must bear in mind the Community as a whole S NP 1 VP 2 PRP MD VP 3 We must VB PP NP 2 bear IN NP 1 NP 2 PP in NN DT NN IN NP 2 mind the Community as DT NN a whole
Constituent Syntax Tree Tree T Σ ( V ) for sets Σ and V is least set T of trees s.t. Variables: V ⊆ T 1 Top concatenation: σ ( t 1 , . . . , t k ) ∈ T for k ∈ N , σ ∈ Σ , t 1 , . . . , t k ∈ T 2
Constituent Syntax Tree Tree T Σ ( V ) for sets Σ and V is least set T of trees s.t. Variables: V ⊆ T 1 Top concatenation: σ ( t 1 , . . . , t k ) ∈ T for k ∈ N , σ ∈ Σ , t 1 , . . . , t k ∈ T 2 tree language = set of trees
Constituent Syntax Trees Syntax tree is not unique (weights are used for disambiguation) S S NP 1 VP 2 NP 1 VP 2 PRP VBD S-BAR PRP VBD NP 2 We saw S We saw PRP$ NN NP 1 VP 1 her duck PRP VBP her duck
Parses Representations enumeration
Parses Representations enumeration proof trees of combinatory categorial grammars local tree languages tree substitution languages regular tree languages
Parses Representations enumeration proof trees of combinatory categorial grammars local tree languages tree substitution languages regular tree languages Regular tree language L ⊆ T Σ ( ∅ ) regular iff ∃ congruence ∼ = (top-concatenation) on T Σ ( ∅ ) s.t. ∼ = has finite index (finitely many equiv. classes) 1 = saturates L ; i.e. L = � ∼ t ∈ L [ t ] ∼ 2 =
Regular Tree Languages Examples for Σ = { σ, δ, α } : 2 equivalence classes ( L and T Σ ( ∅ ) \ L ) L = { t ∈ T Σ ( ∅ ) | t contains odd number of α }
Regular Tree Languages Examples for Σ = { σ, δ, α } : 2 equivalence classes ( L and T Σ ( ∅ ) \ L ) L = { t ∈ T Σ ( ∅ ) | t contains odd number of α } 3 equivalence classes (“no σ ”, “some σ , but legal”, illegal) L ′ = { t ∈ T Σ ( ∅ ) | σ never below δ }
Regular Tree Languages Regular tree grammar [Brainerd, 1969] G = ( Q , Σ , I , P ) alphabet Q of nonterminals and initial nonterminals I ⊆ Q alphabet of terminals Σ finite set of productions P ⊆ T Σ ( Q ) × Q (we write r → q for productions ( r , q ) )
Regular Tree Languages Regular tree grammar [Brainerd, 1969] G = ( Q , Σ , I , P ) alphabet Q of nonterminals and initial nonterminals I ⊆ Q alphabet of terminals Σ finite set of productions P ⊆ T Σ ( Q ) × Q (we write r → q for productions ( r , q ) ) Example productions VP 3 S S q 3 → q 4 q 4 → q 0 q 6 → q 0 q 5 NP 1 VP 2 NP 1 q 2 q 1 q 2 q 4
Regular Tree Languages Derivation semantics and recognized tree language Regular tree grammar G = ( Q , Σ , I , P ) for each production r → q ∈ P = ⇒ G q r
Regular Tree Languages Derivation semantics and recognized tree language Regular tree grammar G = ( Q , Σ , I , P ) for each production r → q ∈ P = ⇒ G q r generated tree language L ( G ) = { t ∈ T Σ ( ∅ ) | ∃ q ∈ I : t ⇒ ∗ G q }
Regular Tree Languages Recall 3 equivalence classes (“no σ ”, “some σ , but legal”, illegal) L ′ = { t ∈ T Σ ( ∅ ) | σ never below δ } C 1 = [ α ] C 2 = [ σ ( α, α )] C 3 = [ δ ( σ ( α, α ) , α )]
Regular Tree Languages Recall 3 equivalence classes (“no σ ”, “some σ , but legal”, illegal) L ′ = { t ∈ T Σ ( ∅ ) | σ never below δ } C 1 = [ α ] C 2 = [ σ ( α, α )] C 3 = [ δ ( σ ( α, α ) , α )] Productions with nonterminals C 1 , C 2 , C 3 α → C 1 δ ( C 1 , C 1 ) → C 1 σ ( C 1 , C 1 ) → C 2 σ ( C 1 , C 2 ) → C 2 σ ( C 2 , C 1 ) → C 2 σ ( C 2 , C 2 ) → C 2 δ ( C 1 , C 2 ) → C 3 δ ( C 1 , C 3 ) → C 3 δ ( C 2 , C 1 ) → C 3 δ ( C 2 , C 2 ) → C 3 δ ( C 2 , C 3 ) → C 3 δ ( C 3 , C 1 ) → C 3 δ ( C 3 , C 2 ) → C 3 δ ( C 3 , C 3 ) → C 3 σ ( C 1 , C 3 ) → C 3 σ ( C 2 , C 3 ) → C 3 σ ( C 3 , C 1 ) → C 3 σ ( C 3 , C 2 ) → C 3 σ ( C 3 , C 3 ) → C 3
Regular Tree Languages Properties ✓ simple ✓ most expressive class we consider ✗ ambiguity, (several explanations for a generated tree) but can be removed ✓ closed under all Boolean operations (union/intersection/complement: ✓ / ✓ / ✓ ) ✓ all relevant properties decidable (emptiness, inclusion, ...)
Regular Tree Languages Characterizations finite index congruences regular tree grammars (deterministic) tree automata regular tree expressions monadic second-order formulas ...
Parses Representations enumeration proof trees of combinatory categorial grammars local tree languages tree substitution languages regular tree languages
Parses Representations enumeration proof trees of combinatory categorial grammars local tree languages tree substitution languages regular tree languages Categories category = tree of T S ( A ) with S = { /, / } and atomic categories A � � e.g. D / E / E / C corresponds to / / ( / ( D , E ) , E ) , C
Combinatory Categorial Grammars Combinators (Compositions) Composition rules of degree k are → ax / c , cy axy (forward rule) cy , ax / c → axy (backward rule) with y = | 1 c 1 | 2 · · · | k c k
Combinatory Categorial Grammars Combinators (Compositions) Composition rules of degree k are → ax / c , cy axy (forward rule) cy , ax / c → axy (backward rule) with y = | 1 c 1 | 2 · · · | k c k Examples: C D / E / D / C D / E / D D / E / C D / E / D D / E / E / C � �� � � �� � degree 0 degree 2
Combinatory Categorial Grammars Combinatory Categorial Grammar (CCG) (Σ , A , k , I , L ) terminal alphabet Σ and atomic categories A maximal degree k ∈ N ∪ {∞} of composition rules initial categories I ⊆ A lexicon L ⊆ Σ × C ( A ) with C ( A ) categories over A
Combinatory Categorial Grammars Combinatory Categorial Grammar (CCG) (Σ , A , k , I , L ) terminal alphabet Σ and atomic categories A maximal degree k ∈ N ∪ {∞} of composition rules initial categories I ⊆ A lexicon L ⊆ Σ × C ( A ) with C ( A ) categories over A Notes: always all rules up to the given degree k allowed k -CCG = CCG using all composition rules up to degree k
Combinatory Categorial Grammars c c d d e e . . . . . . . . . . . . . . . . . . . . . . . . C D / E / D / C . . . . . . . . . D / E / D D / E / C . . . . . . C D / E / E / C . . . D / E / E E D / E E D 2-CCG generates string language L with L ∩ c + d + e + = { c i d i e i | i ≥ 1 } for initial categories { D } L ( c ) = { C } L ( d ) = { D / E / C , D / E / D / C } L ( e ) = { E }
Combinatory Categorial Grammars allow (deterministic) relabeling (to allow arbitrary labels) tree t min-height bounded by k if the minimal distance from each node to a leaf is at most k Theorem (Under relabeling) Class of proof trees of 0-CCGs = class of min-height bounded binary regular tree languages joint work with Marco Kuhlmann
Combinatory Categorial Grammars Theorem (Under relabeling) Class of proof trees of 1-CCGs � class of binary regular tree languages
Combinatory Categorial Grammars Theorem (Under relabeling) Class of proof trees of 1-CCGs � class of binary regular tree languages Theorem (Under relabeling ∗ ) Class of proof trees of ∞ -CCGs � class of simple context-free tree languages joint work with Marco Kuhlmann
Combinatory Categorial Grammars ax / ( by ) by α/ c by α/ c c R1 − → ax / ( by ) ax α/ c c by α ax α ax α by α / c ax / ( by ) c by α / c R2 − → c ax α / c by α ax / ( by ) ax α ax α ax / ( by ) by α / c c by α / c R3 − → ax / ( by ) c ax α / c by α ax α ax α by α/ c ax / ( by ) by α/ c c R4 − → ax α/ c c by α ax / ( by ) ax α ax α joint work with Marco Kuhlmann
Combinatory Categorial Grammars Properties ✓ simple ✗ ambiguity (several explanations for each recognized tree) ✗ not closed under Boolean operations (union/intersection/complement: ✓ /?/ ✗ ∗ ) ✓ closed under (non-injective) relabelings ? decidability of membership for subregular classes (0-CCG & 1-CCG) of a regular tree language
Tree Languages Representations enumerate trees proof trees of combinatory categorial grammars local tree languages tree substitution languages regular tree languages
Tree Languages Representations enumerate trees proof trees of combinatory categorial grammars local tree languages tree substitution languages regular tree languages Local tree grammar [Gécseg, Steinby 1984] Local tree grammar = finite set of legal branchings (together with a set of root labels) G = (Σ , I , P ) with I ⊆ Σ and P ⊆ � k ∈ N Σ × Σ k
Local Tree Languages Example (with root label S) S → NP 1 VP 2 VP 2 → MD VP 3 NP 2 → NP 2 PP VP 3 → VB PP NP 2 MD → must . . .
Local Tree Languages Example (with root label S) S → NP 1 VP 2 VP 2 → MD VP 3 NP 2 → NP 2 PP VP 3 → VB PP NP 2 MD → must . . . S NP 1 VP 2 PRP MD VP 3 We must VB PP NP 2 bear IN NP 1 NP 2 PP in NN DT NN IN NP 2 mind the Community as DT NN a whole
Recommend
More recommend