M-monoid parsing and reduct generation Richard Mörbitz 15th December 2017 1
2 Parsing? Given G = ( N , Σ , S , R ) • N = { S , NP , VP , NNP , . . . } • Σ = { Fruit , bananas , . . . } • R = { S → NP VP , NNP → Fruit , . . . } To parse e = Fruit fmies like bananas.
Parse trees S bananas NNS NP like VBP VP fmies NNP Fruit NNP NP bananas S NNS NP like IN PP fmies VBZ VP Fruit NNP NP 3
Abstract syntax trees like bananas NNS like VBP NP fmies Fruit NNP NNP NP VP NNS bananas IN NP VBZ PP NP VP NNP Fruit 4 fmies S → S → NP → VP → NP → VP → NNP → VBZ → PP → NNP → NNP → VBP → NP → IN → NP → NNS → NNS →
Recognition fmies bananas NNS like VBP NP fmies Fruit NNP NNP NP VP NNS like IN NP bananas 5 NP VP VBZ PP NNP Fruit ⊤ ⊤ ⊤ S → S → ⊤ ⊤ ⊤ ⊤ NP → VP → NP → VP → ⊤ ⊤ ⊤ ⊤ ⊤ ⊤ ⊤ NNP → VBZ → PP → NNP → NNP → VBP → NP → ⊤ ⊤ ⊤ IN → NP → NNS → ⊤ NNS →
Recognition fmies bananas NNS like VBP NP fmies Fruit NNP NNP NP VP bananas NNS like IN NP 5 VBZ PP NP VP NNP Fruit ⊤ ∨ ⊤ ⊤ S → S → ∧ ∧ ⊤ ⊤ ⊤ ⊤ NP → VP → NP → VP → ∧ ∧ ∧ ∧ ⊤ ⊤ ⊤ ⊤ ⊤ ⊤ ⊤ NNP → VBZ → PP → NNP → NNP → VBP → NP → ∧ ∧ ⊤ ⊤ ⊤ IN → NP → NNS → ∧ ⊤ NNS →
String probability fmies bananas NNS like VBP NP fmies Fruit NNP NNP NP VP bananas NNS like IN NP 6 VBZ PP NP VP NNP Fruit 0 . 0000495 + 0 . 000048 0 . 0000015 S → S → × × 0 . 04 0 . 0024 0 . 001 0 . 003 NP → VP → NP → VP → × × × × 0 . 2 0 . 4 0 . 012 0 . 2 0 . 1 0 . 5 0 . 06 NNP → VBZ → PP → NNP → NNP → VBP → NP → × × 0 . 3 0 . 4 0 . 06 IN → NP → NNS → × 0 . 3 NNS →
Probability of the most likely derivation fmies bananas NNS like VBP NP fmies Fruit NNP NNP NP VP bananas NNS like IN NP 7 VBZ PP NP VP NNP Fruit 0 . 000048 max 0 . 000048 0 . 0000015 S → S → × × 0 . 04 0 . 0024 0 . 001 0 . 003 NP → VP → NP → VP → × × × × 0 . 2 0 . 4 0 . 012 0 . 2 0 . 1 0 . 5 0 . 06 NNP → VBZ → PP → NNP → NNP → VBP → NP → × × 0 . 3 0 . 4 0 . 06 IN → NP → NNS → × 0 . 3 NNS →
Generic computation fmies bananas NNS like VBP NP fmies Fruit NNP NNP NP VP bananas NNS like IN NP 8 NNP VBZ PP NP VP Fruit parse( e ) ⊕ S → S → ⊗ ⊗ NP → VP → NP → VP → ⊗ ⊗ ⊗ ⊗ NNP → VBZ → PP → NNP → NNP → VBP → NP → ⊗ ⊗ IN → NP → NNS → ⊗ NNS →
Semiring parsing
Semiring Defjnition (Semiring) 9 A semiring ( S , ⊕ , ⊗ , 0 , 1 ) is an algebraic structure, such that • ( S , ⊕ , 0 ) is a commutative monoid • ( S , ⊗ , 1 ) is a monoid, • ⊗ is left-distributive and right-distributive over ⊕ , and • a ⊗ 0 = 0 = 0 ⊗ a for every a ∈ S . S is complete if Σ ⊕ exists.
Semiring parsing Instances: 10 Algorithm for generic complete semiring ( S , ⊕ , ⊗ , 0 , 1 ) [Goo99] Recognition ( {⊤ , ⊥} , ∨ , ∧ , ⊥ , ⊤ ) String probability ( R ∞ 0 , + , × , 0 , 1 ) Probability of best derivation ( R 1 0 , max , × , 0 , 1 ) Derivation forest ( 2 E 1 , ∪ , · , ∅ , { ε } ) 1 set of derivations (elements of R ∗ )
M-monoid parsing
Knuth’s generalization of Dijkstra’s algorithm [Knu77] B 0 C 11 A G = ( N , Σ , C , R ) , with • N = { A , B , C } • Σ = { α, β, γ, ˆ ( , ˆ γ , } β ) , ˆ • R = { C → γ ( A , B ) , B → α β ( A , A ) , A → α } Mapping val : Σ ∗ → R ∞ 0 , 0 ) k → R ∞ each σ ∈ Σ is a mapping σ : ( R ∞ Generalization ( R ∞ 0 , ≤ ) � ( S , � ) [Jun06]
Multioperator monoid Defjnition (Multioperator monoid) 12 An M-monoid is an algebraic structure ( S , ⊕ , 0 , Ω) , such that • ( S , ⊕ , 0 ) is a commutative monoid, • Ω is a set of operations on S , such that ∀ ω ∈ Ω : ω ( s 1 , . . . , s k ) = 0 if ∃ i : s i = 0 • 0 k ∈ Ω for all k ∈ N , 0 k : S k → S , such that 0 k ( s 1 , . . . , s k ) = 0. S is complete if Σ ⊕ exists.
M-monoid parsing problem Given 13 1. a complete M-monoid ( S , Σ ⊕ ) with ( S , ⊕ , 0 , Ω) , 2. a weighted LCFRS ( G , wt) over S where G = ( N , Σ , Z , R ) is an LCFRS over ∆ and wt : R → Ω , and 3. a sentence e = e 1 . . . e n with n ≥ 1 and e i ∈ ∆ Σ ⊕ Compute parse( e ) = h ( d ) , d ∈ ( T R ) Z : � π Σ ( d ) � = e where h : T R → S is the homomorphism from T R to ( S , Ω) .
Weighted deductive parsing [Ned03] SCAN: Goal: RULE: 14 Inference rules Items I = { [ A ,� κ ] | A ∈ N ,� κ range vector over e } if ρ = ( A → � e i � ) in R [ A , ( i − 1 , i )] [ B 1 ,� κ 1 ] ... [ B k ,� κ k ] if ρ = ( A → σ ( B 1 , . . . , B k )) in R [ A ,σ ( � κ 1 ,...,� κ k )] [ Z , ( 0 , | e | )]
M-monoid parsing algorithm Input 15 1. an M-monoid ( S , ⊕ , 0 , Ω) , 2. an LCFRS − G = ( N , Σ , Z , R ) over ∆ , and wt : R → Ω , 3. a function select : 2 I → I specifjc to the M-monoid, and 4. a sentence e = e 1 . . . e n with n ≥ 1 and e i ∈ ∆ Variables V : I → S mapping Output parse( e )
16 6: 14: 13: 12: RULE 11: 9: 8: 10: 5: 3: Algorithm 2.1 M-monoid parsing for LCFRS − 1: A , C ← ∅ 2: for each A ∈ N and � κ range vector over e do V ([ A ,� κ ]) ← 0 4: for each ρ = ( A → σ ) in R and [ A ,� κ ] generated by SCAN [ A ,� κ ] do V ([ A ,� κ ]) ← V ([ A ,� κ ]) ⊕ wt( ρ )() A ← A ∪ { [ A ,� κ ] } 7: while A � = ∅ do [ A ,� κ ] ← select ( A ) A ← A \ { [ A ,� κ ] } C ← C ∪ { [ A ,� κ ] } for each ρ = ( B → σ ( B 1 , . . . B k )) in R and [ B , � η ] deduced by ∗ η ] from [ A ,� κ ] and other items from C do [ B ,� η ]) ← V ([ B , � η ]) ⊕ wt( ρ )( V ([ B 1 ,� V ([ B , � κ 1 ]) , . . . , V ([ B k ,� κ k ])) if [ B , � η ] �∈ C then A ← A ∪ { [ B , � η ] } 15: return V ([ Z , ( 0 , n )])
Reduct generation with M-monoid parsing
Reduct of a grammar and a sentence Given 17 • an LCFRS G = ( N , Σ , Z , R ) over ∆ (RTG-notation) • a sentence e ∈ ∆ ∗ Compute LCFRS G ⊲ ψ e = ( N ′ , Σ , Z ′ , R ′ ) , such that 1. � L ( G ⊲ ψ e ) � LCFRS = � L ( G ) � LCFRS ∩ { e } , and 2. with the mapping ψ : N ′ → N there exists a bijective mapping from the ASTs of G ⊲ ψ e to the ASTs of G
Preliminary defjnitions Prototype rules: Prototype nonterminals: 18 P R = { [ A ,� κ ] → σ ([ B 1 ,� κ 1 ] , . . . [ B k ,� κ k ]) | ( A → σ ( B 1 , . . . , B k )) ∈ R ∧ � κ,� κ 1 , . . . ,� κ k are range vectors over e } P N = { [ A ,� κ ] | A ∈ N ∧ � κ is a range vector over e }
The reduct problem as an instance of M-monoid parsing Given: 19 1. ( G , Σ ∪ ) , where • G = ( 2 P N × 2 P R , ∪ , ∅ × ∅ , Ω REDUCT ) is the reduct M-monoid • Σ ∪ i ∈ I s i = � i ∈ I s i is used as the infjnitary sum operation 2. ( G , wt) for an arbitrary G over ∆ and wt : R → Ω REDUCT 3. an arbitrary e ∈ ∆ ∗ Compute: parse( e ) = ( { A ∈ N ′ | A is of the form [ Z ,� κ ] } , R ′ ) Then G ⊲ ψ e = ( N ′ , Σ , Z ′ , R ′ ), where • R ′ = v where ( u , v ) = parse( e ) , • N ′ = { [ A ,� κ ] is the left-hand side of a rule in R ′ } , κ ] | [ A ,� • Z ′ = [ Z , ( 0 , | e | )]
20 The operations of Ω REDUCT • if ρ is of the form A → � w � , then ω ρ () = �{ [ A , ( i − 1 , i )] | e i = w } , { [ A , ( i − 1 , i )] → � w � | e i = w }� . • if ρ is of the form A → σ ( B 1 , . . . , B k ) , then ω ρ (( U 1 , V 1 ) , . . . , ( U k , V k )) = � U , V � , where U = { [ A , σ ( � η k )] | ( � η k ) ∈ fit σ ( U 1 , . . . , U k ) } η 1 , . . . , � η 1 , . . . , � � V = V i ∪ { [ A , σ ( � η k )] → σ ([ B 1 , � η 1 ] , . . . , [ B k , � η k ]) | η 1 , . . . , � 1 ≤ i ≤ k ( � η k ) ∈ fit σ ( U 1 , . . . , U k ) } η 1 , . . . , �
Example: reduct generation 3 bananas 4 0 Fruit 1 21 2 like 3 1 fmies 2 [ NNP , ( 0 , 1 )] [ NNP , ( 1 , 2 )] [ VBZ , ( 1 , 2 )] [ IN , ( 2 , 3 )] [ VBP , ( 2 , 3 )] [ NNS , ( 3 , 4 )] � Fruit � � fmies � � fmies � � like � � like � � bananas �
Example: reduct generation 3 bananas 4 0 Fruit 1 21 1 fmies 2 2 like 3 [ NP , ( 0 , 1 )] � x 1 1 � [ NNP , ( 0 , 1 )] [ NNP , ( 1 , 2 )] [ VBZ , ( 1 , 2 )] [ IN , ( 2 , 3 )] [ VBP , ( 2 , 3 )] [ NNS , ( 3 , 4 )] � Fruit � � fmies � � fmies � � like � � like � � bananas �
Example: reduct generation 0 Fruit 1 1 x 2 21 3 bananas 4 1 fmies 2 2 like 3 [ NP , ( 0 , 1 )] [ NP , ( 0 , 2 )] � x 1 1 � � x 1 1 � [ NNP , ( 0 , 1 )] [ NNP , ( 1 , 2 )] [ VBZ , ( 1 , 2 )] [ IN , ( 2 , 3 )] [ VBP , ( 2 , 3 )] [ NNS , ( 3 , 4 )] � Fruit � � fmies � � fmies � � like � � like � � bananas �
Example: reduct generation 0 Fruit 1 1 x 2 21 3 bananas 4 1 fmies 2 2 like 3 [ NP , ( 0 , 1 )] [ NP , ( 0 , 2 )] � x 1 1 � � x 1 1 � [ NNP , ( 0 , 1 )] [ NNP , ( 1 , 2 )] [ VBZ , ( 1 , 2 )] [ IN , ( 2 , 3 )] [ VBP , ( 2 , 3 )] [ NNS , ( 3 , 4 )] � Fruit � � fmies � � fmies � � like � � like � � bananas �
Recommend
More recommend