semiring parsing
play

Semiring parsing Arnd Hartmanns Probabilistic grammars Motivation - PowerPoint PPT Presentation

Semiring parsing Arnd Hartmanns Probabilistic grammars Motivation Natural language is ambiguous: Sentence I saw the man with the telescope. Arnd Hartmanns Semiring parsing Probabilistic grammars Motivation Natural language is ambiguous:


  1. Semiring parsing Arnd Hartmanns

  2. Probabilistic grammars Motivation Natural language is ambiguous: Sentence I saw the man with the telescope. Arnd Hartmanns Semiring parsing

  3. Probabilistic grammars Motivation Natural language is ambiguous: Sentence I saw the man with the telescope. Arnd Hartmanns Semiring parsing

  4. Probabilistic grammars Probabilistic context-free grammars Context-free grammar: G = (N, T, S, R) + probability distribution on derivations S S e.g. P( ) = 0.001, but P( ) = 0.000001 telescope telescope p : R → 0,1 Use s.t. ∀ A ∈ N : p A → 𝛽 = 1 A →𝛽 ∈ R P = p A → 𝛽 and get A →𝛽 ∈ Arnd Hartmanns Semiring parsing

  5. Probabilistic grammars Example PCFG A → A P A → A P A → PP P → I saw | the man | with the telescope A A A A a a a a P a P a P P P P I saw I sa w th the m e man an wi with th th the te e tele lesc scop ope I sa I saw w th the m e man an wi with th th the te e tele lesc scop ope Arnd Hartmanns Semiring parsing

  6. Probabilistic grammars Example PCFG A → A a p(A → Aa) = 0.4 p(A → aA) = 0.1 A → A a p(A → aa) = 0.5 A → aa A A A A a a a a a P a P P P P P lower probability higher probability P = 0.1 × 0.5 = 0. 0.05 05 P = 0.4 × 0.5 = 0. 0.2 Arnd Hartmanns Semiring parsing

  7. Probabilistic grammars Interesting values Example calculations Inside probability Input: . a . a . a . 1 2 3 4 Viterbi inside(1, A, 4) Viterbi-derivation = P( ) + P( ) Viterbi-n-best = 0.2 + 0.05 Outside probability = 0.25 Telescope grammar viterbi-derivation(1, A, 4) p(A → Aa) = 0.4 = p(A → aA) = 0.1 viterbi(1, A, 4) = 0.2 p(A → aa) = 0.5 Arnd Hartmanns Semiring parsing

  8. Semirings? Arnd Hartmanns Semiring parsing

  9. Extending CKY CKY parsing Input: w 1 … w n ; Goal item: [1,S,n+1] S w 1 … w n Beyond recognition [i,A,k] provable ⇔ V[i,A,k] = true A B C w 1 … w m … w k Arnd Hartmanns Semiring parsing

  10. Extending CKY CKY parsing Input: w 1 … w n ; Goal item: [1,S,n+1] (A → w i ) ∈ R (A → BC) ∈ R [i,B,m] [m,C,k] Rules: , [i,A,i+1] [i,A,k] Beyond recognition Unary rule: (A → w i ) ∈ R ⇒ V[i,A,i+1] = true Binary rule: (A → BC) ∈ R ⇒ V[i,A,k] = V[i,A,k] ∨ (V[i,B,m] ∧ V[m,C,k] ) ∨ ∧ success = V[1,S,n+1] success Arnd Hartmanns Semiring parsing

  11. Extending CKY CKY parsing Input: w 1 … w n ; Goal item: [1,S,n+1] (A → w i ) ∈ R (A → BC) ∈ R [i,B,m] [m,C,k] Rules: , [i,A,i+1] [i,A,k] Beyond recognition p(A → w i ) Unary rule: (A → w i ) ∈ R ⇒ V[i,A,i+1] = Binary rule: (A → BC) ∈ R ⇒ V[i,A,k] = V[i,A,k] ∨ (V[i,B,m] ∧ V[m,C,k] + ) × p(A → BC)) × success = V[1,S,n+1] inside Arnd Hartmanns Semiring parsing

  12. Semirings Semiring definition Recall: field → ring + −x x −1 0 × 1 → se semir miring ing ∞ Complete semiring: is well-defined Some semirings Natural numbers: 〈 ℕ [0, ∞ ], +, × , 0, 1 〉 Reals with max: 〈 ℝ [0,1], max, × , 0, 1 〉 Arnd Hartmanns Semiring parsing ∞ ∀ ≤ ⇒ ≤

  13. Extending CKY Derivations Grammar Parser T → a T → a T → a A → AT [2,T,3] [1,T,2] A → TT A → TT T → a [3,T,4] [1,A,3] A → AT T → a T → a a a a [1,A,4] Derivation values Grammar: Multiply all rule values Arnd Hartmanns Semiring parsing

  14. Extending CKY Derivations Grammar Parser [1,A,4] A → AT A → TT T → a A → AT [1,A,3] [3,T,4] T → a T → a A → TT [1,T,2] [2,T,3] a a a T → a T → a T → a Derivation values Grammar: Multiply all rule values Parser: Multiply rule values recursively via item values Arnd Hartmanns Semiring parsing

  15. Semiring computations Notations Value of a rule R(A → BC) – from semiring Grammar derivation E = e 1 … e m – list of rules Item derivation tree D = D 1 …D m – leaves are rules Grammar Parser Value of a derivation: V D = ⨂ d leaf R d m R e i = R D (leaf node) V G E = ⨂ i = 1 m V D i (inner node) ⨂ i = 1 Word, derivable by E 1 … E k : Item x, heading D 1 … D k : k k V G = ⊕ j = 1 V G E j V x = ⊕ j = 1 V ( D j ) Arnd Hartmanns Semiring parsing

  16. Semirings Useful semirings Recognition: 〈 {true, false}, ∨ , ∧ , false, true 〉 Derivation number: 〈 ℕ [0, ∞ ], +, × , 0, 1 〉 Derivation forest: 〈 2 𝔽 , ∪ , ∙ , ∅ , { 〈〉 } 〉 Inside probability: 〈 ℝ [0, ∞ ], +, × , 0, 1 〉 Viterbi: 〈 ℝ [0,1], max, × , 0, 1 〉 Viterbi-derivation: 〈 ℝ [0,1] × 2 𝔽 , max Vit , × Vit , 〈 0, ∅〉 , 〈 1, { 〈〉 } 〉 〉 Viterbi-n-best: way too complicated… Arnd Hartmanns Semiring parsing

  17. Semiring computations Derivation forest example 〈 2 𝔽 , ∪ , ∙ , ∅ ,{ 〈〉 } 〉 Input: . a . a . a . 1 2 3 4 V([1,T,2])={ 〈 T → a 〉 } (T → a) [i,T,i+1] V([2,T,3])={ 〈 T → a 〉 } V([3,T,4])={ 〈 T → a 〉 } Arnd Hartmanns Semiring parsing

  18. Semiring computations Derivation forest example 〈 2 𝔽 , ∪ , ∙ , ∅ ,{ 〈〉 } 〉 Input: . a . a . a . 1 2 3 4 V([1,T,2])={ 〈 T → a 〉 } (A → TT) [i,T,m] V([2,T,3])={ 〈 T → a 〉 } [m,T,k] V([3,T,4])={ 〈 T → a 〉 } [i,A,k] V([1,A,3])={ 〈 A → TT,T → a,T → a 〉 } V([2,A,4])={ 〈 A → TT,T → a,T → a 〉 } Arnd Hartmanns Semiring parsing

  19. Semiring computations Derivation forest example 〈 2 𝔽 , ∪ , ∙ , ∅ ,{ 〈〉 } 〉 Input: . a . a . a . 1 2 3 4 V([1,T,2])={ 〈 T → a 〉 } (A → TA) [i,T,m] V([2,T,3])={ 〈 T → a 〉 } [m,A,k] V([3,T,4])={ 〈 T → a 〉 } [i,A,k] V([1,A,3])={ 〈 A → TT,T → a,T → a 〉 } V([2,A,4])={ 〈 A → TT,T → a,T → a 〉 } V([1,A,4])={ 〈 A → AT,A → TT,T → a,T → a,T → a 〉 } ∪ { 〈 A → TA,A → TT,T → a,T → a,T → a 〉 } Arnd Hartmanns Semiring parsing

  20. Semiring parsing Beyond CKY Works for many parsers e.g. Earley, but also for TAGs Omissions Outside values complicated, but similar proofs ∞ Infinite summation for A → A, semiring-dependent Further reading Joshua Goodman: Semiring parsing …and his Ph.D. thesis Arnd Hartmanns Semiring parsing

  21. Semiring parsing Summary Natural language processing problems Probabilistic grammars p(A → Aa) = 0.4 Inside probability, Viterbi , … ⊕ ⊗ Semiring operation substitution Arnd Hartmanns Semiring parsing ∞

  22. Semiring parsing Summary Natural language processing problems Probabilistic grammars p(A → Aa) = 0.4 Inside probability, Viterbi , … ⊕ ⊗ Semiring operation substitution one parser many values Arnd Hartmanns Semiring parsing ∞

  23. Arnd Hartmanns Semiring parsing

Recommend


More recommend