neural grammatical error correction with finite state
play

Neural Grammatical Error Correction with Finite State Transducers - PowerPoint PPT Presentation

Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher Bryant, and Bill Byrne Department of Engineering Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher


  1. Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher Bryant, and Bill Byrne Department of Engineering Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher Bryant, and Bill Byrne

  2. Informal introduction to finite state transducers • FSTs are graph structures with start state and final state • Arcs are annotated with: • An input symbol • An output symbol • A weight • The FST transduces an input string 𝑡 1 to an output string 𝑡 2 iff. there is a path from the start to the final state with: • 𝑡 1 is the concatenation of all input symbols • 𝑡 2 is the concatenation of all output symbols • The cost of this mapping is the (minimal) sum of arc weights Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher Bryant, and Bill Byrne

  3. Example FSTs • Maps 𝑡 1 = 𝑏𝑐𝑑 to itself Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher Bryant, and Bill Byrne

  4. Example FSTs • Maps 𝑡 1 = 𝑏𝑐𝑑 to itself Output symbol Final state Start state Input symbol Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher Bryant, and Bill Byrne

  5. Example FSTs • Maps 𝑡 1 = 𝑏𝑐𝑑 to itself Input and output Final state Start state symbol Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher Bryant, and Bill Byrne

  6. Example FSTs • Maps 𝑡 1 = 𝑏𝑐𝑑 to itself • Maps any string consisting only of 𝑏 symbols to itself Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher Bryant, and Bill Byrne

  7. Example FSTs 𝜗 : empty input/output symbol • Represents an 𝑜 -best list Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher Bryant, and Bill Byrne

  8. Example FSTs 𝜗 : empty input/output symbol • Represents an 𝑜 -best list • After determinization, 𝜗 -removal, minimization, weight pushing Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher Bryant, and Bill Byrne

  9. FST composition • Composition: Combines two FSTs 𝑈 1 and 𝑈 2 to a new FST 𝑈 1 ∘ 𝑈 2 • If 𝑈 1 maps 𝑡 1 to 𝑡 2 and 𝑈 2 maps 𝑡 2 to 𝑡 3 , then 𝑈 1 ∘ 𝑈 2 maps 𝑡 1 to 𝑡 3 . • The cost is the (minimum) sum of the path costs in 𝑈 1 and 𝑈 2 . Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher Bryant, and Bill Byrne

  10. FST composition examples • Composition and weights 𝑈 𝑈 2 𝑈 1 ∘ 𝑈 2 1 Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher Bryant, and Bill Byrne

  11. FST composition examples • Counting transducers 𝑈 1 𝑈 2 𝑈 1 ∘ 𝑈 2 Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher Bryant, and Bill Byrne

  12. FST composition examples • Language models 𝑈 1 𝑈 2 𝑈 1 ∘ 𝑈 2 Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher Bryant, and Bill Byrne

  13. FST composition examples • 1:1 corrections 𝑈 1 𝑈 2 𝑈 1 ∘ 𝑈 2 Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher Bryant, and Bill Byrne

  14. FST-based unsupervised grammatical error correction 𝐽 (Input) 𝑄 (Penalization) 𝐹 (Edit) 𝑀 (5-gram LM) … Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher Bryant, and Bill Byrne

  15. FST-based unsupervised grammatical error correction • 𝐽 : Input • 𝐹 : Edit • 𝑄 : Penalization 𝐽 ∘ 𝐹 • 𝑀 : 5-gram LM 𝐽 ∘ 𝐹 ∘ 𝑄 𝐽 ∘ 𝐹 ∘ 𝑄 ∘ 𝑀 : Non-neural unsupervised GEC with 5-gram LM scores Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher Bryant, and Bill Byrne

  16. FST-based neural unsupervised GEC • 𝐽 : Input • • Idea: Use the constructed FSTs to constrain the output 𝐹 : Edit • 𝑄 : Penalization of a neural LM • 𝑀 : 5-gram LM • 𝑈 : Tokenization • Neural sequence models normally use subwords or (word → BPE) characters rather than words. • Build transducer 𝑈 that maps full words to subwords (byte-pair encoding, BPE) • Constrain neural LM with 𝐽 ∘ 𝐹 ∘ 𝑄 ∘ 𝑀 ∘ 𝑈 • For constrained neural decoding we use our SGNMT decoder http://ucam-smt.github.io/sgnmt/html/ Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher Bryant, and Bill Byrne

  17. Results (unsupervised) Systems are tuned with respect to metric highlighted in grey. Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher Bryant, and Bill Byrne

  18. FST-based neural supervised GEC • If annotated training data is available: • Input 𝐽 is a (Moses) SMT lattice rather than a single sentence • In addition to the <corr> token, we use an <mcorr> token to count the edits by the SMT system. • We use an ensemble of a neural language model and a neural machine translation model. Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher Bryant, and Bill Byrne

  19. FST-based supervised grammatical error correction 𝐽 (Input SMT lattice) • 𝐽 : Input • 𝐹 : Edit • 𝑄 : Penalization • 𝑀 : 5-gram LM • 𝑈 : Tokenization (word → BPE) 𝐽 ∘ 𝐹 𝐽 ∘ 𝐹 ∘ 𝑄 ∘ 𝑀 ∘ 𝑈 : Constraint for neural ensembles Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher Bryant, and Bill Byrne

  20. Results (supervised) Systems are tuned with respect to metric highlighted in grey. Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher Bryant, and Bill Byrne

  21. Results (supervised) Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher Bryant, and Bill Byrne

  22. Thanks Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher Bryant, and Bill Byrne

  23. BACKUP Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher Bryant, and Bill Byrne

  24. Neural Grammatical Error Correction with Finite State Transducers Felix Stahlberg, Christopher Bryant, and Bill Byrne

Recommend


More recommend