Learning Fine-Grained Image Representations for Mathematical Expression Recognition Sidney Bender*, Monica Haurilet*, Alina Roitberg and Rainer Stiefelhagen Computer Vision for Human Computer Interaction Lab Institute for Anthropomatics and Robotics LaTeX Markup \left[\;\Lambda\;\right]_{R}\^{S}= Input Image \left(\begin{array}{ll}{ \operatorname{cos}\Psi}& Encoder Decoder FGFE {\operatorname{sin}\Psi}\\ {\operatorname{sin}\Psi}& {\operatorname{cos}\Psi}\\ \end{array}\right), www.kit.edu KIT – The Research University in the Helmholtz Association
Mathematical Expression Recognition (MER) Problem Definition Markup (e.g., LaTeX) Input Image \left[\;\Lambda\;\right]_{R}\^{S}= \left(\begin{array}{ll}{ Model \operatorname{cos}\Psi}& {\operatorname{sin}\Psi}\\ {\operatorname{sin}\Psi}& {\operatorname{cos}\Psi}\\ \end{array}\right), Different types of MER Tasks T=1 T=2 T=3 T=4 Model Model Online MER Offline MER [CROHME] [IM2LATEX] Learning Fine-Grained Image Representations for Sidney Bender*, Monica Haurilet*, Alina Roitberg 2 Mathematical Expression Recognition and Rainer Stiefelhagen
Related Work Infty [Suzuki et al .] Caption [Xu et al .] CTC [Graves et al .] Im2Tex [Deng et al .] Learning Fine-Grained Image Representations for Sidney Bender*, Monica Haurilet*, Alina Roitberg 3 Mathematical Expression Recognition and Rainer Stiefelhagen
Overview of the Model \left[ \; \Lambda \; \right] _ { R } \ ^ { S } = \left( \begin{array}{ l l } {\operatorname}{cos} \Psi } & {- Encoder Decoder FGFE \operatorname{sin} \Psi} \\ {\operatorname {sin} \Psi} & {\operatorname{cos} \Psi} Input Image LaTeX Markup (H x W x 1) Learning Fine-Grained Image Representations for Sidney Bender*, Monica Haurilet*, Alina Roitberg 4 Mathematical Expression Recognition and Rainer Stiefelhagen
Visual Encoder and LaTeX Decoder Encoder Decoder FGFE \left[ \; \Lambda \ ; … LSTM Encoder Input Encoder Output H‘ x W‘ x C H‘ x W‘ x D for each t Decoder Input repeat · \Psi H‘ x W‘ x D Learning Fine-Grained Image Representations for Sidney Bender*, Monica Haurilet*, Alina Roitberg 5 Mathematical Expression Recognition and Rainer Stiefelhagen
Final Results img.-based text-based Performance on the IM2LATEX-100K Test Set Learning Fine-Grained Image Representations for Sidney Bender*, Monica Haurilet*, Alina Roitberg 6 Mathematical Expression Recognition and Rainer Stiefelhagen
Impact of Formula Length on Performance Drop at Short Formula Long Formulas are difficult to recognize Learning Fine-Grained Image Representations for Sidney Bender*, Monica Haurilet*, Alina Roitberg 7 Mathematical Expression Recognition and Rainer Stiefelhagen
Impact of Rare Token Classes Learning Fine-Grained Image Representations for Sidney Bender*, Monica Haurilet*, Alina Roitberg 8 Mathematical Expression Recognition and Rainer Stiefelhagen
Importance of a Fine-grained Visual Representation FE Type Attention Maps Predictions Im2Tex \alpha Ours \Delta Learning Fine-Grained Image Representations for Sidney Bender*, Monica Haurilet*, Alina Roitberg 9 Mathematical Expression Recognition and Rainer Stiefelhagen
Recursive Behavior in Long Formulas Input Image LaTeX Prediction Compiled Image +\frac{1}{2}\ [ … Encoder Decoder FGFE D_1((x-x\prime)^2) … … T = 134 T = 108 T = 109 T = 132 T = 133 … \lambda \delta ) _ \lambda Learning Fine-Grained Image Representations for Sidney Bender*, Monica Haurilet*, Alina Roitberg 10 Mathematical Expression Recognition and Rainer Stiefelhagen
Conclusion Approach We tackled the offline MER task Our model was evaluated on the Im2LATEX Dataset We were able to improve results by over 4% in Img-Abs Analysis Analysis of the performance by formula length Visualization of attention maps Impact of rare tokens on performance Typical errors our model produced Learning Fine-Grained Image Representations for Sidney Bender*, Monica Haurilet*, Alina Roitberg 11 Mathematical Expression Recognition and Rainer Stiefelhagen
References [Suzuki et al. ] INFTY: an integrated OCR system for mathematical documents . M. Suzuki, F. Tamari, R. Fukuda, S. Uchida, and T. Kanahori. In DocEng, 2003. [Graves et al. ] Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks . A. Graves, S. Fernández, F. Gomez, and J. Schmidhuber . In ICML, 2006. [Xu et al. ] Show, attend and tell: Neural image caption generation with visual attention. K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudinov, R. Zemel and Y. Bengio . In ICML, 2015. [Deng et al. ] Image-to-markup generation with coarse-to-fine attention. Y. Deng, A. Kanervisto, J. Ling and A. Rush . In ICML, 2017. Learning Fine-Grained Image Representations for Sidney Bender*, Monica Haurilet*, Alina Roitberg 12 Mathematical Expression Recognition and Rainer Stiefelhagen
Learning Fine-Grained Image Representations for Mathematical Expression Recognition Sidney Bender*, Monica Haurilet*, Alina Roitberg and Rainer Stiefelhagen Karlsruhe Institute of Technology, Germany haurilet@kit.edu LaTeX Markup \left[\;\Lambda\;\right]_{R}\^{S}= Input Image \left(\begin{array}{ll}{ \operatorname{cos}\Psi}& Encoder Decoder FGFE {\operatorname{sin}\Psi}\\ {\operatorname{sin}\Psi}& {\operatorname{cos}\Psi}\\ \end{array}\right), Learning Fine-Grained Image Representations for Sidney Bender*, Monica Haurilet*, Alina Roitberg 13 Mathematical Expression Recognition and Rainer Stiefelhagen
Recommend
More recommend