learning fine grained image representations for
play

Learning Fine-Grained Image Representations for Mathematical - PowerPoint PPT Presentation

Learning Fine-Grained Image Representations for Mathematical Expression Recognition Sidney Bender*, Monica Haurilet*, Alina Roitberg and Rainer Stiefelhagen Computer Vision for Human Computer Interaction Lab Institute for Anthropomatics and


  1. Learning Fine-Grained Image Representations for Mathematical Expression Recognition Sidney Bender*, Monica Haurilet*, Alina Roitberg and Rainer Stiefelhagen Computer Vision for Human Computer Interaction Lab Institute for Anthropomatics and Robotics LaTeX Markup \left[\;\Lambda\;\right]_{R}\^{S}= Input Image \left(\begin{array}{ll}{ \operatorname{cos}\Psi}& Encoder Decoder FGFE {\operatorname{sin}\Psi}\\ {\operatorname{sin}\Psi}& {\operatorname{cos}\Psi}\\ \end{array}\right), www.kit.edu KIT – The Research University in the Helmholtz Association

  2. Mathematical Expression Recognition (MER) Problem Definition Markup (e.g., LaTeX) Input Image \left[\;\Lambda\;\right]_{R}\^{S}= \left(\begin{array}{ll}{ Model \operatorname{cos}\Psi}& {\operatorname{sin}\Psi}\\ {\operatorname{sin}\Psi}& {\operatorname{cos}\Psi}\\ \end{array}\right), Different types of MER Tasks T=1 T=2 T=3 T=4 Model Model Online MER Offline MER [CROHME] [IM2LATEX] Learning Fine-Grained Image Representations for Sidney Bender*, Monica Haurilet*, Alina Roitberg 2 Mathematical Expression Recognition and Rainer Stiefelhagen

  3. Related Work Infty [Suzuki et al .] Caption [Xu et al .] CTC [Graves et al .] Im2Tex [Deng et al .] Learning Fine-Grained Image Representations for Sidney Bender*, Monica Haurilet*, Alina Roitberg 3 Mathematical Expression Recognition and Rainer Stiefelhagen

  4. Overview of the Model \left[ \; \Lambda \; \right] _ { R } \ ^ { S } = \left( \begin{array}{ l l } {\operatorname}{cos} \Psi } & {- Encoder Decoder FGFE \operatorname{sin} \Psi} \\ {\operatorname {sin} \Psi} & {\operatorname{cos} \Psi} Input Image LaTeX Markup (H x W x 1) Learning Fine-Grained Image Representations for Sidney Bender*, Monica Haurilet*, Alina Roitberg 4 Mathematical Expression Recognition and Rainer Stiefelhagen

  5. Visual Encoder and LaTeX Decoder Encoder Decoder FGFE \left[ \; \Lambda \ ; … LSTM Encoder Input Encoder Output H‘ x W‘ x C H‘ x W‘ x D for each t Decoder Input repeat · \Psi H‘ x W‘ x D Learning Fine-Grained Image Representations for Sidney Bender*, Monica Haurilet*, Alina Roitberg 5 Mathematical Expression Recognition and Rainer Stiefelhagen

  6. Final Results img.-based text-based Performance on the IM2LATEX-100K Test Set Learning Fine-Grained Image Representations for Sidney Bender*, Monica Haurilet*, Alina Roitberg 6 Mathematical Expression Recognition and Rainer Stiefelhagen

  7. Impact of Formula Length on Performance Drop at Short Formula Long Formulas are difficult to recognize Learning Fine-Grained Image Representations for Sidney Bender*, Monica Haurilet*, Alina Roitberg 7 Mathematical Expression Recognition and Rainer Stiefelhagen

  8. Impact of Rare Token Classes Learning Fine-Grained Image Representations for Sidney Bender*, Monica Haurilet*, Alina Roitberg 8 Mathematical Expression Recognition and Rainer Stiefelhagen

  9. Importance of a Fine-grained Visual Representation FE Type Attention Maps Predictions Im2Tex  \alpha Ours  \Delta Learning Fine-Grained Image Representations for Sidney Bender*, Monica Haurilet*, Alina Roitberg 9 Mathematical Expression Recognition and Rainer Stiefelhagen

  10. Recursive Behavior in Long Formulas Input Image LaTeX Prediction Compiled Image +\frac{1}{2}\ [ … Encoder Decoder FGFE D_1((x-x\prime)^2) … … T = 134 T = 108 T = 109 T = 132 T = 133 … \lambda \delta ) _ \lambda Learning Fine-Grained Image Representations for Sidney Bender*, Monica Haurilet*, Alina Roitberg 10 Mathematical Expression Recognition and Rainer Stiefelhagen

  11. Conclusion Approach  We tackled the offline MER task  Our model was evaluated on the Im2LATEX Dataset  We were able to improve results by over 4% in Img-Abs Analysis  Analysis of the performance by formula length  Visualization of attention maps  Impact of rare tokens on performance  Typical errors our model produced Learning Fine-Grained Image Representations for Sidney Bender*, Monica Haurilet*, Alina Roitberg 11 Mathematical Expression Recognition and Rainer Stiefelhagen

  12. References  [Suzuki et al. ] INFTY: an integrated OCR system for mathematical documents . M. Suzuki, F. Tamari, R. Fukuda, S. Uchida, and T. Kanahori. In DocEng, 2003.  [Graves et al. ] Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks . A. Graves, S. Fernández, F. Gomez, and J. Schmidhuber . In ICML, 2006.  [Xu et al. ] Show, attend and tell: Neural image caption generation with visual attention. K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudinov, R. Zemel and Y. Bengio . In ICML, 2015.  [Deng et al. ] Image-to-markup generation with coarse-to-fine attention. Y. Deng, A. Kanervisto, J. Ling and A. Rush . In ICML, 2017. Learning Fine-Grained Image Representations for Sidney Bender*, Monica Haurilet*, Alina Roitberg 12 Mathematical Expression Recognition and Rainer Stiefelhagen

  13. Learning Fine-Grained Image Representations for Mathematical Expression Recognition Sidney Bender*, Monica Haurilet*, Alina Roitberg and Rainer Stiefelhagen Karlsruhe Institute of Technology, Germany haurilet@kit.edu LaTeX Markup \left[\;\Lambda\;\right]_{R}\^{S}= Input Image \left(\begin{array}{ll}{ \operatorname{cos}\Psi}& Encoder Decoder FGFE {\operatorname{sin}\Psi}\\ {\operatorname{sin}\Psi}& {\operatorname{cos}\Psi}\\ \end{array}\right), Learning Fine-Grained Image Representations for Sidney Bender*, Monica Haurilet*, Alina Roitberg 13 Mathematical Expression Recognition and Rainer Stiefelhagen

Recommend


More recommend