Character Recognition Reporter: Zecheng Xie South China University - PowerPoint PPT Presentation

Accelerating and Compressing LSTM based Model for Online Handwritten Chinese Character Recognition Reporter: Zecheng Xie South China University of Technology August 5 th , 2018

Outline  Motivation  Difficulties  Our approach  Experiments  Conclusion 2

Motivation  Online handwritten Chinese character recognition (HCCR) is widely used in pen input devices and touch screen devices 3

Motivation Our goal: build fast  The difficulties of online HCCR and compact models  Large number of character classes for on-device  Similarity between characters inference  Diversity of writing styles  Deep learning models are powerful but raise other problems  Models are too large  require large footprint and memory  Computational expensive  consume much energy  The advantages of deploying models on mobile devices  Ease server pressure  Better service latency  Can work offline  Privacy protection  … 4

Difficulties of deploying LSTM based online HCCR models on mobile devices  3755 classes  Model tends to be large  Dependences between time steps  Make the inference slow  Nature of RNNs, unlikely to be changed Unroll of RNN [1] [1] http://colah.github.io/posts/2015-08-Understanding-LSTMs/ 5

Our approach  The proposed framework Prune Cluster Reconstruct The baseline baseline redundant remaining model with SVD connections connections 6

Our approach  Data preprocessing and augmentation  Randomly remove 30% of the points in each character  Perform coordinate normalization  Remove redundant points using method proposed in [1]  Point that is too close to the point before it  Middle point that nearly stands in line with the two points before and after it  Data transform & feature extraction[1] 𝑦 𝑗 , 𝑧 𝑗 , 𝑡 𝑗 , 𝑗 = 1, 2, 3, …  𝑦 𝑗 , 𝑧 𝑗 , ∆𝑦 𝑗 , ∆𝑧 𝑗 , 𝑡 𝑗 = 𝑡 𝑗+1 , (𝑡 𝑗 ≠ 𝑡 𝑗+1 ) , 𝑗 = 1, 2, 3, …  [1] X.-Y. Zhang et al., “Drawing and recognizing Chinese characters with recurrent neural network”, TPAMI, 2017 7

Our approach  Data preprocessing and augmentation [1] X.-Y. Zhang et al., “Drawing and recognizing Chinese characters with recurrent neural network”, TPAMI, 2017 8

Our approach  Baseline model architecture  Input-100LSTM-512LSTM-512FC-3755FC-Output t=1 t=T 512 FC 3755 FC 100 LSTM input 512 LSTM 9

Our approach  Reconstruct network with singular value decomposition (SVD) 𝑗 𝑢 = 𝜏(𝑋 𝑗𝑗 𝑦 𝑢 + 𝑋 ℎ𝑗 ℎ 𝑢−1 + 𝑐 𝑗 ) 𝑔 𝑢 = 𝜏 𝑋 𝑗𝑔 𝑦 𝑢 + 𝑋 ℎ𝑔 ℎ 𝑢−1 + 𝑐 𝑔 𝑕 𝑢 = tanh 𝑋 𝑗𝑕 𝑦 𝑢 + 𝑋 ℎ𝑕 ℎ 𝑢−1 + 𝑐 𝑕 Main 𝑝 𝑢 = 𝜏(𝑋 𝑗𝑝 𝑦 𝑢 + 𝑋 ℎ𝑝 ℎ 𝑢−1 + 𝑐 𝑝 ) computation 𝑑 𝑢 = 𝑔 𝑢 ∗ 𝑑 𝑢−1 + 𝑗 𝑢 ∗ 𝑕 𝑢 ℎ 𝑢 = 𝑝 𝑢 ∗ tanh(𝑑 𝑢 ) 𝑐 𝑗 𝑋 𝑋 𝑗 𝑢 𝜏 𝑗𝑗 ℎ𝑗 𝑋 𝑋 𝑐 𝑔 𝜏 𝑔 𝑗𝑔 ℎ𝑔 𝑢 = * 𝑦 𝑢 + ℎ 𝑢−1 + tanh 𝑕 𝑢 𝑋 𝑋 𝑐 𝑕 𝑗𝑕 ℎ𝑕 𝑝 𝑢 𝜏 𝑋 𝑋 𝑐 𝑝 𝑗𝑝 ℎ𝑝 10 10

Our approach  Reconstruct network with singular value decomposition (SVD) 𝑐 𝑗 𝑋 𝑋 𝑗 𝑢 𝜏 𝑗𝑗 ℎ𝑗 𝑋 𝑋 𝑐 𝑔 𝜏 𝑔 𝑗𝑔 ℎ𝑔 𝑢 = 𝑦 𝑢 + ℎ 𝑢−1 + * 𝑕 𝑢 tanh 𝑋 𝑋 𝑐 𝑕 𝑗𝑕 ℎ𝑕 𝑝 𝑢 𝜏 𝑋 𝑋 𝑐 𝑝 𝑗𝑝 ℎ𝑝 𝑋 𝑗 𝑦 𝑢 𝑋 ℎ ℎ 𝑢−1  Apply SVD to 𝑋 𝑗 and 𝑋 ℎ  𝑋 𝑗 : input connections  𝑋 ℎ : hidden-hidden connections 11 11

Our approach  Efficiency analysis of SVD method  Suppose 𝑋 ∈ ℝ 𝑛×𝑜 , by SVD we have 𝑈 𝑋 𝑛×𝑜 = 𝑉 𝑛×𝑜 Σ 𝑜×𝑜 𝑊 𝑜×𝑜  By reserving proper number of singular values 𝑈 𝑋 𝑛×𝑜 ≈ 𝑉 𝑛×𝑠 Σ 𝑠×𝑠 𝑊 = 𝑉 𝑛×𝑠 𝑂 𝑠×𝑜 𝑜×𝑠  Replace 𝑋 𝑛×𝑜 with 𝑉 𝑛×𝑠 𝑂 𝑠×𝑜  𝑋𝑦 → 𝑉𝑂𝑦 12 12

Our approach  Efficiency analysis of SVD method  For a matrix-vector multiplication 𝑋𝑦 , 𝑋 ∈ ℝ 𝑛×𝑜 , 𝑦 ∈ ℝ 𝑜×1 , the acceleration rate and compression rate with r singular values reserved is given by 𝑛𝑜 𝑆 𝑏 = 𝑆 𝑑 = 𝑛𝑠 + 𝑠𝑜  If 𝑛 = 512, 𝑜 = 128, 𝑠 = 32 , then 𝑆 𝑏 = 𝑆 𝑑 = 3.2 13 13

Our approach  Adaptive drop weight (ADW) [1]  Improvement on “Deep Compression” [2] in which a hard threshold is set  ADW gradually prunes away redundant connections in each layer, which have small absolute values (by sort them during retraining)  After ADW, the network become sparse, K-means based quantization is applied to each layer to gain further compression [1] X. Xiao, L. Jin, et al., “ Building fast and compact convolutional neural networks for offline handwritten Chinese character recognition”, Pattern Recognition, 2017 [2] S. Han,et al., “Deep compression: compressing deep neural network with pruning, trained quantization 14 14 and Huffman coding”, ICLR, 2016

Our approach  The proposed framework - review Prune Cluster Reconstruct The baseline baseline redundant remaining model with SVD connections connections 15 15

Experiments  Training set  CASIA OLHWDB1.0 & OLHWDB1.1  720 writers, 2,693,183 samples, 3755 classes  Test set  ICDAR2013 online competition dataset  60 writers, 224,590 samples, 3755 classes  Data preprocessing and augmentation as mentioned before 16 16

Experiments  Details of the baseline model  Main storage cost: LSTM2, FC1, FC2  Main computation cost: LSTM2 17 17

Experiments  Experimental settings  Consideration of the experimental settings  In our experiments, we found LSTM is more sensitive to input connections than hidden-hidden connections  Most computation latency is introduced by hidden-hidden connections 18 18

Experiments  Experimental results  Intel Core i7-4790, single thread After SVD, model is 10 × smaller, and FLOPs is also reduced by 10 ×  After ADW & quantization, model is 31 × smaller, and FLOPs is further reduced  A minor 0.5% drop of accuracy  19 19

Experiments  Experimental results  Compared with [11], our model is 300 × smaller and 4 × faster on CPU  Compared with [15], our model is 52 × smaller and 109 × faster on CPU [1] W. Yang, L. Jin, et al ., “ Dropsample: A new training method to enhance deep convolutional neural networks for largescale unconstrained handwritten Chinese character recognition”, Pattern Recognition, 2016 20 20 [2] X.-Y. Zhang, et al ., “Online and offline handwritten Chinese character recognition: A comprehensive study and new benchmark”, Pattern Recognition, 2017

Conclusion  SVD is efficient for accelerating computation  ADW also works well for LSTMs  By combining SVD and ADW, we can build fast and compact LSTM based model for online HCCR 21 21

Thank you! Lianwen Jin( 金连文 ), Ph.D, Professor eelwjin@scut.edu.cn lianwen.jin@gmail.com Zecheng Xie( 谢泽澄 ), Ph.D, student Yafeng Yang( 杨亚锋 ), Master, student http://www.hcii-lab.net/ 22 22

Character Recognition Reporter: Zecheng Xie South China University - PowerPoint PPT Presentation

Accelerating and Compressing LSTM based Model for Online Handwritten Chinese Character Recognition Reporter: Zecheng Xie South China University of Technology August 5 th , 2018 Outline Motivation Difficulties Our approach

Unconstrained Handwritten Text Recognition Reporter: Zecheng Xie South China University of

Design Elements Issue Task Force March 12, 2014 1 Historic Character 2 Historic Character 3

Curriculum on Character Development L1/A: Character in Leadership Character Development Agenda

Curriculum on Character Development Character in Leadership Character Development Agenda

Come Together: How to Cover College Mergers Rick Seltzer Reporter Inside Higher Ed Who Are You

Handwritten character recognition Handwritten character recognition using elastic matching based

Optical Character Recognition Domain Expert Approximation Through Oracle Learning Joshua Menke

Character Education at Character Education at Northampton Academy An Academy of Character and

CANTERBURY TALES: POWERPOINT CHARACTER PRESENTATION CHARACTER PRESENTER PHYSICAL CHARACTER

- Character set - Character escape conventions - Canonical form - Line editing conventions

Strings II Review Strings are stored character by character. Can access each character

A summary of deep models for face recognition Qianli Liao Face recognition Face recognition:

8-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches

Mining Software Engineering Data Tao Xie Ahmed E. Hassan North Carolina State University

Health Economics Health & Biosecurity | Health Data Analytic Team Yang Xie | yang.xie@csiro.au

Bioprocess scale-up Tracking the informations relevant for scaling-up by GFP reporter strains

Robust Header Compression (ROHC) 53rd IETF Minneapolis, March 2002 Chairs: Carsten Bormann

Architectural Specialization for Inter-Iteration Loop Dependence Patterns Shreesha Srinath,

TSBK01 J RGEN A HLBERG - History - How many samples/pixels/bits? I MAGE CODING AND DATA 3. A

In the name of Allah the compassionate, the merciful Digital Image Processing S. Kasaei Kasaei

Online and Approximation Algorithms http://www14.in.tum.de/lehre/2014SS/oa/index.html.en Susanne

Lecture 1: Asymptotics, Recurrences, Elementary Sorting Instructor: Saravanan Thirumuruganathan

HDR Image and Video Compression dr. Francesco Banterle francesco.banterle@isti.cnr.it HDR

Runtime Traceability Challenges in Systems of Systems Paul Grnbacher Johannes Kepler University

Character Recognition Reporter: Zecheng Xie South China University - PowerPoint PPT Presentation

Accelerating and Compressing LSTM based Model for Online Handwritten Chinese Character Recognition Reporter: Zecheng Xie South China University of Technology August 5 th , 2018 Outline Motivation Difficulties Our approach

Unconstrained Handwritten Text Recognition Reporter: Zecheng Xie South China University of

Design Elements Issue Task Force March 12, 2014 1 Historic Character 2 Historic Character 3

Curriculum on Character Development L1/A: Character in Leadership Character Development Agenda

Curriculum on Character Development Character in Leadership Character Development Agenda

Come Together: How to Cover College Mergers Rick Seltzer Reporter Inside Higher Ed Who Are You

Handwritten character recognition Handwritten character recognition using elastic matching based

Optical Character Recognition Domain Expert Approximation Through Oracle Learning Joshua Menke

Character Education at Character Education at Northampton Academy An Academy of Character and

CANTERBURY TALES: POWERPOINT CHARACTER PRESENTATION CHARACTER PRESENTER PHYSICAL CHARACTER

- Character set - Character escape conventions - Canonical form - Line editing conventions

Strings II Review Strings are stored character by character. Can access each character

A summary of deep models for face recognition Qianli Liao Face recognition Face recognition:

8-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches

Mining Software Engineering Data Tao Xie Ahmed E. Hassan North Carolina State University

Health Economics Health &amp; Biosecurity | Health Data Analytic Team Yang Xie | yang.xie@csiro.au

Bioprocess scale-up Tracking the informations relevant for scaling-up by GFP reporter strains

Robust Header Compression (ROHC) 53rd IETF Minneapolis, March 2002 Chairs: Carsten Bormann

Architectural Specialization for Inter-Iteration Loop Dependence Patterns Shreesha Srinath,

TSBK01 J RGEN A HLBERG - History - How many samples/pixels/bits? I MAGE CODING AND DATA 3. A

In the name of Allah the compassionate, the merciful Digital Image Processing S. Kasaei Kasaei

Online and Approximation Algorithms http://www14.in.tum.de/lehre/2014SS/oa/index.html.en Susanne

Lecture 1: Asymptotics, Recurrences, Elementary Sorting Instructor: Saravanan Thirumuruganathan

HDR Image and Video Compression dr. Francesco Banterle francesco.banterle@isti.cnr.it HDR

Runtime Traceability Challenges in Systems of Systems Paul Grnbacher Johannes Kepler University

Health Economics Health & Biosecurity | Health Data Analytic Team Yang Xie | yang.xie@csiro.au