connectionist temporal classification with maximum
play

Connectionist Temporal Classification with Maximum Entropy - PowerPoint PPT Presentation

Connectionist Temporal Classification with Maximum Entropy Regularization Hu Liu Sheng Jin Changshui Zhang Department of Automation Tsinghua University NeurIPS, 2018 Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization


  1. Connectionist Temporal Classification with Maximum Entropy Regularization Hu Liu Sheng Jin Changshui Zhang Department of Automation Tsinghua University NeurIPS, 2018 Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization NeurIPS, 2018 1 / 9

  2. Introduction to Connectionist Temporal Classification (CTC) @L ctc 1 X = − p ( π | X ) @y t p ( l | X ) y t k k { π | π ∈B − 1 ( l ) ,π t = k } error signal positive feedback dd_oo___g doo____gg CTC Drawbacks: _____dogg = P P(‘dog’| ) CTC lacks exploration and is prone to fall into worse local minima. Output overconfident paths (overfitting). dddoooggg Output paths with peaky distribution. The most suitable path Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization NeurIPS, 2018 2 / 9

  3. Maximum Conditional Entropy Regularization for CTC (EnCTC) d o g Entropy-based L enctc = L ctc − β H ( p ( π | l, X )) regularization X H ( p ( π | l, X )) = − p ( π | X, l ) log p ( π | X, l ) . π ∈B − 1 ( l ) EnCTC: CTC: Better generalization and exploration. Solve peaky distribution problem. expected: (High Entropy) Depict ambiguous segmentation boundaries. Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization NeurIPS, 2018 3 / 9

  4. Equal Spacing CTC (EsCTC) The spacing of two consecutive elements is nearly the same in many sequential tasks. d o g d o g d o g d o g We adopt equal spacing as a pruning method to rule out unreasonable CTC paths. Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization NeurIPS, 2018 4 / 9

  5. Equal Spacing CTC (EsCTC) d Theorem 3.1. Among all segmentation sequences, the equal spacing one has the maximum entropy. o o o o o o o argmax max H ( p ( π | z, l, X )) = z es g g g z d p _ g _ _ _ _ _ o d d d d d d g g o Equal spacing is the best prior without any subjective assumptions. _ g _ _ _ _ _ d d d d o o o g g g _ d d _ g g _ o o X X _ _ d _ _ o _ _ g L esctc = − log p ( π | X ) z s ≤ τ T z ∈ C τ,T π ∈B − 1 z 1 z 2 z 3 ( l ) z | l | Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization NeurIPS, 2018 5 / 9

  6. Algorithm and Complexity Analysis We propose dynamic programming algorithms for EnCTC, EsCTC � and EnEsCTC. EnCTC ( γ ( t − 1 , s ) + γ ( t − 1 , s − 1) if l 0 s = b or l 0 s − 2 = l 0 Q ( l ) = γ ( T, | l 0 | ) + γ ( T, | l 0 | − 1) s γ ( t, s ) = ¯ γ ( t − 1 , s ) + γ ( t − 1 , s − 1) + γ ( t − 1 , s − 2) otherwise EsCTC 8 P τ T t 0 =1 α τ ( t − t 0 , s − 1) σ ( t − t 0 + 1 , t, s ) τ T | l | if l s − 1 6 = l s | l | < α τ ( t, s ) = α τ ( T − t 0 , | l | ) σ ( T − t 0 + 1 , T, 0) X p τ ( l | X 1: T ) = α τ ( T, | l | ) + P τ T t 0 =2 α τ ( t − t 0 , s − 1) y t − t 0 +1 σ ( t − t 0 + 2 , t, s ) | l | otherwise : b t 0 =1 EnEsCTC 8 P τ T γ τ ( t − t 0 , s − 1) σ ( t − t 0 + 1 , t, s ) + α τ ( t − t 0 , s − 1) η ( t − t 0 + 1 , t, s ) | l | > t 0 =1 τ T > > | l | > if l s − 1 6 = l s > > γ τ ( T − t 0 , | l | ) σ ( T − t 0 + 1 , T, 0) > X Q τ ( l ) = γ τ ( T, | l | ) + > P τ T > γ τ ( t − t 0 , s − 1) y t − t 0 +1 σ ( t − t 0 + 2 , t, s )+ < | l | γ τ ( t, s ) = t 0 =2 b t 0 =1 α τ ( t − t 0 , s − 1) y t − t 0 +1 η ( t − t 0 + 2 , t, s )+ > + α τ ( T − t 0 , | l | ) σ ( T − t 0 + 1 , T, 0) log σ ( T − t 0 + 1 , T, 0) > b > > α τ ( t − t 0 , s − 1) y t − t 0 +1 log y t − t 0 +1 σ ( t − t 0 + 2 , t, s ) > > > b b > > otherwise : Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization NeurIPS, 2018 6 / 9

  7. Qualitative Analysis p a r i t y CTC EnCTC EsCTC (a) (b) (c) (d) EnEsCTC Error Signal in Training Alignment Evaluation Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization NeurIPS, 2018 7 / 9

  8. Results on Scene Text Recognition Benchmarks Comparisons with the state-of-the-art methods. Evaluation of model generalization. Method IC03 IC13 IIIT5K SVT Method Synth5K CRNN 89.4 86.7 78.2 80.8 CTC 38.1 STAR-Net 89.9 89.1 83.3 83.6 CTC + LS 42.9 R2AM 88.7 90.0 78.4 80.7 CTC + CP 44.4 RARE 90.1 88.6 81.9 81.9 EnCTC 45.5 EnCTC 90.8 90.0 82.6 81.5 EsCTC 46.3 EsCTC 92.6 87.4 81.7 81.5 EnEsCTC 47.2 EnEsCTC 92.0 90.6 82.0 80.6 Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization NeurIPS, 2018 8 / 9

  9. For more results and analyses, please come Poster: Room 210 & 230 AB #106 https://github.com/liuhu-bigeye/enctc.crnn Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization NeurIPS, 2018 9 / 9

Recommend


More recommend