weighted automata extraction
play

Weighted Automata Extraction from Recurrent Neural Networks via - PowerPoint PPT Presentation

Weighted Automata Extraction from Recurrent Neural Networks via Regression on State Spaces Takamasa Okudono, Masaki Waga, Taro Sekiyama, Ichiro Hasuo SOKENDAI, the Graduate University for Advanced Studies, Japan /National Institute of


  1. Weighted Automata Extraction from Recurrent Neural Networks via Regression on State Spaces Takamasa Okudono, Masaki Waga, Taro Sekiyama, Ichiro Hasuo SOKENDAI, the Graduate University for Advanced Studies, Japan /National Institute of Informatics, Japan LearnAut19, Vancouver, Canada 23 June 2019 1

  2. RNN RNN is a neural network equipped with a internal state Drawing by François Deloche ( CC BY-SA 4.0 ) 2

  3. Goal 𝑆 : Ξ£ βˆ— β†’ ℝ ) Input: RNN 𝑆 whose output is in ℝ (defines 𝑔 𝐡(𝑆) : Ξ£ βˆ— β†’ ℝ ) s.t. 𝑔 Output: WFA 𝐡(𝑆) (defines 𝑔 𝐡(𝑆) ≃ 𝑔 𝑆 Transition func. Transition matrix RNN WFA Initial vector Final vector Final func. Initial state 3

  4. Motivation β€’ Getting lighter (faster to infer) model of an RNN β€’ Because the inference of RNNs are sometimes heavy β€’ Investigate the behavior of RNN 𝑆 via the extracted WFA 𝐡 𝑆 β€’ WFA equips many operations and leads to model checking? β€’ In research line of RNN ⇔ DFA conversion as an acceptor β€’ Ours is a quantitative extension 4

  5. Contribution β€’ Proposed a method to apply Balle and Mohri’s algorithm for the extraction β€’ The key is checking if 𝑆 ≃ 𝐡 by using regression β€’ Our method extracts +7% more accurate models than the baseline β€’ The extracted WFAs are about 1,000 times faster to infer than the target RNNs 5

  6. Def. of RNN (Mathematically, in this work) RNN 𝑆 (of alphabet Ξ£ and dimension 𝑒 ) consists of Need not to be β€’ 𝛽 ∈ ℝ 𝑒 : Initial state linear β€’ 𝛾: ℝ 𝑒 β†’ ℝ : Final function β€’ 𝑕 𝑆 : ℝ 𝑒 Γ— Ξ£ β†’ ℝ 𝑒 : Transition function β€’ 𝑕 𝑆 : ℝ 𝑒 Γ— Ξ£ βˆ— β†’ ℝ 𝑒 is induced recursively β–  𝑆 : Ξ£ βˆ— β†’ ℝ is induced by 𝑔 𝑔 𝑆 π‘₯ 1 … π‘₯ 𝑂 = 𝛾 ∘ 𝑕 𝑆 (𝛽, π‘₯ 1 … π‘₯ 𝑂 ) The configuration for π‘₯ 1 … π‘₯ 𝑂 is defined by πœ€ 𝑆 π‘₯ 1 … π‘₯ 𝑂 = 𝑕 𝑆 𝛽, π‘₯ 1 … π‘₯ 𝑂 β€œinternal state” 6

  7. Def. of Weighted Finite Automaton (WFA) WFA 𝐡 (of size π‘œ and alphabet Ξ£ ) consists of β€’ 𝛽 ∈ ℝ π‘œ : Initial vector β€’ 𝛾 ∈ ℝ π‘œ : Final vector β€’ 𝐡 𝜏 ∈ ℝ π‘œΓ—π‘œ : Transition matrix ( 𝜏 ∈ Ξ£ ) β–  𝐡 : Ξ£ βˆ— β†’ ℝ WFA 𝐡 is a formalism to define 𝑔 (c.f.) A DFA is a formalism to define 𝑔: Ξ£ βˆ— β†’ 2 WFA is an extension of DFA via the matrix representation. 7

  8. Def. of WFA 𝐡 : Ξ£ βˆ— β†’ ℝ as β€’ WFA 𝐡 induces the function 𝑔 𝑔 𝐡 π‘₯ 1 … π‘₯ 𝑂 = 𝛽𝐡 π‘₯ 1 … 𝐡 π‘₯ 𝑂 𝛾 β€’ The configuration (β€œinternal state”) of WFA 𝐡 is πœ€ 𝐡 π‘₯ 1 … π‘₯ 𝑂 = 𝛽𝐡 π‘₯ 1 … 𝐡 π‘₯ 𝑂 ∈ ℝ π‘œ For example: 0.2 , 𝛾 = 0.9 0.7 , 𝐡 0 = 0 1 0 , 𝐡 1 = 0.9 0.1 β€’ Ξ£ = 0, 1 , 𝛽 = 0.8 1 0.5 0.5 0.9 0.1 0 1 0.9 β€’ 𝑔 𝐡 10 = 0.8 0.7 = 0.736 0.2 0.5 0.5 1 0 0.9 0.1 0 1 β€’ πœ€ 𝐡 10 = 0.8 0 = 0.18 0.2 0.82 0.5 0.5 1 8

  9. RNN and WFA RNN 𝑆 (of alphabet Ξ£ and dimension 𝑒 ) consists of β€’ 𝛽 ∈ ℝ 𝑒 : Initial state β€’ 𝛾: ℝ 𝑒 β†’ ℝ : Final function β€’ 𝑕 𝑆 : ℝ 𝑒 Γ— Ξ£ β†’ ℝ 𝑒 : Transition function β–  WFA 𝐡 (of alphabet Ξ£ and size π‘œ ) consists of β€’ 𝛽 ∈ ℝ π‘œ : Initial vector Similar formalism! β€’ 𝛾 ∈ ℝ π‘œ : Final vector Can we approximate RNN by WFA? β€’ 𝐡 𝜏 ∈ ℝ π‘œΓ—π‘œ : Transition matrix ( 𝜏 ∈ Ξ£ ) β–  9

  10. Goal and Our Approach Goal 𝑆 : Ξ£ βˆ— β†’ ℝ ) Input: RNN 𝑆 whose output is in ℝ (defines 𝑔 𝐡(𝑆) : Ξ£ βˆ— β†’ ℝ ) s.t. 𝑔 Output: WFA 𝐡(𝑆) (defines 𝑔 𝐡(𝑆) ≃ 𝑔 𝑆 Approach: Use Balle and Mohri’s algorithm β€’ The challenge is to give the procedure to check if 𝑔 𝑆 for 𝐡 ≃ 𝑔 a candidate WFA 𝐡 10

  11. Balle and Mohri’s Algorithm An extension of Angluin’s L* Algorithm for WFA β€’ Input: β€’ Membership query procedure m: Ξ£ βˆ— β†’ ℝ β€’ Equivalence query procedure e: WFAs β†’ Equivalent βŠ” Ξ£ βˆ— β€’ Output: Called β€’ Minimal WFA 𝐡′ β€œCounterexample” β€’ Property: Given WFA 𝐡 , if 𝑛 = 𝑔 𝐡 and 𝐡 = α‰ŠEquivalent ; 𝑔 𝐡 = 𝑔 ΰ·¨ 𝐡 𝑓 ሚ π‘₯ ; 𝑔 𝐡 π‘₯ β‰  𝑔 ΰ·¨ 𝐡 (π‘₯) then, it terminates by calling 𝑛, 𝑓 polynomial times and 𝑔 𝐡 = 𝑔 𝐡′ 11

  12. Idea of Overall Architecture (Detailed) Implement β€’ Membership query 𝑛 to be the RNN’s induced function 𝑔 𝑆 β€’ Equivalence query 𝑓 to be Generally it 𝐡 = α‰ŠEquivalent ; 𝑔 𝑆 ≃ 𝑔 ΰ·¨ cannot be β€œ=β€œ 𝐡 𝑓 ሚ π‘₯ ; 𝑔 𝑆 π‘₯ β‰  𝑔 ΰ·¨ 𝐡 (π‘₯) Then we would be able to get a WFA ሚ 𝐡 s.t. 𝑔 𝐡 ! 𝑆 ≃ 𝑔 ΰ·¨ But how can we implement such an equivalence query 𝑓 ? 12

  13. How do we know 𝑔 𝐡 ? 𝑆 ≃ 𝑔 𝑔 𝑆 π‘₯ ≃ 𝑔 𝐡 (π‘₯) ⇔ 𝛾 𝑆 ∘ πœ€ 𝑆 𝛽 𝑆 , π‘₯ 1 … π‘₯ π‘œ ≃ πœ€ 𝐡 (π‘₯ 1 … π‘₯ π‘œ )𝛾 𝐡 Both calculate their configurations (β€œinternal states”) If there is a β€œgood” relation between πœ€ 𝑆 and πœ€ 𝐡 , 𝐡 and 𝑆 would behave similarly 13

  14. β€œGood” relation between πœ€ 𝑆 and πœ€ 𝐡 β€’ This work views π‘ž: ℝ 𝑒 β†’ ℝ π‘œ satisfying the following property as a good relation: βˆ€π‘₯ ∈ Ξ£ βˆ— . p πœ€ 𝑆 w ≃ πœ€ 𝐡 (π‘₯) 14

  15. Equivalence Query by approximating π‘ž Let’s approximate configuration translator π‘ž: ℝ 𝑒 β†’ ℝ π‘œ such that βˆ€π‘₯ ∈ Ξ£ βˆ— . p πœ€ 𝑆 w ≃ πœ€ 𝐡 (π‘₯) by applying regression on sampled data. The data is sampled by observing Ξ£ βˆ— in Breadth-First Search. 15

  16. Relation π‘ž between 𝑆 and 𝐡 config. space of 𝐡 ( ℝ π‘œ ) config. space of 𝑆 ( ℝ 𝑒 ) ・ 𝛽 𝑆 ・ 𝛽 𝐡 16

  17. Relation π‘ž between 𝑆 and 𝐡 config. space of 𝐡 ( ℝ π‘œ ) config. space of 𝑆 ( ℝ 𝑒 ) ・ 𝛽 𝑆 ・ 𝛽 𝐡 17

  18. Relation π‘ž between 𝑆 and 𝐡 config. space of 𝐡 ( ℝ π‘œ ) config. space of 𝑆 ( ℝ 𝑒 ) ・ πœ€ 𝑆 (0) ・ πœ€ 𝐡 (0) ・ 𝛽 𝑆 ・ 𝛽 𝐡 18

  19. Relation π‘ž between 𝑆 and 𝐡 config. space of 𝐡 ( ℝ π‘œ ) config. space of 𝑆 ( ℝ 𝑒 ) ・ πœ€ 𝑆 (0) ・ πœ€ 𝐡 (0) ・ 𝛽 𝑆 ・ 𝛽 𝐡 19

  20. Relation π‘ž between 𝑆 and 𝐡 config. space of 𝐡 ( ℝ π‘œ ) config. space of 𝑆 ( ℝ 𝑒 ) ・ πœ€ 𝐡 (1) ・ πœ€ 𝑆 (0) ・ πœ€ 𝑆 (1) ・ πœ€ 𝐡 (0) ・ 𝛽 𝑆 ・ 𝛽 𝐡 20

  21. Relation π‘ž between 𝑆 and 𝐡 config. space of 𝐡 ( ℝ π‘œ ) config. space of 𝑆 ( ℝ 𝑒 ) ・ πœ€ 𝐡 (1) ・ πœ€ 𝑆 (0) ・ πœ€ 𝑆 (1) ・ πœ€ 𝐡 (0) ・ 𝛽 𝑆 ・ 𝛽 𝐡 21

  22. Relation π‘ž between 𝑆 and 𝐡 config. space of 𝐡 ( ℝ π‘œ ) config. space of 𝑆 ( ℝ 𝑒 ) ・ πœ€ 𝐡 (00) ・ πœ€ 𝑆 (00) ・ πœ€ 𝐡 (1) ・ πœ€ 𝑆 (0) ・ πœ€ 𝑆 (1) ・ πœ€ 𝐡 (0) ・ 𝛽 𝑆 ・ 𝛽 𝐡 22

  23. Relation π‘ž between 𝑆 and 𝐡 config. space of 𝐡 ( ℝ π‘œ ) config. space of 𝑆 ( ℝ 𝑒 ) ・ πœ€ 𝐡 (00) ・ πœ€ 𝑆 (00) ・ πœ€ 𝐡 (1) ・ πœ€ 𝑆 (0) ・ πœ€ 𝑆 (1) ・ πœ€ 𝐡 (0) ・ 𝛽 𝑆 ・ 𝛽 𝐡 23

  24. Relation π‘ž between 𝑆 and 𝐡 config. space of 𝐡 ( ℝ π‘œ ) config. space of 𝑆 ( ℝ 𝑒 ) ・ πœ€ 𝐡 (00) ・ πœ€ 𝑆 (00) ・ πœ€ 𝐡 (1) ・ πœ€ 𝑆 (01) ・ πœ€ 𝑆 (0) ・ πœ€ 𝑆 (1) ・ πœ€ 𝐡 0 ≃ πœ€ 𝐡 (01) ・ 𝛽 𝑆 ・ 𝛽 𝐡 24

  25. Relation π‘ž between 𝑆 and 𝐡 config. space of 𝐡 ( ℝ π‘œ ) config. space of 𝑆 ( ℝ 𝑒 ) ・ πœ€ 𝐡 (00) ・ πœ€ 𝑆 (00) ・ πœ€ 𝐡 (1) ・ πœ€ 𝑆 (01) ・ πœ€ 𝑆 (0) ・ πœ€ 𝑆 (1) ・ πœ€ 𝐡 0 ≃ πœ€ 𝐡 (01) ・ 𝛽 𝑆 ・ 𝛽 𝐡 25

  26. BFS-based Equivalence Query Pop w Add w ’s next from words to queue queue Equivalence query proceeds based on Breadth-First Search 26

  27. Maintaining π‘ž Pop w Check if π‘ž Add w ’s next YES from words to should be queue queue refined NO Refine π‘ž We want it to satisfy βˆ€π‘₯ ∈ 𝑋. p πœ€ 𝑆 w ≃ πœ€ 𝐡 (π‘₯) 27

  28. Check if π‘ž should be refined config. space of 𝑆 ( ℝ 𝑒 ) config. space of 𝐡 ( ℝ π‘œ ) π‘ž ・ πœ€ 𝑆 π‘₯β€² ・ πœ€ 𝐡 π‘₯ β€² = π‘ž(πœ€ 𝐡 π‘₯ β€² ) π‘₯β€² : a word already visited in the BFS loop π‘₯ : a word just popped 28

  29. Check if π‘ž should be refined config. space of 𝑆 ( ℝ 𝑒 ) config. space of 𝐡 ( ℝ π‘œ ) ・ πœ€ 𝑆 (π‘₯) π‘ž ・ πœ€ 𝑆 π‘₯β€² ・ πœ€ 𝐡 π‘₯ β€² = π‘ž πœ€ 𝐡 π‘₯ β€² = πœ€ 𝐡 (π‘₯) π‘₯β€² : a word already visited in the BFS loop π‘₯ : a word just popped 29

  30. Check if π‘ž should be refined config. space of 𝐡 ( ℝ π‘œ ) config. space of 𝑆 ( ℝ 𝑒 ) ・ πœ€ 𝑆 (π‘₯) π‘ž ・ πœ€ 𝑆 π‘₯β€² ・ πœ€ 𝐡 π‘₯ β€² = π‘ž πœ€ 𝐡 π‘₯ β€² π‘ž = πœ€ 𝐡 π‘₯ = π‘ž(πœ€ 𝑆 π‘₯ ) config. space of 𝑆 ( ℝ 𝑒 ) config. space of 𝐡 ( ℝ π‘œ ) π‘ž ・ πœ€ 𝑆 (π‘₯) ・ π‘ž(πœ€ 𝑆 π‘₯ ) π‘ž ・ πœ€ 𝑆 π‘₯β€² ・ πœ€ 𝐡 π‘₯ β€² = π‘ž πœ€ 𝐡 π‘₯ β€² = πœ€ 𝐡 (π‘₯) π‘₯β€² : a word already visited in the BFS loop 30 π‘₯ : a word just popped

Recommend


More recommend