Adversarial Models for Deterministic Finite Automata K. Zhang 1 , Q. Wang 2 and C. Lee Giles 1 . 1 Information Sciences and Technology at Pennsylvania State University, United States. 2 School of Computer Science at McGill University, Canada.
Outline • Motivation • Background • Contribution • Main Idea • Main Results Ø Adversarial Model & Transition Importance Ø Critical Pattern & Synchronizing Word Ø Evaluation • Summary and Future work • References
Motivation • Deterministic Finite Automata (DFA) has been widely used in computer language compilers and language processing systems. [1] However, our understanding of DFA remains at a relatively coarse-grained level. It is crucial to explore the fine-grained characteristics. • Most prior work focus on identifying feature-level perturbations that significantly affect a learning model, [2] and there is little research work studying sensitivity through a model-level perspective. • Critical pattern is another important characteristic to identify a specific DFA, which is closely related to synchronizing word. The bound of the length of synchronizing word , also known as Cerny conjecture, still remains a mystery.
Background – DFA & Regular Grammar 0 1 0 Deterministic Finite Automata (DFA) 1 A DFA 𝑁 can be described by a five-tuple { Σ, 𝑅, 𝜀, 𝑟 ! , 𝐺 }: S S S 2 0 1 Ø Σ is the input alphabet; Ø 𝑅 is a finite, non-empty set of states; 0 1 Ø 𝑟 ! represents the initial state; Ø 𝐺 represents the set of accept states; An example DFA accepts only binary numbers that are multiples of 3. Ø 𝜀 denotes a set of deterministic production rules. Grammar Description 1 1* Regular Grammar 2 (10)* 3 An odd number of consecutive 1s is always followed by an even number of consecutive 0s Tomita 4 Any string not containing “000” as a substring Grammars [3] 5 Even number of 0s and even number of 1s 6 The difference between the number of 0s and the number of 1s is a multiple of 3 7 0*1*0*1*
Background – Adversarial Sample Vulnerability of neural networks: adversarial sample problem plagues most statistical and machine learning models. [2] 𝑦 = arg max ( 𝑀(𝑦, 𝑔) s. t. 𝐸(𝑦, 𝑦 ! ) < 𝜁 . " 𝑦 : Data points that can “trick” the model into making incorrect predictions; ( 𝜁 : Perturbation / Manipulation are often “tiny” w.r.t certain distance metric; 𝑀 : Loss function; 𝑔 : Different models. Affected Applications: Ø Image recognition; Ø Sentiment analysis; Ø Malware Analysis: Manipulation of system calls; Ø Threatening for security critical applications, e.g. automatic driving, cyber-security, medical diagnosis, etc.
Contribution Ø Open a discussion on the model-level analysis and introduce a general scheme for adversarial models. Study the transition importance of a DFA through a model-level perturbation. Ø Study critical patterns that can be used for identifying a specific DFA. Develop an algorithm for finding the critical patterns of a DFA by transforming this task as a DFA synchronizing problem. Provide a theoretical approach for estimating the length of any existing perfect patterns. Ø The analysis on DFA models will help in research on the security of cyber-physical systems that are based on working DFAs, e.g., compilers, VLSI design, elevators, and ATMs.
Main Idea In this paper, we aim to study individual DFAs for their fine-grained characteristics, including transition importance and critical patterns. Main Idea Transition Adversary importance Study Critical Synchronizing patterns word In order to directly gain a better understanding of a DFA, we follow a similar approach but study the sensitivity of a DFA through model-level perturbations. A word in the input alphabet of the DFA which sends any state of the DFA to one and the same state. [4]
Adversarial Model & Transition Importance Here we propose to transform the adversarial example problem into the adversarial model problem, which considers model-level perturbations. D 𝑔 = arg max ! |&' E 𝑀(𝑦, 𝑔) |$%$ "∈) To quantitatively evaluate the difference between two sets of strings accepted by different DFAs, here we introduce the following metric: 𝐵 = |𝑌 ∩ B 𝑌| 𝐽𝑃𝑉 𝐵, > Intersection over union (IOU) |𝑌 ∪ B 𝑌| Theorem - (1⨂𝑞) . (𝑁 , ⨂ 𝐵 , + 𝐵 / + 𝑁 / ⨂(K 𝐵 , + K 𝐵 / )) * (1⨂𝑟) 𝐵 = (∑ *+, 𝐽𝑃𝑉 𝐵, > − 1) %, - (𝑞⨂𝑞) . (𝐵 , ⨂K 𝐵 , + 𝐵 / ⨂K ∑ *+, 𝐵 / ) * (𝑟⨂𝑟) where p – initial state vector; q – set of accept states; 𝑁 ! = 1 0 0 ; 𝑁 " = 0 0 1 . 0 0
Adversarial Model & Transition Importance The theorem directly provides a formulation of the adversarial model problem for DFA as an optimization problem. Then by introducing several additional constraints, we solve the following problem: - (𝑞⨂𝑞) . (𝐵 , ⨂K 𝐵 , + 𝐵 / ⨂K 𝐵 / ) * (𝑟⨂𝑟) ∑ *+, min - (1⨂𝑞) . (𝑁 , ⨂ 𝐵 , + 𝐵 / + 𝑁 / ⨂(K 𝐵 , + K 𝐵 / )) * (1⨂𝑟) ∑ *+, 1 " ,0 0 1 # ∈𝒰 / + 𝐵 / − K / = 2 𝐵 , − K s. t. 𝐵 , 4 𝐵 / 4 Constraints: Ø The optimized matrices are transition matrices; Ø Only allows one transition substitution to be applied to one of the transition matrices; Ø Perturbed DFA remains strongly connected; Ø The set of accepted states remains the same; Ø Prevent changes to the absorbing states.
Critical Pattern & Synchronizing Word Set P Set N Another view to investigate the characteristics of a DFA: critical pattern with m b = # strings in Definition (Critical Pattern) N with m 𝑛~ $ 𝑧 indicates that m is a factor of y Absolute pattern: a = # strings in P with m 𝑛 = arg max P 5 +6 |𝑄𝑠 5~ $ 8 (𝑧 ∈ 𝑄) −𝑄𝑠 5~ $ 8 (𝑧 ∈ 𝑂)| d = # strings in N w/o m Relative pattern: c = # strings in P w/o m 𝑛 = arg max P 5 +6 |𝑄𝑠 8∈9 (𝑛~ $ 𝑧) −𝑄𝑠 8∈: (𝑛~ $ 𝑧)| w/o m We will focus on absolute pattern. Comparison of two patterns Definition (Perfect Absolute Pattern) 𝑛 = arg min P 5∈1 % |𝑛| AP: |()*| ( * RP: | (+, − *+- | where 𝐵 # = 𝑛 max 𝑄𝑠 $~ ! & 𝑧 ∈ 𝑄 −𝑄𝑠 $~ ! & 𝑧 ∈ 𝑂 = 1} . (+* $
Critical Pattern & Synchronizing Word ! 2 Synchronizing word 1 ! " Pattern: ‘bab’ " Ø A perfect absolute pattern describes a substring, which has minimal 3 length among all absolute patterns and perfectly differentiates the strings !," from different disjoint sets; ",! Ø Only polynomial and exponential class have absolute perfect patterns; [5] 2 1 ! Ø An absorbing state naturally fits this synchronizing scheme. As such, we Pattern: ‘babaab’ " " can set the absorbing state as the state to be synchronized. ! 3 4 ",! Theorem. The length of a perfect absolute pattern of a DFA with n states " is at most 𝑜(𝑜 − 1)/2 . ! 1 2 Pattern: ",! Theorem. The length of a absolute pattern of a 5-state DFA is at most 9 . ! ‘babaabaab’ ! ",! ! " 3 4 5
Evaluation Optimized value of IOU Random value of IOU Adversarial Grammar 3 1.48e-3 0.342 Models Grammar 5 0.152 0.289 Grammar 7 0.025 0.225 ! # ! ! # ! # ! 2 3 1 # 1 2 ! 1 2 3 ! ! # # G7 G3 G5 ! " " " " # ! ! ! ! 4 5 4 5 4 3 # ! ! !,# !,# Pattern Con. Prob. Critical Probability difference and confidence ab 0.6 0.674 G7 Patterns have positive correlation. bab 0.8 0.912 abab 1 1
Summary & Future Work Summary Ø This work extend the sample-level analysis framework proposed in prior work for feed-forward neural networks to model-level analysis scheme, and furthermore study the transition importance of DFA under this scheme. Ø This work define the critical pattern to identify individual DFA and propose a synchronizing algorithm to effectively find the critical pattern. Furthermore, this work provide some theoretical analysis of the minimal length of the defined critical pattern. Ø This work will facilitate the research on the security of cyber-physical systems that are based on working DFAs. Future Work Understand more complex models and DFAs used in real applications .
References [1] Hopcroft, J. E., Motwani, R., & Ullman, J. D. (2001). Introduction to automata theory, languages, and computation. Acm Sigact News, 32(1), 60-65. [2] Goodfellow, I. J., Shlens, J., & Szegedy, C. (2014). Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572. [3] Tomita, M. (1982). Learning of Construction of Finite Automata from Examples Using Hill- Climbing. RR: Regular Set Recognizer (No. CMU-CS-82-127). [4] Rystsov, I. C. (2004), "Černý's conjecture: retrospects and prospects", Proc. Worksh. Synchronizing Automata, Turku (WSA 2004). [5] Wang, Q., Zhang, K., Ororbia, I. I., Alexander, G., Xing, X., Liu, X., & Giles, C. L. (2018). A Comparative Study of Rule Extraction for Recurrent Neural Networks. arXiv preprint arXiv:1801.05420.
Q & A Thanks! If you are interested in more details, please contact: kuz22@psu.edu
Recommend
More recommend