Active Learning of State Machines tutorial Frits Vaandrager Radboud University Nijmegen Dagstuhl, March 2018
Goal Active Automaton Learning Informationsteknologi Informationsteknologi What state machine reset governs the behavior of this black box? SUT input 1 input 2 output 1 output 2
Why Study Automata Learning? Informationsteknologi Informationsteknologi Fundamental: System Identification Useful Often we don’t have models of software components When we have models we often don’t know whether they are correct
Informationsteknologi Informationsteknologi Machine Learning in General
Informationsteknologi Informationsteknologi Learning Regular Languages
Minimally Adequate Teacher Informationsteknologi Informationsteknologi Membership Queries Yes / No Teacher Learner Equivalence Queries Yes / No + Counterexample
Informationsteknologi Informationsteknologi Regular Sets and Congruences
Informationsteknologi Informationsteknologi Angluin’s L* Algorithm
Black Box Checking (Peled, Vardi & Yannakakis , ‘99) Informationsteknologi Informationsteknologi Learner: Formulate hypothesis Model-Based Testing: Test hypothesis
Informationsteknologi Informationsteknologi
Informationsteknologi Informationsteknologi Our Research Method Tools Theory Applications
Application 1: EMV protocol Informationsteknologi Informationsteknologi Inference of EMV protocol Credit card with EMV chip EMV = Europay, Mastercard and Visa Compatibility between smartcards and terminals EMV-compliance required for
Model of SecureCode app on Dutch banking card EMV standard has over 700 pages Informationsteknologi Informationsteknologi At most 1500 membership queries, less than 30 minutes
Different cards, different state machines Learned models provide unique Informationsteknologi Informationsteknologi fingerprints of cards! Specification?
Informationsteknologi Informationsteknologi Application 2: E.dentifier2
State Machines for Old and New E.dentifier2 Informationsteknologi Informationsteknologi
A Theory of Abstractions (Aarts, Jonsson, Uijen & Vaandrager, 2015) Informationsteknologi Informationsteknologi abstract concrete input input Learner Mapper Teacher small ∑ probably abstract concrete large ∑ output output
Application 3-5: Protocol Implementations Informationsteknologi Informationsteknologi We found standard violations in implementations of major protocols: TCP (CAV’16, FMICS’17) TLS (Usenix Security ‘15) SSH (Spin’17)
Informationsteknologi Informationsteknologi SSH Learning Results
Informationsteknologi Informationsteknologi SSH Model Checking Results
Application 6: Power Control Service of Philips Informationsteknologi Informationsteknologi Legacy component Refactored component Equivalent?
Our Approach Legacy Refactored Implementation Implementation model learner model learner Model Model equivalence checker Adapt models N N counter equiv model(s) correct example ? for ? using Y Y done Adapt implementations(s)
Application 7: Engine Status Manager Océ Printer Informationsteknologi Informationsteknologi Goal: learn models of realistic printer controllers Possible use: regression testing, generation of new implementations,..
Adaptive Distinguishing Sequences (Lee & Yannakakis, 1994) Informationsteknologi Informationsteknologi
Results Learned model from SUT equivalent to handcrafted Informationsteknologi Informationsteknologi model 114 hypotheses generated 8.5 hours needed 29.933.643 membership queries with ≈35 inputs 30.629.711validity queries with ≈30 inputs
Theory+Tools: Learning Register Automata Informationsteknologi Informationsteknologi Three approaches: 1. Using adapted Myhill-Nerode (LearnLib, RALib) 2. Using mappers and CEGAR (Tomte) 3. Using NLambda Haskell library for nominal automata
Theory: Learning Timed Mealy Machines (Jonsson & Vaandrager, 2018) Informationsteknologi Informationsteknologi
Future Work: Opening the Box Informationsteknologi Informationsteknologi Some possible approaches: 1. Fuzzing 2. Static analysis 3. Tainting
Other Research Challenges Informationsteknologi Informationsteknologi I/O transition systems Nondeterminism More complex (operations on) data Quality of learned models …
Conclusions Informationsteknologi Informationsteknologi Nice mix of theory and applications Numerous challenges
Recommend
More recommend