Learning of Automata Models Learning of Automata Models Extended with Data B Bengt Jonsson t J Uppsala University Uppsala University
Acknowledgments Fid Fides Aarts A t M ik M Maik Mertens t Therese Bohlin Therese Bohlin Harald Raffelt Harald Raffelt Olga Grinchtein Bernhard Steffen Falk Howar Johan Uijen M Martin Leucker i L k F i Frits Vaandrager V d 2
Outline Motivation Formalisms for Automata with Data Formalisms for Automata with Data Abstraction Learning Setup Learning Setup Some Completeness Result Abstraction Refinement Ab t ti R fi t Applications and Evaluation Conclusion and Future Work 3
Motivating Use Case BookingServiceInterface SeatBookerInterface • session=openSession(user,pwd) • venue[]=getVenues(user,pwd) venue[] getVenues(user,pwd) • venue[]=getVenues(session) • seat[]=getSeats(user,pwd,venue) • seat[]=getSeats(session,venue) • receipt =bookSeat(user,pwd,seat) • receipt=bookSeat(session,venue,seat) Mediator getVenues(u,p) g ( ,p) u p u,p openSession(u,p) session session Boo oker getVenues(session) SeatBoo kingServic venues venues venues getSeats(u,p,venue) venue venue getSeats(session venue) getSeats(session,venue) ce seats seats seats bookSeat(u,p,s) bookSeat(u p s) s s bookSeat(session,venue,s) receipt receipt receipt
Data Relationships Correct combination username - password openSession(u,p) session Boo getVenues(session) equal l kingServic venues ∈ getSeats(session venue) getSeats(session,venue) ce seats ∈ bookSeat(session,venue,s) receipt
Motivation: More examples Interface Specifications Container classes • must keep track of identities of data • relate data in input to data in subsequent output Communication protocols • SIP, TCP, … • sequence numbers, identifiers, .. sequence numbers, identifiers, .. 6
Practical Learning Scenario interface description p semantics equivalence query membership query p q y test execution test execution
Finite-State Mealy Machines Finite State Machines w. input & output Σ I input symbols Σ O output symbols input Q Q states t t output q 0 q 0 initial state a/1 δ : Q х Σ I → Q transition function I b/1 λ : Q х Σ I → Σ O output function a/0 a / b Notation: q q’ b/0 b/0 b/0 b/0 • Often used for protocol modeling q 1 q 2 Assumptions: a/0 a/0 • Deterministic Deterministic • Completely specified 8
Basic Learning Setup Same as in L* Membership query: is w accepted or rejected? Teacher Teacher w is accepted/rejected i t d/ j t d Learner Yes/counterexample v Oracle E Equivalence query: i l is H equivalent to A ? 9
Baseline: Automata Learning L* infers Finite State Machine from membership queries: L infers Finite State Machine from membership queries: 1. Pose membership queries until “saturation” 2 2. Construct Hypothesis from obtained information Construct Hypothesis from obtained information 3. Pose equivalence query 4. if no(counterexample) goto 1 else return Hypothesis end Needs O(n 3 ) queries to form Hypothesis of size n •In practice often O(n 2 logn) queries •In practice, often O(n logn) queries •Domain-specific optimizations can help a lot Has been used to learn large automata ( ≥ 20 kstates) g ( ) Adapted for Mealy Machines (by Niese et al. 2003)
How to Extend w. Data? Extend Mealy Machine Model Input and output symbols parameterized by data values. State variables remember parameters in received input Types of parameters could be Types of parameters could be, .e.,g e g • Identifiers of connections, sessions, users • Sequence numbers • Ti Time values l Extend Learning Techniques g q Several conceivable approaches We will attempt to reuse L* approach • Augment by Abstraction Techniques 11
Input and Output Symbols Assume • Domains, e.g., Domains, e.g., STRING e.g., ‘Mary’, ‘174’, … SESSION e.g., 0,1,2,3, … SEAT SEAT e.g., 1,2,3, …., 167 e g 1 2 3 167 • (Input and Output) Actions: with arities, e.g., openSession openSession STRING x STRING x SESSION STRING x STRING x SESSION getSeat SESSION x SEAT • Symbols S b l openSession(‘Mary’, ’188H#4’, 42) action parameters 12
Input and Output Symbols Assume • Domains, e.g., Domains, e.g., STRING e.g., ‘Mary’, ‘174’, … SESSION e.g., 0,1,2,3, … SEAT SEAT e g e.g., 1,2,3, …., 167 1 2 3 167 • (Input and Output) Actions: with arities, e.g., openSession openSession STRING x STRING x SESSION STRING x STRING x SESSION getSeat SESSION x SEAT • Parameterized Symbols P i d S b l openSession( u, p, s) action formal parameters 13
Guards and Expressions Assume • Domains, e.g., Domains, e.g., STRING e.g., ‘Mary’, ‘174’, … SESSION e.g., 0,1,2,3, … SEAT SEAT e g e.g., 1,2,3, …., 167 1 2 3 167 • Relations on Data, e.g., = SEAT x SEATS ∈ has_passwd STRING x STRING 14
Symbolic Mealy Machine A Symbolic Mealy Machine consists of • I I Input Actions Input Actions • O Output Actions • L Locations • l 0 • l Initial location Initial location State Variables • X State variables (typed) cur_session : SESSION • → Symbolic Transitions : SEATS cur_seats : SEATS : SEATS booked booked Formal parameters Input Action getSeat(s seat) getSeat(s,seat) Parameterized input symbol Parameteri ed inp t s mbol [s = cur_session ∧ seat ∈ cur_seats]/ guard booked := booked ∪ seat ; assignment bookedSeat(seat) ( ) output expression p p l 0 l 1 15
Example State Variables cur_session : SESSION : SEATS SEATS cur_seats t : SEATS booked (* Maybe complete the Example Here *) Formal parameters Input Action getSeat(s seat) getSeat(s,seat) Parameterized input symbol Parameteri ed inp t s mbol [s = cur_session ∧ seat ∈ cur_seats]/ guard booked := booked ∪ seat ; assignment bookedSeat(seat) ( ) output expression p p l 0 l 1 16
Example: XMPP protocol I: register, login : STRING x STRING pw pw : STRING : STRING pw(p) / pwd := p ; ok logout, del O: ok, rej X: usr pwd X: usr, pwd : STRING : STRING l l 2 login(u,p) [u = usr ∧ p = pwd] / ok delete () / ok logout () / ok login(u,p) [u ≠ usr ∨ p ≠ pwd] / nok l 0 l 1 register(u,p) / usr := u ; pwd := p ; ok 17
How to Adapt Learning? p g How to use L* to infer Symbolic Mealy Machines? L* works on finite state Mealy machines L* works on finite-state Mealy machines SMMs are infinite state, with infinite alphabets. SMMs are infinite state, with infinite alphabets. IDEA: Use abstraction (from Verification/Model Checking) Fides Aarts, Bengt Jonsson, and Johan Uijen: Generating Models of Infinite- State Communication Protocols using Regular Inference with Abstraction. ICTSS 2010 ICTSS 2010 Falk Howar, Maik Merten, Bernhard Steffen Automata Learning with Automated Alphabet Abstraction Refinement, VMCAI 2011 18
Abstraction: the General Idea M A M < M A α α α M 19
Abstraction in Verification Problem: M satisfies ϕ ? Transformed into: M A satisfies ϕ A ? 20
Adaptation in Learning p g Define an abstraction α α transforms the Model M into M A Use L* to infer M A works if M A is deterministic and finite-state Reverse effect of α on M A i.e., M = α -1 ( M A ) 1 If M A i If M A is not adequate, refine α t d t fi 21
Abstraction in Learning? g Black-Box setting -> We do not have access to internal state of SM D fi Define an abstraction on (input and output) symbols b t ti (i t d t t) b l E.g., Suppress parameters. E.g., Suppress parameters. 22
Application to Example pp p Black-Box setting -> pw(p) / pwd := p ; ok No access to internal state of SM No access to internal state of SM Define an abstraction on (input and output) symbols l l 2 E.g., Suppress parameters. E S t login(u,p) [u = usr ∧ p = pwd] / ok delete () / ok logout () / ok login(u,p) [u ≠ usr ∨ p ≠ pwd] / nok l 0 l 1 register(u,p) / usr := u ; pwd := p ; ok 23
Inadequate Model q Abstract Model pw / ok Problem: nondeterminism nondeterminism l 2 l delete / ok login / ok logout / ok login / nok l 0 l 1 register / ok 24
Fixing Nondeterminism-Problem g login / ok login / nok l 0 l 1 register / ok 25
Fixing Nondeterminism-Problem g Abstraction depends on parameters and t d previous history login / ok login / ok login(’Mary’ , ’145#u’) / ok login / nok l 0 register / ok l 1 login(’Mary’ , ’237#u’) / nok register(’Mary’ , ’145#u’) / ok 26
Recommend
More recommend