Learning Interpretable Models Expressed in Linear Temporal Logic - PowerPoint PPT Presentation

Learning Interpretable Models Expressed in Linear Temporal Logic Alberto Camacho 1 , 2 and Sheila McIlraith 1 , 2 1 Department of Computer Science, University of Toronto 2 Vector Institute { acamacho, sheila } @cs.toronto.edu ICAPS 2019 July 13, 2019

Linear Temporal Logic on Finite Traces (LTL f ) LTL f extends propositional logic with temporal operators: next : ϕ until : ψ U χ always : � ϕ eventually : ♦ ϕ release : ψ R χ It is useful to define: macro final to denote the end of trace: final := ¬ ⊤ operator weak next: ϕ ≡ final ∨ ϕ We presume LTL f formulae are written in Negation Normal Form , where negations are pushed to the level of propositional formulae. 2 / 19

Passive Learning from Positive and Negative Example Traces Passive Learning Learn an LTL f formula that is consistent with a given set of positive (+) and negative ( − ) examples. • Examples are finite sequences of states • States are truth assignments to atomic propositions (represented as sets). Examples + : { p } { p } { q } − : { p } { r } { q } + : { p } { q } Model Passive − : { p } { r } p U q Learner − : { r } { q } − : { q , r } + : { p , r } { q } 3 / 19

Related Work There exist related work in: Learning automata (e.g. L* (Angluin, 1987)) And very (very!) recent work in learning LTL: Neider and Gavran (2018) learns LTL temporal properties (interpreted over infinite traces) that are consistent with a given set of finite observation traces. Shah et al. (NeurIPS 2018), and Kim et al. (IJCAI 2019) adopted a non-exact, Bayesian approach to infer models in terms of behaviors encoded as LTL templates. 4 / 19

Skeleton Templates for Alternating Finite Automata (AFA) • A tree expansion is accepting if all the leaf nodes are accepting. • An state trace e = e 1 . . . e N satisfies ϕ iff A ϕ has an accepting tree expansion. ⊤ ⊤ A β A β A α A α A α A α A α ⊤ (a) A α ∧ β (b) A α ∨ β (c) A  α (d) A (e) A  α α p A β ⊤ A β A α ⊤ A α ⊤ A α ⊤ p q ¬ r ⊤ (f) A α U β (g) A α R β (h) A  α (i) A p (j) A p ∨ ( q U ¬ r ) 5 / 19

Passive Learning of LTL f Theorem The following algorithm returns a minimal LTL f formula that is consistent with a set of positive and negative examples. 1: Fix the size of the formula to N = 1 2: Construct an AFA A ϕ with N skeletons that: accepts all positive examples rejects all negative examples 3: If no AFA exists, increment N by one and go to 2. 4: Extract an LTL f formula ϕ from the structure of A ϕ . Return ϕ . We use SAT to find A ϕ . Recall: SAT is the problem of finding an assignment to variables V such that a boolean formula holds true. 6 / 19

Reduction to SAT Skeletons s 1 , s 2 , . . . s N AFA is rooted at skeleton s 1 Each skeleton is one of one type: SkType( s ) := { AND( s ) , OR( s ) , NEXT( s ) , WNEXT( s ) , UNTIL( s ) , RELEASE( s ) , EVENTUALLY( s ) , ALWAYS( s ) , LIT( s ) } Skeleton s has one associated skeleton subformula ‘alpha’ Skeleton s has one associated skeleton subformula ‘beta’ Skeleton s has one associated literal (used if LIT( s ) holds) Clauses: oneOf(SkType( s )) oneOf( { A( s , s ′ ) | s + 1 ≤ s ′ < N } ) oneOf( { B( s , s ′′ ) | s + 1 < s ′′ ≤ N } ) oneOf( { L( s , v ) | 1 ≤ v ≤ | V |} ∪ { L( s , − v ) | 1 ≤ v ≤ | V |} ) 7 / 19

Enforcing Acceptance of Positive Examples Notation: • state trace example e • timestep t ∈ { 1 , . . . , | e |} • v ∈ V represent state variables. • skeleton index s ∈ { 1 , . . . N } Clauses: For each positive example e and skeleton s : • RUN( e , 1 , 1) has to hold. • Implication clauses for each LTL f operator. Subformula Timestep Implication clauses for a positive example RUN( e , t , s ′ ) ← RUN( e , t , s ) ∧ AND( s ) ∧ A( s , s ′ ) α ∧ β 1 ≤ t < | e | RUN( e , t , s ′′ ) ← RUN( e , t , s ) ∧ AND( s ) ∧ B( s , s ′′ ) RUN( e , t + 1 , s ) ← RUN( e , t , s ) ∧ NEXT( s ) ∧ A( s , s ′ ) 1 ≤ t < | e | α ⊥ ← RUN( e , | e | , s ) ∧ NEXT( s ) ∧ A( s , s ′ ) t = | e | 8 / 19

Enforcing Rejection of Negative Examples How to enforce rejection of negative examples? • Naive way: enforce that all tree expansions are rejecting. This may be problematic because the number of tree expansions can be massive. • Clever way: enforce some accepting tree expansion in the dual of the AFA being learned. 9 / 19

Dualizing an AFA Skeleton Structure If A ϕ is an skeleton structure for LTL f formula ϕ , then A dual( ϕ ) is an skeleton structure for LTL f formula ¬ ϕ , where dual( p ) := ¬ p , p ∈ AP dual( α ∨ β ) := dual( α ) ∧ dual( β ) dual( ¬ p ) := p , p ∈ AP dual( α ∧ β ) := dual( α ) ∨ dual( β ) dual( α ) :=  dual( α ) dual( α U β ) := dual( α ) R dual( β ) dual(  α ) := dual( α ) dual( α R β ) := dual( α ) U dual( β ) In other words, A dual( ϕ ) accepts the complement language of A ϕ . 10 / 19

Enforcing Rejection of Negative Examples with Dualized Skeletons Notation: • state trace example e • timestep t ∈ { 1 , . . . , | e |} • skeleton index s ∈ { 1 , . . . N } If e is a negative example: RUN( e , t , s ′ ) ← RUN( e , t , s ) ∧ OR( s ) ∧ A( s , s ′ ) RUN( e , t , s ′′ ) ← RUN( e , t , s ) ∧ OR( s ) ∧ B( s , s ′′ ) And the dualization of skeletons that represent variables is: If v ∈ e [ t ]: ⊥ ← RUN( e , t , s ) ∧ LIT( s ) ∧ L( s , v ) If v �∈ e [ t ]: ⊥ ← RUN( e , t , s ) ∧ LIT( s ) ∧ L( s , ¬ v ) 11 / 19

Active and Passive Learning Examples + : { p } { p } { q } − : { p } { r } { q } query + : { p } { q } Model Passive Active Oracle − : { p } { r } Learner p U q Learner − : { r } { q } response − : { q , r } + : { p , r } { q } (b) Interaction between the active (a) Workflow of passive learning learner and the oracle. Passive Learning Learn an LTL f formula that is consistent with a given set of positive (+) and negative ( − ) examples. Active Learning Learn an LTL f formula from interaction with an oracle, by performing membership and equivalence queries. 12 / 19

Sample complexity LTL f to DFA is worst-case double exponential. Learning minimal DFA with N states needs a number of informative examples that is polynomial in N (cf. Angluin 1987). Question: How many examples do we need to learn a minimal LTL f formula? Theorem Active learning of an LTL f formula ϕ can be done with a number of queries exponential in the size of ϕ . Theorem Passive learning of an LTL f formula ϕ can be done with a number of informative examples exponential in the size of ϕ . We can offer exponentially smaller bounds on the number of examples. 13 / 19

Practical Applications Few-shot learning Behavior Classification Plan Intent and Recognition Reward Function Learning Knowledge Extraction LTL Mining 14 / 19

Experiments Number of positive ( | E + | ) and negative ( | E − | ) examples needed for active learning of each LTL f formula, and comparison with the number of characteristic samples (CS) that uniquely define minimal DFA with S states. LTL f learning DFA | AP | | E + | | E − | Time Target || S → 3 1 4 0.3 40 4 32 p p ∧ q 3 2 5 0.5 29 4 32 ( p ∧ q ) 3 5 5 0.4 42 4 32  ( p ∧ q ) 3 5 6 0.4 29 4 32 p U q 3 3 4 0.2 22 3 24 p R q 3 3 3 0.3 22 3 24 15 / 19

Active Learning Run Time (s) 10 2 10 0 # Examples 20 0 2 3 4 5 6 7 8 9 10 11 Formula Size Figure: Run time (top) and number of examples needed (bottom) to learn a variety of LTL f formulas of different size. 16 / 19

Behavior Classification 4 different behaviours in the Openstacks planning benchmark. 1000 plans generated per behavior (using Top-k planner). Can we learn LTL f formulae that discriminate between behaviours (K examples per behavior)? e.g. (( not ( shipped o5 ))) U((( stacks avail n4 ))) Accuracy Precision Recall 1.00 1.00 1.00 0.75 0.75 0.75 0.50 0.50 0.50 0.25 0.25 0.25 0.00 0.00 0.00 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 K K K LTL f learning via SAT takes 0 . 1 seconds In contrast, deep learning LSTM time series classification (Karim et al 2018) takes 16 seconds. 17 / 19

Conclusions Novel mechanism to do passive and active learning of LTL f formulae Exploiting duality of the AFA to learn from negative examples Exponential bounds on the number of examples (exponentially lower than with DFA) Identification of applications 18 / 19

Thank you. Questions? 19 / 19

Learning Interpretable Models Expressed in Linear Temporal Logic - PowerPoint PPT Presentation

Learning Interpretable Models Expressed in Linear Temporal Logic Alberto Camacho 1 , 2 and Sheila McIlraith 1 , 2 1 Department of Computer Science, University of Toronto 2 Vector Institute { acamacho, sheila } @cs.toronto.edu ICAPS 2019 July 13,

Interpretable sets in o-minimal structures Will Johnson March 27, 2015 Will Johnson

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Introduction to Data Science: Logistic 0 1 1 according to a data fit criterion. account

Not Just a Black Box: Interpretable Deep Learning for Genomics Avan> Shrikumar, Peyton

From ML Successes to Applications ICIP18 Tutorial on Interpretable Deep Learning 2 Black Box

Limitations of linear models Richard Erickson Instructor DataCamp Generalized Linear Models in

IMLI: An Incremental Framework for MaxSAT-Based Learning of Interpretable Classification Rules

Deep Visual Models with Interpretable Features and Modularized Structures Quanshi Zhang John

Incremental Approach to Interpretable Classification Rule Learning Bishwamittra Ghosh and Kuldeep

Workshop 3 Building from Linear Models to Generalised Linear Models Part 2: GLMs 2 2 What are

Support Vector Machines Machine Learning 1 Big picture Linear models 2 Big picture Linear

Linear Classifiers: Expressiveness Machine Learning 1 Lecture outline Linear models:

Functional Linear Models 1 66 / 181 Functional Linear Models Statistical Models So far we have

Notes on the Non-linear Regression The model Non-linear regression models, like ordinary linear

Inducing Interpretable Word Senses for WSD and Enrichment of Lexical Resources Overview Jan 11,

Two-level Authoring of Computer- Interpretable Guidelines David Buenestado, Juan M. Pikatza, Unai

Aspe% immunologici nelle Mielodisplasie Ipoplas3che Renato Zambello, MD Padua University School

Lecture 23: Principal Component Analysis Aykut Erdem January 2019 Hacettepe University

Lecture 24: Principal Component Analysis Aykut Erdem January 2017 Hacettepe University This

OpenSPARC T1 on Xilinx FPGAs Updates Thomas Thatcher Paul Hartke Durgam Vahia

1 " QK

e, X, The Good, the Bad, and the Promising (not necessarily in that order) Thomas Kroc,

Who We Are Our mission at CFED is to make it possible for

The Shape of the Internet Slides assembled by Jeff Chase Duke University (thanks to Vishal

Learning Interpretable Models Expressed in Linear Temporal Logic - PowerPoint PPT Presentation

Learning Interpretable Models Expressed in Linear Temporal Logic Alberto Camacho 1 , 2 and Sheila McIlraith 1 , 2 1 Department of Computer Science, University of Toronto 2 Vector Institute { acamacho, sheila } @cs.toronto.edu ICAPS 2019 July 13,

Interpretable sets in o-minimal structures Will Johnson March 27, 2015 Will Johnson

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Introduction to Data Science: Logistic 0 1 1 according to a data fit criterion. account

Not Just a Black Box: Interpretable Deep Learning for Genomics Avan&gt; Shrikumar, Peyton

From ML Successes to Applications ICIP18 Tutorial on Interpretable Deep Learning 2 Black Box

Limitations of linear models Richard Erickson Instructor DataCamp Generalized Linear Models in

IMLI: An Incremental Framework for MaxSAT-Based Learning of Interpretable Classification Rules

Deep Visual Models with Interpretable Features and Modularized Structures Quanshi Zhang John

Incremental Approach to Interpretable Classification Rule Learning Bishwamittra Ghosh and Kuldeep

Workshop 3 Building from Linear Models to Generalised Linear Models Part 2: GLMs 2 2 What are

Support Vector Machines Machine Learning 1 Big picture Linear models 2 Big picture Linear

Linear Classifiers: Expressiveness Machine Learning 1 Lecture outline Linear models:

Functional Linear Models 1 66 / 181 Functional Linear Models Statistical Models So far we have

Notes on the Non-linear Regression The model Non-linear regression models, like ordinary linear

Inducing Interpretable Word Senses for WSD and Enrichment of Lexical Resources Overview Jan 11,

Two-level Authoring of Computer- Interpretable Guidelines David Buenestado, Juan M. Pikatza, Unai

Aspe% immunologici nelle Mielodisplasie Ipoplas3che Renato Zambello, MD Padua University School

Lecture 23: Principal Component Analysis Aykut Erdem January 2019 Hacettepe University

Lecture 24: Principal Component Analysis Aykut Erdem January 2017 Hacettepe University This

OpenSPARC T1 on Xilinx FPGAs Updates Thomas Thatcher Paul Hartke Durgam Vahia

1 &quot; QK

e, X, The Good, the Bad, and the Promising (not necessarily in that order) Thomas Kroc,

Who We Are Our mission at CFED is to make it possible for

The Shape of the Internet Slides assembled by Jeff Chase Duke University (thanks to Vishal

Not Just a Black Box: Interpretable Deep Learning for Genomics Avan> Shrikumar, Peyton

1 " QK