Two-Way Automata in Coq Christian Doczkal and Gert Smolka Interactive Theorem Proving, Nancy, France, August 24, 2016 saarland university computer science
saarland Motivation university computer science Myhill-Nerode in Isabelle/HOL based on regular expressions (Wu, Zhang, Urban ’11) Various automata formalizations in dependent type theory: ◮ Myhill-Nerode based on automata in Nuprl (Constable ’00) ◮ Coq tactic for deciding RE equivalence (Coquand Siles ’11) ◮ Coq tactic for deciding Kleene algebras (Braibant Pous ’12) Student Project: Elegant formalization of automata/Myhill-Nerode based on Ssreflect’s finite types. Equivalence of DFAs, NFAs, and REs and constructive variant of Myhill-Nerode in Coq (Doczkal Kaiser Smolka ’13) Today: Reduction from Two-Way automata to DFAs based on constructive Myhill-Nerode theorem formalized constructively in Coq. Christian Doczkal & Gert Smolka Two-Way Automata in Coq ITP 2016 2 / 22
saarland Two-Way Automata university computer science Another representation of regular languages Essentially: “Read-only Turing machines without memory” Introduced together with one-way automata (Rabin Scott ’59) Reductions to DFAs in (Rabin Scott ’59) and (Shepherdson ’59) Reduction from 2NFAs to NFAs for complement language (Vardi ’89) Recent Survey Paper by (Pighizzini ’13): “Two-Way Auotmata: Old and Recent Results” Christian Doczkal & Gert Smolka Two-Way Automata in Coq ITP 2016 3 / 22
saarland Two-Way Automata university computer science 2NFA M = ( Q , s , F , δ ) where Q is a finite type of states s : Q is the starting state F : 2 Q is the set of final states δ : Q → Σ ⊎ { ⊲ , ⊳ } → 2 Q ×{ L , R } is the transition function. a a a b b b ⊲ ⊳ s δ Christian Doczkal & Gert Smolka Two-Way Automata in Coq ITP 2016 4 / 22
saarland Two-Way Automata university computer science 2NFA M = ( Q , s , F , δ ) where Q is a finite type of states s : Q is the starting state F : 2 Q is the set of final states δ : Q → Σ ⊎ { ⊲ , ⊳ } → 2 Q ×{ L , R } is the transition function. a a a b b b ⊲ ⊳ p δ Christian Doczkal & Gert Smolka Two-Way Automata in Coq ITP 2016 4 / 22
saarland Two-Way Automata university computer science 2NFA M = ( Q , s , F , δ ) where Q is a finite type of states s : Q is the starting state F : 2 Q is the set of final states δ : Q → Σ ⊎ { ⊲ , ⊳ } → 2 Q ×{ L , R } is the transition function. a a a b b b ⊲ ⊳ q δ Christian Doczkal & Gert Smolka Two-Way Automata in Coq ITP 2016 4 / 22
saarland Language of a Two-Way Automaton university computer science Configurations on word x : C x := Q × { 0 , . . . , | x | + 1 } . Step relation on x : ( p , i ) − → x ( q , j ) : C x → C x → B . L ( M ) := { x ∈ Σ ∗ | ∃ q ∈ F . ( s , 1 ) − ∗ ( q , | x | + 1 ) } → x b b a a a b ⊲ ⊳ q δ 1 Language membership is (obviously) decidable 2 Main Result: 2DFAs and 2NFAs accept exactly the regular languages ( M is a deterministic (a 2DFA) if − → x is functional for all x .) Christian Doczkal & Gert Smolka Two-Way Automata in Coq ITP 2016 5 / 22
saarland Two-Way vs. One-Way university computer science I n := ( a + b ) ∗ a ( a + b ) n − 1 automata model size of minimal automaton O ( 2 n ) DFA O ( n ) NFA 2DFA O ( n ) Cost (in the number of states) of simulating 2DFAs with DFAs is at least exponential. Conjecture (Sakoda & Sipser ’78): The cost of simulating NFAs using 2DFAs is exponential. Christian Doczkal & Gert Smolka Two-Way Automata in Coq ITP 2016 6 / 22
saarland Main Results university computer science 1 Vardi ’89: n -state 2NFA for L � NFA for L with at most 2 2 n states 2 Shepherdson ’59: n -state 2DFA for L � DFA for L with at most ( n + 1 ) ( n + 1 ) states 3 Shepherdson ’59 ∪ folklore ∪ Vardi ’89: n -state 2NFA for L � DFA for L with at most 2 n 2 + n states Christian Doczkal & Gert Smolka Two-Way Automata in Coq ITP 2016 7 / 22
saarland Main Results university computer science 1 Vardi ’89: n -state 2NFA for L � NFA for L with at most 2 2 n states 2 Shepherdson ’59: n -state 2DFA for L � DFA for L with at most ( n + 1 ) ( n + 1 ) states 3 Shepherdson ’59 ∪ folklore ∪ Vardi ’89: n -state 2NFA for L � DFA for L with at most 2 n 2 + n states Christian Doczkal & Gert Smolka Two-Way Automata in Coq ITP 2016 7 / 22
saarland Vardi Construction university computer science Input: 2NFA M = ( Q , s , F , δ ) Output: NFA accepting L ( M ) b b a a a b (extended) word in L ( M ) ⊲ ⊳ C 0 C 1 C 2 C 3 C 4 C 5 C 6 C 7 negative certificate Definition (Negative Certificate for x ) 1 s ∈ C 1 2 If p ∈ C i and ( p , i ) − → x ( q , j ) , then q ∈ C j 3 F ∩ C | x | + 1 = ∅ . Construct NFA N that accepts words that have negative certificates Christian Doczkal & Gert Smolka Two-Way Automata in Coq ITP 2016 8 / 22
saarland Vardi Construction university computer science C D C 0 C 1 a b D ′ E C 1 C 2 b 1 D = D ′ C 2 C 3 a 2 If p ∈ D and ( q , L ) ∈ δ p a , then q ∈ C C 3 C 4 3 If p ∈ D and ( q , R ) ∈ δ p a , then q ∈ E N := ( Q ′ , S ′ , F ′ , δ ′ ) where Q ′ := 2 Q × 2 Q Christian Doczkal & Gert Smolka Two-Way Automata in Coq ITP 2016 9 / 22
saarland Vardi Construction university computer science Input: 2NFA M = ( Q , s , F , δ ) Output: NFA accepting L ( M ) b b a a a b (extended) word in L ( M ) ⊲ ⊳ C 0 C 1 C 2 C 3 C 4 C 5 C 6 C 7 negative certificate Definition (Negative Certificate for x ) 1 s ∈ C 1 2 If p ∈ C i and ( p , i ) − → x ( q , j ) , then q ∈ C j 3 F ∩ C | x | + 1 = ∅ . Construct NFA N that accepts words that have negative certificates Christian Doczkal & Gert Smolka Two-Way Automata in Coq ITP 2016 10 / 22
saarland Vardi Construction university computer science C D C 0 C 1 a b D ′ E C 1 C 2 b 1 D = D ′ C 2 C 3 a 2 If p ∈ D and ( q , L ) ∈ δ p a , then q ∈ C C 3 C 4 3 If p ∈ D and ( q , R ) ∈ δ p a , then q ∈ E N := ( Q ′ , S ′ , F ′ , δ ′ ) where Q ′ := 2 Q × 2 Q S ′ := { ( C 0 , C 1 ) | s ∈ C 1 , ∀ pq . p ∈ C 0 ∧ ( q , R ) ∈ δ p ⊲ → q ∈ C 1 } F ′ := { ( C , D ) | F ∩ D = ∅ , ∀ pq . p ∈ D ∧ ( q , L ) ∈ δ p ⊳ → q ∈ C } Christian Doczkal & Gert Smolka Two-Way Automata in Coq ITP 2016 11 / 22
saarland The Need for Runs university computer science Lemma x ∈ L ( N ) iff there exists a negative certificate for x . Usually: generalize to arbitrary states of N and use induction on x direct inductive proof would require a nontrivial generalization x a x ⊲ ⊳ ⊲ ⊳ � C 0 C C n ? ? ? ? Formalized proof employs an explicit notion of run ( ≈ 1 / 3 of proof) Straightforward but tedious calculation Christian Doczkal & Gert Smolka Two-Way Automata in Coq ITP 2016 12 / 22
saarland Vardi Result university computer science Theorem For every n -state 2NFA M one can construct an NFA accepting L ( M ) that has at most 2 2 n states. (recall: Q ′ := 2 Q × 2 Q and | 2 Q | = 2 | Q | due to extensional representation) Corollary For every n -state 2NFA M there exists a DFA accepting L ( M ) that has at most 2 2 2 n states. Christian Doczkal & Gert Smolka Two-Way Automata in Coq ITP 2016 13 / 22
saarland Shepherdson Construction university computer science Input: 2NFA M = ( Q , s , F , δ ) Output: DFA accepting L ( M ) having at most 2 | Q | 2 + | Q | states. How to collect all the information M can gather about its input in a single sweep? Christian Doczkal & Gert Smolka Two-Way Automata in Coq ITP 2016 14 / 22
saarland Tables university computer science x z ⊲ ⊳ p q T : Σ ∗ → 2 Q × 2 Q × Q q ′ possible states when first entering z enter/return relation when crossing from right “ T x abstracts away the first part of the composite word xy .” Christian Doczkal & Gert Smolka Two-Way Automata in Coq ITP 2016 15 / 22
saarland Tables university computer science x z ⊲ ⊳ p finite type q T : Σ ∗ → 2 Q × 2 Q × Q q ′ possible states y z ⊲ ⊳ when first entering z p enter/return relation q when crossing from right q ′ If T x = T y , then xz ∈ L ( M ) iff yz ∈ L ( M ) Christian Doczkal & Gert Smolka Two-Way Automata in Coq ITP 2016 15 / 22
saarland One-Way Sweep university computer science T : Σ ∗ → 2 Q × 2 Q × Q a b b a a b a ⊲ ⊳ T ε Q ′ := 2 Q × 2 Q × Q s ′ := T ε δ ′ ( Tx ) a := T ( xa ) F ′ := { Tx | x ∈ L ( M ) } Christian Doczkal & Gert Smolka Two-Way Automata in Coq ITP 2016 16 / 22
saarland One-Way Sweep university computer science T : Σ ∗ → 2 Q × 2 Q × Q a b b a a b a ⊲ ⊳ T a Q ′ := 2 Q × 2 Q × Q s ′ := T ε δ ′ ( Tx ) a := T ( xa ) F ′ := { Tx | x ∈ L ( M ) } Christian Doczkal & Gert Smolka Two-Way Automata in Coq ITP 2016 16 / 22
saarland One-Way Sweep university computer science T : Σ ∗ → 2 Q × 2 Q × Q a b b a a b a ⊲ ⊳ T ab Q ′ := 2 Q × 2 Q × Q s ′ := T ε δ ′ ( Tx ) a := T ( xa ) F ′ := { Tx | x ∈ L ( M ) } Christian Doczkal & Gert Smolka Two-Way Automata in Coq ITP 2016 16 / 22
Recommend
More recommend