derivation beyond regular languages
play

Derivation Beyond Regular Languages Albert-Ludwigs-Universit at - PowerPoint PPT Presentation

Derivation Beyond Regular Languages Albert-Ludwigs-Universit at Freiburg Peter Thiemann 1 1 University of Freiburg 14 Oct 2018 Plan 1 Derivatives for Regular Languages Brzozowski Derivatives Antimirovs Partial Derivatives Applications:


  1. Properties of Linear Factors Definition L ( � a , r � ) = { a } · L ( r ) � L ( L ) = {{ a } · L ( r ) | � a , r � ∈ L } Lemma (Representation) L ( r ) = L ( N ( r )) ∪ L ( lf ( r )) Thiemann Derivation Beyond Regular Languages 14 Oct 2018 18 / 77

  2. Partial Derivatives Definition Let r ∈ RE Σ , a ∈ Σ, w ∈ Σ ∗ . The partial derivative ∂ a ( r ) ⊆ RE Σ of r by a is defined by ∂ a ( r ) = { r ′ | � a , r ′ � ∈ lf ( r ) } The partial derivative ∂ w ( r ) ⊆ RE Σ of r by w is defined inductively by ∂ ε ( r ) = { r } ∂ aw ( r ) = ∂ w ( ∂ a ( r )) The iterated partial derivative ∂ Σ ∗ ( r ) ⊆ RE Σ is defined by � { ∂ w ( r ) | w ∈ Σ ∗ } ∂ Σ ∗ ( r ) = Thiemann Derivation Beyond Regular Languages 14 Oct 2018 19 / 77

  3. Properties of Linear Factors Linear factors automatically perform the simplifications needed to obtain finiteness of iterated derivatives. Definition Let S ( r ) ⊆ RE Σ defined by s ∈ S ( r ) if s = s 1 . . . s n such that each s i is a subterm of r if 1 ≤ i < n , then s i occurs strictly below s i +1 in r (if n = 0 then s = 1 ) Thiemann Derivation Beyond Regular Languages 14 Oct 2018 20 / 77

  4. Properties of Linear Factors Linear factors automatically perform the simplifications needed to obtain finiteness of iterated derivatives. Definition Let S ( r ) ⊆ RE Σ defined by s ∈ S ( r ) if s = s 1 . . . s n such that each s i is a subterm of r if 1 ≤ i < n , then s i occurs strictly below s i +1 in r (if n = 0 then s = 1 ) Proposition S ( r ) is finite Thiemann Derivation Beyond Regular Languages 14 Oct 2018 20 / 77

  5. Finiteness Lemma If � a , s � ∈ lf ( r ), then s ∈ S ( r ). Thiemann Derivation Beyond Regular Languages 14 Oct 2018 21 / 77

  6. Finiteness Lemma If � a , s � ∈ lf ( r ), then s ∈ S ( r ). Lemma If s ∈ S ( r ), then ∂ a ( s ) ⊆ S ( r ) Thiemann Derivation Beyond Regular Languages 14 Oct 2018 21 / 77

  7. Finiteness Lemma If � a , s � ∈ lf ( r ), then s ∈ S ( r ). Lemma If s ∈ S ( r ), then ∂ a ( s ) ⊆ S ( r ) Finiteness Theorem for Partial Derivatives ∂ Σ ∗ ( r ) ⊆ S ( r ) and thus finite Thiemann Derivation Beyond Regular Languages 14 Oct 2018 21 / 77

  8. Finiteness Lemma If � a , s � ∈ lf ( r ), then s ∈ S ( r ). Lemma If s ∈ S ( r ), then ∂ a ( s ) ⊆ S ( r ) Finiteness Theorem for Partial Derivatives ∂ Σ ∗ ( r ) ⊆ S ( r ) and thus finite Theorem: RE → NFA Let Q = ∂ Σ ∗ ( r ) and F = { s ∈ Q | N ( s ) = 1 } . Then A = ( Q , Σ , ∂, { r } , F ) is a NFA with L ( A ) = L ( r ). Thiemann Derivation Beyond Regular Languages 14 Oct 2018 21 / 77

  9. Alternative Finiteness Proof (Broda et al 2011) Definition (Approximate Iterated Derivatives) Specification δ + : RE Σ → P ( RE Σ ) δ + ( 0 ) = ∅ δ + ( 1 ) = ∅ δ + ( a ) = { 1 } δ + ( r + s ) = δ + ( r ) ∪ δ + ( s ) δ + ( r . s ) = δ + ( r ) ⊙ s ∪ δ + ( s ) δ + ( r ∗ ) = δ + ( r ) ⊙ r ∗ Remark An over approximation of the set of iterated derivatives! Thiemann Derivation Beyond Regular Languages 14 Oct 2018 22 / 77

  10. Alternative Finiteness Proof (Broda et al 2011) Lemma For all r , δ + ( r ) is finite Thiemann Derivation Beyond Regular Languages 14 Oct 2018 23 / 77

  11. Alternative Finiteness Proof (Broda et al 2011) Lemma For all r , δ + ( r ) is finite Lemma ∂ a ( r ) ⊆ δ + ( r ) s ∈ δ + ( r ) implies ∂ a ( s ) ⊆ δ + ( r ) Thiemann Derivation Beyond Regular Languages 14 Oct 2018 23 / 77

  12. Alternative Finiteness Proof (Broda et al 2011) Lemma For all r , δ + ( r ) is finite Lemma ∂ a ( r ) ⊆ δ + ( r ) s ∈ δ + ( r ) implies ∂ a ( s ) ⊆ δ + ( r ) Finiteness Theorem for Partial Derivatives ∂ Σ ∗ ( r ) ⊆ δ + ( r ) and thus finite Thiemann Derivation Beyond Regular Languages 14 Oct 2018 23 / 77

  13. Plan 1 Derivatives for Regular Languages Brzozowski Derivatives Antimirov’s Partial Derivatives Applications: Equivalence and Containment Computational Interpretation 2 Derivatives for Context-Free Languages Pragmatic Parsing Deriving Automata Computational Interpretation 3 Derivation for ω -Regular Languages Attempt #1: Brzozowski-style Derivatives Attempt #2: Extending Partial Derivatives Thiemann Derivation Beyond Regular Languages 14 Oct 2018 24 / 77

  14. Applications Definition: Regular Expression Equivalence and Containment Let r , s ∈ RE Σ Equivalence r ∼ s if L ( r ) = L ( s ) Containment r � s if L ( r ) ⊆ L ( s ) Thiemann Derivation Beyond Regular Languages 14 Oct 2018 25 / 77

  15. Applications Definition: Regular Expression Equivalence and Containment Let r , s ∈ RE Σ Equivalence r ∼ s if L ( r ) = L ( s ) Containment r � s if L ( r ) ⊆ L ( s ) Equivalence and Containment are Decidable Standard proof for containment r � s Construct DFA for L ( r ) \ L ( s ) Check for emptiness Equivalence and containment are equally powerful r ∼ s iff r � s and s � r Observe that r � s iff r + s ∼ s Thiemann Derivation Beyond Regular Languages 14 Oct 2018 25 / 77

  16. Derivative-based Equivalence Test Coinductive Definition of Equivalence N ( r ) = N ( s ) ( ∀ a ∈ Σ) Σ ∂ a ( r ) ∼ Σ ∂ a ( s ) r ∼ s Thiemann Derivation Beyond Regular Languages 14 Oct 2018 26 / 77

  17. Derivative-based Equivalence Test Coinductive Definition of Equivalence N ( r ) = N ( s ) ( ∀ a ∈ Σ) Σ ∂ a ( r ) ∼ Σ ∂ a ( s ) r ∼ s Remarks Establishes bisimulation between r and s a Labeled transition system: r → Σ ∂ a ( r ) Analogous to equivalence of finite automata Thiemann Derivation Beyond Regular Languages 14 Oct 2018 26 / 77

  18. Implementation Implementation c o l l e c t R [ ] = True c o l l e c t R (( r , s ) : W) = i f ( r , s ) in R then c o l l e c t R W e l s e i f N( r ) == N( s ) then c o l l e c t (( r , s ) : R) (map ( \ a − > ( ∂ a r , ∂ a s )) Σ ++ W) e l s e False equiv r s = c o l l e c t [ ] [ ( r , s ) ] Thiemann Derivation Beyond Regular Languages 14 Oct 2018 27 / 77

  19. Derivative-based Equivalence Test Properties of collect attempts to construct a bisimulation in R returns True iff r ∼ s (prove this!) termination guaranteed by finiteness of iterated derivatives straightforward extension with counterexample on failure Thiemann Derivation Beyond Regular Languages 14 Oct 2018 28 / 77

  20. Extension of collect with Counterexample c o l l e c t R [ ] = Yes c o l l e c t R ((w, r , s ) : W) = i f ( r , s ) in R then c o l l e c t R W e l s e i f N( r ) == N( s ) then c o l l e c t (( r , s ) : R) (map ( \ a − > ( a :w, ∂ a r , ∂ a s )) Σ ++ W) e l s e No w equiv r s = c o l l e c t [ ] [ ( ε , r , s ) ] Thiemann Derivation Beyond Regular Languages 14 Oct 2018 29 / 77

  21. Derivative-Based Containment Coinductive Definition of Containment (cf. Antimirov) N ( r ) ⇒ N ( s ) ( ∀ a ∈ Σ) Σ ∂ a ( r ) � Σ ∂ a ( s ) r � s Thiemann Derivation Beyond Regular Languages 14 Oct 2018 30 / 77

  22. Derivative-Based Containment Coinductive Definition of Containment (cf. Antimirov) N ( r ) ⇒ N ( s ) ( ∀ a ∈ Σ) Σ ∂ a ( r ) � Σ ∂ a ( s ) r � s Remarks Establishes a simulation between r and s Implementation very similar to equivalence Finiteness follows for the same reason Thiemann Derivation Beyond Regular Languages 14 Oct 2018 30 / 77

  23. Example Show that 1 + a ∗ b � a ∗ b ∗ coind. 1 ⇒ 1 0 ⇒ 1 a ∗ b � a ∗ b ∗ 1 � b ∗ 1 ⇒ 1 1 ⇒ 1 a ∗ b � a ∗ b ∗ 1 � b ∗ 1 + a ∗ b � a ∗ b ∗ Thiemann Derivation Beyond Regular Languages 14 Oct 2018 31 / 77

  24. Plan 1 Derivatives for Regular Languages Brzozowski Derivatives Antimirov’s Partial Derivatives Applications: Equivalence and Containment Computational Interpretation 2 Derivatives for Context-Free Languages Pragmatic Parsing Deriving Automata Computational Interpretation 3 Derivation for ω -Regular Languages Attempt #1: Brzozowski-style Derivatives Attempt #2: Extending Partial Derivatives Thiemann Derivation Beyond Regular Languages 14 Oct 2018 32 / 77

  25. Regular Expressions Are Types Parse Trees Are Values ⊢ r Empty : r ∗ ⊢ r Eps : 1 ⊢ r Sym a : a ⊢ r q : r ∗ ⊢ r p : r ⊢ r q : s ⊢ r p : r ⊢ r Cons p q : r ∗ ⊢ r Seq p q : r . s ⊢ r p : r ⊢ r q : s ⊢ r Inl p : r + s ⊢ r Inr q : r + s Thiemann Derivation Beyond Regular Languages 14 Oct 2018 33 / 77

  26. Regular Expressions Are Types Parse Trees Are Values ⊢ r Empty : r ∗ ⊢ r Eps : 1 ⊢ r Sym a : a ⊢ r q : r ∗ ⊢ r p : r ⊢ r q : s ⊢ r p : r ⊢ r Cons p q : r ∗ ⊢ r Seq p q : r . s ⊢ r p : r ⊢ r q : s ⊢ r Inl p : r + s ⊢ r Inr q : r + s Relations Between Types Yield Functions on Parse Trees Equivalence Proof r ∼ s ⇒ Isomorphism r ↔ s Containment Proof r � s ⇒ Embedding r → s Thiemann Derivation Beyond Regular Languages 14 Oct 2018 33 / 77

  27. Example (cont) Embedding for 1 + a ∗ b � a ∗ b ∗ Inl Eps : 1 + a ∗ b Seq Empty Empty : a ∗ b ∗ �→ Thiemann Derivation Beyond Regular Languages 14 Oct 2018 34 / 77

  28. Example (cont) Embedding for 1 + a ∗ b � a ∗ b ∗ Inl Eps : 1 + a ∗ b Seq Empty Empty : a ∗ b ∗ �→ Inr ( Seq Empty ( Sym b )) �→ Seq Empty ( Cons ( Sym b ) Empty ) Thiemann Derivation Beyond Regular Languages 14 Oct 2018 34 / 77

  29. Example (cont) Embedding for 1 + a ∗ b � a ∗ b ∗ Inl Eps : 1 + a ∗ b Seq Empty Empty : a ∗ b ∗ �→ Inr ( Seq Empty ( Sym b )) �→ Seq Empty ( Cons ( Sym b ) Empty ) Inr ( Seq ( Cons ( Sym a ) Empty ) ( Sym b )) �→ Seq ( Cons ( Sym a ) Empty ) ( Cons ( Sym b ) Empty ) Thiemann Derivation Beyond Regular Languages 14 Oct 2018 34 / 77

  30. Example (cont) Embedding for 1 + a ∗ b � a ∗ b ∗ Inl Eps : 1 + a ∗ b Seq Empty Empty : a ∗ b ∗ �→ Inr ( Seq Empty ( Sym b )) �→ Seq Empty ( Cons ( Sym b ) Empty ) Inr ( Seq ( Cons ( Sym a ) Empty ) ( Sym b )) �→ Seq ( Cons ( Sym a ) Empty ) ( Cons ( Sym b ) Empty ) Goal Derive functions between parse tree that witness containment 1 Embedding and Projection from Derivation 2 Coinductive Embedding Thiemann Derivation Beyond Regular Languages 14 Oct 2018 34 / 77

  31. Step 1: Embedding and Projection from Derivation Derivation of r by a yields a sublanguage of r : { a } · � d ( a , r ) � ⊆ � r � Thiemann Derivation Beyond Regular Languages 14 Oct 2018 35 / 77

  32. Step 1: Embedding and Projection from Derivation Derivation of r by a yields a sublanguage of r : { a } · � d ( a , r ) � ⊆ � r � Hence, there is an embedding and a projection a . d ( a , r ) → r r → a . d ( a , r ) Thiemann Derivation Beyond Regular Languages 14 Oct 2018 35 / 77

  33. Step 1: Embedding and Projection from Derivation Derivation of r by a yields a sublanguage of r : { a } · � d ( a , r ) � ⊆ � r � Hence, there is an embedding and a projection a . d ( a , r ) → r r → a . d ( a , r ) Reformulate derivation to construct “embedding” and “projection” functions ⊢ − ⊢ a t : d ( a , r ) → r a t : r → d ( a , r ) (leaving the a argument implicit) Thiemann Derivation Beyond Regular Languages 14 Oct 2018 35 / 77

  34. Auxiliary: Parse Tree for Empty Word Every nullable expression has a parse tree for the empty word mkE ( · ) : { r | N ( r ) } → r N ( r ) mkE ( 1 ) = Eps mkE ( r + s ) = Inl mkE ( r ) N ( s ) mkE ( r + s ) = Inr mkE ( s ) mkE ( r ∗ ) = Empty mkE ( r . s ) = Seq mkE ( r ) mkE ( s ) Thiemann Derivation Beyond Regular Languages 14 Oct 2018 36 / 77

  35. Auxiliary: Get First Symbol from Parse Tree fi ( · ) : r → Maybe Σ fi ( p ) = Nothing : p is parse tree for empty word fi ( p ) = Just a : a ∈ Σ is first symbol of parsed word fi ( Eps ) = Nothing fi ( Sym a ) = Just a fi ( p ) = m fi ( p ) = m fi ( p ) = Just a fi ( Inl p ) = m fi ( Inr p ) = m fi ( Seq p q ) = Just a fi ( p ) = Nothing fi ( q ) = m fi ( Empty ) = Nothing fi ( Seq p q ) = m fi ( p ) = Just a fi ( p ) = Nothing fi ( q ) = m fi ( Cons p q ) = Just a fi ( Cons p q ) = m Thiemann Derivation Beyond Regular Languages 14 Oct 2018 37 / 77

  36. Embedding and Projection from Derivation d ( a , 0 ) = 0 ⊢ − ⊢ a λ y . abort y : 0 → 0 a λ y . abort y : 0 → 0 d ( a , 1 ) = 0 ⊢ − ⊢ a λ y . abort y : 0 → 1 a λ () : 1 → 0 d ( a , b ) = if a = b then 1 else 0 ⊢ a λ y . Sym a : 1 → a ⊢ a λ y . abort y : 0 → b ⊢ − ⊢ − a λ ( Sym a ) . Eps : a → 1 a λ () : b → 0 Thiemann Derivation Beyond Regular Languages 14 Oct 2018 38 / 77

  37. Embedding and Projection from Derivation (2) d ( a , r + s ) = d ( a , r ) + d ( a , s ) ⊢ a t r : r ′ → r ⊢ a t s : s ′ → s ⊢ a λ y . case y of Inl p → Inl ( t r p ) Inr q → Inr ( t s q ) : r ′ + s ′ → r + s ⊢ − a t r : r → r ′ ⊢ − a t s : s → s ′ ⊢ − a λ y . case y of Inl p → Inl ( t r p ) Inr q → Inr ( t s q ) : r + s → r ′ + s ′ Thiemann Derivation Beyond Regular Languages 14 Oct 2018 39 / 77

  38. Embedding and Projection from Derivation (3) d ( a , r . s ) = d ( a , r ) . s + N ( r ) . d ( a , s ) ⊢ a t r : r ′ → r ⊢ a t s : s ′ → s ⊢ a λ y . case y of Inl ( Seq p q ) → Seq ( t r p ) q Inr ( Seq Eps q ) → Seq mkE ( r ) ( t s q ) : r ′ . s + N ( r ) . s ′ → r . s ⊢ − a t r : r → r ′ ⊢ − a t s : s → s ′ ⊢ − a λ ( Seq p q ) . case fi ( p ) of Just a → Inl ( Seq ( t r p ) q ) Nothing → Inr ( Seq Eps t s q ) : r . s → r ′ . s + N ( r ) . s ′ Thiemann Derivation Beyond Regular Languages 14 Oct 2018 40 / 77

  39. Embedding and Projection from Derivation (4) d ( a , r ∗ ) = d ( a , r ) . r ∗ ⊢ a t r : r ′ → r ⊢ a λ ( Seq p q ) . Cons ( t r p ) q : r ′ . r ∗ → r ∗ ⊢ − a t r : r → r ′ ⊢ − a λ y . case y of Cons p q → Seq ( t r p ) q : r ∗ → r ′ . r ∗ Assumes that parse trees for r ∗ are reduced: no unproductive iteration steps, even if N ( r ) Cons p q implies that fi ( p ) = Just a Thiemann Derivation Beyond Regular Languages 14 Oct 2018 41 / 77

  40. Example (cont’d) Embedding from Derivations d ( a , a ∗ . b ∗ ) = ( 1 . a ∗ ) . b ∗ + 1 . 0 = ( 1 . a ∗ ) . b ∗ u a = λ ( Seq ( Seq Eps p ) q ) . Seq ( Seq Eps ( Cons ( Sym a ) p )) q a = t − λ ( Seq ( Seq Eps ( Cons ( Sym a ) p )) q ) . ( Seq ( Seq Eps p ) q ) d ( b , a ∗ . b ∗ ) = 0 . b ∗ + 1 . b ∗ = 1 . b ∗ u b = λ ( Seq Eps q ) . Seq ( Cons ( Sym b ) Empty ) q t − b = λ ( Seq ( Cons ( Sym b ) Empty ) q ) . Seq Eps q Thiemann Derivation Beyond Regular Languages 14 Oct 2018 42 / 77

  41. Step 2: Construtive Embeddings Coinductive Definition of Embedding N ( s ) λ Eps . mkE ( s ) : 1 � s ⊢ a t : s ′ → s N ( s ′ ) λ ( Sym a ) . t ( mkE ( s ′ )) : a � s t 1 : r 1 � s t 2 : r 2 � s λ y . case y of Inl p → Inl ( t 1 p ) | Inr q → Inr ( t 2 q ) : r 1 + r 2 � s Thiemann Derivation Beyond Regular Languages 14 Oct 2018 43 / 77

  42. Constructing the Embedding (2) Coinductive Definition of Embedding (2) ⊢ a u a : s ′ → s ⊢ − a t − a : r 1 → r ′ 1 t 1 : r ′ 1 . r 2 � s ′ t 2 : r 2 � s λ ( Seq p 1 p 2 ) . if fi ( p 1 ) = Nothing then ( t 2 p 2 ) else fi ( p 1 ) = Just a ; u a ( t 1 ( Seq ( t − a p 1 ) p 2 )) : r 1 . r 2 � s Thiemann Derivation Beyond Regular Languages 14 Oct 2018 44 / 77

  43. Constructing the Embedding (3) Coinductive Definition of Embedding (3) ⊢ a u a : s ′ → s ⊢ − a t − a : r 1 → r ′ 1 t 1 : r ′ . r ∗ � s ′ t 2 : r ∗ � s λ y . case y of Empty → mkE ( s ) Cons p q → if fi ( p ) = Nothing then ( t 2 q ) else fi ( p ) = Just a ; u a ( t 1 ( Seq ( t − a p ) q )) : r ∗ � s Thiemann Derivation Beyond Regular Languages 14 Oct 2018 45 / 77

  44. Example Embedding for 1 + a ∗ b � a ∗ b ∗ t = λ y . case y of Inl p → Inl ( t 1 p ) | Inr q → Inr ( t 2 q ) t 1 = λ Eps . mkE ( a ∗ b ∗ ) : 1 � a ∗ b ∗ t 2 = λ ( Seq p 1 p 2 ) . case fi ( p 1 ) of Nothing → ( t 22 p 2 ) Just a → u a ( t 21 a ( Seq ( t − a p 1 ) p 2 )) Just b → u b ( t 21 b ( Seq ( t − b p 1 ) p 2 )) : a ∗ . b � a ∗ . b ∗ t 21 a = λ ( Seq p ( Sym b )) . Seq p Cons ( Sym b ) Empty : ( 1 . a ∗ ) . b � ( 1 . a ∗ ) . b ∗ t 21 b = λ Eps . mkE ( 1 . b ∗ ) : 1 � 1 . b ∗ t 22 = λ ( Sym b ) . t 22 b ( mkE ( 1 . b ∗ )) : b � a ∗ . b ∗ ⊢ b t 22 b = λ ( Seq Eps p ) . Seq Empty ( Cons ( Sym b ) p ) : 1 . b ∗ → a ∗ . b ∗ Thiemann Derivation Beyond Regular Languages 14 Oct 2018 46 / 77

  45. Embedding (continued) Embeddings From Derivation ⊢ a u a = λ ( Seq ( Seq Eps p ) q ) . Seq ( Cons ( Sym a ) p ) q : ( 1 . a ∗ ) . b ∗ → a ∗ . b ∗ ⊢ − a t − a = λ ( Seq ( Cons ( Sym a ) p ) q ) . Seq ( Seq Eps p ) q : a ∗ . b ∗ → ( 1 . a ∗ ) . b ∗ ⊢ b u b = λ ( Seq Eps q ) . Seq Empty ( Cons ( Sym b ) q ) : 1 . b ∗ → a ∗ . b ∗ ⊢ − b t − b = λ ( Seq Empty ( Cons ( Sym b ) q )) . Seq Eps q : a ∗ . b ∗ → 1 . b ∗ Thiemann Derivation Beyond Regular Languages 14 Oct 2018 47 / 77

  46. Plan 1 Derivatives for Regular Languages Brzozowski Derivatives Antimirov’s Partial Derivatives Applications: Equivalence and Containment Computational Interpretation 2 Derivatives for Context-Free Languages Pragmatic Parsing Deriving Automata Computational Interpretation 3 Derivation for ω -Regular Languages Attempt #1: Brzozowski-style Derivatives Attempt #2: Extending Partial Derivatives Thiemann Derivation Beyond Regular Languages 14 Oct 2018 48 / 77

  47. Context-Free Grammars Context-Free Grammar (CFG) G = ( N , Σ , P , S ) N a finite set of nonterminal symbols Σ an alphabet (terminal symbols) P : N → P fin ( N ∪ Σ) ∗ a (finite) set of productions S ∈ N distinguished start symbol Thiemann Derivation Beyond Regular Languages 14 Oct 2018 49 / 77

  48. Plan 1 Derivatives for Regular Languages Brzozowski Derivatives Antimirov’s Partial Derivatives Applications: Equivalence and Containment Computational Interpretation 2 Derivatives for Context-Free Languages Pragmatic Parsing Deriving Automata Computational Interpretation 3 Derivation for ω -Regular Languages Attempt #1: Brzozowski-style Derivatives Attempt #2: Extending Partial Derivatives Thiemann Derivation Beyond Regular Languages 14 Oct 2018 50 / 77

  49. Derivative of Nonterminal A by a Algorithm Let A ∈ N and a ∈ Σ. Let W = { A } . Choose and remove A from W . Exhaustively: If A → α a β ∈ P and N ( α ), then add A a → β ∈ P . If A → α B β ∈ P and N ( α ), then add A a → B a β ∈ P ; if B a / ∈ N , then W = W ∪ { B } . Repeat until W = ∅ Thiemann Derivation Beyond Regular Languages 14 Oct 2018 51 / 77

  50. Derivative of Nonterminal A by a Algorithm Let A ∈ N and a ∈ Σ. Let W = { A } . Choose and remove A from W . Exhaustively: If A → α a β ∈ P and N ( α ), then add A a → β ∈ P . If A → α B β ∈ P and N ( α ), then add A a → B a β ∈ P ; if B a / ∈ N , then W = W ∪ { B } . Repeat until W = ∅ Remark Terminates with L ( A a ) = a − 1 L ( A ) (prove this!) Iterated derivative does not always terminate Termination can be improved by mapping A a to an existing nonterminal with equivalent productions Thiemann Derivation Beyond Regular Languages 14 Oct 2018 51 / 77

  51. Pragmatic Parsing Algorithm To parse w compute S w and test N ( S w ). Remark Effective procedure Basis for recent work on parsing with derivatives by Adams, Darais, Hollenbeck, Might, Spievak Space leak for many grammars Thiemann Derivation Beyond Regular Languages 14 Oct 2018 52 / 77

  52. Pragmatic Parsing Example CFG for a ∗ b (right recursive) S → b , S → aS Derivatives: S a → S , S b → ε Iterated derivation terminates: S aa ≈ S a , S ab ≈ S b Thiemann Derivation Beyond Regular Languages 14 Oct 2018 53 / 77

  53. Pragmatic Parsing Example CFG for a ∗ b (right recursive) S → b , S → aS Derivatives: S a → S , S b → ε Iterated derivation terminates: S aa ≈ S a , S ab ≈ S b CFG for a ∗ b (left recursive) S → Ab , A → ε , A → Aa Derivatives: S a → A a b , A a → A a a , A a → ε , S b → ε Iterated derivation terminates: A a ≈ A ⇒ S a ≈ S Thiemann Derivation Beyond Regular Languages 14 Oct 2018 53 / 77

  54. Pragmatic Parsing: Context-Free Example CFG for a n b n (a linear language) S → ε , S → aSb Derivatives: S a → Sb , nothing for S b Iterated derivatives: S aa → S a b , S ab → ε Iterated derivatives: S aaa → S aa b , S aab → S ab b etc Thiemann Derivation Beyond Regular Languages 14 Oct 2018 54 / 77

  55. A Context-Free Derivative Construction [Winter, Bonsangue, Rutten 2011] Every CFL has a grammar in Greibach Normal Form (GNF) I.e. every production has the form A 0 → aA 1 . . . A n ⇒ Nonterminal A 0 has derivative A 1 . . . A n Drawback: Requires grammar in GNF Every CFG of size n can be converted to a CFG in GNF of size O ( n 4 ) Thiemann Derivation Beyond Regular Languages 14 Oct 2018 55 / 77

  56. Plan 1 Derivatives for Regular Languages Brzozowski Derivatives Antimirov’s Partial Derivatives Applications: Equivalence and Containment Computational Interpretation 2 Derivatives for Context-Free Languages Pragmatic Parsing Deriving Automata Computational Interpretation 3 Derivation for ω -Regular Languages Attempt #1: Brzozowski-style Derivatives Attempt #2: Extending Partial Derivatives Thiemann Derivation Beyond Regular Languages 14 Oct 2018 56 / 77

  57. Plan 1 Derivatives for Regular Languages Brzozowski Derivatives Antimirov’s Partial Derivatives Applications: Equivalence and Containment Computational Interpretation 2 Derivatives for Context-Free Languages Pragmatic Parsing Deriving Automata Computational Interpretation 3 Derivation for ω -Regular Languages Attempt #1: Brzozowski-style Derivatives Attempt #2: Extending Partial Derivatives Thiemann Derivation Beyond Regular Languages 14 Oct 2018 57 / 77

  58. Plan 1 Derivatives for Regular Languages Brzozowski Derivatives Antimirov’s Partial Derivatives Applications: Equivalence and Containment Computational Interpretation 2 Derivatives for Context-Free Languages Pragmatic Parsing Deriving Automata Computational Interpretation 3 Derivation for ω -Regular Languages Attempt #1: Brzozowski-style Derivatives Attempt #2: Extending Partial Derivatives Thiemann Derivation Beyond Regular Languages 14 Oct 2018 58 / 77

  59. ω -Languages Let ω denote the set of natural numbers Σ ω the set of maps from ω → Σ the set of infinite sequences over Σ the set of infinite Σ-words P (Σ ω ) the set of ω -languages over Σ concatenation only defined for u ∈ Σ ∗ and v ∈ Σ ω Thiemann Derivation Beyond Regular Languages 14 Oct 2018 59 / 77

  60. Operations on ω Languages Product Let U ⊆ Σ ∗ and V ⊆ Σ ω U · V = { u · v | u ∈ U , v ∈ V } Iteration Let U ⊆ Σ ∗ with ε / ∈ U U ω = U · U ω Thiemann Derivation Beyond Regular Languages 14 Oct 2018 60 / 77

  61. ω -Regular Languages ω -regular expressions Σ ∋ α, β ::= 0 | α + β | r .α | s ω R ω where r , s ∈ R Σ and N ( s ) = 0 . Thiemann Derivation Beyond Regular Languages 14 Oct 2018 61 / 77

  62. ω -Regular Languages ω -regular expressions Σ ∋ α, β ::= 0 | α + β | r .α | s ω R ω where r , s ∈ R Σ and N ( s ) = 0 . Semantics � 0 � = ∅ � α + β � = � α � ∪ � β � � r .α � = � r � · � α � � s ω � = � s � ω Thiemann Derivation Beyond Regular Languages 14 Oct 2018 61 / 77

  63. ω -Regular Languages ω -regular expressions Σ ∋ α, β ::= 0 | α + β | r .α | s ω R ω where r , s ∈ R Σ and N ( s ) = 0 . Semantics � 0 � = ∅ � α + β � = � α � ∪ � β � � r .α � = � r � · � α � � s ω � = � s � ω Definition An ω -language V is regular if there is an ω -regular expression α such that V = � α � . Thiemann Derivation Beyond Regular Languages 14 Oct 2018 61 / 77

  64. ω Automata A nondeterministic B¨ uchi automaton (NBA) M = ( Q , Σ , q 0 , δ, F ) consists of Q finite set of states Σ alphabet I ⊆ Q initial states δ : Q × Σ → P Q transition function F ⊆ Q accepting states M is deterministic (DFA) if | I | = 1 and, for all q , a , | δ ( q , a ) | = 1. Thiemann Derivation Beyond Regular Languages 14 Oct 2018 62 / 77

  65. Runs and Languages Run A run of M on a 1 , a 2 · · · ∈ Σ ω is a sequence q 0 , q 1 , · · · ∈ Q ω s.t. q 0 ∈ I , for all i ∈ ω , q i +1 ∈ δ ( q , a i ). The run is (B¨ uchi-) accepting if q i ∈ F for infinitely many i ∈ ω . Language of Automaton M L ( M ) = { w ∈ Σ ω | M has an accepting run on w } Theorem An ω -language is regular iff it is accepted by an NBA. Thiemann Derivation Beyond Regular Languages 14 Oct 2018 63 / 77

  66. Plan 1 Derivatives for Regular Languages Brzozowski Derivatives Antimirov’s Partial Derivatives Applications: Equivalence and Containment Computational Interpretation 2 Derivatives for Context-Free Languages Pragmatic Parsing Deriving Automata Computational Interpretation 3 Derivation for ω -Regular Languages Attempt #1: Brzozowski-style Derivatives Attempt #2: Extending Partial Derivatives Thiemann Derivation Beyond Regular Languages 14 Oct 2018 64 / 77

  67. Derivatives for ω -regular expressions A Naive Attempt Derivatives (Attempt #1) d a ( 0 ) = 0 d a ( α + β ) = d a ( α ) ⊕ d a ( β ) d a ( r .α ) = ( d a ( r ) ⊙ α ) ⊕ ( d a ( α ) ⊙ N ( r )) d a ( s ω ) = d a ( s ) ⊙ s ω where ⊕ , ⊙ are “clever” constructors that normalize modulo ≈ . Thiemann Derivation Beyond Regular Languages 14 Oct 2018 65 / 77

  68. Properties of Brzozowski-style ω -Derivatives Quotient L ω ( d a ( α )) = a − 1 L ω ( α ) Representation � L ω ( α ) = { a } · L ω ( d a ( α )) a ∈ Σ Finiteness For each r ∈ R ω Σ , the set d Σ ∗ ( r ) / ≈ is finite. ( ≈ is a similarity relation on R ω Σ ) Thiemann Derivation Beyond Regular Languages 14 Oct 2018 66 / 77

  69. Properties of Brzozowski-style ω -Derivatives Quotient L ω ( d a ( α )) = a − 1 L ω ( α ) Representation � L ω ( α ) = { a } · L ω ( d a ( α )) a ∈ Σ Finiteness For each r ∈ R ω Σ , the set d Σ ∗ ( r ) / ≈ is finite. ( ≈ is a similarity relation on R ω Σ ) ω -RE → ??? The analogous automaton construction does not work out because it yields deterministic automata. Thiemann Derivation Beyond Regular Languages 14 Oct 2018 66 / 77

  70. Counterexample Example for DBA � NBA: finitely many ’a’s α = ( a + b ) ∗ . b ω Wrong automaton constructed by na¨ ıve extension a b α = ( a + b ) ∗ . b ω b β = α + b ω α start β deterministic! accepting states? a Thiemann Derivation Beyond Regular Languages 14 Oct 2018 67 / 77

  71. Plan 1 Derivatives for Regular Languages Brzozowski Derivatives Antimirov’s Partial Derivatives Applications: Equivalence and Containment Computational Interpretation 2 Derivatives for Context-Free Languages Pragmatic Parsing Deriving Automata Computational Interpretation 3 Derivation for ω -Regular Languages Attempt #1: Brzozowski-style Derivatives Attempt #2: Extending Partial Derivatives Thiemann Derivation Beyond Regular Languages 14 Oct 2018 68 / 77

  72. Insight Construction for nondeterministic automaton needed! Thiemann Derivation Beyond Regular Languages 14 Oct 2018 69 / 77

  73. Insight Construction for nondeterministic automaton needed! Alternative Construction based on Antimirov’s partial derivatives Thiemann Derivation Beyond Regular Languages 14 Oct 2018 69 / 77

Recommend


More recommend