nested word automata
play

Nested Word Automata Jens Stimpfle 30.6.2014 Nested Words Nested - PowerPoint PPT Presentation

Nested Word Automata Jens Stimpfle 30.6.2014 Nested Words Nested Words Theoretically and practically pleasant model for the representation of data with both: a linear ordering a hierarchically nested matching Nested Words


  1. Nested Word Automata Jens Stimpfle 30.6.2014

  2. Nested Words

  3. Nested Words ◮ Theoretically and practically pleasant model for the representation of data with both: ◮ a linear ordering ◮ a hierarchically nested matching

  4. Nested Words ◮ Theoretically and practically pleasant model for the representation of data with both: ◮ a linear ordering ◮ a hierarchically nested matching ◮ Applications in software verification and document processing

  5. Nested Words ◮ Theoretically and practically pleasant model for the representation of data with both: ◮ a linear ordering ◮ a hierarchically nested matching ◮ Applications in software verification and document processing ◮ This is the last list item

  6. Structure of this talk 1. Motivation 2. Nested words 3. Nested word automata

  7. Section 1 Motivation

  8. Subsection 1 Data with both linear ordering and hierarchically nested matching 1. Document trees (e.g. HTML) 2. Executions of structured programs (with call-return semantics)

  9. Document trees (e.g. HTML) html head body title h1 p "Hello" "Hello" "Hello, World!"

  10. Executions of structured programs (with call-return semantics) main() countToZero(1) printLn("1") countToZero(0) printLn("0")

  11. Subsection 2 Formal Languages ◮ Regular Languages ◮ Context-Free Languages

  12. Regular Languages Regular language over an alphabet Σ ◮ Most easily explained as generated by a regular expression (RE) ◮ Example RE: 0|[123456789][0123456789]*

  13. Regular Languages Regular language over an alphabet Σ ◮ Most easily explained as generated by a regular expression (RE) ◮ Example RE: 0|[123456789][0123456789]* ◮ Typical implementation: DFA (Deterministic Finite Automaton)

  14. “Problems” with Regular Languages ◮ Can’t express arbitrarily deep nesting

  15. Context-free Languages Context-free language over Σ ◮ Superset of Regular Languages

  16. Context-free Languages Context-free language over Σ ◮ Superset of Regular Languages ◮ Most easily explained as generated by a Context-free Grammar (CFG) ◮ terminal symbols Σ and non-terminal symbols V ◮ start symbol S ∈ V ◮ Productions ⊂ V × ( V ∪ Σ) ∗

  17. Context-free Languages Context-free language over Σ ◮ Superset of Regular Languages ◮ Most easily explained as generated by a Context-free Grammar (CFG) ◮ terminal symbols Σ and non-terminal symbols V ◮ start symbol S ∈ V ◮ Productions ⊂ V × ( V ∪ Σ) ∗ ◮ Example for real world usage: HTML : "<html>" BODY "</html>" BODY : "<body>" CONTENT "</html>" CONTENT : "Hello, world!" | "Hallo, Welt!"

  18. Context-free Languages Context-free language over Σ ◮ Superset of Regular Languages ◮ Most easily explained as generated by a Context-free Grammar (CFG) ◮ terminal symbols Σ and non-terminal symbols V ◮ start symbol S ∈ V ◮ Productions ⊂ V × ( V ∪ Σ) ∗ ◮ Example for real world usage: HTML : "<html>" BODY "</html>" BODY : "<body>" CONTENT "</html>" CONTENT : "Hello, world!" | "Hallo, Welt!" ◮ Typical implementation: Pushdown Automaton

  19. “Problems” with Context-free Languages ◮ Not closed under intersection ◮ Not closed under complementation ◮ Not closed under difference

  20. “Problems” with Context-free Languages ◮ Not closed under intersection ◮ Not closed under complementation ◮ Not closed under difference ◮ Can’t decide inclusion ◮ Can’t decide equivalence

  21. “Problems” with Context-free Languages ◮ Not closed under intersection ◮ Not closed under complementation ◮ Not closed under difference ◮ Can’t decide inclusion ◮ Can’t decide equivalence ◮ Not determinizable (Deterministic Context-free languages are a strict subset of Context-free languages)

  22. Nested words ◮ Nested words were constructed to overcome the limitations of Context-free and Regular languages ◮ The class of nested word languages lies properly between deterministic context-free languages and Regular languages Context-free languages Deterministic context-free languages Nested word languages Regular languages

  23. Section 2 Nested words

  24. Nested words are ordinary words with extra information: The nesting structure is explicitly contained in the input. ⇒ automata for nested words need not parse the nesting.

  25. Definition: Nested word ◮ Later! ◮ For now: well-matched nested words

  26. Definition: Well-matched nested word A well-matched nested word over an alphabet Σ is a pair ( a 1 . . . a n , � )

  27. Definition: Well-matched nested word A well-matched nested word over an alphabet Σ is a pair ( a 1 . . . a n , � ) ◮ a 1 . . . a n ∈ Σ ∗ is a word over Σ

  28. Definition: Well-matched nested word A well-matched nested word over an alphabet Σ is a pair ( a 1 . . . a n , � ) ◮ a 1 . . . a n ∈ Σ ∗ is a word over Σ ◮ The matching � matches “start tags” with their “end tags”: ◮ � ⊂ [1 .. n ] × [1 .. n ] ◮ Given ( i , j ) � = ( k , l ) elements of � , either i < j < k < l or i < k < l < j For ( i , j ) ∈ � , i is a call position and j is a return position

  29. Well-matched N E S T E D

  30. Not well-matched N E S T E D

  31. Not well-matched N E S T E D

  32. Example: Simple HTML tree HTML HEAD /HEAD BODY "Hello, world" /BODY /HTML

  33. Example: Simple HTML tree HTML /HTML HEAD /HEAD BODY /BODY "Hello, world"

  34. Example: Process trace main() countDown(1) print(1) (print) countDown(0) print(0) (print) (countDown) (countDown) (main)

  35. Example: Process trace main() (main) countDown(1) (countDown) print(1) (print) countDown(0) (countDown) print(0) (print)

  36. Section 3 Nested Word Automata (NWA)

  37. A Nested Word Automaton takes a nested word as input and (as automatons do) accepts or rejects it.

  38. A Nested Word Automaton takes a nested word as input and (as automatons do) accepts or rejects it. Nested word automata have much of the power of Pushdown Automata, but can take advantage of the fact that their inputs carry a “pre-parsed” hierarchical structure.

  39. Definition: Deterministic Nested word automaton Definition: A deterministic nested word automaton ( DNWA ) over an alphabet Σ is a structure ( Q , Q 0 , // linear states, initial, accepting Q f , P , P 0 , P f // hierarchical states, initial, accepting , δ c , δ i , // transitions: call, internal, return δ r ) where Q and P are sets of symbols,

  40. Definition: Deterministic Nested word automaton Definition: A deterministic nested word automaton ( DNWA ) over an alphabet Σ is a structure ( Q , Q 0 , // linear states, initial, accepting Q f , P , P 0 , P f // hierarchical states, initial, accepting , δ c , δ i , // transitions: call, internal, return δ r ) where Q and P are sets of symbols, Q 0 ∈ Q , P 0 ∈ P , Q f ⊂ Q , P f ⊂ P ,

  41. Definition: Deterministic Nested word automaton Definition: A deterministic nested word automaton ( DNWA ) over an alphabet Σ is a structure ( Q , Q 0 , // linear states, initial, accepting Q f , P , P 0 , P f // hierarchical states, initial, accepting , δ c , δ i , // transitions: call, internal, return δ r ) where Q and P are sets of symbols, Q 0 ∈ Q , P 0 ∈ P , Q f ⊂ Q , P f ⊂ P , and the three δ are transition functions δ c ⊂ (Σ × Q ) �→ ( Q × P ) δ i ⊂ (Σ × Q ) �→ Q δ r ⊂ (Σ × Q × P ) �→ Q

  42. Definition: DNWA: Run The run of a DNWA over a nested word ( a 1 .. a n , � ) is defined as ◮ A sequence q i for i ∈ [1 , n ] ◮ And a sequence p i for all call positions i

  43. Definition: DNWA: Run The run of a DNWA over a nested word ( a 1 .. a n , � ) is defined as ◮ A sequence q i for i ∈ [1 , n ] ◮ And a sequence p i for all call positions i so that for i ∈ [1 , n ] it holds that: ◮ if i is a call position, then δ c ( a i , q i − 1 ) = ( q i , p i ) ◮ else if i is an internal position, then δ i ( a i , q i − 1 ) = q i ◮ else if i is a return position (let h be its corresponding call position), then δ r ( a i , q i − 1 , p h ) = q i

  44. Definition: DNWA: Run The run of a DNWA over a nested word ( a 1 .. a n , � ) is defined as ◮ A sequence q i for i ∈ [1 , n ] ◮ And a sequence p i for all call positions i so that for i ∈ [1 , n ] it holds that: ◮ if i is a call position, then δ c ( a i , q i − 1 ) = ( q i , p i ) ◮ else if i is an internal position, then δ i ( a i , q i − 1 ) = q i ◮ else if i is a return position (let h be its corresponding call position), then δ r ( a i , q i − 1 , p h ) = q i Informally: q i is the linear trace and p i the hierarchical trace .

Recommend


More recommend