slowing down top trees for better worst case compression
play

Slowing Down Top Trees for Better Worst-Case Compression Bartomiej - PowerPoint PPT Presentation

Slowing Down Top Trees for Better Worst-Case Compression Bartomiej Dudek 1 Pawe Gawrychowski 1 1 University of Wrocaw February 8, 2019 Dudek, Gawrychowski ( University of Wrocaw) Slowing Down Top Trees February 8, 2019 1 / 13


  1. Slowing Down Top Trees for Better Worst-Case Compression Bartłomiej Dudek 1 Paweł Gawrychowski 1 1 University of Wrocław February 8, 2019 Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 1 / 13

  2. Straight-line program (SLP) A context-free grammar in Chomsky normal form with exactly one production for each nonterminal, hence generating exactly one string. Fibonacci words F 0 = a a F 1 = b b F 2 = F 1 F 0 ba F 3 = F 2 F 1 bab F 4 = F 3 F 2 babba = F 5 F 4 F 3 babbabab F 6 = F 5 F 4 babbababbabba Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 2 / 13

  3. Straight-line program (SLP) A context-free grammar in Chomsky normal form with exactly one production for each nonterminal, hence generating exactly one string. Fibonacci words F 0 = a a F 1 = b b F 2 = F 1 F 0 ba F 3 = F 2 F 1 bab F 4 = F 3 F 2 babba = F 5 F 4 F 3 babbabab F 6 = F 5 F 4 babbababbabba Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 2 / 13

  4. Straight-line program (SLP) A context-free grammar in Chomsky normal form with exactly one production for each nonterminal, hence generating exactly one string. Fibonacci words F 0 = a a F 1 = b b F 2 = F 1 F 0 ba F 3 = F 2 F 1 bab F 4 = F 3 F 2 babba = F 5 F 4 F 3 babbabab F 6 = F 5 F 4 babbababbabba Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 2 / 13

  5. Straight-line program (SLP) What is the size of the smallest SLP deriving a string s [ 1 .. n ] over an alphabet of size σ ? n By a counting argument: Ω( log σ n ) . n Constructing an SLP of size O ( log σ n ) Let b = 1 2 log σ n . 1 For every string t s.t. | t | ≤ b prepare a nonterminal deriving t . 2 Cut s into blocks of length b and create a production 3 S → B 1 B 2 . . . B n / b , where B i derives the i -th block. i = 0 σ i ) = O ( n / b + √ n ) . Overall size is O ( n / b + � b 4 Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 3 / 13

  6. Straight-line program (SLP) What is the size of the smallest SLP deriving a string s [ 1 .. n ] over an alphabet of size σ ? n By a counting argument: Ω( log σ n ) . n Constructing an SLP of size O ( log σ n ) Let b = 1 2 log σ n . 1 For every string t s.t. | t | ≤ b prepare a nonterminal deriving t . 2 Cut s into blocks of length b and create a production 3 S → B 1 B 2 . . . B n / b , where B i derives the i -th block. i = 0 σ i ) = O ( n / b + √ n ) . Overall size is O ( n / b + � b 4 Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 3 / 13

  7. Straight-line program (SLP) What is the size of the smallest SLP deriving a string s [ 1 .. n ] over an alphabet of size σ ? n By a counting argument: Ω( log σ n ) . n Constructing an SLP of size O ( log σ n ) Let b = 1 2 log σ n . 1 For every string t s.t. | t | ≤ b prepare a nonterminal deriving t . 2 Cut s into blocks of length b and create a production 3 S → B 1 B 2 . . . B n / b , where B i derives the i -th block. i = 0 σ i ) = O ( n / b + √ n ) . Overall size is O ( n / b + � b 4 Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 3 / 13

  8. Straight-line program (SLP) What is the size of the smallest SLP deriving a string s [ 1 .. n ] over an alphabet of size σ ? n By a counting argument: Ω( log σ n ) . n Constructing an SLP of size O ( log σ n ) Let b = 1 2 log σ n . 1 For every string t s.t. | t | ≤ b prepare a nonterminal deriving t . 2 Cut s into blocks of length b and create a production 3 S → B 1 B 2 . . . B n / b , where B i derives the i -th block. i = 0 σ i ) = O ( n / b + √ n ) . Overall size is O ( n / b + � b 4 Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 3 / 13

  9. Straight-line program (SLP) What is the size of the smallest SLP deriving a string s [ 1 .. n ] over an alphabet of size σ ? n By a counting argument: Ω( log σ n ) . n Constructing an SLP of size O ( log σ n ) Let b = 1 2 log σ n . 1 For every string t s.t. | t | ≤ b prepare a nonterminal deriving t . 2 Cut s into blocks of length b and create a production 3 S → B 1 B 2 . . . B n / b , where B i derives the i -th block. i = 0 σ i ) = O ( n / b + √ n ) . Overall size is O ( n / b + � b 4 Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 3 / 13

  10. Straight-line program (SLP) What is the size of the smallest SLP deriving a string s [ 1 .. n ] over an alphabet of size σ ? n By a counting argument: Ω( log σ n ) . n Constructing an SLP of size O ( log σ n ) Let b = 1 2 log σ n . 1 For every string t s.t. | t | ≤ b prepare a nonterminal deriving t . 2 Cut s into blocks of length b and create a production 3 S → B 1 B 2 . . . B n / b , where B i derives the i -th block. i = 0 σ i ) = O ( n / b + √ n ) . Overall size is O ( n / b + � b 4 Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 3 / 13

  11. Top Tree Compression Aim: to represent a tree with clusters Cluster: a single edge or two clusters merged Cluster: (has at most two “boundary“ nodes) Five possible merges (Bille, Gørtz, Landau, Weimann [ICALP ’13]): Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 4 / 13

  12. Top Tree Compression Aim: to represent a tree with clusters Cluster: a single edge or two clusters merged Cluster: (has at most two “boundary“ nodes) Five possible merges (Bille, Gørtz, Landau, Weimann [ICALP ’13]): Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 4 / 13

  13. Top Tree Compression Aim: to represent a tree with clusters Cluster: a single edge or two clusters merged Cluster: (has at most two “boundary“ nodes) Five possible merges (Bille, Gørtz, Landau, Weimann [ICALP ’13]): Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 4 / 13

  14. Top Tree Compression Aim: to represent a tree with clusters Cluster: a single edge or two clusters merged Cluster: (has at most two “boundary“ nodes) Five possible merges (Bille, Gørtz, Landau, Weimann [ICALP ’13]): Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 4 / 13

  15. Top Tree Compression Aim: to represent a tree with clusters Cluster: a single edge or two clusters merged Cluster: (has at most two “boundary“ nodes) Five possible merges (Bille, Gørtz, Landau, Weimann [ICALP ’13]): Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 4 / 13

  16. Top Tree Compression Aim: to represent a tree with clusters Cluster: a single edge or two clusters merged Cluster: (has at most two “boundary“ nodes) Five possible merges (Bille, Gørtz, Landau, Weimann [ICALP ’13]): Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 4 / 13

  17. Top Tree Compression A B Compression: tree T → binary tree T of clusters 1 goal: short binary tree T → top DAG T D without repeating subtrees 2 goal: small Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 5 / 13

  18. Top Tree Compression merge: A A C B B Compression: tree T → binary tree T of clusters 1 goal: short binary tree T → top DAG T D without repeating subtrees 2 goal: small Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 5 / 13

  19. Top Tree Compression merge: C : A A C B A B B Compression: tree T → binary tree T of clusters 1 goal: short binary tree T → top DAG T D without repeating subtrees 2 goal: small Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 5 / 13

  20. Top Tree Compression merge: C : A A C B A B B Compression: tree T → binary tree T of clusters 1 goal: short binary tree T → top DAG T D without repeating subtrees 2 goal: small Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 5 / 13

  21. Top Tree Compression merge: C : A A C B A B B Compression: tree T → binary tree T of clusters 1 goal: short binary tree T → top DAG T D without repeating subtrees 2 goal: small Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 5 / 13

  22. Top Tree Compression merge: C : A A C B A B B Compression: tree T → binary tree T of clusters 1 goal: short binary tree T → top DAG T D without repeating subtrees 2 goal: small Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 5 / 13

  23. Top Tree Compression merge: C : A A C B A B B Compression: tree T → binary tree T of clusters 1 goal: short binary tree T → top DAG T D without repeating subtrees 2 goal: small Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 5 / 13

Recommend


More recommend