Slowing Down Top Trees for Better Worst-Case Compression Bartłomiej Dudek 1 Paweł Gawrychowski 1 1 University of Wrocław February 8, 2019 Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 1 / 13
Straight-line program (SLP) A context-free grammar in Chomsky normal form with exactly one production for each nonterminal, hence generating exactly one string. Fibonacci words F 0 = a a F 1 = b b F 2 = F 1 F 0 ba F 3 = F 2 F 1 bab F 4 = F 3 F 2 babba = F 5 F 4 F 3 babbabab F 6 = F 5 F 4 babbababbabba Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 2 / 13
Straight-line program (SLP) A context-free grammar in Chomsky normal form with exactly one production for each nonterminal, hence generating exactly one string. Fibonacci words F 0 = a a F 1 = b b F 2 = F 1 F 0 ba F 3 = F 2 F 1 bab F 4 = F 3 F 2 babba = F 5 F 4 F 3 babbabab F 6 = F 5 F 4 babbababbabba Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 2 / 13
Straight-line program (SLP) A context-free grammar in Chomsky normal form with exactly one production for each nonterminal, hence generating exactly one string. Fibonacci words F 0 = a a F 1 = b b F 2 = F 1 F 0 ba F 3 = F 2 F 1 bab F 4 = F 3 F 2 babba = F 5 F 4 F 3 babbabab F 6 = F 5 F 4 babbababbabba Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 2 / 13
Straight-line program (SLP) What is the size of the smallest SLP deriving a string s [ 1 .. n ] over an alphabet of size σ ? n By a counting argument: Ω( log σ n ) . n Constructing an SLP of size O ( log σ n ) Let b = 1 2 log σ n . 1 For every string t s.t. | t | ≤ b prepare a nonterminal deriving t . 2 Cut s into blocks of length b and create a production 3 S → B 1 B 2 . . . B n / b , where B i derives the i -th block. i = 0 σ i ) = O ( n / b + √ n ) . Overall size is O ( n / b + � b 4 Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 3 / 13
Straight-line program (SLP) What is the size of the smallest SLP deriving a string s [ 1 .. n ] over an alphabet of size σ ? n By a counting argument: Ω( log σ n ) . n Constructing an SLP of size O ( log σ n ) Let b = 1 2 log σ n . 1 For every string t s.t. | t | ≤ b prepare a nonterminal deriving t . 2 Cut s into blocks of length b and create a production 3 S → B 1 B 2 . . . B n / b , where B i derives the i -th block. i = 0 σ i ) = O ( n / b + √ n ) . Overall size is O ( n / b + � b 4 Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 3 / 13
Straight-line program (SLP) What is the size of the smallest SLP deriving a string s [ 1 .. n ] over an alphabet of size σ ? n By a counting argument: Ω( log σ n ) . n Constructing an SLP of size O ( log σ n ) Let b = 1 2 log σ n . 1 For every string t s.t. | t | ≤ b prepare a nonterminal deriving t . 2 Cut s into blocks of length b and create a production 3 S → B 1 B 2 . . . B n / b , where B i derives the i -th block. i = 0 σ i ) = O ( n / b + √ n ) . Overall size is O ( n / b + � b 4 Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 3 / 13
Straight-line program (SLP) What is the size of the smallest SLP deriving a string s [ 1 .. n ] over an alphabet of size σ ? n By a counting argument: Ω( log σ n ) . n Constructing an SLP of size O ( log σ n ) Let b = 1 2 log σ n . 1 For every string t s.t. | t | ≤ b prepare a nonterminal deriving t . 2 Cut s into blocks of length b and create a production 3 S → B 1 B 2 . . . B n / b , where B i derives the i -th block. i = 0 σ i ) = O ( n / b + √ n ) . Overall size is O ( n / b + � b 4 Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 3 / 13
Straight-line program (SLP) What is the size of the smallest SLP deriving a string s [ 1 .. n ] over an alphabet of size σ ? n By a counting argument: Ω( log σ n ) . n Constructing an SLP of size O ( log σ n ) Let b = 1 2 log σ n . 1 For every string t s.t. | t | ≤ b prepare a nonterminal deriving t . 2 Cut s into blocks of length b and create a production 3 S → B 1 B 2 . . . B n / b , where B i derives the i -th block. i = 0 σ i ) = O ( n / b + √ n ) . Overall size is O ( n / b + � b 4 Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 3 / 13
Straight-line program (SLP) What is the size of the smallest SLP deriving a string s [ 1 .. n ] over an alphabet of size σ ? n By a counting argument: Ω( log σ n ) . n Constructing an SLP of size O ( log σ n ) Let b = 1 2 log σ n . 1 For every string t s.t. | t | ≤ b prepare a nonterminal deriving t . 2 Cut s into blocks of length b and create a production 3 S → B 1 B 2 . . . B n / b , where B i derives the i -th block. i = 0 σ i ) = O ( n / b + √ n ) . Overall size is O ( n / b + � b 4 Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 3 / 13
Top Tree Compression Aim: to represent a tree with clusters Cluster: a single edge or two clusters merged Cluster: (has at most two “boundary“ nodes) Five possible merges (Bille, Gørtz, Landau, Weimann [ICALP ’13]): Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 4 / 13
Top Tree Compression Aim: to represent a tree with clusters Cluster: a single edge or two clusters merged Cluster: (has at most two “boundary“ nodes) Five possible merges (Bille, Gørtz, Landau, Weimann [ICALP ’13]): Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 4 / 13
Top Tree Compression Aim: to represent a tree with clusters Cluster: a single edge or two clusters merged Cluster: (has at most two “boundary“ nodes) Five possible merges (Bille, Gørtz, Landau, Weimann [ICALP ’13]): Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 4 / 13
Top Tree Compression Aim: to represent a tree with clusters Cluster: a single edge or two clusters merged Cluster: (has at most two “boundary“ nodes) Five possible merges (Bille, Gørtz, Landau, Weimann [ICALP ’13]): Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 4 / 13
Top Tree Compression Aim: to represent a tree with clusters Cluster: a single edge or two clusters merged Cluster: (has at most two “boundary“ nodes) Five possible merges (Bille, Gørtz, Landau, Weimann [ICALP ’13]): Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 4 / 13
Top Tree Compression Aim: to represent a tree with clusters Cluster: a single edge or two clusters merged Cluster: (has at most two “boundary“ nodes) Five possible merges (Bille, Gørtz, Landau, Weimann [ICALP ’13]): Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 4 / 13
Top Tree Compression A B Compression: tree T → binary tree T of clusters 1 goal: short binary tree T → top DAG T D without repeating subtrees 2 goal: small Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 5 / 13
Top Tree Compression merge: A A C B B Compression: tree T → binary tree T of clusters 1 goal: short binary tree T → top DAG T D without repeating subtrees 2 goal: small Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 5 / 13
Top Tree Compression merge: C : A A C B A B B Compression: tree T → binary tree T of clusters 1 goal: short binary tree T → top DAG T D without repeating subtrees 2 goal: small Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 5 / 13
Top Tree Compression merge: C : A A C B A B B Compression: tree T → binary tree T of clusters 1 goal: short binary tree T → top DAG T D without repeating subtrees 2 goal: small Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 5 / 13
Top Tree Compression merge: C : A A C B A B B Compression: tree T → binary tree T of clusters 1 goal: short binary tree T → top DAG T D without repeating subtrees 2 goal: small Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 5 / 13
Top Tree Compression merge: C : A A C B A B B Compression: tree T → binary tree T of clusters 1 goal: short binary tree T → top DAG T D without repeating subtrees 2 goal: small Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 5 / 13
Top Tree Compression merge: C : A A C B A B B Compression: tree T → binary tree T of clusters 1 goal: short binary tree T → top DAG T D without repeating subtrees 2 goal: small Dudek, Gawrychowski ( University of Wrocław) Slowing Down Top Trees February 8, 2019 5 / 13
Recommend
More recommend