counting colours in compressed strings Travis Gagie Juha K arkk - PowerPoint PPT Presentation

counting colours in compressed strings Travis Gagie Juha K¨ arkk¨ ainen CPM 2011

Theorem Given a string s [1 .. n ] , we can build a data structure that takes nH 0 ( s ) + O ( n ) + o ( nH 0 ( s )) bits such that later, given a substring’s endpoints i and j, in O (log ℓ ) time we can count how many distinct characters it contains, where ℓ = j − i + 1 .

source space time BKM&T O ( n log n ) O (log n ) Muthu + WT n log n + o ( n log n ) O (log n ) GN&P n log σ + O ( n log log n ) O (log n ) this paper nH 0 ( s ) + O ( n ) + o ( nH 0 ( s )) O (log ℓ )

counting colours in compressed strings [c, o, u, n, t, i, n, g, c, o, l, o, u, r, s, i, n, c, o, m, p, r, e, s, s, e, d, s, t, r, i, n, g, s] [0, 0, 0, 0, 0, 0, 4, 0, 1, 2, 0, 10, 3, 0, 0, 6, 7, 9, 12, 0, 0, 14, 0, 15, 24, 23, 0, 25, 5, 22, 16, 17, 28]

a a b b 5 3 3 5 5 3 . . . . . . 5 . . . . . .

a b b a 5 9 9 5 9 . . . . . . 5

Components: ◮ multiary wavelet tree assigning entries to blocks ◮ wavelet tree for each block (with a shared bitvector for each block size and depth)

Observations: ◮ if we use more block sizes, the C array becomes more like recency coding and compression is better (but queries take more time) ◮ if we use polylog( n ) block sizes, then we can count the entries much bigger than ℓ in O (1) time using the multiary wavelet tree

Calculation: ◮ if we use block sizes � 2 k = 1 b k = 2 max ( � k − 1 h =1 (1+1 /α ( b h )) , k ) k > 1 then we use a total of nH 0 ( s ) + O ( n ) + o ( nH 0 ( s )) bits and O ( α ( ℓ ) log ℓ log log( ℓ + 1)) query time

Observations: ◮ if a block B smaller than ℓ contains the beginning i of the interval, then it does not contain the end j ◮ we can count the entries C [ q ] = p in B with p < i ≤ q by counting ◮ all the entries in B (in O (1) time with the multiary wavelet tree) ◮ all the entries in B with q < i (in O (1) time with the multiary wavelet tree) ◮ all the entries in B with p ≥ i

Calculation: ◮ if we store pointers to the wavelet-tree nodes at height k , then we use O ( n ) more bits and can count all the entries in B α ( ℓ )(log log( ℓ + 1)) 2 � � with p ≥ i in O ⊆ o (log ℓ ) time

counting colours in compressed strings Travis Gagie Juha K arkk - PowerPoint PPT Presentation

counting colours in compressed strings Travis Gagie Juha K arkk ainen CPM 2011 counting colours in compressed strings Travis Gagie Juha K arkk ainen CPM 2011 Theorem Given a string s [1 .. n ] , we can build a data structure that

s[i] Introduction to Computer Programming Strings CSCI-UA 2 Strings and Characters Strings are

Listing Bit Strings List all bit strings of length 3. Listing Bit Strings List all bit strings

Compressed Membership for NFA (DFA) with Compressed Labels is in NP (P) Artur Je University of

Chapter 9 Strings 1 C-Strings vs C++ Strings T wo string types: C-strings Array

Strings Testing for equality with strings. Lexicographic ordering of strings. Other

Compressed Counting Ping Li Department of Statistical Science Faculty of Computing and

Languages and Regular expressions Lecture 2 1 Strings, Sets of Strings, Sets of Sets of

Strings Digital Medicine I Lists, strings, loops Repetition Hans-Joachim Bckenhauer Dennis

Chapter 9: Strings (To avoid confusion, C-style strings will be referred to as C-string,

Compressed Membership for NFA (DFA) with Compressed Labels is in NP (P) Artur Je Wrocaw,

Pattern Matching on Compressed T exts II Shunsuke Inenaga Kyushu University, Japan Agenda

Decoding in Compressed Sensing Ronald DeVore USC, 2008 p. 1/33 Discrete Compressed Sensing R

Rainbow Collectors What are the seven colours of the rainbow and how can we see them? What IS a

New Products New Designs Amazonia Standard design colours on pink Williamsburg Amazonia

Frames and Canvases Jardinires & Citrus Trees design Standard design colours on Natural

Four colours suffice Robin Wilson Four colours suffice Robin Wilson This talk is dedicated to

Evaluation plans for the Digital Skills Partnership @ Department for Digital, Culture Media &

The Status of Medicaid ACOs and their Projected Future 801.538.5082 | info@accountablecareLC.org |

Virtual Realitys African Future an extrapolation from our current experiences GTC Europe

2017 State of the Borough Address by Mayor Judith M. Davies-Dunhour Stone Harbor Property Owners

WE ARE THE SCANDIMANIACS OF GOURMET PIZZA #NORDICPOWER#PIZZA#BUBBLES C L A S S I C P I Z Z A

January June 2018 acting CEO Juha Hammarn The outlook for the Finnish economy remains good

Investor Presentation November 2016 TSX : HWO 2 DISCLAIMER Certain information contained

EASM 2014 requirements for inclusion in this study. Therefore, 633 posts, 612 articles, and 442

counting colours in compressed strings Travis Gagie Juha K arkk - PowerPoint PPT Presentation

counting colours in compressed strings Travis Gagie Juha K arkk ainen CPM 2011 counting colours in compressed strings Travis Gagie Juha K arkk ainen CPM 2011 Theorem Given a string s [1 .. n ] , we can build a data structure that

s[i] Introduction to Computer Programming Strings CSCI-UA 2 Strings and Characters Strings are

Listing Bit Strings List all bit strings of length 3. Listing Bit Strings List all bit strings

Compressed Membership for NFA (DFA) with Compressed Labels is in NP (P) Artur Je University of

Chapter 9 Strings 1 C-Strings vs C++ Strings T wo string types: C-strings Array

Strings Testing for equality with strings. Lexicographic ordering of strings. Other

Compressed Counting Ping Li Department of Statistical Science Faculty of Computing and

Languages and Regular expressions Lecture 2 1 Strings, Sets of Strings, Sets of Sets of

Strings Digital Medicine I Lists, strings, loops Repetition Hans-Joachim Bckenhauer Dennis

Chapter 9: Strings (To avoid confusion, C-style strings will be referred to as C-string,

Compressed Membership for NFA (DFA) with Compressed Labels is in NP (P) Artur Je Wrocaw,

Pattern Matching on Compressed T exts II Shunsuke Inenaga Kyushu University, Japan Agenda

Decoding in Compressed Sensing Ronald DeVore USC, 2008 p. 1/33 Discrete Compressed Sensing R

Rainbow Collectors What are the seven colours of the rainbow and how can we see them? What IS a

New Products New Designs Amazonia Standard design colours on pink Williamsburg Amazonia

Frames and Canvases Jardinires &amp; Citrus Trees design Standard design colours on Natural

Four colours suffice Robin Wilson Four colours suffice Robin Wilson This talk is dedicated to

Evaluation plans for the Digital Skills Partnership @ Department for Digital, Culture Media &amp;

The Status of Medicaid ACOs and their Projected Future 801.538.5082 | info@accountablecareLC.org |

Virtual Realitys African Future an extrapolation from our current experiences GTC Europe

2017 State of the Borough Address by Mayor Judith M. Davies-Dunhour Stone Harbor Property Owners

WE ARE THE SCANDIMANIACS OF GOURMET PIZZA #NORDICPOWER#PIZZA#BUBBLES C L A S S I C P I Z Z A

January June 2018 acting CEO Juha Hammarn The outlook for the Finnish economy remains good

Investor Presentation November 2016 TSX : HWO 2 DISCLAIMER Certain information contained

EASM 2014 requirements for inclusion in this study. Therefore, 633 posts, 612 articles, and 442

Frames and Canvases Jardinires & Citrus Trees design Standard design colours on Natural

Evaluation plans for the Digital Skills Partnership @ Department for Digital, Culture Media &