improved address calculation coding of integer arrays
play

Improved Address-Calculation Coding of Integer Arrays Jyrki - PowerPoint PPT Presentation

Improved Address-Calculation Coding of Integer Arrays Jyrki Katajainen 1 , 2 Amr Elmasry 3 , Jukka Teuhola 4 1 University of Copenhagen 2 Jyrki Katajainen and Company 3 Alexandria University 4 University of Turku c Performance Engineering


  1. Improved Address-Calculation Coding of Integer Arrays Jyrki Katajainen 1 , 2 Amr Elmasry 3 , Jukka Teuhola 4 1 University of Copenhagen 2 Jyrki Katajainen and Company 3 Alexandria University 4 University of Turku c � Performance Engineering Laboratory SPIRE 2012, Cartagena (1)

  2. Problem formulation Given: An array of integers Many solutions known, see the { x i | i ∈ { 1 , 2 , . . . , n }} list of references in the paper Wanted: Compressed represen- Theoretical approaches tation, fast random access • O (1) worst-case-time access Operations: • overhead of o ( n ) bits with access ( i ): retrieve x i respect to some measure of insert ( i , v ): insert v before x i compactness delete ( i ): remove x i • complicated Other: omitted in this talk Practical approaches sum ( j ): retrieve � j i =1 x i • slower access search ( p ): find the rank of the • O ( n ) bits of overhead given prefix sum p • implementable modify ( i , v ): change x i to v • fast in practice c � Performance Engineering Laboratory SPIRE 2012, Cartagena (2)

  3. Measures of compactness What is optimal? Data-independent measures n : # integers Compact representation: x = max n ˆ i =1 x i n lg(1 + s/n ) + O ( n ) bits s = � n i =1 x i Apply Jensen’s inequality to Data-aware measure the raw representation and Raw representation: accept a linear overhead � n x n ⌉ i =1 ⌈ lg(1 + x i ) ⌉ bits Lower bound 1 : ⌈ lg ˆ x n : Overhead: In order to support ˆ The number of se- random access we expect to quences of n positive integers need some more bits whose value is at most ˆ x � s − 1 � �� Lower bound 2 : lg n − 1 � s − 1 � : The number of se- n − 1 quences of n positive integers that add up to s c � Performance Engineering Laboratory SPIRE 2012, Cartagena (3)

  4. Two trivial “solutions” Uncompressed array Fixed-length coding a : a : x = max n w : size of a machine word ˆ i =1 x i β = ⌈ lg(1 + ˆ x ) ⌉ Space: w · n + O ( w ) bits access ( i ): a [ i ] Space: β · n + O ( w ) bits access ( i ): Access times on my computer: • compute the word address sequential random n • read one or two words 2 10 0.89 1.1 • mask the bits needed 2 15 0.74 1.4 2 20 – one outlier ruins the com- 0.89 7.1 2 25 pactness 0.74 10.9 տ ns per operation + relatively fast – no compression Q: How would you support insert + fast and delete for these structures? c � Performance Engineering Laboratory SPIRE 2012, Cartagena (4)

  5. Two examples x 1 = n 2 , x i = 1 for i ∈ { 2 , . . . , n } x 1 = n , x i = 1 for i ∈ { 2 , . . . , n } Raw representation: Raw representation: n + O (lg n ) bits n + O (lg n ) bits Fixed-length coding: Compact representation: n ⌈ lg(1 + n ) ⌉ bits n lg n + Θ( n ) bits Lower bound 1 : Lower bound 1 : ⌈ n lg n ⌉ bits ⌈ 2 n lg n ⌉ bits Lower bound 2 : n lg n + Θ( n ) bits N.B. All our representations are compact, but we do not claim them to be optimal c � Performance Engineering Laboratory SPIRE 2012, Cartagena (5)

  6. Our contribution Teuhola 2011 This paper Interpolative coding of integer Space: n lg(1+ s/n )+ O ( n ) bits, sequences supporting log-time i.e. compact random access, Inform. Process. access : O (lg lg( n + s )) worst- Manag. 47 ,5, 742–761 case time in the static case and O (lg n ) worst-case time Space: n lg(1+ s/n )+ O ( n ) bits, in the dynamic case i.e. compact insert , delete : O (lg n + w 2 ) worst- access : O (lg( n + s )) worst-case case time time insert , delete : not supported n : # integers (assume n ≥ w ) s : sum of the integers w : size of a machine word c � Performance Engineering Laboratory SPIRE 2012, Cartagena (6)

  7. Address-calculation coding 21 14 7 Space: Compact by the magical formula 9 5 2 5 access : O (lg n ) worst-case time 2 3 2 4 4 1 5 0 (assuming that the position of the most significant one 01110 1001 0100 010 010 10 100 10101 bit in a word can be deter- • encoding in depth-first order mined in O (1) time) • yellow nodes not stored insert , delete : not supported • skip subtrees using the formula t = ⌈ lg(1 + s ) ⌉ Magical formula n ( t − lg n + 1) + ⌊ s ( n − 1)  2 t − 1 ⌋ − t − 1 , if s ≥ n/ 2  B ( n, s ) = 2 t + ⌊ s (2 − 1 2 t − 1 ) ⌋ − t − 1 + s (lg n − t ) , otherwise  c � Performance Engineering Laboratory SPIRE 2012, Cartagena (7)

  8. Indexed address-calculation coding c : a tuning parameter, c ≥ 1 Analysis s i : sum of the numbers in the i th roots: chunk ⌈ n/k ⌉·⌈ lg(1+ s ) ⌉ ≤ n/c + O ( w ) index; fixed-length coding pointers: ⌈ n/k ⌉ · (lg n + lg lg(1+ s/n ) + chunk size: k = ⌊ c · lg( n + s ) ⌋ O (1)) ≤ n/c + O ( w ) # chunks: t = ⌈ n/k ⌉ chunks: root: ⌈ lg(1 + s ) ⌉ bits � t pointer: lg n + lg lg(1 + s/n ) + O (1) bits i =1 [ k · lg(1+ s i /k )+ O ( k )] ≤ chunks; address-calculation coding n lg(1 + s/n ) + O ( n ) c � Performance Engineering Laboratory SPIRE 2012, Cartagena (8)

  9. Other applications of indexing Indexed Elias delta coding Indexed fixed-length coding c : a tuning parameter, c ≥ 1 c : a tuning parameter, c ≥ 1 x = max n ˆ i =1 x i index; fixed-length coding index; fixed-length coding chunk size: k = ⌊ c · (lg n + lg lg s ) ⌋ # chunks: t = ⌈ n/k ⌉ chunk size: k = ⌊ c · (lg n + lg lg ˆ x ) ⌋ pointer: lg n + lg lg(1 + s/n ) + O (1) bits # chunks: t = ⌈ n/k ⌉ chunks; Elias delta coding pointer: lg n + lg lg(1 + ˆ x ) + O (1) bits offsets; fixed-length coding Space: raw + O ( � n i =1 lg lg x i ) access : O (lg n +lg lg s ) worst-case landmark + offset data; raw coding time Space: raw + O ( n lg lg( n + ˆ x )) access : O (1) worst-case time c � Performance Engineering Laboratory SPIRE 2012, Cartagena (9)

  10. Dynamization c : a tuning parameter, c ≥ 1 Use the zone technique: w : size of a machine word • align chunks to word bound- aries index; balanced search tree • keep chunks of the same size in separate zones • only w zones • maintain zones as rotated ar- chunk size: k = cw/ 2..2 cw rays (one chunk may be split) # chunks: t = ⌈ n/ (2 cw ) ⌉ .. ⌈ 2 n/ ( cw ) ⌉ root: w bits Space: Still compact pointer: w bits access : O (lg n ) worst-case time chunks; address-calculation coding ( n ≥ w )) insert , delete : O (lg n + w 2 ) worst- case time c � Performance Engineering Laboratory SPIRE 2012, Cartagena (10)

  11. Experimental setup Benchmark data: Processor: � Xeon R � CPU 1.8 GHz Intel R n integers – uniformly distributed × 2 – exponentially distributed Programming language: Repetitions: C Each experiment repeated r Compiler: times for sufficiently large r gcc with optimization -O3 Reported value: Source code: Measurement result divided Available from Jukka’s home by r × n page c � Performance Engineering Laboratory SPIRE 2012, Cartagena (11)

  12. Experimental results: Overhead Indexed modifiable array Indexed modifiable array Indexed static array Indexed static array 16 Basic AC-coded array Basic AC-coded array 10 Entropy Entropy 14 Bits per source integer Bits per source integer 12 8 10 6 8 6 4 4 2 2 2 4 8 16 32 64 128 256 512 1024 1/64 1/32 1/16 1/8 1/4 1/2 1 2 4 8 Range size Lambda – entropy of x i : expected information content of x i − ln(1 − y i ) � � – for a random floating-point number y i , y i ≥ 0, x i = λ c � Performance Engineering Laboratory SPIRE 2012, Cartagena (12)

  13. Experimental results: access , search , modify Basic AC-coded array, access Indexed modifiable array, modify 2.0 Basic AC-coded array, search Indexed modifiable array, access Indexed static array, search 6 Time per operation (microsec.) Time per operation (microsec.) Indexed static array, access 5 1.5 4 1.0 3 2 0.5 1 1 000 10 000 100 000 1 000 000 1 000 10 000 100 000 1 000 000 Number of source integers Number of source integers – uniformly-distributed integers drawn from [0..63] c � Performance Engineering Laboratory SPIRE 2012, Cartagena (13)

  14. Further work Theory Practice • Try to understand better the • As to the speed of access , we trade-off between the speed showed that O (lg lg( n + s )) is of access and the amount of better than O (lg( n + s )). Can overhead in the data-aware you show that O (1) is better case. than O (lg lg( n + s ))? • Independent of the theoreti- Applications cal running time, can one get • Can some of you convince me the efficiency of access closer that compressed arrays are to that provided by uncom- useful—or even necessary— pressed arrays? in some information-retrieval To do application(s)? • A thorough experimental comparison! c � Performance Engineering Laboratory SPIRE 2012, Cartagena (14)

Recommend


More recommend