' $ Bitmap Index Design and Ev aluation Chee-Y ong Chan Univ ersit y of Wisconsin-Madison Y annis Ioannidis Univ ersit y of Wisconsin-Madison Univ ersit y of A thens & % 1
' $ In tro duction � T remendous gro wth in Decision Supp ort Systems (DSS). � Characteristics of DSS Queries: r e ad-mostly, c omplex, adho c, with lar ge foundsets (i.e., high sele ctivity factors) . � \Resurrection" of in terest in bitmap indexing . � Not m uc h kno wn ab out space-time tradeo�s. & % 2
Example of a Bitmap Index 10 9 8 7 6 5 4 3 2 1 0 A B B B B B B B B B B B 4 0 0 0 0 0 0 1 0 0 0 0 9 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 3 0 0 0 0 0 0 0 1 0 0 0 8 0 0 1 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 1 0 0 10 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 7 0 0 0 1 0 0 0 0 0 0 0 5 0 0 0 0 0 1 0 0 0 0 0 6 0 0 0 0 1 0 0 0 0 0 0 3 0 0 0 0 0 0 0 1 0 0 0 3
' $ Bitmap Index (con t.) � V alue-List Index [O'Neil & Quass, SIGMOD'97]. � Adv an tages: { Compact represen tation of index (esp ecially for attributes with lo w cardinalit y) ) space and I/O e�cien t. { Bitmap op erations (AND, OR, X OR, NOT) are e�cien tly supp orted b y hardw are. & % 4
' $ Scop e of T alk � Bitmap Index Design for selection queries of the form: ( A c ) where 2 f� ; � ; = ; 6 = g : op op <; >; { Range Query : 2 f� ; � ; > g . op <; { Equalit y Query : 2 f = ; 6 = g . op � Assumption : A ttribute v alues are in f 0 ; 1 ; 2 ; � 1 g , where : : : ; C attribute cardinalit y . C is the � 2-Dimensional F ramew ork for Design Space. � Space-Time T radeo� Study . & % 5
Example of a V alue-List Index 10 9 8 7 6 5 4 3 2 1 0 A B B B B B B B B B B B 4 0 0 0 0 0 0 1 0 0 0 0 9 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 3 0 0 0 0 0 0 0 1 0 0 0 8 0 0 1 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 1 0 0 10 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 7 0 0 0 1 0 0 0 0 0 0 0 5 0 0 0 0 0 1 0 0 0 0 0 6 0 0 0 0 1 0 0 0 0 0 0 3 0 0 0 0 0 0 0 1 0 0 0 6
' $ Design Space Of Bitmap Indexes for Selection Queries � Design space consists of 2 orthogonal dimensions (inspired b y [W ong et al, VLDB'85]): 1. A ttribute V alue Decomp osition : determines n um b er and size of index comp onen ts. 2. Bitmap Enco ding Sc heme : determines enco ding of bitmap comp onen ts. � Index ! ! Comp onen t ! ! Bitmap & % 7
' $ st Dimension: A ttribute V alue Decomp osition 1 � Giv en a sequence of n um b ers > , n < b ; b ; : : : ; b n n � 1 1 eac h attribute v alue is decomp osed in to digits A n , where is a base- b digit. A A : : : A A n n � 1 1 i i � Example: = 1000 and attribute v alue = 256. C A < b ; : : : ; b > Decomp osition of A n 1 1000 < > 256 50 ; 20 (20) + < > 12 16 32 ; 32 8 (32) + < > 0 < 5 ; 20 ; 10 > 1 (20)(10) + 5 (10) + 6 � Eac h ( base of index ) de�nes an < b ; b ; : : : ; b > n n � 1 1 n-comp onen t index . & % 8
A ttribute V alue Decomp osition with Base 3 ; 4 < > A A A 2 1 1 � 4 +0 4 1 0 � � � � � ! 2 � 4 +1 9 2 1 � � � � � ! 0 � 4 +1 1 0 1 � � � � � ! 0 � 4 +3 3 0 3 � � � � � ! 2 � 4 +0 8 2 0 � � � � � ! 0 � 4 +2 2 0 2 � � � � � ! 2 � 4 +2 10 2 2 � � � � � ! 0 � 4 +0 0 0 0 � � � � � ! 1 � 4 +3 7 1 3 � � � � � ! 1 � 4 +1 5 1 1 � � � � � ! 1 � 4 +2 6 1 2 � � � � � ! 0 � 4 +3 3 0 3 � � � � � ! 9
' $ nd Dimension: Bitmap Enco ding Sc hemes 2 th � Consider the i index comp onen t with base b . i � Tw o basic w a ys to enco de a v alue (0 � ): x x < b i Enco ding -bit Represen tation for v alue x b i Sc heme � 1 � � � + 1 � 1 � � � 0 b x x x i Equalit y 0 � � � 0 1 0 � � � 0 Range 1 � � � 1 1 0 � � � 0 x � Equalit y Enco ded Bitmap: B = f records with A = x g i i x � Range Enco ded Bitmap: = f records with � g B A x i i b � 1 is not materialized since all its bits are set to 1. B i i & % 10
An Equalit y-Enco ded Base- < Index 3 ; 4 > 2 1 0 3 2 1 0 A A A B B B B B B B 2 1 2 2 2 1 1 1 1 4 1 0 0 1 0 0 0 0 1 9 2 1 1 0 0 0 0 1 0 1 0 1 0 0 1 0 0 1 0 3 0 3 0 0 1 1 0 0 0 8 2 0 1 0 0 0 0 0 1 2 0 2 0 0 1 0 1 0 0 � ! � ! 10 2 2 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 7 1 3 0 1 0 1 0 0 0 5 1 1 0 1 0 0 0 1 0 6 1 2 0 1 0 0 1 0 0 3 0 3 0 0 1 1 0 0 0 11
A Range-Enco ded Base- < Index 3 ; 4 > 1 0 2 1 0 A A A B B B B B 2 1 2 2 1 1 1 4 1 0 1 0 1 1 1 9 2 1 0 0 1 1 0 1 0 1 1 1 1 1 0 3 0 3 1 1 0 0 0 8 2 0 0 0 1 1 1 2 0 2 1 1 1 0 0 � ! � ! 10 2 2 0 0 1 0 0 0 0 0 1 1 1 1 1 7 1 3 1 0 0 0 0 5 1 1 1 0 1 1 0 6 1 2 1 0 1 0 0 3 0 3 1 1 0 0 0 12
' $ BITMAP ENCODING SCHEME Design Space of . . . . . Equality Range Bitmap Indexes Value-List < C > Index ATTRIBUTE < b, b, ..., b> Bit-Sliced VALUE Index log C times b DECOMPOSITION < b 2 , b 1 > < b , b , b > 2 1 3 . . . . . & % 13
' $ Space-Time T radeo� Issues Time Space-Optimal Time-Optimal under Space Constraint S Optimal Space-Time Tradeoff (knee) Time-Optimal Infeasible Region Space S & % 14
' $ Analytical Cost Mo del Cost Metrics Space Num b er of bitmaps. Time Exp ected n um b er of bitmap scans for a selection query ev aluation. � Uniform Query Distribution Assumption : Query space = f A : 2 f� ; � ; = ; 6 = g ; 0 � g , op v op <; >; v < C where is the attribute cardinalit y . C & % 15
Comparison of Enco ding Sc hemes 10 10 Time (Expected Number of Bitmap Scans) Time (Expected Number of Bitmap Scans) Range-Encoded Index Range-Encoded Index Equality-Encoded Index Equality-Encoded Index 8 8 6 6 4 4 2 2 0 0 0 10 20 30 40 50 60 70 80 90 100 0 20 40 60 80 100 Space (Number of Bitmaps) Space (Number of Bitmaps) (a) = 100 (b) = 1000 C C 16
' $ Space-Time T radeo� Results � Class of n-Comp onen t Indexes � � C Time-Optimal Index = 2 ; 2 ; 2 ; > . { < : : : ; n � 1 2 | {z } n � 1 Space-Optimal Index = 1 ; 1 ; 1 ; { < b b : : : ; b b; b; : : : ; b > � � � | {z } | {z } n � r r p � � n r � 1 n � r +1 r n � r where = ( b 1) ( b 1) . b C ; b < C b � � � � Time-Optimal Index = Single-comp onen t index. � Space-Optimal Index = Maximal-comp onen t index. � Knee Index � 2-comp onen t space-optimal index. & % 17
Time-Optimal and Space-Optimal Indexes, C=100 7 Time (Expected Number of Bitmap Scans) 7 n-Comp. Time-Optimal Index n-Comp. Space-Optimal Index 6 6 5 5 5 4 4 4 3 3 3 2 2 2 1 1 0 0 20 40 60 80 100 Space (Number of Bitmaps) 18
Knee Index, C = 100 7 Time (Expected Number of Bitmap Scans) 7 n-Comp. Space-Optimal Index All Index 6 5 5 4 4 3 3 2 2 1 1 0 0 20 40 60 80 100 Space (Number of Bitmaps) 19
' $ Space-Time T radeo� Results (con t.) Time-Optimal Index under Space Constrain t � Searc h space for the optimal solution is large! � A 2-step Heuristic Approac h: 1. Select an initial index that satis�es the space constrain t. 2. Iterativ ely adjust the base of index to impro v e its time-e�ciency . � Heuristic Approac h is near-optimal. & % 20
Storage Schemes for Bitmap Compression Bitmap-Level Storage (BS) 6 files of N bits each Component-Level Storage (CS) 2 files of 3N bits each Index-Level Storage (IS) 1 file of 6N bits ( N = # tuples ) 21
' $ Bitmap Compression � Exp erimen tal Data (from TPC-D Benc hmark): { A ttribute: Lineitem.Qt y with C = 50 and 6M tuples. { Indexes: 6 n-comp onen t space-optimal indexes. { Compression co de: zlib library (a LZ77 v arian t). � Notation : cBS, cCS, cIS for compressed storage sc hemes. & % 22
Compressibilit y of Storage Sc hemes (relativ e to 1-comp. index under BS) 1 BS/CS/IS cBS cCS 0.8 cIS Compressibility 0.6 0.4 0.2 0 1 2 3 4 5 6 Number of Components, n 23
Recommend
More recommend