How Achaeans Would Construct Columns in Troy Alekh Jindal, Felix Martin Schuhknecht, Jens Dittrich, Karen Khachatryan, Alexander Bunte
Number of Visas Received 1 0,75 0,5 0,25 0 Alekh Jens
Health Level 5 days before CIDR 100 percentage 75 50 25 0 Alekh Jens
Average Number of Slides per 20min talk 150 100 50 0 Alekh Jens
Number of Slides Actually Prepared 150 100 50 0 Alekh Jens
“
What is the problem?
Row-stores 8
Column-stores 9
OLTP OLAP 10
11
OLTP OLAP? Can we do efficient OLAP in Row-stores? 12
Any solutions out there?
C-Tables * Application User Database Query Processor Relations Physical Representation File 1 File 2 File 3 .... File n 14 * Nicolas Bruno. Teaching an Old Elephant New Tricks. CIDR 2009
C-Tables * Relation Customer name phone market_segment smith 2134 automobile john 3425 household kim 6756 furniture joe 9878 building mark 4312 building steve 2435 automobile jim 5766 household ian 8789 household 15 * Nicolas Bruno. Teaching an Old Elephant New Tricks. CIDR 2009
C-Tables * Sorted Relation Customer name phone market segment smith 2134 automobile steve 2435 automobile mark 4312 building joe 9878 building kim 6756 furniture john 3425 household jim 5766 household ian 8789 household Physical Table T market segment f v c 1 automobile 2 3 building 2 5 furniture 1 6 household 3 16 * Nicolas Bruno. Teaching an Old Elephant New Tricks. CIDR 2009
C-Tables * Sorted Relation Customer name phone market segment smith 2134 automobile steve 2435 automobile mark 4312 building joe 9878 building kim 6756 furniture john 3425 household jim 5766 household ian 8789 household Physical Table T market segment f v c 1 automobile 2 3 building 2 5 furniture 1 6 household 3 17 * Nicolas Bruno. Teaching an Old Elephant New Tricks. CIDR 2009
C-Tables * Sorted Relation Customer name phone market segment smith 2134 automobile steve 2435 automobile mark 4312 building joe 9878 building kim 6756 furniture john 3425 household jim 5766 household ian 8789 household Physical Table T market segment f v c 1 automobile 2 3 building 2 5 furniture 1 6 household 3 18 * Nicolas Bruno. Teaching an Old Elephant New Tricks. CIDR 2009
C-Tables * Sorted Relation Customer name phone market segment smith 2134 automobile steve 2435 automobile mark 4312 building joe 9878 building kim 6756 furniture john 3425 household jim 5766 household ian 8789 household Physical Table T market segment f v c 1 automobile 2 3 building 2 5 furniture 1 6 household 3 19 * Nicolas Bruno. Teaching an Old Elephant New Tricks. CIDR 2009
C-Tables * Sorted Relation Customer name phone market segment smith 2134 automobile steve 2435 automobile mark 4312 building joe 9878 building kim 6756 furniture john 3425 household jim 5766 household ian 8789 household Physical Table T market segment T phone T name f v c f v f v 1 automobile 2 1 2134 1 smith JOINS ! 3 building 2 2 2435 2 steve 5 furniture 1 3 4312 3 mark 6 household 3 4 9878 4 joe 5 6756 5 kim 6 3425 6 john 7 5766 7 jim 8 8789 8 ian 20 * Nicolas Bruno. Teaching an Old Elephant New Tricks. CIDR 2009
C-Tables * 10000 C-Table Standard Row 1000 Query Time (sec) 100 10 1 0.1 1 2 3 4 5 6 7 8 9 10 # referenced Attributes JOINS ! (a) Cardinality = 10 21 * Nicolas Bruno. Teaching an Old Elephant New Tricks. CIDR 2009
C-Tables * 10000 C-Table Standard Row 1000 Query Time (sec) 100 10 1 0.1 1 2 3 4 5 # referenced Attributes (b) Cardinality = 100 22 * Nicolas Bruno. Teaching an Old Elephant New Tricks. CIDR 2009
C-Tables * Application User Database Query Processor Relations Physical Representation File 1 File 2 File 3 .... File n 23 * Nicolas Bruno. Teaching an Old Elephant New Tricks. CIDR 2009
Column Index * Application User Database Query Processor Relations Physical Representation File 1 File 2 File 3 .... File n 24 * P . Larson et. al. SQL Server Column Store Indexes. SIGMOD 2011
Column Index * Relation Customer name phone market_segment smith 2134 automobile john 3425 household kim 6756 furniture joe 9878 building mark 4312 building steve 2435 automobile jim 5766 household ian 8789 household Physical Table Customer_trojan segment_ID attribute_ID blob_data 1 name smith, john, kim, joe 1 phone 2134, 3425, 6756, 9878 1 market_segment automobile, household, furniture, building 2 name mark, steve, jim, ian 2 phone 4312, 2435, 5766, 8789 2 market_segment building, automobile, household, household segment size = 4 25 * P . Larson et. al. SQL Server Column Store Indexes. SIGMOD 2011
Column Index * Relation Customer name phone market_segment smith 2134 automobile john 3425 household kim 6756 furniture joe 9878 building mark 4312 building steve 2435 automobile jim 5766 household ian 8789 household Physical Table Customer_trojan segment_ID attribute_ID blob_data 1 name smith, john, kim, joe 1 phone 2134, 3425, 6756, 9878 1 market_segment automobile, household, furniture, building 2 name mark, steve, jim, ian 2 phone 4312, 2435, 5766, 8789 2 market_segment building, automobile, household, household segment size = 4 26 * P . Larson et. al. SQL Server Column Store Indexes. SIGMOD 2011
Column Index * Relation Customer name phone market_segment smith 2134 automobile john 3425 household kim 6756 furniture joe 9878 building mark 4312 building steve 2435 automobile jim 5766 household ian 8789 household Physical Table Customer_trojan segment_ID attribute_ID blob_data 1 name smith, john, kim, joe 1 phone 2134, 3425, 6756, 9878 1 market_segment automobile, household, furniture, building 2 name mark, steve, jim, ian 2 phone 4312, 2435, 5766, 8789 2 market_segment building, automobile, household, household segment size = 4 27 * P . Larson et. al. SQL Server Column Store Indexes. SIGMOD 2011
Column Index * Application User Database Query Processor Relations Physical Representation File 1 File 2 File 3 .... File n DEEP CHANGES ! 28 * P . Larson et. al. SQL Server Column Store Indexes. SIGMOD 2011
Column Index * Application User Database Query Processor Relations Physical Representation File 1 File 2 File 3 .... File n LONG TIME ! 28 * P . Larson et. al. SQL Server Column Store Indexes. SIGMOD 2011
Column Index * Application User Database Query Processor Relations Physical Representation File 1 File 2 File 3 .... File n SOURCE CODE ! 28 * P . Larson et. al. SQL Server Column Store Indexes. SIGMOD 2011
Column Index * Application User Database Query Processor Relations Physical Representation File 1 File 2 File 3 .... File n 28 * P . Larson et. al. SQL Server Column Store Indexes. SIGMOD 2011
What do we propose?
Trojan Columns Application User Database Query Processor Relations UDF Storage Layer Physical Representation File 1 File 2 File 3 .... File n 31
Trojan Columns Relation Customer name phone market_segment smith 2134 automobile john 3425 household kim 6756 furniture joe 9878 building mark 4312 building steve 2435 automobile jim 5766 household ian 8789 household Physical Table Customer_trojan segment_ID attribute_ID blob_data 1 name smith, john, kim, joe 1 phone 2134, 3425, 6756, 9878 1 market_segment automobile, household, furniture, building 2 name mark, steve, jim, ian 2 phone 4312, 2435, 5766, 8789 2 market_segment building, automobile, household, household 32
Trojan Columns Relation Customer Tuple name phone market_segment Iterator write-UDF (a) Convert row (c) Get next smith 2134 automobile tuples into blobs row data john 3425 household kim 6756 furniture Data Data joe 9878 building Accesso Parser mark 4312 building r (b) Store blob data steve 2435 automobile jim 5766 household ian 8789 household Physical Table Customer_trojan segment_ID attribute_ID blob_data 1 name smith, john, kim, joe 1 phone 2134, 3425, 6756, 9878 1 market_segment automobile, household, furniture, building 2 name mark, steve, jim, ian 2 phone 4312, 2435, 5766, 8789 2 market_segment building, automobile, household, household 32
Trojan Columns Relation (g)End of table Customer Tuple name phone market_segment Iterator read-UDF (f) Fetch (e) Reconstruct smith 2134 automobile blob data row tuples john 3425 household kim 6756 furniture Data Data joe 9878 building Accessor Parser mark 4312 building (d) Parse blob data steve 2435 automobile jim 5766 household ian 8789 household Physical Table Customer_trojan segment_ID attribute_ID blob_data 1 name smith, john, kim, joe 1 phone 2134, 3425, 6756, 9878 1 market_segment automobile, household, furniture, building 2 name mark, steve, jim, ian 2 phone 4312, 2435, 5766, 8789 2 market_segment building, automobile, household, household 33
Recommend
More recommend