Design Tradeoffs of Data Access Methods Manos Athanassoulis and - PowerPoint PPT Presentation

Design Tradeoffs of Data Access Methods Manos Athanassoulis and Stratos Idreos

declarative interface ask ‘’what’’ you want the system decides “how” to best store and access data db system

applications api/sql algorithms/operators cpu data data memory hierarchy data data system kernel: a collection of access methods

an access method is a way to store and access data layout structure navigation

an access method is a way to store and access data layout e.g., array structure unordered navigation scan

an access method is a way to store and access data layout e.g., array e.g., array structure unordered ordered navigation scan binary search

TREES HASH TABLES SLOTTED PAGES COLUMN-GROUPS TRIES COLUMNS ARRAYS LOG-STRUCTURED TREES MULTI-DIMENTIONAL

isn’t this a solved problem?

isn’t this a solved problem? access method design is now as important as ever

2.5 data dai grows y 2 [IB today data systems are nearly everywhere… continuous need for new and tailored data systems

2.5 data dai grows y 2 [IB today data systems are nearly everywhere… continuous need for new and tailored data systems tomorrow

disk memory A B C D

disk memory A BC option1 row-store A B C D engine

disk memory X X X A BC option1 row-store A B C D engine A option2 column- store engine

how many more new access methods to design?

how many more new access methods to design? it is not about radical new designs only! design, tuning and variations

say the workload (read/write ratio) shifts (e.g., due to app features): should we use a different data layout for base data - diff updates? should we use different indexing or no indexing?

say the workload (read/write ratio) shifts (e.g., due to app features): should we use a different data layout for base data - diff updates? should we use different indexing or no indexing? say we buy new hardware X (flash/memory): should we change the size of b-tree nodes? should we change the merging strategy in our LSM-tree?

say the workload (read/write ratio) shifts (e.g., due to app features): should we use a different data layout for base data - diff updates? should we use different indexing or no indexing? say we buy new hardware X (flash/memory): should we change the size of b-tree nodes? should we change the merging strategy in our LSM-tree? say we want to improve response time: would it be beneficial if we would buy faster flash disks? would it be beneficial if we buy more memory?

conflicting goals moving target (hardware and requirements change continuously and rapidly) application requirements budget performance hardware energy profile

move from design based on intuition & experience only to a more formal and systematic way to design systems

goals and structure of the tutorial structure design space & tradeoffs highlight open problems towards easy to design methods

goals and structure of the tutorial structure design space & tradeoffs highlight open problems towards easy to design methods basic tradeoffs goals & vision ~30 min [slides available at daslab.seas.harvard.edu] design ~40 min space

target audience = beginner to expert no new designs but new connections & structure

NOT JUST SQL + operating systems, no sql, sciences

hardware is a big drive of access method (re)design (and it continuously evolves)

CPU faster registers ~1ns on chip cache ~10ns SRAM on board cache memory wall memory DRAM ~100ns disk cheaper it is not just memory and disk we want to move as few data items as possible all the way up to the CPU

random access & page-based access need to only read x … but have to read all of page 1 data value x … page1 page2 page3

what is the perfect access method?

what is the perfect access method? no single answer; it depends

what is the perfect access method? no single answer; it depends what is the application read patterns write patterns reads/writes ratios hardware (CPU, memory, etc) SLAs

a perfect access method for reads (point queries) find(x) oracle x

a perfect access method for reads (point queries) find(x) reads oracle updates memory x

a perfect access method for reads (point queries) but with no memory overhead binary search to find(x) sorted

a perfect access method for reads (point queries) but with no memory overhead reads updates binary search to find(x) memory sorted

a perfect access method for writes (point writes) update(x) x x x update log

a perfect access method for writes (point writes) reads updates update(x) memory x x x update log

design space it all starts with how we store data every bit matters

basic tradeoffs R eads U pdates M emory RUM conjecture, EDBT 2016

Read R eads min max min min U pdates Update Memory M emory

Read R eads min max min min U pdates Update Memory M emory read-op(mized update & memory memory-op(mized op-mized max max max min min min min min min

Fractional Partitioning Cascading Fractional Log-structured Cascading Updates Differential Logarithmic Sparse Updates Design Indexing study basic access methods design components how they affect the RUM tradeoffs how are they combined in existing access methods Read min max min min Update Memory

Fractional Partitioning Cascading Fractional Log-structured Cascading Updates Differential Logarithmic Sparse Updates Design Indexing study basic access methods design components how they affect the RUM tradeoffs how are they combined in existing access methods Read min Part 2 max min min Update Memory

can we make it easy to design/tune access methods?

… disk memory flash 1 easily utilize past concepts

35 28 # of citations 21 14 7 P. O’Neil, E. Cheng, D. Gawlick, E, O'Neil The log-structured merge-tree (LSM-tree) Acta Informatica 33 (4): 351–385, 1996 0 1996 1999 2002 2005 2008 2011 2014 2 do not miss out on cool ideas and concepts

35 28 # of citations 21 Google publishes BigTable 14 7 P. O’Neil, E. Cheng, D. Gawlick, E, O'Neil The log-structured merge-tree (LSM-tree) Acta Informatica 33 (4): 351–385, 1996 0 1996 1999 2002 2005 2008 2011 2014 2 do not miss out on cool ideas and concepts

move from design based on intuition & experience only to a more formal and systematic way to design systems

construct access methods out of basic components (and their tradeoffs) e.g., scan*, tree*, bloom filters, bitmaps, hash tables, etc.

data system designer INTERACTIVE DATA SYSTEM DESIGN/TUNING/TESTING

possible opportunities once we have a “complete” & navigable set of design modules learn from: s/w engineering, modular dbs, compilers, goes all the way back to basic texts

possible opportunities once we have a “complete” & navigable set of design modules learn from: s/w engineering, modular dbs, compilers, goes all the way back to basic texts easy to change/adapt easy to design

possible opportunities once we have a “complete” & navigable set of design modules learn from: s/w engineering, modular dbs, compilers, goes all the way back to basic texts easy to change/adapt easy to design universal development platform testing

possible opportunities once we have a “complete” & navigable set of design modules learn from: s/w engineering, modular dbs, compilers, goes all the way back to basic texts easy to change/adapt easy to design discovery of universal new combinations development of design options platform testing

Part 2: observe how papers fill in gaps in the structure and existing open gaps

Design Tradeoffs of Data Access Methods Manos Athanassoulis and - PowerPoint PPT Presentation

Design Tradeoffs of Data Access Methods Manos Athanassoulis and Stratos Idreos declarative interface ask what you want the system decides how to best store and access data db system applications api/sql

Area and Time Tradeoffs in FPGAs Examining the concept of area/time tradeoffs in FPGA design,

REDD+ within the WEL nexus Opportunities and tradeoffs Kristy Graham May 2011 Outline What

Space/time tradeoffs; dynamic programming; y g g transform and conquer 1. Space/time

Storage Tradeoffs in a Collaborative Backup Service for Mobile Devices Ludovic Courts,

Design Tradeoffs in Query Processing and Online Architectures T. Yang 293S 2017 Content

Tradeoffs in XML Database Compression James Cheney University of Edinburgh Data Compression

Evolving Data Access Evolving Data Access Evolving Data Access Evolving Data Access

Tradeoffs in Infinite Games Martin Zimmermann Saarland University May 15th, 2018 Scientific

AUTOMATIC TRADEOFFS: ACCURACY AND ENERGY Stephanie Forrest

Quantum Time-Space Tradeoffs for Deciding Systems of Linear Inequalities Robert Spalek

DESIGN & ANALYSIS METHODS DESIGN & ANALYSIS METHODS FOR DESIGN & ANALYSIS METHODS

Small is Beautiful: the design of Lua Roberto Ierusalimschy PUC-Rio Language design many

CSE 510 Web Data Engineering Data Access Object (DAO) Java Design Pattern UB CSE 510 Web Data

Solar Sailing Kinetic Energy Impactor (KEI) Bernd Dachwald Ralph Kahle Mission Design Tradeoffs

Meshless Meshless Methods Meshless Meshless Methods Methods Methods Contents

Access Control and Protection Overview Access control: What and Why Abstract Models of

FAIR Project Team Workshop PMO - Quality Assurance and BINP November 5th, 2018 Herbert Schwarz

Verification and Validation Steven Zeil February 13, 2013 Verification and Validation

Testing 101: How to Bulletproof Your Deployments with Automated Code Testing Chris Wiegman

Software testing Software Testing Introduction Testing levels Automated testing Principles and

Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications Lecture #8

RCE and FELIX Raw Data Access for ProtoDUNE Discussion Tom Junk, with input from JJ Russell,

Objects and nesting and pointers, oh my! If we have a pointer variable ptr , we access the

Classes Ch 10.1 - 10.3 class vs array Arrays group together similar data types (any amount you

Sambuz

Useful Links

Newsletter

Mail Us

Design Tradeoffs of Data Access Methods Manos Athanassoulis and - PowerPoint PPT Presentation

Design Tradeoffs of Data Access Methods Manos Athanassoulis and Stratos Idreos declarative interface ask what you want the system decides how to best store and access data db system applications api/sql

Area and Time Tradeoffs in FPGAs Examining the concept of area/time tradeoffs in FPGA design,

REDD+ within the WEL nexus Opportunities and tradeoffs Kristy Graham May 2011 Outline What

Space/time tradeoffs; dynamic programming; y g g transform and conquer 1. Space/time

Storage Tradeoffs in a Collaborative Backup Service for Mobile Devices Ludovic Courts,

Design Tradeoffs in Query Processing and Online Architectures T. Yang 293S 2017 Content

Tradeoffs in XML Database Compression James Cheney University of Edinburgh Data Compression

Evolving Data Access Evolving Data Access Evolving Data Access Evolving Data Access

Tradeoffs in Infinite Games Martin Zimmermann Saarland University May 15th, 2018 Scientific

AUTOMATIC TRADEOFFS: ACCURACY AND ENERGY Stephanie Forrest

Quantum Time-Space Tradeoffs for Deciding Systems of Linear Inequalities Robert Spalek

DESIGN &amp; ANALYSIS METHODS DESIGN &amp; ANALYSIS METHODS FOR DESIGN &amp; ANALYSIS METHODS

Small is Beautiful: the design of Lua Roberto Ierusalimschy PUC-Rio Language design many

CSE 510 Web Data Engineering Data Access Object (DAO) Java Design Pattern UB CSE 510 Web Data

Solar Sailing Kinetic Energy Impactor (KEI) Bernd Dachwald Ralph Kahle Mission Design Tradeoffs

Meshless Meshless Methods Meshless Meshless Methods Methods Methods Contents

Access Control and Protection Overview Access control: What and Why Abstract Models of

FAIR Project Team Workshop PMO - Quality Assurance and BINP November 5th, 2018 Herbert Schwarz

Verification and Validation Steven Zeil February 13, 2013 Verification and Validation

Testing 101: How to Bulletproof Your Deployments with Automated Code Testing Chris Wiegman

Software testing Software Testing Introduction Testing levels Automated testing Principles and

Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications Lecture #8

RCE and FELIX Raw Data Access for ProtoDUNE Discussion Tom Junk, with input from JJ Russell,

Objects and nesting and pointers, oh my! If we have a pointer variable ptr , we access the

Classes Ch 10.1 - 10.3 class vs array Arrays group together similar data types (any amount you

Sambuz

Useful Links

Newsletter

Mail Us

DESIGN & ANALYSIS METHODS DESIGN & ANALYSIS METHODS FOR DESIGN & ANALYSIS METHODS