data management systems
play

Data Management Systems Storage Management Memory hierarchy - PowerPoint PPT Presentation

Data Management Systems Storage Management Memory hierarchy Segments and file storage Database buffer cache Storage techniques in context Cloud native databases (Snowflake) Gustavo Alonso Institute of Computing Platforms


  1. Data Management Systems • Storage Management • Memory hierarchy • Segments and file storage • Database buffer cache • Storage techniques in context Cloud native databases (Snowflake) Gustavo Alonso Institute of Computing Platforms Department of Computer Science ETH Zürich Storage in Context 1

  2. Architecture of a database Relations, views Application Queries, Transactions (SQL) Logical data (tables, schemas) Logical view (logical data) Record Interface Logical records (tuples) Access Paths Record Access Physical records Physical data in memory Page access Page structure Pages in memory File Access Storage allocation Blocks, files, segments Cloud storage and file system (Amazon S3) Storage in Context 2

  3. Goals of the lecture • Put the previous material regarding storage in perspective • Use a modern example of a database (Snowflake) • Illustrate how hardware and the cloud change things • Illustrate the many possibilities available to a database engine in terms of storing and managing physical data • Start introducing different database engine designs speciliazed to concrete applications Storage in Context 3

  4. Snowflake • A data warehouse specialized for analytical queries developed entirely on the cloud (cloud native) • Separates compute (nodes running VMs with a local disk) from storage (Amazon’s S3) Paper: https://dl.acm.org/doi/10.1145/2882903.2903741 Documentation: https://docs.snowflake.com/en/user-guide-intro.html Storage in Context 4

  5. Amazon’s S3 • Simple Storage Service (S3) is an object storage service in the cloud that acts as the persistent storage that is available to applications • Unlike conventional local disks or distributed file system! • Object store (key-value) [object = file] • HTTP(S) PUT/GET/DELETE interface • No update in place (objects must be written in full) • Can read parts (ranges) of an object instead of the whole object • High CPU overhead (because of HTTP) • I/O is extra expensive (network bandwidth, latency, interface) Storage in Context 5

  6. Something familiar • A virtual warehouse is a collection of worker nodes (EC2 instances in Amazon) • Each worker node has a cache in its local disk where it stores the objects (table files or parts thereof) accessed before • The cache uses a simple LRU replacement policy Storage in Context 6

  7. Micro-partitions • Micro- partitions are Snowflake’s name for extents • What is interesting is how they are organized to facilitate query processing • Size ranges between 50 and 500 MB (before compression, the data is always compressed when in S3) • Each micro partition has metadata describing what is inside • The metadata can be read without reading the whole micro-partition • The metadata is used to read just the part of the micro-partition that is relevant • Data in the micro-partition is stored in columnar form (by columns not by rows) Storage in Context 7

  8. https://docs.snowflake.com/en/user-guide/tables-clustering-micropartitions.html Storage in Context 8

  9. A not so uncommon design • Snowflake is combining many tricks used before in different contexts: • Horizontal partitioning of the tables (by row): allows to read the table in parallel, to put different parts of the table in different processing nodes, and – if organized accordingly- allows to read only the needed tuples instead of all the table • Columnar format (storage by column): the preferred storage format for analytics, improves cache locality, enables vectorized processing, facilitates projection operations (SQL), allows to process only the part of the table that is relevant • Storage level processing to read only the part of the file that is needed (helped by the micro-partitions and the columnar format) Storage in Context 9

  10. Pruning based on metadata • The header for a micro-partition contains information about the data. SELECT * FROM T WHERE age > 45 • If header contains the min and max age in the data, we can decide we do not need that micro- partition simply by looking at the header Storage in Context 10

  11. Pruning • Snowflake does not use indexes • Indexes require a lot of space • Indexes induce random accesses (very bad for slow storage like S3) • Indexes need to be maintained and selected correctly • Instead, it uses the metadata to store information that allows to filter micro-partitions (min/max, #distinct values,#nulls, bloom filters, etc.) • The metadata is much smaller than an index and easier to load than a whole index • By splitting table in potentially many micro-partitions, it can significantly optimize the data movement to and from storage Storage in Context 11

  12. Writing to disk • S3 does not support update in place, a file is replaced in its entirety (immutable) • Snowflake uses this feature to implement snapshots of the data (like shadow paging): • When a micro-partition is modified, a new file is written • The old micro-partition can be kept or discarded • Allows time travel (read the data in the past up to 90 days) and provides fault-tolerance (the old data can be recovered from the old micro-partitions) Storage in Context 12

Recommend


More recommend