Systems Infrastructure for Data Science Web Science Group Uni - PowerPoint PPT Presentation

Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2014/15

Lecture II: Indexing

Indexing Part I of this course 3

Database File Organization and Indexing • Remember: Database tables are implemented as files of records: – A file consists of one or more pages . – Each page contains one or more records . – Each record corresponds to one tuple in a table. • File organization: Method of arranging the records in a file when the file is stored on disk. • Indexing: Building data structures that organize data records on disk in (multiple) ways to optimize search and retrieval operations on them. 4

File Organization • Given a query such as the following: • How should we organize the storage of our data files on disk such that we can evaluate this query efficiently? 5

Heap Files? • A heap file stores records in no particular order . • Therefore, CUSTOMER table consists of records that are randomly ordered in terms of their ZIPCODE. • The entire file must be scanned , because the qualifying records could appear anywhere in the file and we don’t know in advance how many such records exist. 6

Sorted Files? • Sort the CUSTOMERS table in ZIPCODE order. • Then use binary search to find the first qualifying record, and scan further as long as ZIPCODE < 8999. 7

Are Sorted Files good enough?  Scan phase: We get sequential access during this phase. Search phase: We need to read log 2 N records during this phase (N: total number of records in the CUSTOMER table). – We need to fetch as many pages as are required to access these records. – Binary search involves unpredictable jumps that makes prefetching difficult. What about insertions and deletions? 8

Tree-based Indexing • Can we reduce the number of pages fetched during the search phase ? • Tree-based indexing: – Arrange the data entries in sorted order by search key value (e.g., ZIPCODE). – Add a hierarchical search data structure on top that directs searches for given key values to the correct page of data entries. – Since the index data structure is much smaller than the data file itself, the binary search is expected to fetch a smaller number of pages. – Two alternative approaches: ISAM and B + -tree . 9

ISAM: I ndexed S equential A ccess M ethod • All nodes are of the size of a page. pointer – hundreds of entries per page – large fan-out, low depth • Search cost ~ log fan-out N • Key k i serves as a “separator” for the pages pointed to by p i-1 and p i . 10

ISAM Index Structure • Index pages stored at non-leaf nodes • Data pages stored at leaf nodes – Primary data pages & Overflow data pages index pages data pages 11

Updates on ISAM Index Structure • ISAM index structure is inherently static . – Deletion is not a big problem: • Simply remove the record from the corresponding data page. • If the removal makes an overflow data page empty, remove that overflow data page. • If the removal makes a primary data page empty, keep it as a placeholder for future insertions. • Don’t move records from overflow data pages to primary data pages even if the removal creates space for doing so. – Insertion requires more effort: • If there is space in the corresponding primary data page, insert the record there. • Otherwise, an overflow data page needs to be added. • Note that the overflow pages will violate the sequential order .  ISAM indexes degrade after some time. 12

ISAM Example • Assume: Each node can hold two entries. 13

After Inserting 23*, 48*, 41*, 42* Overflow data pages had to be added. 14

… Then Deleting 42*, 51*, 97* 51 appears in index page, but not in the data page. The empty overflow data page is removed. 15

ISAM: Overflow Pages & Locking • The non-leaf pages that hold the index data are static; updates affect only the leaf pages.  May lead to long overflow chains . • Leave some free space during index creation.  Typically ~ 20% of each page is left free. • Since ISAM indexes are static, pages need not be locked during index access. – Locking can be a serious bottleneck in dynamic tree indexes (particularly near the root node). • ISAM may be the index of choice for relatively static data. 16

B + -trees: A Dynamic Index Structure • The B + -tree is derived from the ISAM index, but is fully dynamic with respect to updates. – No overflow chains ; B + -trees remain balanced at all times. – Gracefully adjusts to insertions and deletions. – Minimum occupancy for all B + -tree nodes (except the root): 50% (typically: 67 %). – Original version: • B-tree : R. Bayer and E. M. McCreight , “Organization and Maintenance of Large Ordered Indexes”, Acta Informatica, vol. 1, no. 3, September 1972. 17

B + -trees: Basics • B + -trees look like ISAM indexes, where – leaf nodes are, generally, not in sequential order on disk – leaves are typically connected to form a doubly-linked list – leaves may contain actual data (like the ISAM index) or just references to data pages (e.g., record ids (rids)) • We will assume the latter case, since it is the more common one. – each B + -tree node contains between d and 2d entries ( d is the order of the B + -tree; the root is the only exception). 18

Searching a B + -tree • Function search (k) returns a pointer to the leaf node that contains potential hits for search key k . • Node page layout: pointer 19

Insertion to a B + -tree: Overview • The B + -tree needs to remain balanced after every update (i.e., every root-to-leaf path must be of the same length).  We cannot create overflow pages. • Sketch of the insertion procedure for entry <k, p> (key value k pointing to data page p ): 1. Find leaf page n where we would expect the entry for k . 2. If n has enough space to hold the new entry (i.e., at most 2d-1 entries in n ), simply insert <k, p> into n . 3. Otherwise, node n must be split into n and n’ , and a new separator has to be inserted into the parent of n . Splitting happens recursively and may eventually lead to a split of the root node (increasing the height of the tree). 20

Insertion to a B + -tree: Example • Insert new entry with key 4222 . – Enough space in node 3, simply insert without split. – Keep entries sorted within nodes . 21

Insertion to a B + -tree: Example • Insert key 6330 . – Must split node 4. – New separator goes into node 1 (including pointer to new page). 22

Insertion to a B + -tree: Example • After 8180 , 8245 , insert key 4104 . – Must split node 3. – Node 1 overflows => split it! – New separator goes into root. • Note: Unlike during leaf split, separator key does not remain in inner node. 23

Insertion to a B + -tree: Root Node Split • Splitting starts at the leaf level and continues upward as long as index nodes are fully occupied. • Eventually, this can lead to a split of the root node: – Split like any other inner node. – Use the separator to create a new root. • The root node is the only node that may have an occupancy of less than 50 %. • This is the only situation where the tree height increases. 24

Insertion Algorithm 25

2d+1 2d+1 d+1 2d+1 2d+1 26

• insert (k, rid) is called from outside. • Note how leaf node entries point to rids, while inner nodes contain pointers to other B + -tree nodes. 27

Deletion from a B + -tree • If a node is sufficiently full (i.e., contains at least d+1 entries), we may simply remove the entry from the node. – Note: Afterwards, inner nodes may contain keys that no longer exist in the database. This is perfectly legal. • Merge nodes in case of an underflow (i.e., “undo” a split): • “Pull” separator (i.e., key 6423 ) into merged node. 28

Deletion from a B + -tree • It is not that easy: • Merging only works if two neighboring nodes were 50% full. • Otherwise, we have to re-distribute : – “rotate” entry through parent 29

B + -trees in Real Systems • Actual systems often avoid the cost of merging and/or redistribution, but relax the minimum occupancy rule. • Example: IBM DB2 UDB – The “ MINPCTUSED ” parameter controls when the system should try a leaf node merge (“on - line index reorganization”). – This is particularly easy because of the pointers between adjacent leaf nodes. – Inner nodes are never merged (need to do a full table reorganization for that). • To improve concurrency, systems sometimes only mark index entries as deleted and physically remove them later (e.g., IBM DB2 UDB “type - 2 indexes”). 30

What is stored inside the leaves? • Basically there are three alternatives: 1. The full data entry k* . Such an index is inherently clustered (e.g., ISAM). 2. A <k, rid> pair, where rid is the record id of the data entry. 3. A <k, {rid 1 , rid 2 , …}> pair, where the items in the rid list rid i are record ids of data entries with search key value k . • 2 and 3 are reasons why we want record ids to be stable. • 2 seems to be the most common one. 31

B + -trees and Sorting • A typical situation according to alternative 2 looks as follows: 32

Clustered B + -trees • If the data file was sorted, the scenario would look different: • We call such an index a clustered index . – Scanning the index now leads to sequential access . – This is particularly good for range queries . 33

Systems Infrastructure for Data Science Web Science Group Uni - PowerPoint PPT Presentation

Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2014/15 Lecture II: Indexing Indexing Part I of this course 3 Database File Organization and Indexing Remember: Database tables are implemented as files of

Big data in critical infrastructure: Production and failover infrastructure in DWD's central data

ALBA Synchrotron Data analysis IT infrastructure status report IT Systems - Computing

Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2012/13 Data Stream

Compiler Infrastructure Systems and Internet Infrastructure Security (SIIS) Laboratory Page 1

Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2012/13 Lecture X:

Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2012/13 Web

The built environment is a system of systems Economic infrastructure Social infrastructure

Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2012/13 Lecture IV:

Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2014/15 Lecture III:

Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2012/13 Lecture III:

Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2014/15 Lecture I:

Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2012/13 Data Stream

Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2012/13 Lecture

Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2014/15 Hadoop

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2016) Week 10: Mutable State

Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2012/13 Lecture IX:

Infrastructure as Code So far.. .. Server-based systems (IaaS) Low-level infrastructure

Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2012/13 Lecture VII:

Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2013/14 Lecture VI:

An Integrated Botanical Data Infrastructure in and for Pakistan Mary E. Barkworth, GBIF

BOARD OF MUNICIPAL UTILITIES ADVANCED METERING INFRASTRUCTURE ADVANCED METERING INFRASTRUCTURE

BIG Data and the Swiss spatial data infrastructure BIG Data and the Swiss spatial data

Lecture 23 Verified Systems Software Infrastructure is Shaky Software Infrastructure is Shaky

Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2012/13 Lecture II: