Storing Data: Database Organization “Yea, from the table of my memory I’ll wipe away all trivial fond records.” -- Shakespeare, Hamlet 340151 Big Data & Cloud Computing (P. Baumann) 1
...Now Databases, Again Buffering Fixed/variable length records Arranging files on disk Catalogs 340151 Big Data & Cloud Computing (P. Baumann) 2
Disk Space Management Lowest layer of DBMS manages space on disk Higher levels call layer to: • allocate/de-allocate page • read/write page 340151 Big Data & Cloud Computing (P. Baumann) 3
Buffer Management in a DBMS Table of <frame#, pageid> pairs (plus more, see next) page requests from higher layers BUFFER POOL disk page free frame MAIN MEMORY choice of frame dictated DISK DB by replacement policy 340151 Big Data & Cloud Computing (P. Baumann) 4
When Page is Requested ... If page not in pool: • Choose frame for replacement • If frame dirty, write to disk • Read page into frame Pin page & return address NB: If possible, arrange blocks sequentially on disk 'page' 'block' • minimize seek and rotational delay For sequential scan (access predictable!), pre-fetching is a big win 340151 Big Data & Cloud Computing (P. Baumann) 5
More on Buffer Management Page requestor must unpin & indicate whether page has been modified dirty bit • Page in pool may be requested many times pin count: page is candidate for replacement iff pin count = = 0 • CC & recovery: additional I/O when replacing frame • Write-Ahead Log protocol 340151 Big Data & Cloud Computing (P. Baumann) 6
Buffer Replacement Policy Frame chosen for replacement by replacement policy: Least-recently-used (LRU), Clock, MRU etc. • Policy can have big impact on # of I/O’s; depends on access pattern 340151 Big Data & Cloud Computing (P. Baumann) 7
DBMS vs. OS File System OS does disk space & buffer mgmt: why not let OS manage these tasks? Differences in OS support: portability issues Some limitations • e.g., files can’t span disks Buffer management in DBMS requires ability to: • pin page in buffer pool, force page to disk (CC & recovery!) • adjust replacement policy + pre-fetch pages based on access patterns in typical DB operations 340151 Big Data & Cloud Computing (P. Baumann) 8
...Now Databases, Again Buffering Fixed/variable length records Arranging files on disk Catalogs 340151 Big Data & Cloud Computing (P. Baumann) 9
Record Formats: Fixed Length F1 F2 F3 F4 L1 L2 L3 L4 Base address (B) Address = B+L1+L2 Information about field types same for all records in a file; stored in system catalogs Finding i’th field does not require scan of record 340151 Big Data & Cloud Computing (P. Baumann) 10
Record Formats: Variable Length Two alternative formats (# fields fixed): F1 F2 F3 F4 Variant 1: 4 $ $ $ $ delimiters Fields Delimited by Special Symbols (eg, '\0') Field Count F1 F2 F3 F4 Variant 2: offset ptrs Array of Field Offsets Var2: direct access to i’th field; efficient storage of nulls ; small directory overhead 340151 Big Data & Cloud Computing (P. Baumann) 11
Page Formats: Fixed Length Records Slot 1 Slot 1 Slot 2 Slot 2 Free . . . . . . Space Slot N Slot N Slot M N 1 . . . 0 1 1 M M ... 3 2 1 number number PACKED of records of slots UNPACKED, BITMAP Record id = <page id, slot #> In first alternative, moving records for free space mgmnt changes rid • may not be acceptable 340151 Big Data & Cloud Computing (P. Baumann) 12
Page Formats: Variable Length Records Rid = (i,N) Page i Rid = (i,2) Rid = (i,1) N 20 16 24 Pointer to start N . . . 2 1 # slots of free space SLOT DIRECTORY Indirection can move records on page without changing rid attractive for fixed-length records too 340151 Big Data & Cloud Computing (P. Baumann) 13
...Now Databases, Again Buffering Fixed/variable length records BLOBs Arranging files on disk Catalogs 340151 Big Data & Cloud Computing (P. Baumann) 14
Files of Records Page or block is OK when doing I/O • but higher levels of DBMS operate on records, and files of records File = A collection of pages, each containing a collection of records. Must support: • insert/delete/modify record • read a particular record (specified using record id) • scan all records (possibly with some conditions on the records to be retrieved) 340151 Big Data & Cloud Computing (P. Baumann) 15
Unordered (Heap) Files Simplest file structure -- contains records in no particular order As file grows and shrinks, disk pages are allocated & de-allocated To support record level operations, we must: • keep track of the pages in a file • keep track of free space on pages • keep track of the records on a page many alternatives for keeping track of this 340151 Big Data & Cloud Computing (P. Baumann) 16
Heap File Implemented as a List … Data Data Data Full Pages Page Page Page Header Page Data Data Data … Pages with Page Page Page Free Space header page id & Heap file name must be stored someplace Each page contains 2 `pointers’ plus data 340151 Big Data & Cloud Computing (P. Baumann) 17
Heap File Using a Page Directory Data Page 1 Header Page Data Page 2 Data Page N DIRECTORY directory = collection of pages • linked list implementation just one alternative • Much smaller than linked list of all HF pages! entry for page can include number of free bytes on page 340151 Big Data & Cloud Computing (P. Baumann) 18
...Now Databases, Again Buffering Fixed/variable length records Arranging files on disk Catalogs 340151 Big Data & Cloud Computing (P. Baumann) 19
System Catalogs For each relation: name, file name, file structure (e.g., Heap file) • attribute name and type, for each attribute • index name, for each index • integrity constraints • For each index: structure (e.g., B+ tree) and search key fields • Catalogs themselves For each view: stored as relations ! view name and definition • Plus statistics, authorization, buffer pool size, etc. 340151 Big Data & Cloud Computing (P. Baumann) 20
Sample Catalog Table Attribute_Cat(attr_name, rel_name, type, position) attr_name rel_name type position attr_name Attribute_Cat string 1 rel_name Attribute_Cat string 2 type Attribute_Cat string 3 position Attribute_Cat integer 4 sid Students string 1 name Students string 2 login Students string 3 age Students integer 4 gpa Students real 5 1st entry? fid Faculty string 1 Key(s)? fname Faculty string 2 sal Faculty real 3 340151 Big Data & Cloud Computing (P. Baumann) 21
Databases & Disk: Practitioner's Tips File system type: ext4, or similar; not reiserfs! • Many discussions, though Place redo logs (and Oracle control files) on separate partitions • If possible, same for index (higher traffic!) fast disks for /tmp, cache files, log, and other high-traffic dirs SSD Big RAM never wrong 340151 Big Data & Cloud Computing (P. Baumann) 22
Summary Disks provide cheap, non-volatile storage • Random access, but cost depends on location of page on disk • important to arrange data sequentiallyto minimize seek and rotation delays Buffer manager brings pages into RAM • Page stays in RAM until released by requestor • Written to disk when frame chosen for replacement (which is sometime after requestor releases the page) • Choice of frame to replace based on replacement policy • Tries to pre-fetch several pages at a time 340151 Big Data & Cloud Computing (P. Baumann) 23
Summary (Contd.) DBMS vs. OS File Support: DBMS needs features not found in many OS’s • forcing page to disk • controlling order of page writes to disk • files spanning disks • control pre-fetching + page replacement policy based on predictable access patterns, etc. Variable length record format • field offset directory Slotted page format 340151 Big Data & Cloud Computing (P. Baumann) 24
Summary (Contd.) File layer • keeps track of pages in file + supports abstraction of "collection of records" • Pages with free space identified via linked list or directory structure • similar to how pages in file are kept track of Indexes • support efficient retrieval of records based on the values in some fields Catalog relations • store information about relations, indexes and views • Information that is common to all records in a given collection 340151 Big Data & Cloud Computing (P. Baumann) 25
Recommend
More recommend