disks and files
play

Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke - PowerPoint PPT Presentation

Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital information lasts forever - or five years, whichever comes first." -- Jeff Rothenberg, RAND Corp., 1997 340151 Big Data & Cloud


  1. Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital information lasts forever - or five years, whichever comes first." -- Jeff Rothenberg, RAND Corp., 1997 340151 Big Data & Cloud Computing (P. Baumann) 1

  2. Why Not Everything in Main Memory?  Costs too much • [Rama/Gehrke] $1000 will buy you either 128MB of RAM or 7.5GB of disk • Today: 80 EUR will buy you either 4 GB of RAM or 1 TB of disk • …but today we have multi -Terabyte databases!  Main memory is volatile • want data to be saved between runs (obviously!)  Typical storage hierarchy: • Main memory (RAM) for currently used data • Disk for main database (secondary storage) • Tapes for archiving older versions of data (tertiary storage) 340151 Big Data & Cloud Computing (P. Baumann) 2

  3. Storage Capacity  Absolute times as of 2003, but ratios still ~ same 340151 Big Data & Cloud Computing (P. Baumann) 3

  4. Storage Cost  Again, absolute values as of 2003, but ratios still ~ same 340151 Big Data & Cloud Computing (P. Baumann) 4

  5. Storage Hierarchies Primary memory Main memory Larger Magnetic disks Secondary Cheaper memory RAID systems Slower Magneto-optical media Tertiary Optical media memory Magnetic tapes Storage capacity Storage capacity 340151 Big Data & Cloud Computing (P. Baumann) 5

  6. Numbers source: http://carlos.bueno.org/2014/11/cache.html 340151 Big Data & Cloud Computing (P. Baumann) 6

  7. Nearline (Tertiary) Storage  Usually tape • Reel, today: cartridge • Capacity 10 GB  ~6 TB per tape  Tape robots • HSM = Hierarchical storage management • multi-Petabytes 340151 Big Data & Cloud Computing (P. Baumann) 7

  8. Caching & Virtual Memory  Cache: Fast memory, holding frequently used parts of a slower, larger memory • small (L1) cache holds a few kilobytes of the memory "most recently used" by the processor • Most operating systems keep most recently used "pages" of memory in main memory, put the rest on disk  Virtual memory • programs don't know whether accessing main memory or a page on secondary memory page (most operating systems)  Database systems usually take explicit control over 2ndary memory access 340151 Big Data & Cloud Computing (P. Baumann) 8

  9. Where Databases Reside  Hard Disk is secondary storage device of choice • Many flavors: Disk: Floppy (hard, soft); Winchester; Ram disks; Optical, CD−ROM; Arrays  Main advantage over tapes: random access vs. sequential  Data stored and retrieved in units called disk blocks or pages  Unlike RAM, time to retrieve a disk page varies depending upon location on disk • relative placement of pages on disk has major impact on DBMS performance! 340151 Big Data & Cloud Computing (P. Baumann) 9

  10. The Miracle Called "Hard Disk"  Disk head contains magnet, hovering over spinning platter  flight height: 10-20 nm  (x 5,000 gives one hair!) 340151 Big Data & Cloud Computing (P. Baumann) 10

  11. Components of a Disk  platters spin  arm assembly moves in or out  to position head on desired track  Tracks under heads = a cylinder (imaginary!)  Sector size = N * block size (fixed)  ...typical numbers? 340151 Big Data & Cloud Computing (P. Baumann) 11

  12. Typical Numbers  Diameter: 1 inch ...15 inches  Cylinders: 40 (floppy) ... 20,000  Surfaces: 1 (old CDs) ... 2 (floppies) ... 30  Sector Size: 512 B ... 50 kB  Capacity: 360 kB (old floppy) ... 4 TB 340151 Big Data & Cloud Computing (P. Baumann) 12

  13. Disk Access Time I want block X block X in memory ? 340151 Big Data & Cloud Computing (P. Baumann) 13

  14. Disk Access Time Time = Seek Time + Rotational Delay + Transfer Time + Other 340151 Big Data & Cloud Computing (P. Baumann) 14

  15. Seek Time Time = Seek Time + Rotational Delay + Transfer Time + Other 340151 Big Data & Cloud Computing (P. Baumann) 15

  16. Average Random Seek Time Time = Seek Time + Rotational Delay + Transfer Time + Other  Typical S: 10 ms ...40 ms = millions of times RAM access ! 340151 Big Data & Cloud Computing (P. Baumann) 16

  17. Average Rotational Delay Time = Seek Time + Rotational Delay + Transfer Time + Other  R = 1/2 revolution  typical R = 4.16 ms (7,200 RPM) 340151 Big Data & Cloud Computing (P. Baumann) 17

  18. Transfer Rate Time = Seek Time +  Transfer rate: t Rotational Delay + Transfer Time + Other • typical t: 10 ... 50 MB/second  transfer time T: block size T = --------------- t  Ex: block size 32 kB, t = 32 MB/second transfer time = …? 340151 Big Data & Cloud Computing (P. Baumann) 18

  19. Other Delays Time = Seek Time +  CPU time to issue I/O Rotational Delay + Transfer Time + Other  Contention for controller  Contention for bus, memory  Typical Value: 0 (relative to other values) 340151 Big Data & Cloud Computing (P. Baumann) 19

  20. Sequential Read?  So far: Random Block Access  What about: Reading next block?  Disks optimized towards "consecutive" reading! • Blocks within track • Tracks within cylinder • Next cylinder 340151 Big Data & Cloud Computing (P. Baumann) 20

  21. "Next Block" Costs  `Next’ block concept: • blocks on same track, followed by • blocks on same cylinder, followed by • blocks on adjacent cylinder  If we don’t need to change cylinder: Block Size Time to get = ---------------- + Negligible block t • + switch track (ie, read next arm) • + once in a while, next cylinder 340151 Big Data & Cloud Computing (P. Baumann) 21

  22. Random vs Sequential Read  Rule of Thumb: • Random I/O: Expensive • Sequential I/O: Less expensive  Ex: 1 KB Block: • Random I/O: ~ 20 ms • Sequential I/O: ~ 1 ms  relative difference is smaller for larger blocks Whenever possible arrange file blocks sequentially on disk (by `next’)  to minimize seek and rotational delay • For sequential scan, pre-fetching several pages at a time is a big win! “burst read” 340151 Big Data & Cloud Computing (P. Baumann) 22

  23. ...Writing?  Cost for Writing cost for Reading  ... unless we want to verify! • Then, need to add Block size ---------------- + (full) rotation t 340151 Big Data & Cloud Computing (P. Baumann) 23

  24. ...To Modify a Block?  (a) Read Block  (b) Modify in Memory  (c) Write Block  [ (d) Verify ] 340151 Big Data & Cloud Computing (P. Baumann) 24

  25. Wrap-Up  Capacities grow, data hunger grows larger • Moore's Law vs Greg's Law vs disk growth  Databases heavily i/o bound • Disk space management largely determines performance  Disk access time = Seek Time + Rotational Delay + Transfer Time + Other 340151 Big Data & Cloud Computing (P. Baumann) 25

Recommend


More recommend