I/O Systems interrupts interrupts Processor Cache Lecture 21: Storage Systems Memory - I/O Bus Disk insides, characteristics, performance, reliability, technology Main I/O I/O I/O Memory Controller Controller Controller trends, RAID systems Graphics Disk Disk Network 1 2 Adapted from UCB CS252 S01, Revised by Zhao Zhang Storage Technology Drivers Magnetic Disks Track Driven by the prevailing computing paradigm Purpose: Sector � 1950s: migration from batch to on-line processing � Long-term, nonvolatile storage � 1990s: migration to ubiquitous computing � Large, inexpensive, slow level � computers in phones, books, cars, video cameras, … in the storage hierarchy � nationwide fiber optical network with wireless tails Cylinder Characteristics: � Today: digital media everywhere Platter � Digital forms of voice, picture, and video Head � Seek Time (~8 ms avg) � Data from scientific computing such as earthquake simulation, positional latency � high energy physical experiments, bioinformatics 7200 RPM = 120 RPS => 8 ms per rev rotational latency � In forms of personal storages, web server, peer-to-peer � ave rot. latency = 4 ms storage, grid storage Transfer rate 128 sectors per track => 0.25 ms per sector 1 KB per sector => 16 MB / s Effects on storage industry: � 10-40 MByte/sec � Embedded storage � Blocks � smaller, cheaper, more reliable, lower power Capacity Response time � Data utilities = Queue + Controller + Seek + Rot + Xfer � high capacity, hierarchically managed storage � Gigabytes Service time � Quadruples every 2 years 3 4 Seagate Barracuda 180 Photo of Disk Head, Arm, Actuator � 181.6 GB, 3.5 inch disk Spindle � 12 platters, 24 surfaces Track Arm � 24,247 cylinders Head � 7,200 RPM; (4.2 ms Sector { avg. latency) Actuator Cylinder � 7.4/8.2 ms avg. seek Track Arm Platter Head Buffer (r/w) Platters (12) Latency = � 64 to 35 MB/s Queuing Time + per byte { Controller time + (internal) per access Seek Time + + � 0.1 ms controller time Rotation Time + Size / Bandwidth � 10.3 watts (idle) source: www.seagate.com 5 6 1
Disk Performance Factors Disk Performance Example Actual disk seek and rotation time depends on the current Rule of Thumb: head position � Observed average seek time is typically about 1/4 to 1/3 of quoted seek time (i.e., 3X-4X faster) Seek time: how far is the head to the track? � Rule of Thumb: disks deliver about 3/4 of internal media rate (1.3X slower) for data � Disk industry standard: assume random position of the head, e.g., average 8ms seek time Calculate time to read 64 KB for UltraStar 72, using 1/3 � In practice: disk accesses have locality quoted 7.4ms seek time, 3/4 of 64MB/s internal outer track Rotation time: how far is the head to sector? bandwidth � Can safely assume ½ of rotation time (disk keeps rotating) Disk latency = average seek time + average rotational delay + � 10000 Revolutions Per Minute ⇒ 166.67 Rev/sec transfer time + controller overhead 1 revolution = 1/ 166.67 sec ⇒ 6.00 ms = (0.33 * 7.4 ms) + 0.5 * 1/(7200 RPM/(60000ms/M)) 1/2 rotation (revolution) ⇒ 3.00 ms + 64 KB / (0.75 * 65 MB/s) + 0.1 ms Data Transfer time: What are the rotation speed, disk = 2.5 ms + 0.5 /(7200 RPM/(60000ms/M)) density, and sectors per transfer? + 64 KB / (47 KB/ms) + 0.1 ms � 10000 RPM ⇒ a track of data per 6.00 ms = 2.5 + 4.2 + 1.4 + 0.1 ms = 8.2 ms (64% of 12.7) � Outer tracks are longer and may support higher bandwidth 7 8 Disk Characteristics in 2000 Disk Performance/Cost Trends Seagate IBM IBM 1GB Capacity Cheetah Travelstar Microdrive + 100%/year (2X / 1.0 yrs) ST173404LC 32GH DJSA - DSCM-11000 Transfer rate (BW) Ultra160 SCSI 232 ATA-4 + 40%/year (2X / 2.0 yrs) Disk diameter 3.5 2.5 1.0 Rotation + Seek time (inches) – 8%/ year (1/2 in 10 yrs) Formatted data 73.4 32.0 1.0 MB/$ capacity (GB) > 100%/year (2X / 1.0 yrs) Cylinders 14,100 21,664 7,167 Fewer chips + areal density Disks 12 4 1 Seagate 120GB Internal Hard Drive ST3120026A, $150 at staple (list IBM Microdrive Recording 24 8 2 price, 2003) Surfaces (Heads) Bytes per sector Maxtor 120GB 8MB Cache Hard Drive 512 to 4096 512 512 $59.84 after rebate at OfficeDepot, 2003 Avg Sectors per ~ 424 ~ 360 ~ 140 track (512 byte) Max. areal 6.0 14.0 15.2 density(Gbit/sq.in.) $828 $447 $435 9 10 Disk System Performance How About Queuing Time? Queuing time can be the most significant Response System-level Metrics: 300 Time (ms) one in disk response time • Response Time • Throughput 200 Arrivals Departures Response time = Queue + Controller 100 + service time ( √ ) More interested in long term, steady state than in startup => Arrivals = Departures 0 100% 0% Little’s Law: Mean number tasks in system = Throughput arrival rate x mean reponse time (% total BW) Applies to any system in equilibrium, as long Queue as nothing in black box is creating or Proc IOC Device destroying tasks Response time = Queue + Device Service time 11 12 2
A Little Queuing Theory: A Little Queuing Theory: Example Notation System Processor sends 50 x 8KB disk I/Os per sec, server requests & service exponentially distrib., avg. disk Queue service = 12 ms Proc IOC Device On average, how is the disk utilized? Queuing models assume state of equilibrium: � What is the number of requests in the queue? input rate = output rate � What is the average time a spent in the queue? � What is the average response time for a disk request? Notation: Notation: average number of arriving customers/second r T ser average time to service a customer (tradtionally µ = 1/ T ser ) r average number of arriving customers/second= 50 u server utilization (0..1): u = r x T ser (or u = r / µ ) T ser average time to service a customer= 12 ms u server utilization (0..1): u = r x T ser = 50/s x .012s = 0.60 T q average time/customer in queue = T s er x u / (1 –u) T q average time/customer in queue = T s er x u / (1 – u) T sys average time/customer in system: T sys = T q + T ser = 12x 0.60/(1-0.60) = 12x1.5 = 18 ms L q average length of queue: L q = r x T q T sys average time/customer in system: T sys =T q +T ser = 30 ms L sys average length of system: L sys = r x T sys L q average length of queue: L q = r x T q Little’s Law: Length server = rate x Time server = 50/s x 0.018s = 0.9 requests in queue L sys average # tasks in system : L sys = r x T sys = 50/s x 0.030s = 1.5 (Mean number customers = arrival rate x mean service time) Look into textbook when you need to work on I/O 13 14 Array Reliability How to build Large Storage: Disk Array • Reliability of N disks = Reliability of 1 Disk ÷ N String Controller . . . 50,000 Hours ÷ 70 disks = 700 hours String . . . Controller Disk system MTTF: Drops from 6 years to 1 month! String (MTTF: Mean Time to Failure) . . . Array Controller Controller • Arrays (without redundancy) too unreliable to be String useful! Controller . . . String Solution: RAID -- Redundant Arrays of . . . Controller Inexpensive Disks String . . . Controller Not practical to build large disks 15 16 RAID: The Idea RAID 4: High I/O Rate Parity Increasing 10010011 Logical 11001101 P Disk D0 D1 D2 D3 P 10010011 Insides of Address . . . Insides of 5 disks D4 D5 D6 D7 P 5 disks 1 logical record 1 1 1 0 1 0 1 Striped physical D8 D9 D10 D11 P 0 0 0 0 records 1 0 1 0 Example: Example: D12 D13 D14 P P contains sum of D15 0 1 0 1 small read Stripe small read other disks per stripe 0 1 0 1 D0 & D5, D0 & D5, D16 D17 D18 mod 2 (“parity”) D19 P 1 0 1 0 large write large write 1 1 1 D12-D15 1 If disk fails, subtract D12-D15 D20 D21 D22 D23 P P from sum of other RAID-3 shown . . . . . . . Disk Columns . . . disks to find missing information . . . . . 17 18 3
RAID 5: High I/O Rate Interleaved Parity Future Storage Trends Disks: Increasing � Extraodinary advance in capacity/drive, $/GB Independent Logical D0 D1 D2 D3 Independent P � Currently 17 Gbit/sq. inch; can continue past 100 Gbit/sq. Disk writes writes Addresses inch? possible possible D4 D5 D6 P D7 � Bandwidth, seek time not keeping up: 3.5 inch form factor because of because of makes sense? 2.5 inch form factor in near future? 1.0 inch interleaved interleaved form factor in long term? D8 D9 P D10 parity D11 parity Tapes � Old technique, no investment in innovation D12 P D13 D14 D15 � Are they already dead? Example: � What is a tapeless backup system? write to D0, P D16 D17 D18 D19 No disk Other Storage D5 uses disks 0, 1, hot spot! � CD/DVD 3, 4 D20 D21 D22 D23 � Compact Flash, USB key storage, MRAM P . . . . . . . . . Disk Columns . . . . . . 19 20 4
Recommend
More recommend