ECE232: Hardware Organization and Design Lecture 29: Computer Input/Output Adapted from Computer Organization and Design , Patterson & Hennessy, UCB
Announcements ECE Honors Exhibition Wednesday, April 30 • 3:00-4:00 PM • M5 • SDP Demo Day 10AM-2PM, Friday, April 25 • Gunness Student Center • ECE Picnic (tickets in ECE office/Eliza) 3-7PM, Friday, April 25 • Hadley Young Men’s Club • ECE Banquet (tickets in ECE office/Eliza) 6-9PM, Friday, May 2 • Courtyard Marriott, Hadley • ECE232: Computer I/O 2
Overview Input and output are fundamental for computer operation Typically much slower than computation • Two types of transfer Polling – processor constantly checks for data • Interrupts – processor is interrupted from activity • Need to understand the requirements of data transfer Tied to computer organization (bus, interfaces, etc) • I/O bandwidth is important (how fast, how much) • Most interfaces today are standardized (USB, monitor, Ethernet) ECE232: Computer I/O 3
Anatomy: 5 components of any Computer Keyboard, Processor Devices Memory Mouse Control Input Disk Datapath Output Processor Display , Printer Cache interrupts Memory - I/O Bus I/O I/O I/O Main Controller Controller Controller Memory Graphics Disk Disk Network ECE232: Computer I/O 4
Handling IO Users like to connect devices to their computers • Keyboard, mouse, printer… External devices may require attention from processor at unpredictable times • CPU doesn’t know when you’re about to hit a key IO devices can be very fast or very slow Need to have a flexible way to control all devices ECE232: Computer I/O 5
I/O Device Examples and Speeds I/O Speed: bytes transferred per second (from mouse to display: million-to-1) Device Behavior Partner Data Rate (Mbit/sec) Keyboard Input Human 0.0001 Mouse Input Human 0.0038 Laser Printer Output Human 3.2000 Magnetic Disk Storage Machine 240-2560 Modem I or O Machine 0.016-0.064 Network-LAN I or O Machine 100-1000 Graphics Display Output Human 800-8000 ECE232: Computer I/O 6
Hardware Solution (875 Chipset) Pentium 4 processor System bus (800 MHz, 604 GB/sec) DDR 400 AGP 8X Memory (3.2 GB/sec) (2.1 GB/sec) Graphics controller Main output hub DDR 400 CSA memory (north bridge) (3.2 GB/sec) (0.266 GB/sec) DIMMs 1 Gbit Ethernet 82875P (266 MB/sec) Serial ATA Parallel ATA (150 MB/sec) (100 MB/sec) CD/DVD Disk Serial ATA Parallel ATA (150 MB/sec) (100 MB/sec) Tape Disk I/O AC/97 controller (1 MB/sec) hub Stereo (south bridge) (20 MB/sec) (surround- 82801EB 10/100 Mbit Ethernet USB 2.0 sound) (60 MB/sec) PCI bus . . . (132 MB/sec) ECE232: Computer I/O 7
Disk Device Terminology Several platters, with information recorded magnetically on both surfaces (usually) Inner Outer Arm Head Sector Track Track Actuator Platter Bits recorded in tracks, which in turn are divided into sectors (e.g., 512 Bytes) Actuator moves head (end of arm, 1/surface) over track ( “seek” ), select surface, wait for sector rotate under head, then read or write • “ Cylinder ”: all tracks under heads ECE232: Computer I/O 8
Disk Device Performance Disk Latency = Seek Time + Rotation Time + Transfer Time + Controller Overhead Seek Time - depends on no. tracks arm moves, seek speed Average no. tracks arm moves? • Sum all possible seek distances from all possible tracks / total # • Assumes average seek distance is random • Disk industry standard benchmark Rotation Time - depends on rotation speed, how far sector is from head 1/2 time of a rotation • Example: 7200 Revolutions Per Minute 120 Rev/sec • 1 revolution = 1/120 sec 8.33 milliseconds • 1/2 rotation (revolution) 4.16 ms Transfer Time - depends on data rate (bandwidth) of disk (bit density), size of request ECE232: Computer I/O 9
Disk Performance Model /Trends Capacity + 100%/year (2X/1 yr) • Transfer rate (BW) + 40%/year (2X/2 yrs) • Rotation + Seek time – 8%/year (1/2 in 10 • yrs) MB/$ > 100%/yr (2X/<1.5 yr) • ECE232: Computer I/O 10
Disk Performance Calculate time to read 1 sector (512B) for UltraStar 72 using advertised performance; sector is on outer track Disk latency = average seek time + average rotational delay + transfer time + controller overhead = 5.3 ms + 0.5 * 1/(10000 RPM) + 0.5 KB / (50 MB/s) + 0.15 ms = 5.3 + 3.0 + 0.01 + 0.15 ms = 8.46 ms ECE232: Computer I/O 11
Instruction Set Architecture for I/O Some machines have special input and output instructions Alternative model (used by MIPS): • Input: ~ reads a sequence of bytes • Output: ~ writes a sequence of bytes Memory also a sequence of bytes, so use loads for input, stores for output • Called “ Memory Mapped Input/Output ” A portion of the address address 0 space dedicated to communication paths to Input or Output devices (no memory there) These addresses are not regular memory, instead, cmd reg. 0xFFFF0000 they correspond to data reg. registers in I/O devices 0xFFFFFFFF ECE232: Computer I/O 12
Memory Mapped IO Make control registers and I/O device data registers appear to be part of the system’s main memory • Reads and writes to the mapped region of the memory are translated by memory controller hardware into accesses of hardware device • Makes it easy to support variable numbers/types of devices – just map them onto different regions of memory Accessing I/O device registers and memory can be done by accessing data structures via the device pointers • Most device drivers are now written in C/C++. Memory mapped I/O makes this feasible without any changes to the way a CPU is programmed ECE232: Computer I/O 13
Processor-I/O Speed Mismatch 1 GHz microprocessor can execute 1000 million load or store instructions per second, or 4 million KB/s data rate • I/O devices from 0.01 KB/s to 30,000 KB/s Input: device may not be ready to send data as fast as the processor loads it • Also, might be waiting for human to act Output: device may not be ready to accept data as fast as processor stores it What to do? ECE232: Computer I/O 14
Processor Checks Status before Acting: Polling Path to device generally has 2 registers: • 1 register says it’s OK to read/write (I/O ready), often called Control Register • 1 register that contains data, often called Data Register Processor reads from Control Register in loop, waiting for device to set Ready bit in Control reg to say its OK (0 1) Processor then loads from (input) or writes to (output) data register • Load from device/Store into Data Register resets Ready bit (1 0) of Control Register ECE232: Computer I/O 15
Cost of Polling? Assume: a 1 GHz processor takes 400 clock cycles for a polling operation (call polling routine, accessing the device, and returning). Determine % of processor time for polling • Mouse: polled 30 times/sec - not to miss user movement • Hard disk: transfers data in 16-byte chunks and can transfer at 8 MB/second. No transfer can be missed Mouse Polling Clocks/sec = 30 * 400 = 12000 clocks/sec % Processor for polling = 12*10 3 /1*10 9 = 0.0012% Polling mouse has little impact on processor Times Polling Disk/sec = 8 MB/s /16B = 500K polls/sec Disk Polling Clocks/sec = 500K * 400 = 200,000,000 clocks/sec % Processor for polling: • 2*10 8 /1*10 9 = 20% Unacceptable ECE232: Computer I/O 16
What is the alternative to polling? Interrupt Wasteful to have processor spend most of its time “spin - waiting” for I/O to be ready Wish we could have an unplanned procedure call that would be invoked only when I/O device is ready Solution: use exception mechanism to help I/O. Interrupt program when I/O ready, return when done with data transfer Polling is like picking up the phone every few seconds to see if you have a call. Interrupt is like letting the phone ring ECE232: Computer I/O 17
I/O Interrupt Controller sends interrupt to the processor along with additional information • which device • nature of interrupt: error, no paper, no ink,… Processor halts execution of current program Saves State Processor looks up which handler to start from the interrupt information When interrupt is handled, returns to program state and resumes ECE232: Computer I/O 18
Interrupt Driven Data Transfer Memory add sub (1) I/O user interrupt program and (2) save PC or (3) interrupt service addr (4) read interrupt store service ... routine (5) jr ECE232: Computer I/O 19
Benefit of Interrupt-Driven I/O 500 clock cycle overhead for each transfer, including interrupt. Find the % of processor consumed if the hard disk is only active 5% of the time If interrupt rate = polling rate • Disk Interrupts/sec = 8 MB/s /16B = 500K interrupts/sec • Disk Polling Clocks/sec = 500K * 500 = 250,000,000 clocks/sec • % Processor used during transfers: 250*10 6 /1*10 9 = 25% If disk active 5% 5% * 25% 1.25% busy ECE232: Computer I/O 20
Recommend
More recommend