CPS 196 Overview CPS 196 Overview 1. Some philosophy • Breadth-first • Bottom-up • Broad systems view: architecture and complexity CPS 196 CPS 196 • Focus on principles, apply to many contexts 2. Topics 3. Reading: [SK] Systems and Networks 4. Labs Jeff Chase • Picking a hammer and a nail Spring 2006 5. What do you want to get out of this course? Systems: Surfing the Technology Wave Systems: Surfing the Technology Wave Example: memory and disk storage Example: memory and disk storage 1. Innovation happens. Trends of annual improvement: 2. Things change. • Capacity per $ = 50% per year (more for personal disks) • Bandwidth improvement = 28% 3. Things change fast. • Latency improvement: 7% to 9% • Moore’s Law Exponential growth is a wonderful thing: • Internet • Time for bandwidth to double = 2.9 years 3. Change ripples through systems. • Every 3 years, bandwidth doubles, capacity improves by 3x • [SK]: “incommensurate scaling” rule: “changing any system parameter by a 10x requires a new design”. • But latency improves by 20% for memory and 30% for disk. 4. Different things change at different rates. • Implications? [Patterson04] Technology Trends Disk Capacity doubles every 1.5 years • Today: Processing Power Doubles Every 18 months • Today: Memory Size Doubles Every 18 months(?) The I/O The I/O GAP GAP • Today: Disk Capacity Doubles Every 12 months • Disk Positioning Rate (Seek + Rotate) Doubles Every Ten Years! [Kedem] 1
September 11, 2001 September 11, 2001 Broader Importance of Distributed Software Technology Broader Importance of Distributed Software Technology Today, the global community depends increasingly on The 9/11 load spike at CNN.com: distributed information systems technologies. • complete collapse There are many recent examples of high-profile meltdowns of systems for distributed information exchange. • scramble to manually deploy new servers • Code Red worm: July 2001 How can we handle “flash crowds”? • denial-of-service attacks against Yahoo etc. (spring 00) • Buy/install enough hardware for worst-case load? • stored credit card numbers stolen from CDNow.com (spring 00) • Block traffic? People were afraid to buy over the net at all just a few years ago! • Adaptive provisioning? • Network Solutions DNS root server failure (fall 00) • Steal resources from less critical services? • MCI trunk drop interrupts Chicago Board of Exchange (summer 99) These reflect the reshaping of business, government, and society brought by the global Internet and related software. We have to “get it right”! That Other September 11 That Other September 11 The Importance of Authentication The Importance of Authentication This is a graph of request traffic to download the Starr Report on Pres. This is a picture of a $2.5B move in the value of Emulex Corporation, in Clinton’s extracurricular pursuits, released on 9/11/98. response to a fraudulent press release by short-sellers through InternetWire in 2000. The release was widely disseminated by news media as a statement from Emulex management. EMLX [reproduced from clearstation.com ] Manageability Self- -Managing Systems Managing Systems Manageability Self Today, “cost” has a broader meaning than it once did: • growth in administrative overhead with capacity IBM’s Autonomic Computing Challenge • no interruption of service to upgrade capacity “24 * 7 * 365 * .9999” vendor facility 5% 20% vendor 40% staff 50% 40% Old New World World staff facility 40% 5% Where does the money go? [Borrowed from Jim Gray] 2
Complexity Complexity Operating Systems: The Classical View Operating Systems: The Classical View The operating system (OS) defines an interface between programs and machines. ??? User Applications Complexity system call interface Operating System machine interface Architecture An OS implements a sort of virtual machine that is easier to program and share than the raw hardware. Growth and Change [McKinley] Studying Operating Systems Studying Operating Systems The World Today The World Today desktop clients This course deals with “classical” operating systems issues: Servers • the services and facilities that operating systems provide; database file • OS implementation on modern hardware; web (and architectural support for modern operating systems) ... Internet • how hardware and software evolve together; • the techniques used to implement software systems that are: Server farms (clusters) mobile devices Internet appliances large and complex, long-lived and evolving, LAN/SAN Network concurrent, performance-critical. The Machine The OS and the Hardware The Machine The OS and the Hardware Let’s start from where we left off in CPS 104… The OS is the “permanent” software with the power to: • control/abstract/mediate access to the hardware interrupts CPUs and memory Processor I/O devices Cache • so user code can be: Memory Bus I/O Bridge simpler I/O Bus Main device-independent Memory Network portable Disk Graphics Interface Controller Controller even “transportable” Graphics Disk Disk Network 3
The OS and User Applications The OS and User Applications Overview of OS Services Overview of OS Services The OS defines a framework for users and their programs to Storage : primitives for files, virtual memory , etc. coexist, cooperate, and work together safely, supporting: control devices and provide for the “care and feeding” of the memory system hardware and peripherals • concurrent execution/interaction of multiple user programs Protection and security • shared implementations of commonly needed facilities set boundaries that limit damage from faults and errors “The system is all the code you didn’t write.” • mechanisms to share and combine software components establish user identities, priorities, and accountability Extensibility : add new components on-the-fly as they are developed. access control for logical and physical resources • policies for safe and fair sharing of resources Execution : primitives to create/execute programs physical resources (e.g., CPU time and storage space) support an environment for developing and running applications logical resources (e.g., data files, programs, mailboxes) Communication : “glue” for programs to interact The Big Questions The Big Questions 1. How to divide function/state/trust across components? reason about flow of data and computation through the system 2. What abstractions/interfaces are sufficiently: powerful to meet a wide range of needs? Video: an old-timer’s view. (If time and inclination.) efficient to implement and simple to use? versatile to enable construction of large/complex systems? 3. How can we build: reliable systems from unreliable components? trusted systems from untrusted components? unified systems from diverse components? coherent systems from distributed components? Questions about (Operating) Systems The Four Faces of Your Operating System Questions about (Operating) Systems The Four Faces of Your Operating System • What makes an OS good or not? What are the most • service provider important dimensions of “goodness”? The OS exports commonly needed facilities with standard • What are the parts of an operating system? Where is the interfaces, so that programs can be simple and portable. boundary between “the system” and the applications? • executive/bureaucrat/juggler • What makes an OS different from any other big program? Is The OS controls access to hardware, and allocates physical it any harder to build an OS that “doesn’t suck” than it is to resources (memory, disk, CPU time) for the greatest good. build anything else? • caretaker Unyielding foundations rule The OS monitors the hardware and intervenes to resolve • We can build bridges that does not fall down, why not an exceptional conditions that interrupt smooth operation. OS? • cop/security guard The OS mediates access to resources and grants/denies requests. 4
Classical View: The Questions Classical View: The Questions Where Do We Go From Here? Where Do We Go From Here? The basic issues/questions in this course are how to : • Unix • allocate memory and storage to multiple programs? • Virtual machines • share the CPU among concurrently executing programs? • Linux and Xen • suspend and resume programs? • Network system calls • share data safely among concurrent activities? • Internet servers and Web servers • protect one executing program’s storage from another? • Threads and concurrency • protect the code that implements the protection, and mediates access to resources? • prevent rogue programs from taking over the machine? • allow programs to interact safely? 5
Recommend
More recommend