5/19/2018 Distributed Systems Goals of Distributed Systems 13A. Distributed Systems: Goals & Challenges • scalability and performance – apps require more resources than one computer has 13B. Distributed Systems: Communication – grow system capacity /bandwidth to meet demand 13H. Public Key Encryption • improved reliability and availability – 24x7 service despite disk/computer/software failures • ease of use, with reduced operating expenses – centralized management of all services and systems – buy (better) services rather than computer equipment • enable new collaboration and business models – collaborations that span system (or national) boundaries – a global free market for a wide range of new services Distributed Systems: Issues and Approaches 1 Peter Deutsch's the end of self-contained systems "Seven Falacies of Network Computing" • authentication 1. network is reliable – Active Directory, LDAP, Kerberos, … 2. no latency (instant response time) • configuration and control 3. available bandwidth is infinite – Active Directory, LDAP, DHCP, CIM/WBEM, SNMP, … • external data services 4. network is secure – CIFS, NFS, Andrew, Amazon S3, … 5. network topology & membership are stable • remote devices 6. network admin is complete & consistent – X11, web user interfaces, network printers • even power management, bootstrap, installation 7. cost of transporting additional data is zero – vPro, PXE boot, bootp, live CDs, automatic s/w updates Bottom Line: true transparency is not achievable Heterogenous Interoperability Fundmental Building Blocks Change • heterogenous clients • the old model – different instruction set architectures – programs run in processes – different operating systems and versions – programs use APIs to access system resources – API services implemented by OS and libraries • heterogenous servers • the new model – different implementations – offered by competing service providers – clients and servers run on nodes – clients use APIs to access services • heterogenous networks – API services are exchanged via protocols – public and private • local is a (very important) special case – managed by different orgs in different countries Distributed Systems: Issues and Approaches 5 1
5/19/2018 Changing Paradigms Performance, Scalability, Availability • old model – better components (4-40%/yr) • network connectivity becomes "a given" – find and optimize all avoidable overhead – new applications assume/exploit connectivity – get the OS to be as reliable as possible – new distributed programming paradigms emerge – run on the fastest and newest hardware – new functionality depends on network services • new better – better systems (1000x) • applications demand new kinds of services: – add more $150 blades and a bigger switch – location independent operations – spreading the work over many nodes is a huge win – rendezvous between cooperating processes • performance – linear with/number of blades – WAN scale communication, synchronization • availability – service continues despite node failures General Paradigm – RPC Remote Procedure Call Concepts • procedure calls – a fundamental paradigm • Interface Specification – primary unit of computation in most languages – methods, parameter types, return types – unit of information hiding in most methodologies • eXternal Data Representation – primary level of interface specification – language/ISA independent data representations • a natural boundary between client and server – may be abstract (e.g. XML) or efficient (binary) – turn procedure calls into message send/receives • client stub • a few limitations – client-side proxy for a method in the API – no implicit parameters/returns (e.g. global variables) • server stub (or skeleton) – no call-by-reference parameters – server-side recipient for API invocations – much slower than procedure calls (TANSTAAFL) Distributed Systems: Issues and Approaches 10 Remote Procedure Calls – Data Flow Remote Procedure Calls – Tool Chain RPC RPC interface generation specification tool client application server application server External Data Client RPC RPC Representation call stubs skeleton call access fucntions messages server skeleton server client stub client implementation application code code Client System Sever System client server Distributed Systems: Issues and Approaches 11 2
5/19/2018 (RPC – Key Features) The Interoperability Challenge • client application links against local procedures • S/W, APIs and protocols evolve – calls local procedures, gets results – to embrace new requirements, functionality • all rpc implementation is inside those procedures • A single node is running a single OS release • client application does not know about RPC – all s/w can be upgraded at same time as OS – does not know about formats of messages • A distributed system is unlikely homogenous – does not worry about sends, timeouts, resents – rolling upgrades do one server at a time – does not know about external data representation – newly added servers may be up/down-rev • all of this is generated automatically by RPC tools – we may have no control over client s/w versions • the key to the tools is the interface specification • we must ensure they all “play well” together Distributed Systems: Issues and Approaches 14 Ensuring Interoperability Extensible Data Representations 1. restricted evolution • Upwards compatible serialized object formats all changes must be upwards compatible – platform independent data representations – – client-version sensitive translation 2. compensation (run-time restriction) • old clients never see new-version fields all sessions begin with version negotiation – • new clients infer upwards compatible defaults 3. better tools that embrace polymorphism • Example: Google Protocol Buffers every agent speaks his own protocol version – – very efficient translation RPC language and tools are version-aware – – applicable to both protocols and persisted data messages are un-marshaled as each client expects • – supports many representations (e.g. binary, json) default behaviors are based on older expectations • – has adaptors for many languages (e.g. C, python) equally applicable to messages and at-rest data – Distributed Systems: Issues and Approaches 15 Distributed Systems: Issues and Approaches 16 RPC is not a complete solution Evolving Interaction Paradigms • client/server binding model • HTTP is becoming the preferred transport – expects to be given a live connection – well supported, tunnels through firewalls • threading model implementaiton • Simple Object Access Protocol (SOAP) – a single thread service requests one-at-a-time – HTTP transport of XML encoded RPC requests – numerous one-per-request worker threads – options for other transports and encodings – supports non-RPC interactions (e.g. transactions) • failure handling • REpresentational State Transfer (REST) – client must arrange for timeout and recovery – stateless, scalable, cacheable, layerable • higher level abstractions – operations limited to Create/Read/Update/Delete – e.g. Microsoft DCOM, Java RMI, DRb, Pyro Distributed Systems: Issues and Approaches 17 Distributed Systems: Issues and Approaches 18 3
Recommend
More recommend