Distributed System s Fall 2 0 0 8 Course introduction
Defining distributed system s ”A distributed system is one in which components located at networked computers communicate and coordinate their actions by passing messages.” (Coulouris, Dollimore, Kindberg, 2005) ”A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable.” (Leslie Lamport, 1987)
Outline • Staff presentation • Course presentation • Lessons from last year • This year’s course • General information on middlewares • Different kinds of communication • External data representation • Interaction, Failure handling, Security • Design Considerations
Staff presentation • Daniel Henriksson danielh@cs (teacher and assistant) • Lars Larsson larsson@cs (teacher and assistant) • P-O Östberg p-o@cs (teacher) Questions about the practical assignment should be sent to both Daniel and Lars. Questions about the material covered in lectures should be sent to the appropriate teacher.
Course presentation Students should obtain: • Knowledge of theoretical models for distributed systems • Knowledge of problems and solutions in designing and implementation of distributed systems The course covers: • Architectural models of distributed systems • Client-Server, peer-to-peer, transactions, transparency, naming, error handling, resource management, and synchronisation • Computer security in a broad perspective • Distributed programming and middlewares
Course presentation • Theoretical part (4,5 Swedish hp) – Theory, methods, algorithms, and principles • Practical part (3 Swedish hp) – Practical obligatory assignment
Lessons from last year ( positive) • Practical assignment was praised as being interesting and hard, but rewarding – gave opportunity to apply the theory to practise • The material was interesting, useful, and challenging • Both the material and the assignment gave insight into several problems and how they can be solved
Lessons from last year ( negative) • The lectures were not satisfactory, and ”not of any use” (too vague and not enough detail) • The practical assignment was very time-consuming, and did not cover all material of the course • The book was hard to read, and even if one started to study it ”on time”, it was hard to study all of it • The practical assignment was unstructured • Mixed languages – some material was only presented in Swedish, and students were allowed to hand in reports and exams in either Swedish or English
This year’s course • New staff working with the course • Lectures based on the course as it was given two years ago • Only English answers and reports are allowed • Practical assignment requires more structure (project plan with milestones) and provides more structure (common interface) • Same book (it is second to none)
The practical assignm ent • Group communication middleware – Group handling – Message ordering – Multicast communication • Presentation of working implementation at the end of the course • Solved in pairs (2 students per group) • Deals with theory from the first couple of lectures http: / / www.cs.umu.se/ kurser/ 5DV020/ HT-08/ assignment.html
Teaching style • Pedagogical approach • Choice of teaching aids • Transparencies (overhead slides) – Most come directly from the book’s website • What should one learn, and at what level of detail?
Architectural m odels “The architecture of a system is its structure in terms of separately specified components.” (Section 2.2) Placement of components across the network • Relationships between components – Servers, clients, peers • Dynamic systems (downloadable code, seamless modifications of network) Definitions: • Platform : lowest-level hardware and software (up to OS level) • Middleware : software layer designed to mask heterogeneity and provide convenient programming model
System architectures “Pure” models: • Client-server – HTTP, FTP, … countless others • Peer-to-peer – BitTorrent, Freenet, Direct Connect, … Variations: • Multiple servers (load balancing, more reliable) • Mobile code (download code from server and run it) • Mobile agents • Thin clients
Design of distributed system s • Performance issues – Responsiveness – Throughput – Load balancing • Quality of service (QoS) – Reliability, security, performance (time-critical) – Adaptability • Caching and replication – Web caching and Content Distribution Networks • Dependability – Correctness, security, and fault tolerance
Middlew are • Distributed systems often utilise middleware technologies to aid development • Offers layers of abstraction • Extends upon traditional programming models: – Local procedure call -> Remote Procedure Call – OOP -> Remote Method Invocation – Event-based programming model Applications, services RMI and RPC Middleware request-reply protocol layers marshalling and external data representation UDP and TCP
Middlew are challenges • Heterogeneous systems – Networks, hardware, OS, languages, protocols • Openness • Security • Scalability – Control physical resource costs and performance loss, resource conservation, bottleneck avoidance • Failure handling – Detection, masking, tolerance, recovery, redundancy • Concurrency – No global time, simultaneously running processes • Transparency – Access, location, concurrency, replication, failure, mobility, performance, scaling
Detailed m iddlew are issues • How are parameters passed, and how is data converted? • How are distributed resources (functions, methods, objects) published and discovered? • How are errors handled in a distributed system? • Security concerns • Memory handling – distributed garbage collection?
I nter-process com m unication • Communication in several layers – Network protocol – Transport protocol – Message passing – Request/ reply – Transactions
Message passing send(destination, message); receive(source, message); Five levels of synchronisation: 1. Non-blocking send: call to send() returns immediately as kernel has loaded the parameters 2. Blocking send: send() returns when the receiver’s kernel has received the message 3. (Explicit blocking send: send() returns as the receiver’s application process has received the message – application level) 4. Request and reply: send() returns when the receiver has created a response and sent it back.
Reliability • Reliable: messages are guaranteed to be delivered despite a ”reasonable” number of packets being dropped or lost • Unreliable: messages are not guaranteed to be delivered in the face of even a single packed dropped or lost • UDP is totally unreliable • TCP is (almost) reliable
Request / Reply com m unication • Chosing the underlying transport protocol – UDP vs TCP • Message identifier – Sender and requestID • Dealing with lost packets – How many retransmits?
Request / Reply – Retransm it • When retransmitting, the receiver may end up with duplicates • Filter based on message ID • Carry out the operation again (or resend cached results) – Idempotent operation – can be repeated with the same results each time – Some kind of history is required when resending cached results
Request / Reply - Protocols • Request protocol – R • Request-reply protocol – RR • Request-reply-acknowledge protocol - RRA
Exam ple RR protocol: HTTP • Content negotiation • Authentication • Persistent Connections (HTTP 1.1) • ASCII-based protocol supporting binary data • Methods – GET, POST, HEAD, PUT, DELETE, OPTION, TRACE • Request: – Method, URL, Protoversion, Headers, (Data) • Reply – Protoversion, Status, Reason, Headers, (Data) • RFC1945 (HTTP 1.0), RFC2616 (HTTP 1.1)
Data com m unication • Problem when sending and recieving data – Data structures must be ”flattened” – The representation of data types can differ – The representation of text can differ – Big endian (English digits) vs Little endian (German digits) • Solution – Use senders representation and include information about translation – Use a stanardized external format
External data representation • Standard(s) for external represenation of data structures and primitives • Could be binary or text
Marshalling – Unm arshalling • Marshalling – Convert internal representation (e.g, object) to an external data representation (e.g, text) • Unmarshalling – Convert external data to an internal representation • Marshalling and Unmarshalling is typically taken care of by a middleware, without any involvement with the application programmer.
External data representation • Examples – Sun XDR (eXternal Data Representation) – Java Object Serialization (Java RMI) – Corba CDR (Common Data Representation) – XML (eXtensible Markup Language) • Used in Web services – ASN (Abstract Syntax Notation) • Supports different encodings (such as XML)
Recommend
More recommend