MC714: Sistemas Distribu´ ıdos Prof. Lucas Wanner Instituto de Computac ¸ ˜ ao, Unicamp Comunicac ¸ ˜ ao Aula 5: Revis˜ ao e Programac ¸ ˜ ao com Sockets Aula 6: Troca de Mensagens e Multicast Aula 7: Chamada de Procedimento Remoto
Revision: Threads and Distributed Systems Improve performance Starting a thread is typically much cheaper than starting a new process. Having a single-threaded server prohibits simple scale-up to a multiprocessor system. As with clients: hide network latency by reacting to next request while previous one is being replied. Better structure Most servers have high I/O demands. Using simple, well-understood blocking calls simplifies the overall structure. Multithreaded programs tend to be smaller and easier to understand due to simplified flow of control. Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 2 / 52
Revision: Architecture of VMs Observation Virtualization can take place at very different levels, strongly depending on the interfaces as offered by various systems components: Application Library functions Library System calls Operating system Privileged� General� instructions instructions Hardware Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 3 / 52
Revision: Process VMs versus VM Monitors Application Applications Runtime system Operating system Runtime system Operating system Runtime system Operating system Operating system Virtual machine monitor Hardware Hardware (a) (b) Process VM: A program is compiled to intermediate (portable) code, which is then executed by a runtime system (Example: Java VM). VM Monitor: A separate software layer mimics the instruction set of hardware ⇒ a complete operating system and its applications can be supported (Example: VMware, VirtualBox). Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 4 / 52
Revision: Servers and state Stateless servers Never keep accurate information about the status of a client after having handled a request: Don’t record whether a file has been opened (simply close it again after access) Don’t promise to invalidate a client’s cache Don’t keep track of your clients Consequences Clients and servers are completely independent State inconsistencies due to client or server crashes are reduced Possible loss of performance because, e.g., a server cannot anticipate client behavior (think of prefetching file blocks) Question Does connection-oriented communication fit into a stateless design? Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 5 / 52
Revision: Servers and state Stateful servers Keeps track of the status of its clients: Record that a file has been opened, so that prefetching can be done Knows which data a client has cached, and allows clients to keep local copies of shared data Observation The performance of stateful servers can be extremely high, provided clients are allowed to keep local copies. Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 6 / 52
Revision: Server clusters Logical switch� Application/compute servers Distributed� (possibly multiple) file/database� system Dispatched� request Client requests First tier Second tier Third tier Crucial element The first tier is generally responsible for passing requests to an appropriate server. Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 7 / 52
Revision: Request Handling Observation Having the first tier handle all communication from/to the cluster may lead to a bottleneck. Solution Various, but one popular one is TCP-handoff Logically a single TCP � � Response Server connection Request Request (handed off) Client Switch � � � Server Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 8 / 52
Revision: Code Migration Before execution After execution Client Server Client Server code code Client-Server state state* resource resource code code Remote Evaluation → state → state* resource resource code code Code-on-Demand ← ← state state* resource resource code code Mobile Agents → → state state* resource resource resource resource Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 9 / 52
Revis˜ ao: Exerc´ ıcios Considere um servic ¸o que leva um total de 10 ms para atender um pedido desde 1 que os dados necess´ arios estejam em uma cache na mem´ oria principal. Nos casos onde os dados n˜ ao est˜ ao na cache, uma operac ¸ ˜ ao de disco que leva 90 ms ´ e necessaria antes de completar o pedido, e durante este tempo a thread que processa o pedido ´ e suspensa. Assuma que os dados est˜ ao na cache para 50% dos pedidos. Quantos pedidos por segundo o servidor pode tratar se for implementado com uma ´ unica thread? E se o servidor usar m´ ultiplas threads? Faz sentido limitar o n´ umero de threads em um processo servidor? Argumente. 2 Existem casos onde um servidor single-thread tem desempenho melhor do que um 3 servidor multi-thread? Argumente. Um servidor multi-processos tem algumas vantagens e desvantagens quando 4 comparado com um servidor multi-threads. Dˆ e alguns exemplos. Um servidor que mant´ em uma conex˜ ao TCP/IP para um cliente ´ e stateful ou 5 stateless ? 10 / 52
Layered Protocols Low-level layers Transport layer Application layer Middleware layer Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 11 / 52
Basic networking model Application protocol Application 7 Presentation protocol Presentation 6 Session protocol Session 5 Transport protocol Transport 4 Network protocol 3 Network Data link protocol 2 Data link Physical protocol 1 Physical Network Drawbacks Focus on message-passing only Often unneeded or unwanted functionality Violates access transparency Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 12 / 52
Low-level layers Recap Physical layer: contains the specification and implementation of bits, and their transmission between sender and receiver Data link layer: prescribes the transmission of a series of bits into a frame to allow for error and flow control Network layer: describes how packets in a network of computers are to be routed. Observation For many distributed systems, the lowest-level interface is that of the network layer. Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 13 / 52
Transport Layer Important The transport layer provides the actual communication facilities for most distributed systems. Standard Internet protocols TCP: connection-oriented, reliable, stream-oriented communication UDP: unreliable (best-effort) datagram communication Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 14 / 52
Middleware Layer Observation Middleware is invented to provide common services and protocols that can be used by many different applications A rich set of communication protocols (Un)marshaling of data, necessary for integrated systems Naming protocols, to allow easy sharing of resources Security protocols for secure communication Scaling mechanisms, such as for replication and caching Note What remains are truly application-specific protocols... such as? Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 15 / 52
Types of communication Synchronize at� Synchronize at � Synchronize after� request submission request delivery processing by server Client Request � Transmission� interrupt Storage� facility Reply Server Time Distinguish Transient versus persistent communication Asynchrounous versus synchronous communication Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 16 / 52
Types of communication Synchronize at� Synchronize at � Synchronize after� request submission request delivery processing by server Client Request � Transmission� interrupt Storage� facility Reply Server Time Transient versus persistent Transient communication: Communication server discards message when it cannot be delivered at the next server, or at the receiver. Persistent communication: A message is stored at a communication server as long as it takes to deliver it. Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 17 / 52
Types of communication Synchronize at� Synchronize at � Synchronize after� request submission request delivery processing by server Client Request � Transmission� interrupt Storage� facility Reply Server Time Places for synchronization At request submission At request delivery After request processing Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 18 / 52
Client/Server Some observations Client/Server computing is generally based on a model of transient synchronous communication: Client and server have to be active at time of communication Client issues request and blocks until it receives reply Server essentially waits only for incoming requests, and subsequently processes them Drawbacks synchronous communication Client cannot do any other work while waiting for reply Failures have to be handled immediately: the client is waiting The model may simply not be appropriate (mail, news) Source: Maarten van Steen, Distributed Systems: Principles and Paradigms 19 / 52
Recommend
More recommend