data logistics in network computing
play

Data Logistics in Network Computing Martin Swany Introduction and - PowerPoint PPT Presentation

Data Logistics in Network Computing Martin Swany Introduction and Motivation The goal of Computational Grids is to mimic the electric power grid for computing power Service-orientation to make computing power a utility Compute


  1. Data Logistics in Network Computing Martin Swany

  2. Introduction and Motivation • The goal of Computational Grids is to mimic the electric power grid for computing power • Service-orientation to make computing power a utility • Compute cycles aren’t fungible • Data location and movement overhead is critical • In the case of Data Grids, data movement is the key problem • Managing the location of data is critical

  3. Data Logistics • The definition of Logistics “…the process of planning, implementing, and controlling the efficient, effective flow and storage of goods, services and related information from point of origin to point of consumption.” • Shipping and distribution enterprises make use of storage (and transformation) when moving material • Optimizing the flow, storage, and access of data is necessary to make distributed and Grid environments a viable reality

  4. The Logistical Session Layer • LSL allows systems to exploit “logistics” in stream-oriented communication • LSL service points ( depots ) provide short- term logistical storage and cooperative data forwarding • The primary focus is improved throughput for reliable data streams • Both unicast and multicast • A wide range of new functionality is possible

  5. The Logistical Session Layer

  6. Session Layer • A session is the end-to-end composition of segment-specific transports and signaling • More responsive control loop via reduction of signaling latency • Adapt to local conditions with greater specificity • Buffering in the network means retransmissions need not come from the source Session Session User Space Transport Transport Transport Network Network Network Data Link Data Link Data Link Physical Physical Physical

  7. Initial Deployment

  8. LSL Performance Improvement

  9. TCP Overview TCP provides reliable transmission of byte streams over best-effort • packet networks Sequence number to identify stream position inside segments • Segments are buffered until acknowledged • Congestion (sender) and flow control (receiver) “windows” • Everyone obeys the same rules to promote stability, fairness, and • friendliness Congestion-control loop uses ACKs to clock segment transmission • Round Trip Time (RTT) critical to responsiveness • Conservative congestion windows • Start with window O(1) and grow exponentially then linearly • Additive increase, multiplicative decrease (AIMD) congestion window • based on loss inference “Sawtooth” steady-state • Problems with high bandwidth • delay product networks

  10. Synchronous Multicast with LSL • Each node sends the data and a recursively encoded control subtree to its children • The LSL connections exist simultaneously • Synchronous distribution • Reliability is provided by TCP • Distribution is logically half-duplex so the “upstream” channel can be used for negotiation and feedback

  11. Build a Distribution Tree

  12. Connections close once data is received

  13. Distribution Experiment • 52 nodes in 4 clusters • UIUC, UTK, UCSD • Distributions originate from a single host • Average times over 10 identical distributions • Without checksum • Control case is a “flat” tree within the same infrastructure

  14. Distribution Time

  15. Bandwidth Delivered

  16. Internet Backplane Protocol • LSL is closely related to IBP • Depots are similar in spirit but don’t yet share an implementation Exposed network buffers J. Plank, A. Bassi, M. Beck, T. Moore, M. Swany, R. Wolski, The Internet Backplane Protocol: Storage in the Network , IEEE Internet Computing, September/October 2001 .

  17. LSL Implementation • The LSL client library provides compatibility with current socket applications • Although more functionality is available using the API directly • LD_PRELOAD for function override • socket(), bind(), connect(), setsockopt()… • Allows Un*x binaries to use LSL without recompilation • Daemon runs on all Un*x platform • Forwarding is better on Linux than on BSD

  18. LSL Summary • Logistical data overlays can significantly improve performance for data movement • Demonstrated speedup • Think of a session as the composition of network-specific transport layers • There are many cases in which a single transport protocol from end to end might not be the best choice • Network heterogeneity • Wireless • Optical (with time-division multiplexing) • Potential to become a new model rather than short-term solution for TCP’s problems

  19. The End to End Arguments • Why aren’t techniques like this already in use? • Recall the end-to-end arguments • E2E Integrity • Network elements can’t be trusted • Duplication of function is inefficient • Fate sharing • State in the network related to a user • Scalability • Network transparency • Network opacity • The assumptions regarding scalability and complexity may not hold true any longer

  20. Cascaded TCP Dynamics • Recall TCP’s sequence number and ACKs • We can observe the progress of a TCP connection by plotting the sequence number acknowledged by the receiver • For this experiment, we captured packet-level traces of both LSL and end-to-end connections • 10 traces for each path and subpath were gathered • We compute the average growth of the sequence number with respect to time • The following graphs depict average progress of a set of transfers

  21. UCSB->Denver->UIUC (64M)

  22. UCSB->Houston->UFL (64M)

  23. Cost of Path Traversal • With pipelined use of LSL depots, there is some startup overhead, after which the time to transfer is dominated by the narrow link • Model as a graph, treating edge cost as time to transfer some amount of data • 1 / achievable bandwidth • The cost of a path is that of the maximum- valued link in the path from source to sink – max(c i,j ) for edge i,j in the path • Or, the achievable bandwidth on a path is constrained by the link with the smallest bandwidth • Optimization for this condition is minimax • mini mize the max imum value

  24. Routing Connections • Goal : Find the best path through the network for a given source and sink • Approach : Build a tree of best paths from a single source to all nodes with a greedy algorithm similar to Shortest Path • By walking the tree of Minimax paths (MMP) we can extract the best path from the source node to each of the other nodes • from source to a given destination to build a complete source route • produce a table of destination/next-hop pairs for depot routing tables • O(m log n) operation for each m

  25. A Tree of Minimax Paths

  26. Edge Equivalence • Bandwidth measurements vary slightly from moment to moment • Connections are bound by the same wide-area connection

  27. Edge Equivalence Threshold - � • Modified algorithm considers edges within � of one another to have the same cost

  28. Network Prediction/Forecasting • Predicting network performance is difficult, especially over links with high bandwidth-delay product • Predictions are best generated from a history of identical measurements • Frequent probes cannot be intrusive • How do we predict large transfers? • Instrumentation data is inexpensive • Approach: combine instrumentation data with current lightweight probes to improve application- specific forecasts

  29. What can short probes tell us? NWS 64K Network Probes ANL -> UCSB 2 1.8 1.6 1.4 BW (mb/s) 1.2 1 0.8 0.6 0.4 0.2 0 Time (10 days) HTTP 16MB Transfers ANL -> UCSB 9 8 7 6 BW (mb/s) 5 4 3 2 1 0 Time (10 days)

  30. Multivariate Forecasting ( ) = CDF x Pr( X � x ) ( ) = count � x ECDF x total 1 1 0.9 0.9 ECDF Probability 0.8 ECDF Probability 0.8 0.7 0.7 0.6 0.6 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 0 1 2 3 4 5 6 7 8 0 0.5 1 1.5 Bandw idth Mbit/ sec Bandw idth in Mbit/ sec ( ) � 1 quantile ( ) quantile = CDF X value X prediction Y = CDF Y

  31. Experimental Configuration • Collect 64KB bandwidth measurements every 10 seconds • Time 16MB HTTP transfers every 60 seconds • Use the wget utility to get a file from the filesystem • Heavily used, general purpose systems including a Solaris system at Argonne Nat’l Lab and a Linux machine at UCSB • Forecasting error as a measure of efficacy • Difference in forecast and measured value

  32. Comparison of Forecasting 1 Univariate Forecaster 0.9 Mean Absolute Error (mb/ s) Multivariate Forecaster 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 2 3 4 5 10 15 30 60 180 300 450 Time (minutes between HTTP xfers)

  33. Comparison of Forecasting 2 1.6 1.4 Root Mean Square Error 1.2 1 0.8 0.6 0.4 Univariate Forecaster 0.2 Multivariate Forecaster 0 1 2 3 4 5 10 15 30 60 180 300 450 Time (minutes between HTTP xfers)

  34. Last Value Prediction 1.2 Normalized Mean Absolute Error 1 0.8 Last Value Prediction 0.6 Multivariate Forecaster 0.4 0.2 0 1 2 3 4 5 10 15 30 60 180 300 450 Time (minutes between HTTP xfers)

Recommend


More recommend