. . Design and Evaluation of a Virtual Experimental Environment for Distributed Systems L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum . . 27/02/2013 PDP 2013, Belfast L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum Distem - Design and Evaluation 1 / 27 Grid’5000
. etc. Distem - Design and Evaluation L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum SimGrid, ns-2, OMNET++, etc. Simulation: modeled applications on modeled systems Grid’5000, FutureGrid, PlanetLab, etc. In-situ : real applications on real platforms Many experimental methodologies: Fairness Fault-tolerance Complexity Robustness Scalability Performance Throughput Response time Many objects of study: 2 / 27 . Study of distributed systems
. Simulation: Distem - Design and Evaluation L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum Is there a middle ground methodology? lower realism simplified assumptions perfectly reproducible enables unprecedented experiments 3 / 27 usually unreproducible limited to available environmental conditions uses real implementation more realistic In-situ methodology: . Different methodologies ☺ ☺ ☹ ☹ ☺ ☺ ☹ ☹
. Model your desired platform Distem - Design and Evaluation L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum . . Efficiently emulate the desired platform using the real platform 3 . . 2 . . Use an existing platform 1 . . Idea: computer on another one, usually more powerful Technique used to efficiently simulate the behavior of a 4 / 27 . Emulation →
. Advantages: Combines advantages of simulation and in-situ approaches: allows to use real applications and infrastructure enables complicated experiments Paves the way to reproducibility Answers following type of questions: How can I reproduce an experiment published in 2001 even if 1.5GHz processors do not exist anymore? How can I evaluate my new P2P software designed for DSL networks? How does this runtime with advanced load-balancing capabilities perform on highly hierarchical networks? L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum Distem - Design and Evaluation 5 / 27 . Emulation (cont.)
. . Distem - Design and Evaluation L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum conclude and outline future work 4 . . show and discuss evaluation results 3 . describe its architecture 2 . . present our emulation-based solution 1 . . During the rest of the talk I will: 6 / 27 . Plan of the talk
. . Distem - Design and Evaluation L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum … Grid, Cloud, P2P, Long distance networks, Heterogeneous nodes, = . + . . experimental environments. Distem is a (freely available) software to build virtual distributed 7 / 27 . Distem - DISTributed systems EMulator
. Features of Distem include: Introducing heterogeneity in otherwise homogeneous cluster: CPU heterogeneity How does your solution perform when some nodes are slower? Network heterogeneity Does your solution work in Internet-like infrastructure? Emulating complex network topologies How does your solution perform on a Grid? Enlarging the scale of the experiment How does your solution perform on several thousands of nodes? User-friendliness L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum Distem - Design and Evaluation 8 / 27 . What can Distem do for you?
. VN 3 . 7 . VN 1 . VN 2 . . . Virtual node 4 . CPU cores . CPU performance L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum Distem - Design and Evaluation 6 5 9 / 27 . Distem can host multiple virtual nodes on one physical node : with a different number of cores with different CPU performance There are 2 strategies for degrading performance: CPU-Gov – based on hardware CPU throttling CPU-Hogs – advanced CPU burning . 0 . . 1 . 2 . 3 . 4 . CPU heterogeneity
. Distem can emulate properties of network links between nodes. Each link can have a different: maximum bandwidth latency They can be set for incoming and outgoing traffic independently. L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum Distem - Design and Evaluation 10 / 27 . Network heterogeneity
. 16 ms . . n 4 . n 5 . 4 Mbps 12 ms . 6 Mbps . . if1 . . . . . . L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum Distem - Design and Evaluation . . 11 / 27 5 ms Define properties of network links using network heterogeneity Use them to emulate several local networks linked together . . n 3 . n 1 . n 2 5 Mbps 10 ms . . 10 Mbps . if0 . . . Complex network configuration if0 if0 100 kbps 1 Mbps 30 ms 25 ms 30 ms 30 ms 256 kbps 1 Mbps 100 Mbps 200 kbps 30 ms 3 ms 40 ms 1 ms 100 Mbps 512 kbps if0 if0
. Distem uses a lightweight virtualization to: share resources between nodes: CPU network filesystem host many instances of virtual nodes on a single node This powerful feature: enables challenging experiments of unprecedented scale saves resources and energy L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum Distem - Design and Evaluation 12 / 27 . Scale of the experiment
. Distem strives to be user-friendly: complex and tedious tasks are automated: configuring network interfaces populating routing tables distributing system images etc. 3 interfaces with increasing complexity and feature-set are offered: command-line Ruby library REST interface (with JSON to represent data) L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum Distem - Design and Evaluation 13 / 27 . User-friendliness
. CLI . User . REST API . Ruby . CLI example: Requires REST knowledge distem --create-vnetwork vnetwork=net,address=10.144.0.0/22 distem --create-vnode vnode=node-1,rootfs=file:///image.tgz distem --create-viface vnode=node-1,iface=if0,vnetwork=net distem --start-vnode node-1 distem --execute vnode=node-1,command="hostname" L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum Distem - Design and Evaluation . Language agnostic 14 / 27 Access to all features Command-line interface – distem command Easy to use Hard to automate No access to more advanced features Ruby library Easy to automate Easy to use (if you know Ruby) Requires Ruby REST API . User interfaces ☺ ☹ ☹ ☺ ☺ ☺ ☹ ☺ ☹
. . Distem - Design and Evaluation L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum Switch . Node 3 . Node 2 . Node 1 . = . . + Switch . Node 3 . Node 2 . Node 1 . . Note that it limits the scope of experiments to Linux/Unix. network traffic control (packet schedulers and shapers) advanced networking (network bridging) CPU frequency scaling Control Groups and Linux containers (LXC) Distem uses modern Linux features : 15 / 27 . Distem internals
. & Pnode 1 Distem - Design and Evaluation L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum Vnodes . . . REST . distemd . gateway to Pnodes and Vnodes of the platform and acts as the Starts and controls other . Pnodes, keeps the global state Coordinator . . User’s machine . Uses the command line interface, the Ruby client library . or a REST client 16 / 27 Pnode 2 . distemd . Pnode 3 . distemd . Communication architecture T S E R R E S T
. CPU emulation (Linpack, DGEMM and FFT benchmarks) Distem - Design and Evaluation L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum We used the Grid’5000 testbed. Each measurement was repeated many times and results are averaged. scalability (by performing a large deployment with Distem) basic performance analysis of scp and rsync tools emulation of a simple topology precision of bandwidth emulation latency emulation over time precision of latency emulation network emulation: To evaluate Distem we designed a few experiments concerned with: 17 / 27 . Evaluation Grid’5000
. . . Measured latency (ms) . . . Measured latency (in) . . Measured latency (out) . Expected latency . . Conclusion . . Emulation is accurate, especially for values above real network latency (0.3 ms). L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum Distem - Design and Evaluation Emulated latency (ms) 10 2 18 / 27 . Purpose: test if latency is properly emulated Setup: 2 physical nodes, 1 virtual node on each Emulated latencies from 1 ms to 100 ms Data point: RTT (custom ping tool) between two virtual nodes . . 10 0 . . 10 1 . 10 2 . 10 0 . 10 1 . Precision of latency emulation
. . . Time (ms) . Latency (ms) . . . Measured latency . Emulated latency . . Conclusion . . Emulation is stable and correct during the measurement. L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum Distem - Design and Evaluation . 10 . 1 Purpose: test if latency is constant over time Setup: 2 physical nodes, 1 virtual node on each Data point: result of high-frequency RTT probing . . . 0 . . 2 . 3 . 4 . 19 / 27 . Latency emulation over time 10 . 4 10 . 2 9 . 8
Recommend
More recommend