Large-scale experiments on a cluster Liang Wang - PowerPoint PPT Presentation

Large-scale experiments � on a cluster � � Liang Wang � Supervisor: Prof. Jussi Kangasharju Dept. of Computer Science � University of Helsinki, Finland � � � 1 �

Large-scale experiments • Motivation � • Modern systems are large and distributed. � • Need to evaluate robustness, adaptability and performance. • Three (four) options � � • Simulator � • Internet • Cluster � • (Analytical) � � 2 �

Why on the cluster • With cluster, we can � • easily control all the participants and access all the data; • � make large-scale experiments reproducible; • simulate different real-life scenarios by using different parameters; � • It looks beautiful, however, � • � cluster is always “smaller” than the experiment scale we want. • design and deploy experiment is non-trivial. � � � 3 �

Ukko cluster • Introduction � • computing infrastructure for the research and education purpose in the Dept. of Computer Science, Univ. of Helsinki. • � everyone in the department can access it. • Specification � • � 240 Dell PoweEdge M610 nodes, connected with 10-Gb link; • � Each node has 32GB of RAM and 2 Intel Xeon E5540 2.53GHz CPUs • Each CPU has 4 cores, there can be 16 concurrent threads due to hyper- threading. � • (Part of our work was done on HIIT cluster) � � 4 �

Our work & aims • Aims in the long-run � • In a nutshell, measure & evaluate large-scale distributed systems in a systematic and consistent manner. • Currently, we ... � • � focus on P2P system (BitTorrent) evaluation in cluster environment. • � develop simple but flexible tools to deploy the experiments and automate the whole process(deploying, collecting data, simple analyzing). � • figure out various restrictions on the large-scale experiments on Ukko cluster � • study how to design reasonable experiments. • � try to gain experience for future evaluation for other systems. � 5 �

BitTorrent experiment • Why it is worth study � • The dominant file-sharing protocol in the world - real-world data can be used to validate the results from the cluster experiments. � • A good starting-point - there is abundant literature can be referred to. • A typical complex system - peer-level behaviors are simple and easy to � understand, the system’s overall behaviors are complicated. � • Experiment target � • Instrumented clients are widely-used in research area. There are several ready-made ones, but not full-fledged. We use our own BitTorrent client, based on official version. � • Evaluate different implementations, mainly focus on Mainline Ver4. � � 6 �

Some practical issues • Bypass I/O � • I/O operations to the hard disk are bypassed. Not only because of the limited storage capacity, it is the first bottleneck of the performance. • With the simplest experiment setting, one seeder, one leecher, and no � limits on the transmission rate, � Node A Node B 1-Gb link � MLBT MLBT � I/O bypassed? stable transmission rate CPU resources on I/O wait No 70MB/s over 85% � Yes 115MB/s almost 0% � � 7 �

Some practical issues (contd.) • Running multiple instances on one node � • Reason: maximize the utilization; enlarge the experiment scale with limited resources. • Method: application-layer isolation, no hypervisor is used. Pros & Cons? � • Lots of nasty issues needs to take care -- e.g. I/O overheads, storage � issue, system parameters. • Bypass the write operations, redirect the read operations. � � send & recv send & recv send & recv send & recv . . . . . . MLBT MLBT MLBT MLBT � WRITE READ READ READ READ � X file � 8 �

Some practical issues (contd.) • Tune the parameters � • the default parameters may work well on a home connection with low bandwidth. But some of them are not suitable on a high performance cluster. � • Sending buffer(reduce write operations to network interface), slice size � (reduce read operations). Control the number of concurrent uploads, which is calculated from the upload rate. � • Other Restrictions � • For example, ip_local_port_range = 32768 ~ 61000 (28232 available) • � CPU, memory, max sockets, max opened file, max processes, etc. � � 9 �

Some practical issues (contd.) � � � � � � � � 10 �

Some practical issues (contd.) � � safe region � � � � safe region � � 11 �

Two-node experiment • Homogeneous experiment, all MLBT with same configurations � • Two types of experiments, upload-constrained & download-constrained • Two types of outgoing connections, connections to the native peers & connections to the foreign peers � � Node A Node B � MLBT MLBT MLBT MLBT � MLBT MLBT MLBT MLBT � MLBT MLBT MLBT MLBT � � 12 �

Change in BT’s behaviors • Two-node experiment: upload-constrained � � � � � � � � 13 �

Change in BT’s behaviors • Two-node experiment: download-constrained � � � � � � � � 14 �

How about three nodes? • Homogeneous experiment, all MLBT with same configurations � Node A Node B MLBT MLBT MLBT MLBT � MLBT MLBT MLBT MLBT MLBT � MLBT MLBT MLBT � Node C � MLBT MLBT MLBT MLBT � MLBT MLBT � � 15 �

Change in BT’s behaviors • How about 3 nodes? (download-constrained) � � � � � � � � 16 �

Conclusion � • To experiment on a cluster, we must consider • Experiment target. (protocols and implementations) � • Platform configurations and limitations. (depends on the underlying os) � • Network configurations and topology. � • Many things can be the bottlenecks, so the experiment should be � carefully designed! � � � 17 �

Conclusion (contd.) • Any other conclusions here? � • It seems experimenting on a cluster is “dangerous”, too many underlying details, too many hackings, too many restrictions can mess up an exp. � • Don’t forget the benefits from the cluster! • It is feasible, but we need to be very careful. � � • Always, or at least try to know every underlying details. � • Always design rational experiment. • Always play in the safe area. � � � 18 �

� Thank you! � � Liang Wang, Dept. of Computer Science � � � � � 19 �

Extra figure of exp on Ukko Mainline Ver4 1 2 nodes cap plan 5000 450 0.9 y=560/x 10000 205 nodes cap plan 15000 400 0.8 y=244/x+20 20000 � 25000 350 0.7 30000 300 Peers/Node 0.6 CDF 250 0.5 200 0.4 150 0.3 � 100 0.2 50 0.1 0 0 0 2 4 6 � 8 10 12 0 200 400 600 800 1000 1200 upload rate (MB/s) Average download rate (KB/S) Aria2 evaluation Aria2 evaluation: 10450 peers, UTPEX enabled 0.9 � 1 cln023 avg. dl cln024 0.8 0.9 Ratio of ul connections to the native peers avg. ul 0.8 0.7 � 0.7 0.6 0.6 0.5 CDF 0.5 0.4 0.4 � 0.3 0.3 0.2 0.2 0.1 0.1 � 0 0 500 600 700 800 900 1000 1100 1200 1300 20 40 60 80 100 120 140 160 180 200 Average ul & dl rate (KB/S) Peers/Node � 20 �

Large-scale experiments on a cluster Liang Wang - PowerPoint PPT Presentation

Large-scale experiments on a cluster Liang Wang Supervisor: Prof. Jussi Kangasharju Dept. of Computer Science University of Helsinki, Finland 1 Large-scale experiments Motivation

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

Experiments on deflection of charged Experiments on deflection of charged Experiments on

Large-Scale Machine Learning at Twitter 2 Large-Scale Machine Learning at Twitter Jimmy Lin and

INFRASTRUCTURE 2110414 Large Scale Computing Systems Natawut Nupairoj, Ph.D. Outline 2

Chapter 8. Experiments Chapter 8. Experiments Experimental Research Experimental Research

Experimental Design and the Search for Quasi-Experiments Department of Government London School

Experiments Philosophy of Economics University of Virginia Matthias Brinkmann Contents 1.

Large-Scale Clustering through Functional NCut Embedding Embedding Experiments Summary

MongoDB large scale data-centric architectures QConSF 2012 Kenny Gorman Founder, ObjectRocket

INCORPORATING LARGE-SCALE CITIZEN INCORPORATING LARGE-SCALE CITIZEN DELIBERATION INTO

Workshop Workshop on Large on Large- -Scale Disaster Recovery Scale Disaster Recovery i i

A large-scale chemical data integration system Gaia Paolini Pfizer Confidential 1 Large-Scale

Efficient Large-Scale Graph Processing on Hybrid CPU and GPU Systems A. Gharaibeh, E.

Meeting the Challenges of Ultra- -Large Large- -Scale Scale Meeting the Challenges of Ultra

Large Scale I nternational I Pv6 Pilot Large Scale I nternational I Pv6 Pilot Network (6NET)

Project Presentations 1 BitTorrent Raylene Yung Nathan Marz BitTorrent protocol File split

Can Realistic BitTorrent Experiments Be Performed on Clusters? Ashwin Rao , Arnaud Legout, and

Exploring and Improving BitTorrent Topologies Christian Decker ETH Zurich Distributed

Extraction from Wireless Signal Strength in Real Environments Suman Jana, Sriram Nandha Premnath

2019 https://youtu.be/VEju2Md3u-c Market Opportunity Financial services for half the planet

in ourmon Jim Binkley, Divya Parekh jrb@cs.pdx.edu, divyap@pdx.edu Portland State University

SystemImager and BitT orrent: a p2p approach to large scale OS deployment Andrea Righi

Story Applications Hardware Software Key features Roadmap Summary Story Story Applications