Scalable Multi-Purpose Network Representation for Large Scale - PowerPoint PPT Presentation

Scalable Multi-Purpose Network Representation for Large Scale Distributed System Simulation Laurent Bobelin 1 , Arnaud Legrand 1 , arquez 2 Pierre Navarro 1 , David A. Gonz´ alez M´ Martin Quinson 3 , Fr´ eric Suter 4 , Christophe Thi´ ery 3 ed´ 1 LIG, Grenoble University, France 2 Departemento de Computacion, Universitad de Buneos Aires, Argentina 3 LORIA, Nancy University, France 4 IN2P3 Computing Center, CNRS/IN2P3 Lyon-Villeurbanne, France ANR 08 SEGI 022 ANR 11 INFRA 13 CCGrid 2012 A. Legrand (CNRS) INRIA-MESCAL Scalability vs. Accuracy 1 / 12

Large Scale Distributed Systems LSDS (clusters, P2P, grid, volunteer computing, clouds, . . . ) are a pain ◮ analytic methods quickly become intractable and often fail to cap- ture key characteristics of real systems ◮ experiments on the field are tedious, time-consuming, non- reproducible, sometimes even impossible A. Legrand (CNRS) INRIA-MESCAL Scalability vs. Accuracy Context 2 / 12

Large Scale Distributed Systems LSDS (clusters, P2P, grid, volunteer computing, clouds, . . . ) are a pain ◮ analytic methods quickly become intractable and often fail to cap- ture key characteristics of real systems ◮ experiments on the field are tedious, time-consuming, non- reproducible, sometimes even impossible Hence, lots of research in our area rely on simulation A. Legrand (CNRS) INRIA-MESCAL Scalability vs. Accuracy Context 2 / 12

Large Scale Distributed Systems LSDS (clusters, P2P, grid, volunteer computing, clouds, . . . ) are a pain ◮ analytic methods quickly become intractable and often fail to cap- ture key characteristics of real systems ◮ experiments on the field are tedious, time-consuming, non- reproducible, sometimes even impossible Hence, lots of research in our area rely on simulation LSDS simulation challenges ◮ scalability (both in terms of speed and memory) ◮ accuracy /validity/realism (a very context-dependent notion) ◮ genericity A. Legrand (CNRS) INRIA-MESCAL Scalability vs. Accuracy Context 2 / 12

Large Scale Distributed Systems LSDS (clusters, P2P, grid, volunteer computing, clouds, . . . ) are a pain ◮ analytic methods quickly become intractable and often fail to cap- ture key characteristics of real systems ◮ experiments on the field are tedious, time-consuming, non- reproducible, sometimes even impossible Hence, lots of research in our area rely on simulation LSDS simulation challenges ◮ scalability (both in terms of speed and memory) ◮ accuracy /validity/realism (a very context-dependent notion) ◮ genericity Most works trade everything for scalability although. . . Premature optimization is the root of all evil – D.E.Knuth A. Legrand (CNRS) INRIA-MESCAL Scalability vs. Accuracy Context 2 / 12

Validity: Community Requirements Networking Protocol design requires accurate packet-level simulations A. Legrand (CNRS) INRIA-MESCAL Scalability vs. Accuracy Context 3 / 12

Validity: Community Requirements Networking Protocol design requires accurate packet-level simulations Not everyone has such needs A. Legrand (CNRS) INRIA-MESCAL Scalability vs. Accuracy Context 3 / 12

Validity: Community Requirements Networking Protocol design requires accurate packet-level simulations Not everyone has such needs P2P DHT geographic diversity, jitter, churn � no need for contention, only delay A. Legrand (CNRS) INRIA-MESCAL Scalability vs. Accuracy Context 3 / 12

Validity: Community Requirements Networking Protocol design requires accurate packet-level simulations Not everyone has such needs P2P DHT geographic diversity, jitter, churn � no need for contention, only delay P2P streaming network proximity, asymmetry, interference on the edge � ignore the core A. Legrand (CNRS) INRIA-MESCAL Scalability vs. Accuracy Context 3 / 12

Validity: Community Requirements Networking Protocol design requires accurate packet-level simulations Not everyone has such needs P2P DHT geographic diversity, jitter, churn � no need for contention, only delay P2P streaming network proximity, asymmetry, interference on the edge � ignore the core Grid heterogeneity, complex topology, contention w. large transfers � no need to focus on packets A. Legrand (CNRS) INRIA-MESCAL Scalability vs. Accuracy Context 3 / 12

Validity: Community Requirements Networking Protocol design requires accurate packet-level simulations Not everyone has such needs P2P DHT geographic diversity, jitter, churn � no need for contention, only delay P2P streaming network proximity, asymmetry, interference on the edge � ignore the core Grid heterogeneity, complex topology, contention w. large transfers � no need to focus on packets Volunteer Computing dynamic availability, heterogeneity � little need for networking HPC complex communication workload, protocol peculiarities � build on regularity and homogeneity Cloud mixture of previous requirements Consequence: most simulators are ad hoc and domain-specific � �� read “dead within a year or so” A. Legrand (CNRS) INRIA-MESCAL Scalability vs. Accuracy Context 3 / 12

Network Communication Models Packet-level simulation Networking community has standards, many popular open-source projects (NS, GTneTS, OmNet++,. . . ) ◮ full simulation of the whole protocol stack ◮ complex models � hard to instantiate ◮ inherently slow A. Legrand (CNRS) INRIA-MESCAL Scalability vs. Accuracy Network Models 4 / 12

Network Communication Models Packet-level simulation Networking community has standards, many popular open-source projects (NS, GTneTS, OmNet++,. . . ) ◮ full simulation of the whole protocol stack ◮ complex models � hard to instantiate ◮ inherently slow Delay-based models The simplest ones. . . ◮ communication time = constant delay, statistical distribution, LogP � ( Θ(1) footprint and O (1) computation) ◮ coordinate based systems to account for geographic proximity � ( Θ( N ) footprint and O (1) computation) Although very scalable, these models ignore network congestion and typically assume large bissection bandwidth A. Legrand (CNRS) INRIA-MESCAL Scalability vs. Accuracy Network Models 4 / 12

Network Communication Models (cont’d) Flow-level models A communication (flow) is simulated as a single entity:  S message size   T i,j ( S ) = L i,j + S/B i,j , where L i,j latency between i and j   B i,j bandwidth between i and j Estimating B i,j requires to account for interactions with other flows A. Legrand (CNRS) INRIA-MESCAL Scalability vs. Accuracy Network Models 5 / 12

Network Communication Models (cont’d) Flow-level models A communication (flow) is simulated as a single entity:  S message size   T i,j ( S ) = L i,j + S/B i,j , where L i,j latency between i and j   B i,j bandwidth between i and j Estimating B i,j requires to account for interactions with other flows Assume steady-state and share bandwidth every time a new flow ap- pears or disappears Setting a set of flows F and a set of links L � Constraints For all link j : ̺ i � C j if flow i uses link j A. Legrand (CNRS) INRIA-MESCAL Scalability vs. Accuracy Network Models 5 / 12

Network Communication Models (cont’d) Flow-level models A communication (flow) is simulated as a single entity:  S message size   T i,j ( S ) = L i,j + S/B i,j , where L i,j latency between i and j   B i,j bandwidth between i and j Estimating B i,j requires to account for interactions with other flows Assume steady-state and share bandwidth every time a new flow ap- pears or disappears Setting a set of flows F and a set of links L � Constraints For all link j : ̺ i � C j if flow i uses link j Objective function ◮ Max-Min max(min( ̺ i )) ◮ or other fancy objectives e.g., Reno ∼ max( � log( ̺ i )) A. Legrand (CNRS) INRIA-MESCAL Scalability vs. Accuracy Network Models 5 / 12

Wrap up on flow-level models Such fluid models can account for TCP key characteristics ◮ slow-start ◮ flow-control limitation ◮ RTT-unfairness ◮ cross traffic interference They are a very reasonable approximation for most LSDC systems Yet, many people think they are too complex to scale. Let’s prove them wrong! ¨ ⌣ A. Legrand (CNRS) INRIA-MESCAL Scalability vs. Accuracy Network Models 6 / 12

How to achieve scalability Platform description N nodes and E links Main issues with topology ◮ description size, expressiveness ◮ memory footprint ◮ computation time Representation Input Footprint Parsing Lookup A. Legrand (CNRS) INRIA-MESCAL Scalability vs. Accuracy Topology Representation 7 / 12

How to achieve scalability Platform description N nodes and E links Main issues with topology ◮ description size, expressiveness ◮ memory footprint N ◮ computation time Classical network representation N 1 Flat representation 5000 hosts doesn’t fit in 4Gb! { L 12 , L 52 , . . . , L 4 } Representation Input Footprint Parsing Lookup N 2 N 2 N 2 Flat 1 A. Legrand (CNRS) INRIA-MESCAL Scalability vs. Accuracy Topology Representation 7 / 12

How to achieve scalability Platform description N nodes and E links Main issues with topology ◮ description size, expressiveness ◮ memory footprint ◮ computation time Classical network representation 1 Flat representation 5000 hosts doesn’t fit in 4Gb! 2 Graph representation assum- ing shortest path routing Representation Input Footprint Parsing Lookup Dijsktra N + E E + N log N N + E E + N log N N 2 N 3 Floyd N + E 1 A. Legrand (CNRS) INRIA-MESCAL Scalability vs. Accuracy Topology Representation 7 / 12

Scalable Multi-Purpose Network Representation for Large Scale - PowerPoint PPT Presentation

Scalable Multi-Purpose Network Representation for Large Scale Distributed System Simulation Laurent Bobelin 1 , Arnaud Legrand 1 , arquez 2 Pierre Navarro 1 , David A. Gonz alez M Martin Quinson 3 , Fr eric Suter 4 , Christophe Thi ery 3

Cache Coherence in Scalable Machines Scalable Cache Coherent Systems Scalable, distributed

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

A Novel Framework For Scalable Video A Novel Framework For Scalable Video Streaming Over

Scalable String Matching on the Scalable String Matching on the Scalable String Matching on the

A Scalable Scalable Approach Approach A for for Large- -Scale Scale Schema Schema

K K Knowledge Knowledge l d l d Representation Representation Representation

Towards MKM in the Large: Modular Representation and Scalable Software Architecture Michael

Large Scale Knowledge Representation of Large Scale Knowledge Representation of Distributed

Multi-View Representation Learning: Algorithms and Applications Changqing Zhang ( )

Scalable Multi-Core Model Checking Alfons Laarman ( alfons@laarman.com ), Theory joint work with

Stout An Adaptive Interface to Scalable Cloud Storage John Dunagan John C. McCullough Alec

TenantGuard: Scalable Runtime Verification of Cloud-Wide VM-Level Network Isolation Han Song

The Scalable Commutativity Rule: Designing Scalable Software for Multicore Processors Austin T.

Dyninst Scalable Tools Workshop Granlibakken Resort Lake Tahoe, California Dyninst Scalable

Scalable Distributed Lineage Authentication Ashish Gehani Scalable Distributed Lineage

Stable and Efficient Representation Learning with Nonnegativity Constraints Tsung-Han Lin and

Peer-to-Peer Networks 04: Chord Christian Ortolf Technical Faculty Computer-Networks and

Hover Hand Fall Quarter Design Review Austin Dorotheo, Steven Fields, Colin Garrett, Miclos

Comet: An Active Distributed Key-Value Store Roxana Geambasu Amit Levy Yoshi Kohno Arvind

Motivation The Impact of DHT Routing Geometry on Resilience and New DHTs constantly

CS 744: GOOGLE FILE SYSTEM Shivaram Venkataraman Fall 2019 ANNOUNCEMENTS - Assignment 1 out

HYDRAstor: a Scalable Secondary Storage 7th TF-Storage Meeting September 9 th 2010 ukasz Heldt

DSF: A Common Platform For Distributed Systems Research and Development Chunqiang (CQ) Tang IBM

Models and Tools for the High-Level Simulation of a Name-Based Interdomain Routing Architecture