Tendencias de Uso y Diseo de Redes de Interconexin en Computadores - PDF document

Tendencias de Uso y Diseño de Redes de Interconexión en Computadores Paralelos 14 de Abril, 2016 Universidad Complutense de Madrid Ramón Beivide Universidad de Cantabria Outline 1. Introduction 2. Network Basis 3 . System networks 4. On-chip networks (NoCs) 5. Some current research 2

1. Intro: MareNostrum 3 1. Intro: MareNostrum BSC, Infiniband FDR10 non-blocking Folded Clos (up to 40 racks) Mellanox Mellanox Mellanox Mellanox Mellanox Infiniband Mellanox 648-port 648-port 648-port 648-port 648-port 648-port 648-port Latency: 0,7 μs IB IB IB IB IB IB FDR Bandwidth: 40Gb/s Core Switch Core Switch Core Switch Core Switch Core Switch Core Switch Core switch 560 560 560 560 560 560 FDR10 links 3 links to 3 links to 3 links to 3 links to 2 links to 3 links to 3 links to 3 links to 3 links to 2 links to each each each each each each each each each each core core core core core 18 18 18 18 12 18 18 18 18 12 Leaf switches core core core core core 36-port FDR10 36-port FDR10 36-port FDR10 36-port FDR10 36-port FDR10 36-port FDR10 36-port FDR10 36-port FDR10 36-port FDR10 36-port FDR10 18 18 18 18 12 18 18 18 18 12 40 iDataPlex racks / 3360 dx360 M4 nodes 4

1. Intro: Infiniband core switches 5 1. Intro: Cost dominated by (optical) wires 6

1. Intro: Blades 7 1. Intro: Blades 8

1. Intro: Multicore E5-2670 Xeon Processor 9 1. Intro: A row of servers in a Google DataCenter, 2012. 10

3. WSCs Array: Enrackable boards or blades + rack router To other clusters Figure 1.1: Sketch of the typical elements in warehouse-scale systems: 1U server (left), 7’ rack with Ethernet switch (middle), and diagram of a small cluster with a cluster-level Ethernet switch/router (right). 11 3. WSC Hierarchy 12

1. Intro: Cray Cascade (XC30, XC40) 13 1. Intro: Cray Cascade (XC30, XC40) 14

1. Intro: An Architectural Model Interconnection Network S/R S/R … … … … Interconnection Network Interconnection Network S/R S/R … S/R S/R M 1 … M n M 1 … M n ATU ATU ATU ATU L/S L/S L/S L/S L/S L/S L/S L/S CPU 1 … CPU 1 … CPU n Interconnection Network 15 1. Intro: What we need for one ExaFlop/s Networks are pervasive and critical components in Supercomputers, Datacenters, Servers and Mobile Computers. Complexity is moving from system networks towards on-chip networks: less nodes but more complex 16

Outline 1. Introduction 2. Network Basis Crossbars & Routers Direct vs Indirect Networks 3 . System networks 4. On-chip networks (NoCs) 5. Some current research 17 2. Network Basis All networks based on Crossbar switches • Switch complexity increases quadratically with the number of crossbar input/output ports, N , i.e., grows as O( N 2 ) • Has the property of being non-blocking (N! I/O permutations) • Bidirectional for exploiting communication locality • Minimize latency & maximize throughput 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 18

2. Blocking vs. Non-blocking • Reduction cost comes at the price of performance – Some networks have the property of being blocking (Not N!) – Contention is more likely to occur on network links › Paths from different sources to different destinations share one or more links 0 1 2 3 4 5 6 7 X 0 0 0 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6 7 7 7 non-blocking topology blocking topology 19 2. Swith or Router Microarchitecture Pipelined Switch Microarchitecture Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 Physical channel Physical channel Output buffers Input buffers Control DEMUX Link Control DEMUX MUX MUX Link CrossBar Physical channel Physical channel Output buffers Input buffers Control Control DEMUX Link DEMUX MUX Link MUX Routing Control Unit Arbitration Header Unit Crossbar Flit Control Output Forwarding Port # Table IB (Input Buffering) RC (Route Computation) SA (Switch Arb) ST (Switch Traversal) OB (Output Buffering) Matching the throughput Packet header IB RC SA ST OB of the internal switch Payload fragment IB IB IB ST OB datapath to the external Payload fragment IB IB IB ST OB link BW is the goal Payload fragment IB IB IB ST OB 20

2. Network Organization Indirect (Centralized) and Direct (Distributed) Networks End Nodes Switches 21 2. Previous Myrinet core switches (Indirect, Centralized) 22

2. IBM BG/Q (Direct, Distributed) 23 2. Network Organization • As crossbars do not scale they need to be interconnected for servicing an increasing number of endpoints. • Direct (Distributed) vs Indirect (Centralized) Networks • Concentration can be used to reduce network costs – “c” end nodes connect to each switch – Allows larger systems to be built from fewer switches and links – Requires larger switch degree 32-node system with 8-port switches 64-node system with 8-port switches, c = 4 24

Outline 1. Introduction 2. Network Basis 3 . System networks Folded Clos Tori Dragonflies 4. On-chip networks (NoCs) 5. Some current research 25 3. MareNostrum BSC, Infiniband FDR10 non-blocking Folded Clos (up to 40 racks) Mellanox Mellanox Mellanox Mellanox Mellanox Infiniband Mellanox 648-port 648-port 648-port 648-port 648-port 648-port 648-port Latency: 0,7 μs IB IB IB IB IB IB FDR Bandwidth: 40Gb/s Core Switch Core Switch Core Switch Core Switch Core Switch Core Switch Core switch 560 560 560 560 560 560 FDR10 links 3 links to 3 links to 3 links to 3 links to 2 links to 3 links to 3 links to 3 links to 3 links to 2 links to each each each each each each each each each each core core core core core 18 18 18 18 12 18 18 18 18 12 Leaf switches core core core core core 36-port FDR10 36-port FDR10 36-port FDR10 36-port FDR10 36-port FDR10 36-port FDR10 36-port FDR10 36-port FDR10 36-port FDR10 36-port FDR10 18 18 18 18 12 18 18 18 18 12 40 iDataPlex racks / 3360 dx360 M4 nodes 26

3. Network Topology Centralized Switched ( Indirect ) Networks 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12 13 13 14 14 15 15 16 port Crossbar network 27 3. Network Topology Centralized Switched ( Indirect ) Networks 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12 13 13 14 14 15 15 16 port, 3-stage Clos network 28

3. Network Topology Centralized Switched ( Indirect ) Networks 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12 13 13 14 14 15 15 16 port, 5-stage Clos network 29 3. Network Topology Centralized Switched ( Indirect ) Networks 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12 13 13 14 14 15 15 16 port, 7 stage Clos network = Benes topology 30

3. Network Topology Centralized Switched ( Indirect ) Networks Network Bisection 0 1 2 • Bidirectional MINs 3 • Increase modularity 4 • Reduce hop count, d 5 • 6 Folded Clos network 7 – Nodes at tree leaves 8 – Switches at tree vertices 9 – Total link bandwidth is 10 constant across all tree 11 levels, with full bisection 12 bandwidth 13 14 15 Folded Clos = Folded Benes <> Fat tree network !!! 31 3. Other DIRECT System Network Topologies Distributed Switched (Direct) Networks 2D mesh or grid of 16 nodes 2D torus of 16 nodes hypercube of 16 nodes (16 = 2 4 , so n = 4) Network Bisection ≤ full bisection bandwidth! 32

3. IBM BlueGene/L/P Network Prismatic 32x32x64 Torus (mixed-radix networks) BlueGene/P: 32x32x72 in maximum configuration Mixed-radix prismatic Tori also used by Cray 33 3. IBM BG/Q 34

3. IBM BG/Q 35 3 .BG Network Routing Y Wires X Wires Z Wires Adaptive Bubble Routing ATC-UC Research Group 36

3. Fujitsu Tofu Network 37 3. More Recent Network Topologies Distributed Switched ( Direct ) Networks • Fully-connected network : all nodes are directly connected to all other nodes using bidirectional dedicated links 0 1 7 6 2 5 3 4 38

3. IBM PERCS 39 3. IBM PERCS 40

3. IBM PERCS 41 3. Dragonfly Interconnection Network Organized as groups of routers Parameters: • a : Routers per group • p : Node per router • h : Global link per router • Well-balanced dragonfly [1] Inter-group a = 2p =2h • Global links •Complete graph Intra-group • Local links •Complete graph

3. Dragonfly Interconnection Network Destination Destination Node Group i+N Minimal routing • Longest path 3 hops: local - global - local • Good performance under UN traffic Adversarial traffic [1] • ADV+N: Nodes in group i SATURATION send traffic to group i+N • Saturation of the global link [1] J. Kim, W. Dally, S. Scott, and D. Abts. “Technology-driven, highly-scalable dragonfly topology.” ISCA ‘08. Source Source Group Node i 3. Dragonfly Interconnection Network Destination Node Valiant Routing [2] • Randomly selects an intermediate group to misroute packets • Avoids saturated channel • Longest path 5 hops Intermediate Group local - global - local - global - local [2] L. Valiant, “A scheme for fast parallel communication," SIAM journal on com- puting, vol. 11, p. 350, 1982. Source Node

3. Cray Cascade, electrical supernode 45 3. Cray Cascade, system and routing 46

Tendencias de Uso y Diseo de Redes de Interconexin en Computadores - PDF document

Tendencias de Uso y Diseo de Redes de Interconexin en Computadores Paralelos 14 de Abril, 2016 Universidad Complutense de Madrid Ramn Beivide Universidad de Cantabria Outline 1. Introduction 2. Network Basis 3 . System networks 4.

Sum ario Protocolos em Protocolos em Redes de Dados Redes de Dados Lu s Rodrigues

Redes de rea Extensa (WAN) Area de Ingeniera Telemtica http://www.tlm.unavarra.es Redes de

OTRS apoiando a implantao ITIL um caso de uso ITIL um caso de uso Superintendncia

USO DA HIPERMIDIA NO PROCESSO DE APRENDIZAGEM USO DA HIPERMIDIA NO PROCESSO DE APRENDIZAGEM

Redes Redes no aleatrias Quem que te arranjou emprego ? Entrevistas a dezenas de

Redes de Computadores Aula 19 Aula passada Aula de hoje Topologia de rede Redes sem fio local

PERDA DA BIODIVERSIDADE DEVIDA AO IMPACTO CLIMTICO DO USO DO SOLO NA ACV: DADOS REGIONAIS

17 17 Workshop RN RNP Salvador, 31 de maio de 2016. FLUXOS E REDES GLOBAIS VISO

Curso de Redes de Computadores Adriano Mauro Cansian adriano@acmesecurity.org Captulo 4

REDES VIRTUAIS PRIVADAS EM AMBIENTE COOPERATIVO: UMA ABORDAGEM PRTICA Trabalho de Concluso de

Redes de Computadores Adriano Mauro Cansian adriano@acmesecurity.org Captulo 1 Introduo

Aprendizado de Mquina em Redes Complexas Fabricio Aparecido Breve Orientador: Prof. Dr. Zhao

Transporte sobre ADSL Area de Ingeniera Telemtica http://www.tlm.unavarra.es Redes 4

ADSL Area de Ingeniera Telemtica http://www.tlm.unavarra.es Redes 4 Ingeniera

Tecnologas Wi-Fi (y 2) Area de Ingeniera Telemtica http://www.tlm.unavarra.es Redes de

Protocolos em Redes de Dados Vizinhan cas An uncios Aula 06 Interliga c ao com o

Texas Master Naturalist Cradle of Texas Chapter 2020 Fall Intern Training Introduction Class

TraceR: A Parallel Trace Replay Tool for Studying

From Hello World \ n to the VFS Layer Building a HAMMER2 beadm(1) in C newnix Exile Heavy

Compromising Security of Economic Dispatch in Power System Operations DevendraShelar, MIT

other exceptions / context switches 1 Changelog 16 January 2020: fjx location of stack pointer

Parallel Algorithms and Optimization for Multi-Aperture Image Superresolution Reconstruction Bob

The Carbon Cycle: The Carbon Cycle: Ocean and Biosphere Ocean and Biosphere EES 3310/5310 EES

An Introduction to St. Petersburgs North Shore Seagrass Mitigation Bank Michael J. Dema, Esq.

Sambuz

Useful Links

Newsletter

Mail Us