Center for Information Services and High Performance Computing (ZIH) Introduction to High Performance Computing at ZIH Architecture of the PC Farm (Deimos) Zellescher Weg 12 Trefftz-Bau/HRSK 151 Phone +49 351 - 463 - 39871 Guido Juckeland (guido.juckeland@tu-dresden.de)
Agenda PC Farm Components AMD Opteron Prozessors und Systems Infiniband Networks Slide 2 - Guido Juckeland
PC Farm Components (Deimos) Slide 3 - Guido Juckeland
Linux Networx PC-Farm (Deimos) 1292 AMD Opteron x85 Dual-Core CPUs (2,6 GHz) 726 Compute nodes with 2, 4 oder 8 CPU Cores Per core 2 GiByte main memory 2 Infiniband interconnects (MPI- and I/O-Fabric) 68 TByte SAN-Storage Per node 70, 150, 290 GByte scratch- disk OS: SuSE SLES 10 Batch system: LSF Compiler: Pathscale, PGI, Intel, Gnu 3rd party applications: Ansys100, CFX, Fluent, Gaussian, LS-DYNA, Matlab, MSC,… Slide 4 - Guido Juckeland
Deimos - Partitions 2 Master Nodes – Not accessible for users, PC-Farm management 4 Login Nodes – 4 Core Nodes – Accessible with DNS Round Robin under deimos.hrsk.tu-dresden.de Single-, Dual- und Quad-Nodes – 1, 2 or 4 CPUs – 4, 8 or 16 GiByte main memory (24 Quads with 32 GiByte) – 80, 160 or 300 GByte local disks Setup in phase 1 and phase 2 nodes – Identical hardware – Differences in the connection to the MPI- and the I/O-Fabric (later) Slide 5 - Guido Juckeland
AMD Opteron Processors und Systems Slide 6 - Guido Juckeland
AMD Opteron CPU - Design AMD Opteron x85 (2,6 GHz) Memory controller on-chip (2 memory channels with 3.2 GiByte/s transfer bandwidth each) Each Core 64 KiByte level 1 instruciton- and data cache 1 MiByte Level 2 Cache 64 Bit extension of IA-32 x86- architecture (x86-64, x64 oder EM64T) Now also as quad core CPUs available Slide 7 - Guido Juckeland
AMD Opteron – Block diagram Instr'n 2k Level 1 Instr'n Cache TLB Branch Targets Fetch 2 - transit 16k History Level 2 Pick Counter Cache RAS & Decode 1 Decode 1 Decode 1 Target Address Decode 2 Decode 2 Decode 2 v Pack Pack Pack L2 ECC L2 Tags Decode Decode Decode L2 Tag ECC System Request 8-entry 8-entry 8-entry 36-entry Queue (SRQ) Scheduler Scheduler Scheduler Scheduler Cross Bar (XBAR) ALU AGU ALU AGU ALU AGU FADD FMUL FMISC Memory Controller & Data TM HyperTransport Level 1 Data Cache ECC TLB Slide 8 - Guido Juckeland
Deimos – Layout of a single-CPU node AMD Memory (4 GiByte) Opteron 185 Hypertransport Peripheral devices (Infiniband, Ethernet, Disk) Slide 9 - Guido Juckeland
Deimos – Layout of a dual-CPU nodes AMD AMD Memory Memory (4 GiByte) (4 GiByte) Opteron Opteron 285 285 Hypertransport Hypertransport Peripheral devices (Infiniband, Ethernet, Festplatte) Slide 10 - Guido Juckeland
Deimos - Layout of a quad-CPU Node AMD AMD Memory Memory (4 GiByte) (4 GiByte) Opteron Opteron 885 885 Hypertransport Hypertransport Hypertransport AMD AMD Memory Memory (4 GiByte) (4 GiByte) Opteron Opteron 885 885 Hypertransport Hypertransport Peripheral devices (Infiniband, Ethernet, Festplatte) Slide 11 - Guido Juckeland
Infiniband Networks Slide 12 - Guido Juckeland
Basic Layout Slide 13 - Guido Juckeland
More complicated structures Slide 14 - Guido Juckeland
Infiniband-Stack Slide 15 - Guido Juckeland
Consequences for the user No standard Linux networks (eth0,...) No IP-addresses No direct traffic monitoring possible Very low MPI latency (about 5-15 μ s) High MPI bandwidth (up to 900 MiByte/s) The batch system does not know about the state of the Infiniband fabric Slide 16 - Guido Juckeland
Deimos Infiniband-Layout (rough sketch) Node Node MPI Netzwerk Node Node Node Node Node Node IO Netzwerk ... ... Node Node Slide 17 - Guido Juckeland
Deimos MPI-Fabric 3 288-Port Voltaire ISR 9288 IB-Switches with 4x Infiniband Ports +-------------------+ +--------------------+ +-------------------+ | Switch 1 | | Switch 2 | | Switch 3 | | | 30x | | 30x | | | Rack 05 |-------| Rack 20 |-------| Rack 25 | | | | | | | | all Phase1 Nodes | | Phase2 Duals+Quads | | Phase 2 Singles | +-------------------+ +--------------------+ +-------------------+ Slide 18 - Guido Juckeland
Deimos I/O Fabric Tree structure with – 1 192 Port Voltaire ISR 9288 IB-Switch with 4x Infiniband Ports (Rack 07) – 36 24 Port Mellanox IB-Switch (4x) passive 24 Port Mellanox 24 Port Mellanox 24 Port Mellanox 24 Port Mellanox Voltaire ... ... Core-Switch 24 Port Mellanox 24 Port Mellanox Phase 2 Phase 1 Slide 19 - Guido Juckeland
Recommend
More recommend