HPC and I/O Subsystems Ratan K. Guha School of Electrical Engineering and Computer Science University of Central Florida Orlando, FL 32616 Overview My Experience and Current Projects Top 10 Supercomputers Cluster Computers Node to Node and I/O Communication Current and Future trends 1
My Experience 1990 - 1997- BBN Butterfly, DEC Mpp NSF Grant 2004 – Sun Cluster – ARO Grant W911NF04110100 Cluster Computing Facilities Ariel # Nodes : 48 (Sun Fire V20z) CPU : Dual AMD Opteron 242 1.6GHz processors Memory : 2 GB Network : Gigabit Ethernet 2x36 GB internal disk OS : SunOS 5.9 2
Current Projects Composite cathodes for Intermediate Temperature SOFCs: A comprehensive approach to designing materials for superior functionality, PIs. N. Orlovskaya, A. Sleiti , J. Kapat (MMAE) , A. Masunov (NSTC), R. Guha (CS) [CPMD, Fluent] NASA Grant VCluster: A Thread-Based Java Middleware for SMP and Heterogeneous Clusters – Ph. D. Dissertation Work Parallel Simulation ARO Grants DAAD19-01-1-0502, W911NF04110100 Top 5 Supercomputers 1. DOE/NNSA/LLNL USA BlueGene/L - eServer Blue Gene Solution IBM 2. NNSA/Sandia National Laboratories USA -Red Storm - Sandia/ Cray Red Storm, Opteron 2.4 GHz dual core Cray Inc. 3. IBM Thomas J. Watson Research Center USA BGW - eServer Blue Gene Solution, IBM 4. DOE/NNSA/LLNL – USA ASC Purple - eServer pSeries p5 575 1.9 GHz, IBM 5. Barcelona Supercomputing Center, Spain, MareNostrum - BladeCenter JS21 Cluster, PPC 970, 2.3 GHz, Myrinet, IBM 3
Top 6 -10 Supercomputers 6. NNSA/Sandia National Laboratories USA Thunderbird - PowerEdge 1850, 3.6 GHz, Infiniband Dell 7. Commissariat a l'Energie Atomique (CEA) FranceTera-10 - NovaScale 5160, Itanium2 1.6 GHz, Quadrics, Bull SA 8. NASA/Ames Research Center/NAS USA Columbia - SGI Altix 1.5 GHz, Voltaire Infiniband, SGI 9. GSIC Center, Tokyo Institute of Technology, Japan, TSUBAME Grid Cluster - Sun Fire x4600 Cluster, Opteron 2.4/2.6 GHz and ClearSpeed Accelerator, InfinibandNEC/Sun 10. Oak Ridge National Laboratory USA Jaguar - Cray XT3, 2.6 GHz dual Core, Cray Inc. Some Statistics 261 Intel processors, 113 AMD Operton family 93 IBM Power processors 4
Cluster Computing Become popular with the availability of High performance microprocessors High speed networks Distributed computing tools Provide performance comparable to supercomputers with a much lower price Cluster Computing To run a parallel program on a cluster Processes must be created on every Process 1 Process 2 Process 3 Process 4 machine in the cluster Processes must be able to communicate with each other Ethernet 5
Communications Fiber Channel Gigabit Ethernet Myrinet InfiniBand Myrinet Designed by Myricom High-speed LAN used to interconnect machines Lightweight protocol (2Gb/s) Low latency for short messages Sustained data rate for large messages 6
Gigabit Ethernet Standardized by IEEE 802.3 Data rates in Gigabits/s Deployed in high-capacity backbone network links High end-to-end throughput and less expensive as compared to Myrinet Four physical layer standards: optical fiber, twisted pair cable, or balanced copper cable Fiber Channel Gigabit-speed network technology used for storage networking Runs on both twisted pair Cu and fiber optic Reliable and scalable 4 Gb/s BW Supports many topologies and protocols Efficient Cons Although initially used for supercomputing, more popular now in storage markets More standard definitions are increasing complexity of the protocol 7
InfiniBand (IB) High performance, low latency I/O Interconnect architecture for channel-based, switched fabric servers Replacement for PCI shared-bus First version released in Oct 2000 by InfiniBand Trade Association (ITA) formed Compaq, Dell, HP, IBM, Intel, MS, Sun responsible for compliance and interoperability testing of commercial products June 2001 – Version 1.0a released Why is it different? Unlike present I/O subsystem, IB is a network Uses IPv6 with its 128-bit address IB’s revolutionary approach: Instead of sending data in parallel across the backplane bus (data path), IB uses a serial (bit-at-a-time) bus Fewer pins saves cost and adds reliability Serial bus can multiplex a signal Supports multiple memory areas, which can be accessed by processors and storage devices 8
Advantages of InfiniBand High performance 20Gb/s node-to-node 60Gb/s switch-to-switch IB has defined roadmap to 120Gb/s (fastest specification for any interconnect) Reduced complexity Multiple I/Os on one cable Consolidates clustering transmissions, communications, storage and management data types over a single connection Advantages Contd. Efficient interconnect Communication processing in HW, not CPU, so full resource utilization at each node Employs Remote DMA (efficient data transfer protocol) Reliability, stability, scalability Reliable end-to-end data connections Virtualizations allow multiple apps to run on the same interconnect IB fabrics have multiple paths and fault is limited to a link Can support tens of thousands of nodes in single subnet 9
Integrating into a data center Connecting Fiber Channel storage fabrics to an IB infrastructure: Bridges Somewhat costly Create a bottleneck that gates the Fiber Channel access to speeds less than the array is typically capable of delivering Native interconnects More cost-effective, easier-to-manage solution Integrate the arrays directly to the IB fabric InfiniBand and Gigabit Ethernet? IB is complimentary to GE or Fiber Channel. Cost of FC is quite high GE and Fiber Channel are expected to connect into the IB fabric to access IB- enabled compute resources Helps IT managers to better balance I/O and processing resources within an IB fabric Allows applications to use IB’s RDMA to fetch data, computer and put intermediate results, good for HPC 10
11
Current and Future Trends HPC and I/O Subsystem communication will be faster and easier to manage HPC scientific applications will continue New multidisciplinary work will increase Financial business will use HPC systems 12
References http://www.infinibandta.org http://www.cray.com http://www.myri.com http://www.fibrechannel.org/ Jens Mache, "An Assessment of Gigabit Ethernet as Cluster Interconnect," iwcc , p. 36, 1999. http://www.supercomp.org/sc2002/paperpdfs/pap.pap207.pdf http://compnetworking.about.com/cs/clustering/g/bldef_infiniban.htm InfiniBand today, Article by Dave Ellis http://www.wwpi.com/index.php?option=com_content&task=view&id=1163& Itemid=44 http://www.mellanox.com/pdf/presentations/Top500_Nov_06.pdf http://www.mellanox.com/applications/top_500.php Fiber Channel vs. InfiniBand vs. Ethernet http://www.processor.com/editorial/article.asp?article=articles%2Fp2911%2 F31p11%2F31p11%2Easp&guid=934C81176D3D40969DF5ABA3E28DC8 CF&searchtype=&WordList=&bJumpTo=True 13
Recommend
More recommend