Evaluation of the HPC Challenge Benchmarks in Virtualized Environments Vince Weaver ICL Lunch Talk 8 July 2011
VHPC’11 Paper 6th Workshop on Virtualization in High-Performance Cloud Computing Piotr Luszczek, Eric Meek, Shirley Moore, Dan Terpstra, Vince Weaver, Jack Dongarra 1
Traditional HPC AB ↓ ↓ C 2
Cloud-based HPC AB ↓ ↓ C 3
Cloud Tradeoffs Pros Cons • No AC bill • Unexpected outages • No electricity bill • Data held hostage • No need to spend $$$ • Infrastructure not on infrastructure designed for HPC 4
Measuring Performance in the Cloud First let’s just measure runtime This is difficult because in virtualized environments ❼ ♦ ✶ Time Loses All Meaning ↕ ❖ ✶ 5
Simplified Model of Time Measurement Application Operating System Hardware Time 6
Then the VM gets involved Application Operating System VM Layer Hardware Time 7
Then you have multiple VMs App. ? ? OS1 OS2 OS1 OS2 VM Layer Hardware Time 8
So What Can We Do? Hope we have exclusive access and measure wall-clock time. 9
Measuring Time Externally • Ideally have local hardware access, root, and hooks into the VM system • Otherwise, you can sit there with a watch • Danciu et al. send UDP packet to remote server • Most of these are not possible in a true “cloud” setup 10
Measuring Time From Within Guest • Use gettimeofday() or clock gettime() • This might be the only interface we have • How bad can it be? 11
Our Experimental Setup • 8-core Core i7, (dual 4-core 2.93GHz Xeon X5570) • VMware Player 3.1.4, VirtualBox 4.0.8, KVM 2.6.35 • HPC Challenge Benchmarks, Open MPI • Time measured by invoked by gettimeofday() MPI Wtime() 12
Accuracy Drift • Typical development model is to re-run app over and over again with slight changes while monitoring performance • In virtualized environment, factors inherent in the virtualization might change runtime run to run more than any optimization tuning 13
Ascending vs Descending – HPL Bare metal showed no difference 70 VMware Player VirtualBox KVM 60 50 Percentage difference 40 30 20 10 0 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 Matrix size 14
Performance Results We use a relative metric, defined as: performance VM performance bare metal × 100% 15
HPL – Low OS/Communication Overhead VMware Player VirtualBox KVM 100 % Of Bare Metal Performance 80 60 40 20 0 0 5000 10000 15000 Problem Size 16
MPIRandomAccess – High OS/Communication Overhead VMware Player VirtualBox KVM 100 % Of Bare Metal Performance 80 60 40 20 0 18 20 22 24 26 28 Log of Problem Size 17
Conclusion • Virtualization exacerbates the existing problem of accurate performance measurement • Different workloads can stress the VM layer in drastically different ways • Extra care needs to be taken to generate repeatable results 18
Future Work • Validate internal time measurements with external ones • More analysis of sources of VM overhead • Performance of larger systems with off-node network activity 19
Future Work – PAPI-V • “Improved” timer support. Direct wall-clock access? • Virtualized performance counters • Components for the virtualized hardware: Network Interfaces, etc. 20
Questions? vweaver1@eecs.utk.edu 21
Recommend
More recommend