implications of cache asymmetry on server consolidation
play

Implications of Cache Asymmetry on Server Consolidation Performance - PowerPoint PPT Presentation

Implications of Cache Asymmetry on Server Consolidation Performance Presenter: Omesh Tickoo Padma Apparao, Ravi Iyer, Don Newell *Hardware Architecture Lab Intel Corporation 1 IISWC 2008 Outline Server Consolidation Asymmetric


  1. Implications of Cache Asymmetry on Server Consolidation Performance Presenter: Omesh Tickoo Padma Apparao, Ravi Iyer, Don Newell *Hardware Architecture Lab Intel Corporation 1 IISWC 2008

  2. Outline • Server Consolidation • Asymmetric Caches • Performance Implications • Measurement-Based Analysis • Conclusions / Future Work 2 IISWC 2008

  3. Server Consolidation • Motivation – Virtualization and consolidation are a growing trend in datacenters – Majority of servers expected to run consolidated workloads within few years Workload 1 Workload 3 Workload 2 Workload Guest OS Guest OS Guest OS Single O/S VMM or Hypervisor Server Server • Problem – Performance analysis of consolidation scenarios is challenging • Different virtualization overheads depending on VMM & platform virtualization support • Resource contention (core, cache, memory, etc) between VMs affects performance • Focus – Server consolidation performance as a function of cache contention & asymmetry 3 IISWC 2008

  4. Why study asymmetry? • CMP platforms today have symmetric caches – But space in cache is asymmetrically allocated depending on demand from virtual machines => Virtual Asymmetry • Future CMP platforms may have asymmetric caches – Asymmetry to reduce cache space domination of die area – Asymmetry due to process variability / faults => Physical Asymmetry 4 IISWC 2008

  5. Cache Asymmetry Task1 Task1 Task2 Task2 Taskx Taskx Taskn Taskn Task1 Task1 Task2 Task2 Taskx Taskx Taskn Taskn C C C C C C C C C C C C C C C C Cache Cache Cache Cache Cache Cache Cache Cache Cache Cache Cache Cache ( b) Virtually Asym m etric ( b) Virtually Asym m etric ( a) Sym m etric ( a) Sym m etric Shared Caches of Equal Size Shared Caches of Equal Size Private Caches of Equal Size Private Caches of Equal Size Task1 Task1 Task2 Task2 Taskx Taskx Taskn Taskn Task1 Task1 Task2 Task2 Taskx Taskx Taskn Taskn C C C C C C C C C C C C C C C C Cache Cache Cache Cache Cache Cache Cache Cache Cache Cache Cache Cache ( d) Virtually & Physically Asym m etric ( d) Virtually & Physically Asym m etric ( c) Physically Asym m etric ( c) Physically Asym m etric Shared Caches of Different Size Shared Caches of Different Size Private Caches of Different Size Private Caches of Different Size W hat are the im plications 5 IISWC 2008 on server consolidation perform ance?

  6. Consolidation Benchmark • vConsolidate 5 VMs -- SPECjbb VM -- Sysbench VM -- Webbench VM -- MailServer VM -- Idle VM Memory Vcpus Configuration VM/Workload Configuration in MB Java/SPECjbb (bops/sec) 2 2056 Database/Sysbench (Tx/sec) 2 1544 Web/Webench (Tx/sec) 2 1544 Mail/Exchange (hits/sec) 1 1544 Idle 1 418 6 IISWC 2008

  7. Platform Configuration Hardware • Intel Xen 5400 series vConsolidate VM VM VM – Quadcore per socket – 6MB+6MB $ per socket Xen 3.1 – Used 4MB, 3MB, 2MB cache configs also to create physical LLC LLC LLC LLC asymmetry VMM Mem ory – Xen 3.1 7 IISWC 2008

  8. Analyzing Implications • Four Key Configurations – 1 Virtual Machine • On physically symmetric cache • On physically asymmetric cache – Multi-Virtual Machine • On physically symmetric cache – But virtually asymmetric • On physically asymmetric cache – But virtually asymmetric also 8 IISWC 2008

  9. 1VM / Symmetric Caches Virtual Machine Virtual Machine OR (no sharing) (w/ sharing) All LLCs of same LLC LLC LLC LLC size Mem ory SPECjbb Performance (Symmetric Caches) Webbench Performance (Symmetric Caches) Sysbench Performance (Symmetric Caches) 1.6 1.2 2MB 3MB 4MB 6MB 2MB 3MB 4MB 6MB 1.2 Metric normalized to 2MB 2MB 3MB 4MB 6MB Metric normalized to 2MB 1.4 Metric normalized to 2MB 1 1 1.2 0.8 0.8 1 0.6 0.8 0.6 0.6 0.4 0.4 0.4 0.2 0.2 0.2 0 Thruput CPI MPI 0 0 Thruput CPI MPI Thruput CPI MPI SPECjbb2005 most sensitive to cache – 50% perf improvement from 2MB to 6MB Sysbench and Webbench show less than 10% improvement 9 IISWC 2008

  10. Multi-VM / Virtually Asymmetry Consolidated Virtual Machines (vCon) All LLCs of same LLC LLC LLC size LLC Mem ory SPECjbb Performance with Virtual Cache Asymmetry (6MB) SPECjbb Performance with Virtual Cache Asymmetry (4MB) Metric normalized to when Metric normalized to when 4MB Thruput 4MB CPI 4MB MPI 1.60 1.60 6MB Thruput 6MB CPI 6MB MPI running alone running alone 1.20 1.20 0.80 0.80 0.40 0.40 0.00 0.00 e B h n JBBalone JBB+JBB JBB+Webbench JBB in vCon h JBB+Sysbench n c B c o o n n C J l e + e v a b b B B n b s B B e i y J J S W B B + + B J B B B J J Consolidation causes causes ~30% loss in performance Cache Interference => 20% Core Inteference => 9% 10 IISWC 2008

  11. 1VM / Physical Asymmetry Individual Virtual Machine LLC LLC LLC LLC LLCs are LLCs are smaller 6M size Mem ory (4M, 3M or 2M) SPECjbb (Physically Asymmetric Caches) Webbench (Physically Asymmetric Caches) Sysbench (Physically Asymmetric Caches) 1.80 1.40 Thruput CPI MPI 1.40 Thruput CPI MPI 1.60 Metric normalized to 6MB-6MB Metric normalized to 6MB-6MB 1.20 Thruput CPI MPI Metric normalized to 6MB-6MB 1.40 1.20 1.00 1.20 1.00 0.80 1.00 0.80 0.80 0.60 0.60 0.60 0.40 0.40 0.40 0.20 0.20 0.20 0.00 0.00 0.00 6-6 6-4 6-3 6-2 6-6 6-4 6-3 6-2 6-6 6-4 6-3 6-2 SPECjbb2005 is affected the most Sysbench and Webbench are not affected much 11 IISWC 2008

  12. Multi-VM / Virtual+Physical Asymmetry Consolidated Virtual SPECjbb Performance onVirtual+Physical asymmetry Machines (vCon) 1.60 Metric normalized to 6MB-6MB 1.40 1.20 1.00 0.80 0.60 0.40 0.20 0.00 LLC LLC LLC LLC 6-6 6-4 6-3 6-2 Thruput CPI MPI LLCs are Sysbench Performance on Virtual+Physical Asymmetry LLCs are smaller 6M size Mem ory 1.60 (4M, 3M or 2M) Metric normalized to 6MB-6MB 1.40 1.20 1.00 0.80 SPECjbb is affected the most (as expected) 0.60 0.40 0.20 0.00 6-6 6-4 6-3 6-2 Sysbench and Webbench are not affected much Thruput CPI MPI WebBench Performance onVirtual+Physical Asymmetry 1.40 Opportunity to move Sysbench and Webbench to Metric normalized to 6MB-6MB 1.20 1.00 smaller cache cores 0.80 0.60 => can improve performance of SPECjbb? 0.40 0.20 0.00 6-6 6-4 6-3 6-2 Thruput CPI MPI 12 IISWC 2008

  13. Inferences Affinitization Experiment: • Asymmetry-Aware Affinitize one vcpu to large core Scheduling Leave the other vcpu floating – Virtual Asymmetry vcpu0 (affinitized vcpu1 • Monitor usage and JBB to 6MB) (floating) % benefit interference CPI 1.51 1.80 19% • Modify VMM scheduler to MPI 0.0051 0.0070 39% take this into account vcpu0 (affinitized vcpu1 – Physical Asymmetry Sysbench to 6MB) (floating) % benefit • Monitor usage in large 2.51 2.96 18% CPI and small cores MPI 0.0016 0.0020 25% • Modify VMM scheduler to vcpu0 affinitize (6MB vcpu1 Webbench cache) (floating) % benefit – Cache-sensitive VMs to large-cache-cores 2.59 2.88 11% CPI – Cache-insensitive VMs 0.0023 0.0026 11% MPI to small-cache-cores Allows for detection of sensitivity for 13 Improved scheduling IISWC 2008

  14. Summary • Presented cache asymmetry – Symmetric – Virtual Asymmetry – Physical Asymmetry – Virtual + Physical Asymmetry • Studied the implications of cache asymmetry on a consolidation workload – Using vConsolidate & asymmetric CMP platform • Showed cache contention overheads and overall cache sensitivity • Discussed the potential for asymmetry-aware scheduling 14 IISWC 2008

Recommend


More recommend