Computing: Is it Worth the Pain? A TCO Perspective Sandra Wienke, - PowerPoint PPT Presentation

Accelerators in Technical Computing: Is it Worth the Pain? A TCO Perspective Sandra Wienke, Dieter an Mey, Matthias S. Müller Center for Computing and Communication JARA – High-Performance Computing RWTH Aachen University Rechen- und Kommunikationszentrum (RZ)

Agenda  Introduction  Modeling  Total Cost of Ownership (TCO)  Comparison Metrics  Case Study on Accelerators  Programming Models & System Types  TCO Components @ RWTH  Real-World Application  Results  Conclusion & Outlook TCO of Accelerators 2 Sandra Wienke | Center for Computing and Communication

Introduction  Today: Varity of HPC clusters  Usage of accelerators (NVIDIA GPU, Intel Xeon Phi) motivated by promising performance per watt ratio  System comparison by performance or performance per watt not sufficient for purchase decision  Total costs of ownership (TCO)  Acquisition costs, housing, operation costs,..  Inclusion of manpower costs (administration & programming)  Comparison of costs per program run (application-dependent)  Investigation of a real-world software package  OpenMP on Intel Sandy Bridge Impact of manpower effort/  OpenMP + LEO on Intel Xeon Phi programming model?  OpenCL, OpenACC on NVIDA Fermi GPU TCO of Accelerators 3 Sandra Wienke | Center for Computing and Communication

Modeling – Total Cost of Ownership (TCO)  Basis: single compute node  extrapolate to cluster amount 𝑜: number of nodes  𝐉𝐨𝐰𝐟𝐭𝐮𝐧𝐟𝐨𝐮 𝑱 = 𝐔𝐃𝐏 𝒐, 𝝊 = 𝑫 𝒑𝒖 (𝒐) + 𝑫 𝒒𝒃 (𝒐) ∙ 𝝊 𝜐: system lifetime  One-time costs C ot  Per node: HW acquisition, building/infrastructure, OS/ env. installation  Per node type: OS/ env. installation, programming effort  Annual costs C pa  Per node: HW maintenance, building/infrastructure, OS/ env. maintenance, power consumption  Per node type: OS/ env. maintenance, compiler/software, application maintenance  TCO depends on architecture & application TCO of Accelerators 4 Sandra Wienke | Center for Computing and Communication

Modeling – Comparison Metrics  Costs per program run C ppr 𝑜 ∶ number of nodes 𝜐 ∶ system lifetime  Includes investment/ TCO & application performance 𝑜 𝑓𝑦 ∶ #app. executions 𝐷 𝑞𝑞𝑠 𝑜, 𝜐 = TCO(𝑜, 𝜐) 𝑜 𝑓𝑦 (𝜐) ∙ 𝑜 with 𝑜 𝑓𝑦 𝜐 = 𝑙 ∙ 𝜐 𝑙 ∶ system usage rate 𝑢 𝑞𝑏𝑠 : parallel runtime 𝑢 𝑞𝑏𝑠  Used baseline for system X: Intel Sandy Bridge (SNB) + OpenMP 𝐷 𝑞𝑞𝑠,𝑌 𝑜 𝑌 , 𝜐 − 𝐷 𝑞𝑞𝑠,𝑃𝑁𝑄 𝑜 𝑃𝑁𝑄 , 𝜐 < 0 ≥ 0 𝑗𝑔 𝑌 𝑃𝑁𝑄 beneficial 𝐷 𝑞𝑞𝑠,𝑃𝑁𝑄 𝑜 𝑃𝑁𝑄 , 𝜐  Break-even investments  Min. budget needed so that system X beneficial over OpenMP on SNB  Solve for 𝐽 with given fixed lifetime 𝜐 : 𝐷 𝑞𝑞𝑠,𝑌 𝑜 𝑌 , 𝜐 − 𝐷 𝑞𝑞𝑠,𝑃𝑁𝑄 𝑜 𝑃𝑁𝑄 , 𝜐 = 0 with TCO 𝑜, 𝜐 = 𝐽 TCO of Accelerators 5 Sandra Wienke | Center for Computing and Communication

Case Study on Accelerators – Programming Models & System Types Programming Model Accelerator Host Compiler Serial 2x Intel Sandy Bridge, Intel 13.0.1 OpenMP 16 cores, 2 GHz (simple, vectorized) Intel Xeon Phi LEO + OpenMP Intel 13.0.1 5110P, 60 cores 1x Intel Westmere, OpenACC NVIDIA Tesla PGI 12.9 4 cores, 2.4 GHz C2050 (Fermi), OpenCL Intel 13.0.1 ECC on TCO of Accelerators 6 Sandra Wienke | Center for Computing and Communication

Case Study on Accelerators – TCO Components @ RWTH  One-time costs  HW purchase: list prices from Bull  Building/infrastructure: as annual costs since it is amortized over 25 years  OS/env. installation: -  Programming effort: Full-time employee costs 285.71 € a day  Annual costs  HW maintenance: 5% of HW purchase costs  Building/infrastructure: 200,000 € per year; costs per node: division by 1.6MW; multiplication by max. power consumption of each node  OS/env. maintenance: 4 admins, 75% maintenance cluster (~2300 nodes): 180,000 € / 2300 = 78 € per node and year  Software/compiler: -  Power: PUE 1.5, regional electricity costs 0.15 € /kWh  Application maintenance: - (small kernels)  Given lifetime of 4 years & investment  C ppr  #nodes, #executions (usage rate 80%) TCO of Accelerators 7 Sandra Wienke | Center for Computing and Communication

Case Study on Accelerators – Real-World Application  Basis  Serial version  Small kernel  Assumption: homogeneous app. landscape  KegelSpan 2 Source: BMW, ZF, Klingelnberg  3D simulation of bevel gear cutting process  Kernel artificially increased from 25% to 90% TCO of Accelerators 2 C. Brecher, C. Gorgels, and A. Hardjosuwito. Simulation based Tool Wear Analysis in 8 Sandra Wienke | Center for Computing and Communication Bevel Gear Cutting. In International Conference on Gears, volume 2108.2 of VDI- Berichte, pp.1381 – 1384, Düsseldorf, VDI Verlag, 2010.

Case Study on Accelerators – TCO Components of Application 180 250 OpenCL (GPU) 158  power consumption [W] 160 OpenACC (GPU) 140 200 140 119 OpenMP+LEO (Phi) runtime [s] 120 OpenMP-vec (SNB) 150 100 OpenMP-simp (SNB) 80 100 60 40 50 20 0 0 6 5.0 effort [days] 4.5 3.5 4 1.5 2 0.5 0 TCO of Accelerators 9 Sandra Wienke | Center for Computing and Communication

Case Study on Accelerators – Results 20% costs per program run (relative to OMP-simp) OpenCL (GPU) OpenACC (GPU) 10% OpenMP+LEO (Phi) 3.62% 0% OpenMP-vec (SNB) -10% -12.09% -16.82% -20% -17.15% 0 € 100K € 200K € Investment 10,000 € break-even investment 7,787 7,231 5,000 € 1,809 0 € TCO of Accelerators 10 Sandra Wienke | Center for Computing and Communication

Conclusion  Are accelerators beneficial? “It depends”  TCO spreadsheet 1 for own computations available  Our results (w/ 90% kernel portion) show SNB-OMP (4 years, 250 K € )  GPU Fermi beneficial over 2-socket Intel SNB server -17% C ppr + 4% C ppr  Intel Xeon Phi results disappointing for now  Mainly due to high acquisition costs  NVIDIA Kepler probably similar  Programming effort impacts break-even investment (see OpenACC  OpenCL)  Bigger codes: increase of kernel size ~ increase of break-even invest.  Projections possible (e.g. hybrid codes) 1 Wienke, S., an Mey, D., Müller, M.S.: Accelerators for Technical TCO of Accelerators 11 Computing: Is it Worth the Pain? TCO Spreadsheet. https://sharepoint. Sandra Wienke | Center for Computing and Communication campus.rwth-aachen.de/units/rz/HPC/public/Shared%20Documents/ WienkeEtAl_Accelerators-TCO-Perspective.xlsx, 2013

Outlook  Hybrid code implementation (cmp to projections)  Model extensions  New programming models & architectures (OpenMP 4.0, NVIDIA Kepler)  Network communication (MPI)  Mixed job execution (heterogeneous application landscape)  Assessment of decrease in runtime/ gaining more results  Comprehensive TCO calculation with predictive powers  Performance, power consumption, manpower  Towards exascale computing, architectures might get more complex  More difficult to manage & program Thank you for  Impact of manpower effort might get stronger your attention! TCO of Accelerators 12 Sandra Wienke | Center for Computing and Communication

Computing: Is it Worth the Pain? A TCO Perspective Sandra Wienke, - PowerPoint PPT Presentation

Accelerators in Technical Computing: Is it Worth the Pain? A TCO Perspective Sandra Wienke, Dieter an Mey, Matthias S. Mller Center for Computing and Communication JARA High-Performance Computing RWTH Aachen University Rechen- und

Trustworthy Computing * Reverse engineers agree on that! Trustworthy Computing Trustworthy

COMPUTING COMMUNITY CONSORTIUM The mission of the Computing Research Association's Computing

THE COMPUTING COMMUNITY CONSORTIUM (CCC) COMPUTING COMMUNITY CONSORTIUM The mission of Computing

Calm Computing The Coming Age of Mark Weiser and John Seely Brown Calm Computing Whyfor, Calm

Ray Wu Presentation to School of Computing, National University of Singapore Computing Evolution

ManyCore ManyCore Computing: ManyCore ManyCore Computing: Computing: Computing: The Impact on

Ubiquitous Computing Gabriela Avram IxDM13 The Trends in Computing Technology 1970s 1990s

Interacting with Small Devices in Big Ways Chris Harrison 1 Small Powerful + 2 Computing

Quantum Computing and the Forest SDK Robert Smith 2 February 2019 Rigetti Computing Proprietary

Secure Outsourcing Computation Li Xiong Outline Cloud computing Computing on encrypted

Cloud Computing SENY KAMARA MICROSOFT RESEARCH Computing as a Service 2 Computing is a

THE COMPUTING COMMUNITY CONSORTIUM Elizabeth D. Mynatt Chair COMPUTING COMMUNITY CONSORTIUM The

Today's World-wide Today's World-wide Computing Grid for the Computing Grid for the Computing

15-292 History of Computing Growth of Analog Computing & the Birth of Computing Theory

Parallel Computing: Opportunities and Challenges Victor Lee Parallel Computing Lab (PCL), Intel

Linux Containers Drive P2P Social Cloud Computing By Alex Karasulu Social cloud computing ,

100% Service Getriebe Technik Fllgraf GmbH 100% Competence Industriestrae 5d 45711

HO HOME Prepared for: D & J Simons Customers D & J Simons& Sons Ltd 2012 D &

Advanced Profile Solutions provide steel plate processing with a strong emphasis on wear products.

Naviance Family Connection $cholarships http://connection.naviance.com/ Clearbrookhs

Whistleblower Protection at Nonprofits October 18, 2018 Cindy Lewin, Esq. Stephen Salsbury, Esq.

Sign Code Update Plan Commission Hearing October 11, 2017 Project Background Updates to SMC

Keith Curry, Ed.D. President/CEO Compton College Compton Community College District E E v

Center For Information Technology Services (CITS) Plans, Projects, & Roadmaps for FY2017

Computing: Is it Worth the Pain? A TCO Perspective Sandra Wienke, - PowerPoint PPT Presentation

Accelerators in Technical Computing: Is it Worth the Pain? A TCO Perspective Sandra Wienke, Dieter an Mey, Matthias S. Mller Center for Computing and Communication JARA High-Performance Computing RWTH Aachen University Rechen- und

Trustworthy Computing * Reverse engineers agree on that! Trustworthy Computing Trustworthy

COMPUTING COMMUNITY CONSORTIUM The mission of the Computing Research Association's Computing

THE COMPUTING COMMUNITY CONSORTIUM (CCC) COMPUTING COMMUNITY CONSORTIUM The mission of Computing

Calm Computing The Coming Age of Mark Weiser and John Seely Brown Calm Computing Whyfor, Calm

Ray Wu Presentation to School of Computing, National University of Singapore Computing Evolution

ManyCore ManyCore Computing: ManyCore ManyCore Computing: Computing: Computing: The Impact on

Ubiquitous Computing Gabriela Avram IxDM13 The Trends in Computing Technology 1970s 1990s

Interacting with Small Devices in Big Ways Chris Harrison 1 Small Powerful + 2 Computing

Quantum Computing and the Forest SDK Robert Smith 2 February 2019 Rigetti Computing Proprietary

Secure Outsourcing Computation Li Xiong Outline Cloud computing Computing on encrypted

Cloud Computing SENY KAMARA MICROSOFT RESEARCH Computing as a Service 2 Computing is a

THE COMPUTING COMMUNITY CONSORTIUM Elizabeth D. Mynatt Chair COMPUTING COMMUNITY CONSORTIUM The

Today's World-wide Today's World-wide Computing Grid for the Computing Grid for the Computing

15-292 History of Computing Growth of Analog Computing &amp; the Birth of Computing Theory

Parallel Computing: Opportunities and Challenges Victor Lee Parallel Computing Lab (PCL), Intel

Linux Containers Drive P2P Social Cloud Computing By Alex Karasulu Social cloud computing ,

100% Service Getriebe Technik Fllgraf GmbH 100% Competence Industriestrae 5d 45711

HO HOME Prepared for: D &amp; J Simons Customers D &amp; J Simons&amp; Sons Ltd 2012 D &amp;

Advanced Profile Solutions provide steel plate processing with a strong emphasis on wear products.

Naviance Family Connection $cholarships http://connection.naviance.com/ Clearbrookhs

Whistleblower Protection at Nonprofits October 18, 2018 Cindy Lewin, Esq. Stephen Salsbury, Esq.

Sign Code Update Plan Commission Hearing October 11, 2017 Project Background Updates to SMC

Keith Curry, Ed.D. President/CEO Compton College Compton Community College District E E v

Center For Information Technology Services (CITS) Plans, Projects, &amp; Roadmaps for FY2017

15-292 History of Computing Growth of Analog Computing & the Birth of Computing Theory

HO HOME Prepared for: D & J Simons Customers D & J Simons& Sons Ltd 2012 D &

Center For Information Technology Services (CITS) Plans, Projects, & Roadmaps for FY2017