Linux Kernel Co-Scheduling For Bulk Synchronous Parallel - PowerPoint PPT Presentation

Linux Kernel Co-Scheduling For Bulk Synchronous Parallel Applications ROSS 2011 Tucson, AZ Ter erry J y Jones ones Oak Ridge National Laboratory 1 Managed by UT-Battelle Terry Jones – ROSS 2011 for the U.S. Department of Energy

Outline  Motivation  Approach & Research  Design Attributes  Achieving Portable Performance  Measurements  Conclusion & Acknowledgements 2 Managed by UT-Battelle Terry Jones – ROSS 2011 for the U.S. Department of Energy

We’re Experiencing an Architectural Renaissance Increased Core Counts    Factors To Change Moore’s Law -- Number of transistors per IC  double every 24 months No Power Headroom -- Clock speed will not  increase (and may decrease) because of Power Power  α   Voltage 2  * Frequency  Increased Transistor Power  α   Frequency  Density Disrup1ve Technologies  Power   α   Voltage 3   3 Managed by UT-Battelle Terry Jones – ROSS 2011 for the U.S. Department of Energy

A Key Component of the Colony Project Adaptive System Software For Improved Resiliency and Performance Collaborators  Terry Jones, Project PI Laxmikant Kalé, UIUC PI José Moreira, IBM PI Objec;ves   Approach   Automatic and adaptive load-balancing plus fault  Provide technology to make portable scalability a reality. tolerance.  Remove the prohibitive cost of full POSIX APIs and  High performance peer-to-peer and overlay full-featured operating systems. infrastructure.  Enable easier leadership-class level scaling for domain  Address issues with Linux to provide the familiarity scientists through removing key system software barriers. and performance needed by domain scientists. Impact  Challenges   Full-featured environments allow for a full range of  Computational work often includes large amounts of programming development tools including debuggers, state which places additional demands on successful memory tools, and system monitoring tools that depend work migration schemes. on separate threads or other POSIX API.  Automatic load balancing helps correct problems  For widespread acceptance from the Linux community, associated with long running dynamic simulations. the effort to validate and incorporate HPC originated advancements into the Linux kernel must be minimized.  Coordinated scheduling removes the negative impact of OS jitter from full-featured system software. 4 Managed by UT-Battelle Terry Jones – ROSS 2011 for the U.S. Department of Energy

Motivation – App Complexity Don’t Limit Development Environment  Linux -> Familiar -> Open Source -> Support for common system calls  Support for daemons & threading packages -> Debugging strategies -> Asynchronous strategies  Support for administrative monitoring  OS Scalability -> Eliminate OS Scalability Issues Through Parallel Aware Scheduling 5 Managed by UT-Battelle Terry Jones – ROSS 2011 for the U.S. Department of Energy

The Need For Coordinated Scheduling Bulk  Synchronous  Programs  6 Managed by UT-Battelle Terry Jones – ROSS 2011 for the U.S. Department of Energy

The Need For Coordinated Scheduling • Permit Full Linux Func1onality  • Eliminate Problema1c OS Noise  • Metaphor: Cars and Coordinated Traffic Lights  7 Managed by UT-Battelle Terry Jones – ROSS 2011 for the U.S. Department of Energy

What About …  Core Specialization  Minimalist OS  Will Apps Always Be Bulk Synchronous?  Yeah, but it ʼ s Linux 8 Managed by UT-Battelle Terry Jones – ROSS 2011 for the U.S. Department of Energy

HPC Colony Technology –   Coordinated Scheduling  Time    Node1a  Node1b   Node1c   Node1d   Node2a   Node2b   Node2c   Node2d  Time    • Ferreira, Bridges and Brightwell Node1a  • The Tau team at the University of Node1b   confirmed that a 1000 Hz 25 µ s Oregon has reported 23% to 32% Node1c   noise interference (an amount increase in runtime for parallel Node1d   measured on a large-scale applications running at 1024 nodes Node2a   and 1.6% operating system noise   commodity Linux cluster) can cause Node2b   a 30% slowdown in application Node2c   performance on ten thousand nodes   Node2d   9 Managed by UT-Battelle Terry Jones – ROSS 2011 for the U.S. Department of Energy

Goals • Portable Performance • Make OS Noise a non-issue for bulk-synchronous codes • Permit sysadmin best practices 10 Managed by UT-Battelle Terry Jones – ROSS 2011 for the U.S. Department of Energy

Proof of Concept – Blue Gene / L Core Counts (cont.) Scaling with Noise (Noise level @ serial task takes 30% longer) Allreduce 10000 1000 CNK Colony with SchedMods (quiet) 100 Colony with SchedMods (30% noise) Colony (quiet) Colony (30% noise) 10 1 1024 2048 4096 8192 0.1 GLOB 10000 1000 CNK Colony with SchedMods (quiet) Colony with SchedMods (30% noise) 100 Colony (quiet) Colony (30% noise) 10 1 1024 2048 4096 8192 11 Managed by UT-Battelle Terry Jones – ROSS 2011 for the U.S. Department of Energy

Approach • Introduces two new process flags & two new tunables • total time of epoch • percentage to parallel app (percentage of blue from co-schedule figure) • Dynamically turned on or off with new system call • Tunables are adjusted through use of a second new system call Salient Features • Utilizes a new clock synchronization scheme • Uses existing fair round-robin scheduler for both epochs • Permits needed flexibility for time-out based and/or latency sensitive apps 12 Managed by UT-Battelle Terry Jones – ROSS 2011 for the U.S. Department of Energy

Results 13 Managed by UT-Battelle Terry Jones – ROSS 2011 for the U.S. Department of Energy

Results 14 Managed by UT-Battelle Terry Jones – ROSS 2011 for the U.S. Department of Energy

…and in conclusion…  For Further Info  contact: Terry Jones trj@ornl.gov   hOp://www.hpc‐colony.org   hOp://charm.cs.uiuc.edu   Partnerships and Acknowledgements  Synchronized Clock work done by Terry Jones and Gregory Koenig  DOE Office of Science – major funding provided by FastOS 2  Colony Team 15 Managed by UT-Battelle Terry Jones – ROSS 2011 for the U.S. Department of Energy

Extra Viewgraphs 16 Managed by UT-Battelle Terry Jones – ROSS 2011 for the U.S. Department of Energy

Sponsor: DOE ASCR  Improved Clock Synchroniza1on Algorithms   FWP ERKJT17   Achievement  Developed a new clock synchroniza1on algorithm. The new algorithm is a high precision  design suitable for large leadership‐class machines like Jaguar. Unlike most high‐precision  algorithms which reach their precision in a post‐mortem  analysis aWer the applica1on has completed, the new ORNL developed algorithm rapidly  provides precise results during run1me.    Relevance  • To the Sponsor;  • Makes more effec1ve use of OLCF and ALCF systems possible.  • To the Laboratory, Directorate, and Division Missions; and  • Demonstrates capabili1es in cri1cal system soWware for leadership‐class machines.  • To the Computer Science Research Community.  • High precision global synchronized clock of growing interest to system soWware  needs including parallel analysis tools, file systems, and coordina1on strategies.   • Demonstrates techniques for high‐precision coupled with guaranteed answer at  run1me.  17 Managed by UT-Battelle Terry Jones – ROSS 2011 for the U.S. Department of Energy

Test Setup Jaguar XT5 SeaStar2+ 3D Torus SION InfiniBand InfiniBand Serial ATA 9.6 Gbit/sec 16 Gbit/sec 3.0 Gbit/sec 16 Gbit/sec Compute Nodes Commodity Network Enterprise Storage 18,688 nodes InfiniBand Switches 48 Controllers ( 12 Opteron cores per node ) (3000+ ports) (DataDirect S2A9900) Gateway Nodes Storage Nodes 192 nodes 192 nodes ( 2 Opteron cores per node ) ( 8 Xeon cores per node ) 18 Managed by UT-Battelle Terry Jones – ROSS 2011 for the U.S. Department of Energy

Test Setup (continued) ping pong latency ~5.0 µsecs 19 Managed by UT-Battelle Terry Jones – ROSS 2011 for the U.S. Department of Energy

Linux Kernel Co-Scheduling For Bulk Synchronous Parallel - PowerPoint PPT Presentation

Linux Kernel Co-Scheduling For Bulk Synchronous Parallel Applications ROSS 2011 Tucson, AZ Ter erry J y Jones ones Oak Ridge National Laboratory 1 Managed by UT-Battelle Terry Jones ROSS 2011 for the U.S. Department of Energy Outline

Cup Concept with Profits Bulk Merchandising Solutions.Bulk Merchandising Solutions.Bulk

Bulk Density and Void Content Bulk Density Bulk density ( n .) the mass of a unit volume of bulk

Introduction to Linux Kernel Modules Luca Abeni luca.abeni@santannapisa.it Linux Kernel Modules

Debugging the Linux Kernel with GDB Kieran Bingham Debugging the Linux Kernel with GDB Many

+ Design of Parallel Algorithms Bulk Synchronous Parallel A Bridging Model of Parallel

Introduction to Linux Aline Abler Aline Abler Linux, whats that? The pieces of a Linux

Making C Less Dangerous in the Linux Kernel Kees Cook | @keescook LINUX.CONF.AU 21-25 January

Linux Overview Amir Hossein Payberah payberah@gmail.com 1 Agenda Linux Overview Linux

Linux from Sensors to Servers ! When is Linux Not Linux? ! 1 1 Linux runs across a huge range

Synchronous Grammars Synchronous grammars are a way of simultaneously generating pairs of

Linux Kernel Debugging Your kernel just oopsed - What do you do, hotshot? Muli Ben-Yehuda

Intro to Linux Kernel Programming Don Porter Lab 4 You will write a Linux kernel module

Aperiodic Task Scheduling Radek Pel anek Preemptive Scheduling Non-preemptive Scheduling

Workflow Plus Bulk Request Actions Tool for Synergy Enterprise What is This Tool ? Allows

Linux Kung Fu Introduction What is Linux? Why Linux? What is the difference between a client

1 Graphs Graphs a a c c Graph algorithms Depth-first search, Breadth-first search b

Synchronous Constructive Cry ryptography Chen-Da Ueli Liu-Zhang Maurer ETH Zurich ETH

Abstraction of Clocks in Synchronous Data-flow Systems A. Cohen 1 L. Mandel 2 F. Plateau 2 M.

Wireless Sensor Networks 14th Lecture 12.12.2006 Christian Schindelhauer

VHDL Digital Systems 1 The designers guide to VHDL Peter J. Andersen Morgan Kaufman

Consensus with Partial Synchrony Pedro Ferreira do Souto Departamento de Engenharia Informtica

Lecture 4: Checking properties in NuSMV B. Srivathsan Chennai Mathematical Institute Model

Synchronous Batching: From Cascades to Free Routes Roger Dingledine The Free Haven Project

CONSENSUS Fall 2012 Ken Birman Consensus a classic problem Consensus abstraction underlies

Sambuz

Useful Links

Newsletter

Mail Us