sc camp 2017 resource manager job scheduler
play

SC-Camp 2017: Resource Manager & Job Scheduler On the efficient - PowerPoint PPT Presentation

SC-Camp 2017: Resource Manager & Job Scheduler On the efficient use of HPC facility UL High Performance Computing (HPC) Team Dr. S. Varrette Oct. 23 th , 2017 University of Luxembourg (UL), Luxembourg 8th Intl. SuperComputing Camp (SC-Camp


  1. SC-Camp 2017: Resource Manager & Job Scheduler On the efficient use of HPC facility UL High Performance Computing (HPC) Team Dr. S. Varrette Oct. 23 th , 2017 University of Luxembourg (UL), Luxembourg 8th Intl. SuperComputing Camp (SC-Camp 2017), Cadiz, Spain Shared Etherpad Notes: https://im.csc.uni.lu/p/SC-Camp2017 Dr. S. Varrette & UL HPC Team (University of Luxembourg) SC-Camp 2017: Resource Manager & Job Scheduler 1 / 72 �

  2. Latest versions available on GitLab : UL HPC tutorials: https://github.com/ULHPC/tutorials SC-Camp: http://www.sc-camp.org/ Dr. S. Varrette & UL HPC Team (University of Luxembourg) SC-Camp 2017: Resource Manager & Job Scheduler 2 / 72 �

  3. Introduction Summary 1 Introduction 2 The OAR Batch Scheduler 3 The SLURM Batch Scheduler Overview Slurm commands Slurm vs. OAR commands ULHPC Slurm Configuration Usage Example Slurm Launcher 4 Conclusion Dr. S. Varrette & UL HPC Team (University of Luxembourg) SC-Camp 2017: Resource Manager & Job Scheduler 3 / 72 �

  4. Introduction UL HPC: General cluster organization Other Clusters Site <sitename> Local Institution network Network 10/40 GbE QSFP+ 10 GbE [Redundant] Load balancer Site router [Redundant] Site access server(s) [Redundant] Adminfront(s) Site Computing Nodes OAR Puppet Kadeploy Fast local interconnect supervision Slurm etc... (Infiniband EDR) 100 Gb/s GPFS / Lustre Disk Enclosures Site Shared Storage Area Dr. S. Varrette & UL HPC Team (University of Luxembourg) SC-Camp 2017: Resource Manager & Job Scheduler 4 / 72 �

  5. Introduction HPC Components: Software Stack Remote connection to the platform SSH Identity Management / SSO : LDAP, Kerberos, IPA. . . Resource management : job/batch scheduler → SLURM, OAR, PBS, MOAB/Torque. . . ֒ (Automatic) Node Deployment : → FAI, Kickstart, Puppet, Chef, Ansible, Kadeploy. . . ֒ (Automatic) User Software Management : → Easybuild, Environment Modules, LMod ֒ Platform Monitoring : → Nagios, Icinga, Ganglia, Foreman, Cacti, Alerta. . . ֒ Dr. S. Varrette & UL HPC Team (University of Luxembourg) SC-Camp 2017: Resource Manager & Job Scheduler 5 / 72 �

  6. Introduction Resource and Job Management Systems Resource and Job Management System (RJMS) → “Glue” for a parallel computer to execute parallel jobs ֒ → Goal : satisfy users’ demands for computation ֒ � assign resources to user jobs with an efficient manner Dr. S. Varrette & UL HPC Team (University of Luxembourg) SC-Camp 2017: Resource Manager & Job Scheduler 6 / 72 �

  7. Introduction Resource and Job Management Systems Resource and Job Management System (RJMS) → “Glue” for a parallel computer to execute parallel jobs ֒ → Goal : satisfy users’ demands for computation ֒ � assign resources to user jobs with an efficient manner HPC Resources : → Nodes (typically a unique IP address) NUMA boards ֒ � Sockets / Cores / Hyperthreads � Memory � Interconnect/switch resources → Generic resources (e.g. GPUs) ֒ → Licenses ֒ Strategic Position → Direct/constant knowledge of ֒ resources → Launch and otherwise manage jobs ֒ Dr. S. Varrette & UL HPC Team (University of Luxembourg) SC-Camp 2017: Resource Manager & Job Scheduler 6 / 72 �

  8. Introduction RJMS Layers Resource Allocation involves three principal abstraction layers: → Job Management : ֒ � declaration of a job & demand of resources and job characteristics, → Scheduling : matching of the jobs upon the resources, ֒ → Resource Management : ֒ � launching and placement of job instances. . . � . . . along with the job’s control of execution When there is more work than resources → the job scheduler manages queue(s) of work ֒ � supports complex scheduling algorithms → Supports resource limits (by queue, user, group, etc.) ֒ Dr. S. Varrette & UL HPC Team (University of Luxembourg) SC-Camp 2017: Resource Manager & Job Scheduler 7 / 72 �

  9. Introduction RJMS Detailed Components Resource Management → Resource Treatment (hierarchy, partitions,..) ֒ → Job Launcing, Propagation, Execution control ֒ → Task Placement (topology,binding,..) ֒ → Advanced Features : ֒ � High Availability, Energy Efficiency, Topology aware placement Dr. S. Varrette & UL HPC Team (University of Luxembourg) SC-Camp 2017: Resource Manager & Job Scheduler 8 / 72 �

  10. Introduction RJMS Detailed Components Resource Management → Resource Treatment (hierarchy, partitions,..) ֒ → Job Launcing, Propagation, Execution control ֒ → Task Placement (topology,binding,..) ֒ → Advanced Features : ֒ � High Availability, Energy Efficiency, Topology aware placement Job Management → Job declaration and control (signaling, reprioritizing,. . . ) ֒ → Monitoring (reporting, visualization,..) ֒ → Advanced Features : ֒ � Authentication (limitations, security,..) � QOS (checkpoint, suspend, accounting,. . . ) � Interfacing (MPI libraries, debuggers, APIs,..) Dr. S. Varrette & UL HPC Team (University of Luxembourg) SC-Camp 2017: Resource Manager & Job Scheduler 8 / 72 �

  11. Introduction RJMS Detailed Components Resource Management → Resource Treatment (hierarchy, partitions,..) ֒ → Job Launcing, Propagation, Execution control ֒ → Task Placement (topology,binding,..) ֒ → Advanced Features : ֒ � High Availability, Energy Efficiency, Topology aware placement Job Management → Job declaration and control (signaling, reprioritizing,. . . ) ֒ → Monitoring (reporting, visualization,..) ֒ → Advanced Features : ֒ � Authentication (limitations, security,..) � QOS (checkpoint, suspend, accounting,. . . ) � Interfacing (MPI libraries, debuggers, APIs,..) Scheduling → Queues Management (priorities,multiple,..) ֒ → Advanced Reservation ֒ Dr. S. Varrette & UL HPC Team (University of Luxembourg) SC-Camp 2017: Resource Manager & Job Scheduler 8 / 72 �

  12. Introduction Job Scheduling Dr. S. Varrette & UL HPC Team (University of Luxembourg) SC-Camp 2017: Resource Manager & Job Scheduler 9 / 72 �

  13. Introduction Job Scheduling (backfilling) Dr. S. Varrette & UL HPC Team (University of Luxembourg) SC-Camp 2017: Resource Manager & Job Scheduler 10 / 72 �

  14. Introduction Job Scheduling (suspension & requeue) Dr. S. Varrette & UL HPC Team (University of Luxembourg) SC-Camp 2017: Resource Manager & Job Scheduler 11 / 72 �

  15. Introduction Main Job Schedulers Name Company Version* SLURM SchedMD 17.02.8 LSF IBM 10.1 OpenLava LSF Fork 2.2 MOAB/Torque Adaptative Computing 6.1 PBS Altair 13.0 OAR (PBS Fork) LIG 2.5.7 Oracle Grid Engine (formely SGE) Oracle *: As of Oct. 2017 Dr. S. Varrette & UL HPC Team (University of Luxembourg) SC-Camp 2017: Resource Manager & Job Scheduler 12 / 72 �

  16. Introduction Main Job Schedulers Name Company Version* SLURM SchedMD 17.02.8 LSF IBM 10.1 OpenLava LSF Fork 2.2 MOAB/Torque Adaptative Computing 6.1 PBS Altair 13.0 OAR (PBS Fork) LIG 2.5.7 Oracle Grid Engine (formely SGE) Oracle *: As of Oct. 2017 Dr. S. Varrette & UL HPC Team (University of Luxembourg) SC-Camp 2017: Resource Manager & Job Scheduler 12 / 72 �

  17. The OAR Batch Scheduler Summary 1 Introduction 2 The OAR Batch Scheduler 3 The SLURM Batch Scheduler Overview Slurm commands Slurm vs. OAR commands ULHPC Slurm Configuration Usage Example Slurm Launcher 4 Conclusion Dr. S. Varrette & UL HPC Team (University of Luxembourg) SC-Camp 2017: Resource Manager & Job Scheduler 13 / 72 �

  18. The OAR Batch Scheduler UL HPC resource manager: OAR The OAR Batch Scheduler http://oar.imag.fr Versatile resource and task manager → schedule jobs for users on the cluster resource ֒ → OAR resource = a node or part of it (CPU/core) ֒ → OAR job = execution time ( walltime ) on a set of resources ֒ Dr. S. Varrette & UL HPC Team (University of Luxembourg) SC-Camp 2017: Resource Manager & Job Scheduler 14 / 72 �

  19. The OAR Batch Scheduler UL HPC resource manager: OAR The OAR Batch Scheduler http://oar.imag.fr Versatile resource and task manager → schedule jobs for users on the cluster resource ֒ → OAR resource = a node or part of it (CPU/core) ֒ → OAR job = execution time ( walltime ) on a set of resources ֒ OAR main features includes: interactive vs. passive (aka. batch) jobs best effort jobs : use more resource, accept their release any time deploy jobs (Grid5000 only): deploy a customized OS environment → ... and have full (root) access to the resources ֒ powerful resource filtering/matching Dr. S. Varrette & UL HPC Team (University of Luxembourg) SC-Camp 2017: Resource Manager & Job Scheduler 14 / 72 �

  20. The OAR Batch Scheduler Main OAR commands oarsub submit/reserve a job (by default: 1 core for 2 hours ) oardel delete a submitted job oarnodes shows the resources states oarstat shows information about running or planned jobs Submission interactive oarsub [options] -I passive oarsub [options] scriptName Each created job receive an identifier JobID → Default passive job log files: OAR.JobID.std{out,err} ֒ You can make a reservation with -r "YYYY-MM-DD HH:MM:SS" Dr. S. Varrette & UL HPC Team (University of Luxembourg) SC-Camp 2017: Resource Manager & Job Scheduler 15 / 72 �

Recommend


More recommend