ganga
play

Ganga a job management and optimising tool for job submission to - PowerPoint PPT Presentation

Enabling Grids for E-sciencE Ganga a job management and optimising tool for job submission to the grid Andrew Maier (CERN) for the Ganga Development team www.eu-egee.org INFSO-RI-508833 Overview Enabling Grids for E-sciencE People


  1. Enabling Grids for E-sciencE Ganga a job management and optimising tool for job submission to the grid Andrew Maier (CERN) for the Ganga Development team www.eu-egee.org INFSO-RI-508833

  2. Overview Enabling Grids for E-sciencE • People – Sponsors • Motivation • Ganga introduction • Architecture • Ganga in action • Current users of Ganga • Summary ISGC 2006, May 2006 - A. Maier (CERN) 2 INFSO-RI-508833

  3. People - Sponsors Enabling Grids for E-sciencE • Ganga is an ATLAS/LHCb joint project • Development work supported by PPARC through GridPP and by EGEE through ARDA • Core team: – U.Egede (Imperial), K.Harrison (Cambridge), D.Liko (CERN), A.Maier (CERN), J.T.Moscicki (CERN), A.Soroko (Oxford), CL.Tan (Birmingham) • Contributions from many others, from summer students to senior researchers • Valued contributions from our colleagues of the Academia Sinica ISGC 2006, May 2006 - A. Maier (CERN) 3 INFSO-RI-508833

  4. Motivation (1) Enabling Grids for E-sciencE • To submit a job to the grid a number of “problems” have to be overcome: – Get a certificate – write a JDL with the resources needed – create a script to run the application on the worker-node – monitor the progress of your job – retrieve your output • The grid may not be the only computing resource you want to use: – local machine - for debugging or short tests – a local batch system - for small or intermediate datasets – grid – for intermediate to large scale datasets ISGC 2006, May 2006 - A. Maier (CERN) 4 INFSO-RI-508833

  5. Motivation (2) Enabling Grids for E-sciencE • Working on these resources can be different e.g., – For job running locally - no need for a JDL – For a job running on a batch system - a shared file system may allow to retrieve results easily – One may have to monitor the job manually – The commands to submit a job have different syntax • As a user you are probebly not interested in these technicalities : ➔ Factor out these these differences: use Ganga ISGC 2006, May 2006 - A. Maier (CERN) 5 INFSO-RI-508833

  6. Ganga Overview Enabling Grids for E-sciencE Job Job Job Job LSF LSF store & retrieve job definition localhost localhost submit, kill gLite pbs prepare, configure LCG2 LCG2 Athena get output update status DIRAC gLite Gaudi DIRAC DIAL scripts AtlasPROD AtlasPROD + split, merge, monitor Ganga4 ISGC 2006, May 2006 - A. Maier (CERN) 6 INFSO-RI-508833

  7. Introduction to Ganga Enabling Grids for E-sciencE • What is Ganga – Ganga is an application in python to help the user to � configure, prepare, submit and monitor applications to the local host, a batch or a grid system – The goal is to make submitting a job transparent to the batch system used � Configure once, run anywhere LSF GANGA DIRAC User Localhost AM ISGC 2006, May 2006 - A. Maier (CERN) 7 INFSO-RI-508833

  8. Ganga job abstraction (1) Enabling Grids for E-sciencE • User tasks are represented in Ganga in terms of a set of building blocks • Application – Specification of the software to be run, including values for configurable parameters • Backend – Specification of batch or Grid system to be used, including resource requirements (minimum memory, maximum CPU, etc) • Dataset – Used to specify input and/or output, for example a collection of input files containing event data ISGC 2006, May 2006 - A. Maier (CERN) 8 INFSO-RI-508833

  9. Ganga job abstraction (2) Enabling Grids for E-sciencE • Job – Full task specification - input Dataset, Application, output Dataset, Backend - and bookkeeping information such as ID and status • Splitter – Rule for dividing a Job into subjobs that can be run in parallel; rule may relate to Application parameters and/or to input Dataset • Merger – Rule for merging outputs from subjobs ISGC 2006, May 2006 - A. Maier (CERN) 9 INFSO-RI-508833

  10. Internal architecture Enabling Grids for E-sciencE • Ganga 4 is decomposed into 4 functional components • These components also Application Client Manager describe the components in a distributed model. • Strategy: Design each component so that it could Remote Job Manager be a separate service. Registry • But allow to combine two or more components into a single service ISGC 2006, May 2006 - A. Maier (CERN) 10 INFSO-RI-508833

  11. Client Enabling Grids for E-sciencE • Runs the Ganga interface (CLIP, GPI, GUI) • The user interacts exclusively Application through the client Client Manager • With the client, the user creates, modifies, submits and monitors jobs • Job configuration is kept in a Remote Job Manager registry which can be local Registry (within the client) or remote. ISGC 2006, May 2006 - A. Maier (CERN) 11 INFSO-RI-508833

  12. Client Enabling Grids for E-sciencE • The client is a thin client (pure python) • The client can be a command Application line client or a GUI Client Manager • The client interacts with the application manager to configure applications • It submits and monitors jobs Remote Job Manager via the job manager Registry • It keeps state by storing persistent information in the registry ISGC 2006, May 2006 - A. Maier (CERN) 12 INFSO-RI-508833

  13. Application Manager Enabling Grids for E-sciencE • Prepares and configures the application • Compiles user code Application • Sets-up the necessary Client Manager environment • Provides information to the client on available applications, versions, Remote Job Manager platforms, etc. Registry ISGC 2006, May 2006 - A. Maier (CERN) 13 INFSO-RI-508833

  14. Job Manager Enabling Grids for E-sciencE • Submits the configured job to the submission backend • A submission handler submits a job to a backend Application Client Manager � Creates the starter script and the JDL � performs the monitoring • The application runtime handler Remote Job Manager � prepares the application Registry dependent wrapper script, depending on the backend. � E.g., DIRAC knows how to run LHCb applications with a different setup as LSF. ISGC 2006, May 2006 - A. Maier (CERN) 14 INFSO-RI-508833

  15. Remote Registry Enabling Grids for E-sciencE • Keeps track of jobs • Is a “passive” data store, typically using a database Application backend Client Manager • Keeps a roaming profile of the user jobs • Ganga uses the AMGA metadata catalogue Remote Job Manager Registry • Keeps track of the job status ISGC 2006, May 2006 - A. Maier (CERN) 15 INFSO-RI-508833

  16. Design 16 ISGC 2006, May 2006 - A. Maier (CERN) Enabling Grids for E-sciencE INFSO-RI-508833

  17. Ganga in action Enabling Grids for E-sciencE • The key operations for a user running jobs are typically – Job definition – Job submission – Job cancellation – Job monitoring – Output retrieval • These are performed in Ganga using simple GPI/User commands • Technicalities are hidden from the user ISGC 2006, May 2006 - A. Maier (CERN) 17 INFSO-RI-508833

  18. Job definition Enabling Grids for E-sciencE • A job can be defined in Ganga starting from an instance of the Job class • Job properties can be passed as arguments to the constructor – j = Job( application = Executable(), backend = LCG()) • Job properties and sub-properties can also be set through assignments – j.application.exe = “/bin/echo” – j.application.args = [ “Hello World” ] ISGC 2006, May 2006 - A. Maier (CERN) 18 INFSO-RI-508833

  19. Job submission Enabling Grids for E-sciencE • User command: job.submit() • Outcome: job submitted, split into subjobs, command status returned - Beneath the surface: j ob Application Perform application configuration for j ob derived parameters Manager j ob and derived parameters Job S plit j ob subj obs Manager into subj obs Register subj obs, Archivist allocate workspace subj obs Application Perform application derived parameters Manager configuration for subj obs subj obs and derived parameters Job Create wrapper scripts, status Manager submit to backend Register subj obs Archivist as submitted ISGC 2006, May 2006 - A. Maier (CERN) 19 INFSO-RI-508833

  20. Monitoring and output retrieval Enabling Grids for E-sciencE • User action: none • Outcome: changes in job status reported/updated periodically, output retrieved automatically when job completes - Beneath the surface: Control Determine active j obs Job Monitoring for each backend Manager thread ⇒ For each backend with active j obs: Query status of active j obs ⇒ For each j ob with change of status: Report change of status to uses, retrieve output if j ob completed Register Archivist new j ob status ISGC 2006, May 2006 - A. Maier (CERN) 20 INFSO-RI-508833

  21. CLIP: simple job from 1 st principles Enabling Grids for E-sciencE Job defined in Cambridge Job run in Amsterdam ISGC 2006, May 2006 - A. Maier (CERN) 21 INFSO-RI-508833

  22. Running ATLAS jobs on LCG Enabling Grids for E-sciencE ISGC 2006, May 2006 - A. Maier (CERN) 22 INFSO-RI-508833

  23. Ganga Gui Enabling Grids for E-sciencE • Ganga ship with a GUI • Based on pyqt • Completely dockable • Dynamically built on the internal plugin architecture • Includes a job builder wizard ISGC 2006, May 2006 - A. Maier (CERN) 23 INFSO-RI-508833

  24. Ganga GUI (top half) Enabling Grids for E-sciencE List of Jobs Job details Logical Job Folder list ISGC 2006, May 2006 - A. Maier (CERN) 24 INFSO-RI-508833

Recommend


More recommend