a practical approach for a workflow management system
play

A Practical Approach for a Workflow Management System Simone - PowerPoint PPT Presentation

A Practical Approach for a Workflow Management System Simone Pellegrini, Francesco Giacomini, Antonia Ghiselli INFN Cnaf Viale B. Pichat, 6/2 40127 Bologna {simone.pellegrini | francesco.giacomini | antonia.ghiselli}@cnaf.infn.it Outline


  1. A Practical Approach for a Workflow Management System Simone Pellegrini, Francesco Giacomini, Antonia Ghiselli INFN Cnaf Viale B. Pichat, 6/2 40127 Bologna {simone.pellegrini | francesco.giacomini | antonia.ghiselli}@cnaf.infn.it

  2. Outline ● Workflow Management Systems overview ● A practical approach for real workflows ● Implementation issues ● A case study: JDL to GWorkflowDL conversion

  3. Outline ● Workflow Management Systems overview ● A practical approach for real workflows ● Implementation issues ● A case study: JDL to GWorkflowDL conversion

  4. Workflow Management Systems Overview ● A lot of interest around WfMSs exist – Thanks to workflows, the processes' business logic can be easily expressed using graphs ● Appealing for users with limited programming skills ● However, the lack of a recognized standard causes incompatibility among WfMSs: – several languages for workflow description exist: ● Based on different modeling formalisms (DAGs, Petri- Nets, Pi-Calculus...);

  5. Problems in Workflow Management ● The choice of a WfMS usually binds users to a specific workflow language – Change of the WfMS has a high cost: ● Legacy workflows must be rewritten. ● Interoperability among WfMSs is still an open issue ● Difficulty to manage large , complex and real life scientific processes

  6. Outline ● Workflow Management Systems overview ● A practical approach for real workflows ● Implementation issues ● A case study: JDL to GWorkflowDL conversion

  7. A practical approach for real workflows ● Introducing a WfMS: – Petri-Nets based ● Formal semantics ● Turing-complete (deals with Workflow Patterns) ● Build time analysis tools ( reachability , boundedness... ) – Independent from the underlying Grid middleware – Multi-language: ● GWorkflowDL used as internal representation – Deals with interoperability

  8. The WfMS Architecture Language Workflow Management Workflow Management Interoperability System System Aims at language JDL, GworkflowDL, Workflow Gateway Workflow Gateway independence BPEL Workflow Engine Workflow Engine Engine Interoperability Layered architecture Grid Abstraction Layer Grid Abstraction Layer Aims at Grid middleware independence Grid Middleware/s Grid Middleware/s Storage nodes Execution nodes

  9. Grid Abstraction Layer ● Abstracts the basic Grid Abstraction Layer Grid Abstraction Layer functions of a Grid Dispatcher Reservation providing portability of the Data Transfer Observer Workflow engine over different Grid middlewares – Dispatcher: Job submission/cancellation – Data Transfer: Move data between Grid nodes – Observer: Monitor submitted job status – Reservation: Resource reservation

  10. Workflow Gateway ● Provides language and model converters in order to achieve compatibility with legacy workflows and legacy WfMSs – Parser: Extracts Workflow Gateway Workflow Gateway BPEL the model from a Dag Parser gLite (classAd) workflow Pi-Calculus Compiler JSDL description Model Translator – Compiler: Produces a workflow representation using a target language ( JDL, GWorkflowDL, BPEL, …)

  11. Language Interoperability ● The Gateway solves some interoperability issues in workflow management: – Part of a workflow (or a sub-workflow) can be translated and successively delegated to a third- party WfMS – Legacy workflows can be executed on our WfMS without being rewritten ● Every process can be expressed in terms of a Petri Net (Turing-complete)

  12. The Workflow Engine ● Our goal is to keep the engine as simple as possible and concentrate on interoperability issues ● Main characteristics – Petri-Net base – Micro-Kernel architecture: ● aims at modularity and extendibility – Distributed

  13. Outline ● Workflow Management Systems overview ● A practical approach for real workflows ● Implementation issues ● A case study: JDL to GWorkflowDL conversion

  14. EGEE/gLite as a Grid Middleware ● Choice of EGEE/gLite middleware because of: – Services maturity – Reliability – Large adoption ● Job management is done by the Workload Management System (WMS) ● Job monitoring is done by the Logging and Bookkeeping Service (LB)

  15. Project Goal: A WfMS over the WMS ● The practical outcome of our work is to build a WfMS relying on the WMS + LB ● Both WMS and LB provide a Web Service interface – simplify interaction with Grid services

  16. WfMS Deploy (1/2) ● WfMS running on a dedicated server: – Client sends the workflow description to the server; – WfMS server manages workflows execution; Client Monitoring Workflow Grid Description Submitted Workflow Jobs running WFMS instance Server Interface to the Grid is provided by the WMS API.

  17. WfMS Deploy (2/2) ● WfMS running as a Grid job : – Client submits a workflow to the Grid via the WMS and monitors it via the LB – The WfMS ends up running on a Grid node – The WfMS instance Client submits workflow tasks Monitoring Grid WFMS to the Grid using the running instances WMS and monitors Workflow Description them via the LB submit ● Taking full advantage from the Grid resources and facilities (e.g. checkpointing)

  18. Outline ● Workflow Management Systems overview ● A practical approach for real workflows ● Implementation issues ● A case study: JDL to GWorkflowDL conversion

  19. The Job Description Language (JDL) ● The Job Description Language describes a job to be execution on the Grid ● The JDL adopted within the gLite middleware is based upon Condor's Classified Advertisements (ClassAds) – record-like structure composed of a finite number of attributes separated by semi-colon (;) Attribute = Value;

  20. A JDL DAG Workflow ● JDL allows workflow (DAG based) definition: [ Type = “dag”; DAG Model [ ... ] nodes = [ father father = [ ... ]; Dependencies son1 = [ ... ]; son2 = [ ... ]; Job final = [ ... ]; dependencies = { son1 son2 {father, {son1, son2}}, {son1, final}, JDL parser {son2, final} + }; final Model extractor ]; ]

  21. DAGs in gLite ● Many legacy workflows expressed in JDL exist; ● JDL workflows are managed by the Condor DAG Manager (DAGMan): – Acts as a Meta-Scheduler for Condor jobs – Submits job respecting their inter-dependencies – In case of job failure, DAGMan continues until it can no longer make progress.

  22. DAG -> Petri Net ● The current DAG model is very limited, e.g. – lack of error handling – lack of task types other than computation ● DAGs can be easily described using Petri Nets – a DAG node can be represented by a Petri Net transition – the flow of data among DAG nodes is modeled using data tokens

  23. DAG -> Petri Net Petri-Net Model init DAG Model father father P1 P2 son1 son2 son1 son2 P3 P4 DAG to Petri Net Model Converter final final end

  24. Petri Net -> GWorkflowDL Petri-Net Model <workflow [...]> <place ID=”init” /> init <place ID=”P1” /> <place ID=”P2” /> father [...] <place ID=”end” /> Compiler P1 P2 <transition ID=”father”> <inputPlace placeID=”init”/> son1 son2 <outputPlace placeID=”P1”/> <outputPlace placeID=”P2”/> <operation /> P3 P4 </transition> [...] final </workflow> end DEMO DEMO

  25. From abstract to concrete workflow (1/4) ● The workflow description needs a refinement process in order to perform concrete task operations ● In the case of the gLite middleware task execution is asynchronous – The WMS serves job submission request returning an ID that identifies the job in its task queue – The ID is used to query (or register for notifications by) the LB service until task termination (or failure )

  26. From abstract to concrete workflow (2/4) JobExecute JobExecute InputSandbox Task Task Execution Execution JobStart jdl jobID P1 P1-1 P1-2 P1-3 P1 JobRegister data movement if(result.FAIL) Recovery JobExecute Strategy P1-4 result P2 wait_for_termination P2 P1-5 data movement if(result.SUCCESS)

  27. From abstract to concrete workflow (3/4) ● Waits until the task is done (or failed) using polling: – getJobState() operation of the LB service. wait_for_termination (polling) wait_for_termination (polling) if(s.job_done || s.job_fail) P1-4 P1-3 jobID P1-3-1 s N sec if(s.job_running) getJobState

  28. From abstract to concrete workflow (4/4) ● Same sub-workflow implementation using notifications; wait_for_termination (notify) wait_for_termination (notify) P1-3 P1-3 P1-4 jobID result N sec do_polling

  29. Conclusions and Future Work ● Workflow Gateway as a standalone component – Other WfMS can easily take advantage from languages conversions – customizable target language depending on the underlying WfMS ● A lightweight WfMS with basic functionalities – solves low level aspects of workflow management ● Investigate WfMSs engine interoperability

  30. Thank You!

Recommend


More recommend