CMS Software and Computing in LHC Run 2 (and Beyond) Matteo Cremonesi FNAL DPF - August 3, 2017
DAQ Software Detector & & Trigger Computing 2
Experimental Particle Physics from Computing Perspective • Detect particle interactions (data), compare with theory predictions (simulation) • Black dots: recorded data • Blue shape: simulation • Red shape: simulation of new theory (in this case the Higgs) 3
Detector Data Analysis Reconstruction Software Algorithms Simulation 4
Detector Data Analysis Reconstruction Software Algorithms Simulation Central Production 5
Outline • The challenge for central production • What is a workflow? How much work is there? Where can it run? • Description of the system needed to get all the work done • Work assignment tool - more detailed description • Crucial for the e ffi cient production of simulation and processing of detector data • Minimizes time to delivery of datasets for physics analysis => maximizes resource utilization 6
Request: Definition of Workflow MC data • abstract definition of processing and producing datasets • converted into an actual sequence of jobs => production system • defined by a set of algorithms, input, and output dataset 7
8
Some Numbers Analyzing CMS data requires a large volume of simulation • Billions of events in 10s of thousands of datasets • Requires a flexible and automated production system, needs to support at all times: • Up to 5k workflows in parallel • Up to 200k jobs pending,150k jobs running • Record: 200k concurrent jobs • 9
Data Processing at CMS Request Workflow Manager Work Assignment HTCondor CMS Grid 10
Simulation Processing at CMS Request MonteCarlo Manager Work Assignment Workflow Manager HTCondor CMS Grid 11
Simulation Processing at CMS Input Request MonteCarlo Manager Decision Work Assignment Workflow Manager making HTCondor Resources CMS Grid 12
Some Technicalities of the System McM (MonteCarlo Manager) • Receive sample requests from the physics group. • Inject consolidated workflows to production system. ReqMgr (Workflow Manager) • Receive assembled configuration from McM, prepare the full tree of processing towards the production of the final outputs. • Split jobs according to workload specifications and data content and submit jobs to HTCondor. • Resubmit certain types of failures. HTCondor • Use shared resources between analyzers and central production in a global pool. New!!! • Allow multi-core application, moving most workflows to 4+ threads 13
Work Assignment: Unified • A software to drive the workflows Modules From assignment-approved ● injector Cloned from the requester through considered ● transferor Input needed ● stagor ReqMgr and back to the requester. ● Assignor No input staging ● checkor needed Input available Cloned ● closor • it solves a multi-dimensional forget ● recoveror staged matching problem: data location, Cloned Rejected Assignment available resources, etc. trouble assistance-onhold away issues assistance • It does everything automatically completed Aborted completed assistance-recovery close completed • less e ff ort needed Closed-out assistance-recovering assistance-biglumi done • higher e ffi ciency assistance-duplicates assistance-manual • optimized resource utilization 14
Work Assignment: Unified Automation of transfer Modules From assignment-approved ● injector Cloned considered ● transferor Input needed ● stagor • parametrized number of copies ● Assignor No input staging ● checkor needed of the input data to sites Input available Cloned ● closor forget ● recoveror staged Cloned Rejected • Destinations picked according Assignment trouble assistance-onhold to CPU pledge away issues assistance completed Aborted completed • Monitoring of transfers assistance-recovery close completed Closed-out assistance-recovering assistance-biglumi done assistance-duplicates assistance-manual 15
Work Assignment: Unified Automatic assignment to as many Modules From assignment-approved ● injector Cloned sites as possible: considered ● transferor Input needed ● stagor ● Assignor No input staging • Mostly homogeneous resource, ● checkor needed Input available Cloned ● closor but not all sites are equivalent forget ● recoveror staged (performance, policy, availability, Cloned Rejected Assignment size, ...) trouble assistance-onhold away issues assistance • Thousands of workflows with completed Aborted completed assistance-recovery heterogeneous requirements (CPU close completed bound, I/O bound, high Closed-out assistance-recovering assistance-biglumi memory ,...) done assistance-duplicates assistance-manual • Balance job priority with site availability 16
Work Assignment: Unified Automatic recovery Modules From assignment-approved ● injector Cloned considered ● transferor Input needed ● stagor • Most workload are without ● Assignor No input staging ● checkor needed issue (transfer, job failures, site Input available Cloned ● closor forget ● recoveror issues, ...) staged Cloned Rejected Assignment trouble assistance-onhold • Issues are dealt with increasing away issues assistance automation completed Aborted completed assistance-recovery close completed Closed-out assistance-recovering assistance-biglumi done assistance-duplicates assistance-manual 17
Recent Developments on Unified Overflow mechanism • Site might come out of production status because of schedule intervention, emergency shutdown, intermittent failures • Workload backlog might develop on local site queue • Mechanism to overflow to neighboring site • Quicken delivery with reliable remote read • In future perspective, can be used to redirect jobs to resources becoming available 18
Conclusion • CMS relies on a sophisticated infrastructure to process detector data and produce simulations • Without the timely and e ffi cient delivery of thousands of samples CMS physics program would not be possible • Workflow assignment tool instrumental in the success to deliver datasets to physics analysis in time • Supports large scale production and reprocessing for LHC Run II • Automates all steps of the production and processing cycle • Constantly working on improvements by learning from operation and investing in development 19
Backup
MonteCarlo Management: McM • Receive sample requests from generator contact person • Inject consolidated workflows to production system • CMS Software configuration and ingredients for production steps aggregated in campaigns • Subsequent steps of production materialize in chains of campaigns • Flow implement campaign modifiers • Allow for complex chaining • Flexibility for defining any specific request 21
Workflow Management: ReqMgr • Receive assembled configuration from McM. • Prepare the full tree of processing towards the production of the final outputs. • Split jobs according to workload specifications and data content. • Submit jobs to broker. • Resubmit certain types of failures. • Inject the produced data with parentage into book keeping system • System composed by central request manager and multiple agents supporting high load • 5k workflows • 200k jobs pending • 150k jobs running 22
Job Broker: HTCondor • Job broker that uses shared resources between analyzer and central production in a global pool. • Use GlideIn mechanism: • Wrapper job: pilot running on site • Receive and execute trusted jobs • Double stage of matchmaking • Jobs to resource (start pilots) • Jobs to pilots (claim pilots) • Migrated for a large fraction to multi-core partitionable pilots • Allows multi-thread application, moving most workflows to 4+ threads • Performances: • Record 200k concurrent jobs • Steady >150k job 23
Recommend
More recommend