Advancement of usage of TaskChain in production J-R Vlimant In A - PowerPoint PPT Presentation

Advancement of usage of TaskChain in production J-R Vlimant

In A Nutshell ● TaskChain is the most flexible type of workflow ● One cmsRun per “task” ● A root task either reading from input dataset or generating events ● wmLHE and pLHE enabled ● Each subsequente task feeds from one of the output module from one of the preceding task ● Trees of tasks possible ● A → B → C1 → D1 and B → C2 → D2 (C2 → D3 and so on) ● Job splitting either done explicitly (#events/job, #lumis/job) or automatic using time/event (N.B. #events/lumi fully functioning) ● All outputs are exposed to computing up-front ● PROS ● In a multi-campaign mode of operation, reduces the number of workflows (items in request manager) from N>1 to 1 ● No intermediate manipulation of datasets ● No latency in assigning the next workflows ● No latency, less manual operation in creating tape families ● CONS ● Full chain has to be tested at once : change of mode of operation from gen contact ● Recovery workflows can become complicated with large number of tasks : change of operation from ops ● The chain has one priority ● All requests need to run at the same site (no T2 → T1 relocation) 2 Post-MccM Discussion, J-R Vlimant 9/19/14

Already Tested ● Years of operation of release validation samples ● Although job splitting was always set explicitly ● Treating eos-based .lhe files in input https://github.com/dmwm/WMCore/issues/4871 ● #Events per lumi https://github.com/dmwm/WMCore/issues/4872 ● Doing wmLHE and gen-sim in a single workflow ● 2 requests in mcm ● 2 tasks in the taskchain ● https://vlimant.web.cern.ch/vlimant/SUS-Fall13wmLHE-00011_dict_2t.json ● Output 2 datasets as if they were processed in two different workflows, without the dataset manipulation latency ● Doing trees of requests from SUS-Fall13wmLHE-00011 ● https://cms-pdmv.cern.ch/mcm/chained_requests?root_request=SUS-Fall13wmLHE-0001 ● 1 wmLHE, 1 gen-sim, 2 digi-reco, 1 mini-aod : 5 workflows compared to one taskchain ● https://vlimant.web.cern.ch/vlimant/SUS-Fall13wmLHE-00011_dict_at.json ● The last clone made by Alan succeeded with only an AODSIM output dataset collision due to wrong assignment. 3 Post-MccM Discussion, J-R Vlimant 9/19/14

Already Developed (1/3) ● Testing script for the full chain request (March 2014) https://cms-pdmv.cern.ch/mcm/public/restapi/chained_requests/ ● get_setup/<chained request id> ● Setup&run each request one after the others ● Testing API for chained requests (March 2014) https://cms-pdmv.cern.ch/mcm/restapi/chained_requests/ ● Test/<chained request id> ● Threaded runtest of the chain ● Verification of performance & efficiency measured ● Requires certificates and xrootd enabled ● Creating the taskchain dictionary from ● A chained request ID (March 2014) https://cms-pdmv.cern.ch/mcm/public/restapi/chained_requests/get_dict/SUS-chain_Fall13wmLHE_flowWMLHEtoF13_flowS14P ● Handle only the requests that are part of the chain ● N.B. The link has scratch=true which unfolds the whole chain ● A request ID (August 2014) https://cms-pdmv.cern.ch/mcm/public/restapi/chained_requests/get_dict/SUS-Fall13wmLHE-00011?scratch=true ● Look for the tree of requests from the chains the request is involved with ● N.B. The link has scratch=true which unfolds the whole chain ● Injection of taskchain (March 2014) ● wmcontrol is provided with the url to the dictionnary https://github.com/cms-PdmV/wmcontrol/commit/0a2352e7866a61cf41fb31afa334f4f268f8a415 ● Everything is done within McM 4 Post-MccM Discussion, J-R Vlimant 9/19/14

Already Developed (2/3) ● Labelling of the output dataset “processingstring” (March 2014) ● Application of experience with relvals ● Simplifies greatly the assignment of TaskChains ● Registering statistics and status of multiple output dataset (August 2014) ● Required for proper toggling of done status with completed events in McM ● Reduction of stats DB size by making an history member of each doc (August 2014) ● From 23Gb to 500Mb … ● Growth plot fully available and made simpler to make ● Button for chain request testing available to gen contact (September 2014) ● Fixed for un-intentional reset of requests ● Approval toggling from gen contact & convener (September 2014) ● Once validation is finished, status is toggled ● Toggling to define then approve in the regular way ● Injection of taskchain and batch texting (September 2014) ● Injection is now threaded and locked ● Subject&Text of the pilot batch was ambiguous https://hypernews.cern.ch/HyperNews/CMS/get/dataopsrequests/5546.html ● and now fixed https://github.com/cms-PdmV/cmsPdmV/pull/652 5 Post-MccM Discussion, J-R Vlimant 9/19/14

Already Developed (3/3) ● Toggling of status to done using multiple output (September 2014) ● Few typos fixed ● Worked out of the box, with regular request inspection https://cms-pdmv.cern.ch/mcm//requests?member_of_chain=HIG-chain_Summer12_flowS12to53-00264&page=0&shown=146297325599 ● Protection for dataset name collision (September 2014) ● PR https://github.com/cms-PdmV/cmsPdmV/pull/658 ● Required to prevent TaskChains to create collisions with existing requests ● Functions with indirect injection of taskchain : i.e. when toggling submit approval ● Does not operated with direct injection : i.e using /restapi/chained_requests/inject/<id> ● 6 Post-MccM Discussion, J-R Vlimant 9/19/14

On-Going ● Pilot batch of TaskChain from McM ● From HIG mass scan https://cms-pdmv.cern.ch/mcm/requests?dataset_name=*FilterMuOrEle15*&member_of_campaign=Summer12 ● Extra mass point (55) added, validated https://cms-pdmv.cern.ch/mcm/requests?prepid=HIG-Summer12-02258 ➔ Completed after a few manual steps ➔ Issue with ACDC not solved yet ● Brainstorming on assignment (Ops) quoting chats with Alan ● Adapt the scripts that look for possible job location based on input datasets, being primary or pileup ● Adapt possible modification to job splitting made by assignment scripts ● Allocate TaskChains to site based on resource availability ➔ No feedback yet ➔ Proper site white list wasn't used in the pilot and lead to failures in digi-reco https://hypernews.cern.ch/HyperNews/CMS/get/dataopsrequests/5546/1/1/1/1/1/1/1/1/1/1/1/1/1/1.html ➔ Suspicion that this is also what is causing the ACDC to not start 7 Post-MccM Discussion, J-R Vlimant 9/19/14

Suggestion To Next Steps ● Get feedback the Ops brainstorming and iron out the handshaking details ● Do a reservation campaign in Summer11 & Summer12* ● Put all new requests in Summer11 and Summer12* through TaskChain ● Extend to new requests in Fall13* → miniAOD ● Extend to new requests in Fall14wmLHE → Fall14 8 Post-MccM Discussion, J-R Vlimant 9/19/14

Suggestion To Next Steps ● Get feedback the Ops brainstorming and iron out the handshaking details ● Do a reservation campaign in Summer11 & Summer12* ● Put all new requests in Summer11 and Summer12* through TaskChain ● Extend to new requests in Fall13* → miniAOD ● Extend to new requests in Fall14wmLHE → Fall14 9 Post-MccM Discussion, J-R Vlimant 9/19/14

Advancement of usage of TaskChain in production J-R Vlimant In A - PowerPoint PPT Presentation

Advancement of usage of TaskChain in production J-R Vlimant In A Nutshell TaskChain is the most flexible type of workflow One cmsRun per task A root task either reading from input dataset or generating events wmLHE and pLHE

FROM ADVANCEMENT OFFICE TO ADVANCEMENT COMMUNITY Introduced by Greg Simmons, VP Institutional

University Advancement University Advancement Development Communication & Marketing

University Advancement Division Carrie Stewart Vice President, University Advancement Mission

EP AZC AZC EP EP AZC Arizona Consortium for the Advancement of EBP AZC EP Arizona

ACADEMIC ADVANCEMENT at Michigan State University See the Academic Advancement Networks

Physics plans and and ILDG ILDG usage usage Physics plans in Italy Italy in Francesco Di

AVC Welcome Breakfast Merritt Crowley Vice President, Advancement Whats New in Advancement

Teacher Advancement Program Update Lewis C. Solmon President Teacher Advancement Program

Call 3: Call 3: ICT- - based based Solutions for Solutions for Advancement Advancement of of

Organization for the Organization for the Advancement of Structured Advancement of Structured

Mapping Scenarios for the Future of Our Schools David Willows AISA Webinar, 20 April 2020

Commercial Energy Usage District Fuel We have reduced Fuel Usage FY03 - FY08 our average

USQCD regional grid USQCD regional grid Report to ILDG 14 Report to ILDG 14 US Grid Usage US

CDF Data production model CDF Data production model S. Hou S. Hou for the CDF data production

PRODUCTION EXECUTION PRODUCTION EXECUTION Table of contents Course Map Module 1: Production

Materials Production Materials Production Materials Production Materials Production

Standards and architecture approaches for ML in future networks Place: Tunis Date: 22 April 2019

Technology Analysis of Service Chaining Approaches Bin Hu (AT&T) Tim Rozet (Red Hat)

Spark Processing 101 September 10, 2015 Justin Sun Overview What is Spark? SparkContext

Income/Estate Tax Update Presented by: Christina C. Ward, Assistant Director Sandra J. Lind, Tax

E CONOMI C DE VE L OPME NT YE AR 1, UNI VE RSI T Y OF WAT E RL OO ST AT E O F

Automated, Connected, Electric, Shared & Emerging Applications - Practical Implementation

Hash Tables Direct-Address Tables Hash Functions Universal Hashing Chaining Open Addressing

Dynamic analysis, reporting and visualization of data catalogs Use cases INSPIRE Dashboard for

Advancement of usage of TaskChain in production J-R Vlimant In A - PowerPoint PPT Presentation

Advancement of usage of TaskChain in production J-R Vlimant In A Nutshell TaskChain is the most flexible type of workflow One cmsRun per task A root task either reading from input dataset or generating events wmLHE and pLHE

FROM ADVANCEMENT OFFICE TO ADVANCEMENT COMMUNITY Introduced by Greg Simmons, VP Institutional

University Advancement University Advancement Development Communication &amp; Marketing

University Advancement Division Carrie Stewart Vice President, University Advancement Mission

EP AZC AZC EP EP AZC Arizona Consortium for the Advancement of EBP AZC EP Arizona

ACADEMIC ADVANCEMENT at Michigan State University See the Academic Advancement Networks

Physics plans and and ILDG ILDG usage usage Physics plans in Italy Italy in Francesco Di

AVC Welcome Breakfast Merritt Crowley Vice President, Advancement Whats New in Advancement

Teacher Advancement Program Update Lewis C. Solmon President Teacher Advancement Program

Call 3: Call 3: ICT- - based based Solutions for Solutions for Advancement Advancement of of

Organization for the Organization for the Advancement of Structured Advancement of Structured

Mapping Scenarios for the Future of Our Schools David Willows AISA Webinar, 20 April 2020

Commercial Energy Usage District Fuel We have reduced Fuel Usage FY03 - FY08 our average

USQCD regional grid USQCD regional grid Report to ILDG 14 Report to ILDG 14 US Grid Usage US

CDF Data production model CDF Data production model S. Hou S. Hou for the CDF data production

PRODUCTION EXECUTION PRODUCTION EXECUTION Table of contents Course Map Module 1: Production

Materials Production Materials Production Materials Production Materials Production

Standards and architecture approaches for ML in future networks Place: Tunis Date: 22 April 2019

Technology Analysis of Service Chaining Approaches Bin Hu (AT&amp;T) Tim Rozet (Red Hat)

Spark Processing 101 September 10, 2015 Justin Sun Overview What is Spark? SparkContext

Income/Estate Tax Update Presented by: Christina C. Ward, Assistant Director Sandra J. Lind, Tax

E CONOMI C DE VE L OPME NT YE AR 1, UNI VE RSI T Y OF WAT E RL OO ST AT E O F

Automated, Connected, Electric, Shared &amp; Emerging Applications - Practical Implementation

Hash Tables Direct-Address Tables Hash Functions Universal Hashing Chaining Open Addressing

Dynamic analysis, reporting and visualization of data catalogs Use cases INSPIRE Dashboard for

University Advancement University Advancement Development Communication & Marketing

Technology Analysis of Service Chaining Approaches Bin Hu (AT&T) Tim Rozet (Red Hat)

Automated, Connected, Electric, Shared & Emerging Applications - Practical Implementation