useful info on snt workflows
play

Useful info on SNT workflows Nick Amin Overview Two main parts - PowerPoint PPT Presentation

Useful info on SNT workflows Nick Amin Overview Two main parts Data/metadata retrieval people usually use DAS many of us use DIS metadata about SNT samples (i.e., CMS4) Job submission people usually use CRAB many


  1. Useful info on SNT workflows Nick Amin

  2. Overview ⚫ Two main parts • Data/metadata retrieval ‣ people usually use DAS ‣ many of us use DIS ‣ metadata about SNT samples (i.e., CMS4) • Job submission ‣ people usually use CRAB ‣ many of us use local condor (and Metis) ⚫ Also advertising how to retrieve this information and how to optimally request CMS4 samples :) � 2

  3. Where do I get my data? ⚫ Many CMS services to deal with data bookkeeping ⚫ DAS (Data Aggregation Service) deals with many but not all • It’s old and slow ⚫ "Can I get the cmsRun configuration for the GENSIM used to make this MC sample?" • Surprise — Can’t use DAS • Surprise — even using McM, that’s 10-30 clicks the first time and then 4-5 clicks once you know which ICONS are relevant DBS dataset/file info Phedex dataset/file location DAS 👥 Other stuff minor stuff nobody uses user PmP campaign progress Skype with Nick, or ls /hadoop/... McM MC configurations CMS4 CMS4 information � 3

  4. Where do I get my data? ⚫ DIS alleviates this issue by querying services directly • Faster, so you find out where that WJets sample is before retirement ⚫ Drops some DAS things that we don’t use daily ⚫ Adds some things that combine multiple sources ⚫ Adds CMS4 bookkeeping ⚫ You don’t need a proxy/cert for anything, only the person running the DIS server DBS dataset/file info Phedex dataset/file location 👥 user DIS PmP campaign progress McM MC configurations CMS4 CMS4 information � 4

  5. A DAS query processing time: 8.1s …not to mention it times out sometimes, and there’s also this kind of page: � 5

  6. A DIS query ⚫ http://uaf-8.t2.ucsd.edu/~namin/dis/?query=%2FEGamma%2FRun2018D-22Jan2019- v2%2FMINIAOD&type=files&short=short processing time: 2.2s ⚫ If DIS talks to DBS directly, and DAS talks to DBS for the same data, then how is DAS 4x slower? 🤸 ⚫ dasgoclient (CLI) written by DAS author to bypass DAS and query DBS directly • But it’s not a website, and doesn’t (nicely) do all the things we want � 6

  7. DIS knobs query (almost always just a dataset name) what kind of info do you want? "short" output? if unchecked, display more details � 7

  8. DIS options (1) ⚫ Basic ⚫ Files — by default, shows only 10 files (uncheck short option to see all) � 8

  9. DIS options (2) ⚫ Sites — where is my data? If you put in a file , you get info about that file/block If you put in a dataset , you get fractional T2 presence � 9

  10. DIS options (3) ⚫ Chain • returns McM info (fragment, driver, CMSSW version, gridpack) ‣ … for all steps from GENSIM to NANOAODSIM � 10

  11. DIS options (4) ⚫ Pick (pickevents) • Put in a dataset, then comma separated list of run:lumi:event • Gives you the command to run to get a single root file ⚫ Pick_cms4 (pickevents to CMS4 level) • Or, just skip the middle and print out which CMS4 files have the events � 11

  12. DIS options (5) ⚫ SNT (search CMS4 samples) • Two entries here, because there’s two CMS4 tags � 12

  13. How to summarize data? ⚫ How can we summarize lots of output? ⚫ "What’s the total event count of all /TTTT* samples?" • Any list of json-like stu ff can be piped into "grep" • Print out some statistics with "stats" � 13

  14. How to select data? ⚫ Additionally, for SNT queries, put restrictions as comma separated list after dataset pattern ⚫ Print out hadoop path and dataset name for Prompt 2018 data processed with the CMS4_V10-02-04 tag � 14

  15. Python CLI client ⚫ Python command line client has exact same syntax as webpage (just give -t <type> ), and you can make nice tables too • https://github.com/aminnj/dis/blob/master/dis_client.py • dis_client.py -t snt "/MET/Run2018*Prompt*/ MINIAOD,cms3tag=CMS4_V10-02-04 | grep dataset_name,gtag,nevents_in" --table dataset_name gtag nevents_in /MET/Run2018A-PromptReco-v2/MINIAOD 102X_dataRun2_Prompt_v11 5980578 /MET/Run2018A-PromptReco-v1/MINIAOD 102X_dataRun2_Prompt_v11 30172992 /MET/Run2018B-PromptReco-v1/MINIAOD 102X_dataRun2_Prompt_v11 28012780 /MET/Run2018C-PromptReco-v1/MINIAOD 102X_dataRun2_Prompt_v11 1986935 /MET/Run2018B-PromptReco-v2/MINIAOD 102X_dataRun2_Prompt_v11 1739672 /MET/Run2018A-PromptReco-v3/MINIAOD 102X_dataRun2_Prompt_v11 17175066 /MET/Run2018D-PromptReco-v2/MINIAOD 102X_dataRun2_Prompt_v11 162272551 /MET/Run2018C-PromptReco-v2/MINIAOD 102X_dataRun2_Prompt_v11 14698298 /MET/Run2018C-PromptReco-v3/MINIAOD 102X_dataRun2_Prompt_v11 14586790 � 15

  16. DIS (misc) ⚫ Other features • See readme of repo � 16

  17. ProjectMetis ⚫ "CRAB mostly works when it works, but it mostly doesn’t work" • CRAB has a lightheavyweight server between you and your jobs ⚫ Luckily we have local condor submission and running lots of cmsRun isn’t that complicated ⚫ Almost all data processing we do is based on dataset in → files out • Can be organized into "tasks" that take a "sample" (supplier of events) and produce files with events • CRAB takes a dataset, PSet, CMSSW code, and some other info in a configuration file ⚫ Metis (https://github.com/aminnj/ProjectMetis) makes it more functional • The tarfile contains CMSSW source to eventually ship to condor worker nodes task = CMSSWTask( sample = DBSSample(dataset="/ZeroBias6/Run2017A-PromptReco-v2/MINIAOD"), events_per_output = 450e3, output_name = "merged_ntuple.root", tag = "CMS4_V00-00-03", pset = "pset_test.py", pset_args = "data=True prompt=True", cmssw_version = "CMSSW_9_2_1", tarfile = "/nfs-7/userdata/libCMS3/lib_CMS4_V00-00-03_workaround.tar.gz", is_data = True, ) � 17

  18. ProjectMetis (submitting) ⚫ Process a task def main(): task = CMSSWTask( • get list of inputs sample = DBSSample(dataset="/ZeroBias6/Run2017A-PromptReco-v2/MINIAOD"), events_per_output = 450e3, • make list of outputs output_name = "merged_ntuple.root", • submit jobs tag = "CMS4_V00-00-03", pset = "pset_test.py", • resubmit failed jobs pset_args = "data=True prompt=True", cmssw_version = "CMSSW_9_2_1", tarfile = "/nfs-7/userdata/libCMS3/lib_CMS4_V00-00-03_workaround.tar.gz", ⚫ Make a summary of is_data = True, ) jobs and put it on a dashboard task.process() StatsParser(data=total_summary, webdir="~/public_html/dump/metis_test/").do() ⚫ Easily extendible to a if __name__ == "__main__": loop over datasets # Do stuff, sleep, do stuff, sleep, etc. for i in range(100): main() time.sleep(1.*3600) # Since everything is backed up, totally OK to Ctrl+C and pick up later � 18

  19. ProjectMetis (chaining) ⚫ Can chain together tasks tag = "v1" total_summary = {} for _ in range(10000): gen = CMSSWTask( • Input of one is the output sample = DummySample(N=1, dataset="/WH_HtoRhoGammaPhiGamma/privateMC_102x/GENSIM"), events_per_output = 1000, total_nevents = 1000000, pset = "gensim_cfg.py", cmssw_version = "CMSSW_10_2_5", scram_arch = "slc6_amd64_gcc700", of the previous task tag = tag, split_within_files = True, ) raw = CMSSWTask( sample = DirectorySample( location = gen.get_outputdir(), dataset = gen.get_sample().get_datasetname().replace("GENSIM","RAWSIM"), ), open_dataset = True, ⚫ Allows one to make a files_per_output = 1, pset = "rawsim_cfg.py", cmssw_version = "CMSSW_10_2_5", scram_arch = "slc6_amd64_gcc700", tag = tag, ) GEN → CMS4 workflow in one aod = CMSSWTask( sample = DirectorySample( location = raw.get_outputdir(), dataset = raw.get_sample().get_datasetname().replace("RAWSIM","AODSIM"), ), script open_dataset = True, files_per_output = 5, pset = "aodsim_cfg.py", cmssw_version = "CMSSW_10_2_5", scram_arch = "slc6_amd64_gcc700", • Make 5 tasks tag = tag, ) miniaod = CMSSWTask( sample = DirectorySample( location = aod.get_outputdir(), • Loop through tasks and dataset = aod.get_sample().get_datasetname().replace("AODSIM","MINIAODSIM"), ), open_dataset = True, flush = True, files_per_output = 5, pset = "miniaodsim_cfg.py", process them all cmssw_version = "CMSSW_10_2_5", scram_arch = "slc6_amd64_gcc700", tag = tag, ) • As tasks complete, the cms4 = CMSSWTask( sample = DirectorySample( location = miniaod.get_outputdir(), dataset = miniaod.get_sample().get_datasetname().replace("MINIAODSIM","CMS4"), ), open_dataset = True, inputs for the subsequent flush = True, files_per_output = 1, output_name = "merged_ntuple.root", pset = "psets_cms4/main_pset_V10-02-04.py", pset_args = "data=False year=2018", ones become available global_tag = "102X_upgrade2018_realistic_v12", cmssw_version = "CMSSW_10_2_5", scram_arch = "slc6_amd64_gcc700", tag = tag, ‣ parallel in a sense tarfile = "/nfs-7/userdata/libCMS3/lib_CMS4_V10-02-04_1025.tar.xz", ) tasks = [gen,raw,aod,miniaod,cms4] for task in tasks: task.process() summary = task.get_task_summary() total_summary[task.get_sample().get_datasetname()] = summary StatsParser(data=total_summary, webdir="~/public_html/dump/metis/").do() time.sleep(30*60) � 19

Recommend


More recommend