Status of Storm+Lustre and Multi-VO Support YAN Tian for Distributed Computing Group Meeting Oct. 23, 2014
StoRM + Lustre: Test Bed SE server configuration: This test machine is originally prepared for dCache+Lustre frontend, Model Dell PowerEdger R620 thus with good network performance CPU Xeon E5-2609 v2 @ 2.50 GHz CPU Cores 8 cores Memory 64 GB HDD scsi 300 GB Network 1 eth0: 1 Gbps Network 2 eth4: 10 Gbps A symblic link to Lustre directory , User can access files in this directory through StoRM webDAV portal
StoRM + Lustre Test 1: single thread download • Test time: Oct 15, 17:50--18:40 • Lustre is not busy (load 7%, out 80 MB/s) • 20 files of size 1 GB • Average download speed: 10.6 MB/s • with eth0: 1 Gbps • load of SE: 0.8~1.1 load, 11~13% wa v.s. When Lustre is busy: out 500~1400 MB/s
StoRM + Lustre Test 2: multi threads/processes download • Multi- tread download tool: mytget, can’t start multi -thread mode for Lustre Multi-process wget download, do not improve much, 22~33 M/s • 4 processes 8 processes
StoRM + Lustre Test 3: Symbolic Link Problem Modify namespace.xml is under trying.
Storm+Lustre Test: To Do • Solve symblik link problem • Dataset transfer test between IHEPD-USER • Open ports 50000:55000 • Dataset transfer test between WHU/USTC-USER
ILC-DIRAC Study: User Interface • Python code, which can be directly execute • A job script example: from DIRAC.Core.Base import Script Script.parseCommandLine() from ILCDIRAC.Interfaces.API.DiracILC import DiracILC dirac instance is job reciever dirac = DiracILC(True, “ my_job_repository.rep " ) from ILCDIRAC.Interfaces.API.NewInterface.UserJob import UserJob job = UserJob() job.setName( " MyJobName " ) job.setJobGroup( " Agroup " ) job.setCPUTime(86400) from ILCDIRAC.Interface.API.NewInterface.Application import Mokka, Marlin mo = Mokka() mo.setLogFile(“sim - job.log”) mo.setInputFile(“init.macro”) define and set para. for app. mo.setOutputFile(“E250 - CDR_wo_Pnnh.eL.eR.001.slcio”) mo.setNumberOfEvents(1000) job.append(mo) mar = Marlin() applications stack mar.setParameters(“value”) mar.getInputFromApp(mo) job.append(mar) job.submit(dirac)
ILC-DIRAC Study: Job Repository • Repo. Contains all necessary information of jobs • for Job Monitoring $ dirac-repo-monitor repo.cfg • for Retrive all the output sandbox and output data • $ dirac-repo-retrieve-jobs-output – r – O repo.cfg call 3 methods • Repository is a functionality provided by DIRAC
ILC-DIRAC Study: Applications • Many applications Generation: Whizard, Pythia, StdHepCut Simulation: Mokka, SLIC Reconstruction: Marlin, LCSIM, SLICPandora Analysis: Marlin, ROOT, Druid, etc… • A command for user to querry avaliable app. and it’s version $ dirac-ilc-show-software • Applications are all defined in module – ILCDIRAC.Interfaces.API.NewInterface.Application (base class) – ILCDIRAC.Interfaces.API.NewInterface.Applications • In job script: from ILCDIRAC.Interface.API.NewInterface.Application import Mokka mo = Mokka() mo.setParameters1(“value1”) A Generic application for executable outside mo.setParameters2(“value2”) ILCsoft, e.g. job.append(mo) ga = GenericApplication() ga.setScript(“boss.exe”) ga.setArguments(“jobOptions.txt”)
ILC-DIRAC Study: User Input Data • For ILC analysis jobs, users always need his own lib. file *.so • ILC solution: upload to SE, download to WN $ tar czf lib.tar.gz lib/ $ dirac-dms-add-files /ilc/user/i/initial/some/path/lib.tar.gz lib.tar.gz CERN-SRM $ dirac-dms-remove-files /ilc/user/i/initial/some/path/lib.tar.gz • In Job Script: job.setInputSandbox( "LFN:/ilc/user/i/initial/some/path/lib.tar.gz" ) • ILC allow user to use $ dirac-dms-filecatalog
ILC-DIRAC Study: Class Inheritance DIRAC classes ILC-DIRAC classes Dirac DiracILC Spliter? UserJob Job ProductionJob Application Applications MokkaAnalysis ModuleBase MarlinAnalysis etc… PhthiaAnalysis
ILC-DIRAC: Module Example (MokkaAnalysis) called by Job Agent from ILCDIRAC.Workflow.Modules import MokkaAnalysis ma = MokkaAnalysis() ma.execute() In this module: 1. retrieve job parameters 2. write a shell script to a) set environment b) run application c) return status code
Recommend
More recommend