Raw Data Reconstruction with Raw-Data Reconstruction with PROOF C. Cheshkov, P. Hristov 24/10/2008 ALICE Offline Week C O e ee
Many thanks to: Andrei,Federico,Fons,Jan,Gerri,Latchezar,Rene for the discussions and help and Marco & Jan Fiete for their great support on CAF! for their great support on CAF! 24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 2
What is all about? � Run AliRoot raw-data reco in parallel mode using PROOF R AliR t d t i ll l d i PROOF � Fast reconstruction feedback � Tuning of reco-parameters and code � Fast test before going to full-blast AliEn production � Fast test before going to full blast AliEn production � It may sound a bit abstract, but in fact that was one of the options we needed urgently during the LHC start-up options we needed urgently during the LHC start-up � Use case: − raw-data files << #slaves − Higher (compared to ESD) event size and processing time 24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 3
Contents � Overview Overview � AliReconstruction implemented as Tselector − Input List − Code executed on slaves (also I/O) Code executed on slaves (also I/O) − Output files � Performance on CAF � Documentation Documentation � Outlook 24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 4
July 08, Federico's office Lets make a TSelector 24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 5
It turned out to be a bit more It turned out to be a bit more difficult... InitRun: I itR Create RawReader Init QA mgr Init CDB Modified set run# from RAW Init Run: Init Geometry C Create RawReader t R R d Init GRP Init run-loaders Load all needed OCDB entries Init CDB Unload used OCDB entries Set run# from RAW Init Reco-params Init Geometry I it GRP Init GRP SlaveBegin: Init Reco-params Read selector input list Init vertexers and trackers Init run-loaders Open ESD files Init vertexers and trackers Out-of-Loop QA Open ESD files Init QA mgr Reconstruct Event: ... Process: In-loop QA Recreate RawReader Intermediate ESD files TkDiff 4.1.4 report ... ... In-loop QA number of diffs: 104 ... Finish Run: 17 regions were deleted 20 regions were added Close files etc. SlaveTerminate: Out-of-loop QA 67 regions were changed Close files etc. ESD tags Finish QA Finish QA Terminate: ESD tags 24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 6
AliReconstruction :: public TSelector AliReconstruction :: public TSelector CLIENT CLIENT MASTER MASTER SLAVES SLAVES SlaveBegin: collection of Restore AliReconstruction raw-data files Init run-loaders Init vertexers/trackers Init vertexers/trackers Open ESD files InitRun: Init QA mgr gProof Create raw-reader Get u Get run # # Raw data Load OCDB xrootd Input List TGeo geometry OCDB OCDB entries t i R Run: Mag field map Raw-data-chain->Process() Process: AliReconstruction Recreate raw-reader from tree entry Standard event reco Event QA Terminate: Merging ESD files Create tags SlaveTerminate: Close files etc. Finish QA ESD 24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 7
AliReconstruction :: public TSelector � Completely transparent for the user: − Prepare collection of raw-data files (one run) ( ) − Open PROOF session − Enable AliRoot (reco libs) on master and slaves ( ) − Run your favourite (or standard) rec.C by providing files collection as input (“collection://xxx”) − One can use any specific OCDB storage, custom reco options, reco- params etc. − If PROOF session is not opened -> runs locally and allows quick If PROOF session is not opened > runs locally and allows quick check/debugging of the AliRoot code � Same code base used as if running sequentially � Same code base used as if running sequentially � Note: One has to patch v5-21-01-alice with the fixes in PROOF output files merging output files merging 24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 8
Input List � Contains all 'parameter-like' objects common to slaves � Single access to OCDB from client machine Single access to OCDB from client machine � Allows to customize reconstruction (as if running locally) � Size: − Dominated by OCDB entries − Depends on the active detectors − From a few MBs to ~50-60 MBs at most � It took quite some time to debug the code as input-list was copied on the master (default constructor TGeoManager deletes previous instance) 24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 9
O On Slaves Sl First read input list: � Restore AliReconstruction state from client session − Restore OCDB manager state <- OCDB entries − Set TGeo geometry and field map − Initialize: � AliRoot run-loaders (for managing intermediate reco files) − Open ESD/ESD-friend files & initialize QA mgr − Process: � Get AliRawEvent entry from raw-data tree − Recreate raw-reader out of it − Run standard single-event reco − Finally: � Close all files & finalize QA − 24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 10
On Slaves (Local I/O) � AliRoot reco output files: AliRoot reco output files: − ESD − ESD-friend (switchable) − QA files (switchable) Q ( ) − Intermediate (RecPoints, Digits) files (was not switchable) switchable) − Log (switchable) � If we get rid of intermediate files, the I/O would minimal 24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 11
Fil File-less Run-Loaders l R L d � Idea: TFile -> TDirectory (no I/O, all event objects in Idea: TFile > TDirectory (no I/O all event objects in memory) � Was implemented and tester (details on performance W i l t d d t t (d t il f slides) − Controlled via the url of galice.root file − Some methods not implemented at TDirectory level − In AliReconstruction - disable unloading/loading and writing of rec- points, digits data − At the moment code is unstable (problems with ROOT garbage collection) � Will be committed as soon as we get more confidence Will b itt d t fid 24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 12
Output Files � Using output-file merging functionality in Using output file merging functionality in PROOF (based on TFileMerger) � By default B d f lt − Only ESD files are merged y g − Resulting file arrives locally � Necessitate running xrtood daemon on client machine N it t i t d d li t hi � Check the way to run and configure xrootd on CAF-reco web page (shown at the end of the talk) web-page (shown at the end of the talk) 24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 13
Output Files � How to merge ESD-friend files? − Opened transparently inside ROOT − How to create PROOF-output file? p � Possibility to specify another output ESD file location (url) − Make resulting ESDs available in CAF as data-set (optionally) Make resulting ESDs available in CAF as data-set (optionally) � Possibility to retrieve other output files (expert mode) � AliReconstruction::SetOutput(url) ? AliR i S O ( l) ? − If location is file url – store ESD (and ESD-friends) there − If location is folder – store all output files (one sub-folder per slave) 24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 14
Performance on CAF... 24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 15
Performance – init time � Linear dependance � Slope depends mainly on the size of OCDB objects sent to slaves � Gerri is implementing the p g concept of 'input data' Data is uploaded on slaves storage − similarly to PAR files similarly to PAR files Transparent to the user − Updated only when input data changes p y p g − There should be no dep on #slaves -> − init time will be diminshed to a few s 24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 16
Performance – processing rate 24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 17
Perforamnce – local I/O Processing rate in ev/s (4 files, 2.2 GB, 1200 evts) No QA No QA,ESD-friend No QA,ESD-friend,run-loaders No QA,ESD friend,run loaders No QA,ESD-friend,run-loader,log 0 0 2 4 6 6 8 8 10 0 12 14 16 6 24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 18
Performance – cache size � Packets much shorter than � Packets much shorter than tree cache (and xrootd read- ahead?) size − Leads to an overhead input data rate − More slaves -> packets become shorted -> effect is more pronounced � One has to play with both sizes 24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 19
Test w/o read-ahead in xrootd client Test w/o read ahead in xrootd client and smaller tree cache � Processing rate: Processing rate: − No effect for the runs with more detectors with more detectors − Performance of ITS- only greatly improved l tl i d � Overall slow-down is most likely due to AliRoot update ;-) AliRoot update ; ) 24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 20
Test w/o read ahead in xrootd client Test w/o read-ahead in xrootd client and smaller tree cache � Input data rate: − Now effect is pronounced not only f for ITS run, but also ITS b t l for all runs with small e ent si e event size − I guess ROOT resets tree cache depending on the entry size � Some crashes with smaller cache size smaller cache size 24/10/2008 Raw-Data Reco with PROOF (C.Cheshkov, P.Hristov) 21
Recommend
More recommend