the pilot wlcg service the pilot wlcg service
play

The Pilot WLCG Service: The Pilot WLCG Service: Last steps before - PowerPoint PPT Presentation

ISGC Taipei, May 2006 Taipei, May 2006 ISGC The Pilot WLCG Service: The Pilot WLCG Service: Last steps before full production i) Definition of what the service actually is ii) Highlight steps that still need to be taken iii)Issues


  1. ISGC – – Taipei, May 2006 Taipei, May 2006 ISGC The Pilot WLCG Service: The Pilot WLCG Service: Last steps before full production i) Definition of what the service actually is ii) Highlight steps that still need to be taken iii)Issues & concerns Jamie Shiers, CERN Jamie Shiers, CERN

  2. Abstract The production phase of the Service Challenge 4 - also known as the � Pilot WLCG Service - is due to start at the beginning of June 2006 This leads to the full production WLCG service from October 2006 Thus the WLCG pilot is the final opportunity to shakedown not only the � services provided as part of the WLCG computing environment - including their functionality - but also the operational and support procedures that are required to offer a full production service. This talk will describe in detail all aspects of the service, together with � the currently planned production and test activities of the LHC experiments to validate their computing models as well as the service itself. [ Probably no time for this, but slides included… ] There is a Service Challenge talk on Thursday – detail about the recent � SC4 T0-T1 disk-disk and disk-tape activities +Tx-Ty transfers then…

  3. Caveat I realize that this presentation is primarily oriented towards the � WLCG community. However, I had the choice of: � � Something targeted at the general Grid community (and probably of little interest to WLCG…) � Something addressing the main concerns of delivering in the IMMEDIATE FUTURE reliable production services for the WLCG (with possibly some useful messages for the general community) I have chosen the latter… � � There will nevertheless be a few introductory slides…

  4. The Worldwide LHC Computing Grid LCG Purpose � � Develop, build and maintain a distributed computing environment for the storage and analysis of data from the four LHC experiments � Ensure the computing service � … and common application libraries and tools Phase I – 2002-05 - Development & planning � Phase II – 2006-2008 – Deployment & commissioning of the � initial services The solution! les.robertson@cern.ch HEPiX Rome 05apr06

  5. Data Handling and CERN LCG Computation for Physics Analysis detector event filter event filter (selection & (selection & reconstruction) reconstruction) reconstruction processed event data summary data raw data batch batch event physics event physics reprocessing analysis reprocessing analysis analysis analysis objects (extracted by physics topic) event les.robertson@cern.ch event simulation simulation simulation interactive physics analysis les.robertson@cern.ch HEPiX Rome 05apr06

  6. LCG Service Hierarchy LCG Tier-0 – the accelerator centre Data acquisition & initial processing � Long-term data curation � Data Distribution to Tier-1 centres � Tier-1 – “online” to the data acquisition process � high availability Managed Mass Storage – � � grid-enabled data service All re-processing passes � Canada – Triumf (Vancouver) Spain – PIC (Barcelona) France – IN2P3 (Lyon) Taiwan – Academia SInica (Taipei) Data-heavy analysis Germany –Karlsruhe � UK – CLRC (Oxford) Italy – CNAF (Bologna) US – FermiLab (Illinois) National, regional support Netherlands – NIKHEF/SARA (Amsterdam) – Brookhaven (NY) � Nordic countries – distributed Tier-1 Tier-2 – ~100 centres in ~40 countries � Simulation � End-user analysis – batch and interactive � Services, including Data Archive and Delivery, from Tier-1s les.robertson@cern.ch HEPiX Rome 05apr06

  7. LCG Summary of Computing Resource Requirements All experiments - 2008 From LCG TDR - June 2005 CERN All Tier-1s All Tier-2s Total CPU (MSPECint2000s) 25 56 61 142 Disk (PetaBytes) 7 31 19 57 Tape (PetaBytes) 18 35 53 CPU Disk Tape CERN CERN 12% 18% A ll Tier-2s CERN 33% 34% A ll Tier-2s 43% A ll Tier-1s 66% A ll Tier-1s A ll Tier-1s 39% 55% last update 02/05/2006 13:11 les robertson - cern-it-7

  8. What are the requirements for the WLCG? Over the past 18 – 24 months, we have seen: � The LHC Computing Model documents and Technical Design Reports; � The associated LCG Technical Design Report; � The finalisation of the LCG Memorandum of Understanding (MoU) � Together, these define not only the functionality required (Use Cases), but also the � requirements in terms of Computing, Storage (disk & tape) and Network But not necessarily in an site-accessible format… � We also have close-to-agreement on the Services that must be run at each � participating site Tier0, Tier1, Tier2, VO-variations (few) and specific requirements � We also have close-to-agreement on the roll-out of Service upgrades to address � critical missing functionality We have an on-going programme to ensure that the service delivered meets the � requirements, including the essential validation by the experiments themselves

  9. More information on the LCG Experiments’ Computing Models � LCG Planning Page Technical Design Reports • LCG TDR - Review by the LHCC • ALICE TDR supplement : Tier-1 dataflow diagrams • ATLAS TDR supplement : Tier-1 dataflow • CMS TDR supplement Tier 1 Computing Model LHCb TDR supplement : Additional site dataflow diagrams • � GDB Workshops � Mumbai Workshop - see GDB Meetings page Experiment presentations, documents Please register asap! � Tier-2 workshop and tutorials Both workshop & tutorials! CERN - 12-16 June les.robertson@cern.ch HEPiX Rome 05apr06

  10. How do we measure success? By measuring the service we deliver against the MoU targets � Data transfer rates; � Service availability and time to resolve problems; � Resources provisioned across the sites as well as measured usage… � By the “challenge” established at CHEP 2004: � [ The service ] “should not limit ability of physicist to exploit performance of � detectors nor LHC’s physics potential“ “…whilst being stable, reliable and easy to use” � Preferably both… � Equally important is our state of readiness for startup / commissioning, � that we know will be anything but steady state [ Oh yes, and that favourite metric I’ve been saving… ] �

  11. The Requirements And test extensively, both ‘dteam’ and other VOs Resource requirements , e.g. ramp-up in Tier N CPU, disk, tape and network � Look at the Computing TDRs; � Look at the resources pledged by the sites (MoU etc.); � Look at the plans submitted by the sites regarding acquisition, installation and � commissioning; � Measure what is currently (and historically) available; signal anomalies. Functional requirements , in terms of services and service levels, including � operations, problem resolution and support Implicit / explicit requirements in Computing Models; � Agreements from Baseline Services Working Group and Task Forces; � Service Level definitions in MoU; � � Measure what is currently (and historically) delivered; signal anomalies. Data transfer rates – the Tier X �� Tier Y matrix � Understand Use Cases; � � Measure …

  12. Service Challenges LCG � Purpose � Understand what it takes to operate a real grid service real grid service – run for weeks/months at a time (not just limited to experiment Data Challenges) � Trigger and verify Tier-1 & large Tier-2 planning and deployment – - tested with realistic usage patterns � Get the essential grid services ramped up to target levels of reliability, availability, scalability, end-to-end performance � Four progressive steps from October 2004 thru September 2006 � End 2004 - SC1 – data transfer to subset of Tier-1s � Spring 2005 – SC2 – include mass storage, all Tier-1s, some Tier-2s � 2 nd half 2005 – SC3 – Tier-1s, >20 Tier-2s – first set of baseline services � Jun-Sep 2006 – SC4 – pilot service � Autumn 2006 – LHC service in continuous operation – ready for data taking in 2007 les.robertson@cern.ch HEPiX Rome 05apr06

  13. WLCG Service Deadlines Pilot Services – 2006 stable service from 1 June 06 cosmics LHC Service in operation – 1 Oct 06 over following six months ramp up to full 2007 operational capacity & performance first LHC service commissioned – 1 Apr 07 physics 2008 full physics run

  14. SC4 – the Pilot LHC Service LCG from June 2006 A stable service on which experiments can make a full demonstration of experiment offline chain DAQ � Tier-0 � Tier-1 � data recording, calibration, reconstruction Offline analysis - Tier-1 �� Tier-2 data exchange � simulation, batch and end-user analysis And sites can test their operational readiness Service metrics � MoU service levels � Grid services � Mass storage services, including magnetic tape � Extension to most Tier-2 sites Evolution of SC3 rather than lots of new functionality In parallel – Development and deployment of distributed database services (3D project) � Testing and deployment of new mass storage services (SRM 2.1) � les.robertson@cern.ch HEPiX Rome 05apr06

  15. Production Services: Challenges LCG Why is it so hard to deploy reliable, production services? � What are the key issues remaining? � How are we going to address them? � les.robertson@cern.ch HEPiX Rome 05apr06

  16. LCG Production WLCG Services (a) The building blocks les.robertson@cern.ch HEPiX Rome 05apr06

Recommend


More recommend