world wide computing and the atlas world wide computing
play

World Wide Computing and the ATLAS World Wide Computing and the - PowerPoint PPT Presentation

World Wide Computing and the ATLAS World Wide Computing and the ATLAS Experiment Experiment th July 2004 Taipei, 25 th July 2004 Taipei, 25 Roger Jones Roger Jones ATLAS International Computing Board Chair ATLAS International Computing


  1. World Wide Computing and the ATLAS World Wide Computing and the ATLAS Experiment Experiment th July 2004 Taipei, 25 th July 2004 Taipei, 25 Roger Jones Roger Jones ATLAS International Computing Board Chair ATLAS International Computing Board Chair

  2. Complexity of the Problem Maj or challenges associat ed wit h: Communicat ion and collaborat ion at a dist ance (collaborat ive t ools) Dist ribut ed comput ing resources Remot e sof t ware development and physics analysis RWL Jones, Lancaster University

  3. Context Context � � There is a real challenge presented by the large data There is a real challenge presented by the large data volumes (~10Pb/year) expected by 2007 and beyond volumes (~10Pb/year) expected by 2007 and beyond � Despite Moore’s Law, we will need large scale distributed computing → Grids � There are many Grid initiatives trying to make this possible � � ATLAS is a worldwide collaboration, and so we span most ATLAS is a worldwide collaboration, and so we span most Grid projects Grid projects � This has strengths and weaknesses! � We benefit from all developments � We have problems maintaining coherence � We will ultimately be working with several Grids (with defined interfaces) � This may not be what funders like the EU want to hear! RWL Jones, Lancaster University

  4. The ATLAS Data Data The ATLAS � � ATLAS ATLAS � Not one experiment! � A facility for many different measurements and physics topics � � Event selection Event selection � 1 GHz pp collision rate � 40 MHz bunch- crossing rate � 200 Hz event-rate to mass- storage � Real-time selection � Leptons � Jets RWL Jones, Lancaster University

  5. From Trigger to Reconstruction From Trigger to Reconstruction T0 reconstruction 15 kSI2k-s/event D 2. 5 µ s LV E 1.3 kSI2k = 1 P4 with 3.2 GHz L1 T RO ROD ROD ROD LVL2 ROB ROB ROB D ROI B L2SV H A ROS T L2P 300 MB/s L L A 2 F DFM N E ~4 GB/ s ~ sec L Event Filt er T B SFI O N EFP EB EFP W EFP E EFP F SFO N CERN Raw + Cal ESD etc Tier-0&Tier 1/2 Auto.Tape (TB) 3216 1000 Disk (TB) 40 800 CPU (kSI2k) 4058 0 RWL Jones, Lancaster University

  6. The Global View The Global View Distribution to ~6 T1s Each of N T1s hold 2/N of the reconstructed data. The ability to do research requires therefore a sophisticated software infrastructure for complete and convenient data-access for the whole collaboration, and sufficient network bandwidth (2.5 Gb/s) for keeping up the data-transfer from T0 to T1s. RWL Jones, Lancaster University

  7. Digital Divide Digital Divide � � The is an opportunity to move closer to a single scientific The is an opportunity to move closer to a single scientific community, wealthy/developed/emerging/poor community, wealthy/developed/emerging/poor � � This requires investment on all sides This requires investment on all sides For the ‘ ‘haves haves’ ’ For the � International networks � Grid infrastructure and toolkits � An open approach For the ‘ ‘have less have less’ ’ For the � Investment in local networks and infrastructure � Regional T2s � Negotiated relationship with a Tier-1 RWL Jones, Lancaster University

  8. Digital Divide Digital Divide � � Excellent networking is Excellent networking is GLORIAD: GLobal RIng network for assumed in the Grid model assumed in the Grid model Advanced applications Development. � � Requires global high- -bandwidth bandwidth Requires global high A 10 Gb/s network planned to start in connectivity connectivity 2004 � Projects like GLORIAD will provide this � Allows countries to contribute to ATLAS computing through in-kind local capacity � High bandwidth integration with the ATLAS infrastructure � MC-data produced locally, may stay and be accessed by ATLAS collaborators � Will give member countries collaborative tools connection to the collaboration � � All local institutions in ATLAS All local institutions in ATLAS must have significant must have significant bandwidth to the local T2 bandwidth to the local T2 RWL Jones, Lancaster University

  9. Digital Divide Digital Divide � � � � Needed ATLAS computing Needed ATLAS computing Variation- Variation -1 1 capacity in countried countried with with capacity in � Copy of the 3 Tb/year TAG small numbers of users e.g. small numbers of users e.g. at the local institution China (10- -15 end users, 88% 15 end users, 88% China (10 � Network to T2 on the low for local usage, 12% shared for local usage, 12% shared end with the whole with the whole � ~0.5 TB locally per user collaboration) collaboration) � ~1 kSI2k locally per user � 120 kSI2k CPU � � Variation- -2 2 Variation � 80 TB disk � Access the 3 Tb TAG at the � 34 TB tape/slow media local T2 � � Internal connection to local Internal connection to local � Network to the T2 on the T2 T2 high end, less disk-space locally � 0.16-0.65 Gb/s � Good PC as work-station for each researcher RWL Jones, Lancaster University

  10. ATLAS Components ATLAS Components � � Grid Projects Grid Projects � Develop the middleware � Provide hardware resources � Provide some manpower resource � But also drain resources from our core activities � � Computing Model Computing Model � Dedicated group to develop the computing model � Revised resources and planning paper evolving � Sep 2004 � Now examine from DAQ to end-user � Must include University/local resources � Devise various scenarios, different distributions of data � � Data Challenges Data Challenges � Test the computing model � Service other needs in ATLAS (but this must be secondary in DC2) RWL Jones, Lancaster University

  11. Grid Projects Grid Projects Until these groups provide interoperability the experiments must provide it themselves EGEE RWL Jones, Lancaster University

  12. Peak of Inflated Expectations ? Hype Plateau of Productivity Slope of Enlightenment Trough of Disillusionment Trigger Time RWL Jones, Lancaster University

  13. Test Bench – Data Challenges –Data Challenges Data Challenges Test Bench Data Challenges � � DC 1 Jul 2002- -May 2003 May 2003 DC 1 Jul 2002 � Showed the many resources available (hardware, willing people) � Made clear the need for integrated system � Some tests of Grid software � Mainly driven by HLT and Physics Workshop needs � One external driver is sustainable, two is not! � � DC2 DC2 June- June -Sept 04 Sept 04 � Real test of computing model for computing TDR � Must use Grid systems � Analysis and calibration + reconstruction and simulation � Pre-production period (End June 04…) then 1-week intensive tests � � DC3 05/06 DC3 05/06 � Physics readiness TDR. � Big increase in scale RWL Jones, Lancaster University

  14. DC2: June – – Sept 2004 Sept 2004 DC2: June � � The goal includes: The goal includes: � Use widely the GRID middleware and tools � Large scale physics analysis in latter phase � Computing model studies (document end 2004) � Slice test of the computing activities in 2007 � � Also Also � Full use of Geant4; POOL; LCG applications � Pile-up and digitization in Athena � Deployment of the complete Event Data Model and the Detector Description � Simulation of full ATLAS and 2004 combined Testbeam � Test the calibration and alignment procedures � Run as much as possible the production on LCG-2 � Combined Test beam operation foreseen as concurrent with DC2 and using same tools � � Also need networking tests at full rate and above Also need networking tests at full rate and above � Co-ordinate with LCG service challenges, ESLEA and other light-path network tests RWL Jones, Lancaster University

  15. � � Preparation phase: worldwide exercise (June- -July 04) July 04) Preparation phase: worldwide exercise (June � Event generation; Simulation (>10 7 ); pile-up and digitization � All “Byte-stream” sent to CERN � � Reconstruction: at Tier0 Reconstruction: at Tier0 � ~400 processors, short term, sets scale � Several streams � Express lines � Calibration and alignment lines � Different output streams � ESD and AOD replicated to Tier1 sites � Out of Tier0 � Re-calibration � new calibrations and alignment parameters � Re-processing � Analysis using ATLAS Distributed Analysis in late phase RWL Jones, Lancaster University

  16. Test Beds & Grid Projects Test Beds & Grid Projects All pre- -production for DC2 to be done using Grid tools from: production for DC2 to be done using Grid tools from: All pre � � LCG2 LCG2 � Common to all LHC experiments � LCG2 now rolling-out. � Much improved job success rate � � Grid3/US ATLAS Test Bed Grid3/US ATLAS Test Bed � Demonstrated success of grid computing model for HEP � Developing & deploying grid middleware and applications � wrap layers around apps,simplify deployment � Very important tools for data management MAGDA and software installation (pacman) � Evolve into fully functioning scalable distributed tiered grid � � NorduGrid NorduGrid � A very successful regional test bed � Light-weight Grid user interface, middleware, working prototypes etc � Now to be part of Northern European Grid in EGEE RWL Jones, Lancaster University

Recommend


More recommend