f
play

f Lothar A T Bauerdick Fermilab ISGC 2004 July 27, 2004 f U.S. - PowerPoint PPT Presentation

Grid-3 and the Open Science Grid in the U.S. LATBauerdick, Fermilab International Symposium on Grid Computing ISGC 2004 Academia Sinica, Taipei, Taiwan f Lothar A T Bauerdick Fermilab ISGC 2004 July 27, 2004 f U.S. Grids


  1. Grid-3 and the Open Science Grid in the U.S. LATBauerdick, Fermilab International Symposium on Grid Computing ISGC 2004 中央研究院 Academia Sinica, Taipei, Taiwan f Lothar A T Bauerdick Fermilab ISGC 2004 July 27, 2004

  2. f U.S. Grids Science Drivers  Science drivers for U.S. Physics Grid Projects:  iVDGL, GriPhyN and PPDG (”Trillium”) 2009 � ATLAS & CMS experiments @ CERN LHC � 100s of Petabytes 2007 - ? Community growth � High Energy & Nuclear Physics expts 2007 Data growth � ~1 Petabyte (1000 TB) 1997 – present � LIGO (gravity wave search) 2005 � 100s of Terabytes 2002 – present � Sloan Digital Sky Survey 2003 � 10s of Terabytes 2001 – present 2001 Future Grid resources � Massive CPU (PetaOps) � Large distributed datasets (>100PB) � Global communities (1000s) Lothar A T Bauerdick Fermilab ISGC 2004 Academia Sinica July 27, 2004 2

  3. f Globally Distributed Science Teams  Sharing and federating vast Grid resources Lothar A T Bauerdick Fermilab ISGC 2004 Academia Sinica July 27, 2004 3

  4. f Gravitational Wave Observatory  Grid-enabled GW Pulsar Search using the Pegasus system  Goal: Implement a production-level blind galactic-plane search for Gravitational Wave pulsar signals  Run 30 days on ~5-10x more resources than LIGO has -- using the grid (e.g., 10,000 CPUs for 1 month) Millions of individual jobs  Planning by GriPhyN Chimera/Pegasus Execution by Condor DAGman File cataloging by Globus RLS Metadata by Globus MCS  Achieved: Access to ~ 6000 CPUs for 1 week ~ 5% utilization due to bottlenecks Lothar A T Bauerdick Fermilab ISGC 2004 Academia Sinica July 27, 2004 4

  5. f Sloan Digital Sky Survey  Galaxy Cluster Finding: red-shift analysis, weak lensing effects  Using the GriPhyN Chimera and Pegasus  Coarse grained DAG works fine (batch system) Fine grain DAG has scaling issues (virtual data system) Lothar A T Bauerdick Fermilab ISGC 2004 Academia Sinica July 27, 2004 5

  6. f Large Hadron Collider  Energy frontier high luminosity p-p-collider at CERN  order-of-magnitue step in energy and luminosity for particle physics 1 1 0 0 0 0 0 0 0 Constituent Center-of-Mass Energy (GeV) Constituent Center-of-Mass Energy (GeV) � � L L H H C L H H C C L L E E P ( ( C C E E R R N N ) 1 1 0 0 0 0 0 0 T T e e v v a a t t r r o o n n 1 1 9 9 9 9 5 5 : : t t o o p p ( ( F F e e r r m m i i l l a a b b ) L L E E P P 2 2 0 0 0 S S p p p p S 1 1 9 9 8 8 3 3 : : W W , , Z Z H H E E R R A ( ( C C E E R R N N ) ) 1 1 0 0 0 0 L L E E P P 1 1 1 1 9 9 8 8 9 9 : : 3 3 f f a a m m i i l l i i e e s s ( ( C C E E R R N N ) ) 1 1 9 9 7 7 9 9 : : G G L L U U O O N N P P E E T T R R A ( ( D D E E S S Y Y ) ) 1 1 0 0 S S P P E E A A R R ( ( S S t t a a n n f f o o r r d d ) � � 1 1 9 9 7 7 4 4 : : J J / / � � 1 1 9 9 7 7 5 5 : : 1 1 + + – – e e e e C C O O L L L L I I D D E E R R S e e - - p p C C O O L L L L I I D D E E R R S H H A A D D R R O O N N C C O O L L L L I I D D E E R R S 1 1 9 9 6 6 0 0 1 1 9 9 7 7 0 0 1 1 9 9 8 8 0 0 1 1 9 9 9 9 0 0 2 2 0 0 0 0 0 2 2 0 0 1 1 0 0 2 2 0 0 2 2 0 Y Y e e a a r r o o f f F F i i r r s s t t P P h h y y s s i i c c s Lothar A T Bauerdick Fermilab ISGC 2004 Academia Sinica July 27, 2004 6

  7. f Emerging LHC Production Grids  LHC first to put “real”, multi-organizational, global Grids to work  large resources become available to experiments “opportunistically” Lothar A T Bauerdick Fermilab ISGC 2004 Academia Sinica July 27, 2004 7

  8. f Grid2003 project  in 2003 U.S. science projects and Grid projects coming together to build a multi-organizational Infrastructure: Grid3 US LHC projects Korea CMS testbeds, data challenges Tevatron RHIC BaBar end-to-end HENP virtual data research applications VDT U.Buffalo virtual data grid laboratory BTeV Lothar A T Bauerdick Fermilab ISGC 2004 Academia Sinica July 27, 2004 8

  9. Grid3 Initial multi-organizational f Grid infrastructure  Common Grid operating as coherent loosely-coupled infrastructure.  Applications running on Grid3 (Trillium, U.S. LHC), benefiting LHC (3), SDSS (2), LIGO (1), Biology (2), Computer Science (3). 25 Universities 4 National Labs 2800 CPUs July-26, 2004, 11:35pm CDT Lothar A T Bauerdick Fermilab ISGC 2004 Academia Sinica July 27, 2004 9

  10. f Resource Sharing Works  example: U.S. CMS Data Challenge Simulation production  running on Grid3 since Nov 2003  profited at least 40% non-CMS resources in first quarter 2004 Lothar A T Bauerdick Fermilab ISGC 2004 Academia Sinica July 27, 2004 10

  11. f Important Role of Tier2 Centers  Tier2 facility logically grouped around their Tier1 regional center  20 – 40% of Tier1?  “1-2 FTE support”: commodity CPU & disk, no hierarchical storage  Essential university role in extended computing infrastructure  Validated by 3 years of experience with proto-Tier2 sites  Specific Functions for Science Collaborations  Physics analysis  Simulation  Experiment software  Support smaller institutions  Official role in Grid hierarchy (U.S.)  Sanctioned by MOU (ATLAS, CMS, LIGO)  Local P.I. with reporting responsibilities  Selection by collaboration via careful process Lothar A T Bauerdick Fermilab ISGC 2004 Academia Sinica July 27, 2004 11

  12. Grid3 infrastructure built upon f the Virtual Data Toolkit  Grid environment built from core Globus and Condor middleware, as delivered through the Virtual Data Toolkit (VDT)  GRAM, GridFTP, MDS, RLS, VDS, VOMS, …  VDT sponsored through GriPhyN and iVDGL, contributions from LCG  …equipped with VO and multi-VO security, monitoring, and operations services  …allowing federation with other Grids where possible, eg. CERN LHC Computing Grid (LCG)  U.S.ATLAS: GriPhyN Virtual Data System execution on LCG sites  U.S.CMS: storage element interoperability (SRM/dCache) Lothar A T Bauerdick Fermilab ISGC 2004 Academia Sinica July 27, 2004 12

  13. f Grid3 Principles  Simple approach:  Sites consisting of  Computing element (CE)  Storage element (SE)  Information and monitoring services  VO level, and multi-VO  VO information services  Operations (iGOC)  Minimal use of grid-wide systems  No centralized resource broker, replica/metadata catalogs, or command line interface  to be provided by individual VO’s  Application driven  adapt application to work with Grid-3 services  prove application on VO testbeds Lothar A T Bauerdick Fermilab ISGC 2004 Academia Sinica July 27, 2004 13

  14. f “loosely coupled” set of services  The Grid3 environment consists on a “loosely coupled” set of services  Processing Service  Globus-Gram bridge from Condor-G for central submission  four separate queueing systems are being supported  Data Transfer Services  GridFTP interfaces on all sites through gateway systems  Files are transferred into processing sites  Results are transferred directly into MSS GridFTP door  CMS has moved to SRM-based storage element functionality  VO Management Services  Need central service for authentication, VOMS  Monitoring Services  System and application level monitoring allows status verification and diagnoses  Software Distribution Services  lightweight, based on Pacman  Information Services  top help applications and monitoring, based on MDS Lothar A T Bauerdick Fermilab ISGC 2004 Academia Sinica July 27, 2004 14

  15. f Site Services and Installation  Goal is to install and configure with minimal human intervention  Use Pacman tool and distributed software “caches”  Registers site with VO and Grid3 level services  Accounts, application install areas & working directories %pacman –get iVDGL:Grid3 Grid3 Site VDT $app VO service $tmp GIIS register Compute Info providers Element Grid3 Schema Storage Log management 4 hours to install and validate Lothar A T Bauerdick Fermilab ISGC 2004 Academia Sinica July 27, 2004 15

  16. f VO centric model  “what are the services to enable application VOs”  “what do providers need to provide resources to their VOs”  Lightweight-nes at the cost of centrally provided functionality VOMS  examples for this approach: servers  flexible VO security infrastructure SDSS  DOEGrids Certificate Authority US CMS  PPDG and iVDGL Registration Authorities, with VO or site sponsorship Grid3  Automated multi-VO authorization, US ATLAS gridmap using EDG-developed VOMS Grid3 Sites  Each VO manages a service and it’s members BTeV  Each Grid3 site is able to generate and locally adjust LSC gridmap file with authenticated query to each VO service iVDGL  VOs negotiate policies & priorities with provider directly  VOs can run their own storage services  U.S. CMS sites run SRM/dCache storage services on Tier-1 and Tier-2s Lothar A T Bauerdick Fermilab ISGC 2004 Academia Sinica July 27, 2004 16

Recommend


More recommend