sam data management services
play

SAM Data Management Services Adam Lyon SAMGrid Project Leader - PowerPoint PPT Presentation

SAM Data Management Services Adam Lyon SAMGrid Project Leader CD/REX/PS Leader D SAM is a multi-level system for data management in use by D, CDF and MINOS Adam Lyon / Fermilab CD D / Neutrino Computing Workshop 1 SAM as a Data


  1. SAM Data Management Services Adam Lyon SAMGrid Project Leader – CD/REX/PS Leader – DØ SAM is a multi-level system for data management in use by DØ, CDF and MINOS Adam Lyon / Fermilab CD DØ / Neutrino Computing Workshop 1

  2. SAM as a Data Catalog (all) Store metadata about files File type, run information, stream names, MC info, … Create datasets (lists of files) based on metadata queries Datasets are “live”, query language simpler than SQL Replica Catalog Maintain list of file locations (including pnfs & SRM locations) 2

  3. SAM as a data “deliverer” (all, especially CDF, MINOS) Deliver file URLs upon request (interop with dCache, bluearc & SRM) Throttle deliveries to protect underlying cache/storage systems Move files from storage to cache and cache to cache world wide Track file usage by projects and jobs. Easy creation of recovery jobs 3

  4. SAM as a cache management system (DØ) Operate a system of multi-tiered caches (large cache nodes, small cache nodes) Distributed cache for Grid jobs worldwide Complex routing possible 4

  5. SAM has a “request system” (DØ for MC) Store metadata for a MC production request Track produced files (real and virtual) and apply metadata Easy queries for MC files 5

  6. SAM is maintainable and scalable Tiny operational load at MINOS ~ 0.75 FTE for entire DØ data management system ~ 1.5 FTE for CDF data management system (but also includes dCache monitoring and operations) Low ongoing development (but still improving caching tune, keeping system up to date with latest infrastructure) SAM works with SL 4 and 5 SAM is easily scalable (including database access via DB servers) 6

  7. !""# ,-.-/012341510/.6/756819.: !"" $&" $*" $(" $!" $"" ?@ &" *" (" !" " "$%"#%"# "$%$$%"# "$%$'%"# "$%$)%"# "$%$+%"# "$%$#%"# "$%!$%"# "$%!'%"# "$%!)%"# "$%!+%"# "$%"&%"# "$%$"%"# "$%$!%"# "$%$(%"# "$%$*%"# "$%$&%"# "$%!"%"# "$%!!%"# "$%!(%"# "$%!*%"# "$%!&%"# ;<-2=9->:54$ ;<-2=9->:54! ! ! 7

  8. SAM has client and developer features Command line, Python, C++ Interfaces Interface SAM with your framework 8

  9. SAM - you can take what you need Data catalog Data delivery (real or virtual) Data caching 9

  10. Items to consider How much of SAM do you need to satisfy your requirements? Can your needs be consolidated? e.g. Can multiple experiments share a database? Share a SAM installation? Share a cacheing system? How much development is required (by you and us)? 10

Recommend


More recommend