a virtualized approach to mass storage system
play

A virtualized approach to Mass Storage System Dorin Lobontu, Jos van - PowerPoint PPT Presentation

A virtualized approach to Mass Storage System Dorin Lobontu, Jos van Wezel and Martin Beitzinger STEINBUCH CENTRE FOR COMPUTING KIT University of the State of Baden-Wuerttemberg and www.kit.edu National Research Center of the Helmholtz


  1. A virtualized approach to Mass Storage System Dorin Lobontu, Jos van Wezel and Martin Beitzinger STEINBUCH CENTRE FOR COMPUTING KIT – University of the State of Baden-Wuerttemberg and www.kit.edu National Research Center of the Helmholtz Association

  2. Presentation Overview  GridKa Storage Overview  TSM as Management System for MSS  Tape Library Virtualization with ERMM  Tape Reports 2 01.06.2011 Dorin Lobontu Steinbuch Centre for Computing

  3. GridKa Storage Overview GridKa Storage System dCache GridFTP optimized tape access LHC-Centers  temporary storage  analysis Stage-pools NameSpace Operations MSS  monte-carlo-simulation Access Controls 1 GB/s Storage Management GridFTP Pool Management  an extension of the standard FTP for Grid applications keep file copies for performance improvement  authentification over GSI (Grid Security Infrastructure)  encryption by SSL Read-Pools  partial file transfer  automatic TCP optimization  parallel and striped transfers 1 GB/s write on tape by TSM 350 MegaByte dCache is a storage management system: GridKa Storage System per second  manages a large amount of data  75 fileservers in 3 dCache  stores data on distributed media (disk, tape) installations  hierarchical storage management  3 tape libraries - 10 PB tape  has automatic load balancing Write-Pools capacity  8 PB disk capacity 3 01.06.2011 Dorin Lobontu Steinbuch Centre for Computing

  4. MSS Requirements Components Mass Storage System dCache TSS+STA+DM dCache Arch. Library Manager manager dCache TSS+STA+DM dCache TSS+STA+DM Xrootd Arch. LSDF Arch. Manager Manager • MSS has to have a scalable architecture xrootd • MSS has to uncouple tape resources and TSS+STA+DM applications xrootd • MSS has to share the same resources for different TSS+STA+DM LSDF Clients applications • MSS has to provide security mechanisms to prevent/grant applications access to its resources 4 01.06.2011 Dorin Lobontu Steinbuch Centre for Computing

  5. Presentation Overview  GridKa Storage Overview  TSM as Management System for MSS  Tape Library Virtualization with ERMM  Tape Reports 5 01.06.2011 Dorin Lobontu Steinbuch Centre for Computing

  6. TSM as Library Manager TSM Server & Library Manager dCachePool tss StorageAgent IBM TS3500 dCachePool tss StorageAgent dCachePool tss Grau ITL-XL StorageAgent dCachePool tss StorageAgent STK SL-8500  on the TSM server one path for every agent and every tape drive must be defined (65 agents x 26 drives = 1690 paths)  these paths must be manually maintained 6 01.06.2011 Dorin Lobontu Steinbuch Centre for Computing

  7. Distributing Data over all Libraries IBM TS3500 TSS TSM dCache DevC-Grau MGMTC1 StorageClass1 StorageClass1 <-> TSM MGMTC1 STGPOOL1 StorageClass2 DevC-IBM MGMTC2 Grau ITL-XL StorageClass2 <-> TSM MGMTC2 STGPOOL2 StorageClassX DevC-STK StorageClassN MGMTCN StorageClassN <-> TSM MGMTCN STK SL-8500 STGPOOLN  data is statically distributed by TSS (Tape Staging Server) over the libraries  drives load-balancing is not possible  a library crash interrupts the processes assigned to this library 7 01.06.2011 Dorin Lobontu Steinbuch Centre for Computing

  8. Presentation Overview GridKa Storage Overview TSM as Management System for MSS Tape Libraries Virtualization with ERMM Tape Reports 8 01.06.2011 Dorin Lobontu Steinbuch Centre for Computing

  9. ERMM as Library Manager TSM ERMM ERMM-Client IBM TS3500 dCache-Pool tss StorageAgent ERMM-Client Grau ITL-XL dCache-Pool tss StorageAgent STK SL-8500 ERMM-Client dCache-Pool tss StorageAgent  ERMM : ERMM-Client  takes over the entire management of the libraries dCache-Pool  coordinates the access to drives and tapes tss  logs all activities in an own DB2 database StorageAgent  provides a single point of control of tape resources 9 01.06.2011 Dorin Lobontu Steinbuch Centre for Computing

  10. Distributing Data over all Libraries ERMM dCache TSS TSM GRAU MGMTC1 StorageClass1 StorageClass1 <-> TSM MGMTC1 STGPOOL1 IBM StorageClass2 MGMTC2 DevC-LTO STK StorageClass2 <-> TSM MGMTC2 STGPOOL2 StorageClassX drives group StorageClassN MGMTCN StorageClassN <-> TSM MGMTCN tapes group STGPOOLN • TSM has only one external library • TSM defines only one path for every storage agent to the external library • ERMM maintains dynamically all path from the storage agents to all drives • ERMM spreads the data over all phisycal libraries • ERMM makes dynamic drives load balancing 10 01.06.2011 Dorin Lobontu Steinbuch Centre for Computing

  11. Presentation Overview GridKa Storage Overview TSM as Management System for MSS Tape Libraries Virtualization with ERMM Tape Reports 11 01.06.2011 Dorin Lobontu Steinbuch Centre for Computing

  12. Collecting Statistics Data Sense data Sense data request collector ERMM event pipe  library manager  all drives ERMM  drive information  all cartridges  library information  temporary DCA  cartridge information  drive cartridge access record for every operation  archive DB TSM  one external library dCache  no drive  no scrtach Mass Storage System MySql DB 12 01.06.2011 Dorin Lobontu Steinbuch Centre for Computing

  13. Generate Tape Reports Complete history af Drive Cartridge Access  amount of data written/read per mount Statistics generator  mout and unmount time  number of soft/hard error per mount (perl program) MySql DB Statistics: Web  throuput reports per drive, cartridge, library and time unit graphics plot generator  number of mounts per drive, cartridge, library and time unit  number of concurrent drives in use per library and time unit  error reports per drive, cartridge, library and time unit 13 01.06.2011 Dorin Lobontu Steinbuch Centre for Computing

  14. Activity Reports DriveActivity LibraryActivity VolumeInfo Home 14 01.06.2011 Dorin Lobontu Steinbuch Centre for Computing

  15. Activity Reports DriveActivity LibraryActivity VolumeInfo Home 15 01.06.2011 Dorin Lobontu Steinbuch Centre for Computing

  16. Activity Reports DriveActivity LibraryActivity VolumeInfo Home 16 01.06.2011 Dorin Lobontu Steinbuch Centre for Computing

  17. Error Reports - per Library per month iwr_grau1_lto3(16 drives) iwr_grau1_lto4(8 drives) 17 01.06.2011 Dorin Lobontu Steinbuch Centre for Computing

  18. Tape Errors Since November 2009 about 100 cartrigdes removed due to increasing correctable errors (~25 LTO3 from a total of ~5000 ~75LTO4 from a total of ~5000) 4 drives(from ~64) replaced due to bad performance and increasing error rate Lost 4 cartrigdes with internal label destroyed TSM: ANR8355E Error reading label for volume … 18 01.06.2011 Dorin Lobontu Steinbuch Centre for Computing

  19. 19 01.06.2011 Dorin Lobontu Steinbuch Centre for Computing

Recommend


More recommend