DSS Data & Storage Services CERN Lustre Evaluation and Storage Outlook Tim Bell Arne Wiebalck HEPiX, Lisbon 20 th April 2010 CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/i t
DSS Agenda • Lustre Evaluation Summary • Storage Outlook – Life cycle management – Large disk archive • Conclusions Internet Services CERN IT Department CH-1211 Genève 23 CERN Lustre Evaluation and Storage Outlook - 2 Switzerland www.cern.ch/i t
DSS Lustre Evaluation Scope • HSM System – CERN Advanced STORage Manager (CASTOR) – 23 PB, 120 million files, 1’352 servers • Analysis Space – Analysis of the experiments’ data – 1 PB access with XRootD • Project Space – >150 projects – Experiments’ code (build infrastructure) – CVS/SVN, Indico, Twiki, … • User home directories – 20’000 users on AFS – 50’000 volumes, 25 TB, 1.5 billion acc/day, 50 servers Internet – 400 million files Services CERN IT Department CH-1211 Genève 23 3 Switzerland www.cern.ch/i t
DSS Evaluation Criteria • Mandatory is support for ... – Life Cycle Management – Backup – Strong Authentication – Fault-tolerance – Acceptable performance for small files and random I/O – HSM interface • Desirable is support for ... – Replication – Privilege delegation – WAN access – Strong administrative control Internet • Performance was explicitly excluded Services – See the results of the HEPiX FSWG CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/i t 4
DSS Compliance (1/3) • Life cycle management – Not OK: no support for live data migration, Lustre or kernel upgrades, monitoring, version compatibility • Backup – OK: LVM snaphots for MDS plus TSM for files worked w/o problems • Strong Authentication – Almost OK: Incomplete code in v2.0, full implementation Internet expected Q4/2010 or Q1/2011 Services CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/i t 5
DSS Compliance (2/3) • Fault-tolerance – OK: MDS and OSS failover (we used a fully redundant multipath iSCSI setup) • Small files – Almost OK: Problems when mixing small and big files (striping) • HSM interface – Not OK: Not supported yet, but under active development Internet Services CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/i t 6
DSS Compliance (3/3) • Replication – Not OK: not supported (would help with data migration and availability) • Privilege delegation – Not OK: not supported • WAN access – Not OK: may become possible once Kerberos is fully implemented (cross-realm setups) • Strong administrative control – Not OK: pools not mandatory, striping settings cannot be Internet enforced Services CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/i t 7
DSS Additional thoughts • Lustre comes with (too) strong client/server coupling – Recovery case • Moving targets on the roadmap – Some of the requested features are on the roadmap since years, some are simply dropped • Lustre aims at extreme HPC rather then a general purpose file system – Most of our requested features are not needed in the primary customers‘ environment Internet Services CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/i t 8
DSS Lustre Evaluation Conclusion • Operational deficiencies do not allow for a Lustre- based storage consolidation at CERN • Lustre still interesting for the analysis use case (but operational issues should be kept in mind here as well) • Many interesting and desired features (still) on the roadmap, so it’s worthwhile to keep an eye on it Internet • For details, see write up at Services https://twiki.cern.ch/twiki/pub/DSSGroup/LustreEvaluation/CERN_Lustre_Evaluation.pdf CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/i t 9
DSS Agenda • Lustre Evaluation Summary • Storage Outlook – Life cycle management – Large disk archive • Conclusions Internet Services CERN IT Department CH-1211 Genève 23 CERN Lustre Evaluation and Storage Outlook - 10 Switzerland www.cern.ch/i t
DSS Life Cycle Management • Archive sizes continue to grow – 28 PB tape used currently at CERN – 20 PB/year expected • Media refresh every 2-3 years – Warranty expiry on disk servers – Tape drive repacking to new densities • Time taken is related to – TOTAL space, not new data volume recorded – Interconnect between source and target – Metadata handling overheads per file • Must be performed during online periods Internet – Conflicts between user data serving and refresh Services CERN IT Department CH-1211 Genève 23 Storage Outlook - 11 Switzerland www.cern.ch/i t
DSS Repack Campaign Total Tapes Repacked 35000 30000 25000 Tapes Repacked 20000 15000 10000 5000 0 Nov-08 Dec-08 Jan-09 Feb-09 Mar-09 Apr-09 May-09 Jun-09 Jul-09 Aug-09 Sep-09 • Last repack campaign took 12 months to copy 15PB of data • When next drives are available, there will be around 35PB of data • To complete repack in 1 year, data refresh will require as much Internet resources as LHC data recording Services • This I/O capacity needs to be reserved in the disk and tape CERN IT Department planning for sites with large archives CH-1211 Genève 23 Storage Outlook - 12 Switzerland www.cern.ch/i t
DSS Disk Based Archive? Disk and Tape Capacity Projections 16 14 12 Maximum capacity (TB) 10 8 6 4 2 0 2009 2010 2011 2012 2013 2014 2015 Pessimistic disk projection Optimistic disk projection Pessimistic tape projection Optimistic tape projection Internet • Can we build a disk based archive at reasonable cost compared to Services a tape based solution? CERN IT Department CH-1211 Genève 23 Storage Outlook - 13 Switzerland www.cern.ch/i t
DSS Storage in a Rack • Tape Storage at CERN – 1 drive has 374 TB storage – Average rate 25 MB/s • Disk Server equivalent – 2 head nodes • 2 x 4 port SAS cards – 8 JBOD expansion units • 45 x 2 TB disks each – Capacities • 720 TB per rack Internet • 540 TB when RAID-6 of 8 disks Services • 270 TB per head node CERN IT Department CH-1211 Genève 23 Storage Outlook - 14 Switzerland www.cern.ch/i t
DSS High Availability Disk Array Disk Array Disk Array Disk Array Head Node Head Node Disk Array Disk Array Disk Array Internet Services Disk Array CERN IT Department CH-1211 Genève 23 Storage Outlook - 15 Switzerland www.cern.ch/i t
DSS Simulation 20 PB/yr 2011-15 5 year archive cost 180 160 140 Normalised Cost (Current=100) 120 100 80 60 40 20 0 Tape with just in time repack Storage in a box Storage in a rack Storage in a rack with backup Internet • Costs normalised to tape HSM as 100 Services • Storage in a rack can be comparable with CERN IT Department CH-1211 Genève 23 Storage Outlook - 16 Switzerland tape on cost/GB www.cern.ch/i t
DSS Simulation for power Annual Power Consumption 160 kWatts 140 120 100 80 60 40 20 0 2011 2012 2013 2014 2015 Nearline Repack Just in Time Storage in a rack Storage in a rack with backup Internet • Additional power consumption of 100 kWatt Services • Cost included in the simulation CERN IT Department CH-1211 Genève 23 Storage Outlook - 17 Switzerland www.cern.ch/i t
DSS Areas to investigate • Reliability – Corruptions – Scrubbing • Availability – Fail-over testing • Power conservation – Disk spin down / up • Lifecycle management – 40 days to drain at gigabit ethernet speeds • Manageability – Monitoring, Repair, Install • Operations cost – How much effort is it to run Internet Services CERN IT Department CH-1211 Genève 23 Storage Outlook - 18 Switzerland www.cern.ch/i t
DSS Conclusions • Lustre should continue to be watched but currently is not being considered for Tier-0, Analysis or AFS replacement • Lifecycle management is a major concern for the future as the size of the archive grows • Disk based archiving may be an option – In-depth reliability study before production – Watch trends for disk/tape capacities and pricing – Adapt software for multiple hierarchies Internet Services CERN IT Department CH-1211 Genève 23 CERN Lustre Evaluation and Storage Outlook - 19 Switzerland www.cern.ch/i t
DSS Data & Storage Services Backup Slides CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/i t
DSS Use Cases Internet Services CERN IT Department CH-1211 Genève 23 Presentation Title - 21 Switzerland www.cern.ch/i t
Recommend
More recommend