12 th TF-Storage Meeting - Berlin CESNET storage activities overview Jiří Horký (jiri.horky@cesnet.cz) 6th of March, 2013
History • Relatively new activity to CESNET • started in autumn 2010 with 2FTEs • preparation phase continued until the end of 2011 with no HW no services • Some childhood problems appeared afterwards • the department restructured in April 2012 • 6.5 FTEs in total right now 6/3/2013 CESNET storage activities overview 2
Financing and scale EU structural funds + projects of Ministry of Education, Youth and Sports • Research and Development for Innovations OP (“R&DI OP”) - project “ eIGeR ” • 100 mil. CZK ~ 4 mil. EUR (infrastructure only) • May 2011 – October 2013 (end of pilot phase) • Project “Large infrastructure” • 2011 – 2015, operational costs • commitment to sustain operation at least to 2018 6/3/2013 CESNET storage activities overview 3
Data storage facility • Three geographically separated storage locations (Pilsen, PardubiceJihlava, Brno) to support research and science community • Large research projects • Public universities • Academy of Sciences • Public Libraries • Digitalization of books • ... • Total capacity 16.6 PB + another tender in preparation 6/3/2013 CESNET storage activities overview 4
Distributed storage locations 6/3/2013 CESNET storage activities overview 5
Technical assumptions • Assumptions: • Emphasis on economical aspects + transparent behavior to users -> HSM system • Backup and archival demands foreseen -> requirement for a tape library • Difference in usage patterns will be covered by migration policies, e.g.: • archival – move everything from disk to tape almost immediately • input files for batch jobs – migrate only files touched > one month ago 6/3/2013 CESNET storage activities overview 6
The Pilsen storage site • Hosted by University of Western Bohemia, delivered in the end of 2011, in pilot operation since May 2012 • Multi-tier, managed by DMF (Data Migration Facility), SLES OS, CXFS file system (SGI) • Disk systems in two IS4600 arrays: • Tier1: 50 TB FC drives • Tier2: 450 TB SATA drives • Tier3: Spectra T-Finity tape library with dual robotics • 2200 slots with LTO5 (3.3PB in total), 2300 free slots • 8Gbit redundant FC SAN, 10 Gbit Ethernet • 2x10 Gbit connection to CESNET2 network • 6 frontends managed by Pacemaker in HA cluster • 2 HSM servers in HA mode, one system for administration 6/3/2013 CESNET storage activities overview 7
The Pilsen storage site 6/3/2013 CESNET storage activities overview 8
The Jihlava storage site • Tender finished in the beginning of 2013 • Hosted by Vyso čina Region Authority • Multi-tier, GPFS filesystem, Tivoli Storage Manager (IBM) • Disk systems in two DSC3700 arrays (+JBODs): • Tier1: 800 TB SATA drives • Tier2: • 2.5 PB MAID (Promise) • 3.8 PB Tape – IBM TS3500 with TS1140 drives, dual robotics • 16Gbit redundant FC SAN, 10 Gbit Ethernet • 2x10 Gbit connection to CESNET2 network • 5 frontends, 2 HSM servers in HA mode 6/3/2013 CESNET storage activities overview 9
The Jihlava storage site Only being delivered during this week! 6/3/2013 CESNET storage activities overview 10
The Brno storage site • Tender finished in the very beggining of 2013 • Hosted by Brno University of Technology • Multi-tier, GPFS filesystem, Tivoli Storage Manager (IBM) • Disk systems in two DSC3700 arrays (+JBODs): • Tier1: 400 TB SATA drives • Tier2: • 1.8 PB MAID (Proware) • 3.5 PB Tape – IBM TS3500 with TS1140 tape drives, dual robotics • 16Gbit redundant FC SAN, 10 Gbit Ethernet • 2x10 Gbit connection to CESNET2 network • 5 frontends, 2 HSM servers in HA mode 6/3/2013 CESNET storage activities overview 11
The Brno storage site Server room still under construction! 6/3/2013 CESNET storage activities overview 12
Use cases • Long term storage with high redundancy (geographical replicas) – but no data curing • Institutional backups and/or archive – individual backups supported as well • Input/output data for batch jobs • Exchange point of scientific data among collaborations using standard file protocols • not suitable for – on-line data processing – critical services 6/3/2013 CESNET storage activities overview 13
Access to storage services • File access – access to the same namespace • our “core bussiness” – taking advantage of file- level migration to lower tiers – NFSv4 with strong Kerberos authentication – SCP, SFTP (sshfs) – FTPS – rsync 6/3/2013 CESNET storage activities overview 14
Access to storage services • Filesender • our first storage service – but not heavily advertised yet • up to 400 GB files (images of systems) • still stable, can resume! • upload speed limit of ~100Mbit • command line API? 6/3/2013 CESNET storage activities overview 15
Access to storage services • Grid Storage Element – dCache for ATLAS (LHC at CERN) and AUGER experiments – coupled with HSM to enable efficient storage of not heavily used data – 30TB on disks + 2x100TB on tapes – data are actually read here, not only written 6/3/2013 CESNET storage activities overview 16
Access to storage services • Block access – iSCSI directly from arrays – no migration possible using HSM – speed (latency) issues – only in special and well justified cases (7TB so far) – but very stable service 6/3/2013 CESNET storage activities overview 17
User management • One user identity in the whole CESNET e-Infrastructure (data storage group, NGI_CZ, video conferencing services...) • Divide and conquer in practice: – users arranged in groups with an admin from the group – resource owners negotiate on service quantity and quality (quotas, migration policies...) with the group admin – the group admin decides who can join the group and what quality of service each particular user receives • Managed by Perun v3 system developed in collaboration with other projects 6/3/2013 CESNET storage activities overview 18
User management • We require users to be well verified – “somebody we trust has seen user’s ID” – managed by eduID.cz federation – then we create a Kerberos principal & password which forms his e-Infrastructure identity – decoupled from federation after registration • Grid Storage Element and iSCSI are special 6/3/2013 CESNET storage activities overview 19
Open questions • Inter site replication – FTP backend for each HSM in the beginning • each site will migrate to another site as to lower tier • Long term goal: One namespace across all three sites accessible by standard file protocols – suggestions? 6/3/2013 CESNET storage activities overview 20
Conclusion • 3.8 PB in pilot operation • mainly backups and archive data • 16.6 PB available soon • Broad range of user demands • broad range of technical challenges (aka problems :-) • surprisingly many bugs & patches of commonly used tools/apps! 6/3/2013 CESNET storage activities overview 21
Thank you for your attention. Questions? Jiří Horký (jiri.horky@cesnet.cz) 6/3/2013 CESNET storage activities overview 22
Recommend
More recommend