r d activities on storage in cern it s fio group
play

R&D Activities on Storage in CERN-ITs FIO group Helge Meinhard - PowerPoint PPT Presentation

R&D Activities on Storage in CERN-ITs FIO group Helge Meinhard / CERN-IT HEPiX Fall 2009 LBNL 27 October 2009


  1. R&D Activities on Storage in CERN-IT’s FIO group Helge Meinhard / CERN-IT HEPiX Fall 2009 LBNL 27 October 2009 ������������������ ����������������� ����������� ������������� �

  2. Outline Follow-up of two presentations in Umea meeting: • iSCSI technology (Andras Horvath) • Lustre evaluation project (Arne Wiebalck) Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

  3. iSCSI - Motivation • Three approaches – Possible replacement for rather expensive setups with Fibre Channel SANs (used e.g. for physics databases with Oracle RAC, and for backup infrastructure) or proprietary high-end NAS appliances • Potential cost-saving • Potential cost-saving – Possible replacement for bulk disk servers (Castor) • Potential gain in availability, reliability and flexibility – Possible use for applications, for which small disk servers have been used in the past • Potential gain in flexibility, cost-saving • Focus is functionality, robustness and large-scale deployment rather than ultimate performance Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

  4. iSCSI terminology • iSCSI is a set of protocols for block-level access to storage – Similar to FC – Unlike NAS (e.g. NFS) • “Target”: storage unit listening to block-level requests requests – Appliances available on the market – Do-it-yourself: put software stack on storage node, e.g. our storage-in-a-box nodes • “Initiator”: unit sending block-level requests (e.g. read, write) to the target – Most modern operating systems feature an iSCSI initiator stack: Linux RH4, RH5; Windows Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

  5. Hardware used • Initiators: number of different servers including – Dell M610 blades – Storage-in-a-box server – All running SLC5 • Targets: – Dell Equallogic PS5000E (12 drives, 2 controllers with 3 GigE each) each) – Dell Equallogic PS6500E (48 drives, 2 controllers with 4 GigE each) – Infortrend A12E-G2121 (12 drives, 1 controller with 2 GigE) – Storage-in-a-box: Various models with multiple GigE or 10GigE interfaces, running Linux • Network (if required): private, HP ProCurve 3500 and 6600 Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

  6. Target stacks under Linux • RedHat Enterprise 5 comes with tgtd – Single-threaded – Does not scale well • Tests with IET – Multi-threaded – No performance limitation in our tests – Required newer kernel to work out of the box (Fedora and Ubuntu server worked for us) • In context of collaboration between CERN and Caspur, work going on to understand the steps to be taken for backporting IET to RHEL 5 Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

  7. Performance comparison • 8k random I/O test with Oracle tool Orion Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

  8. Performance measurement • 1 server, 3 storage-in-a-box servers as targets – Each target exporting 14 JBOD disks over 10GigE Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

  9. Almost production status… • Two storage-in-a-box servers with hardware RAID5 running SLC5 and tgtd on GigE – Initiator provides multipathing and software RAID 1 – Used for some grid services – No issues • Two Infortrend boxes (JBOD configuration) – Again, initiator provides multipathing and software RAID 1 – Used as backend storage for Lustre MDT (see next part) • Tools for setup, configuration and monitoring in place Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

  10. Being worked on • Large deployment of Equallogic ‘Sumos’ (48 drives of 1 TB each, dual controllers, 4 GigE/controller): 24 systems, 48 front-end nodes • Experience encouraging, but there are issues – Controllers don’t support DHCP, manual config required – Buggy firmware – Problems with batteries on controllers – Support not fully integrated into Dell structures yet – Remarkable stability • We have failed all network and server components that can fail, the boxes kept running – Remarkable performance Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

  11. Equallogic performance • 16 servers, 8 sumos, 1 GigE per server, iozone Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

  12. Appliances vs. home-made • Appliances – Stable – Performant – Highly functional (Equallogic: snapshots, relocation without server involvement, automatic load balancing, …) • Home-made with storage-in-a-box servers • Home-made with storage-in-a-box servers – Inexpensive – Complete control over configuration – Can run other things than target software stack – Can select function at software install time (iSCSI target vs. classical disk server with rfiod or xrootd) Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

  13. Ideas (partly started testing) • Two storage-in-a box servers as highly redundant setup – Running target and initiator stacks at the same time – Mounting half the disks local, half on the other machine – Some heartbeat detects failures and (e.g. by resetting an IP alias) moves functionality to one or the other box IP alias) moves functionality to one or the other box • Several storage-in-a-box servers as targets – Exporting disks either as JBOD or as RAID – Front-end server creates software RAID (e.g. RAID 6) over volumes from all storage-in-a-box servers – Any one (or two with SW RAID 6) storage-in-a-box server can fail entirely, the data remain available Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

  14. Lustre Evaluation Project • Tasks and goals – Evaluate Lustre as a candidate for storage consolidation • Home directories • Project space • Analysis space • HSM • HSM – Reduce service catalogue • Increase overlap between service teams • Integrate with CERN fabric management tools Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

  15. Areas of interest (1) • Installation – Quattorized installation of Lustre instances – Client RPMs for SLC5 • Backup – LVM-based snapshots for meta data – Tested with TSM, set up for PPS instance – Changelogs feature of v2.0 not yet usable • Strong Authentication – v2.0: early adaptation, full Kerberos Q1/2011 – Tested & used by other sites (not by us yet) • Fault-tolerance – Lustre comes with built-in failover – PPS MDS iSCSI setup Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

  16. FT: MDS PPS Setup MDS MDT OSS Dell Equallogic iSCSI Arrays Dell PowerEdge M600 Private iSCSI OSS Blade Server 16GB 16x 500GB SATA Network CLT Fully redundant against component failure – iSCSI for shared storage – Linux device mapper + md for mirroring – Quattorized – Needs testing

  17. Areas of Interest (2/2) • Special performance & Optimization – Small files: „Numbers dropped from slides“ – Postmark benchmark (not done yet) • HSM interface – Active developement, driven by CEA – Access to Lustre HSM code (to be tested with TSM/CASTOR) • Life Cycle Management (LCM) & Tools – Support for day-to-day operations? – Limited support for setup, monitoring and management

  18. Findings and Thoughts • No strong authentication as of now – Foreseen for Q1/2011 • Strong client/server coupling – Recovery • Very powerful users • Very powerful users – Striping, Pools • Missing support for life cycle management – No user transparent data migration – Lustre/Kernel upgrades difficult • Moving targets on the roadmap – V2.0 not yet stable enough for testing

  19. Summary • Some desirable features not there (yet) – Wish list communicated to SUN – SUN interested in evaluation • Some more tests to be done – Kerberos, Small files, HSM • Documentation

Recommend


More recommend