converged fault tolerant distributed parallel irods
play

Converged& Fault Tolerant& Distributed& Parallel& - PowerPoint PPT Presentation

Converged& Fault Tolerant& Distributed& Parallel& iRODS. iRODS User Group Meeting 2017 Aaron Gardner June, 2017 Introduction BioTeam is focused on research


  1. Converged& Fault Tolerant& Distributed& Parallel& iRODS. iRODS User Group Meeting 2017 Aaron Gardner June, 2017

  2. Introduction • BioTeam is focused on research computing consulting and products • Scientists with deep IT and scientific computing expertise • Infrastructure (HPC, Storage, Networking, Enterprise, Cloud), Informatics, Software Development, Cross-disciplinary Assessments • 15 years bridging the “gap” between science, IT, and HPC

  3. History with iRODS • BioTeam members working with iRODS since 2011— thanks Reagan • A number of consulting engagements around iRODS • BioTeam sees data management as a critical mountain that must be scaled • We are actively engaged with the scientific community to solve data management issues collaboratively

  4. Motivation • Resource server vault storage exclusivity • OK for direct attached storage, active archive • Not for distributed parallel at speed • Multiple copies on primary (fast) 
 storage for iRODS a non-starter 


  5. Motivation • Resource server fails—data drops off the grid • Catalog fails—lose access to everything • Multiple copies of catalog data not ideal • Avoid additional hardware • Performance and scalability We want “all the things”—what to do? 


  6. Can an iRODS catalog and resources have the same resiliency and scalability that today’s distributed storage systems have? How close can we get?

  7. New Reference Architecture clients high speed local network (10-100Gb Ethernet, IB, etc.) iRES.7 iRES.0 iRES.1 iRES.2 iRES.3 iRES.4 iRES.5 iRES.6 iCAT iCAT db db /fs /fs /fs /fs /fs /fs /fs /fs vm0 vm1 vm2 vm3 vm4 vm5 vm6 vm7 controller0 controller1 distributed shared storage

  8. Let it fail clients high speed local network (10-100Gb Ethernet, IB, etc.) iRES.7 iRES.1 iRES.2 iRES.3 iRES.4 iRES.5 iRES.6 iCAT iCAT iRES.0 db db /fs X /fs /fs /fs /fs /fs /fs vm0 vm1 vm2 vm3 vm4 vm5 vm6 vm7 controller0 controller1 distributed shared storage

  9. Let it fail, let it fail clients high speed local network (10-100Gb Ethernet, IB, etc.) iRES.7 iRES.2 iRES.3 iRES.4 iRES.5 iRES.6 iCAT iCAT iRES.0 iRES.1 db db /fs X X /fs /fs /fs /fs /fs vm0 vm1 vm2 vm3 vm4 vm5 vm6 vm7 controller0 controller1 distributed shared storage

  10. Let it fail, let it fail, let it fail. clients high speed local network (10-100Gb Ethernet, IB, etc.) iRES.7 iRES.2 iRES.4 iRES.5 iRES.6 iCAT X iRES.0 iRES.1 iRES.3 db X /fs X X /fs X /fs /fs /fs vm0 vm1 vm2 vm3 vm4 vm5 vm6 vm7 controller0 controller1 distributed shared storage

  11. New Reference Architecture Converged • Deployed on storage controller(s) • No additional hardware or server instances • Request latency minimized • Single replica kept on shared storage 
 Fault Tolerant • Resource servers see all available storage • “Physical” resources impersonate “virtual” • Cluster monitoring and failure handling • Only need one “physical” resource, catalog, database 


  12. New Reference Architecture Distributed • Resource performance scales with backing storage • iCAT hosted on distributed storage and scales independently Parallel • Client can read and write to all resources at the same time • Minimize false “data island” lock-in • Clients can achieve higher bandwidth than a single resource • (Future) Multipart could provide true parallel object access

  13. • Unmodified codebase • Scale horizontally • Incorporate with other storage

  14. How was this accomplished? • iRODS 4.1.9 (refactoring for 4.2.1) • Ansible, Vagrant, VirtualBox, NFS for Test • Spectrum Scale on Cluster for Production • Pacemaker/(CMAN | Corosync) • Custom irods , icat OCF resources • “ Virtual” resource reference counting • /etc/irods/hosts_config.json • Galera Cluster for MySQL 


  15. Physical Resource (pR) Failures: 0 clients composite resource tree (random, etc.) vR0 vR1 vR2 vR3 vR4 vR5 vR6 vR7 pR pR pR pR pR pR pR pR /vault (shared POSIX filesystem) obj.0 obj.1 obj.2 obj.3 obj.4 obj.5 obj.6 obj.7

  16. Physical Resource (pR) Failures: 1 clients composite resource tree (random, etc.) vR2 vR0 vR1 vR3 vR4 vR5 vR6 vR7 pR pR X pR pR pR pR pR /vault (shared POSIX filesystem) obj.0 obj.1 obj.2 obj.3 obj.4 obj.5 obj.6 obj.7

  17. Physical Resource (pR) Failures: 2 clients composite resource tree (random, etc.) vR2 vR5 vR0 vR1 vR3 vR4 vR6 vR7 pR pR X pR pR X pR pR /vault (shared POSIX filesystem) obj.0 obj.1 obj.2 obj.3 obj.4 obj.5 obj.6 obj.7

  18. Physical Resource (pR) Failures: 3 clients composite resource tree (random, etc.) vR1 vR2 vR5 vR0 vR3 vR6 vR7 vR4 pR X X pR pR X pR pR /vault (shared POSIX filesystem) obj.0 obj.1 obj.2 obj.3 obj.4 obj.5 obj.6 obj.7

  19. HA Active-Active iCAT Cluster clients load balancing (DNS round robin, etc.) floating ip icat.0 floating ip icat.1 floating ip icat.n … icat icat icat floating ip sql.1 floating ip sql.n floating ip sql.0 sql sql sql mysql galera sst (fixed ips)

  20. HA Active-Active iCAT Cluster: SQL Fail clients load balancing (DNS round robin, etc.) floating ip icat.0 floating ip icat.1 floating ip icat.n … icat icat icat floating ip sql.0 floating ip sql.1 floating ip sql.n X sql sql mysql galera sst (fixed ips)

  21. HA Active-Active iCAT Cluster: iCAT Fail clients load balancing (DNS round robin, etc.) floating ip icat.1 floating ip icat.0 floating ip icat.n … icat X icat floating ip sql.0 floating ip sql.1 floating ip sql.n X sql sql mysql galera sst (fixed ips)

  22. iRODS Distributed Database Experiences Oracle RAC MySQL Cluster Postgres-XL MySQL Galera 


  23. iRODS Soapbox • Resource throughput and scalability • Catalog performance and scalability • Atomicity of transactions • Multipart • Multipath for resources • Fastpath

  24. Future Work • Benchmark and test • Postgres-XL • Apache Trafodion • Desirable replication • Additional architectures (HCI, etc.) • Microservice deployment in Kubernetes

  25. Thank You bioteam.net 
 info@BioTeam.net 
 @BioTeam

Recommend


More recommend