eos open storage
play

EOS Open Storage the CERN storage ecosystem for scientific data - PowerPoint PPT Presentation

EOS Open Storage the CERN storage ecosystem for scientific data repositories Dr. Andreas-Joachim Peters for the EOS project CERN IT-ST Overview Introduction EOS at CERN and elsewhere Tapes, Clouds & Lakes Scientific Service


  1. EOS Open Storage the CERN storage ecosystem for scientific data repositories Dr. Andreas-Joachim Peters for the EOS project CERN IT-ST

  2. Overview • Introduction • EOS at CERN and elsewhere • Tapes, Clouds & Lakes • Scientific Service Bundle • EOS as a filesystem • Vision, Summary & Outlook

  3. Everything about EOS http://eos.cern.ch Disclaimer: this presentation skips many interesting aspects of the core development work and focus on few specific aspects.

  4. Introduction What is EOS? EOS is a storage software solution for • central data recording • user analysis • data processing

  5. Introduction EOS and the Large Hadron Collider LHC

  6. Introduction EOS and CERNBox Sync & Share Platform with collaborative editing

  7. Architecture EOS is implemented Storage Clients: 
 in C++ using the XRootD Browser, Appliations, Mounts framework XRootD provides a Meta Data Service / Namespace client/server protocol which is tailored for data access Asynchronous Messaging Service • third party transfer • WAN latency compensation 
 using vectored read requests • pluggable authentication 
 Data Storage framework • …

  8. Architecture Transition 2017/18 • during 2017 CERN 
 services exceeded 
 EOS releases are design limits - lower 
 named after gemstones service availability 
 AQUAMARINE Version • leading effort to 
 production <= 2017 - commission new 
 in-memory namespace - architecture in 2018 CITRINE Version with namespace 
 production >=2017 - cache in-memory & in-memory & 
 - scalability KV store persistency 
 scale-out KV persistency in QuarkDB

  9. EOS at CERN 15 EOS instances • 4 LHC • 2 CERNBox (new home) • EOSMEDIA (Foto, Video) • EOSPUBLIC (non-LHC Experiments) • EOSBACKUP (backup for CERNBox) • 6 for various test infrastructures GRAFANA Dashboard 3/2018

  10. Distributed EOS Latency 22 ms 60 ms EOS@CERN AARNet Russian Federation CERN & Wigner Data Center 
 CloudStor Prototype 3 x 100Gb

  11. EOS for OpenData

  12. CERN Open Source for Open Data

  13. Tapes …

  14. C ERN EOS + Tape = EOSCTA T APE A RCHIVE • in 2017 tape storage passed 200 PB with CERN CASTOR storage system • CTA modularises and splits tape functionality from the disk cache implementation 
 and can be adapted to the disk technology • Tape copies are treated in EOS as offline disk replicas • EOS & CTA communicate via GOOGLE protocol buffer messages, which can 
 be configured synchronous or asynchronous using the EOS workflow engine • first production CTA code available in 2018 - continuous testing & improvements currently on the way

  15. participating in Extreme (Data) Clouds http://www.extreme-datacloud.eu/

  16. W participating in http://wlcg.web.cern.ch/

  17. Datalakes Datalakes: evolution of distributed storage • Datalakes are an extension of storage consolidation where geographically distributed storage centers are operated and accessed as a single entity Goals • Optimise storage usage to lower the cost of stored data 
 technology requirements: geo-awareness, storage tiering and automated file workflows fostered by fa(s)t - QOS

  18. HL-LHC Datalakes deal with 12 more data

  19. E nable O ther S torage • Scope of EOS in XDC & WLCG Datalake project • enable storage caches • enable hybrid storage • distributed deployments and storage QOS for cost savings • What does this really mean?

  20. D ynamic C aches Adding clustered storage caches as dynamic resource. 
 Files can have replicas in static (EULAKE) 
 and dynamic resources (CACHE-FOO). xCache FST write-through MGM hierarchical FST FST structure centralised MD store (QuarkDB) 
 & DDM centralised access control IO with credential tunnelling read-through data storage temporary replica geotag creation FST temporary replica geotag deletion dynamic site cache resource distributed EOS setup CACHE-FOO EULAKE

  21. H ybrid D istributed S torage • Attach external storage into datalake • external storage has not to be accessed via data lake - can be operated as is: better scalability • external storage connector uses a notification listener to publish creations and deletions and applies QOS (replication) policies to distribute data in the lake data storage FST distributed object store Planned connectors file = object mounted external storage Amazon S3 with external data storage namespace CEPH S3 MGM hierarchical FSTs FST structure Shared Filesystem (with limitations) centralised MD store (QuarkDB) 
 ExOS O bject S torage (RADOS) & DDM centralised access control data storage X RootD / Web DAV+REST flat basic constraints: structure write-once data FST PUT semantic data storage

  22. H ybrid D istributed S torage 
 ` Example: AWS Integration • transparent S3 backup on tapes FST notification listener MGM FSTs client interacts Cern Tape Archive with AWS API QOS policy 
 triggers CTA replication

  23. H ybrid D istributed S torage 
 Example High Performance DAQ with Object Storage RADOS replicated MD pool MGM FSTs libExOS libExOS FST notification Cern Tape Archive listener QOS policy 
 DAQ Farm triggers CTA replication RADOS EC data pool libExOS is a lock-free minimal implementation to store data in RADOS object stores optimised for erase encoding 
 leverages CERN IT-ST experience as author of RADOS striping library & intel EC

  24. QOS in EOS How do we save? Cost Metrics EOS provides a workflow engine and QOS transformations • event (put, delete) and time trigger ( file age, last access ) workflows • file layout transformations [ replica <=> EC encode* ] [ e.g. save 70% ] • policies are expressed as external attributes and express structure and geographical placement [ skipping a lot of details ] used for CTA * can do erasure encoding over WAN resources/centers

  25. CERN S cientific S ervices B undle We have bundled a demonstration setup of four CERN developed cloud and analysis platform services called UBoxed. 
 encapsulated four components • EOS - scalable storage platform with data, metadata and messaging server components • CERNBox - dropbox-like add-on for sync-and-share services on top of EOS • SWAN - service for web based interactive analysis with jupyter notebook interface • CVMFS - CernVM file system - a scalable software distribution service Try dockerized Demo Setup on CentOS7 or Ubuntu: eos-docs.web.cern.ch/eos-docs/quickstart/uboxed.html

  26. CERN S cientific S ervices B undle Web Service Interface after UBoxed installation Try dockerized Demo Setup on CentOS7 or Ubuntu: eos-docs.web.cern.ch/eos-docs/quickstart/uboxed.html

  27. CERN S cientific S ervices B undle

  28. EOS as a filesystem /eos background to /eos • a filesystem mount is standard API supported by every application 
 - not always the most efficient for physics analysis • a filesystem mount is very delicate interface 
 - any failure translates into applications failures, job inefficiencies etc. • FUSE is a simple (not always) but not the most efficient way to implement a filesystem • implementing a filesystem in general is challenging, currently deployed implemenation 
 has many POSIX problems • we implemented 3rd generation of a FUSE based client for EOS

  29. EOS as a filesystem /eos features • more POSIX - better performance - cross client md/data consistency • strong security: krb5 & certificate authentication - oauth2 under consideration • distributed byte range locking - small file caching • hard links ( starting with version 4.2.19 ) • rich ACLs support on the way

  30. eos xd FUSE filesystem daemon Example Performance Metrics Architecture eos xd 1000x mkdir = 870/s kernel 1000x rmdir = 2800/s libfuse 1000x touch = 310/s low-level API untar (1000 dirs) = 1.8s untar (1000 files) = 2.8s meta data CAP store data queue dd bs=1M wr 1 GB files wr 4 GB files rd 1 GB files rd 4GB files 480 meta data backend hb XrdCl::Proxy com 360 XrdCl::Filesystem XrdCl::File sync async MB/s 240 sync sync async sync 120 MGM - FuseServer FST - xrootd 0 1 2 3 4 5

  31. eos xd FUSE filesystem daemon Example Performance metrics eos xd • aim to take over some AFS use • untar linux source (65k files/directories) cases • compile xrootd • compile eos • related to AFS phaseout project 
 EOS AFS WORK AFS HOME LOCAL at CERN (longterm) 400 300 • provide at least POSIX features 200 of AFS 100 0 untar linux compile xrootd compile eos commissioned to production at CERN during Q2/2018

  32. EOS Vision • evolve from CERN Open Source to Community Open Source project - outcome of 2 nd EOS workshop • leverage power of community storage Open Source • embedded technologies 
 ( object storage & filesystem hybrids ) • slim-down storage customisation layers

Recommend


More recommend