Dynamic Federations Seamless aggregation of standard-protocol-based storage endpoints Fabrizio Furano Patrick Fuhrmann Paul Millar Daniel Becker Adrien Devresse Oliver Keeble Ricardo Brito da Rocha Alejandro Alvarez Credits to ShuTing Liao (ASGC) 1 1
2 WLCG Computing Model Data Worker Worker Worker Worker App Cernvmfs Data Data EMI INFSO-RI-261611 18 Sept 2012 F.Furano - Dynamic federations
3 Storage Federations: Motivations • Currently data lives on islands of storage • catalogues are the maps • FTS/gridFTP are the delivery companies • Experiment frameworks populate the island • Jobs are directed to places where the needed data is • or should be ...... • Almost all data lives on more than one island • Assumption : • perfect storage ( unlikely to impossible) • perfect experiment workflow and catalogues ( unlikely ) • Strict locality has some limitations – a single missing file can derail the whole job • or series of jobs -> Failover to data on another island could help • Replica catalogues impose limitations, too – E.g. synchronization is difficult, performance too • Quest for direct, Web-like forms of data access • Great plus: other use cases may be fulfilled e.g. site EMI INFSO-RI-261611 caching, sharing storage amongst sites 18 Sept 2012 F.Furano - Dynamic federations
Storage federations • What ’ s the goal? – Make different storage clusters be seen as one – Make global file-based data access seamless • How should this be done? – Dynamically • easy to setup/maintain • no complex metadata persistency • no DB babysitting (keep it for the experiment ’ s metadata) • no replica catalogue inconsistencies, by design – Light config constraints on participating storage – Using standards • No strange APIs, everything looks familiar • Global direct access to global data EMI INFSO-RI-261611 Dynamic Federations, Lyon, Sept 2012 3
The basic idea We see this Aggregation All the metadata interactions are With 2 hidden /dir1 replicas NO persistency /dir1/file1 needed here, just efficiency and /dir1/file2 parallelism /dir1/file3 Storage/MD endpoint 1 Storage/MD endpoint 2 EMI INFSO-RI-261611 /dir1/file1 /dir1/file2 EMI INFSO-RI-261611 /dir1/file2 /dir1/file3 11
Dynamic HTTP Federations • Federation – Simplicity, redundancy, storage/network efficiency, elasticity, performance – Dynamic: does everything on the fly, no DB • Focus on HTTP/DAV – Standard clients everywhere – One protocol for everything (WAN/LAN) – Transparent redirection • Use cases – Easy, direct job/user data access, WAN friendly – Access missing files after job starts – Friend sites can share storage – Cache integration (future) EMI INFSO-RI-261611 Dynamic Federations, Lyon, Sept 2012 2
What is federated? • We federate (meta)data repositories that are ‘ compatible ’ – HTTP interface – Name space (modulo simple prefixes) • Including catalogues – Permissions (they don ’ t contradict across sites) – Content (same key or filename means same file [modulo translations]) • Dynamically and transparently discovering metadata – looks like a unique, very fast file metadata system – properly presenting the aggregated metadata views – redirecting clients to the geographically closest endpoint • Local SE is preferred • The system also can load a “ Geo ” plugin EMI INFSO-RI-261611 Dynamic Federations, Lyon, Sept 2012 4
What is federated? • Technically TODAY we can aggregate: – SEs with DAV/HTTP interfaces – dCache, DPM • Future: Xrootd? EOS? Storm? – Catalogues with DAV/HTTP interfaces • LFC supported • Future: Experiment catalogues could be integrated – Cloud DAV/HTTP/S3 services – Anything else that happens to have an HTTP interface… • Caches – Native LFC and DPM databases EMI INFSO-RI-261611 Dynamic Federations, Lyon, Sept 2012 5
Why HTTP/DAV? • It ’ s everywhere – A very widely adopted technology • It has the right features – Redirection, WAN friendly • Convergence – Transfers and data access – No other protocols required • We (humans) like browsers , they give an experience of simplicity – Open to direct access and integrated web apps EMI INFSO-RI-261611 Dynamic Federations, Lyon, Sept 2012 6
DPM/HTTP • DPM has invested significantly in HTTP as part of the EMI project – New HTTP/DAV interface – Parallel WAN transfers – 3rd party copy – Solutions for replica fallback •“ Global access ” and metalink – Performance evaluations • Experiment analyses • Hammercloud • Synthetic tests • Root tests Dynamic Federations, Lyon, Sept 2012 7
Demo • We have set up a stable demo testbed, using HTTP/DAV – Head node in DESY: http://federation.desy.de/myfed/ – a DPM instance at CERN – a DPM instance at ASGC (Taiwan) – a dCache instance in DESY – a Cloud storage account by Deutsche Telecom • The feeling it gives is surprising – Metadata performance is in avg higher than contacting the endpoints • We see the directories as merged, as it was only one system • There ’ s one test file in 3 sites, i.e. 3 replicas. – /myfed/atlas/fabrizio/hand-shake.JPG – Clients in EU get the one from DESY/DT/CERN – Clients in Asia get the one from ASGC • There’s a directory whose content is interleaved between CERN and DESY – http://federation.desy.de/myfed/dteam/ugrtest/interleaved/ • There’s a directory where all the files are in two places EMI INFSO-RI-261611 – http://federation.desy.de/myfed/dteam/ugrtest/all/ Dynamic Federations, Lyon, Sept 2012 10
Example Client Frontend (Apache2+DMLite) Aggregator (UGR) Plugin DMLite Plugin DAV/HTTP Plugin HTTP LFC or DB SE SE EMI INFSO-RI-261611 SE SE LFC SE SE EMI INFSO-RI-261611 SE SE Plain Plain DAV/HTTP DAV/HTTP 18 Sept 2012 F.Furano - Dynamic federations 1 2
Design and performance • Full parallelism – Composes on the fly the aggregated metadata views by managing parallel tasks of information location • Never stacks up latencies! • The endpoints are treated in a completely independent way – No global locks/serialisations! – Thread pools, prod/consumer queues used extensively (e.g. to stat N items in M endpoints while X clients wait for some items) • Aggressive metadata caching – The metadata caching keeps the performance high • Peak raw cache performance is ~500K->1M hits/s per core – A relaxed, hash-based, in-memory partial name space – Juggles info in order to always contain what ’ s needed • Keep them in an LRU fashion and we have a fast 1st level namespace cache – Stalls clients the minimum time that is necessary to juggle their information bits EMI INFSO-RI-261611 Dynamic Federations, Lyon, Sept 2012 15
Server architecture Clients come and are distributed through: • different machines (DNS alias) • different processes (Apache config) Clients are served by the UGR. They can browse/stat or be redirected for action. The architecture is multi/manycore friendly and uses a fast parallel caching scheme EMI INFSO-RI-261611 13
Name translation • A sophisticated scheme of name translation is a key to be able to federate almost any source of metadata – UGR implements algorithmic translations and can accommodate non algorithmic ones as well – A plugin could also query an external service (e.g. an LFC or a private DB) EMI INFSO-RI-261611 Dynamic Federations, Lyon, Sept 2012 14
Design and performance • Horizontally scalable deployment – Multithreaded – DNS balanceable • High performance DAV client implementation – Wraps DAV calls into a POSIX-like API, saves from the difficulty of composing requests/responses – Performance is privileged: uses libneon w/ sessions caching – Compound list/stat operations are supported – Loaded by the core as a “ location ” plugin EMI INFSO-RI-261611 Dynamic Federations, Lyon, Sept 2012 16
A performance test • Two endpoints: DESY and CERN (poor VM) • One UGR frontend at DESY • Swarm of test clients at CERN • 10K files in a 4-levels deep directory – Files exist on both endpoints • The test (written in C++) invokes Stat only once per file, using many parallel clients doing stat() at the maximum pace from 3 machines EMI INFSO-RI-261611 Dynamic Federations, Lyon, Sept 2012 17
The result, WAN access EMI INFSO-RI-261611 18
Another test, LAN, Cache impact EMI INFSO-RI-261611 18
Another test, LAN, access patterns EMI INFSO-RI-261611 18
Get started • Get it here: https://svnweb.cern.ch/trac/lcgdm/wiki/Dynafed s • What you can do with it: – Easy, direct job/user data access, WAN friendly – Access missing files after job starts – Friend sites can share storage – Diskless sites – Federating catalogues • Combining catalogue-based and catalogue-free data Dynamic Federations, Lyon, Sept 2012 19
Next steps • Release our beta, as the nightlies are good • More massive tests, with many endpoints, possibly distant – We are now looking for partners • Precise performance measurements • Refine the handling of the ‘ death ’ of the endpoints • Immediate sensing of changes in the endpoints ’ content, e.g. add, delete – SEMsg in EMI2 SYNCAT would be the right thing in the right place • Some more practical experience (getting used to the idea, using SQUIDs, CVMFS, EOS, clouds,... <put your item here> ) EMI INFSO-RI-261611 Dynamic Federations, Lyon, Sept 2012 21
Recommend
More recommend