r a i d f s
play

R.A.I.D.F.S Randomized Aggregation Independent Distributed File - PowerPoint PPT Presentation

R.A.I.D.F.S Randomized Aggregation Independent Distributed File System P2P Distributed File System with an API for Map-Reduce Integration Sven Reber, Jrmy Gotteland, David Froelicher, Alban Marguet, Pascal Cudr, Valrian Pittet Context


  1. R.A.I.D.F.S Randomized Aggregation Independent Distributed File System P2P Distributed File System with an API for Map-Reduce Integration Sven Reber, Jérémy Gotteland, David Froelicher, Alban Marguet, Pascal Cudré, Valérian Pittet

  2. Context ● clusters hard to configure and expensive to maintain ● everyone has a computer ● lots of unused storage and computational resources on end- user machine ● network connexions are improving

  3. Goals Peer to peer DFS that is ● designed to support Map-Reduce operations ○ chunking by line blocks ○ text files ● resilient ● easy to configure (dynamic configuration) ○ simply connect to the network and run your jobs

  4. Architecture

  5. DFS - Stabilization GlobalChunkField <= 3 (arbitrary) is an unstable state

  6. DFS - Stabilization Look at its neighbors chunkfields

  7. DFS - Stabilization Randomly gets one of the insufficiently replicated chunk

  8. DFS - Stabilization Do not download chunk if it finds enough replicas

  9. DFS - Stabilization File is “stable” when there is enough replicas

  10. DFS - put New file : “put” command

  11. DFS - put publish an index update, then neighbors discover every 20s

  12. DFS - put neighbors try to stabilize file (same process as before)

  13. DFS - put neighbors get missing chunks randomly to complete their GCF

  14. DFS - other commands commands available ● ls ● put ● get ● rm

  15. Map operation ● Some peer starts a Job ● MapFiles (jobid, Resource, Initiator, MapFunction) ○ Each chunk mapped to its result files (can be created in advance) -> One folder for each mapped chunk ○ One key chunk for each key discovered in the original chunk

  16. MapFile

  17. Reduce operation ● Keys discovered during map ● Keys sent to initiator

  18. ReduceFile ● Initiator prepare ReduceFile on DFS

  19. ReduceFile ● Peer that wants to create a ReduceFile chunk download the needed keyChunks

  20. ReduceFile ● Initiator knows that a reduce is finished when ReduceFile is stable on DFS

  21. What’s Next ● Large Scale & Stress Tests of DFS ● Implement the Map and Reduce files ● Include multi-master management (results from the MRp2p paper)

Recommend


More recommend