data location optimization for a self organized storage
play

Data Location Optimization for a Self-Organized Storage System - PowerPoint PPT Presentation

Data Location Optimization for a Self-Organized Storage System Hannes Mhleisen, Tilman Walther and Robert Tolksdorf 1 [A. Bockoven] 2 [Thomas Schmickl] 3 Brood Sorting - Algorithm item = null; while (true) if (item != null) if


  1. Data Location Optimization for a Self-Organized Storage System Hannes Mühleisen, Tilman Walther and Robert Tolksdorf 1

  2. [A. Bockoven] 2

  3. [Thomas Schmickl] 3

  4. Brood Sorting - Algorithm item = null; while (true) if (item != null) if (similarity(item,nearbyItems()) > α ) drop(item) item = null else item = min(similarity(nearbyItems() ² )) pickup(item) move() 4

  5. Probabilistic Request Routing #B S5 85% S3 10% 95% 95% S4 50% 25% S1 S6 70% 50% S2 #B? [Lindgren03] 5

  6. Research Question Can brood sorting improve data placement in a large-scale distributed storage system based on probabilistic routing? 6

  7. Some Adaptions • Data is clustered into a limited amount of “buckets” • Movement split up into two phases: • Search phase: Every node periodically generates “profile” of locally stored data and sends it on its way • Response phase: Nodes compare incoming profiles to local stored data, generating movement responses 7

  8. (1) (1) 1 2 3 Profile 8

  9. (2) (2) 1 2 3 9

  10. (3) (3) 1 2 3 ✓ Clean! 10

  11. Evaluation • Cluster of 100 Linux nodes • Two datasets, random & synthetic • 1000 write operations, four phases • Recorded data: • # Data items in network • # Successful movement operations • Bucket amount & size 11

  12. Data Items vs. Move Operations synthetic/100nodes 1e+05 Data Items 2500 Move Operations 8e+04 2000 Move Operations 6e+04 Data Items 1500 1000 4e+04 500 2e+04 0 0 20 40 60 80 100 120 Sample 12

  13. Bucket Amount vs. Average Size synthetic/100nodes 500 200 400 180 Total Amount Average Size 160 300 140 200 120 Total Amount Average Size 100 0 20 40 60 80 100 120 Sample 13

  14. Data Items vs. Move Operations random/100nodes Data Items 4000 Move Operations 80000 3000 Move Operations 60000 Data Items 2000 40000 1000 20000 0 0 50 100 150 200 250 Sample 14

  15. Bucket Amount vs. Average Size random/100nodes 150 8000 6000 100 Total Amount Average Size 4000 50 2000 Total Amount Average Size 0 0 50 100 150 200 250 Sample 15

  16. Conclusion • Brood Sorting works! * * YMMV 16

  17. Thank You! Questions? Web Page: http://hannes.muehleisen.org

Recommend


More recommend