hurricane
play

Hurricane Master semester project IC School Operating Systems - PowerPoint PPT Presentation

Hurricane Master semester project IC School Operating Systems Laboratory Author Diego Antognini Supervisors Prof. Willy Zwaenepoel Laurent Bindschaedler Outline Motivation Hurricane Experiments Future work Conclusion


  1. Hurricane Master semester project – IC School Operating Systems Laboratory Author Diego Antognini Supervisors Prof. Willy Zwaenepoel Laurent Bindschaedler

  2. Outline • Motivation • Hurricane • Experiments • Future work • Conclusion 2

  3. Motivation

  4. Original goal of the project • Implement Chaos on top of HDFS ! • How ? • Replace storage engine by HDFS • Why ? • Industry interested by systems running on Hadoop • Handling cluster easily • Distributed file systems • Fault-tolerance (but at what price ?) Introduction – Hurricane – Experiments – Future work - Conclusion 4

  5. Chaos • Scale-out graph processing from secondary storage • Maximize sequential access • Stripes data across secondary devices in a cluster • Limited only by : • aggregate bandwidth • capacity of all storage devices in the entire cluster Introduction – Hurricane – Experiments – Future work - Conclusion 5

  6. Hadoop Distributed File System Namenode Client Datanodes Datanodes Introduction – Hurricane – Experiments – Future work - Conclusion 6

  7. Experiment : DFSIO • Measure aggregate bandwidth on a cluster when writing & reading 100 GB of data in X files : # Files Size 1 100 GB 2 50 GB … … 4096 25 MB • Use DFSIO benchmark • Each task operates on a distinct block • Measure disk I/O Introduction – Hurricane – Experiments – Future work - Conclusion 7

  8. Clusters DCO OS Ubuntu 14.04.01 # Cores 16 Memory 128 GB HDD : 140 MB/s Storage SSD : 243 MB/s Network 10 Gbit/s Introduction – Hurricane – Experiments – Future work - Conclusion 8

  9. Results DFSIO – DCO cluster I/O to disk writing 100GB of data 8 Nodes - No Replication DCO Cluster 2500 2250 Aggregate bandwidth [MB/s] 2000 1750 1500 1250 1000 750 500 250 0 1 2 4 8 16 32 64 128 256 512 1024 2048 4096 Number of Files Read Write Baseline (dd, hdparm) - Read/Write Introduction – Hurricane – Experiments – Future work - Conclusion 9

  10. Observations: DFSIO • Somewhat lackluster performance • Hard to tune ! HDFS doesn’t fit the requirements Introduction – Hurricane – Experiments – Future work - Conclusion 10

  11. Our solution • Create a standalone distributed storage system based on Chaos storage engine • Give it an HDFS-like RPC interface Actual project ! Introduction – Hurricane – Experiments – Future work - Conclusion 11

  12. Hurricane

  13. Hurricane • Scalable decentralized storage system based on Chaos • Balance I/O load randomly across available disks • Saturate available storage bandwidth • Target rack-scale deployment Introduction – Hurricane – Experiments – Future work - Conclusion 13

  14. Real life scenario • Chaos using Hurricane Introduction – Hurricane – Experiments – Future work - Conclusion 14

  15. Real life scenario • Measuring emotions of countries during Euro 2016 Emotions Switzerland Switzerland Emotions Belgium data Belgium Emotions Romania Romania • And much more ! Introduction – Hurricane – Experiments – Future work - Conclusion 15

  16. Locality does not matter ! • Remote storage bandwidth = local storage bandwidth • Clients can read/write to any storage device • Storage is slower than network • Network not a bottleneck ! • Realistic for most clusters at rack scale or even more Introduction – Hurricane – Experiments – Future work - Conclusion 16

  17. Maximizing I/O bandwidth • Clients pull data records from servers Server Client Server Client Server Client • Batches requests to prevent idle servers (prefetching) Introduction – Hurricane – Experiments – Future work - Conclusion 17

  18. Features • Global file handling (global_*) • Create, exists, delete, fill, drain, rewind etc … • Local file handling (local_*) • Create, exists, delete, fill, drain, rewind etc ... • Add storage nodes dynamically Introduction – Hurricane – Experiments – Future work - Conclusion 18

  19. How does it work ? – Writing files f f S1 S2 C1 C3 C2 f Introduction – Hurricane – Experiments – Future work - Conclusion 19

  20. How does it work ? – Reading files f f S1 S2 C1 C3 C2 Introduction – Hurricane – Experiments – Future work - Conclusion 20

  21. How does it work ? - Join g g f f S1 S2 S3 C1 C3 C2 g f Introduction – Hurricane – Experiments – Future work - Conclusion 21

  22. Experiments

  23. Clusters LABOS DCO TREX OS Ubuntu 14.04.1 Ubuntu 14.04.01 Ubuntu 14.04.01 # Cores 32 16 32 Memory 32 GB 128 GB 128 Gb HDD : 140 MB/s HDD : 414 MB/s Storage HDD : 474 MB/s SSD : 243 MB/s SSD : 464 MB/s Network 1 Gbit/s 10 Gbit/s 40 Gbit/s Introduction – Hurricane – Experiments – Future work - Conclusion 23

  24. List of experiments • Weak scaling • Scalability 1 client • Strong scaling • Case studies • Unbounded buffer • Compression Introduction – Hurricane – Experiments – Future work - Conclusion 24

  25. Weak scaling • Each node writes/reads 16 GB of data • Increasing number of nodes • N servers, N clients • Measure average bandwidth • Compare Chaos storage engine, Hurricane, DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 25

  26. 16 GB per node – 40 Gbit/s network TREX SSD Read TREX SSD Write Baseline (dd, hdparm) 500 500 Average bandwidth [MB/s] Average bandwidth [MB/s] 400 400 300 300 200 200 100 100 0 0 1 2 4 8 16 1 2 4 8 16 Machines Machines Chaos storage Hurricane DFSIO Chaos storage Hurricane DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 26

  27. 16 GB per node – 10 Gbit/s network DCO SSD Read DCO SSD Write Baseline (dd, hdparm) 250 250 Average bandwidth [MB/s] Average bandwidth [MB/s] 200 200 150 150 100 100 50 50 0 0 1 2 4 8 1 2 4 8 Machines Machines Chaos storage Hurricane DFSIO Chaos storage Hurricane DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 27

  28. 16 GB per node – 1 Gbit/s network LABOS Read LABOS Write Baseline (dd, hdparm) 500 500 Average bandwidth [MB/s] Average bandwidth [MB/s] 400 400 300 300 200 200 100 100 0 0 1 2 4 8 1 2 4 8 Machines Machines Chaos storage Hurricane DFSIO Chaos storage Hurricane DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 28

  29. Weak scaling - Summary • Hurricane similar performance with Chaos storage • Scalable • Outperforms HDFS roughly 1.5x • Maximize I/O bandwidth Introduction – Hurricane – Experiments – Future work - Conclusion 29

  30. 16 GB per node - 64 nodes DCO SSD Read Baseline (dd, hdparm) 300 Average bandwidth [MB/s] 250 200 150 100 50 0 1 2 4 8 16 32 64 Machines STILL SCALABLE & GOOD I/O BANDWIDTH Chaos storage Hurricane DCO SSD Write Baseline (dd, hdparm) 300 Average bandwidth [MB/s] 250 200 150 100 50 0 1 2 4 8 16 32 64 Machines Chaos storage Hurricane Introduction – Hurricane – Experiments – Future work - Conclusion 30

  31. Scalability with 1 Client • Client writes/reads 16 GB of data per server node • Increasing number of server nodes • N servers, 1 client • Measure aggregate bandwidth • Only Hurricane is used Introduction – Hurricane – Experiments – Future work - Conclusion 31

  32. 40 Gbit/s network Unknown network problem TREX SSD Read TREX SSD Write 5000 5000 4500 4500 4000 Aggregate bandwidth [MB/s] 4000 Aggegate bandwidth [MB/s] 3500 3500 3000 3000 2500 2500 2000 2000 1500 1500 1000 1000 500 500 0 0 1 2 4 8 16 1 2 4 8 16 Machines Machines Baseline Actual bandwidth of the network Introduction – Hurricane – Experiments – Future work - Conclusion 32

  33. 10 Gbit/s network DCO SSD Read DCO SSD Write 5000 5000 4500 4500 Aggregate bandwidth [MB/s] 4000 4000 Aggegate bandwidth [MB/s] 3500 3500 3000 3000 2500 2500 2000 2000 1500 1500 1000 1000 500 500 0 0 1 2 4 8 1 2 4 8 Machines Machines Baseline Also scale with only 1 client Use the I/O bandwidth of all the server nodes Introduction – Hurricane – Experiments – Future work - Conclusion 33

  34. Strong scaling • Read/write 128 GB of data in total • Increasing number of nodes • N servers, N clients • Measure aggregate bandwidth • Compare Chaos storage engine, Hurricane, DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 34

  35. 40 Gbit/s network Baseline TREX SSD Read TREX SSD Write 8000 8000 7000 7000 Aggregate bandwidth [MB/s] Aggregate bandwidth [MB/s] 6000 6000 5000 5000 4000 4000 3000 3000 2000 2000 1000 1000 0 0 1 2 4 8 16 1 2 4 8 16 Machines Machines Chaos storage Hurricane DFSIO Chaos storage Hurricane DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 35

  36. 1 Gbit/s network Baseline LABOS Read LABOS Write 8000 8000 7000 7000 Aggregate bandwidth [MB/s] Aggregate bandwidth [MB/s] 6000 6000 5000 5000 4000 4000 3000 3000 2000 2000 1000 1000 0 0 1 2 4 8 1 2 4 8 Machines Machines Chaos storage Hurricane DFSIO Chaos storage Hurricane DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 36

Recommend


More recommend