Hurricane Master semester project – IC School Operating Systems Laboratory Author Diego Antognini Supervisors Prof. Willy Zwaenepoel Laurent Bindschaedler
Outline • Motivation • Hurricane • Experiments • Future work • Conclusion 2
Motivation
Original goal of the project • Implement Chaos on top of HDFS ! • How ? • Replace storage engine by HDFS • Why ? • Industry interested by systems running on Hadoop • Handling cluster easily • Distributed file systems • Fault-tolerance (but at what price ?) Introduction – Hurricane – Experiments – Future work - Conclusion 4
Chaos • Scale-out graph processing from secondary storage • Maximize sequential access • Stripes data across secondary devices in a cluster • Limited only by : • aggregate bandwidth • capacity of all storage devices in the entire cluster Introduction – Hurricane – Experiments – Future work - Conclusion 5
Hadoop Distributed File System Namenode Client Datanodes Datanodes Introduction – Hurricane – Experiments – Future work - Conclusion 6
Experiment : DFSIO • Measure aggregate bandwidth on a cluster when writing & reading 100 GB of data in X files : # Files Size 1 100 GB 2 50 GB … … 4096 25 MB • Use DFSIO benchmark • Each task operates on a distinct block • Measure disk I/O Introduction – Hurricane – Experiments – Future work - Conclusion 7
Clusters DCO OS Ubuntu 14.04.01 # Cores 16 Memory 128 GB HDD : 140 MB/s Storage SSD : 243 MB/s Network 10 Gbit/s Introduction – Hurricane – Experiments – Future work - Conclusion 8
Results DFSIO – DCO cluster I/O to disk writing 100GB of data 8 Nodes - No Replication DCO Cluster 2500 2250 Aggregate bandwidth [MB/s] 2000 1750 1500 1250 1000 750 500 250 0 1 2 4 8 16 32 64 128 256 512 1024 2048 4096 Number of Files Read Write Baseline (dd, hdparm) - Read/Write Introduction – Hurricane – Experiments – Future work - Conclusion 9
Observations: DFSIO • Somewhat lackluster performance • Hard to tune ! HDFS doesn’t fit the requirements Introduction – Hurricane – Experiments – Future work - Conclusion 10
Our solution • Create a standalone distributed storage system based on Chaos storage engine • Give it an HDFS-like RPC interface Actual project ! Introduction – Hurricane – Experiments – Future work - Conclusion 11
Hurricane
Hurricane • Scalable decentralized storage system based on Chaos • Balance I/O load randomly across available disks • Saturate available storage bandwidth • Target rack-scale deployment Introduction – Hurricane – Experiments – Future work - Conclusion 13
Real life scenario • Chaos using Hurricane Introduction – Hurricane – Experiments – Future work - Conclusion 14
Real life scenario • Measuring emotions of countries during Euro 2016 Emotions Switzerland Switzerland Emotions Belgium data Belgium Emotions Romania Romania • And much more ! Introduction – Hurricane – Experiments – Future work - Conclusion 15
Locality does not matter ! • Remote storage bandwidth = local storage bandwidth • Clients can read/write to any storage device • Storage is slower than network • Network not a bottleneck ! • Realistic for most clusters at rack scale or even more Introduction – Hurricane – Experiments – Future work - Conclusion 16
Maximizing I/O bandwidth • Clients pull data records from servers Server Client Server Client Server Client • Batches requests to prevent idle servers (prefetching) Introduction – Hurricane – Experiments – Future work - Conclusion 17
Features • Global file handling (global_*) • Create, exists, delete, fill, drain, rewind etc … • Local file handling (local_*) • Create, exists, delete, fill, drain, rewind etc ... • Add storage nodes dynamically Introduction – Hurricane – Experiments – Future work - Conclusion 18
How does it work ? – Writing files f f S1 S2 C1 C3 C2 f Introduction – Hurricane – Experiments – Future work - Conclusion 19
How does it work ? – Reading files f f S1 S2 C1 C3 C2 Introduction – Hurricane – Experiments – Future work - Conclusion 20
How does it work ? - Join g g f f S1 S2 S3 C1 C3 C2 g f Introduction – Hurricane – Experiments – Future work - Conclusion 21
Experiments
Clusters LABOS DCO TREX OS Ubuntu 14.04.1 Ubuntu 14.04.01 Ubuntu 14.04.01 # Cores 32 16 32 Memory 32 GB 128 GB 128 Gb HDD : 140 MB/s HDD : 414 MB/s Storage HDD : 474 MB/s SSD : 243 MB/s SSD : 464 MB/s Network 1 Gbit/s 10 Gbit/s 40 Gbit/s Introduction – Hurricane – Experiments – Future work - Conclusion 23
List of experiments • Weak scaling • Scalability 1 client • Strong scaling • Case studies • Unbounded buffer • Compression Introduction – Hurricane – Experiments – Future work - Conclusion 24
Weak scaling • Each node writes/reads 16 GB of data • Increasing number of nodes • N servers, N clients • Measure average bandwidth • Compare Chaos storage engine, Hurricane, DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 25
16 GB per node – 40 Gbit/s network TREX SSD Read TREX SSD Write Baseline (dd, hdparm) 500 500 Average bandwidth [MB/s] Average bandwidth [MB/s] 400 400 300 300 200 200 100 100 0 0 1 2 4 8 16 1 2 4 8 16 Machines Machines Chaos storage Hurricane DFSIO Chaos storage Hurricane DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 26
16 GB per node – 10 Gbit/s network DCO SSD Read DCO SSD Write Baseline (dd, hdparm) 250 250 Average bandwidth [MB/s] Average bandwidth [MB/s] 200 200 150 150 100 100 50 50 0 0 1 2 4 8 1 2 4 8 Machines Machines Chaos storage Hurricane DFSIO Chaos storage Hurricane DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 27
16 GB per node – 1 Gbit/s network LABOS Read LABOS Write Baseline (dd, hdparm) 500 500 Average bandwidth [MB/s] Average bandwidth [MB/s] 400 400 300 300 200 200 100 100 0 0 1 2 4 8 1 2 4 8 Machines Machines Chaos storage Hurricane DFSIO Chaos storage Hurricane DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 28
Weak scaling - Summary • Hurricane similar performance with Chaos storage • Scalable • Outperforms HDFS roughly 1.5x • Maximize I/O bandwidth Introduction – Hurricane – Experiments – Future work - Conclusion 29
16 GB per node - 64 nodes DCO SSD Read Baseline (dd, hdparm) 300 Average bandwidth [MB/s] 250 200 150 100 50 0 1 2 4 8 16 32 64 Machines STILL SCALABLE & GOOD I/O BANDWIDTH Chaos storage Hurricane DCO SSD Write Baseline (dd, hdparm) 300 Average bandwidth [MB/s] 250 200 150 100 50 0 1 2 4 8 16 32 64 Machines Chaos storage Hurricane Introduction – Hurricane – Experiments – Future work - Conclusion 30
Scalability with 1 Client • Client writes/reads 16 GB of data per server node • Increasing number of server nodes • N servers, 1 client • Measure aggregate bandwidth • Only Hurricane is used Introduction – Hurricane – Experiments – Future work - Conclusion 31
40 Gbit/s network Unknown network problem TREX SSD Read TREX SSD Write 5000 5000 4500 4500 4000 Aggregate bandwidth [MB/s] 4000 Aggegate bandwidth [MB/s] 3500 3500 3000 3000 2500 2500 2000 2000 1500 1500 1000 1000 500 500 0 0 1 2 4 8 16 1 2 4 8 16 Machines Machines Baseline Actual bandwidth of the network Introduction – Hurricane – Experiments – Future work - Conclusion 32
10 Gbit/s network DCO SSD Read DCO SSD Write 5000 5000 4500 4500 Aggregate bandwidth [MB/s] 4000 4000 Aggegate bandwidth [MB/s] 3500 3500 3000 3000 2500 2500 2000 2000 1500 1500 1000 1000 500 500 0 0 1 2 4 8 1 2 4 8 Machines Machines Baseline Also scale with only 1 client Use the I/O bandwidth of all the server nodes Introduction – Hurricane – Experiments – Future work - Conclusion 33
Strong scaling • Read/write 128 GB of data in total • Increasing number of nodes • N servers, N clients • Measure aggregate bandwidth • Compare Chaos storage engine, Hurricane, DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 34
40 Gbit/s network Baseline TREX SSD Read TREX SSD Write 8000 8000 7000 7000 Aggregate bandwidth [MB/s] Aggregate bandwidth [MB/s] 6000 6000 5000 5000 4000 4000 3000 3000 2000 2000 1000 1000 0 0 1 2 4 8 16 1 2 4 8 16 Machines Machines Chaos storage Hurricane DFSIO Chaos storage Hurricane DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 35
1 Gbit/s network Baseline LABOS Read LABOS Write 8000 8000 7000 7000 Aggregate bandwidth [MB/s] Aggregate bandwidth [MB/s] 6000 6000 5000 5000 4000 4000 3000 3000 2000 2000 1000 1000 0 0 1 2 4 8 1 2 4 8 Machines Machines Chaos storage Hurricane DFSIO Chaos storage Hurricane DFSIO Introduction – Hurricane – Experiments – Future work - Conclusion 36
Recommend
More recommend