Warped Mirrors for Flash Yiying Zhang Andrea C. Arpaci-Dusseau Remzi H. Arpaci-Dusseau
2
3 Flash-based SSDs in Storage Systems • Using commercial SSDs in storage layer ▫ Good performance ▫ Easy to use ▫ Relatively cheap • Usage ▫ MySpace, Facebook, Amazon, etc. ▫ All-flash storage, e.g., Pure Storage • What about reliability?
4 Flash-based SSD Reliability • Flash wears out with erases ▫ More writes => more erases ▫ FTL and wear leveling help • One way to improve SSD reliability • Redundancy or RAID Assume failure independence
5 What About Flash-based Array? $$$ Replace Write Write SSD Data Loss Write Write SSD Time Correlated failure !
6 WaM - Warped Mirrors for Flash • Write more to one SSD to induce earlier failure Replace Write Write Write Write Write SSD No Data Loss Write Write Write SSD • Focus on mirrors (RAID1)
7 WaM Benefits • Reliability achieved by failure separation • Configurable ▫ Approximated model + correcting method • Low monetary cost ▫ 1-2 cents per hour for mirrors using WaM ▫ 47-94% of fixed-time replacement every one year • Small performance overhead ▫ 10% more resp time for 52hr-159day separation
8 Outline • Introduction • WaM design and model • Evaluation results • Conclusion
9 Basic Solution - Adding Dummy Writes Dummy Write Write Write SSD early Write Write SSD late FSI Dummy Write from RAID controller: Failure-Separation Interval Write the existing content From last write or a random page
10 Failure Separation Interval • FSI: window for detection and reconstruction ▫ Set by administrator at initialization time ▫ Can be adjusted • Choosing FSI ▫ Long enough for recovery ▫ Short to avoid high performance cost How many dummy writes to add given an FSI?
11 Challenges • Subverting FTL ▫ No knowledge of underlying FTL • Achieving near-perfect FSI ▫ FSI cannot be shorter than target (reliability) ▫ Performance overhead should be minimized
12 WaM Model • Model based on ▫ Target FSI length ▫ SSD properties ▫ Workload properties • Goal ▫ Find dummy write percentage for a target FSI
13 WaM Model – Dummy Write Percentage • Ratio of erases between two mirrored SSDs early Number of erases issued by SSD early N erases R erase late N Number of erases issued by SSD late erases • Dummy write percentage P dummy R P 1 erase dummy P R 1 dummy erase
14 WaM Model – Num Erases Remaining Maximum number of erases of an SSD block (SSD late ) late late N N N remaining worn erases Number of erases with SSD late when SSD early dies N SSD early late worn N erases R erase N late worn N N remaining worn R erase
15 WaM Model – Num Erases during Time T N Workload dependent I Os / T T r i Avg Response Time Avg Idle Time Flash Page Size S T Knowledge of page total N ( ) T P erases writes SSD parameters T T S r i block Write Percentage Flash Erase Block Size S T 1 page perblock N ( ) T P erases writes T T S N r i block ssd Num of Erase Perfect wear leveling Blocks in SSD
16 WaM Model – Final Steps late perblock N N (FSI) remaining erase N N FSI 1 page worn N P worn writes R T T N N erase r i block ssd N worn R erase N FSI 1 page N P worn writes T T N N r i block ssd P R 1 dummy erase
17 Assumptions and Limitations • Device parameters ▫ From device vendor or detect with tool • Workload changes ▫ Adjust model as workloads change • Imperfect or no wear leveling • Incorrect SSD lifetime Violations: FSI too short or too long
18 Achieving Target FSI late N remaining _ target • If FSI too short R delay late N remaining _ actual ▫ Delay writes to the surviving SSD SSD early Write Write SSD late Target FSI • If FSI too long ▫ Performance cost ▫ Adjust in future WaM modeling
19 Recovery • When the first SSD (SSD early ) fails ▫ Replace with a new SSD ▫ Reconstruct the data • Replacing the second SSD (SSD late ) ▫ At the same time when first SSD fails (no reliability risk, slightly higher cost) ▫ When it fails (higher reliability risk, slightly low cost)
20 Outline • Introduction • WaM design and model • Evaluation results • Conclusion
21 Evaluation Environment • Simulation based on Disksim + SSD extension • A mirror pair of two 80GB SSDs • Workloads ▫ Microbenchmark ▫ Macrobenchmark ▫ Trace ▫ No idle time
22 Can Failures Be Separated with Dummy Writes? And How? 8000 6000 FSI (h) 4000 Random Write 66% Random Write 2000 33% Random Write Sequential Write Failures can be separated with dummy writes 0 More dummy writes -> longer separation 0 20 40 60 80 100 Dummy Write Percentage (%) Wear leveling homogenize workloads
23 What Is the Performance Overhead? 100 Avg Response Time Increase (%) Random Write 80 66% Random Write 33% Random Write 60 Sequential Write 40 20 0 0 20 40 60 80 100 Dummy Write Percentage (%) More dummy writes -> worse performance
24 Can the Correct FSI Be Achieved? • Sequential workload 60 Target 50 WaM Without Delay FSI Delivered (h) 40 WaM With Delay 30 20 10 0 0 10 20 30 40 50 FSI Target (h)
25 Can the Correct FSI Be Achieved? • Random workload 60 Target 50 WaM Without Delay FSI Delivered (h) 40 WaM With Delay 30 20 10 WaM model can be inaccurate 0 Target FSI can be delivered with delaying 0 10 20 30 40 50 FSI Target (h)
26 How about Real Workloads? - FSI 300 2000 250 TPC-C Postmark 1500 200 FSI (h) FSI (h) 150 1000 100 500 50 0 0 0 20 40 60 80 100 0 20 40 60 80 100 Dummy Write Percentage (%) Dummy Write Percentage (%) 25000 20000 WebSearch FSI (h) 15000 10000 FSI and dummy write relationship as expected 5000 0 Larger FSI with read-intensive workloads 0 20 40 60 80 100 Dummy Write Percentage (%)
27 How about Real Workloads? - Performance 100 Avg Response Time Increase (%) Postmark 80 TPC-C 60 WebSearch 40 20 50-5000 hours of FSI 0 Higher overhead with write-intensive workloads 0 20 40 60 80 100 Dummy Write Percentage (%) Performance overhead is small for typical FSI
28 What is the Monetary Cost? • WaM: cost of SSD + sys-admin check each FSI interval • Fixed replacement: replace SSD after one year 0.03 Cost with fixed replacement Cost (dollar/h) 0.02 0.01 3 years total ownership cost: Fixed replacement - $594 WaM - $275 - $366 0 WaM costs lower than fixed-time replacement 0 20 40 60 80 100 Dummy Write Percentage (%)
29 Summary of Results • Failures are separated with desired FSI • Model is approximated • Achieves desired FSI with delaying • Small performance overhead • Low monetary cost
30 Outline • Introduction • WaM design and model • Evaluation results • Conclusion
31 Conclusion • Correlated failure of flash-based RAID • Separate failures by carefully adding dummy writes and delaying writes • Other techniques for failure separation ▫ Wear our one SSD to some extent before using ▫ Stagger SSDs with different ages in a RAID ▫ Vendor control when SSDs in RAID fail
32 Conclusion • Applying existing solutions directly to new devices may not work • WaM is a simple solution to guarantee failure separation and pushes aggressive use of SSDs • Other techniques may work well • WaM model can be useful
33 Thank You Questions? http://wisdom.cs.wisc.edu/home http://research.cs.wisc.edu/adsl
Recommend
More recommend