Adjustable flat layouts for Two- Failure Tolerant Storage Systems Thomas Schwarz, SJ Marquette University
Motivation • Storage device batches fail at di ff erent rates • Example: Backblaze: • 1163 Seagate Barracuda 7200.14 disks • failed at a rate of 43% per year in 2014, • Storage devices (sometimes) fail at di ff erent rates • Bathtub curve seen in about 50% of all HD at Netapp • SSD unrecoverable read error rate increases at the end of their lifetime
Motivation • Large storage systems • Currently consists of disks or SSDs organized in racks • Individual devices are replaced • Erasure coding for files, not devices • My proposal • Organize a large number of devices in a storage pod • Level of failure tolerance in pod varies according to prediction of device vulnerability • Use a flat layout to increase failure tolerance
Adjustable Raid 6 Example • Group k devices into a reliability stripe • User data devices • Add two parity devices to each reliability stripe • If device failure rate appears to be high: • Rededicate a user data device as a parity • Overall: • Trade capacity for additional failure tolerance when needed
Adjustable RAID 6 Example P0,1 P0,2 D0,0 D0,1 D0,2 D0,3 D0,4 D0,5 D0,6 D0,7 P1,2 D1,0 D1,1 D1,2 D1,3 D1,4 D1,5 D1,6 D1,7 P1,1 P2,1 P2,2 D2,0 D2,1 D2,2 D2,3 D2,4 D2,5 D2,6 D2,7 D3,7 P3,1 P3,2 D3,0 D3,1 D3,2 D3,3 D3,4 D3,6 D3,5 D4,6 D4,7 P4,1 P4,2 D4,0 D4,1 D4,2 D4,3 D4,4 D4,5 D5,5 D5,6 D5,7 P5,1 P5,2 D5,0 D5,1 D5,2 D5,3 D5,4 D6,5 D6,6 D6,7 P6,1 P6,2 D6,0 D6,1 D6,2 D6,3 D6,4 D7,3 D7,4 D7,6 D7,7 P7,1 P7,2 D7,0 D7.1 D7,2 D7,5 D8,6 D8,7 P8,2 D8,1 D8,2 D8,2 D8,3 D8,4 D8,5 P8,1 P9,2 D9,0 D9,1 D9,2 D9,3 D9,4 D9,5 D9,6 D9,7 P9,1 P10,1 P10,2 D10,0 D10,1 D10,2 D10,3 D10,4 D10,5 D10,6 D10,7 D11,2 D11,3 D11,4 D11,6 D11,7 P11,1 P11,2 D11,0 D11,1 D11,5 D12,2 D12,3 D12,5 D12,6 D12,7 P12,1 P12,2 D12,0 D12,1 D12,4 D13,5 D13,6 D13,7 P13,1 P13,2 D13,0 D13,1 D13,2 D13,3 D13,4 D14,4 D14,5 D14,6 D14,7 P14,1 P14,2 D14,0 D14,1 D14,2 D14,3 D15,4 D15,5 D15,6 D15,7 P15,1 P15,2 D15,0 D15,1 D15,2 D15,3 Adjustable RAID 6
Adjustable RAID 6 Example
Alternative to RAID Stripes • Use a flat layout: • Each user data device is in two or three reliability stripes with one additional parity • Does not use Galois field arithmetic • Reconstruction can be done using two or three alternatives • Can avoid a single hot spot
Results • Adjustable RAID 6 • Easy to find configurations • Adjustable flat layouts • Higher reliability • No need for Galois field arithmetic • Accelerators need extended instruction set • Flexibility in reconstruction of lost data
Layout Definition • Flat layouts: • Each user data device is part of two reliability stripes • Two reliability stripes have one or none data device in common • Each reliability stripe contains k user data devices • Therefore: • Each data device corresponds to an edge of an undirected graph • Each parity device corresponds to a reliability stripe that corresponds to a vertex
Layout Definition • Use graph view: D 1 D 1 D 2 D 3 A D A D 4 D 2 • Edges are D 3 D 4 D 5 D 6 B user data D 5 B E D 7 D 8 D 9 C devices D 6 D 7 D 8 • Vertices are D E F parity data D 9 C F devices Layout and corresponding graph
Layout Definition • Densest layouts correspond to a complete graph A 0 4 1 3 A: 0, 1, 2, 3, 4 2 B: 4, 5, 6, 7, 8 F 5 B C: 8, 3, 9 , 10, 11 6 D: 11, 7, 12, 13 7 14 8 E: 13, 10, 6, 1, 14 F: 14, 12, 9 , 5, 0 9 E 10 C 12 11 13 Flat Layout with 6 stripes and 14 user data devices D
Layout Definition • If we want to create additional reliability stripes, we can use a graph factorization • Each user data device is in three reliability stripes • Any two stripes intersect in one or none user data devices • This factorization invented by Lawless 1974
Layout Definition • Can add additional parity devices to an ensemble in case of need • How about switching some user data devices to parity? • Cannot be done instantaneously because those data devices need to emptied • But it can be done
Layout Definition • Punctured Layouts: Remove the middle edge from each factor 0 0 9 1 7 1 2 8 6 2 3 7 5 3 4 6 5 4
Layout Definition • Available only for certain parity - data device numbers
Reliability Evaluation • We compare with an ��� ���� ��� Comparison Degree 5 adjustable RAID Level 6 ��� ���� � configuration ��� ���� � ���� � + � • Robustness: Probability ��� ���� � ��� that f device failures have let to data loss ��� �� # ���� � � �� �� ��� ���� ��� Comparison Degree 10 ��� ���� � ��� ���� � ���� � + � ��� ���� � ��� ��� �� # ���� � � �� �� �� ��
Reliability Evaluation • Calculation of five and six year survival probabilities: ����� �������� ����� �������� �� Degree 10 Degree 5 �� �� �� �� � � � � � � ��� ����� ��� ����� ��� ����� ��� ����� � � ��� ���� � �� ���� ��� ���� � �� ���� ���� ���� ������ ������� ������� ���� ���� ������ ������� �������
Results • Adjustable RAID 6 • Easy to find configurations • Adjustable flat layouts • Higher reliability • No need for Galois field arithmetic • Accelerators need extended instruction set • Flexibility in reconstruction of lost data
Recommend
More recommend