A Generic Framework for Testing Parallel File Systems Jinrui Cao, - PowerPoint PPT Presentation

A Generic Framework for Testing Parallel File Systems Jinrui Cao†, Simeng Wang†, Dong Dai‡, Mai Zheng†, and Yong Chen‡ † Computer Science Department, New Mexico State University ‡ Computer Science Department, Texas Tech University Presented by Simeng Wang SC16’ PDSW-DISCS 11. 14. 2016.

Motivation Jan, 2016 @HPCC: power outage lead to unmeasurable data loss 2

Motivation q Existingmethods fortestingstoragesystemsare not good enough for large- scaleparallelfilesystems (PFS) Ø Modelchecking [e.g., EXPLODE@OSDI’06] v difficult to build a controllable model for PFS v state explosion problem Ø Formalmethods [e.g., FSCQ@SOSP’15] v challenging to write correct specifications for PFS Ø Automatic Testing[e.g., TorturingDB, CrashConsistency@OSDI’14] v closely tied to local storage stack: intrusive for PFS v only work for single-node 3

Our Contributions q A genericframeworkfor testingfailurehandlingofparallelfilesystem Ø Minimalinterference& high portability v decouple PFS from the testing framework through a remote storage protocol(iSCSI) Ø Systematicallygeneratefailureevents with high fidelity v fine-grained,controllablefailureemulation v emulaterealisticfailuremodes q An initialprototypefor Lustre filesystem Uncover internal I/O behaviors of Lustre under different workloads and Ø failureconditions 4

Outline q Introduction q Design Ø Virtual Device Manager Ø Failure State Emulator Ø Data-Intensive Workloads Ø Post-Failure Checker q Preliminary Experiments q Conclusion and Future Work 5

Overview Testing Framework LustreNodes Data-Intensive Workload Post-Failure Checker MGS MDS OSS OSS OSS ….. OST Failure State Emulator MGT MDT OST OST Virtual Device Manager …… Virtual Virtual Virtual Virtual Virtual Device Device Device Device Device …… Device File MGS: Management Server MGT: Management Target MDS: Metadata Server MDT: Metadata Target OSS: Object Storage Server OST: Object Storage Target 7

Overview Testing Framework LustreNodes Data-Intensive Workload Post-Failure Checker MGS MDS OSS OSS OSS ….. OST Failure State Emulator MGT MDT OST OST Virtual Device Manager …… Virtual Virtual Virtual Virtual Virtual Device Device Device Device Device …… Device File 8

Virtual Device Manager q Createsand maintainsdevicefiles for storagedevices. q Mounted to Lustrenodesas virtualdevices via iSCSI. q I/O operations aretranslatedinto diskI/O commands q Log commandsintoa commandhistorylog Ø Include nodeIDs,commanddetails, andactual datatransferred Ø Used bythe FailureStateEmulator 9

Overview Testing Framework LustreNodes Data-Intensive Workload MGS MDS OSS OSS OSS Post-Failure Checker ….. OST MGT MDT OST OST Failure State Emulator Virtual Device Manager …… Virtual Virtual Virtual Virtual Virtual Device Device Device Device Device …… …… Device File 10

Failure State Emulator q Generatefailureeventsin a systematic and controllable way. Ø Manipulate I/Ocommandsand emulatesfailure state ofeach individualdevice Ø Emulate four realistic failure modes based on previous studies [e.g., FAST’13 , OSDI’14, TOCS’16, FAST’16] 1.WholeDeviceFailure Device becomes invisible to the host 2.CleanTerminationofWrites Emulates simplestpower outage 3.ReorderingoftheWrites Commits writes in an order different from the issuing order 4.CorruptionoftheDeviceBlock Change content of writes 11

Overview Testing Framework LustreNodes Data-Intensive Workload MGS MDS OSS OSS OSS Post-Failure Checker ….. OST MGT MDT OST OST Failure State Emulator Virtual Device Manager …… Virtual Virtual Virtual Virtual Virtual Device Device Device Device Device …… Device File 12

Co-design Workloads and Checkers q Data-Intensive workloads Ø Stress Lustre and generate I/O operations to age the system and bring it to a statethatmaybedifficultto recover Ø Mayuseexistingdata-intensiveworkloads Ø Mayincludeself-identification/verificationinformation q Post-Failure Checkers Ø examines the post-failurebehavior andcheckifit can recover withoutdataloss Ø May use existing checkers ( e.g.,, LFSCKfor Lustre) 13

Preliminary Experiment q Experiment setup Ø Cluster of seven VMs, installed with CentOS 7. Ø Lustrefile system (version 2.8) on five VMs. Ø One MGS/MGT node, one MDS/MDT node, and three OSS/OST nodes. Ø Sixth VM : hosts the Virtual Device Manager and the Failure State Emulator v Virtual Device Manager is built on top of the Linux SCSI target framework Ø Last VM : used as client for launching workloads and LFSCK v Data-Intensive Workload , Post-Failure Checker 15

Preliminary Experiment q Workloads Ø Normal Workloads ran on Lustre Workload Description Montage/m101 astronomical imagemosaic engine cp copy a file into Lustre tar decompress a file on Lustre rm delete a file from Lustre Ø Post-Failure Workloads ran on Lustre Operation Description lfs setstripe set striping pattern dd-nosync create & extend a Lustrefile dd-sync create & extend a Lustrefile LFSCK check & repair Lustre 16

Preliminary Results q Internal Pattern of Writes without Failure Ø Numbers of bytes (MB) written to different Lustrenodes under different workloads. Ø Montage/m101 is spilt into twelve steps (i.e., s1 to s12) to show the fine-grained write pattern. Luster cp tar rm Montage/m101 Nodes s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 MGS/MGT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 MDS/MDT 0.1 5 0.2 6 0.4 6 0.5 6 0.6 6 0.7 6 1 6 1 OSS/OST#1 0 14 0 14 28 14 66 14 66 18 66 18 94 56 94 OSS/OST#2 15 14 15 14 43 14 81 14 81 19 81 19 109 19 110 OSS/OST#3 0 16 0 16 24 16 24 17 24 21 24 21 49 58 49 17

Preliminary Results q Internal Pattern of Writes without Failure Ø Accumulated numbers of bytes (KB) written to different nodes during the workloads . 18

Preliminary Results q Post-Failure Behavior q Emulate a whole device failure on MDS/MDT node q Run operations on Lustre after the emulated device failure Ø dd-nosyncmeans using dd to create and extend a Lustrefile Ø dd-sync means enforcing synchronous writes on the dd command Ø The last column shows whether the operation reported error or not Operation Description Report Error? lfs setstripe set striping pattern No dd-nosync create & extend a Lustrefile No dd-sync create & extend a Lustrefile Yes LFSCK check & repair Lustre No 19

Conclusion and Future Work q Proposed and prototyped a framework for testing failure handling of large-scale parallel file systems. q Uncovered internal behaviors towards workloads under normal and failure conditions q More effective post-failure checking operations q More file systems (e.g., PVFS, Ceph) q Explore novel mechanisms to enhance the resilience of large-scale parallel file systems 21

A Generic Framework for Testing Parallel File Systems Jinrui Cao, - PowerPoint PPT Presentation

A Generic Framework for Testing Parallel File Systems Jinrui Cao, Simeng Wang, Dong Dai, Mai Zheng, and Yong Chen Computer Science Department, New Mexico State University Computer Science Department, Texas Tech University

Parallel File Systems John White Lawrence Berkeley National Lab Topics Defining a File

File Management What is a file? Elements of file management File organization

What are Generics? e.g. Generics, Generic Programming, Generic Types, Generic Methods 6

Click on M odel File for CAD Click on M odel File for CAD Click on Model File for CAD Click

CPSC 410/611: File Management What is a file? Elements of file management File

Week 10: File Management What is a file? Elements of file management File

Generic Programming in a Dependently Typed Language Generic proofs for generic programs Peter

Generic Methods 36 What are Generic Methods? Generic methods = methods that introduce type

1 Definition of a simple generic class Why generic programming (cont.) class Pair <T> {

File Systems: Semantics & Structure What is a File a file is a named collection of

File Systems: Semantics & Structure What is a File a file is a named collection of

CPSC 410/611: File Management What is a file? Elements of file management

File Systems: Consistency Issues 1 File Systems: Consistency Issues File systems maintain many

~FILE SYSTEM~ SUNU WIBIRAMA OUTLINE FILE SYSTEM ACCESS METHODS DIRECTORY STRUCTURE FILE

What if... There is no file with the name given to the File constructor: new File

Planning and Optimization C14. Merge-and-Shrink Abstractions: Generic Algorithm Malte Helmert and

MEMORY MEANS FOR THE FUTURE OF DATABASE SYSTEMS @ANDY_PAVLO 1973 1974 1978 1986 1994 2010

A Process Oriented Tool for Mobile Devices for Monitoring OSCAR Clusters Mario Antnio Ribeiro

CS378 - Mobile Computing Sensing and Sensors Part 2 Using Sensors Recall basics for using a

Hit Finding Emulator Summary : Something about the module Integrations into

RD53A Emulator and 64b/66b Serial Link Status Lev Kurilenko (levkur@uw.edu) University of

Getting System Sizing and Getting System Sizing and performance testing right performance

Cr Cryst stalNet alNet Faithfully Emulating Large Production Networks Hongqiang Harry Liu,

Call for Contribution: A New White-Box Analytic Tool Junwei Wang WhibOx 2019 May 19, 2019,

Sambuz

Useful Links

Newsletter

Mail Us

A Generic Framework for Testing Parallel File Systems Jinrui Cao, - PowerPoint PPT Presentation

A Generic Framework for Testing Parallel File Systems Jinrui Cao, Simeng Wang, Dong Dai, Mai Zheng, and Yong Chen Computer Science Department, New Mexico State University Computer Science Department, Texas Tech University

Parallel File Systems John White Lawrence Berkeley National Lab Topics Defining a File

File Management What is a file? Elements of file management File organization

What are Generics? e.g. Generics, Generic Programming, Generic Types, Generic Methods 6

Click on M odel File for CAD Click on M odel File for CAD Click on Model File for CAD Click

CPSC 410/611: File Management What is a file? Elements of file management File

Week 10: File Management What is a file? Elements of file management File

Generic Programming in a Dependently Typed Language Generic proofs for generic programs Peter

Generic Methods 36 What are Generic Methods? Generic methods = methods that introduce type

1 Definition of a simple generic class Why generic programming (cont.) class Pair &lt;T&gt; {

File Systems: Semantics &amp; Structure What is a File a file is a named collection of

File Systems: Semantics &amp; Structure What is a File a file is a named collection of

CPSC 410/611: File Management What is a file? Elements of file management

File Systems: Consistency Issues 1 File Systems: Consistency Issues File systems maintain many

~FILE SYSTEM~ SUNU WIBIRAMA OUTLINE FILE SYSTEM ACCESS METHODS DIRECTORY STRUCTURE FILE

What if... There is no file with the name given to the File constructor: new File

Planning and Optimization C14. Merge-and-Shrink Abstractions: Generic Algorithm Malte Helmert and

MEMORY MEANS FOR THE FUTURE OF DATABASE SYSTEMS @ANDY_PAVLO 1973 1974 1978 1986 1994 2010

A Process Oriented Tool for Mobile Devices for Monitoring OSCAR Clusters Mario Antnio Ribeiro

CS378 - Mobile Computing Sensing and Sensors Part 2 Using Sensors Recall basics for using a

Hit Finding Emulator Summary : Something about the module Integrations into

RD53A Emulator and 64b/66b Serial Link Status Lev Kurilenko (levkur@uw.edu) University of

Getting System Sizing and Getting System Sizing and performance testing right performance

Cr Cryst stalNet alNet Faithfully Emulating Large Production Networks Hongqiang Harry Liu,

Call for Contribution: A New White-Box Analytic Tool Junwei Wang WhibOx 2019 May 19, 2019,

Sambuz

Useful Links

Newsletter

Mail Us

1 Definition of a simple generic class Why generic programming (cont.) class Pair <T> {

File Systems: Semantics & Structure What is a File a file is a named collection of

File Systems: Semantics & Structure What is a File a file is a named collection of