bringsel a tool for measuring storage system reliability
play

Bringsel: A Tool for Measuring Storage System Reliability, - PowerPoint PPT Presentation

Bringsel: A Tool for Measuring Storage System Reliability, Uniformity, Performance and Scalability John Kaitschuck Cray Federal CUG2007 jkaitsch@cray.com 5/2007 Overview - Challenges in File Systems Testing and Technology - Points for


  1. Bringsel: A Tool for Measuring Storage System Reliability, Uniformity, Performance and Scalability John Kaitschuck Cray Federal CUG2007 jkaitsch@cray.com 5/2007

  2. Overview - Challenges in File Systems Testing and Technology - Points for Consideration - A Generalized Requirement Framework - Bringsel, Yet Another File System Benchmark? - Features - Examples - Sample Output - Testing/Taxonomy - Some Results - Possible Future Directions for Bringsel - Questions CUG2007 01 ||23

  3. Challenges in File System Testing and Technology "If seven maids with seven mops Swept it for half a year, Do you suppose," the Walrus said, "That they could get it clear?" -- Lewis Carroll - Primary focus within community, users and suppliers. - Rarely consider reliability (implied/assumed). - Pace of hardware technology vs. system software. - Limits on testing, temporal and hardware wise. - Focus derived from RFP/SOW/Facility breakdown. - Scaling, doing end to end testing. - Historical context, past vs. present. - Differing customer/user requirements. - Sometimes ideas ignore operational context. CUG2007 02 ||23

  4. Points for Consideration Partial [1] Service Specifics - API's, Documentation, Security... [2] Reliability - Given N bits, reflect N bits of content... [3] Uniformity - Under load X, for period T... [4] Performance - Provide high bandwidth, low latency... [5] Scalability - Provide 1 -> 4 at sizes required... Full CUG2007 03 ||23

  5. A Generalized Requirement Framework ∑ ∑ ∑ ∑ ∑ Se a + Re b + Un c + Pe d + Sc e - Where these elements take on a series of unique values, which are... - Defined by the facility. - Defined by the application(s). - Constrained by the technology/architecture (fs, dfs, pfs). CUG2007 04 ||23

  6. A Generalized Requirement Framework: Ideally ∑ ∑ ∑ ∑ ∑ Se a + Re b + Un c + Pe d + Sc e Benchmarks Technology CUG2007 05 ||23

  7. Bringsel, Yet Another File System Benchmark? - Plenty of existing benchmarks/utilities... bonnie++, iozone, filebench, perf, pdvt, ior, xdd, explode trace, etc. - Not all are "operational inclusive" (mixed ops and blocks). - Most focus on separated MD/Data testing. - Need a known context, bringsel development started in ~1998, focused on HPTC, a strictly part time project. - Need to have a code that is easy to modify, comment, extend, maintain and balance simplicity/complexity. - Need a code with a known utilization history. (Industry, NSF, other Federal sites) - Need to focus on central point within user space for "nd" I/O. - Unique tools, enable unique discoveries. - Diversification of available test programs. CUG2007 06 ||23

  8. Features - Symmetric tree creation and population. - MultiAPI support: POSIX, STREAM, MMAP, MPI_IO - POSIX threads support (AD). - File checksums via haval. - Directory walks, across created structures. - Metadata loop measurements. - MSI support via MPI (MPP/Clusters). - Mixed access types (RW, SR, etc.). - Mixed block sizes (16K, 1024K, etc.). - Remedial configuration file parsing. - Coordinated looping/iteration support. - Misc functionality: truncation, async I/O, appending, etc. - Numerous reliability checks. - Of course, Bandwidth and IOPS performance measurement as well. CUG2007 07 ||23

  9. Examples Simple CLI Invocation General File Operation bringsel -T 4 -D /snarf/foo:1,2,2 -M -L -c -b 32 -S 100M alpha Directory Walk bringsel -T 4 -a sx -D /snarf/foo:1,2,2 -L CUG2007 08 ||23

  10. Examples Configuration File Utilization # # Comments begin with "#" # -T 4 -D /snarf/foo:1,2,2 -M -L -c -b 32 -S 100M alpha -T 4 -a sx -D /snarf/foo:1,2,2 -L Invocation bringsel -C ./sample.cnf CUG2007 09 ||23

  11. Example: Parallel Directory Creation bringsel -T 4 -D /snarf/foo:1,2,2 -M -L -c -b 32 -S 100M alpha /snarf/foo 0 A0001 T1 Time mkdir 1 stat Barrier B0001 B0002 T1 T2 2 MAXD = 20 mkdir mkdir stat stat Barrier C0001 C0002 C0001 C0002 T1 T2 T3 T4 2 mkdir mkdir mkdir mkdir stat stat stat stat N MAXB = 100 CUG2007 10 ||23

  12. Example: Metadata Loop Operations bringsel -T 4 -D /snarf/foo:1,2,2 -M -L -c -b 32 -S 100M alpha T1 tmp_file1 T2 tmp_file2 A0001 T3 tmp_file3 T4 tmp_file4 open close stat Error? rename mkdir chmod utime CUG2007 11 ||23

  13. Example: File Operations bringsel -T 4 -D /snarf/foo:1,2,2 -M -L - c -b 32 -S 100M alpha 32KB T1 alpha_0001 32KB 100MB T2 alpha_0002 32KB 100MB T3 A0001 alpha_0003 32KB 100MB T4 alpha_0004 100MB POSIX open write Error? close chksum CUG2007 12 ||23

  14. Example: Sequence of Operations bringsel -T 4 -D /snarf/foo:1,2,2 -M -L -c -b 32 -S 100M alpha /snarf/foo A0001 Complete Results 4x 100 MB 1 Barrier B0001 B0002 2 1 2 Next Barrier Barrier Barrier Barrier C0001 C0002 C0001 C0002 3 4 5 6 2 End CUG2007 13 ||23

  15. Example: Directory Walk bringsel -T 4 -a sx -D /snarf/foo:1,2,2 -L T1 T2 4x 100 MB A0001 T3 T4 B0001 B0002 opendir readdir Error? stat rewinddir closedir C0001 C0002 C0001 C0002 CUG2007 14 ||23

  16. Example: Hash Trees Backup Restore Repository /snarf/foo /snarf /widget A0001 A0001 B0001 B0002 B0001 B0002 C0001 C0002 C0001 C0002 C0001 C0002 C0001 C0002 CUG2007 15 ||23

  17. Example: Hash Tree Formulation bringsel -T 4 -a ds -D /snarf/foo:1,2,2 N B0002 .bringsel_sd01 T2 2 A = H ( f 1 , f 2 , f 3 , f 4 , B , C ) Barrier C0001 C0002 T3 T4 5 6 .bringsel_sd01 .bringsel_sd01 B = H ( f 1 , f 2 , f 3 , f 4 ) C = H ( f 1 , f 2 , f 3 , f 4 ) Time 0 V = H ( f 1 → f n , d 1 → d n ) H () → SHA − 256 CUG2007 16 ||23

  18. Sample Raw Output Standard File Operations Date/Time MD Time Etime MBps Op/Size Thread/Iter Opn Lat IOPs Error? Directory Walk File Date/Time MD Time Etime Cnt Dir Op/Dir Thread/Iter Sym Cnt Error? Cnt CUG2007 17 ||23

  19. Testing/Taxonomy Interface Operation Block Size File Size [ 24 : 1 : 1 : 1,0 : 0,0 ] POS : CR : 64K : 310M Nodes Threads per Node Directory Serial Access # files,str/seg Parallel Access # files,str/seg CUG2007 18 ||23

  20. Sample Results: Reliability - Of 25 Tests... - ~350 TB of data written without corruption or access failures. - No major hardware failures in ~90 days of operation. - All checksums valid. - Early SLES9 NFS client problems under load, detected and corrected via patch. (735130) - 1 FC DDU failure, without data loss. - Spatial use from 0% to 100%+ during various test cases. - Test case durations of several minutes to several days. CUG2007 19 ||23

  21. Sample Results: Uniformity ~10% Variation across a 12.5 hour run. [ 24 : 1 : * : 1,0 : 0,0 ] POS:CR:64K:310M - SLES9 2.6.5-7.244 with 6x 802.3ad CUG2007 20 ||23

  22. Sample Results: Scalability [ VAR : 1 : 1 : 1,0 : 0,0 ] POS:RW:VAR:500M - SLES10 2.6.16.21-0.8 with 6x Dedicated @ 0% Spatial Utilization 500 500 450 450 400 400 350 350 Aggregate MBps Aggregate MBps 300 300 250 250 200 200 150 150 100 100 50 50 0 0 1024 2 512 4 6 256 8 10 128 12 K Blocks 14 Number of Nodes 64 16 18 32 20 22 CUG2007 21 ||23 16

  23. Some Possible Future Directions for Bringsel - Code refinement, documentation. - Tree discovery/tree limit. - UPC support. - Adding and pruning directories in CF. - Selectable horiz/vert barriers. - Fault injection. - Parser refinements. - Modules to support tracing output, either VFS or library level. - Better visualization methods (external). - Long term, automated style driver (external). CUG2007 22 ||23

  24. Questions? CUG2007 23 ||23

Recommend


More recommend