Evaluating selected cluster file systems with Parabench Internship - PDF document

Evaluating selected cluster file systems with Parabench Internship report Authors: Marcel Krause; Jens Schlager, B.A. Tutors: Olga Mordvinova, Julian M. Kunkel Winter term 2009–2010 Overview � Test Scenarios � � Test Patterns � Cluster Hardware � Benchmark Results � Workflow Optimization � Conclusion 1

Test Scenarios � Original plan: Test all combinations with up to 8 nodes. S = {2, 4 servers} × {1, 2, 4 clients} × {100, 1k, 10k, 100k iterations} × {4 patterns} |S| = 96 tests � Reality: availability and stability problems � Most of the time only 7 nodes at most � Test duration limits for slow file systems Test Scenarios � { server nodes } ∩ { client nodes } = Ø � Configuration files: � OCFS: software RAID (NBD) as “servers”; not really servers because all logic in clients � GlusterFS: defaults + server/client count � Ceph: example + server/client count, all servers had data and meta data 2

Overview � Test Scenarios � Test Patterns � � Cluster Hardware � Benchmark Results � Workflow Optimization � Conclusion Test Patterns � 3 patterns reflecting an OLAP (OnLine Analytic Processing) engine � Business Intelligence: not actually HPC, but also data intensive � 1 synthetic pattern: Basic Operations Test (includes sequential read/write) � All patterns written in Parabench language 3

Test Patterns � Create index: � generates initial index directory structure and builds index configuration files � Operations: mkdir, write, rename, delete � Delete index: � delete data directories, update meta data � Operations: delete, rmdir, write Test Patterns � Index index: � fills the created index with data � Operations: read, write � Solo part: repeat all formerly distributed operations on one single client � Basic Operations Test: � write, read, append, rename, delete 4

Overview � Test Scenarios � Test Patterns � Cluster Hardware � � Benchmark Results � Workflow Optimization � Conclusion Cluster Hardware Highlights: � 2x Intel Xeon 2 GHz Special hardware on nodes 01 – 05: � 1 GB DDR-RAM � RAID controller Promise � 80 GB IDE HDD FastTrack TX2300 � 2x Gigabit Ethernet � RAID0 (Striping) of two ports (one in use) 160 GB SATA-II HDDs � Intel 82545EM Gigabit Ethernet controller 5

Theoretical Throughput create index: write ~60 KB index index: write ~4.8 KB clients aggreg. each clients aggreg. each 1 55.5 55.5 1 8.2 8.2 2 111.0 55.5 2 16.4 8.2 4 186.3 46.6 4 32.0 8.0 index index: read ~5.4 KB � All numbers in MiB/s clients aggreg. each � Throughput reduction: 1 9.1 9.1 switch limit � next slide 2 18.3 9.1 � Index index: too few data to gain momentum 4 35.4 8.9 Theoretical Throughput � Calculations based on Performance Analysis of the PVFS2 Persistency Layer by Julian M. Kunkel because it’s the same cluster � Assuming optimal read-ahead and maximum write buffering � Reduction of throughput by switch limit (more network traffic than the switch can handle) 6

Overview � Test Scenarios � Test Patterns � Cluster Hardware � Benchmark Results � � Workflow Optimization � Conclusion Benchmark Results � See detailed report for: � Throughput comparison � Each operation’s duration � Each test’s theoretical duration � Comparison by block size (basic operations) � => Lots of precise numbers. � In these slides: plain and simple comparison by test duration with 4 servers and 1 client. � Basic Op. Test: partially estimated for 1k it. 7

Benchmark Results Create 100 it. 10 k it. Delete 100 it. 10 k it. 2s 3m 19s 2s 6m 55s OCFS OCFS Gluster 17s 28m 23s Gluster 20s 32m 54s 6m 46s 9m 12s Ceph ( ~ 11h) Ceph ( ~ 15h) Index 100 it. 10 k it. Index Real it. 1k it. OCFS 4s ( ~ 42m) OCFS 1 000 7m 21s Gluster 38s 80m 9s Gluster 100 13m 50s Ceph 32m 32s ( ~ 2d 6h) Ceph 10 16m 40s Overview � Test Scenarios � Test Patterns � Cluster Hardware � Benchmark Results � Workflow Optimization � � Conclusion 8

Workflow Optimization � Many tests, few variables � Number of servers � Number of clients � Number of iterations � => Utilities to… � prepare and clean up the test environment � generate Parabench scripts from templates � run them, collect data, reformat for OpenOffice File System Management Scripts � Specific management scripts for each FS � ./<f>.sh start <s> <c> � Initializes the test environment for file system <f> with <s> servers and <c> clients � Node roles are appointed dynamically, based on the list in available_nodes.txt � Summarizes all their hundreds of vacuous system messages to a simple status report 9

File System Management Scripts � ./<f>.sh stop � Stops all servers and clients, based on the lists in the server_nodes and client_nodes files generated by “start” � HTTP notification: � After completing their operations, the scripts can notify the tester’s web server, which in our case then sent us an XMPP instant message. FS-specific Features � OCFS: ocfs2.sh � Creates or assembles the RAID � Recreates or cleans OCFS2 file system (automatically guesses quicker method) � GlusterFS: gluster-manager.sh � Generates and distributes the config files � Starts the servers in parallel, because the Gluster servers take ages to start. � 10

FS-specific Features � Ceph basics: ceph-helper.sh � Modified version of Dennis Runz’s start script. � Ceph wrapper: manage-ceph.sh � Wrapper for ceph-helper.sh ; output is filtered to prevent message flood � Generates and distributes the config files � Simplifies init , start , mount , umount , stop , and clean to just start and stop . FS-independent Utilities � paralog.pl (used in wiz.sh ) � Copies Parabench ’s output to files, like tee � Stops Parabench when it reports errors, to prevent message flood � results/sumtimes.pl (used in wiz.sh ) � Collects Parabench ’s time files � Calculates minimum, maximum, average � Reformats them for copy & paste to OO Calc 11

FS-independent Utilities � ./wiz.sh <t> <i> � Prepares the test environment � Generates the Parabench script for test <t> with <i> iterations per client � Runs test with Parabench and paralog.pl � Runs solo part, if applicable � Notifies the tester’s web server (again, XMPP) � Gathers results ( sumtimes.pl ) � Displays wall time summary Overview � Test Scenarios � Test Patterns � Cluster Hardware � Benchmark Results � Workflow Optimization � Conclusion � 12

Conclusion � Ceph seems to scale almost linearly � but very slow for our test patterns � GlusterFS too, and acceptable speed, but � “no space left o.d.” when only a few % used � OCFS was the fastest FS, but � seems to have a limit on the number of files � non-linear scaling � => OCFS clearly wins these tests � but GlusterFS might win for larger data Thank you for listening For sources, see the detailed report 13

Evaluating selected cluster file systems with Parabench Internship - PDF document

Evaluating selected cluster file systems with Parabench Internship report Authors: Marcel Krause; Jens Schlager, B.A. Tutors: Olga Mordvinova, Julian M. Kunkel Winter term 20092010 Overview Test Scenarios Test Patterns

File Management What is a file? Elements of file management File organization

Final Selected Abstracts Final Selected Abstracts Final Selected Abstracts Final Selected

Click on M odel File for CAD Click on M odel File for CAD Click on Model File for CAD Click

CPSC 410/611: File Management What is a file? Elements of file management File

Week 10: File Management What is a file? Elements of file management File

Cluster Architectures Overview Cluster Computing The Problem The Solution The Anatomy

File Systems: Semantics & Structure What is a File a file is a named collection of

File Systems: Semantics & Structure What is a File a file is a named collection of

CPSC 410/611: File Management What is a file? Elements of file management

File Systems: Consistency Issues 1 File Systems: Consistency Issues File systems maintain many

~FILE SYSTEM~ SUNU WIBIRAMA OUTLINE FILE SYSTEM ACCESS METHODS DIRECTORY STRUCTURE FILE

What if... There is no file with the name given to the File constructor: new File

history and drivers The Aerospace Cluster The Cluster-Association The Aerospace Cluster The

Getting started on the cluster Learning Objectives Describe the structure of a compute cluster

Parallel File Systems John White Lawrence Berkeley National Lab Topics Defining a File

Chapter 6: File Systems File systems n Files n Directories & naming n File system

A network for improvement of cephalopod welfare husbandry in research, aquaculture and fisheries (

Linux Open Source Distributed Filesystem Ceph at SURFsara Remco van Vugt July 2, 2013 1/ 34

Ceph: All-in-One Network Data Storage What is Ceph and how we use it to backend the Arbutus cloud

CephFS as a service with OpenStack Manila John Spray john.spray@redhat.com jcsp on #ceph-devel

D3N A multi-layer cache for the rest of us E. Ugur Kaynar, Mania Abdi, Mohammad Hossein Hajkazemi,

Storage Cluster mit Ceph CeBIT 2015 20. Mrz 2015 Michel Rode Linux/Unix Consultant &

Certified in Public Health: Credentialing Public Health Leaders Why the CPH? Created to

Ceph & RocksDB (Cloud Storage ) Ceph Basics Placement Group PG#1 PG#2 PG#3

Evaluating selected cluster file systems with Parabench Internship - PDF document

Evaluating selected cluster file systems with Parabench Internship report Authors: Marcel Krause; Jens Schlager, B.A. Tutors: Olga Mordvinova, Julian M. Kunkel Winter term 20092010 Overview Test Scenarios Test Patterns

File Management What is a file? Elements of file management File organization

Final Selected Abstracts Final Selected Abstracts Final Selected Abstracts Final Selected

Click on M odel File for CAD Click on M odel File for CAD Click on Model File for CAD Click

CPSC 410/611: File Management What is a file? Elements of file management File

Week 10: File Management What is a file? Elements of file management File

Cluster Architectures Overview Cluster Computing The Problem The Solution The Anatomy

File Systems: Semantics &amp; Structure What is a File a file is a named collection of

File Systems: Semantics &amp; Structure What is a File a file is a named collection of

CPSC 410/611: File Management What is a file? Elements of file management

File Systems: Consistency Issues 1 File Systems: Consistency Issues File systems maintain many

~FILE SYSTEM~ SUNU WIBIRAMA OUTLINE FILE SYSTEM ACCESS METHODS DIRECTORY STRUCTURE FILE

What if... There is no file with the name given to the File constructor: new File

history and drivers The Aerospace Cluster The Cluster-Association The Aerospace Cluster The

Getting started on the cluster Learning Objectives Describe the structure of a compute cluster

Parallel File Systems John White Lawrence Berkeley National Lab Topics Defining a File

Chapter 6: File Systems File systems n Files n Directories &amp; naming n File system

A network for improvement of cephalopod welfare husbandry in research, aquaculture and fisheries (

Linux Open Source Distributed Filesystem Ceph at SURFsara Remco van Vugt July 2, 2013 1/ 34

Ceph: All-in-One Network Data Storage What is Ceph and how we use it to backend the Arbutus cloud

CephFS as a service with OpenStack Manila John Spray john.spray@redhat.com jcsp on #ceph-devel

D3N A multi-layer cache for the rest of us E. Ugur Kaynar, Mania Abdi, Mohammad Hossein Hajkazemi,

Storage Cluster mit Ceph CeBIT 2015 20. Mrz 2015 Michel Rode Linux/Unix Consultant &amp;

Certified in Public Health: Credentialing Public Health Leaders Why the CPH? Created to

Ceph &amp; RocksDB (Cloud Storage ) Ceph Basics Placement Group PG#1 PG#2 PG#3

File Systems: Semantics & Structure What is a File a file is a named collection of

File Systems: Semantics & Structure What is a File a file is a named collection of

Chapter 6: File Systems File systems n Files n Directories & naming n File system

Storage Cluster mit Ceph CeBIT 2015 20. Mrz 2015 Michel Rode Linux/Unix Consultant &

Ceph & RocksDB (Cloud Storage ) Ceph Basics Placement Group PG#1 PG#2 PG#3