What a Lustre Cluster (Improving and Tracing Lustre Metadata) - PowerPoint PPT Presentation

What a Lustre Cluster (Improving and Tracing Lustre Metadata) yaaaasss Team Saffron Amanda Bonnie Zach Fuerst Thomas Stitt

Overview ● Motivation ● Configuration ● Tracing Metadata ● Improving Metadata Hardware ● Multiple Lustre Clients via Virtualization ● Conclusions & Future Work 2

Motivation ● Tracing Metadata Motivation Can we get enough information without too much overhead? ○ ● Improving Metadata Hardware Motivation MDS can be a performance bottleneck ○ Faster MDT ☞ better performance? ○ ● Lustre Client Virtualization Motivation Single Lustre Client/Node underutilized IB device ○ Higher throughput ☞ Less transfer agents needed ○ Multi-VM nodes ☞ better throughput? ○ 3

Lustre Configuration TAMIRS PROBE ● ● MASTER (sa-master) MASTER (n01) ○ ○ 4 X OSS (sa02-sa05) 5 X OSS (n02-n05,n11) ○ ○ Single disk RAID0 8 disk RAID0 ■ ■ 1 X MGS/MDS (sa01) 1 X MGS/MDS (n06) ○ ○ 2 X CLIENTS (n07-n08) hdd, nvme, KOVE ■ ○ 5 X CLIENTS (sa06-sa10) ○ 2 X VM CLIENTS (n09-n10) ○ MDS/ OSS MGS MASTER CLIENTS MDT OST 4

MDS Tracing 5

Tracing Metadata ● Test tool: mdtest ● Tracers ○ Lustre Debug ○ debugfs (ftrace) ● Mask ○ ftrace - create, open, link, unlink, readdir, getattr, setattr ○ Lustre Debug - no mask 6

Tracing Metadata - Results quite an not too bad overhead ideal 7

MDS Hardware 8

Improving Metadata Hardware ● HDD ○ meh. (96.7 MB/s write & 206 MB/s read) ● NVMe ○ Fast! (686MB/s write & 1.3GB/s read) ● KOVE Express Disk (XPD) ○ RAM Storage Appliance ○ FAAAST! (2.8GB/s write & 3.5GB/s read) 9

Improving Metadata Hardware - Testing ● mdtest ○ Concerned with node caching (dropped caches!) ○ Performance still “low” ● MDS-Survey ○ Runs on MGS/MDS ○ Independent of CLIENT and OSS nodes. 10

Improving Metadata Hardware - Results hdd to hdd to nvme to nvme (%) kove (%) kove (%) create 19.57 20.12 0.46 lookup -1.67 0.99 2.70 md_getattr -0.12 4.72 4.85 setxattr 287.45 244.46 -11.09 destroy 43.45 46.83 2.36 PERCENT INCREASE FROM NVME TO HDD, KOVE TO HDD, & KOVE TO NVME 11

Lustre Client Virtualization 12

SR-IOV 13

Multiple Lustre Clients via Virtualization ● Enable SR-IOV ● KVM hypervisor with Centos 6.6 VMs on top ● Attach n Virtual Functions (VF) to the Physical Function (the device) ■ Virtual Functions just interfaces ■ n ∈ [1-11] 14

Testing Client Performance ● IOR ● Trinity Test from NERSC ○ POSIX Only ● N to N writes/reads ○ 44.7 GiB File per Client ● 10K, 100K, 1MB transfer sizes 15

IOR Write Results 16 (dashed lines are native installs)

IOR Read Results 17 (dashed lines are native installs)

VM Problems ● Hardware Restrictions ○ More than 2GB Ram Needed ○ Only 12 physical Cores ● IB Subnet Manager Needed on Host ● VMware’s ESXi Hypervisor ○ Mellanox drivers for ESXi didn’t support SR-IOV, only pass-through ○ Not Free 18

Conclusions ● MDS Tracing ○ Large Overhead or Not Extensive ● MDS Hardware ○ Improvements << Cost ● Virtualization of Clients ○ Scalable! ○ Worth Further Exploration 19

Future Work ● More Virtualization! ○ Put VMs in a VM so we can virtualize our virtualization allowing us to virtualize while we virtualize (and manage SR-IOV better) ■ Changing the number of VFs requires a reboot which is slow ○ Greater number of VMs (>11) ● Local subnet on each host ● SR-IOV with verbs on ESXi 20

Future Work ● More Virtualization! ○ Put VMs in a VM so we can virtualize our virtualization allowing us to virtualize while we virtualize (and manage SR-IOV better) ■ Changing the number of VFs requires a reboot which is slow ○ Greater number of VMs (>11) ● Local subnet on each host ● SR-IOV with verbs on ESXi 21

Acknowledgements Mentors : Brad Settlemyer, Christopher Mitchell, Michael Mason Instructors : Matthew Broomfield, Jarrett Crews Administration : Carolyn Connor, Andree Jacobson, Gary Grider, Josephine Olivas 22

Questions? 23

What a Lustre Cluster (Improving and Tracing Lustre Metadata) - PowerPoint PPT Presentation

What a Lustre Cluster (Improving and Tracing Lustre Metadata) yaaaasss Team Saffron Amanda Bonnie Zach Fuerst Thomas Stitt Overview Motivation Configuration Tracing Metadata Improving Metadata Hardware Multiple Lustre

1 A Lustre V6 tutorial Verimag December 5, 2008 - Outline Lustre Lustre V6 The Lustre V6

Cray Lustre Model Roadmap Cory Spitz and Derek Robb Cray Inc. 5/24/2011 Introduction and Agenda

Overview of Lustre Usage on JUROPA 26 September 2011 | Frank Heckes, FZ Jlich, JSC Lustre

Lustre Background Why Lustre Failover ? How does Lustre Failover work ? Automation

The Lustre Centre of Excellence at ORNL Makia Minich Clustre Monkey, HPC Software Stack Lustre

An Experiment With Lustre and Real-Time Calculus Introduction du cours Matthieu Moy Verimag

Lustre at GSI - Evaluation of a cluster file system Walter Schn, GSI Walter Schn, GSI Topic

Un-scratching Lustre MSST 2019 Cameron Harr (Lustre Ops & Stuff, LLNL) May 21, 2019

Multi-VO Support YAN Tian for Distributed Computing Group Meeting Oct. 23, 2014 StoRM + Lustre:

Lustre V6 Synchronous Team VERIMAG, Grenoble 2 Lustre Basics Structuration Only nodes

Cluster Architectures Overview Cluster Computing The Problem The Solution The Anatomy

An Introduction to the Lustre Parallel File System Tom Edwards tedwards@cray.com C O M P U T E

on Cray Systems Cory Spitz and Ann Koehler Cray Inc. 5/25/2011 Introduction Lustre is a

DSS Data & Storage Services CERN Lustre Evaluation and Storage Outlook Tim Bell Arne

Cray Centre of Excellence for HECToR This talk is not about how to get maximum performance from

An Introduction to the Lustre Parallel File System Tom Edwards tedwards@cray.com C O M P U T E

Finding (Recently) Frequent Items in Distributed Data Streams Amit Manjhi Kedar Dhamdhere

Q3 2016 Results November 10, 2016 1 DISCLAIMER NOT AN OFFER TO SELL OR SOLICITATION OF AN OFFER

w w w . I C A 2 0 1 4 . o r g On improving pension product design Agnieszka K. Konicz a and John

Low energy by procurement Smart municipality on the move Municipality of Mnsters 13.500

Presentation of third quarter 2019 CEO Per Jrgen Weisethaunet and CFO Stian Lnvik Oslo,

Roberto Rodriguez 5G TELEFONICA TRIALS AND 5G FIRST EXPERIENCES GSMA CITEL Seminar in WRC-19

RURAL HOUSING LOAN FUND RURAL HOUSING LOAN FUND A National Incremental Housing Finance

Research in Coherence: Pitfalls, Developments, and Suggestions Sam Ashcroft, Lee

Sambuz

Useful Links

Newsletter

Mail Us

What a Lustre Cluster (Improving and Tracing Lustre Metadata) - PowerPoint PPT Presentation

What a Lustre Cluster (Improving and Tracing Lustre Metadata) yaaaasss Team Saffron Amanda Bonnie Zach Fuerst Thomas Stitt Overview Motivation Configuration Tracing Metadata Improving Metadata Hardware Multiple Lustre

1 A Lustre V6 tutorial Verimag December 5, 2008 - Outline Lustre Lustre V6 The Lustre V6

Cray Lustre Model Roadmap Cory Spitz and Derek Robb Cray Inc. 5/24/2011 Introduction and Agenda

Overview of Lustre Usage on JUROPA 26 September 2011 | Frank Heckes, FZ Jlich, JSC Lustre

Lustre Background Why Lustre Failover ? How does Lustre Failover work ? Automation

The Lustre Centre of Excellence at ORNL Makia Minich Clustre Monkey, HPC Software Stack Lustre

An Experiment With Lustre and Real-Time Calculus Introduction du cours Matthieu Moy Verimag

Lustre at GSI - Evaluation of a cluster file system Walter Schn, GSI Walter Schn, GSI Topic

Un-scratching Lustre MSST 2019 Cameron Harr (Lustre Ops &amp; Stuff, LLNL) May 21, 2019

Multi-VO Support YAN Tian for Distributed Computing Group Meeting Oct. 23, 2014 StoRM + Lustre:

Lustre V6 Synchronous Team VERIMAG, Grenoble 2 Lustre Basics Structuration Only nodes

Cluster Architectures Overview Cluster Computing The Problem The Solution The Anatomy

An Introduction to the Lustre Parallel File System Tom Edwards tedwards@cray.com C O M P U T E

on Cray Systems Cory Spitz and Ann Koehler Cray Inc. 5/25/2011 Introduction Lustre is a

DSS Data &amp; Storage Services CERN Lustre Evaluation and Storage Outlook Tim Bell Arne

Cray Centre of Excellence for HECToR This talk is not about how to get maximum performance from

An Introduction to the Lustre Parallel File System Tom Edwards tedwards@cray.com C O M P U T E

Finding (Recently) Frequent Items in Distributed Data Streams Amit Manjhi Kedar Dhamdhere

Q3 2016 Results November 10, 2016 1 DISCLAIMER NOT AN OFFER TO SELL OR SOLICITATION OF AN OFFER

w w w . I C A 2 0 1 4 . o r g On improving pension product design Agnieszka K. Konicz a and John

Low energy by procurement Smart municipality on the move Municipality of Mnsters 13.500

Presentation of third quarter 2019 CEO Per Jrgen Weisethaunet and CFO Stian Lnvik Oslo,

Roberto Rodriguez 5G TELEFONICA TRIALS AND 5G FIRST EXPERIENCES GSMA CITEL Seminar in WRC-19

RURAL HOUSING LOAN FUND RURAL HOUSING LOAN FUND A National Incremental Housing Finance

Research in Coherence: Pitfalls, Developments, and Suggestions Sam Ashcroft, Lee

Sambuz

Useful Links

Newsletter

Mail Us

Un-scratching Lustre MSST 2019 Cameron Harr (Lustre Ops & Stuff, LLNL) May 21, 2019

DSS Data & Storage Services CERN Lustre Evaluation and Storage Outlook Tim Bell Arne