BabuDB: Fast and Efficient File System Metadata Storage Jan - PowerPoint PPT Presentation

BabuDB: Fast and Efficient File System Metadata Storage Jan Stender, Björn Kolbeck, Felix Hupfeld Mikael Högqvist Zuse Institute Berlin Google GmbH Zurich SNAPI 2010 · Jan Stender

Motivation – Modern parallel / distributed file systems: Huge numbers of files and directories – Many storage servers but few metadata servers – – Examples: Lustre, Panasas Active Scale, Google File System – – Metadata access critical wrt. system performance ~75% of all file system calls are metadata accesses – Metadata servers are bottlenecks – SNAPI 2010 · Jan Stender

Motivation – B-tree-like data structures used for metadata storage ZFS, btrfs, Lustre, PVFS2 – – Downsides: Hard to implement and test, – high code complexity Multi-version B-trees even more complex – On-disk re-balancing expensive – SNAPI 2010 · Jan Stender

BabuDB – Key-value store – FS metadata: key-value pairs stored in DB indices SNAPI 2010 · Jan Stender

BabuDB: Index SNAPI 2010 · Jan Stender

Example SNAPI 2010 · Jan Stender

Example: Insertions SNAPI 2010 · Jan Stender

Example: Lookups SNAPI 2010 · Jan Stender

Example: Deletions SNAPI 2010 · Jan Stender

Example: Range Lookups SNAPI 2010 · Jan Stender

Example: Checkpoints SNAPI 2010 · Jan Stender

On-disk Index – Sorted by Keys – Block index in RAM, blocks mmap 'ed SNAPI 2010 · Jan Stender

BabuDB: Related Work – Inspired by log-structured merge trees (LSM-trees) Only one on-disk index – No „rolling merge“ – – Made popular by Google Bigtable Insert/lookup/merge similar as in Bigtable's T ablets – SNAPI 2010 · Jan Stender

BabuDB: Metadata Mapping – Mapping a hierarchical directory tree to a flat database index: SNAPI 2010 · Jan Stender

BabuDB: Advantages – Why BabuDB for File System Metadata? Short-lived files – ▪ 50% of all files deleted within 5 minutes Atomic file system operations w/o locking or transactions – ▪ e.g. rename Directory content in contiguous disk regions – ▪ Efficient readdir + stat Snapshots – ▪ No need for multi-version data structures SNAPI 2010 · Jan Stender

BabuDB: Evaluation – Linux kernel build 2000 1800 1600 ~10M calls: 44% stat , – 1400 seconds 1200 40% open , 15% 1000 BabuDB 800 ext4 readlink , 1% others 600 400 200 0 Kernel build – Dovecot mail server 400 + imaptest 350 300 seconds 250 ~2M calls: 51% stat , – 200 BabuDB ext4 150 48% open , 1% others 100 50 0 Dovecot test SNAPI 2010 · Jan Stender

BabuDB: Evaluation – Listing directory content SNAPI 2010 · Jan Stender

Summary – BabuDB is ... an efficient key-value store – optimized for file system – metadata but also suitable http://babudb.googlecode.com for other purposes suitable for large-scale – databases available for Java and C++ – under BSD license used in the XtreemFS – http://www.xtreemfs.org metadata server SNAPI 2010 · Jan Stender

Thank you for your attention! SNAPI 2010 · Jan Stender

Background: XtreemFS XtreemFS: a distributed replicated Internet file system – part of the XtreemOS research project – developed since 2006 by partners from – Germany, Spain and Italy Object-based – architecture: MRC stores metadata – OSD s store pure file content – as objects Client s provide POSIX file – system interface www.xtreemfs.org SNAPI 2010 · Jan Stender

The XtreemOS Project – Research project funded by the European Commission – 19 partners from Europe and China – XtreemFS is the data management component developed by ZIB, NEC HPC Europe, – Barcelona Supercomputing Center and ICAR-CNR Italy ~ 3 years of development – first public release in August 2008 – SNAPI 2010 · Jan Stender

XtreemFS: Overview – What is XtreemFS? a distributed and replicated – POSIX compliant file system off-the-shelve Servers – no – expensive hardware servers in Java , runs on – Linux / OS X / Solaris client in C , runs on – Linux / OS X / Windows secure (X.509 and SSL) – easy to install and maintain – open source (GPL) – SNAPI 2010 · Jan Stender

File System Landscape Internet Cluster FS/ Data Center Network FS/ Centralized PC ext3, ZFS, NFS, SMB Lustre, Panasas, Grid File System GDM NTFS AFS/Coda GPFS, CEPH... GFarm "gridftp" SNAPI 2010 · Jan Stender

BabuDB: Fast and Efficient File System Metadata Storage Jan - PowerPoint PPT Presentation

BabuDB: Fast and Efficient File System Metadata Storage Jan Stender, Bjrn Kolbeck, Felix Hupfeld Mikael Hgqvist Zuse Institute Berlin Google GmbH Zurich SNAPI 2010 Jan Stender Motivation Modern parallel / distributed file systems:

Storage and File Structure December 12, 2008 Storage and File Structure Magnetic Discs RAID

File Management What is a file? Elements of file management File organization

File and Metadata Replication in XtreemFS Bjrn Kolbeck Zuse Institute Berlin File and Metadata

Click on M odel File for CAD Click on M odel File for CAD Click on Model File for CAD Click

~FILE SYSTEM~ SUNU WIBIRAMA OUTLINE FILE SYSTEM ACCESS METHODS DIRECTORY STRUCTURE FILE

UNSD metadata template / SDMX Metadata Structure Definition Elena De Jess, UNSD Standardized

[537] Distributed Systems Chapters 42 Tyler Harter 11/19/14 File-System Case Studies Local -

Part III Part III Storage Management Storage Management Chapter 11: File System Implementation

CPSC 410/611: File Management What is a file? Elements of file management File

Week 10: File Management What is a file? Elements of file management File

Fast File System Don Porter 1 CSE 306: Opera.ng Systems How to place a file system on disk?

File System Implementation Summer 2016 Cornell University Today File allocation Unix

FILE SYSTEM IMPLEMENTATION Sunu Wibirama Outline File-System Structure File-System

Chapter 10: Storage and File Structure Overview of Physical Storage Media Magnetic Disks

File Systems: Semantics & Structure What is a File a file is a named collection of

File Systems: Semantics & Structure What is a File a file is a named collection of

Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer

OS Support for a Commodity Database on PC Clusters Distributed Devices vs. Distributed File

GFS Doug Woos (based on slides from Tom Anderson and Dan Ports) Logistics notes Lab 3b due

Data-Intensive Distributed Computing 431/451/631/651 (Fall 2020) Part 2: MapReduce Algorithm

Installation and Usage Yunhong Gu July 2010 Agenda System Overview Installation File

Roadmap for Applying Hadoop Distributed File System in Scientific Grid Computing Garhan Attebury

Direct-FUSE: Removing the Middleman for High-Performance FUSE File System Support Yue Zhu *, Teng

Steganographic File Systems Steganographic File Systems 1 Conventional Protection Mechanisms in

BabuDB: Fast and Efficient File System Metadata Storage Jan - PowerPoint PPT Presentation

BabuDB: Fast and Efficient File System Metadata Storage Jan Stender, Bjrn Kolbeck, Felix Hupfeld Mikael Hgqvist Zuse Institute Berlin Google GmbH Zurich SNAPI 2010 Jan Stender Motivation Modern parallel / distributed file systems:

Storage and File Structure December 12, 2008 Storage and File Structure Magnetic Discs RAID

File Management What is a file? Elements of file management File organization

File and Metadata Replication in XtreemFS Bjrn Kolbeck Zuse Institute Berlin File and Metadata

Click on M odel File for CAD Click on M odel File for CAD Click on Model File for CAD Click

~FILE SYSTEM~ SUNU WIBIRAMA OUTLINE FILE SYSTEM ACCESS METHODS DIRECTORY STRUCTURE FILE

UNSD metadata template / SDMX Metadata Structure Definition Elena De Jess, UNSD Standardized

[537] Distributed Systems Chapters 42 Tyler Harter 11/19/14 File-System Case Studies Local -

Part III Part III Storage Management Storage Management Chapter 11: File System Implementation

CPSC 410/611: File Management What is a file? Elements of file management File

Week 10: File Management What is a file? Elements of file management File

Fast File System Don Porter 1 CSE 306: Opera.ng Systems How to place a file system on disk?

File System Implementation Summer 2016 Cornell University Today File allocation Unix

FILE SYSTEM IMPLEMENTATION Sunu Wibirama Outline File-System Structure File-System

Chapter 10: Storage and File Structure Overview of Physical Storage Media Magnetic Disks

File Systems: Semantics &amp; Structure What is a File a file is a named collection of

File Systems: Semantics &amp; Structure What is a File a file is a named collection of

Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer

OS Support for a Commodity Database on PC Clusters Distributed Devices vs. Distributed File

GFS Doug Woos (based on slides from Tom Anderson and Dan Ports) Logistics notes Lab 3b due

Data-Intensive Distributed Computing 431/451/631/651 (Fall 2020) Part 2: MapReduce Algorithm

Installation and Usage Yunhong Gu July 2010 Agenda System Overview Installation File

Roadmap for Applying Hadoop Distributed File System in Scientific Grid Computing Garhan Attebury

Direct-FUSE: Removing the Middleman for High-Performance FUSE File System Support Yue Zhu *, Teng

Steganographic File Systems Steganographic File Systems 1 Conventional Protection Mechanisms in

File Systems: Semantics & Structure What is a File a file is a named collection of

File Systems: Semantics & Structure What is a File a file is a named collection of