the google file system
play

The Google File System Joanna Swietlicka October 13, 2010 Joanna - PowerPoint PPT Presentation

Design overview Interactions Master operation Fault tolerance and diagnosis Measurements The Google File System Joanna Swietlicka October 13, 2010 Joanna Swietlicka The Google File System Design overview Interactions Master


  1. Design overview Interactions Master operation Fault tolerance and diagnosis Measurements The Google File System Joanna ´ Swietlicka October 13, 2010 Joanna ´ Swietlicka The Google File System

  2. Design overview Interactions Master operation Fault tolerance and diagnosis Measurements Based on: S. Ghemawat, H. Gobioff, and S.-T. Leung: “The Google file system,” in Proceedings ACM SOSP 2003, Lake George, NY, USA, October 2003. Joanna ´ Swietlicka The Google File System

  3. Design overview Interactions Master operation Fault tolerance and diagnosis Measurements Design overview 1 Assumptions Interface Architecture Single master Chunk size Metadata Interactions 2 Mutation mechanism Additional operations Master operation 3 Fault tolerance and diagnosis 4 Measurements 5 Joanna ´ Swietlicka The Google File System

  4. Assumptions Design overview Interface Interactions Architecture Master operation Single master Fault tolerance and diagnosis Chunk size Measurements Metadata Frequent failures Hundreds of machines built from inexpensive commodity parts Component failures are the norm rather than the exception Constant monitoring, error detection, fault tolerance, and prompt automatic recovery must be integral to the system Joanna ´ Swietlicka The Google File System

  5. Assumptions Design overview Interface Interactions Architecture Master operation Single master Fault tolerance and diagnosis Chunk size Measurements Metadata Huge files Modest number of large files Multi-GB files are common Small files supported, but not optimized for Design assumptions and parameters such as I/O operation and blocksizes had to be revisited Joanna ´ Swietlicka The Google File System

  6. Assumptions Design overview Interface Interactions Architecture Master operation Single master Fault tolerance and diagnosis Chunk size Measurements Metadata Writing Mostly appending new data rather than overwriting existing data Large, sequential writes Once written, files are seldom modified again Appending is the focus of performance optimization and atomicity guarantees Joanna ´ Swietlicka The Google File System

  7. Assumptions Design overview Interface Interactions Architecture Master operation Single master Fault tolerance and diagnosis Chunk size Measurements Metadata Reading Once written, files are only read, often only sequentially Mostly large streaming reads and small random reads Batching and sorting small reads to advance steadily through the file Joanna ´ Swietlicka The Google File System

  8. Assumptions Design overview Interface Interactions Architecture Master operation Single master Fault tolerance and diagnosis Chunk size Measurements Metadata Concurrency Files often used as producer-consumer queues or for many-way merging Hundreds of producers concurrently append to a single file The file may be read later, or a consumer may be reading through the file simultaneously Atomicity with minimal synchronization overhead is essential Joanna ´ Swietlicka The Google File System

  9. Assumptions Design overview Interface Interactions Architecture Master operation Single master Fault tolerance and diagnosis Chunk size Measurements Metadata Bandwidth vs. latency High sustained bandwidth is more important than low latency Most applications place a premium on processing data in bulk at a high rate Few have stringent response time requirements for an individual read or write Joanna ´ Swietlicka The Google File System

  10. Assumptions Design overview Interface Interactions Architecture Master operation Single master Fault tolerance and diagnosis Chunk size Measurements Metadata Interface GFS doesn’t implement a standard API such as POSIX Files are organized hierarchically in directories and identified by pathnames Standard operations: create, delete, open, close, read, and write Additional operations: snapshot and record append Snapshot creates a copy of a file or a directory tree at low cost Record append allows multiple clients to append data to the same file concurrently while guaranteeing the atomicity of each individual client’s append Joanna ´ Swietlicka The Google File System

  11. Assumptions Design overview Interface Interactions Architecture Master operation Single master Fault tolerance and diagnosis Chunk size Measurements Metadata Architecture A GFS cluster consists of a single master and multiple chunkservers and is accessed by multiple clients Each of these is a commodity Linux machine running a user-level server process Joanna ´ Swietlicka The Google File System

  12. Assumptions Design overview Interface Interactions Architecture Master operation Single master Fault tolerance and diagnosis Chunk size Measurements Metadata Files Files are divided into fixed-size chunks Each chunk is identified by a 64 bit chunk handle Chunkservers store chunks on local disks as Linux files Each chunk is replicated on multiple chunkservers (default: 3) Joanna ´ Swietlicka The Google File System

  13. Assumptions Design overview Interface Interactions Architecture Master operation Single master Fault tolerance and diagnosis Chunk size Measurements Metadata Master Maintains all file system metadata: namespace access control information mapping from files to chunks current locations of chunks Controls system-wide activities: chunk lease management garbage collection of orphaned chunks chunk migration between chunkservers Periodically communicates with each chunkserver in HeartBeat messages to give it instructions and collect its state Joanna ´ Swietlicka The Google File System

  14. Assumptions Design overview Interface Interactions Architecture Master operation Single master Fault tolerance and diagnosis Chunk size Measurements Metadata Communication GFS client communicates with the master and chunkservers to read or write data on behalf of the application Clients interact with the master only for metadata operations All data-bearing communication goes directly to the chunkservers Joanna ´ Swietlicka The Google File System

  15. Assumptions Design overview Interface Interactions Architecture Master operation Single master Fault tolerance and diagnosis Chunk size Measurements Metadata Cache Clients cache only metadata Caching data offers little benefit because most applications stream through huge files Not having them simplifies the client and the overall system Chunkservers need not cache file data because chunks are stored as local files (Linux’s buffer cache already keeps frequently accessed data in memory) Joanna ´ Swietlicka The Google File System

  16. Assumptions Design overview Interface Interactions Architecture Master operation Single master Fault tolerance and diagnosis Chunk size Measurements Metadata Single master Having a single master simplifies the design Minimizing its involvement in reads and writes ensures that it does not become a bottleneck Clients only ask the master which chunkservers they should contact They cache this information for a limited time and interact with the chunkservers directly for many subsequent operations Joanna ´ Swietlicka The Google File System

  17. Assumptions Design overview Interface Interactions Architecture Master operation Single master Fault tolerance and diagnosis Chunk size Measurements Metadata Interactions Client translates the file name and byte offset into chunk 1 index within the file It sends the master a request 2 The master replies with the corresponding chunk handle 3 and locations of the replicas The client caches this information 4 The client then sends a request to one of the replicas 5 Further reads of the same chunk require no more 6 client-master interaction Joanna ´ Swietlicka The Google File System

  18. Assumptions Design overview Interface Interactions Architecture Master operation Single master Fault tolerance and diagnosis Chunk size Measurements Metadata Interactions - scheme Joanna ´ Swietlicka The Google File System

  19. Assumptions Design overview Interface Interactions Architecture Master operation Single master Fault tolerance and diagnosis Chunk size Measurements Metadata Chunk size 64 MB Lazy space allocation – avoids wasting space due to internal fragmentation Advantages: Reduction of clients’ need to interact with the master Reduction of network overhead by keeping a persistent TCP connection to the chunkserver over an extended period of time Reduction of the size of metadata Joanna ´ Swietlicka The Google File System

  20. Assumptions Design overview Interface Interactions Architecture Master operation Single master Fault tolerance and diagnosis Chunk size Measurements Metadata Metadata Three types: File and chunk namespaces Mapping from files to chunks Locations of each chunk’s replicas All metadata is kept in the master’s memory Namespaces and mapping are also kept in an operation log stored on the master’s local disk and replicated on remote machines The master does not store chunk location information persistently – it asks each chunkserver about its chunks Joanna ´ Swietlicka The Google File System

Recommend


More recommend