differentiated storage services
play

Differentiated Storage Services M. Mesnier, J.B. Akers, F. Chen, T. - PowerPoint PPT Presentation

Differentiated Storage Services M. Mesnier, J.B. Akers, F. Chen, T. Luo Presentation by Szymon Bachnij Introduction DSS is a proposition of I/O classification architecture we want to define the separate classes of I/O our goal is to


  1. Differentiated Storage Services M. Mesnier, J.B. Akers, F. Chen, T. Luo Presentation by Szymon Bachnij

  2. Introduction ● DSS is a proposition of I/O classification architecture ● we want to define the separate classes of I/O ● our goal is to assign the storage system policy to each of those classes to efficently manage data and I/O requests

  3. Challenges: ● Computer system performance depends on storage system ● Storage systems are becoming more and more complex ● Storage system need some information to provide any optimazation ● ... but too much information is not a good idea

  4. Requirements

  5. Operating system: ● classifier assosiated with every I/O request ● new field must be added to each OS structure describing I/O which is always copied to actual I/O command (SCSI, ATA) ● OS scheduler need to be changed

  6. Filesystem: ● must have its own classification scheme ● each class have its own policy ● I/O can change the classification class (ex. file changes its size)

  7. Storage system: ● must exctract the classifier, find the appropriate policy and enforce it ● don’t need to remember the class of each data block ● have to inform about changing the location of block

  8. Application: ● O_CLASSIFIED needed to use DSS while opening the file ● POSIX gather/scatter operations are overloaded ● changes in VFS are essential in order to handle DSS features

  9. Implementation

  10. Operating system ● interface for classifying I/O requests

  11. Operating system ● then we copy from the BIO to the 5-bit vendor-specific Group Number field in byte 6 of the SCSI CDB SCpnt->cmnd[6] = SCpnt->request->bio->bi_class; ● adding I/O classification is a matter of tracking an I/O from filesystem to device drivers through block layers

  12. File system ● Goal: provide the storage system information which blocks should be cached and the order of eviction of cached blocks

  13. File system ● class id and priority may change ● we using 19 out of 32 available ID’s ● the less numer the higher priority is

  14. File system ● provided POSIX interface for user-level I/O

  15. File system ● example for PostgreSQL

  16. Storage system Baseline algorithm: ● at the beginning we have ‘free list’ of allocations ● when the data block is cached the allocation is moved to ‘dirty list’ ● when the ‘free list’ drops below some level ‘syncer deamon’ begins to clean the ‘dirty list’

  17. Storage system Selective allocation: ● decision about caching is not based on request size ● metadata and small files are always cached ● large files are cached conditionally (it depends on ‘syncer deamon’ state)

  18. Storage system ● Selective eviction: ● is not a LRU algorithm ● first are evicted entries with lowest priority ● If this is not enough we evict next lowest entries ● metadata and small files rarely leave cache ● large files are usually moved out because of priority, but also its size

  19. Evaluation

  20. Environment ● single Linux machine (Fedora 13) ● kernel version: 2.6.34 ● 8-core system with 8GB of RAM ● file system: Ext3 ● storage device: 5-disk LSI RAID-1E array ● cache: Intel 32GB X25-E SSD

  21. Test methodology ● Workload generator which on input takes: file size distribution, request file size, read/write ratio, number of subdirectories

  22. File server ● file server worload based on SPECsfs2008 ● over 262,000 files and 8,500 directories created ● over 262,000 transactions performed ● read/write ratio is 2:1 ● 184GB of memory used ● 18GB cache

  23. E-mail server ● e-mail server worload based on a study of e-mail server file sizes ● 1 milion files 1,000 directories ● 1 milion transactions performed ● read/write ratio is 2:1 ● 204GB memory used ● 20GB cache

  24. Results

  25. Database ● used database: PostgreSQL ● highest priority for: metadata, user tables, log files and temporary tables (all in one class) ● index files have lower priority ● 8GB cache

  26. Database results

  27. The end

Recommend


More recommend