enabling system transactions
play

Enabling System Transactions via Lightweight Kernel Extensions R.P. - PowerPoint PPT Presentation

Enabling System Transactions via Lightweight Kernel Extensions R.P. Spillane, S. Gaikwad. M. Chinni, C.P. Wright, E. Zadok Stony Brook University http://www.fsl.cs.sunysb.edu/ Summary What is the design complexity of system transactions


  1. Enabling System Transactions via Lightweight Kernel Extensions R.P. Spillane, S. Gaikwad. M. Chinni, C.P. Wright, E. Zadok Stony Brook University http://www.fsl.cs.sunysb.edu/

  2. Summary  What is the design complexity of system transactions implemented in the VFS?  Low  100 lines of code added to page writeback  4000 lines of module code (log implementation)  What is the performance?  Valor: 35% overhead on top of theoretical best, compared to…  104% overhead for an efficient user-level alternative 2/28/2009 FAST 2009 - Enabling System Transactions 2

  3. System Transaction Process 1 FS State: foo ’ FS State: foo System Calls TID←sys_tbegin (...) / / write(TID,...) unlink(TID,...) sys_tabort(TID) f1 f2 f1 f2 2/28/2009 FAST 2009 - Enabling System Transactions 3

  4. The Design Spectrum  Valor side-steps the traditional trade-off by working with the Kernel’s page cache in a general way. Quicksilver, TxF Valor Transparency & Amino Performance Berkeley DB, KBDB Stasis Design Feasibility 2/28/2009 FAST 2009 - Enabling System Transactions 4

  5. Valor’s Process Txn Model  Transactional Model  Supported Operations:  dirtying a page  appending to a file, modifying an inode  modifying a directory  Locking:  directory locks, inode locks  page range locks for overwrites  intent locks for directory renames 2/28/2009 FAST 2009 - Enabling System Transactions 5

  6. Asynchronous By Default  ACI (no D w/o tsync)  Similar to asynchronous write(2) with fsync(2)  Same purpose (performance increase)  Requires page cache for files updated transactionally 2/28/2009 FAST 2009 - Enabling System Transactions 6

  7. Valor Design  Modify page writeback to support simple write ordering  Implement an ARIES style undo/redo log module for FS-operations 2/28/2009 FAST 2009 - Enabling System Transactions 7

  8. Page Dirtying: No Txns LEGEND: OK bad Old Page Process 1 New Page Uh- oh… write(TID,… ) write(TID,… ) 2/28/2009 FAST 2009 - Enabling System Transactions 8

  9. Page Dirtying: With Txns LEGEND: Old Page Process 1 U/R Page New Page log_append (TID,… ) log_append (TID,… ) write(TID,… ) write(TID,… ) 2/28/2009 FAST 2009 - Enabling System Transactions 9

  10. Current Kernel Design LEGEND: Page Cache Old Page U/R Page Ext3 New Page Ext2 Uh- oh… XFS ZFS Page Writeback … Process 2 log_append (TID,…) write(TID,…) 2/28/2009 FAST 2009 - Enabling System Transactions 10

  11. What DBs Do Page Cache II: The Wrath of Khan Disk Cache Flush (fsync) Page Cache Ext2 XFS ZFS 2/28/2009 FAST 2009 - Enabling System Transactions 11

  12. Simple Write Ordering LEGEND: Page Cache Old Page FS1 U/R Page FS2 New Page FS3 FS4 Valor 2/28/2009 FAST 2009 - Enabling System Transactions 12

  13. Log Module State File Log File Process 2 U/R Page 1 Disk tbegin (TID,…) 1 1 U/R,1 U/R,1 U/R,1 tlog (TID,…) 2 1 3 2 U/R Page 2 Valor Module write(TID,…) 3 2 U/R,1 U/R,1 C,1 page writeback 4 U/R Page 3 4 5 6 tlog (TID,…) 5 3 Record Maps write(TID,…) 6 U/R Page 4 U/R,1 U/R,1 U/R,1 tresolve (TID,…) 7 4 1 3 2 8 page writeback U/R Page 5 page writeback 9 5 U/R,1 U/R,1 C,1 4 5 6 6 2/28/2009 FAST 2009 - Enabling System Transactions 13

  14. Atomicity Argument  Transition from pre-writeback to post- writeback disk state atomically iff  All writes preceded by sys_log_append  Simple write ordering is implemented  writes to a single sector are atomic  Valor satisfies the top 2 constraints  A supported hard disk satisfies the third 2/28/2009 FAST 2009 - Enabling System Transactions 14

  15. Performing Recovery  Two kinds of recovery are supported:  System Recovery  Application Recovery (per-process abort)  Standard recovery process:  Reconstruct RAM state from log  In reverse LSN order commit/abort landed transactions  Perform a page writeback 2/28/2009 FAST 2009 - Enabling System Transactions 15

  16. Evaluation  We must compare against traditional asynchronous FSes  benchmark against asynchronous ext3  do serial transfer benchmarks for large files  We turn off synchronous transactions for two other controls (for fairness)  FS built on top of Stasis  FS built on top of Berkeley DB 2/28/2009 FAST 2009 - Enabling System Transactions 16

  17. Mock ARIES Benchmark  Important lower bound (not tight) MT-ow-noread MT-ow MT-ow-finite Disk Disk Disk Log Log Log 2/28/2009 FAST 2009 - Enabling System Transactions 17

  18. Mock ARIES Benchmark 104% 90 66% 80 Elapsed Time (sec) 70 35% 60 16% 50 2% 2x 40 Wait 30 User 20 System 10 0 2/28/2009 FAST 2009 - Enabling System Transactions 18

  19. Serial Overwrite Transaction size: 16 pages 1000 900 22.75 x Ext3 Elapsed Time (sec) 800 700 600 BDB 500 Stasis 400 Valor 300 200 Ext3 100 5.0 x Ext3 0 256 512 1024 2048 2.75 x Ext3 Size of Serial Transfer (MiB) 2/28/2009 FAST 2009 - Enabling System Transactions 19

  20. Transaction Throughput 1200 Valor Heel 1000 Elapsed Time (sec) 23.0 x Ext3 800 BDB 600 BDB Heel Stasis Valor 400 Stasis Heel Ext3 200 4.2 x Ext3 0 2.9 x Ext3 1 4 16 64 256 Size of Transaction (pages) 2/28/2009 FAST 2009 - Enabling System Transactions 20

  21. Conclusions  System transactions are feasible  Valor achieves good overhead  Minimal changes to existing kernels 2/28/2009 FAST 2009 - Enabling System Transactions 21

  22. Limitations/Future Work  Limitations  Locking slows interleaved writes to the same page  Some FSes/Disks do not fsync() when asked to  Future Work  Explore use of logging device as a coordinator in a transactional disk array 2/28/2009 FAST 2009 - Enabling System Transactions 22

  23. Q&A Enabling System Transactions via Lightweight Kernel Extensions R.P. Spillane, S. Gaikwad. M. Chinni, C.P. Wright, E. Zadok Stony Brook University http://www.fsl.cs.sunysb.edu/

  24. TxF  TxF is Microsoft’s transactional file system  Motivation: program installation, system updates, website updates  Pros  Backed by Microsoft  Cons  Specific to NTFS 2/28/2009 FAST 2009 - Enabling System Transactions 24

  25. Isolation  Extended mandatory locking  Allows locking of directories  Do not have to set group exec/setgid bits  Locking permissions  Let users decide if a file can be locked  All processes acquire locks  Regular processes hold only for the syscall  Lock inheritance  Allow multi-process transactions 2/28/2009 FAST 2009 - Enabling System Transactions 25

  26. Valor != Journaling  Journaling FSes good at fast recovery  …but are too special-purpose:  No-Steal Caching  all state modified by a txn. must remain in memory until commit/abort  Non-Modular Design  does not handle rollback of VFS and page caches, just disk-state on boot 2/28/2009 FAST 2009 - Enabling System Transactions 26

Recommend


More recommend