Write Amplification: An Analysis of In-Memory Database Durability Techniques Jaemyung Kim , Kenneth Salem, Khuzaima Daudjee University of Waterloo IMDM 2015
Durability Matters OLTP IMDB Orders of magnitude faster! I/O is unavoidable! (ACID, Durability) Write I/O efficiency of in-memory DBMS is an important issue. cliparts from openclipart.org 2
Write Amplification? Application λ In-memory Database λ P On-disk Database Persistent Storage 3
Goal of Write Amplification Model Quantify and compare the I/O efficiency of the persistent storage management schemes Provide us with some insight into the different natures of update-in-place and copy-on-write storage managers Lower cost for operating a database management system (contributed by improved I/O efficiency) Lead to better system performance in situations that I/O capacity is constrained (restart recovery) The following is not our goals: Emulate a specific storage manager implementation Compare specific implemenations: e.g., Hekaton is better than H-Store 4
Architectural Diversity in IMDB SM Two broad classes: Update In-Place and Copy-On-Write Update In-Place (UIP) UIP : conventional page-based (e.g., Shore-MT) random writes for checkpointing device sensitive: e.g., HDD vs. SSD UIP-S : snapshot checkpointing (e.g., H-Store, SiloR) Copy-On-Write (COW) COW-D : logging only (log-structured) database (e.g., Hekaton) COW-M : log-structured memory and disk datbases (e.g., RAMCloud) 5
UIP: Page-level Checkpoint Example Space Constraint ( α ) = 1 . 2 × DBSize, PageSize=2 Example Update Sequence: I,D,B Memory: A B C D E F G H I J Disk: A B C D E F G H I J 0 0 DB LOG LOG I/O History: DB I/O History: I/O Per Update = # DBIO +# LogIO = # Updates 6
UIP: Page-level Checkpoint Example Space Constraint ( α ) = 1 . 2 × DBSize, PageSize=2 Example Update Sequence: I,D,B Memory: A B C D E F G H I J Disk: A B C D E F G H I J I 0 DB LOG LOG I/O History: I DB I/O History: I/O Per Update = # DBIO +# LogIO = 0+1 = 1 # Updates 1 6
UIP: Page-level Checkpoint Example Space Constraint ( α ) = 1 . 2 × DBSize, PageSize=2 Example Update Sequence: I,D,B Memory: A B C D E F G H I J Disk: A B C D E F G H I J I D DB LOG LOG I/O History: I,D DB I/O History: I/O Per Update = # DBIO +# LogIO = 0+2 = 1 # Updates 2 6
UIP: Page-level Checkpoint Example Space Constraint ( α ) = 1 . 2 × DBSize, PageSize=2 Example Update Sequence: I,D,B Memory: A B C D E F G H I J Disk: A B C D E F G H I J 0 0 DB LOG LOG I/O History: I,D DB I/O History: C,D,I,J I/O Per Update = # DBIO +# LogIO = 4+2 = 3 # Updates 2 6
UIP: Page-level Checkpoint Example Space Constraint ( α ) = 1 . 2 × DBSize, PageSize=2 Example Update Sequence: I,D,B Memory: A B C D E F G H I J Disk: A B C D E F G H I J B 0 DB LOG LOG I/O History: I,D,B DB I/O History: C,D,I,J I/O Per Update = # DBIO +# LogIO = 4+3 ≈ 2 . 33 # Updates 3 6
UIP: Page-level Checkpoint Example Space Constraint ( α ) = 1 . 2 × DBSize, PageSize=2 Example Update Sequence: I,D,B Memory: A B C D E F G H I J Disk: A B C D E F G H I J B 0 DB LOG LOG I/O History: I,D,B DB I/O History: C,D,I,J I/O Per Update = # DBIO +# LogIO = 4+3 ≈ 2 . 33 # Updates 3 6
UIP-S: Snapshot Checkpoint Example Space Constraint ( α ) = 1 . 2 × DB Size Example Update Sequence: I,D,B Memory: A B C D E F G H I J Disk: A B C D E F G H I J I D DB LOG LOG I/O History: I,D DB I/O History: I/O Per Update = # DBIO +# LogIO = 0+2 = 1 # Updates 2 7
UIP-S: Snapshot Checkpoint Example Space Constraint ( α ) = 1 . 2 × DB Size Example Update Sequence: I,D,B Memory: A B C D E F G H I J Disk: A B C D E F G H I J 0 0 DB LOG LOG I/O History: I,D DB I/O History: A,B,C,D,E,F,G,H,I,J I/O Per Update = # DBIO +# LogIO = 10+2 = 6 # Updates 2 7
UIP-S: Snapshot Checkpoint Example Space Constraint ( α ) = 1 . 2 × DB Size Example Update Sequence: I,D,B Memory: A B C D E F G H I J Disk: A B C D E F G H I J B 0 DB LOG LOG I/O History: I,D,B DB I/O History: A,B,C,D,E,F,G,H,I,J I/O Per Update = # DBIO +# LogIO = 10+3 ≈ 4 . 33 # Updates 3 7
UIP-S: Snapshot Checkpoint Example Space Constraint ( α ) = 1 . 2 × DB Size Example Update Sequence: I,D,B Memory: A B C D E F G H I J Disk: A B C D E F G H I J B 0 DB LOG LOG I/O History: I,D,B DB I/O History: A,B,C,D,E,F,G,H,I,J I/O Per Update = # DBIO +# LogIO = 10+3 ≈ 4 . 33 # Updates 3 7
Architectural Diversity in IMDB SM Two broad classes: Update In-Place and Copy-On-Write Update In-Place (UIP) UIP : conventional page-based (e.g., Shore-MT) random writes for checkpointing device sensitive: e.g., HDD vs. SSD UIP-S : snapshot checkpointing (e.g., H-Store, SiloR) Copy-On-Write (COW) COW-D : logging only (log-structured) database (e.g., Hekaton) COW-M : log-structured memory and disk datbases (e.g., RAMCloud) 8
COW-D: Log-structured Disk Example Space Constraint ( α ) = 1 . 2 × DB Size Example Update Sequence: I,D,B Memory: A B C D E F G H I J Disk: H A E I F G D C B J 0 0 Log-structured DB DB I/O History: Read: Write: I/O Per Update = # ReadIO +# WriteIO = # Updates 9
COW-D: Log-structured Disk Example Space Constraint ( α ) = 1 . 2 × DB Size Example Update Sequence: I,D,B Memory: A B C D E F G H I J Disk: H A E I F G D C B J I 0 Log-structured DB DB I/O History: Read: Write: I I/O Per Update = # ReadIO +# WriteIO = 0+1 = 1 # Updates 1 9
COW-D: Log-structured Disk Example Space Constraint ( α ) = 1 . 2 × DB Size Example Update Sequence: I,D,B Memory: A B C D E F G H I J Disk: H A E I F G D C B J I D Log-structured DB DB I/O History: Read: Write: I,D I/O Per Update = # ReadIO +# WriteIO = 0+2 = 1 # Updates 2 9
COW-D: Log-structured Disk Example Space Constraint ( α ) = 1 . 2 × DB Size Example Update Sequence: I,D,B Memory: A B C D E F G H I J Disk: I F G D C B J I D H A E Log-structured DB DB I/O History: Read: H,A,E Write: I,D,H,A,E I/O Per Update = # ReadIO +# WriteIO = 3+5 = 4 # Updates 2 9
COW-D: Log-structured Disk Example Space Constraint ( α ) = 1 . 2 × DB Size Example Update Sequence: I,D,B Memory: A B C D E F G H I J Disk: F G D C B J I D H A E 0 Log-structured DB DB I/O History: Read: H,A,E,I Write: I,D,H,A,E I/O Per Update = # ReadIO +# WriteIO = 4+5 = 4 . 5 # Updates 2 9
COW-D: Log-structured Disk Example Space Constraint ( α ) = 1 . 2 × DB Size Example Update Sequence: I,D,B Memory: A B C D E F G H I J Disk: F G D C B J I D H A E B Log-structured DB DB I/O History: Read: H,A,E,I Write: I,D,H,A,E,B I/O Per Update = # ReadIO +# WriteIO = 4+6 ≈ 3 . 33 # Updates 3 9
COW-D: Log-structured Disk Example Space Constraint ( α ) = 1 . 2 × DB Size Example Update Sequence: I,D,B Memory: A B C D E F G H I J Disk: F G D C B J I D H A E B Log-structured DB DB I/O History: Read: H,A,E,I Write: I,D,H,A,E,B I/O Per Update = # ReadIO +# WriteIO = 4+6 ≈ 3 . 33 # Updates 3 9
COW-M: Log-structured Memory Example Space Constraint ( α ) = 1 . 2 × DB Size Example Update Sequence: I,D,B Memory: H A E I F G D C B J I Log-structured IMDB Disk: H A E I F G D C B J 0 0 Log-structured DB DB I/O History: Write: I/O Per Update = # WriteIO # Updates = 10
COW-M: Log-structured Memory Example Space Constraint ( α ) = 1 . 2 × DB Size Example Update Sequence: I,D,B Memory: H A E F G D C B J I D Log-structured IMDB Disk: H A E I F G D C B J I 0 Log-structured DB DB I/O History: Write: I I/O Per Update = # WriteIO # Updates = 1 1 = 1 10
COW-M: Log-structured Memory Example Space Constraint ( α ) = 1 . 2 × DB Size Example Update Sequence: I,D,B Memory: H A E F G C B J I D B Log-structured IMDB Disk: H A E I F G D C B J I D Log-structured DB DB I/O History: Write: I,D I/O Per Update = # WriteIO # Updates = 2 2 = 1 10
COW-M: Log-structured Memory Example Space Constraint ( α ) = 1 . 2 × DB Size Example Update Sequence: I,D,B Memory: F G C B J I D H A E B Log-structured IMDB Disk: F G D C B J I D H A E 0 Log-structured DB DB I/O History: Write: I,D,H,A,E I/O Per Update = # WriteIO # Updates = 5 2 = 2 . 5 10
COW-M: Log-structured Memory Example Space Constraint ( α ) = 1 . 2 × DB Size Example Update Sequence: I,D,B Memory: F G C J I D H A E B Log-structured IMDB Disk: F G D C B J I D H A E B Log-structured DB DB I/O History: Write: I,D,H,A,E,B I/O Per Update = # WriteIO # Updates = 6 3 = 2 10
COW-M: Log-structured Memory Example Space Constraint ( α ) = 1 . 2 × DB Size Example Update Sequence: I,D,B Memory: F G C J I D H A E B Log-structured IMDB Disk: F G D C B J I D H A E B Log-structured DB DB I/O History: Write: I,D,H,A,E,B I/O Per Update = # WriteIO # Updates = 6 3 = 2 10
What We Can Do Using WAF Model Compare UIP and COW Analyze the effect of persistent storage utilization ( α ) Analyze the effect of update workload (uniform vs. skew) Analyze the effect of page size on UIP Find best page size for storage-specific performance characteristic Analyze the effect of SSD vs. HDD persistent storage Weight random and sequential I/O differently, depending on the device type Application λ In-memory Database I/O Per Update = λ P λ λ P On-disk Database Persistent Storage 11
Recommend
More recommend