Chungbuk National University Reducing Journaling Harm on Virtualized I/O systems Eunji Lee, Hyokyung Bahn, Minseong Jeong, Sunghwan Kim, Jesung Yeon, Seunghoon Yoo, Sam H. Noh, Kang G. Shin ACM/ USENIX SYSTOR ’16, June 6 ‐ 8 1
Chungbuk National University Virtualization in Computer Systems Widely used in modern computer systems From personal computing devices to cloud servers Provide flexibility, scalability and energy savings Separate a software platform from hardware conditions Accompanied with inefficiencies Additional software layers “Bare ‐ metal” approach: Para ‐ Virtualization Need to modify guest OS 80% of cloud servers rely on full virtualization hypervisors Vmware, Hyper ‐ V, and QEMU ‐ KVM http://www.infoq.com/news/2012/10/Survey-Virtualization-Cloud 2
Chungbuk National University Our Work in Brief Challenge Need to improve inefficiency of virtualization without compromising on transparency Layered SW stack is more painful on high ‐ speed storage Guest’s journaling makes harmful effects on I/O performance in a virtualized environment Analyze the effectiveness of journaling and reduce the overhead by presenting a new caching strategy Proposed strategy is implemented in QEMU ‐ KVM, and improves I/O performance by 3 ‐ 32% for file and key ‐ value store benchmarks 3
Chungbuk National University I/O Stack in Full Virtualization Both guest and host have their own file systems and buffer caches Guest’s I/O goes through host cache ( ‐ ) Redundant data caching Read “A” Guest’s … A buffer cache Host’s A buffer cache A B C 4
Chungbuk National University I/O Stack in Full Virtualization Both guest and host have their own file systems and buffer caches Guest’s I/O goes through host cache ( ‐ ) Redundant data caching (+) Large shared cache Read “A” Guest’s … A buffer cache Host’s A buffer cache A B C 5
Chungbuk National University I/O Stack in Full Virtualization Both guest and host have their own file systems and buffer caches Guest’s I/O goes through host cache ( ‐ ) Redundant data caching (+) Large shared cache Read “A” (+) Buffering and merging effects Guest’s … A buffer cache Host’s A buffer cache A B C 6
Chungbuk National University I/O Stack in Full Virtualization Both guest and host have their own file systems and buffer caches Guest’s I/O goes through host cache ( ‐ ) Redundant data caching (+) Large shared cache Read “A” (+) Buffering and merging effects Guest’s … A buffer cache Deliver better performance Locality Host’s Asynchronous writes A buffer cache A B C 7
Chungbuk National University I/O Stack in Full Virtualization Both guest and host have their own file systems and buffer caches Guest’s I/O goes through host cache ( ‐ ) Redundant data caching (+) Large shared cache Read “A” (+) Buffering and merging effects Guest’s … A buffer cache Deliver better performance Locality Host’s Asynchronous writes A buffer cache What if in high ‐ speed storage? Additional memory copy is more painful A B C on high-speed storage Solid State Disk 8
Chungbuk National University Just Bypass a Host Cache! Performance comparison of using host cache (Writeback) and bypassing it (Direct) in SSD 9
Chungbuk National University Just Bypass a Host Cache – NO! Using host cache delivers 2.7x and 1.7x better performance in HDD and SSD on average Benefit of host cache 10
Chungbuk National University Journaling Effects in Virtualization A bit on Journaling Used to ensure data consistency in file systems Ext4, JFS, ReiserFS, etc Writes new data to a journal area, and updates original data in the permanent file location only if logging succeeds Case of Ext4 Data in page cache Checkpoint Commit (5s) File system Journal area 11
Chungbuk National University Harmful I/O Traffic by Journaling No locality : Not accessed again unless the system crashes Guest’s … J buffer cache Host’s J buffer cache Never hit in host cache J 12
Chungbuk National University Harmful I/O Traffic by Journaling Synchronous writes : FLUSH command comes right after journaling (can be skipped) FLUSH FLUSH FLUSH Guest’s … J buffer cache Time Host’s D D S J J C J buffer cache Associated Journal data No buffering effect due to immediate data blocks synchronization J Ext4 with ordered mode 13
Chungbuk National University Harmful I/O Traffic by Journaling Large footprint : Completely sequential writes in a large loop J J J J Journal Area Guest’s J … … J buffer cache J J J J J J J Host’s J U S E F U L U S E F U L buffer cache Cache pollution J J 14
Chungbuk National University Analyzing Journal Accesses Journal traffic accounts for 19% on average and up to 47% 15
Chungbuk National University Analyzing Journal Accesses Footprint of journal accesses account for 45.2% on average and up to 84.8% of the total footprint 16
Chungbuk National University Analyzing Journal Accesses 86% of total sync operations are associated with journal on average 17
Chungbuk National University Pollution Defensive Caching (PDC) Filter out journal traffic from host cache Two challenges How to identify journal traffic than other How to transfer the information to host OS, such that it can decide to cache it or not Need approaches that still provide transparency F J Guest OS … Hypervisor F Host OS Write journal data directly J F to storage 18
Chungbuk National University 1.How to Identify Journal Traffic Implicit Journal Traffic Detection Maintain access flows with first and last LBAs in a hash table Monitor if the upcoming request is in a consecutive address range Regard a range where consecutive writes forms a large loop as journal area Explicit Knowledge Prediction Period Implicit Detection 19
Chungbuk National University 2. How to Communicate with Host OS posix_fadvise system call enables user applications to provide explicit hints to the OS Implement POSIX_FADV_NOREUSE flag to bypass the host cache Host operating system commences direct I/O when this flag set, and switches to buffered I/O when receiving another system call with POSIX_FADV_NORMAL flag. 20
Chungbuk National University Performance Evaluation Experimental Setup system call (read / write) Data structure file system r/w func. Guest page modify buffer buffer cache layer interface in I/O path bio block I/O layer request device driver dev2 dev1 regular data journal data Pollution Defensive Hypervisor Caching selective caching routine 3. use cache 1. no 2. use cache w. development invalidation use cache system call Host file system r/w func. buffer cache layer block I/O layer device driver 21
Chungbuk National University Performance Evaluation PDC provides 8 ‐ 32% higher IOPS than original caching (WB) in SSD 22
Chungbuk National University Performance Evaluation PDC improves synchronous writes and compact operations considerably in key ‐ value store fillsync and compact operations improves the performance by 33% and 18% 23
Chungbuk National University Cache Hit Ratio No significant difference in the hit ratio between two policies, despite the high ratio of journal data in footprint 24
Chungbuk National University Cache Hit Ratio No significant difference in the hit ratio between two policies, despite the high ratio of journal data in footprint Journaling writes small updates periodically that are consecutive to previous accesses Little effect on evicting likely ‐ to ‐ be accessed data from the host cache 25
Chungbuk National University Conclusion Analyze the journaling effect in fully virtualized systems Uncover that journal traffic deteriorates cache performance with synchronous writes with no locality Propose a new caching policy : pollution ‐ defensive caching Implemented in Linux 4.14 and QEMU ‐ KVM Improve I/O performance by 3 ‐ 32% in file and key ‐ value store benchmarks 26
Chungbuk National University Reducing Write Amplification of Flash Storage through Cooperative Data Management with NVM 32nd International Conference on Massive Storage Systems and Technology (MSST) May, 2016 Eunji Lee , Chungbuk Natational University Julie Kim, Ewha University Hyokyung Bahn, Ewha University Sam H. Noh, UNIST 27
Chungbuk National University Write Amplification in SSD Undesirable phenomenon associated with flash memory Number of writes to storage is higher than the number of writes issued by a host Key aspect limiting stable performance and endurance of SSD Performance is fluctuating! Source: Radian Memory Systems 28
Chungbuk National University Write Amplification in SSD Garbage collection is performed to recycle used blocks Copy out valid pages in a victim block into a free block write (B’ F’ G’ H’) write (B’ F’ G’ H’) + write (A C D E ) in GC A E B’ B’ A B F F’ F’ C C G G’ G’ D D H H’ H’ E Write 8 Blocks Write 4 Blocks 2x Writes! Write Amplification Factor : 2.0 29
Recommend
More recommend