Fast Write Protection Xiao Guangrong - PowerPoint PPT Presentation

Fast Write Protection Xiao Guangrong <xiaoguangrong@tencent.com>

Agenda • Background • Challenges • Fast write protection • Dirty bitmap • Evaluation • Future plan

Background • Live migration is a key feature for cloud provider, e.g., Tencent Cloud • Load Balance • Error recovery • Maintainability • Etc.

Background (Cont.) • Write protection is a key performance dependence for Live migration Write access from VM Guest Memory 1. VM-Exit 4. VM-Entry …… 2. Make memory writable 2. Write protect memory 3. Set bit Every iteration ……… #PF/EPT-violation Dirty Bitmap of memory migration 1. Copy and clear

Challenges • Current write protection implantation • It is based on SPTE RMAP (Shadow Page Table Entry Reverse MAPping) SPTE Pointer If only 1 SPTE 4k Page 1 4k Page 2 struct pte_list_desc 4k Page 3 4k pages ...... SPTE Pointer SPTE Pointer Or if multiple SPTEs SPTE Pointer 4k Page N *rmap[ ] (rmap = pte_list_desc | 0x1) SPTE Pointer … 2Mpages … 2M Page 1 2M Page 2 more more 2M Page 3 NULL indicates termination ...... Other huge pages 2M Page N

Challenges (Cont.) • It traverses rmaps of all memslots and makes spte readonly one by one • It is not scalable as it depends on the size of memory in VM • More worse, it needs to hold mmu-lock • Mmu-lock is a big & hot lock as It is contended by all vCPUs to update shadow page table

Fast write protection Original Fast write protection • Overview Write protect all memory Write protect all memory Page Write protected entry Move write protection by #PF on demand Writable entry

Fast write protection (Cont.) • The basic idea was raised by Avi Kivity in ~2011 during my vMMU development • Extremely fast • The O(1) algorithm • Not depend on the capacity of guest memory • Lockless • Not require mmu-lock • Not hurt the parallel of vCPUs

Fast write protection: Implementation • A new API, KVM_WRITE_PROTECT_ALL_MEM, is introduced • A global write-protect indicator is introduced • In order to make it lockless, the indicator is split to two parts Bit 63 Bit 0 Global write-protect indicator: Enable write-protect all Generation number • A write-protect-all generation number is introduced to shadow page table (struct kvm_mmu_page) • Which is synced with global generation number and used to check if write protection is needed

Fast write protection: Implementation (Cont.) Migration Thread vCPU Ioctl(KVM_WRITE_PROTECT_ALL_MEM) Global-gen-num++ Kick off all vCPUs and ask them to VM-Entry VM-Exit Reload its root page table Reload root page table: if (gen-number of shadow page != global–gen-num) { write protect all entries update shadow page’s gen-num }

Fast write protection: Implementation (Cont.) • For page fault handler Repeat until all fault entries are writable Make the fault entry writable Write protect all entries Fault on a write protected In lower level page table entry based on its gen-num and global-gen-num Write protected entry Writable entry

Fast write protection: Implementation (Cont.) • For the new created shadow page, we can simply set its write-protect generation number to global generation • To speed up the process which makes all entries of the shadow page readonly, we introduce these new stuffs to shadow page table • possible_writable_spte_bitmap which indicates the writable sptes • possiable_writable_sptes which is a counter indicating the number of writable sptes in the shadow page

Dirty bitmap • One call of KVM_WRITE_PROTECT_ALL_MEM can write protect all VM memory, so that KVM_GET_DIRTY_LOG need not do write protection anymore • A new flag is introduced to KVM_GET_DIRTY_LOG to ask KVM skipping write protection • KVM_DIRTY_LOG_WITHOUT_WRITE_PROTECT • In fact, that opens the opportunities to speed up KVM_GET_DIRTY_LOG • Now, it just copies the bitmap from kernel to userspace

Dirty bitmap: omit KVM_GET_DIRTY_LOG • Make the bitmap be shared between userspace and KVM • Userspace & KVM async-ly and atomic-ly operate the bitmap, i.e., move the operation in current KVM_GET_DIRTY_LOG to userspace Userspace KVM Fetch bitmap: mark_page_dirty: for ( i = 0; i < n / sizeof ( long ); i ++) { set_bit_le(gfn_index, memslot->dirty_bitmap); mask = xchg(&dirty_bitmap[i], 0); Saved_dirty_bitmap_buffer[i] = mask; } • Avoiding xchg is also possible (by introducing double dirty bitmaps and switch them during fetching dirty bits?)

Evaluation • When we did the evaluation, shared bitmap has not been implemented yet • The following cases are based on the VM which has 3G memory + 12 vCPUs • Case 1: evaluate the time for KVM_GET_DIRTY_LOG Before After Result +46603% Time (ns) 64289121 137654

Evaluation • Case 2: evaluate the time to make all memory writable after write- protection Before After Result - 3% Time (ns) 281735017 291150923 • Performance drop due to • a) fast page fault which locklessly fix #PF on last level of shadow page, so before our work, it is complete lockless, after our work, need mmu-lock to make upper levels writable • b) need little time to move write protection from upper levels to lower levels • We think it is acceptable, particularly, mmu-lock contention (caused by write protection) did not take into account for this case

Evaluation (Cont.) • The following cases are for the VM which has 30G memory and 8 vCPUs, during live migration, a memory benchmark is running in the VM which repeatedly writes 3000M memory • Case 3: for the new booted VM, that means, mmu-lock is required to map physical memory into shadow page table Before After Result +49% Dirty page rate 333092 497266 (pages) -47% Total time of live 12532 18467 migration • As fast write protection reduces the contention of mmu-lock, VM writes memory more efficiently than before • No surprise, as more dirty pages are generated, more time is needed to migrate memory

Evaluation (Cont.) • Case 4: for the pre-written VM, that means, all memories are mapped in, fast page fault can directly make the page table writeable without holding mmu-lock on the last level Before After Result + 0 % Dirty page rate 447435 449284 (pages) + 47% Total time of live 31068 28310 migration • We also noticed that the time of dirty log for the first time, before our work is 156 ms, after our work, only 6 ms is needed

Future plan • Currently, v2 of fast write protection has been posted out • https://lkml.org/lkml/2017/6/20/274 • Ask Paolo, Marcelo, Radim and other guys to comment on it and push it to upstream • Enable it on QEMU side • Think shared dirty bitmap carefully and enable it • Others…

Thanks!

Fast Write Protection Xiao Guangrong - PowerPoint PPT Presentation

Fast Write Protection Xiao Guangrong <xiaoguangrong@tencent.com> Agenda Background Challenges Fast write protection Dirty bitmap Evaluation Future plan Background Live migration is a key feature for cloud

Write Through No Write Allocate Cache Write Reference Check tag and index Yes Tag AND

Being a METS Startup Fast Failure; Fast Reward November 2016 Fast Failure; Fast Reward

Read Write Inc. Phonics Parents Meeting Who is Read Write Inc. Phonics for? Read Write Inc.

Tier 1 Water Budget CTC SPC Meeting # 2/09 Agenda Item # 6.1 February 17, 2009 Gayle

Groundwater Quality Vulnerability Analysis - WHPA delineation & vulnerability CTC SWP

Community Update MST T Fast st Facts cts MST T Fast st Facts cts MST T Fast st Facts

Fast Food and Your Health www.ddssafety.net Last updated October 2009 What is fast food?

Lurssen 32,9 A classic fast Lurssen 32,9 A classic fast A F T D E C K Lurssen 32,9 A

Read Write Inc. Phonics MISS CASBAN About Read Write Inc Phonics

Direct-Mapped Cache: Write Allocate with Write-Through Protocol Block size in bytes: B = 2 b WRITE

Sergio Benitez sb@sergio.bz Rocket is a web framework for Rust that makes it simple to write fast

I. Asset Protection Trusts Foreign Asset Protection Trusts Offshore Asset Protection

Module 18: Protection Goals of Protection Domain of Protection Access Matrix

Module 18: Protection Goals of Protection Domain of Protection Access Matrix

Protection Issues I/O protection Protection and System Calls Prevent users from

The Education for All The Education for All Fast Track Initiative Fast Track Initiative

CO2101 Processes and Multi-tasking Tom Ridge (tr61) 7th October 2019 tr61 Multi-tasking

Programming and Data Structures (PDS) (Theory: 3-1-0) The basic components of a digital

lecture 13 MIPS data path and control 1 - single cycle model - fetch vs execute -

Computer Organization & Assembly Language Programming (CSE 2312) Lecture 17: More Processor

1 Processes and the Kernel The Kernel Processes and the Kernel The Kernel Today, all

Protecting Free and Open Communications on the Internet Against Man-in-the-Middle Attacks on

For the Love of Fashion Third Quarter 2018 Results 1 IMPORTANT NOTICE This presentation, and

From Feedforward-Designed Convolutional Neural Networks (FF-CNNs) to Successive Subspace Learning

Sambuz

Useful Links

Newsletter

Mail Us

Fast Write Protection Xiao Guangrong - PowerPoint PPT Presentation

Fast Write Protection Xiao Guangrong <xiaoguangrong@tencent.com> Agenda Background Challenges Fast write protection Dirty bitmap Evaluation Future plan Background Live migration is a key feature for cloud

Write Through No Write Allocate Cache Write Reference Check tag and index Yes Tag AND

Being a METS Startup Fast Failure; Fast Reward November 2016 Fast Failure; Fast Reward

Read Write Inc. Phonics Parents Meeting Who is Read Write Inc. Phonics for? Read Write Inc.

Tier 1 Water Budget CTC SPC Meeting # 2/09 Agenda Item # 6.1 February 17, 2009 Gayle

Groundwater Quality Vulnerability Analysis - WHPA delineation &amp; vulnerability CTC SWP

Community Update MST T Fast st Facts cts MST T Fast st Facts cts MST T Fast st Facts

Fast Food and Your Health www.ddssafety.net Last updated October 2009 What is fast food?

Lurssen 32,9 A classic fast Lurssen 32,9 A classic fast A F T D E C K Lurssen 32,9 A

Read Write Inc. Phonics MISS CASBAN About Read Write Inc Phonics

Direct-Mapped Cache: Write Allocate with Write-Through Protocol Block size in bytes: B = 2 b WRITE

Sergio Benitez sb@sergio.bz Rocket is a web framework for Rust that makes it simple to write fast

I. Asset Protection Trusts Foreign Asset Protection Trusts Offshore Asset Protection

Module 18: Protection Goals of Protection Domain of Protection Access Matrix

Module 18: Protection Goals of Protection Domain of Protection Access Matrix

Protection Issues I/O protection Protection and System Calls Prevent users from

The Education for All The Education for All Fast Track Initiative Fast Track Initiative

CO2101 Processes and Multi-tasking Tom Ridge (tr61) 7th October 2019 tr61 Multi-tasking

Programming and Data Structures (PDS) (Theory: 3-1-0) The basic components of a digital

lecture 13 MIPS data path and control 1 - single cycle model - fetch vs execute -

Computer Organization &amp; Assembly Language Programming (CSE 2312) Lecture 17: More Processor

1 Processes and the Kernel The Kernel Processes and the Kernel The Kernel Today, all

Protecting Free and Open Communications on the Internet Against Man-in-the-Middle Attacks on

For the Love of Fashion Third Quarter 2018 Results 1 IMPORTANT NOTICE This presentation, and

From Feedforward-Designed Convolutional Neural Networks (FF-CNNs) to Successive Subspace Learning

Sambuz

Useful Links

Newsletter

Mail Us

Groundwater Quality Vulnerability Analysis - WHPA delineation & vulnerability CTC SWP

Computer Organization & Assembly Language Programming (CSE 2312) Lecture 17: More Processor