Harmonizing Performance and Isolation in Microkernels with Efficient Intra-kernel Isolation and Communication Jinyu Gu , Xinyue Wu, Wentai Li, Nian Liu, Zeyu Mi, Yubin Xia, Haibo Chen
Monolithic Kernel and Microkernel 2
Monolithic Kernel and Microkernel Microkernel’s philosophy: Moving most OS components into isolated user processes 3
Benefits and Usages of Microkernel • Achieves good extensibility, security, and fault isolation • Succeeds in safety-critical scenarios (Airplane, Car) • For more general-purpose applications (Google Zircon) 4
Expensive Communication Cost • Tradeoff: Performance and Isolation – Inter-process communication (IPC) overhead File Disk App System Driver Microkernel IPC 5
IPC Overhead is Considerable IPC Cost Real Work in Servers 100% SQLite xv6FS Ramdisk 80% 60% 40% Microkernel 20% Zircon seL4 seL4 Direct cost: privilege switch, process switch, … w/ kpti w/o kpti Indirect cost: CPU internal structures pollution Evaluated on Dell PowerEdge R640 server with Intel Xeon Gold 6138 CPU 6
Goal: Both Ends • Harmonize the tension between Performance and Isolation in microkernels – Reducing the IPC overhead – Maintaining the isolation guarantee 7
New Hardware Brings Opportunities • PKU: Protection Key for Userspace (aka. MPK) – Assign each page one PKEY (i.e., memory domain ID) [0:15] – A new register PKRU stores read/write permission 8
Efficient Intra-Process Isolation App Part • ERIM [Security’19] & Hodor [ATC’19] Library-1 – Based on Intel PKU Library-2 – Build isolate domains in the same process efficiently – Domain switch only takes 28 cycles (modify PKRU) 9
Intra-Process Isolation + Microkernel System Servers Intel PKU App Drv … App FS MM Net Microkernel Process IPC Sched Hardware 10
Design Choice #1 Isolate different system servers in a single process. Server-1 Isolated Server-2 App domains Server-3 … Just as traditional IPCs Microkernel 11
Design Choice #2 Let’s get more aggressive! App-1 App-2 Drawbacks Server-1 Server-1 1. Update Server mapping is costly Server-2 Server-2 2. IPC connection is also costly Server-3 Server-3 3. Less flexibility for applications on address space and using PKU … … Microkernel 12
An Observation on Intel PKU • A misleading name – Protection Key for Userspace • It still takes effect when in kernel (ring-0) – The “Userspace” means user-accessible memory – U/K bit in PTE 13
UnderBridge: Sinking System Servers System Servers Intel PKU App Drv … App FS MM Net Intra-kernel isolation Microkernel Hardware 14
Design Choice #3: UnderBridge • Build execution domains in the kernel page table User App App App Dom-0 Microkernel Kernel Dom-1 Dom-2 Dom-3 Server-1 Server-2 Server-3 15
Execution Domain • Execution domain 0 is for the microkernel – Use memory domain 0 – Can access all the memory • Others own a private memory domain – A private MPK memory domain ID • Shared memory Dom-0 – Allocate a free Microkernel MPK memory domain ID Dom-1 Dom-2 Server-1 Server-2 16
IPC Gate Dom-1 Dom-2 • Connect two servers Server-1 Server-2 – Generated by the microkernel – Resides in memory domain 0 (execute-only for servers) • Transfer control flow during IPC invocations – context switch and domain switch • Connect the microkernel and servers – System calls Dom-2 Dom-0 Server-2 Microkernel 17
Server Migration • The number of execution domain is limited – Hardware only provides 16 memory domains – Time-multiplexing is expensive • Move servers between user and kernel space – Disjoint virtual memory regions – Runtime migration 18
Privilege Deprivation • In-kernel servers have supervisor privilege – Can affect the whole system if compromised – CFI (with binary scanning) incurs runtime overhead – Binary rewriting only is infeasible • Prevent servers to execute privilege instructions – Add a tiny secure monitor in hypervisor mode – For instructions rarely execute: VMExits – For instructions that frequently required: Rewriting 19
Other Designs and Implementations • IPC capability authentication • Seamless server migration • Privilege deprivation details 20
Cross-server IPC Round-Trip Latency 8500 8151 8000 7500 5000 Cycles Cycles 4145 4000 3057 3000 2035 2000 1450 1000 437 109 24 0 Monolithic ChCore SkyBridge seL4 seL4 Fiasco.OC Fiasco.OC Zircon (UnderBridge) -KPTI -KPTI 21 Evaluated on Dell PowerEdge R640 server with Intel Xeon Gold 6138 CPU
SQLite Throughput under YCSB-A 1 × ∼ 8 × Native w/ KPTI UnderBridge Native w/o KPTI Monolithic SkyBridge Monolithic w/o KPTI 10 8 Throughput 6 4 2 0 Zircon Fiasco.OC seL4 22 Evaluated on Dell PowerEdge R640 server with Intel Xeon Gold 6138 CPU
Conclusion & Thanks! • UnderBridge – A redesign of the runtime structure of microkernel OSes for faster OS services – The efficient intra-kernel isolation mechanism may also be used to harden the isolation of monolithic kernels Q&A: gujinyu@sjtu.edu.cn 23
Recommend
More recommend