Dynamic and Adaptive Updates of Non-Quiescent Subsystems in Commodity OS Kernels Kristis Makris <kristis.makris@asu.edu> Arizona State University Kyung Dong Ryu <kryu@us.ibm.com> IBM T.J. Watson Research Center 1 March 23, 2007 DynAMOS -- EuroSys '07
Overview Motivation Dynamic Kernel Updates Categorization System Architecture Adaptive Function Cloning Synchronized Updates Applications Conclusion 2 March 23, 2007 DynAMOS -- EuroSys '07
Motivation Dynamic kernel updates are essential Existing updating methods are inadequate Two approaches – Build adaptable OS Specially crafted (K42, VINO, Synthetix) Require OS and application restructuring – Dynamic code instrumentation No kernel source modification (KernInst, GILK) Basic block code interposition Currently limited – No procedure replacement – No autonomous kernel adaptability – No safe, complete subsystem update guarantees 3 March 23, 2007 DynAMOS -- EuroSys '07
Dynamic Updates Categorization (1) Updating variable values – Update an entry in system call table – Update owner (uid) of an inode Needs synchronized update – Count number of system calls of a process Needs state tracking Updating datatypes – Add new fields in Linux PCB for process checkpointing Update all functions that use the old datatype, or Maintain new fields in separate data structure – Does not need state transfer 4 March 23, 2007 DynAMOS -- EuroSys '07
Dynamic Updates Categorization (2) Updating single function – Correct a defect Updating kernel threads – Update memory paging subsystem Needs update during infinite loop Updating function groups – Update pipefs subsystem Needs synchronized update 5 March 23, 2007 DynAMOS -- EuroSys '07
Our Approach DynAMOS – Prototype for i386 Linux 2.2-2.6 Dynamic code instrumentation No kernel source modification or reboot – Procedure replacement – Adaptive updates Concurrent execution of multiple versions – State tracking – Autonomous kernel adaptability – Safe updates of complete subsystems Quiescence detection – Update synchronization (non-quiescent subsystems) – Datatype updates – State transfer – 6 March 23, 2007 DynAMOS -- EuroSys '07
DynAMOS System Architecture Unmodified kernel in memory insert module new object make function file images ld gcc original vmlinux function images update kernel source source 7 March 23, 2007 DynAMOS -- EuroSys '07
DynAMOS System Architecture Unmodified kernel in memory new function images original function images DynAMOS load DynAMOS kernel module 8 March 23, 2007 DynAMOS -- EuroSys '07
DynAMOS System Architecture Unmodified kernel in memory new function images original function images initiate update DynAMOS Update tool /dev/dynamos kernel module version manager 9 March 23, 2007 DynAMOS -- EuroSys '07
DynAMOS System Architecture Unmodified kernel in memory new function images prepare update cloned new original function function images images copy image relocation DynAMOS Update tool /dev/dynamos kernel module disassembler version manager 10 March 23, 2007 DynAMOS -- EuroSys '07
DynAMOS System Architecture Unmodified kernel in memory new function images cloned new cloned new original function function function images images images DynAMOS Update tool /dev/dynamos kernel module version manager 11 March 23, 2007 DynAMOS -- EuroSys '07
DynAMOS System Architecture Unmodified kernel in memory new function images cloned new original function function images images redirection activate update DynAMOS Update tool /dev/dynamos kernel module version manager 12 March 23, 2007 DynAMOS -- EuroSys '07
Execution Flow Redirection caller Apply Linger-Longer scheduler ... – Unobtrusive fine-grain cycle stealing call schedule – Implemented in schedule_LL as a ... scheduling policy schedule step 1 13 March 23, 2007 DynAMOS -- EuroSys '07
Execution Flow Redirection caller redirection handler ... call schedule ... jmp * schedule trampoline Trampoline installation – Disable processor interrupts – Flush I-cache Indirect jump Don’t modify page permissions – step 2 14 March 23, 2007 DynAMOS -- EuroSys '07
Execution Flow Redirection caller redirection handler preserve state perform bookkeeping ... call execute adaptation handler call schedule restore state ... schedule trampoline Bookkeeping – Maintain use counters adaptation handler User-defined adaptation handler – Execute if available ret – Select active version of function step 2 15 March 23, 2007 DynAMOS -- EuroSys '07
Execution Flow Redirection caller redirection handler ... call schedule ... jump to active function schedule trampoline jmp * adaptation handler schedule_clone schedule_LL_clone step 3 16 March 23, 2007 DynAMOS -- EuroSys '07
Execution Flow Redirection caller redirection handler ... call schedule ... jump to active function schedule trampoline jmp * adaptation handler schedule_clone schedule_LL_clone jump back jump back step 4 17 March 23, 2007 DynAMOS -- EuroSys '07
Execution Flow Redirection caller redirection handler ... call schedule ... jump to active function preserve state schedule perform bookkeeping trampoline ret restore state return to caller adaptation handler schedule_clone schedule_LL_clone jump back jump back step 5 18 March 23, 2007 DynAMOS -- EuroSys '07
Adaptive Function Cloning Benefits No processor state saved on stack – Function arguments accessed directly Autonomous kernel determination of update timeliness – Using adaptation handler Function-level updates – Basic blocks can be bypassed (no control-flow graph needed) – Function modifications developed in original source language 19 March 23, 2007 DynAMOS -- EuroSys '07
Function Relocation Issues Replace ret (1-byte) with jmp * (6-byte) back to handler – Adjust inbound ( jmp ) and outbound ( call ) relative offsets Safely detect – Backward branches: jmp to code overwritten by trampoline – Outbound branches: jmp to code outside function image – Indirect outbound branches: jmp * from indirection table – Data-in-code Need user verification – Multiple entry-points: e.g. produced by Intel C Compiler 20 March 23, 2007 DynAMOS -- EuroSys '07
Performance Small memory footprint (42k) Indirect addressing ( jmp * ) hurts branch prediction Can use direct addressing ( jmp ) – Overhead not correlated to path length – Mostly 1-8% – 21 March 23, 2007 DynAMOS -- EuroSys '07
Quiescence Detection Needed to – Atomically update function groups e.g. Count number of processes using a filesystem – Safely reverse updates Implemented by – Usage counters On entry and exit – Stack walk-through For non-returning calls ( do_exit in Linux; no ret instruction) Examine stack and program counter of all processes Default kernel compilation (works without frame pointers) 22 March 23, 2007 DynAMOS -- EuroSys '07
Non-quiescent Subsystems reader and writer are synchronized with each other pipe_read() pipe_write() { { ... ... acquire Sem acquire Sem wait for while (buffer_empty) { while (buffer_full) { wait for ... ... new data more room release Sem release Sem in buffer in buffer L1: sleep L2: sleep acquire Sem acquire Sem } } read from data buffer write in data buffer release Sem release Sem return return } } Adaptively enlarge pipefs 4k copy buffer during large data transfers 23 March 23, 2007 DynAMOS -- EuroSys '07
Non-quiescent Subsystems non-quiescent; sleeping quiescent pipe_read() pipe_write() { { ... ... acquire Sem acquire Sem while (buffer_empty) { while (buffer_full) { ... ... release Sem release Sem L1: sleep L2: sleep acquire Sem acquire Sem } } read from data buffer write in data buffer release Sem release Sem return return } } subsystem may never quiesce cannot update atomically 24 March 23, 2007 DynAMOS -- EuroSys '07
Synchronized update of pipefs pipe_read() { pipe_read_v3() { acquire Sem acquire Sem while (4k_buffer_empty) { while (1mb_buffer_empty) { release Sem release Sem L1: sleep L1: sleep acquire Sem acquire Sem } } read data from 4k_buffer read data from 1mb_buffer release Sem release Sem return return } } Phase 1 25 March 23, 2007 DynAMOS -- EuroSys '07
Synchronized update of pipefs pipe_read() { pipe_read_v2() { pipe_read_v3() { acquire Sem acquire Sem acquire Sem while (4k_buffer_empty) { while (4k_buffer_empty) { while (1mb_buffer_empty) { release Sem release Sem release Sem L1: sleep L1: sleep L1: sleep acquire Sem acquire Sem acquire Sem } if (must_update) { } read data from 4k_buffer phase = 3 read data from 1mb_buffer release Sem STATE TRANSFER release Sem goto new return return } } } } Semantically equivalent version at sou read data from 4k_buffer release Sem return Wait for pipe_read to become inactive new: Phase 2 } 26 March 23, 2007 DynAMOS -- EuroSys '07
Recommend
More recommend