10/27/12 ¡ Logical Diagram Binary Memory Threads Formats Allocators Read-Copy Update User Today’s Lecture System Calls Kernel (RCU) RCU File System Networking Sync Don Porter CSE 506 Memory Device CPU Management Scheduler Drivers Hardware Interrupts Disk Net Consistency Motivation RCU in a nutshell (from Paul McKenney’s Thesis) 35 ò Think about data structures that are mostly read, "ideal" Hash Table Searches per Microsecond "global" occasionally written 30 "globalrw" ò Like the Linux dcache 25 ò RW locks allow concurrent reads 20 Performance of RW ò Still require an atomic decrement of a lock counter 15 lock only marginally ò Atomic ops are expensive better than mutex 10 ò Idea: Only require locks for writers; carefully update data lock 5 structure so readers see consistent views of data 0 1 2 3 4 # CPUs Principle (1/2) Principle (2/2) ò Locks have an acquire and release cost ò Reader/writer locks may allow critical regions to execute in parallel ò Substantial, since atomic ops are expensive ò But they still serialize the increment and decrement of ò For short critical regions, this cost dominates the read count with atomic instructions performance ò Atomic instructions performance decreases as more CPUs try to do them at the same time ò The read lock itself becomes a scalability bottleneck, even if the data it protects is read 99% of the time 1 ¡
10/27/12 ¡ Lock-free data structures RCU: Split the difference ò Some concurrent data structures have been proposed that ò One of the hardest parts of lock-free algorithms is don’t require locks concurrent changes to pointers ò They are difficult to create if one doesn’t already suit ò So just use locks and make writers go one-at-a-time your needs; highly error prone ò But, make writers be a bit careful so readers see a consistent view of the data structures ò Can eliminate these problems ò If 99% of accesses are readers, avoid performance-killing read lock in the common case Example: Linked lists Example: Linked lists This implementation needs a lock Insert(B) Insert(B) A C E A C E B’s next B B pointer is uninitialized; Reader gets a Reader goes to C or Reader goes to B page fault B---either is ok Example recap Example 2: Linked lists Delete (C) ò Notice that we first created node B, and set up all outgoing pointers A C E ò Then we overwrite the pointer from A ò No atomic instruction or reader lock needed ò Either traversal is safe ò In some cases, we may need a memory barrier ò Key idea: Carefully update the data structure so that a reader can never follow a bad pointer Reader may still be ò Writers still serialize using a lock looking at C. When can we delete? 2 ¡
10/27/12 ¡ Problem Worst-case scenario ò We logically remove a node by making it unreachable to ò Reader follows pointer to node X (about to be freed) future readers ò Another thread frees X ò No pointers to this node in the list ò X is reallocated and overwritten with other data ò We eventually need to free the node’s memory ò Reader interprets bytes in X->next as pointer, ò Leaks in a kernel are bad! segmentation fault ò When is this safe? ò Note that we have to wait for readers to “move on” down the list Quiescence Quiescence, cont ò Trick: Linux doesn’t allow a process to sleep while traversing ò There are some optimizations that keep the per-CPU an RCU-protected data structure counter to just a bit ò Includes kernel preemption, I/O waiting, etc. ò Intuition: All you really need to know is if each CPU has called schedule() once since this list became non-empty ò Idea: If every CPU has called schedule() (quiesced), then it is safe to free the node ò Details left to the reader ò Each CPU counts the number of times it has called schedule() ò Put a to-be-freed item on a list of pending frees ò Record timestamp on each CPU ò Once each CPU has called schedule, do the free Limitations Nonetheless ò No doubly-linked lists ò Linked lists are the workhorse of the Linux kernel ò Can’t immediately reuse embedded list nodes ò RCU lists are increasingly used where appropriate ò Must wait for quiescence first ò Improved performance! ò So only useful for lists where an item’s position doesn’t change frequently ò Only a few RCU data structures in existence 3 ¡
10/27/12 ¡ Big Picture API ò Carefully designed data ò Drop in replacement for read_lock: structures ò rcu_read_lock() Hash Pending ò Readers always see ò Wrappers such as rcu_assign_pointer() and List Signals consistent view rcu_dereference_pointer() include memory barriers ò Low-level “helper” ò Rather than immediately free an object, use functions encapsulate call_rcu(object, delete_fn) to do a deferred deletion RCU “library” complex issues ò Memory barriers ò Quiescence Code Example Simplified Code Example From fs/binfmt_elf.c From arch/x86/include/asm/rcupdate.h rcu_read_lock(); #define rcu_dereference(p) ({ \ prstatus->pr_ppid = typeof(p) ______p1 = (*(volatile typeof(p)*) &p);\ task_pid_vnr(rcu_dereference(p->real_parent)); read_barrier_depends(); // defined by arch \ rcu_read_unlock(); ______p1; // “returns” this value \ }) Code Example From McKenney and Walpole, Introducing Technology into the Linux Kernel: A Case Study From fs/dcache.c 2000 static void d_free(struct dentry *dentry) { 1800 /* ... Ommitted code for simplicity */ 1600 call_rcu(&dentry->d_rcu, d_callback); 1400 # RCU API Uses } 1200 // After quiescence, call_rcu functions are called 1000 800 static void d_callback(struct rcu_head *rcu) { 600 struct dentry *dentry = 400 container_of(head, struct dentry, d_rcu); 200 __d_free(dentry); // Real free 0 2002 2003 2004 2005 2006 2007 2008 2009 } Year Figure 2: RCU API Usage in the Linux Kernel 4 ¡
10/27/12 ¡ Summary ò Understand intuition of RCU ò Understand how to add/delete a list node in RCU ò Pros/cons of RCU 5 ¡
Recommend
More recommend