applications with execution filters
play

Applications with Execution Filters Jingyue Wu, Heming Cui, Junfeng - PowerPoint PPT Presentation

LOOM: Bypassing Races in Live Applications with Execution Filters Jingyue Wu, Heming Cui, Junfeng Yang Columbia University 1 Mozilla Bug #133773 void js_DestroyContext( A buggy interleaving JSContext *cx) { JS_LOCK_GC(cx->runtime); Last


  1. LOOM: Bypassing Races in Live Applications with Execution Filters Jingyue Wu, Heming Cui, Junfeng Yang Columbia University 1

  2. Mozilla Bug #133773 void js_DestroyContext( A buggy interleaving JSContext *cx) { JS_LOCK_GC(cx->runtime); Last Thread Non-last Thread MarkAtomState(cx); if (last) { // last thread? ... if (last) // return true FreeAtomState(cx); ... } FreeAtomState JS_UNLOCK_GC(cx->runtime); } bug MarkAtomState 2

  3. Complex Fix void js_DestroyContext() { void js_ForceGC(bool last) } if (last) { { gcLevel = 1; state = LANDING; gcPoke = true; gcLock.release(); if (requestDepth == 0) js_GC(last); restart: js_BeginRequest(); } MarkAtomState(); while (gcLevel > 0) void js_GC(bool last) { gcLock.acquire(); JS_AWAIT_GC_DONE(); if (state == LANDING && if (gcLevel > 1) { js_ForceGC(true); !last) gcLevel = 1; while (gcPoke) return; gcLock.release(); js_GC(true); gcLock.acquire(); goto restart; FreeAtomState(); if (!gcPoke) { } } else { gcLock.release(); gcLevel = 0; gcPoke = true; return; gcPoke = false; js_GC(false); } gcLock.release(); } if (gcLevel > 0) { } } gcLevel++; void js_BeginRequest() { while (gcLevel > 0) while (gcLevel > 0) JS_AWAIT_GC_DONE(); JS_AWAIT_GC_DONE(); gcLock.release(); } return; • 4 functions; 3 integer flags • Nearly a month • Not the only example 3

  4. LOOM: Live-workaround Races • Execution filters: temporarily filter out buggy thread interleavings void js_DestroyContext(JSContext *cx) { MarkAtomState(cx); if (last thread) { A mutual-exclusion ... execution filter to bypass FreeAtomState(cx); the race on the left ... } js_DestroyContext <> self } • Declarative, easy to write 4

  5. LOOM: Live-workaround Races • Execution filters: temporarily filter out buggy thread interleavings • Installs execution filters to live applications – Improve server availability – STUMP *PLDI ‘09+, Ginseng *PLDI ‘06+, KSplice *EUROSYS ‘09+ • Installs execution filters safely – Avoid introducing errors • Incurs little overhead during normal execution 5

  6. Summary of Results • We evaluated LOOM on nine real races. – Bypasses all the evaluated races safely – Applies execution filters immediately – Little performance overhead (< 5%) – Scales well with the number of application threads (< 10% with 32 threads) – Easy to use (< 5 lines) 6

  7. Outline • Architecture – Combines static preparation and live update • Safely updating live applications • Reducing performance overhead • Evaluation • Conclusion 7

  8. Architecture Live Update Static Preparation js_DestroyContext Application Source Execution Filter <> self $ llvm-gcc LLVM Compiler LOOM $ loomctl add <pid> $ opt – load <filter file> Controller $ llc LOOM Compiler $ gcc Plugin LOOM Update LOOM Update LOOM Update Engine Engine Engine Buggy Patched Application Application Application Binary 8

  9. Outline • Architecture – Combines static preparation and live update • Safely updating live applications • Reducing performance overhead • Evaluation • Conclusion 9

  10. Safety: Not Introducing New Errors Mutual Exclusion Order Constraints PC Lock Up Up PC PC PC Down Down Unlock PC 10

  11. Evacuation Algorithm 1. Identify the dangerous region using static analysis 2. Evacuate threads that are in the dangerous region 3. Install the execution filter Unsafe to update Updated Safe to update PC Install “Evacuate” Filter L OOM L OOM L OOM Update Update Update Engine Engine Engine 11

  12. Control Application Threads 3: entry of handle_client 1 : // database worker thread 2 : void handle_client(int fd) { 3 : for(;;) { Y 6: ret<=0 4 : struct client_req req; 5 : int ret = recv(fd, &req, ...); N 6 : if(ret <= 0) break; 7: call open_table 7 : open_table(req.table_id); 8 : ... // do real work 9 : close_table(req.table_id); … // do real work 10: } 11: } 9: call close_table 11: exit of handle_client 12

  13. Control Application Threads (cont’d) // not the final version 3: entry of 3: entry of void cond_break () { handle_client handle_client read_unlock(& update ); read_lock(& update ); Y } 6: ret<=0 6: ret<=0 Y cond_break() N N 7: call open_table 7: call open_table … // do real work … // do real work // not the final version void loom_update() { write_lock(& update ); 9: call close_table 9: call close_table install_filter(); write_unlock(& update ); 11: exit of 11: exit of } handle_client handle_client 13

  14. Pausing Threads at Safe Locations cmpl 0x0, 0x845208c 3: entry of je 0x804b56d handle_client void cond_break () { if ( wait [backedge_id]) { read_unlock(&update); while ( wait [backedge_id]); 6: ret<=0 Y read_lock(&update); cond_break() } N } 7: call open_table void loom_update() { identify_safe_locations(); … // do real work for each safe backedge E wait [E] = true; write_lock(&update); 9: call close_table install_filter(); for each safe backedge E wait [E] = false; write_unlock(&update); 11: exit of } handle_client 14

  15. Outline • Architecture – Combines static preparation and live update • Safely updating live applications • Reducing performance overhead • Evaluation • Conclusion 15

  16. Hybrid Instrumentation void slot(int stmt_id) { 3: entry of op_list = operations[stmt_id]; handle_client foreach op in op_list 3: entry of do op; handle_client } switch? Y 6: ret<=0 Y Y cond_break() 6: ret<=0 6: ret<=0 switch? N N N cond_break() slot(); 7: call open_table 7: call open_table 7: call open_table slot(); … // do real work … // do real work … // do real work slot(); 9: call close_table 9: call close_table 9: call close_table slot(); 11: exit of 11: exit of handle_client handle_client 16

  17. Bare Instrumentation Overhead Performance overhead < 5% 17

  18. Bare Instrumentation Overhead Performance overhead < 5% 18

  19. Scalability • 48-core machine with 4 CPUs; Each CPU has 12 cores. • Pin the server to CPU 0, 1, 2, and the client to CPU 3. Scalability on MySQL 14% 12% 10% Overhead (%) 8% 6% 4% RESP 2% 0% TPUT -2% -4% -6% 1 2 4 8 16 32 Number of threads Performance overhead does not increase 19

  20. Conclusion • LOOM: A live-workaround system designed to quickly and safely bypass races – Execution filters: easy to use and flexible (< 5 lines) – Evacuation algorithm: safe – Hybrid instrumentation: fast (overhead < 5%) and scalable (overhead < 10% with 32 threads) • Future work – Generic hybrid instrumentation framework – Extend the idea to other classes of errors 20

  21. Questions? 21

Recommend


More recommend