relaxed memory models
play

Relaxed memory models No sequential consistency (SC) in chips today - PowerPoint PPT Presentation

Dynamic Synthesis for Relaxed Memory Models Feng Liu*, Nayden Nedev*, Nedyalko Prisadnikov , Martin Vechev, Eran Yahav *Princeton University, Sofia University, ETH Zurich, Technion 06/13/2012 PLDI 2012, Beijing Relaxed


  1. Dynamic Synthesis for Relaxed Memory Models Feng Liu*, Nayden Nedev*, Nedyalko Prisadnikov †, Martin Vechev§, Eran Yahav ‡ *Princeton University, † Sofia University, § ETH Zurich, ‡ Technion 06/13/2012 PLDI 2012, Beijing

  2. Relaxed memory models • No sequential consistency (SC) in chips today • Chip designers implement “relaxed memory” on different architectures : - total store order ( TSO ) Intel’s and AMD’s X86 ; SPARC - partial store order ( PSO ) SPARC - PPC model IBM’s PowerPC; ARM - …

  3. Modeling TSO & PSO Programs • Store Buffering – FIFO queues (buffers) associated with threads – A store goes to a local buffer, not memory – Stores in buffers are flushed at non-deterministic times • Store Forwarding – Satisfy loads from local buffer if possible 3

  4. PSO Example H=0, Done=0 Fails on PSO thread1 thread2 H=1; while (!Done) { } Done=1; assert(H= =1); store flush … H 1 … t1 Done 1 Main Memory … H … t2 Done Done=1 H=0 load 4

  5. Memory Fences H=0, Done=0 thread1 thread2 H=1; while (!Done) { } Fence; assert(H= =1); Done=1; Memory fence is very expensive (10-100 cycles) Use only where necessary 5

  6. Our Approach C/C++ Program P Program P’ FENDER Specification with Dynamic Analysis & S Fences static fixing Memory Model M P’ satisfies S under M 6

  7. Challenge: Handling real-world concurrent programs A lock-free memory allocator 771 lines of C code 2699 lines of IR code [1] M. Michael, “scalable lock - free dynamic memory allocation,” PLDI’04.

  8. Real-World Programs? • Exposing violations under relaxed memory models – Violations occur rarely • Many possible fence placements – Large programs • Written in C/C++ language – Rather than program models 8

  9. Contributions • Demonic scheduler to expose violations – Delay flushes of values from store buffer to main memory • Avoiding bad executions by adding fences – Extracting ordering constraints from bad executions – Enforcing ordering constraints using fences • Parametric synthesis framework – Different memory models • Evaluating fences required under different memory models and correctness criteria – Found redundant and missing fences – Linearizability on relaxed memory models – Handled real C/C++ programs 9

  10. Fender Framework – Support for concurrency and RMM Concurrent Client C/C++ code LLVM-GCC .bc LLVM Interpreter Threading our extension Demonic Memory Scheduler Model existing work 10

  11. Our work – Dynamic analysis Concurrent Client C/C++ code LLVM-GCC .bc SAT assignment LLVM Interpreter Trace Analysis SAT Solver Threading trace Order formula our extension Demonic Memory Specification Scheduler Model existing work 11

  12. Our work – Implement memory fences Fixed bytecode & Concurrent Client Fence location report C/C++ code LLVM-GCC Fence Enforcement .bc SAT modified .bc assignment LLVM Interpreter Trace Analysis SAT Solver Threading Order trace formula our extension Demonic Memory Specification Scheduler Model existing work 12

  13. Example H=0, Done=0 thread1 thread2 L1: H=1; L3: while (!Done) { } L2: Done=1; L4: assert(H==1); : : … H … t1 Done Main Memory … H … t2 Done 13

  14. Interpretation on PSO H=0, Done=0 thread1 thread2 L3: Load Done c L1: H=1; L3: while (!Done) { } L1: Store H=1 L2: Done=1; L4: assert(H==1); : L2: Store Done=1 : L4: Load H trace L1 … H 1 … t1 Done 1 L2 Main Memory … H … t2 Done c load store flush 14

  15. Interpretation on PSO H=0, Done=0 thread1 thread2 c L3 L1: H=1; L3: while (!Done) { } L1 L2: Done=1; L4: assert(H==1); : : L2 trace L3 … H … t1 Done Main Memory … H … t2 Done c load store flush 15

  16. Flush with a probability H=0, Done=0 thread1 thread2 c L3 L1: H=1; L3: while (!Done) { } L1 L2: Done=1; L4: assert(H==1); : L2 : L3 trace L1 … H 1 … t1 Done Main Memory … H … t2 Done c load store flush 16

  17. Execution trace H=0, Done=0 thread1 thread2 c L3 L1: H=1; L3: while (!Done) { } L1 L2: Done=1; L4: assert(H==1); : L2 : L3 trace L4 L1 … H 1 … t1 Done Main : Memory … H … t2 Done c load store flush 17

  18. Checking Specification . . . . . . . . trace different executions c load store flush 18

  19. Repair one trace L1 … H 1 L1 … t1 Done 1 L2 L2 Main Memory … . . . . H trace … t2 Done C order predicate [L1, L2] D : order formula [L1, L2]  [C, D]  … for a single execution x 1 x 2 19

  20. Repair all incorrect traces . . . . . . . . trace One memory fence should be placed here trace1 trace2 trace3 different executions Global formula to SAT solver: (x 1  x 2  ..)  (x 1  x 3  ..)  … trace1 trace3 20

  21. Fix the program H=0, Done=0 thread1 thread2 L1: H=1; L3: while(! Done) { } Fence; L4: assert(H==1); . . . . L2: Done=1; : : 21

  22. Evaluation - Benchmarks Memory safety Operational Sequential consistency linearizability Program TSO PSO TSO PSO TSO PSO 0 0 1 2 2 3 Chase-Lev WSQ Cilk’s THE WSQ 0 0 1 3 - - Work stealing queues 0 0 0 2 1 2 FIFO WSQ 0 0 0 1 0 1 LIFO WSQ 0 0 0 1 0 1 Anchor WSQ 0 3 - - - - FIFO iWSQ Idempotent Work stealing queues 0 2 - - - - LIFO iWSQ 0 2 - - - - Anchor iWSQ 0 0 0 0 0 0 MS2 Queue Concurrent data structures 0 0 0 1 0 1 MSN Queue 0 0 0 0 0 0 LazyList Set Harris’s Set 0 0 0 1 0 1 Memory 0 3 0 4 0 4 Lock-free memory allocator allocator 22

  23. Evaluation - Specifications Memory safety Operational Sequential consistency linearizability Program TSO PSO TSO PSO TSO PSO 0 0 1 2 2 3 Chase-Lev WSQ Cilk’s THE WSQ 0 0 1 3 - - 0 0 0 2 1 2 FIFO WSQ 0 0 0 1 0 1 LIFO WSQ 0 0 0 1 0 1 Anchor WSQ 0 3 - - - - FIFO iWSQ 0 2 - - - - LIFO iWSQ 0 2 - - - - Anchor iWSQ 0 0 0 0 0 0 MS2 Queue 0 0 0 1 0 1 MSN Queue 0 0 0 0 0 0 LazyList Set Harris’s Set 0 0 0 1 0 1 Memory 0 3 0 4 0 4 allocator 23

  24. Evaluation - Memory models Memory safety Operational Sequential consistency linearizability Program TSO PSO TSO PSO TSO PSO 0 0 1 2 2 3 Chase-Lev WSQ Cilk’s THE WSQ 0 0 1 3 - - 0 0 0 2 1 2 FIFO WSQ 0 0 0 1 0 1 LIFO WSQ 0 0 0 1 0 1 Anchor WSQ 0 3 - - - - FIFO iWSQ 0 2 - - - - LIFO iWSQ 0 2 - - - - Anchor iWSQ 0 0 0 0 0 0 MS2 Queue 0 0 0 1 0 1 MSN Queue 0 0 0 0 0 0 LazyList Set Harris’s Set 0 0 0 1 0 1 Memory 0 3 0 4 0 4 allocator 24

  25. Evaluation - number of memory fences Memory safety Operational Sequential consistency linearizability Program TSO PSO TSO PSO TSO PSO 0 0 1 2 2 3 Chase-Lev WSQ Cilk’s THE WSQ 0 0 1 3 - - 0 0 0 2 1 2 FIFO WSQ 0 0 0 1 0 1 LIFO WSQ 0 0 0 1 0 1 Anchor WSQ 0 3 - - - - FIFO iWSQ 0 2 - - - - LIFO iWSQ 0 2 - - - - Anchor iWSQ 0 0 0 0 0 0 MS2 Queue 0 0 0 1 0 1 MSN Queue 0 0 0 0 0 0 LazyList Set Harris’s Set 0 0 0 1 0 1 Memory 0 3 0 4 0 4 allocator 25

  26. Conclusion • Demonic scheduler to expose violations • Avoiding bad executions by adding fences • Parametric synthesis framework • Evaluating fences required under different memory models and correctness criteria 26

  27. Thanks! Q & A 27

Recommend


More recommend