0 anton podkopaev
play

0 Anton Podkopaev Researcher @ JetBrains Research Postdoc @ - PowerPoint PPT Presentation

Programming language memory models: Problems, Solutions, and Directions Anton Podkopaev anton@ podkopaev.net 0 Anton Podkopaev Researcher @ JetBrains Research Postdoc @ MPI-SWS Docent @ HSE Programming languages Weak memory concurrency


  1. 1. support compiler optimizations 2. provide effjcient compilation to hardware 3. have easy non-expert mode Requirements to (Weak) Memory Models Hardware MMs should [x86, Power, ARM, RISC-V] 1. describe real CPUs 2. save room for future optimizations 3. provide reasonable guarantees for PLs Programming languages’ MMs should [C/C++, Java, JS, Wasm, OCaml] 6

  2. Requirements to (Weak) Memory Models Hardware MMs should [x86, Power, ARM, RISC-V] 1. describe real CPUs 2. save room for future optimizations 3. provide reasonable guarantees for PLs Programming languages’ MMs should [C/C++, Java, JS, Wasm, OCaml] 1. support compiler optimizations 2. provide effjcient compilation to hardware 3. have easy non-expert mode 6

  3. Requirements to (Weak) Memory Models Hardware MMs should [x86, Power, ARM, RISC-V] 1. describe real CPUs 2. save room for future optimizations 3. provide reasonable guarantees for PLs Programming languages’ MMs should [C/C++, Java, JS, Wasm, OCaml] 1. support compiler optimizations 2. provide effjcient compilation to hardware 3. have easy non-expert mode 6

  4. a y y 1 Optimized x 1 b x 1. Compiler optimizations [ x ] := 1 ; [ y ] := 1 ; Source a := [ y ]; b := [ x ]; 7

  5. a y y 1 Optimized x 1 b x 1. Compiler optimizations [ x ] := 1 ; [ y ] := 1 ; Source a := [ y ]; b := [ x ]; 7

  6. 1. Compiler optimizations [ x ] := 1 ; [ y ] := 1 ; Source a := [ y ]; b := [ x ]; a := [ y ]; [ y ] := 1 ; Optimized [ x ] := 1 ; b := [ x ]; 7

  7. 1. Compiler optimizations [ x ] := 1 ; [ y ] := 1 ; Source a := [ y ]; b := [ x ]; ⊆ a := [ y ]; [ y ] := 1 ; Optimized [ x ] := 1 ; b := [ x ]; 7

  8. 2. Effjcient compilation to hardware [ x ] := 1 ; [ y ] := 1 ; Source MM (SC) a := [ y ]; b := [ x ]; [ x ] := 1 ; [ y ] := 1 ; Target MM (x86) mfence ; mfence ; a := [ y ]; b := [ x ]; 8

  9. 2. Effjcient compilation to hardware [ x ] := 1 ; [ y ] := 1 ; Source MM (SC) a := [ y ]; b := [ x ]; No compilation scheme w/o fences [ x ] := 1 ; [ y ] := 1 ; Target MM (x86) mfence ; mfence ; a := [ y ]; b := [ x ]; 8

  10. D ata- R ace- F reedom guarantee: a x b y if a then if b then y 1 x 1 C/C++ MM allows to get a b 1 1 is O ut- O f- T hin- A ir outcome a b 3. Easy non-expert mode Nice program ⇒ nice behaviors 9

  11. D ata- R ace- F reedom guarantee: a x b y if a then if b then y 1 x 1 C/C++ MM allows to get a b 1 1 is O ut- O f- T hin- A ir outcome a b 3. Easy non-expert mode No data races ⇒ only SC behaviors 9

  12. D ata- R ace- F reedom guarantee: a x b y if a then if b then y 1 x 1 C/C++ MM allows to get a b 1 1 is O ut- O f- T hin- A ir outcome a b 3. Easy non-expert mode No data races in SC executions ⇒ only SC behaviors 9

  13. a x b y if a then if b then y 1 x 1 C/C++ MM allows to get a b 1 1 is O ut- O f- T hin- A ir outcome a b 3. Easy non-expert mode D ata- R ace- F reedom guarantee: No data races in SC executions ⇒ only SC behaviors 9

  14. C/C++ MM allows to get a b 1 1 is O ut- O f- T hin- A ir outcome a b 3. Easy non-expert mode D ata- R ace- F reedom guarantee: No data races in SC executions ⇒ only SC behaviors a := [ x ]; b := [ y ]; if a then if b then [ y ] := 1 [ x ] := 1 9

  15. 1 is O ut- O f- T hin- A ir outcome a b 3. Easy non-expert mode D ata- R ace- F reedom guarantee: No data races in SC executions ⇒ only SC behaviors a := [ x ]; b := [ y ]; if a then if b then [ y ] := 1 [ x ] := 1 C/C++ MM allows to get a = b = 1 9

  16. 3. Easy non-expert mode D ata- R ace- F reedom guarantee: No data races in SC executions ⇒ only SC behaviors a := [ x ]; b := [ y ]; if a then if b then [ y ] := 1 [ x ] := 1 C/C++ MM allows to get a = b = 1 a = b = 1 is O ut- O f- T hin- A ir outcome 9

  17. Simplicity No UB RC11 [Lahav et al., 2017] Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] 10

  18. Simplicity No UB RC11 [Lahav et al., 2017] Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] 11

  19. Validity of transformations [Ševčík and Aspinall, 2008] JMM SC Trace-preserving transformations ✓ Reordering normal memory accesses ✗ Redundant read after read elimination ✓ Redundant read after write elimination ✓ Irrelevant read elimination ✓ Irrelevant read introduction ✓ Redundant write before write elimination ✓ Redundant write after read elimination ✓ External action reordering ✗ 12

  20. Drawbacks: Hardware still allows weak behaviors, i.e., no end-to-end SC Requires modifying existing compilers SC-preserving optimizations in LLVM [Marino et al., 2011] Average slowdown: ▶ 34% w/ only SC preserving optimizations ▶ 5.5% w/ optimizations modifjed to preserve SC 13

  21. SC-preserving optimizations in LLVM [Marino et al., 2011] Average slowdown: ▶ 34% w/ only SC preserving optimizations ▶ 5.5% w/ optimizations modifjed to preserve SC Drawbacks: ▶ Hardware still allows weak behaviors, i.e., no end-to-end SC ▶ Requires modifying existing compilers 13

  22. Simplicity No UB RC11 [Lahav et al., 2017] Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] 14

  23. Validity of transformations [Ševčík and Aspinall, 2008] SC JMM ∗ Trace-preserving transformations ✓ ✓ Reordering normal memory accesses ✓ ∗ ✗ Redundant read after read elimination ✓ ✗ Redundant read after write elimination ✓ ✓ Irrelevant read elimination ✓ ✓ Irrelevant read introduction ✓ ✗ Redundant write before write elimination ✓ ✓ Redundant write after read elimination ✓ ✗ External action reordering ✗ ✗ 15

  24. Validity of transformations [Ševčík and Aspinall, 2008] SC JMM ∗ Trace-preserving transformations ✓ ✓ Reordering normal memory accesses ✓ ∗ ✗ Redundant read after read elimination ✓ ✗ Redundant read after write elimination ✓ ✓ Irrelevant read elimination ✓ ✓ Irrelevant read introduction ✓ ✗ Redundant write before write elimination ✓ ✓ Redundant write after read elimination ✓ ✗ External action reordering ✗ ✗ 15

  25. Simplicity No UB RC11 [Lahav et al., 2017] Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] 16

  26. Simplicity No UB RC11 [Lahav et al., 2017] Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] 16

  27. Simplicity No UB RC11 [Lahav et al., 2017] Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] 16

  28. End-to-end SC via Volatile JVM [Liu et al., 2017, Liu et al., 2019] Java MM guarantees D ata- R ace- F reedom: Shared locations are volatile (no data races) ⇒ SC semantics 17

  29. 28 79 81 164 57 85 157 73 125 103 End-to-end SC via Volatile JVM [Liu et al., 2017, Liu et al., 2019] Benchmarks Slowdown, in % DaCapo spark-perf x86 Average Max ARM (1) Average Max ARM (2) Average Max 17

  30. 57 85 157 73 125 103 End-to-end SC via Volatile JVM [Liu et al., 2017, Liu et al., 2019] Benchmarks Slowdown, in % DaCapo spark-perf x86 Average 28 79 Max 81 164 ARM (1) Average Max ARM (2) Average Max 17

  31. 73 125 103 End-to-end SC via Volatile JVM [Liu et al., 2017, Liu et al., 2019] Benchmarks Slowdown, in % DaCapo spark-perf x86 Average 28 79 Max 81 164 ARM (1) Average 57 85 Max 157 ∞ ARM (2) Average Max 17

  32. End-to-end SC via Volatile JVM [Liu et al., 2017, Liu et al., 2019] Benchmarks Slowdown, in % DaCapo spark-perf x86 Average 28 79 Max 81 164 ARM (1) Average 57 85 Max 157 ∞ ARM (2) Average 73 125 Max 103 ∞ 17

  33. Simplicity No UB RC11 [Lahav et al., 2017] Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] 18

  34. Simplicity No UB RC11 [Lahav et al., 2017] Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] 18

  35. Simplicity No UB RC11 [Lahav et al., 2017] Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] 18

  36. Simplicity No UB RC11 [Lahav et al., 2017] Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] 19

  37. Simplicity No UB RC11 [Lahav et al., 2017] Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] 19

  38. C/C++ MM allows to get a = b = 1, OOTA a := [ x ]; b := [ y ]; if a then if b then [ y ] := 1 [ x ] := 1 20

  39. a 0 b 0 a 0 b 1 a 1 b 1 R y 0 R y 1 R y 1 R x 0 R x 0 R x 1 W y 1 W y 1 W y 1 W x 1 W x 1 Axioms: 1. po rf preserved is acyclic ( rf preserved rf ) 2. … rf po po po rf po po Executions in C/C++ MM a := [ x ]; b := [ y ]; [ y ] := 1 if b then [ x ] := 1 21

  40. a 0 b 0 a 0 b 1 a 1 b 1 R y 1 R y 1 R x 0 R x 1 W y 1 W y 1 W x 1 W x 1 Axioms: 1. po rf preserved is acyclic ( rf preserved rf ) 2. … po po rf po rf po po Executions in C/C++ MM a := [ x ]; b := [ y ]; [ y ] := 1 if b then [ x ] := 1 R y 0 R x 0 W y 1 21

  41. a 0 b 1 a 1 b 1 R y 1 R y 1 R x 0 R x 1 W y 1 W y 1 W x 1 W x 1 Axioms: 1. po rf preserved is acyclic ( rf preserved rf ) 2. … po rf po rf po po po Executions in C/C++ MM a := [ x ]; b := [ y ]; [ y ] := 1 if b then [ x ] := 1 // a = 0 ; b = 0 R y 0 R x 0 W y 1 21

  42. a 1 b 1 R y 1 R x 1 W y 1 W x 1 Axioms: 1. po rf preserved is acyclic ( rf preserved rf ) 2. … rf rf po po po po po Executions in C/C++ MM a := [ x ]; b := [ y ]; [ y ] := 1 if b then [ x ] := 1 // a = 0 ; b = 0 // a = 0 ; b = 1 R y 0 R y 1 R x 0 R x 0 W y 1 W y 1 W x 1 21

  43. Axioms: 1. po rf preserved is acyclic ( rf preserved rf ) 2. … po rf po po rf po po Executions in C/C++ MM a := [ x ]; b := [ y ]; [ y ] := 1 if b then [ x ] := 1 // a = 0 ; b = 0 // a = 0 ; b = 1 // a = 1 ; b = 1 R y 0 R y 1 R y 1 R x 0 R x 0 R x 1 W y 1 W y 1 W y 1 W x 1 W x 1 21

  44. po rf rf po po po po Executions in C/C++ MM a := [ x ]; b := [ y ]; [ y ] := 1 if b then [ x ] := 1 // a = 0 ; b = 0 // a = 0 ; b = 1 // a = 1 ; b = 1 R y 0 R y 1 R y 1 R x 0 R x 0 R x 1 W y 1 W y 1 W y 1 W x 1 W x 1 Axioms: 1. po ∪ rf preserved is acyclic ( rf preserved ⊆ rf ) 2. … 21

  45. fake ctrl a x b y R y 1 R x 1 if a then if b then y 1 x 1 W y 1 W x 1 y 1 ctrl else ctrl rf ctrl ctrl rf ctrl rf Out-Of-Thin-Air in C/C++ MM R y 1 R x 1 a := [ x ]; b := [ y ]; [ y ] := 1 if b then [ x ] := 1 W y 1 W x 1 R y 1 R x 1 a := [ x ]; b := [ y ]; if a then if b then [ y ] := 1 [ x ] := 1 W y 1 W x 1 22

  46. fake ctrl a x b y R y 1 R x 1 if a then if b then y 1 x 1 W y 1 W x 1 y 1 ctrl else ctrl rf ctrl ctrl rf ctrl rf Out-Of-Thin-Air in C/C++ MM R y 1 R x 1 a := [ x ]; b := [ y ]; [ y ] := 1 if b then [ x ] := 1 W y 1 W x 1 R y 1 R x 1 a := [ x ]; b := [ y ]; if a then if b then [ y ] := 1 [ x ] := 1 W y 1 W x 1 22

  47. fake ctrl R y 1 R x 1 W y 1 W x 1 ctrl ctrl ctrl else rf ctrl ctrl rf rf Out-Of-Thin-Air in C/C++ MM R y 1 R x 1 a := [ x ]; b := [ y ]; [ y ] := 1 if b then [ x ] := 1 W y 1 W x 1 R y 1 R x 1 a := [ x ]; b := [ y ]; if a then if b then [ y ] := 1 [ x ] := 1 W y 1 W x 1 a := [ x ]; b := [ y ]; if a then if b then [ y ] := 1 [ x ] := 1 [ y ] := 1 22

  48. fake ctrl ctrl rf ctrl ctrl else rf ctrl ctrl rf Out-Of-Thin-Air in C/C++ MM R y 1 R x 1 a := [ x ]; b := [ y ]; [ y ] := 1 if b then [ x ] := 1 W y 1 W x 1 R y 1 R x 1 a := [ x ]; b := [ y ]; if a then if b then [ y ] := 1 [ x ] := 1 W y 1 W x 1 a := [ x ]; b := [ y ]; R y 1 R x 1 if a then if b then [ y ] := 1 [ x ] := 1 W y 1 W x 1 [ y ] := 1 22

  49. ctrl rf ctrl ctrl else rf ctrl ctrl rf Out-Of-Thin-Air in C/C++ MM R y 1 R x 1 a := [ x ]; b := [ y ]; [ y ] := 1 if b then [ x ] := 1 W y 1 W x 1 R y 1 R x 1 a := [ x ]; b := [ y ]; if a then if b then [ y ] := 1 [ x ] := 1 W y 1 W x 1 a := [ x ]; b := [ y ]; R y 1 R x 1 if a then if b then fake ctrl [ y ] := 1 [ x ] := 1 W y 1 W x 1 [ y ] := 1 22

  50. Simplicity No UB RC11 [Lahav et al., 2017] Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] 23

  51. Simplicity No UB Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] RC11 [Lahav et al., 2017] Forbids all po ∪ rf cycles 24

  52. since hardware respects rf How? 1. Restrict compiler optimizations 2. Put a fence between R and W Cheaper for C/C++ than for Java! po po po rf po W rf po po R W R rf po rf po po rf rf rf rf rf po rf po po po po po Forbidding po ∪ rf cycles Enough to respect [ R ] ; po ; [ W ] 25

  53. since hardware respects rf How? 1. Restrict compiler optimizations 2. Put a fence between R and W Cheaper for C/C++ than for Java! rf rf R W R W po rf po po rf po po po rf po po po po po rf po rf po rf rf po Forbidding po ∪ rf cycles Enough to respect [ R ] ; po ; [ W ] ( po ∪ rf ) ∗ 25

  54. since hardware respects rf How? 1. Restrict compiler optimizations 2. Put a fence between R and W Cheaper for C/C++ than for Java! rf po rf po po R po W rf W R po po po po po po po rf rf rf rf po Forbidding po ∪ rf cycles Enough to respect [ R ] ; po ; [ W ] ( po ∪ rf ) ∗ rf \ po rf \ po 25

  55. since hardware respects rf How? 1. Restrict compiler optimizations 2. Put a fence between R and W Cheaper for C/C++ than for Java! po R W R W po po rf po po po rf po po rf po rf po po rf rf rf Forbidding po ∪ rf cycles Enough to respect [ R ] ; po ; [ W ] ( po ∪ rf \ po ) ∗ rf \ po rf \ po 25

  56. since hardware respects rf How? 1. Restrict compiler optimizations 2. Put a fence between R and W Cheaper for C/C++ than for Java! po R W R W po po rf po po po rf po po rf po rf po po rf rf rf Forbidding po ∪ rf cycles Enough to respect [ R ] ; po ; [ W ] ( po ∪ rf \ po ) ∗ rf \ po rf \ po 25

  57. How? 1. Restrict compiler optimizations 2. Put a fence between R and W Cheaper for C/C++ than for Java! po R W R W po po rf po po po rf po po rf po rf rf po rf rf Forbidding po ∪ rf cycles Enough to respect [ R ] ; po ; [ W ] since hardware respects rf \ po ( po ∪ rf \ po ) ∗ rf \ po rf \ po 25

  58. Cheaper for C/C++ than for Java! po R W R W po po rf po po po po rf po rf po rf po rf rf rf Forbidding po ∪ rf cycles Enough to respect [ R ] ; po ; [ W ] since hardware respects rf \ po ( po ∪ rf \ po ) ∗ rf \ po rf \ po How? 1. Restrict compiler optimizations 2. Put a fence between R and W 25

  59. po po R W R W po po rf po po po po rf rf rf rf rf po rf po Forbidding po ∪ rf cycles Enough to respect [ R ] ; po ; [ W ] since hardware respects rf \ po ( po ∪ rf \ po ) ∗ rf \ po rf \ po How? 1. Restrict compiler optimizations 2. Put a fence between R and W Cheaper for C/C++ than for Java! 25

  60. C/C++ has undefjned behavior 26

  61. subject to OOTA int data int data 0 0 atomic< int > f f 0 0 f acq 0 f rel 1 Java: Fine, but may print 0 C/C++: Undefjned Behavior! Race on normal location! Java MM C/C++ MM special locations data race on int weak guarantees undefjned behavior access to int relaxed ( rlx ) access to atomic<int> int while volatile int atomic<int> Undefjned Behavior and Memory Models [ data ] := 42 ; while ([ f ] == 0 ) {} ; [ f ] := 1 ; print ([ data ]); 27

  62. subject to OOTA int data 0 atomic< int > f 0 f acq 0 f rel 1 Java: Fine, but may print 0 C/C++: Undefjned Behavior! Race on normal location! Java MM C/C++ MM special locations data race on int weak guarantees undefjned behavior access to int relaxed ( rlx ) access to atomic<int> int while volatile int atomic<int> Undefjned Behavior and Memory Models int data = 0 ; f = 0 ; [ data ] := 42 ; while ([ f ] == 0 ) {} ; [ f ] := 1 ; print ([ data ]); 27

  63. subject to OOTA int data 0 atomic< int > f 0 f acq 0 f rel 1 Java MM C/C++ MM special locations data race on int weak guarantees undefjned behavior access to int relaxed ( rlx ) access to atomic<int> while volatile int int atomic<int> Undefjned Behavior and Memory Models int data = 0 ; f = 0 ; [ data ] := 42 ; while ([ f ] == 0 ) {} ; [ f ] := 1 ; print ([ data ]); Java: Fine, but may print 0 C/C++: Undefjned Behavior! Race on normal location! 27

  64. subject to OOTA int data 0 f 0 f acq 0 f rel 1 Java: Fine, but may print 0 C/C++: Undefjned Behavior! Race on normal location! Java MM C/C++ MM special locations data race on int weak guarantees undefjned behavior access to int relaxed ( rlx ) access to atomic<int> int while volatile int atomic<int> Undefjned Behavior and Memory Models int data = 0 ; atomic< int > f = 0 ; [ data ] := 42 ; while ([ f ] == 0 ) {} ; [ f ] := 1 ; print ([ data ]); 27

  65. subject to OOTA int data 0 f 0 f 0 f 1 Java: Fine, but may print 0 C/C++: Undefjned Behavior! Race on normal location! Java MM C/C++ MM special locations data race on int weak guarantees undefjned behavior access to int relaxed ( rlx ) access to atomic<int> int while atomic<int> volatile int Undefjned Behavior and Memory Models int data = 0 ; atomic< int > f = 0 ; while ([ f ] acq == 0 ) {} ; [ data ] := 42 ; [ f ] rel := 1 ; print ([ data ]); 27

  66. int data 0 f 0 f 0 f 1 Java: Fine, but may print 0 C/C++: Undefjned Behavior! Race on normal location! subject to OOTA int while atomic<int> volatile int Undefjned Behavior and Memory Models int data = 0 ; atomic< int > f = 0 ; while ([ f ] acq == 0 ) {} ; [ data ] := 42 ; [ f ] rel := 1 ; print ([ data ]); Java MM C/C++ MM special locations data race on int weak guarantees undefjned behavior access to int relaxed ( rlx ) access to atomic<int> 27

  67. int data 0 f 0 f 0 f 1 Java: Fine, but may print 0 C/C++: Undefjned Behavior! Race on normal location! while volatile int int atomic<int> Undefjned Behavior and Memory Models int data = 0 ; atomic< int > f = 0 ; while ([ f ] acq == 0 ) {} ; [ data ] := 42 ; [ f ] rel := 1 ; print ([ data ]); Java MM C/C++ MM special locations data race on int weak guarantees undefjned behavior subject to OOTA access to int relaxed ( rlx ) access to atomic<int> 27

  68. Simplicity No UB Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] RC11 [Lahav et al., 2017] Forbids all po ∪ rf cycles 28

  69. Simplicity Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. No UB DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] RC11 [Lahav et al., 2017] Forbids all po ∪ rf cycles 28

  70. To forbid po ∪ rf cycles in C/C++ enough to respect [ R ] ; po ; [ W ] on atomics 29

  71. ARMv8: bogus conditional branch for relaxed atomic reads No changes for LLVM x86: no fences 1. Restrict compiler optimizations: 2. Put a fence between R and W Slowdown on ARMv8 is 0% on average and 6.3% max CDS from CDS C++, Folly, Junction, Rigtorp libs and 6 bechmarks from CDSSpec Preserving [ R ] ; po ; [ W ] for atomics in LLVM [Ou and Demsky, 2018] 30

  72. ARMv8: bogus conditional branch for relaxed atomic reads No changes for LLVM x86: no fences Slowdown on ARMv8 is 0% on average and 6.3% max CDS from CDS C++, Folly, Junction, Rigtorp libs and 6 bechmarks from CDSSpec Preserving [ R ] ; po ; [ W ] for atomics in LLVM [Ou and Demsky, 2018] 1. Restrict compiler optimizations: 2. Put a fence between R and W 30

  73. ARMv8: bogus conditional branch for relaxed atomic reads x86: no fences Slowdown on ARMv8 is 0% on average and 6.3% max CDS from CDS C++, Folly, Junction, Rigtorp libs and 6 bechmarks from CDSSpec Preserving [ R ] ; po ; [ W ] for atomics in LLVM [Ou and Demsky, 2018] 1. Restrict compiler optimizations: No changes for LLVM 2. Put a fence between R and W 30

  74. ARMv8: bogus conditional branch for relaxed atomic reads Slowdown on ARMv8 is 0% on average and 6.3% max CDS from CDS C++, Folly, Junction, Rigtorp libs and 6 bechmarks from CDSSpec Preserving [ R ] ; po ; [ W ] for atomics in LLVM [Ou and Demsky, 2018] 1. Restrict compiler optimizations: No changes for LLVM 2. Put a fence between R and W ▶ x86: no fences 30

  75. Slowdown on ARMv8 is 0% on average and 6.3% max CDS from CDS C++, Folly, Junction, Rigtorp libs and 6 bechmarks from CDSSpec Preserving [ R ] ; po ; [ W ] for atomics in LLVM [Ou and Demsky, 2018] 1. Restrict compiler optimizations: No changes for LLVM 2. Put a fence between R and W ▶ x86: no fences ▶ ARMv8: bogus conditional branch for relaxed atomic reads 30

  76. Preserving [ R ] ; po ; [ W ] for atomics in LLVM [Ou and Demsky, 2018] 1. Restrict compiler optimizations: No changes for LLVM 2. Put a fence between R and W ▶ x86: no fences ▶ ARMv8: bogus conditional branch for relaxed atomic reads Slowdown on ARMv8 is 0% on average and 6.3% max CDS from CDS C++, Folly, Junction, Rigtorp libs and 6 bechmarks from CDSSpec 30

Recommend


More recommend