Concurrency and Memory Models Filip Sieczkowski
Why concurrency?
Moore’s law • Every two years, the number of transistors in a dense integrated circuit doubles • The keystone of microprocessor industry
Physical limitations • Finite speed of light • Atomic nature of matter • Power consumption & heat dissipation
Resolution: concurrency • Instead of increasing speed, increase the number of processors • Not necessarily the only way, see Matt Might: http://matt.might.net/papers/might2009manycorefad-talk.pdf • Need to run things on multiple chips and communicate
Shared-memory Concurrency Primer
Modelling Concurrency
Interleaving Interpretation • Any thread can execute an atomic command • The effect takes place directly in the memory • Corresponds to any sequential interleaving of the instructions • In the example, possible results are (0, 1), (1, 0) and (1,1)
Libraries for Concurrency • Writing correct concurrent code is very tricky • Most languages provide libraries, like java.util.concurrent • Even using libraries and locks requires care
Fine-grained Concurrency: a spin-lock • Locks allow ownership to just one thread at a time • We need an atomic communication primitive! • Compare-and-swap is costly • Even the simplest algorithms are very complicated
Relaxed Memory pt. 1 Total Store Order
What about Performance? • The standard model for concurrency is slow • Memory is huge and located far away: writing costs 10–20 cycles at the least ! • Undermines the reason for adding concurrency
Relaxed Memory: TSO • Each thread is equipped with a FIFO buffer • A write action is queued in the buffer • The thread tries to read from its own store buffer before consulting the main memory • The buffered writes are nondeterministically flushed
Weak Behaviours on TSO
Managing the Store-buffers • Fences ensure the buffered writes get committed to main memory • This allows to regain SC behaviours, but at a cost • Compare-and-swap also needs to flush the buffered writes to main memory • Presence of relaxed memory complicates reasoning even further!
Relaxed Memory pt. 2 Even More Relaxed
TSO as Code Reordering • We can think of the store buffer effects as executing the reads before the writes • But why stop there?
Message Passing and Relaxed Memory • In sequentially consistent semantics r is always 42 • This is also the case on TSO! • What happens if we allow more code reordering? • Power and ARM processors actually allow that to happen!
Memory Models and Compilers • Need to produce efficient code • Necessary to exploit relaxed behaviours of machines — without sacrificing correctness • Some languages (Java, C/C++) also have nonstandard memory models on the source level!
Recap • Concurrency is here to stay • It’s hard to use properly • The actual machine architectures are even more complex • Compiler writers need to exploit these behaviours
Recommend
More recommend