looking inside a race detector kavya
play

Looking Inside a Race Detector kavya @kavya719 data race - PowerPoint PPT Presentation

Looking Inside a Race Detector kavya @kavya719 data race detection data races when two+ threads concurrently access a shared memory location, at least one access is a write. data race // Shared variable R R R var count = 0 W R


  1. Looking Inside a Race Detector

  2. kavya @kavya719

  3. data race detection

  4. data races “when two+ threads concurrently access a shared memory location, at least one access is a write.” data race // Shared variable R R R var count = 0 W R R func incrementCount() { if count == 0 { R W W count ++ } !W W W } count = 1 count = 2 count = 2 func main() { // Spawn two “threads” !concurrent concurrent concurrent “g1” go incrementCount() go incrementCount() “g2” }

  5. data races “when two+ threads concurrently access a shared memory location, at least one access is a write.” data race !data race Thread 1 Thread 2 // Shared variable var count = 0 lock(l) lock(l) func incrementCount() { count=1 count=2 if count == 0 { unlock(l) unlock(l) count ++ } } func main() { // Spawn two “threads” go incrementCount() go incrementCount() }

  6. • relevant Panic messages from • elusive unexpected program • have undefined consequences crashes are often reported • easy to introduce in languages 
 on the Go issue tracker. like Go An overwhelming number of these panics are caused by data races, and an overwhelming number of those reports centre around Go’s built in map type. — Dave Cheney

  7. given we want to write multithreaded programs, how may we protect our systems from the unknown consequences of the difficult-to-track-down data race bugs… in a manner that is reliable and scalable?

  8. race detectors read by goroutine 7 at incrementCount() created at main()

  9. …but how?

  10. go race detector • Go v1.1 (2013) 
 • Integrated with the Go tool chain — > go run -race counter.go 
 • Based on C/ C++ ThreadSanitizer 
 dynamic race detection library • As of August 2015, 1200+ races in Google’s codebase, ~100 in the Go stdlib, 
 100+ in Chromium, 
 + LLVM, GCC, OpenSSL, WebRTC, Firefox

  11. core concepts internals evaluation wrap-up

  12. core concepts

  13. concurrency in go The unit of concurrent execution : goroutines user-space threads 
 use as you would threads 
 > go handle_request(r) Go memory model specified in terms of goroutines within a goroutine: reads + writes are ordered with multiple goroutines: shared data must be synchronized…else data races!

  14. The synchronization primitives: channels 
 > ch <- value 
 mutexes, conditional vars, … 
 > import “sync” 
 > mu.Lock() 
 atomics 
 > import “sync/ atomic" 
 > atomic.AddUint64(&myInt, 1)

  15. concurrency ? “…goroutines concurrently access a shared memory location, at least one access is a write.” var count = 0 R R R func incrementCount() { W R R if count == 0 { count ++ R W W } } W W W func main() { “g1” count = 1 count = 2 count = 2 go incrementCount() “g2” go incrementCount() !concurrent concurrent concurrent }

  16. how can we determine “concurrent” memory accesses?

  17. var count = 0 func incrementCount() { if count == 0 { count++ } } func main() { incrementCount() incrementCount() } not concurrent — same goroutine

  18. var count = 0 func incrementCount() { mu.Lock() if count == 0 { count ++ } mu.Unlock() } func main() { go incrementCount() go incrementCount() } not concurrent — 
 lock draws a “dependency edge”

  19. happens-before orders events across goroutines X ≺ Y IF one of: — same goroutine — are a synchronization-pair memory accesses 
 — X ≺ E ≺ Y i.e. reads, writes a := b synchronization 
 IF X not ≺ Y and Y not ≺ X , via locks or lock-free sync mu.Unlock() concurrent! ch <— a

  20. g1 g2 L A ≺ B same goroutine R B ≺ C A W lock-unlock on same object B U A ≺ D L C transitivity D R U

  21. var count = 0 func incrementCount() { if count == 0 { count ++ } } func main() { go incrementCount() go incrementCount() } concurrent ?

  22. g1 g2 A ≺ B and C ≺ D same goroutine A C R R but A ? C and C ? A W W B D concurrent

  23. how can we implement happens-before?

  24. vector clocks means to establish happens-before edges g2 g1 g 1 g 2 g 2 g 1 0 0 0 0 1 0 read(count) 2 0 3 0 t 1 = max(4, 0) unlock(mu) t 2 = max(0,1) 4 0 lock(mu) 4 1 0 1

  25. g1 g2 (0, 0) (0, 0) (1, 0) L R A ≺ D ? (3, 0) < (4, 2) ? A (3, 0) W so yes . (4, 0) B U L (4, 1) C (4, 2) D R U

  26. B ≺ C ? g1 g2 (2, 0) < (0, 1) ? no. A (0, 1) C (1, 0) R R C ≺ B ? B (2, 0) (0, 2) D no. W W so, concurrent

  27. pure happens-before detection This is what the Go Race Detector does! Determines if the accesses to a memory location can be ordered by happens-before, using vector clocks.

  28. internals

  29. go run -race to implement happens-before detection, need to: create vector clocks for goroutines 
 …at goroutine creation 
 update vector clocks based on memory access, 
 synchronization events 
 …when these events occur 
 compare vector clocks to detect happens-before 
 relations. 
 …when a memory access occurs

  30. spawn state lock program race read race detector race detector state machine

  31. do we have to modify our programs then, to generate the events? memory accesses synchronizations goroutine creation nope.

  32. var count = 0 func incrementCount() { if count == 0 { count ++ } } func main() { go incrementCount() go incrementCount() }

  33. var count = 0 func incrementCount() { raceread() if count == 0 { 
 racewrite() count ++ } 
 racefuncexit() } func main() { go incrementCount() go incrementCount() - race

  34. go tool compile -race the gc compiler instruments memory accesses adds an instrumentation pass over the IR. func compile(fn *Node) { ... order(fn) walk(fn) if instrumenting { instrument(Curfn) } ... }

  35. This is awesome. We don’t have to modify our programs to track memory accesses. What about synchronization events, and goroutine creation? proc.go mutex.go package runtime package sync import “internal/race" func newproc1() { func (m *Mutex) Lock() { if race.Enabled { if race.Enabled { newg.racectx = race.Acquire(…) racegostart(…) } } ... ... } } raceacquire(addr)

  36. runtime.raceread() ThreadSanitizer (TSan) library C++ race-detection library 
 (.asm file because it’s calling into C++) program TSan

  37. threadsanitizer TSan implements the happens-before race detection: 
 creates, updates vector clocks for goroutines -> ThreadState 
 keeps track of memory access, synchronization events -> Shadow State, Meta Map 
 compares vector clocks to detect data races.

  38. go incrementCount() func newproc1() { if race.Enabled { struct ThreadState { newg.racectx = racegostart (…) ThreadClock clock; } } ... } contains a fixed-size vector clock proc.go (size == max(# threads)) count == 0 1. data race with a previous access? raceread (…) 2. store information about this access 
 for future detections by compiler instrumentation

  39. shadow state stores information about memory accesses. 8-byte shadow word for an access: directly-mapped: TID clock pos wr 0x7fffffffffff application TID: accessor goroutine ID 
 0x7f0000000000 clock: scalar clock of accessor , optimized vector clock 0x1fffffffffff pos: offset, size in 8-byte word shadow 0x180000000000 wr: IsWrite bit

  40. Optimization 1 N shadow cells per application word (8-bytes) g x read g y write g x clock_1 0:2 0 g y clock_2 4:8 1 When shadow words are filled, evict one at random.

  41. Optimization 2 TID clock pos wr scalar clock, not full vector clock. g x access: g x g y 3 2 3

  42. g1: count == 0 0 0 g1 0 0:8 0 raceread (…) by compiler instrumentation g1: count++ 1 0 g1 1 0:8 1 racewrite (…) g2: count == 0 0 0 g2 0 0:8 0 raceread (…) and check for race

  43. race detection compare: <accessor’s vector clock, new shadow word> with: each existing shadow word 0 0 g2 0 0:8 0 g1 1 0:8 1 “…when two+ threads concurrently access a shared memory location, at least one access is a write.”

  44. race detection compare: <accessor’s vector clock, new shadow word> with: each existing shadow word 0 0 g2 0 0:8 0 g1 1 0:8 1 ✓ do the access locations overlap? ✓ are any of the accesses a write? ✓ are the TIDS different? ✓ are they concurrent (no happens-before)? g2’s vector clock: (0, 0) existing shadow word’s clock: (1, ?)

  45. race detection compare (accessor’s threadState, new shadow word) with each existing shadow word: 0 0 g2 0 0:8 0 g1 1 0:8 1 ✓ do the access locations overlap? ✓ are any of the accesses a write? ✓ are the TIDS different? ✓ are they concurrent (no happens-before)? RACE!

  46. synchronization events TSan must track synchronization events g2 g1 g 1 g 2 g 2 g 1 0 0 0 0 1 0 g 1 = max(3, 0) 2 0 g 2 = max(0,1) lock(mu) unlock(mu) 3 0 3 1

  47. sync vars struct SyncVar { struct SyncVar { mu := sync.Mutex{} SyncClock clock; } } stored in the meta map region. contains a vector clock g1 g2 mu.Unlock() 3 0 SyncClock max( , 0 1 mu.Lock() SyncClock)

Recommend


More recommend