what came first
play

What Came First? The Ordering of Events in Systems @kavya719 - PowerPoint PPT Presentation

What Came First? The Ordering of Events in Systems @kavya719 kavya the design of concurrent systems Slack architecture on AWS systems with multiple independent actors . threads nodes in a multithreaded program. in a distributed system.


  1. What Came First? The Ordering of Events in Systems @kavya719

  2. kavya

  3. the design of concurrent systems

  4. Slack architecture on AWS

  5. systems with multiple independent actors . threads nodes in a multithreaded program. in a distributed system. concurrent actors

  6. threads user-space or system threads

  7. threads user-space or system threads var tasks []Task func main() { R for { if len(tasks) > 0 { task := dequeue(tasks) 
 W process(task) } } R } W

  8. multiple threads: g1 g2 // Shared variable R var tasks []Task func worker() { R for len(tasks) > 0 { W task := dequeue(tasks) process(task) } W } func main() { // Spawn fixed-pool of worker threads. 
 startWorkers(3, worker) // Populate task queue. for _, t := range hellaTasks { 
 tasks = append(tasks, t) } } data race “when two+ threads concurrently access a shared memory location, at least one access is a write.”

  9. …many threads provides concurrency, may introduce data races.

  10. nodes processes i.e. logical nodes 
 (but term can also refer to machines i.e. 
 physical nodes). communicate by message-passing i.e. 
 connected by unreliable network, 
 no shared memory. are sequential. no global clock.

  11. distributed key-value store. 
 three nodes with master and two replicas. cart: [ ] user Y user X ADD apple crepe ADD blueberry crepe M R R cart: [ apple crepe, 
 blueberry crepe ]

  12. distributed key-value store. 
 three nodes with three equal replicas . read_quorum = write_quorum = 1. eventually consistent . cart: [ ] user X user Y ADD apple crepe ADD blueberry crepe N 1 cart: [ apple crepe ] N 2 N 3 cart: [ blueberry crepe ]

  13. …multiple nodes accepting writes 
 provides availability, may introduce conflicts.

  14. given we want concurrent systems, we need to deal with data races, 
 conflict resolution.

  15. riak: distributed key-value store channels: Go concurrency primitive stepping back: similarity, 
 meta-lessons

  16. riak a distributed datastore

  17. riak • Distributed key-value database : 
 // A data item = <key: blob> 
 {“uuid1234”: {“name”:”ada”}} 
 • v1.0 released in 2011. 
 Based on Amazon’s Dynamo. ] • Eventually consistent : 
 uses optimistic replication i.e. 
 AP system replicas can temporarily diverge, 
 (CAP theorem) will eventually converge. 
 • Highly available : 
 data partitioned and replicated, 
 decentralized, 
 sloppy quorum.

  18. cart: [ ] ADD apple crepe ADD blueberry crepe cart: [ apple crepe ] N 1 conflict resolution N 2 N 3 cart: [ blueberry crepe ] cart: [ apple crepe ] UPDATE to date crepe N 1 causal updates cart: [ date crepe ] N 2 N 3

  19. how do we determine causal vs. concurrent updates?

  20. A: apple user Y B: blueberry { cart : [ B ] } D: date user X user X { cart : [ D ]} { cart : [ A ] } { cart : [ A ]} A B C D N 1 N 2 N 3 concurrent events?

  21. A B C D N 1 N 2 N 3 concurrent events?

  22. A B C D N 1 N 2 N 3 A, C: not concurrent — same sequential actor

  23. A B C D N 1 N 2 N 3 A, C: not concurrent — same sequential actor C, D: not concurrent — fetch/ update pair

  24. happens-before orders events across actors. X ≺ Y IF one of: (threads or nodes) — same actor — are a synchronization pair — X ≺ E ≺ Y Formulated in Lamport’s 
 Time, Clocks, and the Ordering of Events paper in 1978. IF X not ≺ Y and Y not ≺ X , concurrent! establishes causality and concurrency.

  25. causality and concurrency A B C D N 1 N 2 N 3 A ≺ C (same actor) C ≺ D (synchronization pair) So, A ≺ D (transitivity)

  26. causality and concurrency A B C D N 1 N 2 N 3 …but B ? D 
 D ? B So, B, D concurrent!

  27. { cart : [ B ] } { cart : [ A ]} { cart : [ D ]} { cart : [ A ] } A B C D N 1 N 2 N 3 A ≺ D 
 D should update A 
 B, D concurrent B, D need resolution

  28. how do we implement happens-before?

  29. vector clocks means to establish happens-before edges. n1 n2 n3 n 2 n 2 n 2 n 1 n 3 n 1 n 3 n 1 n 3 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1

  30. vector clocks means to establish happens-before edges. n1 n2 n3 n 2 n 2 n 2 n 1 n 3 n 1 n 3 n 1 n 3 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 2 0 0

  31. vector clocks means to establish happens-before edges. n1 n2 n3 n 2 n 2 n 2 n 1 n 3 n 1 n 3 n 1 n 3 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 2 0 0 0 1 0

  32. vector clocks means to establish happens-before edges. n1 n2 n3 n 2 n 2 n 2 n 1 n 3 n 1 n 3 n 1 n 3 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 max ((2, 0, 0), 2 0 0 (0, 1, 0)) 2 1 0

  33. vector clocks means to establish happens-before edges. n1 n2 n3 n 2 n 2 n 2 n 1 n 3 n 1 n 3 n 1 n 3 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 max ((2, 0, 0), 2 0 0 (0, 1, 0)) 2 1 0 happens-before comparison: X ≺ Y iff VCx < VCy

  34. 2 0 0 A B C D N 1 1 1 0 0 0 0 2 0 0 N 2 2 1 0 N 3 0 0 1 VC at A: 1 0 0 VC at D: 2 1 0 So, A ≺ D

  35. 2 0 0 A B C D N 1 1 1 0 0 0 0 2 0 0 N 2 2 1 0 N 3 0 0 1 VC at B: 0 0 1 VC at D: 2 1 0 So, B, D concurrent

  36. causality tracking in riak Riak stores a vector clock with each version of the data. a more precise form, 
 “dotted version vector” GET, PUT operations on a key pass around a casual context object, that contains the vector clocks. n2 n1 Therefore, able to detect conflicts. max ((2, 0, 0), 2 0 0 (0, 1, 0)) 2 1 0

  37. causality tracking in riak Riak stores a vector clock with each version of the data. a more precise form, 
 “dotted version vector” GET, PUT operations on a key pass around a casual context object, that contains the vector clocks. Therefore, able to detect conflicts. …what about resolving those conflicts?

  38. conflict resolution in riak Behavior is configurable. 
 Assuming vector clock analysis enabled: 
 • last-write-wins 
 i.e. version with higher timestamp picked. • merge, iff the underlying data type is a CRDT • return conflicting versions to application 
 riak stores “siblings” or conflicting versions, 
 returned to application for resolution.

  39. return conflicting versions to application: Riak stores both versions B: { cart: [ “blueberry crepe” ] } 0 0 1 D: { cart: [ “date crepe” ] } 2 1 0 next op returns both to application application must resolve conflict { cart: [ “blueberry crepe”, “date crepe” ] } which creates a causal update { cart: [ “blueberry crepe”, “date crepe” ] } 2 1 1

  40. …what about resolving those conflicts? doesn’t (default behavior). instead, exposes happens-before graph to the application for conflict resolution.

  41. riak: uses vector clocks to track causality and conflicts. exposes happens-before graph to the user for conflict resolution.

  42. channels Go concurrency primitive

  43. multiple threads: g1 g2 // Shared variable R var tasks []Task func worker() { R for len(tasks) > 0 { W task := dequeue(tasks) process(task) } W } func main() { // Spawn fixed-pool of worker threads. 
 startWorkers(3, worker) // Populate task queue. for _, t := range hellaTasks { 
 tasks = append(tasks, t) } } data race “when two+ threads concurrently access a shared memory location, at least one access is a write.”

  44. memory model specifies when an event happens before another. x = 1 X print(x) Y X ≺ Y IF one of: — same thread — are a synchronization pair unlock/ lock on a mutex, — X ≺ E ≺ Y send / recv on a channel, spawn/ first event of a thread. IF X not ≺ Y and Y not ≺ X , etc. concurrent!

  45. goroutines The unit of concurrent execution: goroutines user-space threads 
 use as you would threads 
 > go handle_request(r) Go memory model specified in terms of goroutines within a goroutine: reads + writes are ordered with multiple goroutines: shared data must be synchronized…else data races!

  46. synchronization The synchronization primitives are: mutexes, conditional vars, … 
 > import “sync” 
 > mu.Lock() atomics 
 > import “sync/ atomic" 
 > atomic.AddUint64(&myInt, 1) channels

  47. 
 channels “ Do not communicate by sharing memory; 
 instead, share memory by communicating.” standard type in Go — chan safe for concurrent use. mechanism for goroutines to communicate, and synchronize. Conceptually similar to Unix pipes: 
 > ch := make(chan int) // Initialize 
 > go func() { ch <- 1 } () // Send 
 > <-ch // Receive, blocks until sent. 


  48. // Shared variable want: var tasks []Task func worker() { worker: for len(tasks) > 0 { * get a task. task := dequeue(tasks) process(task) * process it. } * repeat. } func main() { main: // Spawn fixed-pool of workers. 
 * give tasks to workers. startWorkers(3, worker) // Populate task queue. for _, t := range hellaTasks { 
 tasks = append(tasks, t) } }

Recommend


More recommend