NUMA-aware Reader-Writer Locks Tom Herold, Marco Lamina 04.02.2015 NUMA Seminar
Agenda 1. Recap: Locking 2. Locks in NUMA Systems 3. NUMA-aware RW Locks 4. Implementations 5. Hands On
Why Locking? ■ Parallel tasks access shared resources ■ Locks: Synchronization mechanism in concurrent environments ■ Locks ensure mutual exclusion for critical section ■ Preventing race conditions NUMA-aware Reader-Writer Locks Tom Herold, Marco Lamina 28.01.2015 Chart 3
Example: Race Condition Ideal execution leads to correct result NUMA-aware Reader-Writer Locks Tom Herold, Marco Lamina 28.01.2015 Chart 4
Example: Race Condition No synchronization may lead to wrong result NUMA-aware Reader-Writer Locks Tom Herold, Marco Lamina 28.01.2015 Chart 5
Example: Race Condition Critical section requires mutual exclusion NUMA-aware Reader-Writer Locks Tom Herold, Marco Lamina 28.01.2015 Chart 6
Locking Basics Starvation A thread never gets to execute critical section Fairness Some threads acquire lock more often than others Deadlock Two threads wait for each other to release a lock NUMA-aware Reader-Writer Locks Tom Herold, Marco Live Lock Lamina 28.01.2015 Two threads activate each other in an infinite loop Chart 7
Reader-Writer Locks ■ Mutual exclusion can be too much! ■ Allow concurrent read access ■ Require exclusive access for write operations concurrent operations read read read NUMA-aware Reader-Writer read read read Locks read write write read write read write read Tom Herold, Marco Lamina time 28.01.2015 Chart 8
Locks in NUMA Systems
Locks on NUMA Systems ■ Intra-node traffic is cheap ■ Communication on interconnect is expensive! ■ Where to put the lock? ■ Problem: Cache coherency protocols lead to lock L NUMA-aware migrations Reader-Writer Locks Tom Herold, Marco Lamina 28.01.2015 Chart 10
Cohort Lock Properties Cohort Detection Lock owner can determine whether there are additional threads waiting to acquire the lock Thread Obliviousness The lock can be acquired by one thread and released by any other thread NUMA-aware Reader-Writer Locks Tom Herold, Marco Lamina 28.01.2015 Chart 11
Avoiding Lock Migrations: Cohort Locking ■ Two locks: □ G à global (thread- oblivious) □ L à node-local acquireCohortLock() { L.acquire() G.acquire() } G L L NUMA-aware Reader-Writer releaseCohortLock() { Locks L.release() Tom Herold, Marco if(!L.hasWaitingThreads()) Lamina 28.01.2015 G.release() Chart 12 }
Cohort Locks Example Node 1 Node 2 1 4 2 3 5 L 1 L 2 G Node 4 Node 3 NUMA-aware 6 Reader-Writer 7 Locks Tom Herold, Marco Lamina 28.01.2015 L 3 L 4 Chart 13
Cohort Locks Example Node 1 Node 2 1 4 2 3 5 L 1 L 2 G Node 4 Node 3 NUMA-aware 6 Reader-Writer 7 Locks Tom Herold, Marco Lamina 28.01.2015 L 3 L 4 Chart 14
Cohort Locks Example Node 1 Node 2 1 4 2 3 5 L 1 L 2 G Node 4 Node 3 NUMA-aware 6 Reader-Writer 7 Locks Tom Herold, Marco Lamina 28.01.2015 L 3 L 4 Chart 15
Cohort Locks Example Node 1 Node 2 1 4 2 3 5 L 1 L 2 G Node 4 Node 3 NUMA-aware 6 Reader-Writer 7 Locks Tom Herold, Marco Lamina 28.01.2015 L 3 L 4 Chart 16
Cohort Locks Example Node 1 Node 2 1 4 2 3 5 L 1 L 2 G Node 4 Node 3 NUMA-aware 6 Reader-Writer 7 Locks Tom Herold, Marco Lamina 28.01.2015 L 3 L 4 Chart 17
Cohort Locks Example Node 1 Node 2 1 4 2 3 5 L 1 L 2 G Node 4 Node 3 NUMA-aware 6 Reader-Writer 7 Locks Tom Herold, Marco Lamina 28.01.2015 L 3 L 4 Chart 18
Cohort Locks Example Node 1 Node 2 1 4 2 3 5 L 1 L 2 G Node 4 Node 3 NUMA-aware 6 Reader-Writer 7 Locks Tom Herold, Marco Lamina 28.01.2015 L 3 L 4 Chart 19
Cohort Locks Example Node 1 Node 2 1 4 2 3 5 L 1 L 2 G Node 4 Node 3 NUMA-aware 6 Reader-Writer 7 Locks Tom Herold, Marco Lamina 28.01.2015 L 3 L 4 Chart 20
Cohort Locks Example Node 1 Node 2 1 4 2 3 5 L 1 L 2 G Node 4 Node 3 NUMA-aware 6 Reader-Writer 7 Locks Tom Herold, Marco Lamina 28.01.2015 L 3 L 4 Chart 21
Cohort Locks Example Node 1 Node 2 1 4 2 3 5 L 1 L 2 G Node 4 Node 3 NUMA-aware 6 Reader-Writer 7 Locks Tom Herold, Marco Lamina 28.01.2015 L 3 L 4 Chart 22
Cohort Locks Example Node 1 Node 2 1 4 2 3 5 L 1 L 2 G Node 4 Node 3 NUMA-aware 6 Reader-Writer 7 Locks Tom Herold, Marco Lamina 28.01.2015 L 3 L 4 Chart 23
Cohort Locks Example Node 1 Node 2 1 4 2 3 5 L 1 L 2 G Node 4 Node 3 NUMA-aware 6 Reader-Writer 7 Locks Tom Herold, Marco Lamina 28.01.2015 L 3 L 4 Chart 24
Cohort Locks Example Node 1 Node 2 1 4 2 3 5 L 1 L 2 G Node 4 Node 3 NUMA-aware 6 Reader-Writer 7 Locks Tom Herold, Marco Lamina 28.01.2015 L 3 L 4 Chart 25
Cohort Locks Example Node 1 Node 2 1 4 2 3 5 L 1 L 2 G Node 4 Node 3 NUMA-aware 6 Reader-Writer 7 Locks Tom Herold, Marco Lamina 28.01.2015 L 3 L 4 Chart 26
Cohort Locks Example Node 1 Node 2 1 4 2 3 5 L 1 L 2 G Node 4 Node 3 NUMA-aware 6 Reader-Writer 7 Locks Tom Herold, Marco Lamina 28.01.2015 L 3 L 4 Chart 27
Cohort Locks Example Node 1 Node 2 1 4 2 3 5 L 1 L 2 G Node 4 Node 3 NUMA-aware 6 Reader-Writer 7 Locks Tom Herold, Marco Lamina 28.01.2015 L 3 L 4 Chart 28
Cohort Locks Example Node 1 Node 2 1 4 2 3 5 L 1 L 2 G Node 4 Node 3 NUMA-aware 6 Reader-Writer 7 Locks Tom Herold, Marco Lamina 28.01.2015 L 3 L 4 Chart 29
Cohort Locks Example Node 1 Node 2 1 4 2 3 5 L 1 L 2 G Node 4 Node 3 NUMA-aware 6 Reader-Writer 7 Locks Tom Herold, Marco Lamina 28.01.2015 L 3 L 4 Chart 30
Definition: Cohort Locks Cohort Locks are a technique to compose NUMA-aware mutex locks from NUMA- oblivious mutex locks. NUMA-aware Reader-Writer Locks Tom Herold, Marco Lamina 28.01.2015 Chart 31
NUMA-aware Reader-Writer Locks
Lock Design Goal ■ Reduce lock migration frequency to generate better node- local locality by: ■ Batching read and write operations ■ Increase local cache hits ■ Trading “fairness” principle for maximized read NUMA-aware concurrency Reader-Writer Locks Tom Herold, Marco Lamina 28.01.2015 Chart 33
Lock Design Goal II NUMA-aware Reader-Writer Locks Tom Herold, Marco Lamina 28.01.2015 Source: [1] Chart 34
Lock Design Goal II NUMA-aware Reader-Writer Locks Tom Herold, Marco Lamina 28.01.2015 Source: [1] Chart 35
Lock Design Goal II NUMA-aware Reader-Writer Locks Tom Herold, Marco Lamina 28.01.2015 Source: [1] Chart 36
NUMA? Classic NUMA ■ Placement of memory relative to thread location ■ CohortLocks ■ About shared caches and their coherence state ■ Reduce write invalidation, coherence misses and remote cache access NUMA-aware Reader-Writer Locks Tom Herold, Marco Lamina 28.01.2015 Chart 37
A RW-Lock Instance NUMA RW Lock Cohort Lock Indicators Indicators Read Indicators Local Locks Global Lock Local Locks Local Locks NUMA-aware Reader-Writer Locks ■ One Read Indicator ■ One global lock Tom Herold, Marco Counter per NUMA Node Lamina ■ One local lock per NUMA 28.01.2015 Core Chart 38
Benefit from Lock Generalization ■ NUMA-aware lock is oblivious of the underlying read indicator and mutex lock implementation ■ Backoff Locks ■ (Partitioned) Ticket Locks ■ MCS Locks ■ Simple Spin Locks ■ Properties of the chosen locks influence and tune the NUMA-aware RW lock Reader-Writer Locks ■ Abortion Tom Herold, Marco Lamina ■ Locality 28.01.2015 Chart 39
Performance of Different Implementations NUMA-aware Reader-Writer Locks Tom Herold, Marco Lamina 28.01.2015 Source: [4] Chart 40
Lock Preferences - Neutral 1: ¡reader: ¡ 2: ¡CohortLock.acquire() ¡ 3: ¡ReadIndr.arrive() ¡ 4: ¡CohortLock.release() ¡ 5: ¡ ¡<read-‑critical-‑section> ¡ Reader Concurrency 6: ¡ReadIndr.depart() ¡ ¡ 7: ¡writer: ¡ NUMA-aware 8: ¡ ¡CohortLock.acquire() ¡ Reader-Writer Locks 9: ¡ ¡while ¡NOT(ReadIndr.isEmpty()) ¡ Tom Herold, Marco 10: ¡ ¡ ¡Pause ¡ Wait for all readers Lamina to finish 28.01.2015 11: ¡ ¡<write-‑critical-‑section> ¡ Chart 41 12: ¡ ¡CohortLock.release() ¡
Lock Preferences - Neutral Problems with Neutral Preference ■ Restriction of reader-reader concurrency by cohort lock acquisition ■ High contention for central node under pressure à high interconnect bandwidth NUMA-aware Reader-Writer Locks Tom Herold, Marco Lamina 28.01.2015 Chart 42
Lock Preferences - Reader Assumptions ■ Most RW locks are read-dominated ■ Better throughput by batching read operations ■ Bypass waiting readers Benefits NUMA-aware Reader-Writer Locks ■ Increased scalability Tom Herold, Marco Lamina ■ High Reader-reader concurrency 28.01.2015 Chart 43
Recommend
More recommend