CPSC 213 data structure that could be accessed by multiple threads - PowerPoint PPT Presentation

Readings for These Next Four Lectures Synchronization The Importance of Mutual Exclusion ‣ Text ‣ Shared data Memory Bus CPUs Memory (Cores) CPSC 213 • data structure that could be accessed by multiple threads •Shared Variables in Threaded Programs - Synchronizing Threads with • typically concurrent access to shared data is a bug Semaphores, Using Threads for Parallelism, Other Concurrency Issues ‣ Critical Sections •2nd: 12.4-12.5, 12.6, parts of 12.7 • sections of code that access shared data •1st: 13.4-13.5, (no equivalent to 12.6), parts of 13.7 some other thread ‣ Race Condition wait disk-read thread notify • simultaneous access to critical section section by multiple threads disk controller Introduction to Computer Systems • conflicting operations on shared data structure are arbitrarily interleaved ‣ We invented Threads to • unpredictable (non-deterministic) program behaviour — usually a bug (a serious bug) • exploit parallelism do things at the same time on different processors ‣ Mutual Exclusion • manage asynchrony do something else while waiting for I/O Controller ‣ But, we now have two problems Unit 2c • a mechanism implemented in software (with some special hardware support) • coordinating access to memory (variables) shared by multiple threads • to ensure critical sections are executed by one thread at a time Synchronization • control flow transfers among threads (wait until notified by another thread) • though reading and writing should be handled differently (more later) ‣ Synchronization is the mechanism threads use to ‣ For example • ensure mutual exclusion of critical sections • consider the implementation of a shared stack by a linked list ... • wait for and notify of the occurrence of events 1 2 3 4 Mutual Exclusion using locks ‣ Stack implementation ‣ concurrent test doesn’t always work ‣ The bug •push and pop are critical sections on the shared stack void push_st (struct SE* e) { struct SE { et = uthread_create ((void* (*)(void*)) push_driver, (void*) n); ‣ lock semantics e->next = top; struct SE* next; dt = uthread_create ((void* (*)(void*)) pop_driver, (void*) n); •they run in parallel so their operations are arbitrarily interleaved top = e; }; uthread_join (et); } struct SE *top=0; uthread_join (dt); •a lock is either held by a thread or available •sometimes, this interleaving corrupts the data structure assert (top==0); X •at most one thread can hold a lock at a time struct SE* pop_st () { malloc: *** error for object 0x1022a8fa0: pointer being freed was not allocated top •a thread attempting to acquire a lock that is already held is forced to wait struct SE* e = top; top = (top)? top->next: 0; ‣ what is wrong? ‣ lock primitives return e; } • lock acquire lock, wait if necessary void push_st (struct SE* e) { struct SE* pop_st () { e->next = top; ‣ Sequential test works struct SE* e = top; •unlock release lock, allowing another thread to acquire if waiting void push_st (struct SE* e) { struct SE* pop_st () { top = e; top = (top)? top->next: 0; e->next = top; struct SE* e = top; } return e; ‣ using locks for the shared stack top = e; top = (top)? top->next: 0; } void push_driver (long int n) { void pop_driver (long int n) { } return e; struct SE* e; struct SE* e; } while (n--) void push_cs (struct SE* e) { while (n--) { struct SE* pop_cs () { push ((struct SE*) malloc (...)); do { lock (&aLock); struct SE* e; 1. e->next = top } e = pop (); lock (&aLock); push_st (e); 2. e = top } while (!e); unlock (&aLock); e = pop_st (); free (e); 3. top = top->next unlock (&aLock); } } push_driver (n); 4. return e } pop_driver (n); return e; 6. top = e assert (top==0); 5. free e } 5 6 7 8 Implementing Simple Locks Atomic Memory Exchange Instruction ‣ We now have a race in the lock code ‣ The race exists even at the machine-code level •two instructions acquire lock: one to read it free, one to set it held Thread A Thread B ‣ Here’s a first cut ‣ We need a new instruction •but read by another thread and interpose between these two void lock (int* lock) { void lock (int* lock) { • use a shared global variable for synchronization •to atomically read and write a memory location while (*lock==1) {} while (*lock==1) {} ld $lock, r1 •lock loops until the variable is 0 and then sets it to 1 •with no intervening access to that memory location from any other thread *lock = 1; *lock = 1; ld $1, r2 } } allowed •unlock sets the variable to 0 lock appears free loop: ld (r1), r0 ‣ Atomicity beq free Another thread int lock = 0; void lock (int* lock) { 1. read *lock==0, exit loop br loop •is a general property in systems while (*lock==1) {} reads lock *lock = 1; 2. read *lock==0, exit loop •where a group of operations are performed as a single, indivisible unit free: st r2, (r1) acquire lock } 3. *lock = 1 ‣ The Atomic Memory Exchange 4. return with lock held void unlock (int* lock) { Thread A Thread B •one type of atomic memory instruction (there are other types) *lock = 0; 5. *lock = 1, return •group a load and store together atomically 6. return with lock held } ld (r1), r0 •exchanging the value of a register and a memory location • why doesn’t this work? Both threads think they hold the lock ... ld (r1), r0 Name Semantics Assembly st r2, (r1) atomic exchange r[v] ← m[r[a]] xchg (ra), rv st r2, (r1) m[r[a]] ← r[v] 9 10 11 12 Implementing Atomic Exchange Spinlock Blocking Locks ‣ Spin first on normal read •normal reads are very fast and efficient compared to exchange ‣ A Spinlock is ‣ If a thread may wait a long time •use normal read in loop until lock appears free Memory Bus CPUs Memory • it should block so that other threads can run •a lock where waiter spins on looping memory reads until lock is acquired •when lock appears free use exchange to try to grab it (Cores) • it will then unblock when it becomes runnable (lock available or event notification) •also called “busy waiting” lock •if exchange fails then go back to normal read ‣ Blocking locks for mutual exclusion ‣ Implementation using Atomic Exchange ld $lock, %r1 • if lock is held, locker puts itself on waiter queue and blocks •spin on atomic memory operation loop: ld (%r1), %r0 • when lock is unlocked, unlocker restarts one thread on waiter queue beq %r0, try ‣ Can not be implemented just by CPU •that attempts to acquire lock while br loop ‣ Blocking locks for event notification try: ld $1, %r0 •atomically reading its old value •must synchronize across multiple CPUs • waiting thread puts itself on a a waiter queue and blocks xchg (%r1), %r0 beq %r0, held •accessing the same memory location at the same time • notifying thread restarts one thread on waiter queue (or perhaps all) ld $lock, %r1 br loop ‣ Implemented by Memory Bus ld $1, %r0 ‣ Implementing blocking locks presents a problem held: loop: xchg (%r1), %r0 •memory bus synchronizes every CPU’s access to memory • lock data structure includes a waiter queue and a few other things beq %r0, held ‣ Busy-waiting pros and cons br loop • data structure is shared by multiple threads; lock operations are critical sections •the two parts of the exchange (read + write) are coupled on bus held: •Spinlocks are necessary and okay if spinner only waits a short time • mutual exclusion can be provided by blocking locks (they aren’t implemented yet) •bus ensures that no other memory transaction can intervene • and so, we need to use spinlocks to implement blocking locks (this gets tricky) •but there is a problem: atomic-exchange is an expensive instruction •But, using a spinlock to wait for a long time, wastes CPU cycles •this instruction is much slower, higher overhead than normal read or write 13 14 15 16

CPSC 213 data structure that could be accessed by multiple threads - PowerPoint PPT Presentation

Readings for These Next Four Lectures Synchronization The Importance of Mutual Exclusion Text Shared data Memory Bus CPUs Memory (Cores) CPSC 213 data structure that could be accessed by multiple threads Shared Variables in

Slide 4 / 213 Slide 4 (Answer) / 213 Slide 5 / 213 Derivatives Exploration Exploration into the

MESSAGE HANDLING MESSAGE HANDLING ICS- -213 213 ICS Presented by Chuck Sprick KE5RAD Feb

CPSC 213 - news, admin details, schedule and readings - lecture slides (always posted before

CPSC 320: NP-Completeness CPSC 320 2013W2 CPSC 320: NP-Completeness Up to now: We have been

July 23, 2019 Utility Fund Summary Audited Utility Fund Ending Balance 09/30/18 $ 11,213,213

ORRICK, HERRINGTON BACKGROUND & SUTCLIFFE INTELLIGENCE, INC. 213/612-2204 213/243-0707

Machine-Level Programming I: Basics 15-213/18-213:

Rate of Change Click here to go to the lab titled "Derivatives Exploration: y = x 2 "

Rate of Change Click here to go to the lab titled "Derivatives allow them to be able to find

CPSC 213 Read assembly code anything that can be determined before execution (by compiler)

CPSC 213 Introduction to Computer Systems Unit 3 Course Review 1 Learning Goals 1 Memory

CPSC 213 2.4.4-2.4.5 Textbook Structures, Dynamic Memory Allocation, Understanding

CPSC 213 2.7.4, 2.7.7-2.7.8 Text Switch Statements, Understanding Pointers 2ed:

CPSC 213 Read assembly code Java weak references, reference objects, reference queues

CPSC 213 Introduction to Computer Systems Unit 1a Numbers and Memory 1 The Big Picture

CPSC 213 Introduction to Computer Systems Unit 2d Virtual Memory 1 Reading Companion 5

Introduction to Collaborative Signal Processing Feng Zhao Xerox Palo Alto Research Center

Monitors CoSc 450: Programming Paradigms 07 Monitor Purpose: To consolidate the wait and signal

Last time Need for synchronization primitives 7: Synchronization Locks and building locks

Reactive Probabilistic Programming Semantics with Mixed Nondeterministic/Probabilistic Automata

Monitors Dr. Liam OConnor University of Edinburgh LFCS (and UNSW) Term 2 2020 1 Monitors

Interprocess Communication 11 Only a brain-damaged operating system would support task

Deadlocks - I Deadlocks Deadlock Characterization Resource Allocation Graphs Tevfik

Nonlinear Signal Processing 2006-2007 Connections (Ch.4, Riemannian Manifolds, J. Lee,

CPSC 213 data structure that could be accessed by multiple threads - PowerPoint PPT Presentation

Readings for These Next Four Lectures Synchronization The Importance of Mutual Exclusion Text Shared data Memory Bus CPUs Memory (Cores) CPSC 213 data structure that could be accessed by multiple threads Shared Variables in

Slide 4 / 213 Slide 4 (Answer) / 213 Slide 5 / 213 Derivatives Exploration Exploration into the

MESSAGE HANDLING MESSAGE HANDLING ICS- -213 213 ICS Presented by Chuck Sprick KE5RAD Feb

CPSC 213 - news, admin details, schedule and readings - lecture slides (always posted before

CPSC 320: NP-Completeness CPSC 320 2013W2 CPSC 320: NP-Completeness Up to now: We have been

July 23, 2019 Utility Fund Summary Audited Utility Fund Ending Balance 09/30/18 $ 11,213,213

ORRICK, HERRINGTON BACKGROUND &amp; SUTCLIFFE INTELLIGENCE, INC. 213/612-2204 213/243-0707

Machine-Level Programming I: Basics 15-213/18-213:

Rate of Change Click here to go to the lab titled &quot;Derivatives Exploration: y = x 2 &quot;

Rate of Change Click here to go to the lab titled &quot;Derivatives allow them to be able to find

CPSC 213 Read assembly code anything that can be determined before execution (by compiler)

CPSC 213 Introduction to Computer Systems Unit 3 Course Review 1 Learning Goals 1 Memory

CPSC 213 2.4.4-2.4.5 Textbook Structures, Dynamic Memory Allocation, Understanding

CPSC 213 2.7.4, 2.7.7-2.7.8 Text Switch Statements, Understanding Pointers 2ed:

CPSC 213 Read assembly code Java weak references, reference objects, reference queues

CPSC 213 Introduction to Computer Systems Unit 1a Numbers and Memory 1 The Big Picture

CPSC 213 Introduction to Computer Systems Unit 2d Virtual Memory 1 Reading Companion 5

Introduction to Collaborative Signal Processing Feng Zhao Xerox Palo Alto Research Center

Monitors CoSc 450: Programming Paradigms 07 Monitor Purpose: To consolidate the wait and signal

Last time Need for synchronization primitives 7: Synchronization Locks and building locks

Reactive Probabilistic Programming Semantics with Mixed Nondeterministic/Probabilistic Automata

Monitors Dr. Liam OConnor University of Edinburgh LFCS (and UNSW) Term 2 2020 1 Monitors

Interprocess Communication 11 Only a brain-damaged operating system would support task

Deadlocks - I Deadlocks Deadlock Characterization Resource Allocation Graphs Tevfik

Nonlinear Signal Processing 2006-2007 Connections (Ch.4, Riemannian Manifolds, J. Lee,

ORRICK, HERRINGTON BACKGROUND & SUTCLIFFE INTELLIGENCE, INC. 213/612-2204 213/243-0707

Rate of Change Click here to go to the lab titled "Derivatives Exploration: y = x 2 "

Rate of Change Click here to go to the lab titled "Derivatives allow them to be able to find