Synchronization presented by Radu Teodorescu CS533 Why we need it? - PowerPoint PPT Presentation

Synchronization presented by Radu Teodorescu CS533

Why we need it? • Parallel programs share data! • Consistency of shared data structures • Access serialization • Coordination between processors • Allows queueing, ordering 2

For today • How synchronization works • Synchronization operations • Hardware primitives • Implementations

How it works • Components of a synchronization event: • ACQUIRE method - access right to the synchronization • WAITING algorithm - wait for synchronization to become available • RELEASE method - allow other processes to proceed past synchronization

Waiting algorithms Busy-waiting • The process spins in a loop, repeatedly testing for status change • PROS: low latency • CONS: blocks processor, higher traffic (network, bus, cache)

Waiting algorithms Blocking • The process suspends, releases the processor, waits to be awakened • PROS: releases the processor for other jobs • CONS: higher overhead Scheduling overhead < expected wait time

Synch operations • Locks: grant access to one process only • Barriers: no process advances beyond it until all have arrived • Semaphores • Monitors • … Some combination of hardware primitives and software

Hardware primitives • Synchronization requires some ATOMIC operation • Some flavor of atomic Read&Modify • Atomic exchange • Test & set • Fetch & increment Implement locks, barriers, etc.

Test & set • Test a value and set it if test passed • Also used to implement locks lock: ADD R2, R0, #1 T&S R2, (R1) BNEZ R2, lock • In cache coherent machines - lots of invalidations

Test and test & set • Take advantage of locality • Test before write (to avoid an invalidation) lock: LD R2, (R1) BNEZ R2, lock ADD R2, R0, #1 T&S R2, (R1) BNEZ R2, lock

Atomic exchange • Exchange a value in a register with memory EXCH Reg, Mem • Can be used to implement a spin lock lock: LD R2, (R1) BNEZ R2, lock ADD R2, R0, #1 EXCH R2, (R1) BNEZ R2, lock

Fetch and increment • Reads a memory value and increments it atomically • Useful for barrier implementation • Can be used to determine how many processes are waiting for it

Performance • How expensive is locking with test&set? • N processors waiting • No hold time lock: ADD R2, R0, #1 T&S R2, (R1) BNEZ R2, lock ∑ (2i+1) = N 2 +2N

Hardware complexity • Single, atomic memory operation • Hard to implement in hardware • complicates the coherence protocol • must be uninterrupted • avoid deadlocks

Reducing complexity • Maintain the atomicity requirement • Have two linked instructions • The second indicates if the pair executed atomically • MIPS/SGI: Load-linked (LL), Store-conditional (SC)

LL/SC • LL returns value of memory location • SC • LL/SC not atomic - returns 0, doesn’t update memory • LL/SC is atomic - returns 1, updates memory

LL/SC lock: LL R2, (R1) BNEZ R2, lock ADD R2, R0, #1 SC R2, (R1) BEQZ R2, lock • LL/SC also fails if processor context switches between LL and SC • Can implement other primitives: exchange, fetch-and-add,...

Implementations • HEP Multiprocessor • NYU Ultracomputer • IBM RP3 • Illinois Cedar

HEP multiprocessor • Each word in memory has Full/Empty bit • Bit is tested in hardware before a RD/WR • The RD/WR blocks until the test succeeds: • RD until full • WR until empty • When test succeeds, the bit is set to the opposite value

HEP multiprocessor • PROS: • Very efficient for low level dependences (compare to locks) • CONS: • Requires complex hardware: • F/E bits • Support to queue a process if test fails • Logic to implement indivisible ops

NYU ultracomputer • Implements fetch-and-add • PROS: • Can use message combining, scales well • Efficient barrier implementation • CONS: • Very complex network • Adders in each memory module

Message combining F&A(X,3) Switch Memory X 5 F&A(X,1) F&A(X,4) Switch Memory X 3 5 5 Switch Memory X 9 3 5 Switch Memory X 9 8

IBM RP3 • Implements fetch-and-phi, where phi can be: • Add, And, Or • Min, Max, Store • Store if zero • PRO: generality • CON: hardware complexity

Illinois Cedar • General atomic instruction that operates on synchronization variables • Synch variable has 2 words: Key and Value • Synch instruction: {addr;(cond);op on key;op on value} • F/E bit test for a read: {X; (X.key==1)*; decrement; fetch} • Complex hardware, special processor for each memory module

That’s it for today!

Synchronization presented by Radu Teodorescu CS533 Why we need it? - PowerPoint PPT Presentation

Synchronization presented by Radu Teodorescu CS533 Why we need it? Parallel programs share data! Consistency of shared data structures Access serialization Coordination between processors Allows queueing, ordering 2 For

Content Synchronization Content Synchronization March 2nd 2005 Jukka Honkola T-110.456

Clock Synchronization Synchronization Clock Henrik Lnn Electronics & Software Volvo

synchronization.txt synchronization.txt Feb 2 2009 1:10 Page 1

File Synchronization with File Synchronization with Syxaw in an Ad-hoc Network Syxaw in an

Chapter 7: Process Synchronization Background The Critical-Section Problem

Module 6: Process Synchronization Background The Critical-Section Problem

CSCI [4|6] 730 Operating Systems Synchronization Part 1 : The Basics Maria Hybinette, UGA

Chapter 6: Process [& Thread] Synchronization Why is synchronization needed? CSCI [4|6]

Thread and Synchronization Synchronization Mechanisms (Module 20) Yann-Hang Lee Arizona State

Semaphores and Monitors: High-level Synchronization Constructs 1 Synchronization Constructs

Operating Systems Operating Systems CMPSC 473 CMPSC 473 Synchronization Synchronization

CS 134: Operating Systems More Synchronization 1 / 21 Overview CS34 Overview 2013-05-19

Chapter 7: Process Synchronization Background The Critical-Section Problem

Synchronization in sensor networks Synchronization in sensor networks Jie Gao Computer Science

Time Synchronization Goals of this chaper Understand the importance of time synchronization in

Automatic Realizations of Statically Safe Intra-Object Synchronization Schemes in MP-Eiffel

Decentralize everything! The Future of Bitcoin The block chain as a vehicle for

Automated Inference of Atomic Sets for Safe Concurrent Execution Gul Agha University of Illinois

15-415 - Database Applications Lecture #20: Overview of Transaction Management (R&G ch. 16)

Transactions 09 Transactions Alexandros Labrinidis University of Pittsburgh 2 Alexandros

Atomicity and Deadlock November 29, 2007 1 Check-then-Act Race public void transfer (Account

C++11 atomics for G4atomic Jonathan R. Madsen Texas A&M University What is an atomic? We

Transactions & Indexing Professor Larry Heimann Carnegie Mellon University Information Systems

Atomic page flip and mode setting Hardware structure and abstraction Atomic page flip The

Synchronization presented by Radu Teodorescu CS533 Why we need it? - PowerPoint PPT Presentation

Synchronization presented by Radu Teodorescu CS533 Why we need it? Parallel programs share data! Consistency of shared data structures Access serialization Coordination between processors Allows queueing, ordering 2 For

Content Synchronization Content Synchronization March 2nd 2005 Jukka Honkola T-110.456

Clock Synchronization Synchronization Clock Henrik Lnn Electronics &amp; Software Volvo

synchronization.txt synchronization.txt Feb 2 2009 1:10 Page 1

File Synchronization with File Synchronization with Syxaw in an Ad-hoc Network Syxaw in an

Chapter 7: Process Synchronization Background The Critical-Section Problem

Module 6: Process Synchronization Background The Critical-Section Problem

CSCI [4|6] 730 Operating Systems Synchronization Part 1 : The Basics Maria Hybinette, UGA

Chapter 6: Process [&amp; Thread] Synchronization Why is synchronization needed? CSCI [4|6]

Thread and Synchronization Synchronization Mechanisms (Module 20) Yann-Hang Lee Arizona State

Semaphores and Monitors: High-level Synchronization Constructs 1 Synchronization Constructs

Operating Systems Operating Systems CMPSC 473 CMPSC 473 Synchronization Synchronization

CS 134: Operating Systems More Synchronization 1 / 21 Overview CS34 Overview 2013-05-19

Chapter 7: Process Synchronization Background The Critical-Section Problem

Synchronization in sensor networks Synchronization in sensor networks Jie Gao Computer Science

Time Synchronization Goals of this chaper Understand the importance of time synchronization in

Automatic Realizations of Statically Safe Intra-Object Synchronization Schemes in MP-Eiffel

Decentralize everything! The Future of Bitcoin The block chain as a vehicle for

Automated Inference of Atomic Sets for Safe Concurrent Execution Gul Agha University of Illinois

15-415 - Database Applications Lecture #20: Overview of Transaction Management (R&amp;G ch. 16)

Transactions 09 Transactions Alexandros Labrinidis University of Pittsburgh 2 Alexandros

Atomicity and Deadlock November 29, 2007 1 Check-then-Act Race public void transfer (Account

C++11 atomics for G4atomic Jonathan R. Madsen Texas A&amp;M University What is an atomic? We

Transactions &amp; Indexing Professor Larry Heimann Carnegie Mellon University Information Systems

Atomic page flip and mode setting Hardware structure and abstraction Atomic page flip The

Clock Synchronization Synchronization Clock Henrik Lnn Electronics & Software Volvo

Chapter 6: Process [& Thread] Synchronization Why is synchronization needed? CSCI [4|6]

15-415 - Database Applications Lecture #20: Overview of Transaction Management (R&G ch. 16)

C++11 atomics for G4atomic Jonathan R. Madsen Texas A&M University What is an atomic? We

Transactions & Indexing Professor Larry Heimann Carnegie Mellon University Information Systems