NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, - PowerPoint PPT Presentation

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 27 November 2015

Lecture 8  Problems with locks  Atomic blocks and composition  Hardware transactional memory  Software transactional memory

Transactional Memory Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit

Our Vision for the Future In this course, we covered …. Best practices … New and clever ideas … And common-sense observations. 4 Art of Multiprocessor Programming

Our Vision for the Future In this course, we covered …. Nevertheless … Best practices … Concurrent programming is still too hard … New and clever ideas … Here we explore why this is …. And common-sense observations. And what we can do about it. 5 Art of Multiprocessor Programming

Locking 6 Art of Multiprocessor Programming

Coarse-Grained Locking Easily made correct … But not scalable. 7 Art of Multiprocessor Programming

Fine-Grained Locking Can be tricky … 8 Art of Multiprocessor Programming

Locks are not Robust If a thread holding a lock is delayed … No one else can make progress 9 Art of Multiprocessor Programming

Locking Relies on Conventions • Relation between Actual comment – Locks and objects from Linux Kernel – Exists only in programmer’s mind (hat tip: Bradley Kuszmaul) /* * When a locked buffer is visible to the I/O layer * BH_Launder is set. This means before unlocking * we must clear BH_Launder, mb() on alpha and then * clear BH_Lock, so no reader can see BH_Launder set * on an unlocked buffer and then risk to deadlock. */ Art of Multiprocessor Programming

Simple Problems are hard double-ended queue enq(y) enq(x) No interference if ends “far apart” Interference OK if queue is small Clean solution is publishable result: [Michael & Scott PODC 97] 11 Art of Multiprocessor Programming

Locks Not Composable Transfer item from one queue to another Must be atomic : No duplicate or missing items 12 Art of Multiprocessor Programming

Locks Not Composable Lock source Unlock source & target Lock target 13 Art of Multiprocessor Programming

Locks Not Composable Lock source Methods cannot provide Unlock source & internal synchronization target Objects must expose locking protocols to clients Lock target Clients must devise and follow protocols Abstraction broken! 14 Art of Multiprocessor Programming

Monitor Wait and Signal Empty buffer zzz Yes! If buffer is empty, wait for item to show up 15 Art of Multiprocessor Programming

Wait and Signal do not Compose empty empty zzz… Wait for either? 16 Art of Multiprocessor Programming

The Transactional Manifesto • Current practice inadequate – to meet the multicore challenge • Research Agenda – Replace locking with a transactional API – Design languages or libraries – Implement efficient run-time systems 17 17 Art of Multiprocessor Programming

Transactions Block of code …. Atomic: appears to happen instantaneously Serializable: all appear to happen in one-at-a-time Commit: takes effect order (atomically) Abort: has no effect (typically restarted) 18 18 Art of Multiprocessor Programming

Atomic Blocks atomic { x.remove(3); y.add(3); } atomic { y = null; } 19 19 Art of Multiprocessor Programming

Atomic Blocks atomic { x.remove(3); No data race y.add(3); } atomic { y = null; } 20 20 Art of Multiprocessor Programming

A Double-Ended Queue public void LeftEnq(item x) { Qnode q = new Qnode(x); q.left = left; left.right = q; left = q; } Write sequential Code 21 21 Art of Multiprocessor Programming

A Double-Ended Queue public void LeftEnq(item x) atomic { Qnode q = new Qnode(x); q.left = left; left.right = q; left = q; } } 22 22 Art of Multiprocessor Programming

A Double-Ended Queue public void LeftEnq(item x) { atomic { Qnode q = new Qnode(x); q.left = left; left.right = q; left = q; } } Enclose in atomic block 23 23 Art of Multiprocessor Programming

Warning • Not always this simple – Conditional waits – Enhanced concurrency – Complex patterns • But often it is… 24 24 Art of Multiprocessor Programming

Composition? 25 Art of Multiprocessor Programming

Composition? public void Transfer(Queue<T> q1, q2) { atomic { T x = q1.deq(); Trivial or what? q2.enq(x); } } 26 Art of Multiprocessor Programming

Conditional Waiting public T LeftDeq() { atomic { if (left == null) retry; … } } Roll back transaction and restart when something changes 27 27 Art of Multiprocessor Programming

Composable Conditional Waiting atomic { x = q1.deq(); } orElse { x = q2.deq(); } Run 1 st method. If it retries … Run 2 nd method. If it retries … Entire statement retries 28 28 Art of Multiprocessor Programming

Hardware Transactional Memory • Exploit Cache coherence • Already almost does it – Invalidation – Consistency checking • Speculative execution – Branch prediction = optimistic synch! Art of Multiprocessor 29 29 Programming

HW Transactional Memory read active T caches Interconnect memory 30 30 Art of Multiprocessor Programming

Transactional Memory active read active T T caches memory 31 31 Art of Multiprocessor Programming

Transactional Memory active committed active T T caches memory 32 32 Art of Multiprocessor Programming

Transactional Memory write committed active T D caches memory 33 33 Art of Multiprocessor Programming

Rewind write aborted active active T T D caches memory 34 34 Art of Multiprocessor Programming

Transaction Commit • At commit point – If no cache conflicts, we win. • Mark transactional entries – Read-only: valid – Modified: dirty (eventually written back) • That’s all, folks! – Except for a few details … 35 35 Art of Multiprocessor Programming

Not all Skittles and Beer • Limits to – Transactional cache size – Scheduling quantum • Transaction cannot commit if it is – Too big – Too slow – Actual limits platform-dependent 36 36 Art of Multiprocessor Programming

HTM Strengths & Weaknesses • Ideal for lock-free data structures

HTM Strengths & Weaknesses • Ideal for lock-free data structures • Practical proposals have limits on – Transaction size and length – Bounded HW resources – Guarantees vs best-effort

HTM Strengths & Weaknesses • Ideal for lock-free data structures • Practical proposals have limits on – Transaction size and length – Bounded HW resources – Guarantees vs best-effort • On fail – Diagnostics essential – Try again in software?

Composition Locks don’t compose, transactions do. Composition necessary for Software Engineering. But practical HTM doesn’t really support composition! Why we need STM

Transactional Consistency • Memory Transactions are collections of reads and writes executed atomically • They should maintain consistency – External : with respect to the interleavings of other transactions ( linearizability ). – Internal : the transaction itself should operate on a consistent state.

External Consistency Invariant x = 2y 4 X 8 Transaction A: Write x Write y 2 4 Y Transaction B: Read x Read y Compute z = 1/(x-y) = 1/2 Application Memory

A Simple Lock-Based STM • STMs come in different forms – Lock-based – Lock-free • Here : a simple lock-based STM • Lets start by Guaranteeing External Consistency Art of Multiprocessor 43 Programming

Synchronization • Transaction keeps – Read set : locations & values read – Write set : locations & values to be written • Deferred update – Changes installed at commit • Lazy conflict detection – Conflicts detected at commit 44 Art of Multiprocessor Programming

STM: Transactional Locking Map V# Array of Application V# version #s & Memory locks V# 45 45 Art of Multiprocessor Programming

Reading an Object Mem Locks V# Add version numbers V# & values to read set V# V# V# 46 46 Art of Multiprocessor Programming

To Write an Object Mem Locks V# Add version numbers & V# new values to write set V# V# V# 47 47 Art of Multiprocessor Programming

To Commit Mem Locks Acquire write locks V# Check version numbers unchanged X V# V#+1 Install new values Increment version numbers V# Unlock. V# Y V#+1 V# 48 48 Art of Multiprocessor Programming

Encounter Order Locking (Undo Log) Mem Locks 1. To Read: load lock + location V# 0 V# 0 V# 0 V# 0 2. Check unlocked add to Read-Set V#+1 0 V#+1 0 X V# 0 V# 1 X 3. To Write: lock location, store value 4. Add old value to undo-set V# 0 V# 0 5. Validate read-set v# ’ s unchanged V# 0 V# 0 V#+1 0 Y V# 0 V#+1 0 V# 1 Y 6. Release each lock with v#+1 V# 0 V# 0 V# 0 V# 0 V# 0 V# 0 V# 0 V# 0 Quick read of values freshly written by the reading transaction

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, - PowerPoint PPT Presentation

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 27 November 2015 Lecture 8 Problems with locks Atomic blocks and composition Hardware transactional memory Software transactional memory Transactional

Data Blocking Jon K. Nilsen Department of Physics and Scientific Computing Group University of

Data Blocking Jon K. Nilsen Department of Physics and Scientific Computing Group University of

Pragmatic Primitives for Non-blocking Data Structures PODC 2013 Trevor Brown, University of

Blocking and Non-blocking Checkpointing and Rollback Recovery for Networks-on-Chip Claudia Rusu 1

Dynamic Blocking Problems for Models of Fire Propagation Alberto Bressan Department of

Delay Aware Packet Scheduling (DAPS) and receivers buffer blocking in CMT-SCTP Nicolas KUHN 1 ,

Blocking in the 2 k Design Blocking may be required because: we cannot perform all required runs

Efficient ion blocking in gaseous detectors Efficient ion blocking in gaseous detectors and its

[Introduction to] Writing non- blocking code ... in Node.js and Perl Thursday, July 19, 12

A General Technique for Non-blocking Trees Trevor Brown, University of Toronto, Canada Faith

Non-Blocking Two Phase Commit (2PC) Using Blockchain Paul Ezhilchelvan , Amjad Aldweesh and Aad

Hypo contact and Sasakian SU ( 2 ) -structures in 5-dimensions structures on Lie groups Sasakian

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 18 November 2016 Lecture 7

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 25 November 2016 Lecture 8

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 21 November 2014 Lecture 7

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 14 November 2014 Lecture 6

M ETHANE E MISSIONS E STIMATES FROM O IL AND N ATURAL G AS P RODUCTION U SING A TMOSPHERIC M

Tim Pettersen @kannonboy @kannonboy What do the following have in common? Homebrew Flashbake

TECHNICAL ON-BOARDING PROCESS TECHNICAL ON-BOARDING KICK OFF MEETING PRESENTATION FOR SCC PARTNERS

TCTL model checking lower/upper-bound Introduction parametric timed automata without Parametric

Modeling Assistive Devices: Rogers Team, Sawicki Team, Silverman Team Tuning

Presentation Slides Scott : Structure Canonisation using Ordered-Tree Translation

Breakout Sessions! @Advisen Drones: The Risk from Above Drones: The Risk from Above Thomas

Lattice optimization for low charge Lattice optimization for low charge state heavy ion operation

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us