Thread-Modular Reasoning for Lock-Free Data Structures Roland Meyer - PowerPoint PPT Presentation

Thread-Modular Reasoning   for Lock-Free Data Structures Roland Meyer based on joint work with Luká š Holík, Tomá š Vojnar, and Sebastian Wol ff .

Lock-Free Data Structures Key Take Aways: • e ffi cient but complex • correctness = linearizability • checking linearizability reduces to reachability http://www.braunschweig-fotograf.de/mein-braunschweig/

Concept • avoid locks ➡ critical section cannot exist • single commands are atomic ➡ compare-and-swap (CAS) CAS(src, cmp, dst) := atomic { if (src != cmp) return false; src = dst; return true; }

Example: Treiber’s Stack push(val): pop(): 1 node = new Node(val); while (true) { while (true) { top = ToS; top = ToS; if (top == NULL) node.next = top; return EMPTY; if (CAS(ToS, top, node)) next = top.next; return; if (CAS(ToS, top, next)) } return top.data; } ToS top node

Example: Treiber’s Stack push(val): pop(): node = new Node(val); while (true) { while (true) { top = ToS; top = ToS; if (top == NULL) node.next = top; return EMPTY; if (CAS(ToS, top, node)) next = top.next; return; if (CAS(ToS, top, next)) 1 } return top.data; } ToS top node top next

Example: Treiber’s Stack push(val): pop(): 2 node = new Node(val); while (true) { while (true) { top = ToS; top = ToS; if (top == NULL) node.next = top; return EMPTY; if (CAS(ToS, top, node)) next = top.next; 1 return; if (CAS(ToS, top, next)) } return top.data; } ToS top top next next top 2 next 2

Correctness and Concurrency • pre/post conditions meaningless ➡ other correctness criteria required • linearizability ➡ every concurrent run must coincide with a sequential run ➡ most common for lock-free data structures ➡ illusion of sequentiality [ Filipovi ć et al. ESOP’09 ]: linearizable ⟺ sequential and concurrent implementation   are observationally equivalent

Checking Linearizability • check sequentiality illusion ➡ sufficient: sequence of linearization points is valid [ Abdulla et al. TACAS’13 ]   (intuitively: linearization point = change of data structure takes effect) concurrent ( DS ) | = sequential ( DS ) ⇒ linp ( DS ) ⊆ sequential ( DS ) ⇐ ⇒ linp ( DS ) ∩ sequential ( DS ) = ∅ ⇐ ⇒ linp ( DS ) ∩ observer ( DS ) = ∅ ⇐ ➡ checking linearizability is a reachability problem

Overview 1. thread-modular reasoning 2. ownership 3. summaries

Thread-Modular Reasoning [Qadeer, Flanagan SPIN’03] Key Take Aways: • compute reachability • interference is key to scalability

Concept • view abstraction ➡ split states into set of views ➡ views capture perception of 1 thread (abstract from correlation) • state exploration ➡ fixed-point computation: X = X ∪ sequential ( X ) ∪ interference ( X )

Example: View Abstraction X = X ∪ sequential ( X ) ∪ interference ( X ) ToS ToS CAS(ToS, top, next) 1 Note: both views are equal . CAS(ToS, top, next) 2 top 1 next 1 top 2 next 2

Example: Sequential Step X = X ∪ sequential ( X ) ∪ interference ( X ) ToS CAS(ToS, top, next) 1 top 1 next 1 No concurrent behavior.

Example: Interference Step X = X ∪ sequential ( X ) ∪ interference ( X ) ToS CAS(ToS, top, next) 1 top 1 next 1 ToS CAS(ToS, top, next) 2 top 2 next 2 1. combine

Example: Interference Step X = X ∪ sequential ( X ) ∪ interference ( X ) ToS CAS(ToS, top, next) 1 CAS(ToS, top, next) 2 top 1 next 1 top 2 next 2 1. combine 2. step 3. project

Challenges with Interference • number of possible combinations is enormous ➡ not all combinations are reasonable • need pruning to make the approach practical ➡ precision ➡ performance • pruning must be sound

Pruning Interferences two types • matching ➡ Is it possible to combine at all? Skip if not. • correlation ➡ Which nodes should coincide?

Matching: Complication • matching gets harder due to finite abstraction • we use reachability predicates (shape analysis): • 0-step: = ToS • 1-step: � node • n-step: ⤏ ToS // ToS ⤏ NULL • unreach: ⋈ // node � ToS node

Matching: Example ToS Subgraph top next isomorphism: NP-complete! logical stack content ToS node

Correlation: Example ToS node 1 … ToS … next 2 node ?? top 2 Exponentially many! ToS … ToS top next … top 2 next 2 node 1

Practicality is about Interference • interference ➡ quadratic in size of state space • matching poor scalability ➡ subgraph isomorphism (NP) fight imprecision   (false-positives) • correlation ➡ exponential

Ownership Key Take Aways: • ownership saves the day • even under explicit memory management

Concept partition allocated heap into • owned ➡ exclusive access for a single thread ➡ granted upon allocation • shared ➡ accessible by every thread ➡ by publishing (e.g. making accessible via shared variables)

Ownership in Thread-Modular Reasoning [Gotsman et al. PLDI’07] • track ownership ➡ small overhead • matching ➡ owned cells not contained • correlation ➡ owned cells not merged with other nodes

Ownership and Correlation ToS ToS node 1 … own … next 2 node node ?? top 2 ToS … ToS … top next top 2 next 2 node 1

Ownership in Thread-Modular Reasoning • helps a lot with Only for garbage ➡ matching collection (GC)! ➡ correlation • makes thread-modular reasoning   practical What about explicit ➡ prunes false-positives memory management (MM)?

Problem with MM Ownership does not exist under   explicit memory management.   — folklore • almost true • indeed no exclusivity ➡ dangling pointers • we introduced weak ownership in VMCAI’16

Weak Ownership [ VMCAI’16 ] • write exclusivity ➡ only owners may write owned dangling … • no read exclusivity … ➡ dangling readers allowed ➡ dangling reads unsafe ➡ only owner may rely on memory contents

Weak Ownership in Thread-Modular Reasoning [ VMCAI’16 ] • track dangling pointers ➡ small overhead • matching: like normal ownership • correlation ➡ -owned cells referenced by only via dangling pointers 1 2 • dangling write accesses may be unsafe ➡ report as bug

Performance Impact [ VMCAI’16 ] MM with   MM without   ownership ownership 944s 25.5s :37 Treiber’s stack :36 #116776 #3175 false positive 11700s Michael&Scott’s queue impractical > #69000 #19742

Accomplishments • ownership helps with matching and correlation • low overhead tracking additional info • deeming unsafe accesses as bugs reflects programming practice • performance improvements for analysis • but: not practical yet ➡ interference still computationally complex

Summaries Key Take Aways: • copy-and-check blocks • statelessness • e ffi cient interference

Observation • lock-freedom relies on   copy-and-check blocks push(val): node = new Node(val); 1. create local copy of shared data while (true) { top = ToS; 1 2. make changes locally node.next = top; 2 if (CAS(ToS, top, node)) 3 3. publish changes if copy up-to-date   return; } or retry otherwise ➡ updates appear atomically

Insight Threads cannot observe the local behavior of other threads.   — SAS’17 So why do interference for all intermediate steps? ➡ instead: apply updates in one shot ➡ potentially unsound: stay tuned

Example: Summary for pop 1. make atomic atomic { while (true) { 2. remove noise top = ToS; if (top == NULL) return ; EMPTY next = top.next; if (CAS(ToS, top, next)) top.data ; return } }

Example: Summary for pop 1. make atomic atomic { while (true) { 2. remove noise top = ToS; if (top == NULL) 3. copy propagation return; next = top.next; if (CAS(ToS, top, next)) return; } }

Example: Summary for pop 1. make atomic atomic { while (true) { 2. remove noise top = ; ToS ToS ToS if ( == NULL) top 3. copy propagation return; top next = .next; top if (CAS(ToS, , next)) return; } }

Example: Summary for pop 1. make atomic atomic { while (true) { 2. remove noise if (ToS == NULL) 3. copy propagation return; 4. remove noise if (CAS(ToS, ToS, ToS.next)) 5. rewrite CAS return; } }

Example: Summary for pop 1. make atomic atomic { 2. remove noise if (ToS == NULL) 3. copy propagation return; 4. remove noise if (CAS(ToS, ToS, ToS.next)) ToS = ToS.next; return; 5. rewrite CAS return; }

Example: Summary for pop 1. make atomic atomic { 2. remove noise assume(ToS != NULL); if (ToS == NULL) 3. copy propagation return; 4. remove noise ToS = ToS.next; return; 5. rewrite CAS 6. rewrite guard }

Example: Summary for pop 1. make atomic atomic { 2. remove noise assume(ToS != NULL); 3. copy propagation 4. remove noise ToS = ToS.next; 5. rewrite CAS 6. rewrite guard }

Example: Summary for pop 1. make atomic atomic { assume(ToS != NULL); 2. remove noise ToS = ToS.next; } 3. copy propagation • easy to compute 4. remove noise ➡ similar for push 5. rewrite CAS • compact form beneficial for analysis   (and understandability) 6. rewrite guard

Thread-Modular Reasoning for Lock-Free Data Structures Roland Meyer - PowerPoint PPT Presentation

Thread-Modular Reasoning for Lock-Free Data Structures Roland Meyer based on joint work with Luk Holk, Tom Vojnar, and Sebastian Wol ff . Lock-Free Data Structures Key Take Aways: e ffi cient but complex correctness =

Modular Budgets Modular Budgets Modular Budgets Modular Budgets OSPA NANO Session 10/25/06

Locks Do Not Compose! Example Code Thread 1 Thread 2 class Account { transfer(A, B, 10);

Lock-Free, Wait-Free and Multi-core Programming Roger Deran boilerbay.com Fast, Efficient

13 IN THIS CHAPTER Benefits of Thread Pooling 308 Considerations and Costs of Thread

Analyzing the Performance of Lock-Free Data Structures: A Conflict-based Model Aras Atalar, Paul

1 Reader/Writer Lock: Second Try Reader/Writer Lock: Second Try Guidelines for Condition

Easy Lock-Free Programming in Non-Volatile Memory Tia ianzheng Wang Justin Levandoski

Multithreading Horstmann ch.9 Multithreading Threads Thread states Thread

Automated Reasoning Course Presentation Summary Automated Reasoning Motivations Course Plan

LOCK/WAIT FREE SYNCHRONIZATION Synchronization Mutex Blocking Lock-free At

Quantitative Reasoning for Proving Lock-Freedom Jan Ho ff mann, Michael Marmar, and Zhong Shao

To thread or not to thread? Why PETSc favors MPI-only Plenary Discussion PETSc User Meeting 2016

Decoupling Lock-Free Data Structures from Memory Reclamation for Static Analysis [POPL'19]

From Lock-Free to Wait-Free: Linked List Edward Duong Outline 1) Outline operations of the

Transactional Memory: Architectural support for Lock-Free Data Structure Transactional Memory:

Evidential and Causal Reasoning Much reasoning in AI can be seen as evidential reasoning ,

Do Cas stars host helium stars? N.L. with D. Baade, J. Bodensteiner, J. Greiner, Th. Rivinius,

Hardware Platforms Presented by: Sidharth Raj An Alternate Title The BW -Tree: A Latch-free,

Stochastic Multi-CAS Filip Pizlo Purdue University Crazy Idea Talk ISMM | 22 Oct 2007 |

Languages and Calculi for Collective Adaptive Systems Rocco De Nicola Joint work with Y. A.

On providing a CAS for Python Pearu Peterson pearu.peterson@gmail.com Centre for Nonlinear

1 TeraGrid Allocations TeraGrid Single Sign-On Resources allocated by peer review TeraGrid

How to Give a Good Technical Presentation Thomas J. Dolan ASIPP Hefei 2011 American Nuclear

The Business Case for NursingCAS AACN is the national membership organization for 800 nursing