Low-Overhead Software Transactional Memory with Progress Guarantees and Strong Semantics Minjia Zhang, Jipeng Huang, Man Cao, Michael D. Bond 1
Do We Need Efficient STM? 2
Problem Solved! Blue Gene/Q 3
Problem Solved? HTM is limited… 4
Problem Solved? Best-effort HTM: no completion guarantee 1 Performance penalty: short transactions 2 Language-level support for atomic blocks: STM fallback atomic { from.balance -= amount; transaction to.balance += amount; } [1] I. Calciu et al. Invyswell: A Hybrid Transactional Memory for Haswell’s Restricted Transactional Memory. In PACT, 2014. [2] R. M. Yoo et al. Performance Evaluation of Intel Transactional Synchronization Extensions for High-Performance Computing. In SC, 2013. 5
Software Transactional Memory Is Slow Existing STMs add high overhead 1,2,3 [1] C. Cascaval et al. Software Transactional Memory: Why Is It Only a Research Toy? In CACM, 2008 [2] A. Dragojevi´c, et al. Why STM Can Be More than a Research Toy. In CACM, 2011 [3] R. M. Yoo et al. Kicking the Tires of Software Transactional Memory: Why the Going Gets Tough. In SPAA, 2008. 6
Software Transactional Memory Is Slow Existing STMs add high overhead 1,2,3 Related challenges: scalability, progress guarantees, strong semantics [1] C. Cascaval et al. Software Transactional Memory: Why Is It Only a Research Toy? In CACM, 2008 [2] A. Dragojevi´c, et al. Why STM Can Be More than a Research Toy. In CACM, 2011 [3] R. M. Yoo et al. Kicking the Tires of Software Transactional Memory: Why the Going Gets Tough. In SPAA, 2008. 7
Challenge Expensive to detect conflicts T2 T1 atomic { … o.f = … … = o.f; … = p.g; … o.f = …; p.g = …; … } 8
Challenge Expensive to detect conflicts T2 T1 atomic { … p.g = … … = o.f; … = p.g; … o.f = …; p.g = …; … } 9
Challenge Expensive to detect conflicts T2 T1 atomic { … t.k = … … = o.f; … = p.g; … o.f = …; p.g = …; … } 10
Challenge Expensive to detect conflicts T2 T1 atomic { … ? … = o.f; … = p.g; … instrumentation o.f = …; p.g = …; … } 11
12
LarkTM Contributions Adds very low overhead Achieves good scalability by using a hybrid approach Provides strong progress guarantees Provides strong atomicity 13
Key Insight Avoid high instrumentation costs by minimizing instrumentation costs for non-conflicting accesses 14
LarkTM Design Per-object biased reader-writer locks 1,2 Eager concurrency control Piggybacking conflict detection and conflict resolution on lock transfers 1. M. D. Bond et al. Octet: Capturing and Controlling Cross-Thread Dependences Efficiently. In OOSPLA, 2013. 2. B. Hindman and D. Grossman. Atomicity via Source-to-Source Translation. In MSPC, 2006. 15
LarkTM Design Per-object biased reader-writer locks 1,2 Eager concurrency control Piggybacking conflict detection and conflict resolution on lock transfers • Minimal instrumentation and synchronization for both transactional and non-transactional non-conflicting accesses • Does not release locks even if transactions commit 1. M. D. Bond et al. Octet: Capturing and Controlling Cross-Thread Dependences Efficiently. In OOSPLA, 2013. 2. B. Hindman and D. Grossman. Atomicity via Source-to-Source Translation. In MSPC, 2006. 16
Biased Locks object o lock state f 17
Biased Locks object o ∈ {WrEx T , RdEx T , RdSh} lock state f 18
Multi-thread Execution T2 T1 object o WrEx T1 lock state f Time 19
Multi-thread Execution T2 T1 object o transaction start WrEx T1 lock state txn id: 42 last txn o.f = 1 f Time 20
Multi-thread Execution T2 T1 object o transaction start WrEx T1 lock state txn id: 42 last txn 42 o.f = 1 f update Time 21
Multi-thread Execution T2 T1 object o transaction start WrEx T1 lock state txn id: 42 last txn 42 o.f = 1 undo log f add … o.f Time 22
Multi-thread Execution T2 T1 object o transaction start WrEx T1 lock state txn id: 42 last txn 42 update o.f = 1 1 f … Time 23
Multi-thread Execution T2 T1 object o transaction start WrEx T1 lock state txn id: 42 last txn 42 o.f = 1 1 f o.f = 2 … Time 24
Multi-thread Execution T2 T1 object o transaction start WrEx T1 lock state txn id: 42 last txn 42 o.f = 1 1 f … o.f = 2 … … Problem! Time No synchronization on T1’s accesses to o 25
Multi-thread Execution T2 T1 object o transaction start WrEx T1 lock state txn id: 42 last txn 42 o.f = 1 1 f … o.f = 2 … … Time T2 starts coordination 26
Coordination T2 T1 object o transaction start Int T2 lock state txn id: 42 update last txn 42 o.f = 1 1 f … o.f = 2 … … Time 27
Coordination T2 T1 object o transaction start Int T2 lock state txn id: 42 last txn 42 o.f = 1 1 f … o.f = 2 … … request Time 28
Coordination T2 T1 object o transaction start Int T2 lock state txn id: 42 last txn 42 o.f = 1 1 f … o.f = 2 … … safe point request … = o.f Time safe point 29
Coordination T2 T1 object o transaction start Int T2 lock state txn id: 42 last txn 42 o.f = 1 1 f … o.f = 2 … … safe point request … = o.f Time Detecting safe point Conflicts 30
A Transactional Conflict T2 T1 object o transaction start Int T2 lock state txn id: 42 last txn 42 o.f = 1 1 f … o.f = 2 … … safe point request … = o.f Time detected Resolving Detecting conflicts safe point Conflicts Conflicts Contention Management 31
Not A Transactional Conflict T2 T1 object o Int T2 lock state last txn 42 … 1 f … o.f = 2 … transaction … start request txn id: 43 Time Detecting safe point safe no conflict Conflicts point 32
Coordination T2 T1 object o transaction start Int T2 lock state txn id: 42 last txn 42 o.f = 1 1 f … o.f = 2 … … request … = o.f Time Detecting safe point Conflicts 33
Coordination T2 T1 object o transaction start Int T2 lock state txn id: 42 last txn 42 o.f = 1 1 f … o.f = 2 … … request … = o.f Time waiting Detecting safe point Conflicts response 34
Strong Progress Guarantees T2 T1 object o transaction start Int T2 lock state txn id: 42 last txn 42 o.f = 1 1 f … o.f = 2 … … request … = o.f Time waiting Detecting safe point Conflicts response may abort may abort 35
Strong Progress Guarantees T2 T1 object o transaction start Int T2 lock state txn id: 42 last txn 42 o.f = 1 1 f … o.f = 2 … … request … = o.f Time waiting Detecting safe point Conflicts response may abort may abort Starvation and livelock freedom 36
Strong Atomicity Semantics T2 T1 object o transaction start Int T2 lock state transaction start txn id: 42 last txn 42 transactional access o.f = 1 1 f … o.f = 2 … … request … = o.f Time waiting Detecting safe point Conflicts response abort Transactional vs. Transactional Conflict 37
Strong Atomicity Semantics T2 T1 object o transaction start Int T2 lock state transaction start txn id: 42 last txn 42 transactional access o.f = 1 1 f … o.f = 2 … … request … = o.f retry Time waiting Detecting safe point Conflicts response abort Transactional vs. Transactional Conflict 38
Strong Atomicity Semantics T2 T1 object o transaction start Int T2 lock state txn id: 42 non-transactional last txn 42 access o.f = 1 1 f … o.f = 2 … … request … = o.f Time waiting Detecting safe point safe Conflicts point response abort Transactional vs. Non-transactional Conflict 39
Strong Atomicity Semantics T2 T1 object o transaction start Int T2 lock state txn id: 42 non-transactional last txn 42 access o.f = 1 1 f … o.f = 2 … … request … = o.f Time waiting Detecting retry safe point Conflicts response abort Transactional vs. Non-transactional Conflict 40
Strong Atomicity Semantics T2 T1 T1 transaction end o.f = 2 request safe point Time response non-transactional access … = o.f o.f = … Non-transactional accesses short transactions no setting up/tearing down cost 41
No Transactional Conflict T2 T1 object o Int T2 lock state transaction last txn 42 start 1 f txn id: 51 o.f = 2 … transaction end request Time waiting Detecting safe point Conflicts response 42
No Transactional Conflict T2 T1 acquire object o lock WrEx T2 lock state transaction last txn 42 start 1 f txn id: 51 o.f = 2 … transaction end request Time waiting Detecting safe point Conflicts response 43
No Transactional Conflict T2 T1 object o WrEx T2 lock state transaction last txn 51 update start 2 f txn id: 51 o.f = 2 … add transaction undo log end request o.f Time waiting Detecting safe point Conflicts response 44
No Transactional Conflict T2 T1 object o WrEx T2 lock state transaction last txn 51 start 2 f txn id: 51 o.f = 2 … transaction undo log end request o.f Time waiting Detecting safe point Conflicts response o.f = 2 Two versions of coordination protocol 45
LarkTM-O Adds very low overhead and scales well for low-contention cases 46
Recommend
More recommend