Hybrid STM/HTM for Nested Transactions in Java Keith Chapman Tony Hosking Eliot Moss Purdue U ANU/Data61, Purdue U UMass
Motivation STM has been around for ages • But STM is slow Commodity hardware for transactions available now • But HTM approaches are only best effort Out goal: Accelerate STM with HTM when possible
Nested Transactions Allow composition of transactions Two flavors • Closed Nested Transactions • HTM inside STM not useful; must do all the same work • Open Nested Transactions • HTM inside STM avoids most TM work – HTM well-suited!
XJ: Transactional Java XJ Language • Supports flat and closed/open/boosted nested transactions The implementation supports hybrid transactions • HTM support via Intel TSX • HTM and STM can co-exist Uses an Optimistic-reads / Pessimistic-writes protocol
TM Metadata Version # 0 0 0 Txn ID 0 0 1 Txn Log One metadata word for every object
STM Transaction Protocol (Read) Check Mode T1 Validate T1 Commit T1 Read 0 0 0 0 T1 Log
STM Transaction Protocol (Write) Check Mode T1 Commit T1 Write T1 1 0 1 0 0 T1 Log
STM Conflict Detection T1 Validate T1 Abort T1 Read T2 Write T2 0 0 0 0 0 0 1 T1 Log T2 Log
Hybrid Transaction Protocol STM – HTM conflicts detected by lock word accesses • Explicit XABORT if locked by another transaction • HTM reads – Read the metadata word • STM writes modify the metadata word • Causes HTM to abort • HTM writes – Increment version number • Causes STM read invalidation / HTM abort
Abstract Locking & Undo Operations
Abstract Locks for STM Acquire abstract locks Acquire abstract locks Open / Boosted Atomic Method Body Open / Boosted Atomic Method Body Log abstract locks Release abstract locks Log undo operations If top level transaction If nested transaction
Abstract Locks for Hybrid TM Acquire abstract locks Validate abstract locks Open / Boosted Atomic Method Body Open / Boosted Atomic Method Body Log abstract locks Log undo operations If top level is HTM
Why Validation Works • HTM vs STM Validate abstract locks • If abstract locks conflict they must touch some same physical words in the abstract locking data structure — otherwise they could not detect the conflict Open / Boosted Atomic Method Body • HTM vs HTM • No conflict in the locking data structure because all accesses to it are reads • Any real conflicts that exist will occur on the actual data structure
STM and HTM Methods STM needs logging HTM doesn't Different actions during read/write Different actions for abstract locks HTM should fall back to STM Maintain separate HTM and STM versions of methods
XJ System Architecture XJ run-time library standard Java bytecode HTM-enabled XJ source code XJ Compiler XJ Rewriter bytecode + run-time calls JVM compile load run HTM 4-5 times faster than STM
JVM Modifications Kept to a minimum Modifications done on OpenJDK: • Native methods to begin, end, and abort a HW transaction • Made them intrinsic to the HotSpot C1/C2 compilers Had to go through several hoops to get HTM to work with HotSpot’s optimizing compilers
Results
Synchrobench Micro-benchmarks to evaluate synchronization performance on various data structures Added ability to run multiple operations within a single transaction (group size) Included XJ versions of the benchmarks • TransactionalFriendlyTreeSet 48-way, Intel Xeon E5-2690 v3 machine with 2 sockets of 12 hyperthreaded cores
Group size 1 1 1 Throughput (Normalized) Throughput (Normalized) 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 2 4 8 12 16 20 24 28 32 36 40 44 48 1 2 4 8 12 16 20 24 28 32 36 40 44 48 Threads Threads Closed nested 5% Updates Open nested
Group size 1 Group size 2 1 1 Throughput (Normalized) Throughput (Normalized) 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 2 4 8 12 16 20 24 28 32 36 40 44 48 1 2 4 8 12 16 20 24 28 32 36 40 44 48 Threads Threads Closed nested 5% Updates Open nested
Group size 1 Group size 2 Group size 4 1 1 Throughput (Normalized) Throughput (Normalized) 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 2 4 8 12 16 20 24 28 32 36 40 44 48 1 2 4 8 12 16 20 24 28 32 36 40 44 48 Threads Threads Closed nested 5% Updates Open nested
Group size 1 Group size 8 Group size 2 Group size 4 1 1 Throughput (Normalized) Throughput (Normalized) 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 2 4 8 12 16 20 24 28 32 36 40 44 48 1 2 4 8 12 16 20 24 28 32 36 40 44 48 Threads Threads Closed nested 5% Updates Open nested
Group size 1 Group size 8 Group size 2 Group size 16 Group size 4 1 1 Throughput (Normalized) Throughput (Normalized) 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 2 4 8 12 16 20 24 28 32 36 40 44 48 1 2 4 8 12 16 20 24 28 32 36 40 44 48 Threads Threads Closed nested 5% Updates Open nested
Group size 1 Group size 8 Group size 2 Group size 16 Group size 4 Group size 32 1 1 Throughput (Normalized) Throughput (Normalized) 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 2 4 8 12 16 20 24 28 32 36 40 44 48 1 2 4 8 12 16 20 24 28 32 36 40 44 48 Threads Threads Closed nested 5% Updates Open nested
open htm commits open stm commits htm aborts closed htm commits closed stm commits 200 committed ops and aborted txns (10 6 ) 150 100 50 0 1 2 4 8 12 16 20 24 28 32 36 40 44 48 1 2 4 8 12 16 20 24 28 32 36 40 44 48 1 2 4 8 12 16 20 24 28 32 36 40 44 48 Group size 1 Group size 2 Group size 4
Conclusions STM and HTM can co-exist for nested transactions in Java • Closed nesting — Similar to previous schemes • Open nesting — Novel validation mechanism • Implemented in OpenJDK on Intel TSX — Artifact evaluated When it works, HTM is ~4-5 × faster than STM Open nesting increases the envelope of effectiveness for HTM Production VM would need deeper modification
• Hybrid STM/HTM for Nested Transactions on OpenJDK. OOPSLA’16 http://dx.doi.org/10.1145/2983990.2984029 • Extending OpenJDK to Support Hybrid STM/HTM. VMIL’16 http:// dx.doi.org/10.1145/2998415.2998417
Recommend
More recommend