Derivation And Evaluation of Concurrent Collectors Martin T. Vechev University of Cambridge David F. Bacon, Perry Cheng and Dave Grove IBM T.J. Watson Research Center
Outline Motivation and Benefits New Generalizations Abstract Algorithms Practical Algorithms Derivations Evaluation
Motivation Concurrent Collectors: Difficult to Construct Correctly Initial Errors in Dijkstra and Steele Algorithms Difficult to Understand Difficult to Implement No systematic comparisons, largely folklore
Contributions Generalization Of Existing Mechanisms Abstract Collectors Based on Generalizations Precise, but inefficient New Algorithm Derived from the power of generalizations Experimental Evaluation Of 4 Concurrent GC
Benefits of Generalization Steele Dijkstra Completion Time Hybrid Yuasa Memory Overhead (Floating Garbage)
Assumptions Single Collector Thread Multiple Mutator Threads Atomic Write Barrier Non-Moving
Why Is It Hard ? Concurrent Interleaving B D C R1 A Time GC marks B
Why Is It Hard ? R2 B B D D C C R1 R1 A A Time Mutator creates GC marks B R2
Why Is It Hard ? R2 R2 B B B D D D C C C X R1 R1 R1 A A A Time Mutator creates Mutator removes GC marks B R2 R1
Why Is It Hard ? R2 R2 R2 B B B B D D D D C C C C X R1 R1 R1 A A A A Time Mutator creates Mutator removes GC reclaims C & D GC marks B R2 R1 live – WRONG!
Wavefront E G R1 B D C F A R2
Protection Installation Deletion Remember Crossing Pointer R1 Remember R2 E E R1 D R1 G G B B D C C F F X A A R2 R2
New generalizations Precise Wavefront Shade Precise Counting Of Cross Pointers Scanned Reference Count (S-RC)
Collector Progress (Shade) E G B D F C R2 A SHADE:0
Collector Progress (Shade) E G B D F C R2 A SHADE:1
Collector Progress (Shade) E G B D F C R2 A SHADE:2
Collector Progress (Shade) E G B D F C R2 A SHADE:3
Shade Observations Computed by Collector Generalization of the tri-color abstraction Different Granularities Different Objects
Scanned Reference Count (S-RC) E G B D F C A S-RC:0
Scanned Reference Count (S-RC) E G B D F C A S-RC:1
Scanned Reference Count (S-RC) E G B D F C A S-RC:2
Scanned Reference Count (S-RC) E G B D F C X A S-RC:1
Scanned Reference Count (S-RC) E G B D F C A X S-RC:0
Scanned Reference Count (S-RC) E G B D F C A S-RC:0
Outline Motivation and Benefits New Generalizations Abstract Algorithms Practical Algorithms Derivations Evaluation
Abstract Algorithms Utilize Shade and S-RC Installation-Based and Deletion-Based Mutator nominates candidates Does not mark objects
Concurrent System Structure COLLECT() MUTATE ( obj, field, target) do obj.field = target; mark(); nominate(target); processNominated(); while (!finished);
Mutator Nominates (Installation) E G B D F C A S-RC:0 NOMINATED OBJECT BUFFER
Mutator Nominates (Installation) E G B D F C A S-RC:1 A NOMINATED OBJECT BUFFER
Mutator Nominates (Installation) E G B D F C S-RC:1 A S-RC:1 C A NOMINATED OBJECT BUFFER
Mutator Nominates (Installation) E G B D F C S-RC:1 X A S-RC:0 C A NOMINATED OBJECT BUFFER
After Mark (Installation) COLLECT() E G B D do F C mark(); S-RC:1 processNominated(); A while (!finished); S-RC:0 C A NOMINATED OBJECT BUFFER
After Find (Installation) Collector Marks Object C E G COLLECT() B D F C do S-RC:1 mark(); A processNominated(); S-RC:0 while (!finished); C A NOMINATED OBJECT BUFFER
Allocation In Installation-Based Collectors No difference In Deletion-Based Collectors Remembered Upon Allocation
Allocation E G B D F C A NOMINATED OBJECT BUFFER
Allocation E G B D F C A N S-RC:1 N NOMINATED OBJECT BUFFER
Outline Motivation and Benefits New Generalizations Abstract Algorithms Practical Algorithms Derivations Evaluation
Practical Algorithms Stacks Non-Barriered Region Scanned Object : behind wavefront S-RC affected Stack rescanning S-RC and Shade compression (tri-color) Reachability Effect
Compressing S-RC (sticky bit) E G B D F C A S-RC:0
Compressing S-RC (sticky bit) E G B D F C A S-RC:1
Compressing S-RC (sticky bit) E G B D F C X A S-RC:1
Compressing S-RC (sticky bit) E G B D F C A S-RC:1 Object A – Unreachable but kept Alive
Compressing Shade E G B D F C A SHADE:0
Collector Progress (Shade) E G B D F C A SHADE:3
Collector Progress (Shade) E G B D F C S-RC:1 R1 A SHADE:3 PRECISE : 1
Collector Progress (Shade) E G B D F C S-RC:1 (NOT decremented) R1 X A SHADE:3 PRECISE : 1
Collector Progress (Shade) E G B D F C A SHADE:3 PRECISE : 1 Object C – Unreachable but kept Alive
Outline Motivation and Benefits New Generalizations Abstract Algorithms Practical Algorithms Derivations Evaluation
Deriving Dijkstra INSTALLATION-BASED GC Stack Regions RESCANNED STACKS Compress S-RC to sticky bit for ALL objects INSTALLATION with 1-bit Compress Shade to 1-bit for ALL objects DIJKSTRA
Deriving Yuasa DELETION-BASED GC Stack Regions RESCANNED STACKS Compress S-RC to 0-bit for ALL objects DELETION with 0-bits NO S-RC needed => NO rescanning Deletion with NO rescanning Compress Shade to 1-bit for ALL objects YUASA
Deriving a New Collector (Hybrid) DELETION-BASED GC Stack Regions RESCANNED STACKS Compress S-RC to sticky bit for Allocated objects Compress S-RC to 0-bit for Existing objects MIXED DELETION Compress Shade to 1-bit for ALL objects HYBRID
Algorithms Steele Dijkstra Completion Time Hybrid Yuasa Memory Overhead (Floating Garbage)
Evaluation First Systematic Comparison Of Concurrent Collectors IBM J9 Production Virtual Machine J2ME Profile, microJit 512MB RAM, Pentium 4, 3GHz Comparison in terms of Execution time And Space Overhead Dijkstra, Steele, Yuasa, Hybrid Which benchmarks: SpecJVM98 (-s100) Work-Based Incremental Scheme Collect 9K for every 6K allocated.
Maximum Space Usage 1.6 YU ASA DIJKSTRA STEELE H YBRID 1.4 1.2 Maximum Space 1 0.8 0.6 0.4 0.2 0 javac mtrt jack geomean db jess Benchmarks
Execution Time 1.4 YU ASA DIJKSTRA STEELE H YBRID 1.2 1 End-to-End Time 0.8 0.6 0.4 0.2 0 javac mtrt jack geomean db jess Benchmarks
Summary Generalization Of Existing Mechanisms S-RC, Shade Abstract Collectors Based on Generalizations Precise, but inefficient (S-RC, Shade) New Concrete Algorithm Combines good properties of Yuasa and Dijkstra Suitable for Real-Time Domains Experimental Evaluation
On-Going Work More Transformations Formal Proof Of Correctness Transformations Unified Abstract Collector Formal Relation Between Algorithms
IF TIME PERMITS SLIDES
Abstract Object Layout Barrier Reference Count Computed By Mutator Don’t Sweep Recorded Marked Computed By Collector Shade DATA
The Transitive Loss P2 P2 P2 B B B B D D D D C C C C P1 P1 A A A A Time Thread Working Thread Working Collector Working Collector Working - C was not seen - Marks B as live - Installs P2 - Deletes P1 - Reclaims C is - C becomes OK unreachable - Reclaims D: live!
Common ommon Struct uctur ure e Example xample WriteBarrier(Obj, field, New) WriteBarrier(Obj, field, New) { { if (Phase == Tracing) if (Phase == Tracing) { { Old = Obj[field]; Remember(New); Remember(Old); } } Obj[field] = New; Obj[field] = New; } } DIJKSTRA YUASA
Recommend
More recommend