derivation and evaluation of concurrent collectors
play

Derivation And Evaluation of Concurrent Collectors Martin T. Vechev - PowerPoint PPT Presentation

Derivation And Evaluation of Concurrent Collectors Martin T. Vechev University of Cambridge David F. Bacon, Perry Cheng and Dave Grove IBM T.J. Watson Research Center Outline Motivation and Benefits New Generalizations Abstract


  1. Derivation And Evaluation of Concurrent Collectors Martin T. Vechev University of Cambridge David F. Bacon, Perry Cheng and Dave Grove IBM T.J. Watson Research Center

  2. Outline  Motivation and Benefits  New Generalizations  Abstract Algorithms  Practical Algorithms  Derivations  Evaluation

  3. Motivation Concurrent Collectors:  Difficult to Construct Correctly  Initial Errors in Dijkstra and Steele Algorithms  Difficult to Understand  Difficult to Implement  No systematic comparisons, largely folklore

  4. Contributions  Generalization Of Existing Mechanisms  Abstract Collectors Based on Generalizations  Precise, but inefficient  New Algorithm  Derived from the power of generalizations  Experimental Evaluation Of 4 Concurrent GC

  5. Benefits of Generalization Steele Dijkstra Completion Time Hybrid Yuasa Memory Overhead (Floating Garbage)

  6. Assumptions  Single Collector Thread  Multiple Mutator Threads  Atomic Write Barrier  Non-Moving

  7. Why Is It Hard ?  Concurrent Interleaving B D C R1 A Time GC marks B

  8. Why Is It Hard ? R2 B B D D C C R1 R1 A A Time Mutator creates GC marks B R2

  9. Why Is It Hard ? R2 R2 B B B D D D C C C X R1 R1 R1 A A A Time Mutator creates Mutator removes GC marks B R2 R1

  10. Why Is It Hard ? R2 R2 R2 B B B B D D D D C C C C X R1 R1 R1 A A A A Time Mutator creates Mutator removes GC reclaims C & D GC marks B R2 R1 live – WRONG!

  11. Wavefront E G R1 B D C F A R2

  12. Protection Installation Deletion Remember Crossing Pointer R1 Remember R2 E E R1 D R1 G G B B D C C F F X A A R2 R2

  13. New generalizations  Precise Wavefront  Shade  Precise Counting Of Cross Pointers  Scanned Reference Count (S-RC)

  14. Collector Progress (Shade) E G B D F C R2 A SHADE:0

  15. Collector Progress (Shade) E G B D F C R2 A SHADE:1

  16. Collector Progress (Shade) E G B D F C R2 A SHADE:2

  17. Collector Progress (Shade) E G B D F C R2 A SHADE:3

  18. Shade Observations  Computed by Collector  Generalization of the tri-color abstraction  Different Granularities  Different Objects

  19. Scanned Reference Count (S-RC) E G B D F C A S-RC:0

  20. Scanned Reference Count (S-RC) E G B D F C A S-RC:1

  21. Scanned Reference Count (S-RC) E G B D F C A S-RC:2

  22. Scanned Reference Count (S-RC) E G B D F C X A S-RC:1

  23. Scanned Reference Count (S-RC) E G B D F C A X S-RC:0

  24. Scanned Reference Count (S-RC) E G B D F C A S-RC:0

  25. Outline  Motivation and Benefits  New Generalizations  Abstract Algorithms  Practical Algorithms  Derivations  Evaluation

  26. Abstract Algorithms  Utilize Shade and S-RC  Installation-Based and Deletion-Based  Mutator nominates candidates  Does not mark objects

  27. Concurrent System Structure COLLECT() MUTATE ( obj, field, target) do obj.field = target; mark(); nominate(target); processNominated(); while (!finished);

  28. Mutator Nominates (Installation) E G B D F C A S-RC:0 NOMINATED OBJECT BUFFER

  29. Mutator Nominates (Installation) E G B D F C A S-RC:1 A NOMINATED OBJECT BUFFER

  30. Mutator Nominates (Installation) E G B D F C S-RC:1 A S-RC:1 C A NOMINATED OBJECT BUFFER

  31. Mutator Nominates (Installation) E G B D F C S-RC:1 X A S-RC:0 C A NOMINATED OBJECT BUFFER

  32. After Mark (Installation) COLLECT() E G B D do F C mark(); S-RC:1 processNominated(); A while (!finished); S-RC:0 C A NOMINATED OBJECT BUFFER

  33. After Find (Installation)  Collector Marks Object C E G COLLECT() B D F C do S-RC:1 mark(); A processNominated(); S-RC:0 while (!finished); C A NOMINATED OBJECT BUFFER

  34. Allocation  In Installation-Based Collectors  No difference  In Deletion-Based Collectors  Remembered Upon Allocation

  35. Allocation E G B D F C A NOMINATED OBJECT BUFFER

  36. Allocation E G B D F C A N S-RC:1 N NOMINATED OBJECT BUFFER

  37. Outline  Motivation and Benefits  New Generalizations  Abstract Algorithms  Practical Algorithms  Derivations  Evaluation

  38. Practical Algorithms  Stacks  Non-Barriered Region  Scanned Object : behind wavefront  S-RC affected  Stack rescanning  S-RC and Shade compression (tri-color)  Reachability Effect

  39. Compressing S-RC (sticky bit) E G B D F C A S-RC:0

  40. Compressing S-RC (sticky bit) E G B D F C A S-RC:1

  41. Compressing S-RC (sticky bit) E G B D F C X A S-RC:1

  42. Compressing S-RC (sticky bit) E G B D F C A S-RC:1 Object A – Unreachable but kept Alive

  43. Compressing Shade E G B D F C A SHADE:0

  44. Collector Progress (Shade) E G B D F C A SHADE:3

  45. Collector Progress (Shade) E G B D F C S-RC:1 R1 A SHADE:3 PRECISE : 1

  46. Collector Progress (Shade) E G B D F C S-RC:1 (NOT decremented) R1 X A SHADE:3 PRECISE : 1

  47. Collector Progress (Shade) E G B D F C A SHADE:3 PRECISE : 1 Object C – Unreachable but kept Alive

  48. Outline  Motivation and Benefits  New Generalizations  Abstract Algorithms  Practical Algorithms  Derivations  Evaluation

  49. Deriving Dijkstra INSTALLATION-BASED GC Stack Regions RESCANNED STACKS Compress S-RC to sticky bit for ALL objects INSTALLATION with 1-bit Compress Shade to 1-bit for ALL objects DIJKSTRA

  50. Deriving Yuasa DELETION-BASED GC Stack Regions RESCANNED STACKS Compress S-RC to 0-bit for ALL objects DELETION with 0-bits NO S-RC needed => NO rescanning Deletion with NO rescanning Compress Shade to 1-bit for ALL objects YUASA

  51. Deriving a New Collector (Hybrid) DELETION-BASED GC Stack Regions RESCANNED STACKS Compress S-RC to sticky bit for Allocated objects Compress S-RC to 0-bit for Existing objects MIXED DELETION Compress Shade to 1-bit for ALL objects HYBRID

  52. Algorithms Steele Dijkstra Completion Time Hybrid Yuasa Memory Overhead (Floating Garbage)

  53. Evaluation  First Systematic Comparison Of Concurrent Collectors  IBM J9 Production Virtual Machine  J2ME Profile, microJit  512MB RAM, Pentium 4, 3GHz  Comparison in terms of Execution time And Space Overhead  Dijkstra, Steele, Yuasa, Hybrid  Which benchmarks:  SpecJVM98 (-s100)  Work-Based Incremental Scheme  Collect 9K for every 6K allocated.

  54. Maximum Space Usage 1.6 YU ASA DIJKSTRA STEELE H YBRID 1.4 1.2 Maximum Space 1 0.8 0.6 0.4 0.2 0 javac mtrt jack geomean db jess Benchmarks

  55. Execution Time 1.4 YU ASA DIJKSTRA STEELE H YBRID 1.2 1 End-to-End Time 0.8 0.6 0.4 0.2 0 javac mtrt jack geomean db jess Benchmarks

  56. Summary  Generalization Of Existing Mechanisms  S-RC, Shade  Abstract Collectors Based on Generalizations  Precise, but inefficient (S-RC, Shade)  New Concrete Algorithm  Combines good properties of Yuasa and Dijkstra  Suitable for Real-Time Domains  Experimental Evaluation

  57. On-Going Work  More Transformations  Formal Proof Of Correctness  Transformations  Unified Abstract Collector  Formal Relation Between Algorithms

  58. IF TIME PERMITS SLIDES

  59. Abstract Object Layout Barrier Reference Count Computed By Mutator Don’t Sweep Recorded Marked Computed By Collector Shade DATA

  60. The Transitive Loss P2 P2 P2 B B B B D D D D C C C C P1 P1 A A A A Time Thread Working Thread Working Collector Working Collector Working - C was not seen - Marks B as live - Installs P2 - Deletes P1 - Reclaims C is - C becomes OK unreachable - Reclaims D: live!

  61. Common ommon Struct uctur ure e Example xample WriteBarrier(Obj, field, New) WriteBarrier(Obj, field, New) { { if (Phase == Tracing) if (Phase == Tracing) { { Old = Obj[field]; Remember(New); Remember(Old); } } Obj[field] = New; Obj[field] = New; } } DIJKSTRA YUASA

Recommend


More recommend