managing hybrid memories by predicting object write
play

Managing Hybrid Memories by Predicting Object Write Intensity Shoaib - PowerPoint PPT Presentation

Managing Hybrid Memories by Predicting Object Write Intensity Shoaib Akram , Kathryn S. Mckinley, Jennifer B. Sartor, Lieven Eeckhout Ghent University, Belgium Shoaib.Akram@UGent.be DRAM as main memory is facing multiple challenges Cost high


  1. Managing Hybrid Memories by Predicting Object Write Intensity Shoaib Akram , Kathryn S. Mckinley, Jennifer B. Sartor, Lieven Eeckhout Ghent University, Belgium Shoaib.Akram@UGent.be

  2. DRAM as main memory is facing multiple challenges Cost high when scaling to 100s of GB Reliability a concern as stored charge very small

  3. Opportunity for new memory technologies to replace DRAM Source: https://www.nextplatform.com/2015/07/29/scaling-the-growing-system-memory-hierarchy/

  4. PCM cells have limited write endurance, shortening its lifetime Reset to amorphous 610°C (Temperature) Current Set to crystalline 350°C Read Time

  5. Hybrid memory is the best of DRAM and PCM PCM DRAM Speed ✔ Speed Endurance ✔ Endurance Energy Energy ✔ Density Density ✔

  6. Future of main memory: limited DRAM, lots of PCM DRAM PCM This work uses DRAM for frequently written data

  7. Garbage collection: key advantage of using a managed language Memory automatically reclaimed for reuse More than just reclaim, stuff better organized

  8. Use GC to keep frequently written objects in DRAM Reactive approach - Monitors writes to objects - More fine-grained compared to hardware and OS approaches - No page migrations Write-rationing garbage collection for hybrid memories, PLDI 2018

  9. Use GC to keep frequently written objects in DRAM Proactive approach Use a profile-guided predictor (this work)

  10. Three offline steps in building a write intensity predictor Application Profiling <Size, Type, Site, #writes> Feature Selection <Site, #writes> Classification <Site, advice>

  11. Profiling methodology • Java Virtual Machine - Jikes RVM (version 3.1.2) - 4 MB nursery - 2 GB Mark Sweep mature • Java applications - 9 from DaCapo - PsuedoJBB 2005 - Default inputs 11

  12. The outcome of profiling is a write intensity trace For each unique object X 1. Size 2. Type 3. Allocation site <method-name, bytecode index> 4. # Writes 12

  13. Measuring entropy of different features Object Size # Writes O1 12 B 1000 O2 12 B 1000 O3 64 KB 1000 O4 32 0 O5 32 0 Each size has an entropy of 0 13

  14. Measuring entropy of different features Object Size # Writes O1 12 B 1000 O2 12 B 1000 O3 64 KB 1000 O4 32 1000 O5 32 0 Size 32 has an entropy of 1 14

  15. Homogeneity curves compare size vs. type vs. allocation site Homogeneity 100% % of Heap Volume 80% size 60% type 40% site 20% Write intensity threshold = 1 K 0% 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Entropy 15

  16. Heuristics to classify allocation sites as write-intensive or not • Goals 1. Minimize DRAM utilization 2. Minimize PCM writes • Parameters 1. Criteria to determine write intensive objects 2. Homogeneity threshold 16

  17. Criteria # 1: write frequency Write frequency threshold = 1 K Object Site Size # Writes O1 A 12 1000 ✔ O2 A 12 1000 ✔ O3 A 65536 1000 ✔ ✗ O4 A 32 0 ✗ O5 A 32 0 17

  18. Criteria # 2: write density Write density threshold = 1 Object Site Size # Writes O1 A 12 1000 ✔ O2 A 12 1000 ✔ ✗ O3 A 65536 1000 ✗ O4 A 32 0 ✗ O5 A 32 0 18

  19. Criteria # 1: write frequency Write frequency threshold = 1 K Homogeneity threshold = 50% Object Site Size # Writes O1 A 12 1000 ✔ O2 A 12 1000 ✔ O3 A 65536 1000 ✔ ✗ O4 A 32 0 ✗ O5 A 32 0 Site A is write-intensive 19

  20. Criteria # 2: write density Write density threshold = 1 Homogeneity threshold = 50% Object Site Size # Writes O1 A 12 1000 ✔ O2 A 12 1000 ✔ ✗ O3 A 65536 1000 ✗ O4 A 32 0 ✗ O5 A 32 0 Site A is NOT write-intensive 20

  21. Baseline generational heap organization GC mutator mutator nursery mature large DRAM

  22. Distribution of writes to objects Empirical observations 1. Nursery is highly mutated 2. 2% of mature objects get 80% of writes

  23. Generational heap organization in hybrid memory GC mutator mutator nursery mature large DRAM mutator mature large PCM

  24. PCM Writes vs. DRAM Utilization Write-Frequency Write-Density 50 w f = 1 % Heap in DRAM 40 d cut = 1E-3 30 w f = 30K w f = 50K 20 d cut = 0.2 10 d cut = 50 0 0 5 10 15 % Writes to PCM Homogeneity threshold = 1% 24

  25. Allocation site predictor yields better tradeoffs than size and type PCM Writes DRAM Utilization 60 % of mature 50 40 30 20 10 0 Size Type Site Homogeneity threshold = 1% , Write-Density (50) 25

  26. Profile-guided predictor is more effective compared to existing work Kingsguard-Writers Write-Density Normalized writes to PCM 0.6 0.5 0.4 0.3 0.2 0.1 0 Lusearch Pjbb Lu.Fix Avrora Luindex Hsqldb Xalan Pmd Jython Pmd.S Fop Antlr Bloat Sunflow 26

  27. What is missing in the workshop paper? • Implementation details - Compiler sets a bit in the object header - GC chooses the correct allocator • Big data benchmarks • Emulation on a real NUMA machine • Performance results 27

  28. Conclusions • Exploit GC for improving the lifetime of emerging memories • Allocation sites correctly predict write intensity • Use an allocation site predictor to eliminate a large number of writes to PCM

  29. Challenge: limit # writes to PCM Solution: Use DRAM for frequently written data

  30. Online monitoring introduces mutator and GC overheads mutator mutator nursery observer mature large DRAM mutator mature large PCM

  31. Online monitoring introduces mutator and GC overheads mutator mutator nursery observer mature large DRAM mutator mature large PCM

Recommend


More recommend