conditions for and effects of card cache implementations
play

Conditions for and effects of CARD cache implementations Gustaf R - PowerPoint PPT Presentation

Conditions for and effects of CARD cache implementations Gustaf R antil a and Mikael W anggren {e99_gra,e99_mwa}@e.kth.se 1 Agenda Problem formulation Hypothesis Our approach and methods Results (and their reliability)


  1. Conditions for and effects of CARD cache implementations Gustaf R¨ antil¨ a and Mikael W˚ anggren {e99_gra,e99_mwa}@e.kth.se 1

  2. Agenda • Problem formulation • Hypothesis • Our approach and methods • Results (and their reliability) • Questions 2

  3. Problem formulation • Context switches degrade performance – interactive systems (with short timeslices) extra sensitive – Overhead: Saving and loading registers & processor state, scheduling – Flushing caches, TLB, prediction buffers etc → need to rebuild them every new timeslice 3

  4. Hypothesis • We can decrease the negative effects of context switches by “caching the cache” • How? On context switch – activate a CARD cache – Save process-specific data (cache, buffers etc.) – Load ditto for the next process • CARD: Context switch Active – Run-time Drowsy – Sleeps when “programs run” – Awakens on context switch – Hardware implementation not discussed in this project 4

  5. Issues not discussed in this project • Many processes – huge CARD cache – Scheduler can prioritize most suitable processes • Kernel–CPU interaction – New instructions required 5

  6. Our approach and methods • We only save and restore the cache (not registers etc) • Simics 2.0 for full-system simulation – g-cache as cache model • x86 20 MHz hardware model • Red Hat Enterprice 7.3 with Linux 2.6 kernel 6

  7. Our approach and methods contd. • Cache setup (we mimic an XScale) – 32 kB L1 i-cache, and 32 kB L1 d-cache ∗ 32-way, virtually indexed, physically tagged ∗ i-cache policy: lru, d-cache: random ∗ 1 cycle penalty for hit ∗ 50 cycle penalty for miss 7

  8. Implementation • Requirements – Identifying context switches in Simics ∗ Break on execution of __switch_to ∗ Re-build kernel with magic instructions – Grab PID to use as key to the CARD ∗ Currently requires magic instructions 8

  9. Magic instructions in Linux • Magic instructions do no harm • Our procedure in Linux – Before context switch ∗ Set eax to 0 and call magic instruction – After context switch ∗ Copy PID to eax and call magic instruction 9

  10. Magic instructions in Simics (python) • Simics has native support for magic instructions • Our procedure in python – Break on MI and read eax – Load or save current cache to CARD – Start a temporal breakpoint chain ∗ For every temporal breakpoint, store statistics 10

  11. Experimentation • We simulate applications of different behaviour • From MiBench – lame , calculation heavy – dijkstra , both calculation and data heavy – crc32 , a very common sequential application • Home-made – string search , data heavy 11

  12. Experimentation contd. • Simulations runs on a pre-emptive kernel – But it’s not easy to force “pre-emption” • We want context switches! – We loop “ps >> file” in background to force CS – Thereby we also get the programs PID 12

  13. 13

  14. 14

  15. 15

  16. 16

  17. 17

  18. 18

  19. 19

  20. 20

  21. 21

  22. 22

  23. 23

  24. 24

  25. 25

  26. 26

  27. 27

  28. 28

  29. 29

  30. Reliability in the results • Longer runs would eliminate start-up slowdown • Do we use a decent cache setup? • L2 Cache? • Is our clock frequency fair? 30

  31. • Questions? 31

Recommend


More recommend