graph prefetching using data structure knowledge
play

Graph Prefetching Using Data Structure Knowledge Sam Ainsworth and - PowerPoint PPT Presentation

Graph Prefetching Using Data Structure Knowledge Sam Ainsworth and Timothy M. Jones Computer Laboratory Graph500 Search Performance Current Prefetching Techniques Stride Software Exploit Look-ahead! Work List Vertex List Edge List


  1. Graph Prefetching Using Data Structure Knowledge Sam Ainsworth and Timothy M. Jones Computer Laboratory

  2. Graph500 Search Performance

  3. Current Prefetching Techniques ● Stride ● Software

  4. Exploit Look-ahead! Work List Vertex List Edge List Visited 5 # # False 4 # # True 1 3 # True 2 5 0 True 3 # 6 True 7 # # True ... # # False # # True # # # # #

  5. Problems ● Need address bounds of data structures ● Need to schedule prefetches ● Need to react to variable latency loads

  6. Problems ● Need address bounds of data structures ● Configure them in software! ● Need to schedule prefetches ● Need to react to variable latency loads

  7. Problems ● Need address bounds of data structures ● Configure them in software! ● Need to schedule prefetches ● Use observation hardware – EWMAs. ● Need to react to variable latency loads

  8. Problems ● Need address bounds of data structures ● Configure them in software! ● Need to schedule prefetches ● Use observation hardware – EWMAs. ● Need to react to variable latency loads ● React to arrival of prefetches, not loads!

  9. Graph Prefetcher Work List Vertex List Main To / From L2 Cache Memory Edge List Snoops EWMA Visited List Calculator Prefetch Reqs Address Generator Dcache Prefetched Data Request L2 Cache Queue DTLB Prefetcher Config Core

  10. Work List Vertex List Edge List Visited 5 # # False 4 # # True 1 3 # True 2 5 0 True 3 7 6 True 7 # 0 True ... # 1 False # # True # # # # #

  11. Graph Prefetcher: Microarchitecture Snoops & Prefetched Data Address Bounds Registers From L1 Cache Work List Start Work List End Vertex List Start Vertex List End Address Edge List Start Edge List End Filter Visited List Start Visited List End Prefetch To DTLB Request & L1 Cache Queue EWMA Unit Work List Time EWMA Prefetch Address Data Time EWMA Generator Ratio Register

  12. Results – Graph500

  13. Results – Boost Graph Library

  14. Results – Sequential Iteration

  15. Generalized Prefetching - Databases Bucket Hash Table 12 43 ptr Key Hash(43) = 3 Bucket ( 43, 2, ptr) 12 Lookahead Bucket 62 by striding in the key list 43 13 87 null

  16. Programmable Prefetcher Snoops & Prefetched Data Programmable Registers From L1 Cache Hash XOR Shift Amount Hash Table Start Hash Table End Address Key List Start Key List End Filter Other Data Other Data Prefetch To DTLB Request & L1 Cache Queue EWMA Unit CPU CPU CPU Work List Time EWMA Programmable Units Data Time EWMA CPU CPU CPU Ratio Register

  17. Graph Prefetching Using Data Structure Knowledge Sam Ainsworth and Timothy M. Jones sam.ainsworth@cl.cam.ac.uk timothy.jones@cl.cam.ac.uk Work List Vertex List Main To / From L2 Cache Memory Edge List Snoops EWMA Visited List Calculator Prefetch Reqs Address Generator Dcache Prefetched Data Request L2 Cache Queue DTLB Prefetcher Config Core For more information, see our paper from ICS 2016!

Recommend


More recommend