ti ti tiny directory tiny directory di di t t
play

Ti Ti Tiny Directory Tiny Directory Di Di t t Making - PowerPoint PPT Presentation

Ti Ti Tiny Directory Tiny Directory Di Di t t Making Coherence Tracking Making Coherence Tracking Making Coherence Tracking Making Coherence Tracking Feather light Feather Feather-light Feather light light Mainak Chaudhuri Indian


  1. Ti Ti Tiny Directory Tiny Directory Di Di t t Making Coherence Tracking Making Coherence Tracking Making Coherence Tracking Making Coherence Tracking Feather light Feather Feather-light Feather light light Mainak Chaudhuri Indian Institute of Technology Kanpur Indian Institute of Technology Kanpur (J i (Joint work with Sudhanshu Shukla, IITK) k i h S dh h Sh kl IITK)

  2. Forty Forty-year anniversary year anniversary • Forty years of directory-based coherence • L M Censier and P Feautrier A New L. M. Censier and P. Feautrier. A New Solution to Coherence Problems in Multicache Systems In IEEE Transactions on Multicache Systems. In IEEE Transactions on Computers, c-27 (12):1112-1118, December 1978. 1978 – “A new solution is presented and discussed here: the presence flag solution.” the presence flag solution. Lucien M. Censier, Paul Feautrier, CII-Honeywell Bull y University Pierre y et Marie Curie

  3. Sketch Sketch • Talk in one slide • Result highlights Result highlights • Introduction • Tiny Directory Ti Di t – In-LLC coherence tracking – Tiny Directory design – Spilling into LLC space • Simulation infra-structure • Simulation results Simulation results • Summary and extensions

  4. Sketch Sketch  Talk in one slide  Result highlights  Result highlights • Introduction • Tiny Directory Ti Di t – In-LLC coherence tracking – Tiny Directory design – Spilling into LLC space • Simulation infra-structure • Simulation results Simulation results • Summary and extensions

  5. Talk in One Slide Talk in One Slide C0 C1 C2 C3 B2 Private B1 B1 Cache(s) B3 Interconnection Network Interconnection Network B2 B2 Sparse B2 B1 Shared Shared Directory LLC LLC B3 Height Bank Bank Sparse Sparse B1 B3 Directory Slice Directory Slice

  6. Talk in One Slide Talk in One Slide C0 C1 C2 C3 B2 Private B1 B1 Cache(s) B3 Sparse directory height is an important Interconnection Network Interconnection Network determinant of performance determinant of performance B2 B2 Sparse B2 B1 Shared Shared Directory LLC LLC B3 Height Bank Bank Sparse Sparse B1 B3 Directory Slice Directory Slice

  7. Talk in One Slide Talk in One Slide C0 C1 C2 C3 B2 Private B1 B1 Cache(s) B3 We show how to design very small sparse Interconnection Network Interconnection Network directories directories while delivering high performance hile deli ering high performance B2 B2 Sparse B2 B1 Shared Shared Directory LLC LLC B3 Height Bank Bank Sparse Sparse B1 B3 Directory Slice Directory Slice

  8. Talk in One Slide Talk in One Slide C0 C1 C2 C3 B2 Private B1 B1 Cache(s) B3 Track privately owned blocks by Interconnection Network Interconnection Network borro ing bits from LLC data borrowing bits from LLC data way a B2 B2 Sparse B2 B1 Shared Shared Directory LLC LLC B3 Height Bank Bank Sparse Sparse B1 B3 Directory Slice Directory Slice

  9. Talk in One Slide Talk in One Slide C0 C1 C2 C3 B2 Private B1 B1 Cache(s) B3 Track privately owned blocks by Interconnection Network Interconnection Network borro ing bits from LLC data borrowing bits from LLC data way a B2 B2 Sparse B2 B1 Shared Shared Directory LLC LLC B3 Height Bank Bank Sparse Sparse B1 B3 Directory Slice Directory Slice

  10. Talk in One Slide Talk in One Slide C0 C1 C2 C3 B2 Private B1 B1 Cache(s) B3 Track privately owned blocks by Interconnection Network Interconnection Network borro ing bits from LLC data borrowing bits from LLC data way a B2 B2 Sparse B2 B1 Shared Shared Directory LLC LLC B3 Height Bank Bank Sparse Sparse B1 B3 Directory Slice Directory Slice

  11. Talk in One Slide Talk in One Slide C0 C1 C2 C3 B2 Private B1 B1 Cache(s) B3 Track privately owned blocks by Interconnection Network Interconnection Network borro ing bits from LLC data borrowing bits from LLC data way a B2 B2 Sparse B2 B1 Shared Shared Directory LLC LLC B3 Height Bank Bank Sparse Sparse B1 B3 Directory Slice Directory Slice

  12. Talk in One Slide Talk in One Slide C0 C1 C2 C3 B2 Private B1 B1 Cache(s) B3 Track privately owned blocks by Interconnection Network Interconnection Network borro ing bits from LLC data borrowing bits from LLC data way a B2 B2 Sparse B2 B1 Shared Shared Directory LLC LLC B3 Height Bank Bank Sparse Sparse B1 B3 Directory Slice Directory Slice

  13. Talk in One Slide Talk in One Slide C0 C1 C2 C3 B2 Private B1 B1 Cache(s) B3 Critical shared blocks with large-scale read Interconnection Network Interconnection Network sharing are tracked in a tin sharing are tracked in a tiny directory director B2 B2 Sparse B2 B1 Shared Shared Directory LLC LLC B3 Height Bank Bank Sparse Sparse B1 B3 Directory Slice Directory Slice

  14. Talk in One Slide Talk in One Slide C0 C1 C2 C3 B2 Private B1 B1 Cache(s) B3 Entries evicted from tiny directory can be Interconnection Network Interconnection Network spilled into LLC space at a controlled rate spilled into LLC space at a controlled rate B2 B2 Sparse B2 B1 Shared Shared Directory LLC LLC B3 Height Bank Bank Sparse Sparse B1 B3 Directory Slice Directory Slice

  15. Talk in One Slide Talk in One Slide C0 C1 C2 C3 B2 Private B1 B1 Cache(s) B3 Entries evicted from tiny directory can be Interconnection Network Interconnection Network spilled into LLC space at a controlled rate spilled into LLC space at a controlled rate B2 B2 Sparse B2 B1 Shared Shared Directory LLC LLC B3 Height Bank Bank Sparse Sparse B1 B3 Directory Slice Directory Slice

  16. Talk in One Slide Talk in One Slide C0 C1 C2 C3 B2 Private B1 B1 Cache(s) B3 Entries evicted from tiny directory can be Interconnection Network Interconnection Network spilled into LLC space at a controlled rate spilled into LLC space at a controlled rate B2 B2 Sparse B2 B1 Shared Shared Directory LLC LLC B3 Height Bank Bank Sparse Sparse B1 B3 Directory Slice Directory Slice

  17. Result highlights Result highlights • 128-core chip-multiprocessor running scientific computing, general-purpose, and commercial multi-threaded workloads l l h d d kl d – Our Tiny Directory proposal using sparse directories with (1/32)x to (1/256)x entries performs within 1% of a 2x sparse directory • Tiny Directory capacity ranges from 187KB to 23.75KB Ti Di t it f 187KB t 23 75KB – Our Tiny Directory proposal exercising (1/256)x entries saves 16% energy in the LLC and the entries saves 16% energy in the LLC and the sparse directory compared to the 2x baseline – Our proposal outperforms the state-of-the-art – Our proposal outperforms the state of the art multi-grain directory by large margins

  18. Result highlights Result highlights • 128-core chip-multiprocessor running scientific computing, general-purpose, and commercial multi-threaded workloads l l h d d kl d – Our Tiny Directory proposal using sparse directories with (1/32)x to (1/256)x entries A significant leap forward in saving on-die performs within 1% of a 2x sparse directory SRAM investment for coherence tracking SRAM in estment for coherence tracking • Tiny Directory capacity ranges from 187KB to 23.75KB Ti Di t it f 187KB t 23 75KB – Our Tiny Directory proposal exercising (1/256)x entries saves 16% energy in the LLC and the entries saves 16% energy in the LLC and the sparse directory compared to the 2x baseline – Our proposal outperforms the state-of-the-art – Our proposal outperforms the state of the art multi-grain directory by large margins

  19. Sketch Sketch • Talk in one slide • Result highlights Result highlights  Introduction • Tiny Directory Ti Di t – In-LLC coherence tracking – Tiny Directory design – Spilling into LLC space • Simulation infra-structure • Simulation results Simulation results • Summary and extensions

  20. Introduction Introduction • Sparse directory is a set-associative tagged Sparse directory is a set associative tagged structure attached to each last-level cache (LLC) bank ( ) – Each sparse directory entry tracks the location(s) of an LLC block in the private cache hierarchy attached to each core tt hed to e h o e – Sparse directory implementation needs to be space-efficient as the number of cores in the space efficient as the number of cores in the chip-multiprocessor increases – The number of sparse directory entries imposes p y p an upper bound on the number of distinct blocks tracked at any point in time • This parameter plays an important role in determining This parameter plays an important role in determining the overall performance and the total space investment for coherence tracking

  21. Sparse directory height Sparse directory height • Sparse directory height is an important • Sparse directory height is an important determinant of performance – Number of sparse directory entries is mentioned Number of sparse directory entries is mentioned as a fraction of the number of blocks in the last- level private cache (L2 cache in our case) level private cache (L2 cache in our case)

Recommend


More recommend