537 tlbs
play

[537] TLBs Tyler Harter 9/21/14 Overview Review Paging TLBs - PowerPoint PPT Presentation

[537] TLBs Tyler Harter 9/21/14 Overview Review Paging TLBs (Chapter 18) TLB measurement demo (if time) Review: Paging 0 KB 1 5 4 P1 pagetable PT PT 4 KB 6 2 3 P2 pagetable P1 8 KB P2 Physical Virtual 12 KB P2 16


  1. Cache Types (more in CS 552) Direct-Mapped : only one place to put entries Four-Way Set Associative : 4 options Fully-Associative : entries can go anywhere - most common for TLBs 
 - must store whole key/value in cache 
 - search all in parallel

  2. Array Iterator (w/ TLB) int sum = 0; 
 for (i=0; i<2048; i++) { 
 � sum += a[i]; 
 }

  3. 
 
 
 Array Iterator Virt load 0x1000 
 load 0x1004 
 load 0x1008 
 load 0x100C 
 …

  4. 
 
 
 Virt Phys load 0x1000 
 load 0x1004 
 load 0x1008 
 load 0x100C 
 …

  5. 
 
 
 Virt Phys load 0x1000 
 PTBR 0 KB PT PT 4 KB load 0x1004 
 P1 pagetable P1 1 5 4 … 8 KB load 0x1008 
 P2 0 1 2 3 12 KB load 0x100C 
 P2 CPU’s TLB … 16 KB Valid Virt Phys P1 20 KB 0 P1 0 24 KB 0 P2 0 28 KB

  6. 
 
 
 Virt Phys load 0x1000 
 PTBR 0 KB PT PT 4 KB load 0x1004 
 P1 pagetable P1 1 5 4 … 8 KB load 0x1008 
 P2 0 1 2 3 12 KB load 0x100C 
 P2 CPU’s TLB … 16 KB Valid Virt Phys P1 20 KB 0 P1 0 24 KB 0 P2 0 28 KB

  7. 
 
 
 Virt Phys load 0x1000 
 load 0x0004 
 PTBR 0 KB PT PT 4 KB load 0x1004 
 P1 pagetable P1 1 5 4 … 8 KB load 0x1008 
 P2 0 1 2 3 12 KB load 0x100C 
 P2 CPU’s TLB … 16 KB Valid Virt Phys P1 20 KB 1 1 5 P1 0 24 KB 0 P2 0 28 KB

  8. 
 
 
 Virt Phys load 0x1000 
 load 0x0004 
 PTBR 0 KB PT load 0x5000 
 PT 4 KB load 0x1004 
 P1 pagetable P1 1 5 4 … 8 KB load 0x1008 
 P2 0 1 2 3 12 KB load 0x100C 
 P2 CPU’s TLB … 16 KB Valid Virt Phys P1 20 KB 1 1 5 P1 0 24 KB 0 P2 0 28 KB

  9. 
 
 
 Virt Phys load 0x1000 
 load 0x0004 
 PTBR 0 KB PT load 0x5000 
 PT 4 KB load 0x1004 
 P1 pagetable P1 1 5 4 … 8 KB load 0x1008 
 P2 0 1 2 3 12 KB load 0x100C 
 P2 CPU’s TLB … 16 KB Valid Virt Phys P1 20 KB 1 1 5 P1 0 24 KB 0 P2 0 28 KB

  10. 
 
 
 Virt Phys load 0x1000 
 load 0x0004 
 PTBR 0 KB PT load 0x5000 
 PT 4 KB load 0x1004 
 (TLB) 
 P1 pagetable P1 1 5 4 … 8 KB load 0x1008 
 P2 0 1 2 3 12 KB load 0x100C 
 P2 CPU’s TLB … 16 KB Valid Virt Phys P1 20 KB 1 1 5 P1 0 24 KB 0 P2 0 28 KB

  11. 
 
 
 Virt Phys load 0x1000 
 load 0x0004 
 PTBR 0 KB PT load 0x5000 
 PT 4 KB load 0x1004 
 (TLB) 
 P1 pagetable P1 load 0x5004 
 1 5 4 … 8 KB load 0x1008 
 P2 0 1 2 3 12 KB load 0x100C 
 P2 CPU’s TLB … 16 KB Valid Virt Phys P1 20 KB 1 1 5 P1 0 24 KB 0 P2 0 28 KB

  12. 
 
 
 Virt Phys load 0x1000 
 load 0x0004 
 PTBR 0 KB PT load 0x5000 
 PT 4 KB load 0x1004 
 (TLB) 
 P1 pagetable P1 load 0x5004 
 1 5 4 … 8 KB load 0x1008 
 (TLB) 
 P2 0 1 2 3 load 0x5008 
 12 KB load 0x100C 
 (TLB) 
 P2 CPU’s TLB … load 0x500C 16 KB Valid Virt Phys P1 20 KB 1 1 5 P1 0 24 KB 0 P2 0 28 KB

  13. 
 
 
 
 Virt Phys load 0x1000 
 load 0x0004 
 PTBR 0 KB PT load 0x5000 
 PT 4 KB load 0x1004 
 (TLB) 
 P1 pagetable P1 load 0x5004 
 1 5 4 … 8 KB load 0x1008 
 (TLB) 
 P2 0 1 2 3 load 0x5008 
 12 KB load 0x100C 
 (TLB) 
 P2 CPU’s TLB … 
 load 0x500C 16 KB Valid Virt Phys P1 20 KB 1 1 5 load 0x2000 P1 0 24 KB 0 P2 0 28 KB

  14. 
 
 
 
 
 Virt Phys load 0x1000 
 load 0x0004 
 PTBR 0 KB PT load 0x5000 
 PT 4 KB load 0x1004 
 (TLB) 
 P1 pagetable P1 load 0x5004 
 1 5 4 … 8 KB load 0x1008 
 (TLB) 
 P2 0 1 2 3 load 0x5008 
 12 KB load 0x100C 
 (TLB) 
 P2 CPU’s TLB … 
 load 0x500C 
 16 KB Valid Virt Phys P1 20 KB 1 1 5 load 0x2000 load 0x0008 P1 1 2 4 24 KB 0 P2 0 28 KB

  15. 
 
 
 
 
 Virt Phys load 0x1000 
 load 0x0004 
 PTBR 0 KB PT load 0x5000 
 PT 4 KB load 0x1004 
 (TLB) 
 P1 pagetable P1 load 0x5004 
 1 5 4 … 8 KB load 0x1008 
 (TLB) 
 P2 0 1 2 3 load 0x5008 
 12 KB load 0x100C 
 (TLB) 
 P2 CPU’s TLB … 
 load 0x500C 
 16 KB Valid Virt Phys P1 20 KB 1 1 5 load 0x2000 load 0x0008 
 P1 1 2 4 load 0x4000 24 KB 0 P2 0 28 KB

  16. 
 
 
 
 
 
 Virt Phys load 0x1000 
 load 0x0004 
 PTBR 0 KB PT load 0x5000 
 PT 4 KB load 0x1004 
 (TLB) 
 P1 pagetable P1 load 0x5004 
 1 5 4 … 8 KB load 0x1008 
 (TLB) 
 P2 0 1 2 3 load 0x5008 
 12 KB load 0x100C 
 (TLB) 
 P2 CPU’s TLB … 
 load 0x500C 
 16 KB Valid Virt Phys P1 20 KB 1 1 5 load 0x2000 
 load 0x0008 
 P1 1 2 4 load 0x4000 
 24 KB 0 load 0x2004 P2 0 28 KB

  17. 
 
 
 
 
 
 Virt Phys load 0x1000 
 load 0x0004 
 PTBR 0 KB PT load 0x5000 
 PT 4 KB load 0x1004 
 (TLB) 
 P1 pagetable P1 load 0x5004 
 1 5 4 … 8 KB load 0x1008 
 (TLB) 
 P2 0 1 2 3 load 0x5008 
 12 KB load 0x100C 
 (TLB) 
 P2 CPU’s TLB … 
 load 0x500C 
 16 KB Valid Virt Phys P1 20 KB 1 1 5 load 0x2000 
 load 0x0008 
 P1 1 2 4 load 0x4000 
 24 KB 0 load 0x2004 (TLB) 
 P2 0 28 KB

  18. 
 
 
 
 
 
 Virt Phys load 0x1000 
 load 0x0004 
 PTBR 0 KB PT load 0x5000 
 PT 4 KB load 0x1004 
 (TLB) 
 P1 pagetable P1 load 0x5004 
 1 5 4 … 8 KB load 0x1008 
 (TLB) 
 P2 0 1 2 3 load 0x5008 
 12 KB load 0x100C 
 (TLB) 
 P2 CPU’s TLB … 
 load 0x500C 
 16 KB Valid Virt Phys P1 20 KB 1 1 5 load 0x2000 
 load 0x0008 
 P1 1 2 4 load 0x4000 
 24 KB 0 load 0x2004 (TLB) 
 P2 0 0x4004 28 KB

  19. How many TLB lookups? (assume 1KB pages) int sum = 0; 
 for (i=0; i<2048; i++) { 
 � sum += a[i]; 
 }

  20. How many TLB lookups? (assume 1KB pages) int sum = 0; 
 for (i=0; i<2048; i++) { 
 � sum += a[i]; 
 } 2048/sizeof(int) = 512

  21. How many TLB “misses”? (assume 1KB pages) int sum = 0; 
 for (i=0; i<2048; i++) { 
 � sum += a[i]; 
 }

  22. How many TLB “misses”? (assume 1KB pages) int sum = 0; 
 for (i=0; i<2048; i++) { 
 � sum += a[i]; 
 } if a%4096 is 0, then 2 else 3

  23. Miss rate? (assume 1KB pages) int sum = 0; 
 for (i=0; i<2048; i++) { 
 � sum += a[i]; 
 } 2/512 = 0.4% or 3/512 = 0.6%

  24. Hit rate? (assume 1KB pages) int sum = 0; 
 for (i=0; i<2048; i++) { 
 � sum += a[i]; 
 } 510/512 = 99.6% or 509/512 = 99.4%

  25. Outline What work can we eliminate? Basic strategy. Workloads, systems, metrics. Context switching and security.

  26. Reasoning about TLB Workload : series of loads/stores to accesses TLB : chooses entries to store in CPU Metric : performance (i.e., hit rate) TLB “algebra”, given 2 variables, find the 3rd: f( W , T ) = M

  27. Reasoning about TLB Workload : series of loads/stores to accesses TLB : chooses entries to store in CPU Metric : performance (i.e., hit rate) TLB “algebra”, given 2 variables, find the 3rd: f( W , T ) = M

  28. TLB Workloads Sequential array accesses can almost always hit in the TLB, and so are very fast! What pattern would be slow?

  29. TLB Workloads Sequential array accesses can almost always hit in the TLB, and so are very fast! What pattern would be slow? 
 - highly random, with no repeat accesses

  30. Workload Characteristics Workload A Workload B int sum = 0; 
 int sum = 0; 
 srand(1234); 
 for (i=0; i<2048; i++) { 
 for (i=0; i<1000; i++) { 
 � sum += a[i]; 
 � sum += a[rand() % N]; 
 } } 
 srand(1234); 
 for (i=0; i<1000; i++) { 
 � sum += a[rand() % N]; 
 }

  31. address time … ? address time ? …

  32. Workload A Workload B address address … … time time

  33. Workload A Workload B address address Spatial Locality Temporal Locality … … time time

  34. Workload Locality Spatial Locality : future access will be to nearby addresses Temporal Locality : future access will be repeats to the same data

  35. Workload Locality Spatial Locality : future access will be to nearby addresses Temporal Locality : future access will be repeats to the same data What TLB characteristics are best for each type?

  36. A couple policies LRU : evict least-recently used a TLB slot is needed Random : randomly choose entries to evict When is each better?

  37. LRU Troubles Valid Virt Phys 0 virtual addresses: 0 0 0 1 2 3 4 0

  38. LRU Troubles Valid Virt Phys 0 virtual addresses: 0 0 0 1 2 3 4 0

  39. LRU Troubles Valid Virt Phys 1 0 ? virtual addresses: 0 0 0 1 2 3 4 0 miss!

  40. LRU Troubles Valid Virt Phys 1 0 ? virtual addresses: 0 0 0 1 2 3 4 0

  41. LRU Troubles Valid Virt Phys 1 0 ? virtual addresses: 1 1 ? 0 0 1 2 3 4 0 miss!

  42. LRU Troubles Valid Virt Phys 1 0 ? virtual addresses: 1 1 ? 0 0 1 2 3 4 0

  43. LRU Troubles Valid Virt Phys 1 0 ? virtual addresses: 1 1 ? 1 2 ? 0 1 2 3 4 0 miss!

  44. LRU Troubles Valid Virt Phys 1 0 ? virtual addresses: 1 1 ? 1 2 ? 0 1 2 3 4 0

  45. LRU Troubles Valid Virt Phys 1 0 ? virtual addresses: 1 1 ? 1 2 ? 0 1 2 3 4 0 3 ? miss!

  46. LRU Troubles Valid Virt Phys 1 0 ? virtual addresses: 1 1 ? 1 2 ? 0 1 2 3 4 0 3 ?

  47. LRU Troubles Valid Virt Phys 1 4 ? virtual addresses: 1 1 ? 1 2 ? 0 1 2 3 4 0 3 ? miss!

  48. LRU Troubles Valid Virt Phys 1 4 ? virtual addresses: 1 1 ? 1 2 ? 0 1 2 3 4 0 3 ?

  49. LRU Troubles Valid Virt Phys 1 4 ? virtual addresses: 1 0 ? 1 2 ? 0 1 2 3 4 0 3 ? miss!

  50. LRU Troubles Valid Virt Phys 1 4 ? virtual addresses: 1 0 ? 1 2 ? 0 1 2 3 4 0 3 ?

Recommend


More recommend