chapter 4 cache memory contents
play

Chapter 4 Cache Memory Contents Computer memory system overview - PowerPoint PPT Presentation

Chapter 4 Cache Memory Contents Computer memory system overview Characteristics of memory systems Memory hierarchy Cache memory principles Elements of cache design Cache size Mapping function Replacement


  1. Chapter 4 Cache Memory

  2. Contents • Computer memory system overview —Characteristics of memory systems —Memory hierarchy • Cache memory principles • Elements of cache design —Cache size —Mapping function —Replacement algorithms —Write policy —Line size —Number of caches • Pentium 4 and PowerPC cache organizations

  3. Key Points • Memory hierarchy —processor registers —cache —main memory —fixed hard disk —ZIP cartridges, optical disks, and tape • Going down the hierarchy —decreasing cost, increasing capacity, and slower access time • Principles of locality —during the execution of a program, memory references tend to cluster

  4. 4.1 Computer Memory System Overview • Characteristics of memory systems —Location —Capacity —Unit of transfer —Access method —Performance —Physical type —Physical characteristics – volatile/nonvolatile – erasable/nonerasable —Organization

  5. Location • CPU • Internal —main memory —cache • External(secondary) —peripheral storage devices —disk, tape

  6. Capacity • Word size —natural unit of organization —8, 16, 32, and 64 bits • Number of words —memory capacity

  7. Unit of Transfer • Internal memory —Usually governed by data bus width • External memory —Usually a block which is much larger than a word

  8. Access Methods (1) • Sequential —Start at the beginning and read through in order —Access time depends on location of data and previous location —e.g. tape • Direct —Individual blocks have unique address —Access is by jumping to vicinity plus sequential search —Access time depends on location of data and previous location —e.g. disk

  9. Access Methods (2) • Random —Each location has a unique address —Access time is independent of location or previous access —e.g. RAM • Associative —Data is retrieved based on a portion of its contents rather than its address —Access time is independent of location or previous access —e.g. cache

  10. Performance • Access time (latency) —For random-access memory – time between presenting the address and getting the valid data —For non-random-access memory – time to position the read-write head at the location • Memory Cycle time (primarily applied to random-access memory) —Time may be required for the memory to “recover” before next access – die out on signal lines – regenerate data if they are read destructively —access time + recover time • Transfer Rate —For random-access memory, equal to 1/(cycle time)

  11. Performance • For non-random-access memory, the following relationship holds: T N = T A + N/R where T N = Average time to read or write N bits T A = Average access time N = Number of bits R = Transfer rate, in bits per second(bps)

  12. Physical Types • Semiconductor —RAM, ROM • Magnetic —Disk, Tape • Optical —CD, CD-R, CD-RW, DVD

  13. Physical Characteristics • Volatile/Nonvolatile • Erasable/Nonerasable

  14. Questions on Memory Design • How much? —Capacity • How fast? —Time is money • How expensive?

  15. Hierarchy List • Registers • L1 Cache • L2 Cache • Main memory • Disk cache • Disk • Optical • Tape

  16. Memory Hierarchy - Diagram

  17. As Going Dow n The Hierarchy • Decreasing cost per bit • Increasing capacity • Increasing access time • Decreasing frequency of access of memory by the processor

  18. An Example • Suppose we have two levels of memory —L1 : 1000 words, 0.01 us access time —L2 : 100,000 words, 0.1 us access time —H = fraction of all memory accesses found in L1 —T1 = access time to L1 —T2 = access time to L2 • Suppose H = 0.95 —(0.95)(0.01 us) + (0.05)(0.01 us + 0.1 us) = 0.095 + 0.0055 = 0.015 us —average access time is much closer to 0.01 us

  19. Principle of Locality • As going down the hierarchy, we had the decreasing frequency of access by the processor —this is possible due to the principle of locality • During the course of the execution of a program, memory references tend to cluster —programs contain loops and procedures – there are repeated references to a small set of instructions —operations on arrays involve access to a clustered set of data – there are repeated references to a small set of data

  20. 4.2 Cache Memory Principles • Cache —Small amount of fast memory local to processor —Sits between main memory and CPU

  21. Cache/Main Memory Structure

  22. Cache Read Operation • CPU requests contents of memory location • Check cache for this data • If present, get from cache (fast) • If not present, read required block from main memory to cache • Then deliver from cache to CPU • Cache includes tags to identify which block of main memory is in each cache slot

  23. Cache Read Operation

  24. 4.3 Elements of Cache Design • Design issues —Size —Mapping Function – direct, associative, set associative —Replacement Algorithm – LRU, FIFO, LFU, Random —Write Policy – Write through, write back —Line Size —Number of Caches – single or two level – unified or split

  25. Size Does Matter • Small enough to make it cost effective • Large enough for performance reasons —but larger caches tend to be slightly slower than small ones

  26. Mapping Function • Fewer cache lines than main memory blocks —mapping is needed —also need to know which memory block is in cache • Techniques —Direct —Associative —Set associative • Example case —Cache size : 64 KByte —Line size : 4 Bytes – cache is organized as 16 K lines —Main memory size : 16 Mbytes – each byte is directly addressable by a 24-bit address

  27. Direct Mapping • Maps each block into a possible cache line • Mapping function i = j modulo m where i = cache line number j = main memory block number m = number of lines in the cache • Address is in three parts —Least Significant w bits identify unique word —Most Significant s bits specify one memory block – these are split into a cache line field r and a tag s-r(most significant)

  28. Direct Mapping - Address Structure • Address length = (s + w) bits Number of addressable units = 2 s+ w words or bytes • Block size = line size = 2 w words or bytes • Number of blocks in main memory = 2 s+ w /2 w = 2 s • Number of lines in cache = m = 2 r • • Size of tag = (s – r) bits

  29. Direct Mapping - Address Structure Tag s-r Line or Slot r w 2 8 14 • 24 bit address(22 + 2) • 2 bit word identifier (4 bytes in a block) • 22 bit block identifier — 8 bit tag (= 22-14) — 14 bit slot or line • No two blocks mapping into the same line have the same tag field

  30. Direct Mapping - Cache Line Mapping Cache line Main Memory blocks assigned 0, m, 2m, 3m…2 s -m 0 1,m+ 1, 2m+ 1…2 s -m+ 1 1 m-1, 2m-1,3m-1…2 s -1 m-1

  31. Direct Mapping - Cache Line Mapping Cache line Starting memory address of block 0 000000, 010000,…, FF0000 1 000004, 010004,…, FF0004 m-1 00FFFC, 01FFFC,…, FFFFFC

  32. Direct Mapping - Cache Organization

  33. Direct Mapping Example

  34. Direct Mapping Pros & Cons • Simple and inexpensive to implement • Fixed cache location for any given block —If a program accesses 2 blocks that map to the same line repeatedly, cache misses are very high

  35. Associative Mapping • A main memory block can be loaded into any line of cache • Memory address is interpreted as a tag and a word field —Tag field uniquely identifies a block of memory • Every line’s tag is simultaneously examined for a match —Cache searching gets complex and expensive

  36. Associative Mapping - Address Structure • Address length = (s + w) bits Number of addressable units = 2 s+ w words or bytes • Block size = line size = 2 w words or bytes • Number of blocks in main memory = 2 s+ w /2 w = 2 s • • Number of lines in cache = cannot specify using s or w • Size of tag = s bits

  37. Associative Mapping - Address Structure Word Tag 22 bit 2 bit • 22 bit tag stored with each 32 bit block of data • Compare tag field with tag entry in cache to check for hit • Least significant 2 bits of address identify which byte is required from 32 bit data block

  38. Fully Associative Cache Organization

  39. Associative Mapping - Example

  40. Associative Mapping Pros & Cons • Flexible as to which block to replace when a new block is read into the cache —need to select one which is not going to be used in the near future • Complex circuitry is required to examine the tags of all cache lines

  41. Set Associative Mapping • A compromise of direct and associative methods • Cache is divided into a number of sets(v) • Each set contains a number of lines(k) • The relationships are m = v x k i = j modulo v where i = cache set number j = main memory block number m = number of lines in the cache

Recommend


More recommend