Cache Memory Raul Queiroz Feitosa
Content • Memory Hierarchy • Principle of Locality • Some Definitions • Cache Architectures • Fully Associative • Direct Mapping • Set Associative • Replacement Policy • Main Memory Update Polic y 2 Cache Memory 10/09/2020
Memory Hierarchy • Tradeoff cost speed • Memory split in hierarchical levels capacity Access time Access probability Cost/bit spped • Request sent to the next level below until it is carried out. 3 Cache Memory 10/09/2020
Cache operation – overview • CPU requests contents of memory location • Check cache for this data • If present, get from cache (fast) • If not present, read required block from main memory to cache • Then deliver from cache to CPU • Cache includes tags to identify which block of main memory is in each cache slot 4 Cache Memory 10/09/2020
Cache Read Operation 5 Cache Memory 10/09/2020
Cache and Main Memory from now on 6 Cache Memory 10/09/2020
Cache Addressing Where does cache sit? • Between processor and virtual memory management unit • Between MMU and main memory • Logical cache (virtual cache) stores data using virtual • addresses Processor accesses cache directly, not through physical cache • Cache access faster, before MMU address translation • Virtual addresses use same address space for different applications • • Must flush cache on each context switch Physical cache stores data using main memory physical • addresses 7 Cache Memory 10/09/2020
Principle of locality • Spatial The processor tends to access few restricted areas of the address space. • Temporal The processor tends to access in the near future addresses accessed in the recent past.. 8 Cache Memory 10/09/2020
Definitions Hit : access served by the cache • Miss: access not served by the cache • Hitratio: proportion of accesses served by the cache • number of accesses served by the cach h total number of accesses Missratio: proportion of accesses not served by the cache • number of accesses not served by the cach m total number of accesses Clearly m+h =1 • 9 Cache Memory 10/09/2020
Definitions Example: Let h be the hitrati o t hit the access time on a hit t miss the access time on a miss The average memory access time t will be: t miss t = h t hit + ( 1-h ) t miss t hit h 0 1 10 Cache Memory 10/09/2020
Definitions Block All set of 2 b bytes in consecutive addresses, starting in addresses whose b least significant bits are zero. Note that the addresses of the bytes belonging to the same block are coincident to the left of the b least significant bytes . address content 00000000 00000001 block 0 … … 00000111 00001000 00001001 block 1 … … 00001111 The data exchange between the cache and the main memory is carried out block-by-block. Does it make sense? 11 Cache Memory 10/09/2020
Fully Associative Cache Architeture lines 0 1 ●●● ●●● ●●● 2 L lines ●●● ●●● ●●● ●●● 2 L -1 valid TAG VALUE bit indicates if the contains the number contains a copy of the memory block line contains a of the block copied in valid memory that line block copy 12 Cache Memory 10/09/2020
Fully Associative Cache Operation ADDRESS GENERATED BY THE CPU a-1 b b-1 0 b Block number of the addressed points to the byte/word byte/word in the block The cache controller compares the block number and the TAG field of all lines simultaneously (associative search). If a TAG matches the block number and the valid bit is “on”, it is a hit, otherwise it is a miss. The b least significant bits are used as points to the byte/word within the block. 13 Cache Memory 10/09/2020
Fully Associative Cache Problem To compare the block number with the TAG fields of all cache lines simultaneously (associative search) one needs lots of comparators. Consequence Fully associative design is only used for small capacity caches.. 14 Cache Memory 10/09/2020
Direct Mapped Caches Basic Idea: Assign each main memory block to a single cache line. ● ● ● ● f main memory ● ● ●● ● ● cache blocks ●● ● ● ● lines ● ● ● ● ● ● ● ● ● ● 15 Cache Memory 10/09/2020
Direct Mapped Caches Basic Idea: Each main memory block can only be loaded into the cache line it is mapped to. Thus, it will be no more necessary to check all lines but just one. 16 Cache Memory 10/09/2020
Direct Mapped Caches Operation: ADDRESS GENERATED BY THE CPU a-1 b+L b+L-1 b b-1 0 L b points to a to be compared with the TAG field points toa cache line byte/work within the block The cache controller compares the address field to the left with the TAG field of the (single) cache line defined by the L bits. 17 Cache Memory 10/09/2020
Direct Mapped Caches Problem: Some lines may be often requested by different blocks, while other lines are rarely requested, which implies in a non-optimal use of the cache capacity. 18 Cache Memory 10/09/2020
Set Associative Caches Basic Idea: Instead of assigning each main memory block to a single cache line, assign each block to a set (associative) of cache lines. associative sets ● ● ● f ● main memory ● ● ●● ● ● cache lines blocks ● ● ● ● ● ● ● ● ●● ● ● ● ● ● 19 Cache Memory 10/09/2020
Set Associative Caches Basic Idea: A block may be loaded into any cache line of the associative set it is assigned to. associative sets ● ● ● f ● main memory ● ● ●● ● ● cache lines blocks ● ● ● ● ● ● ● ● ●● ● ● ● ● ● 20 Cache Memory 10/09/2020
Set Associative Caches Architecture set v TAG VALUE v TAG VALUE v TAG VALUE … 0 … 1 2 S sets … … … … … 2 S -1 line 0 line 1 line 2 c -1 21 Cache Memory 10/09/2020
Set Associative Caches Operation: assume that there are 2 S sets ADDRESS GENERATED BY THE CPU a-1 b+S b+S-1 b b-1 0 S b points to a to be compared with the TAG points toa set byte/work within field the block The cache controller compares the address field to the left with the TAG field of all rows of the associative set defined by the S bits (associative search). 22 Cache Memory 10/09/2020
Set Associative Caches Fully Associative Caches: Are set associative caches with a single associative set. Direct Mapped Caches: Are set associative caches whose “associative” sets contain each a single line. 23 Cache Memory 10/09/2020
Set Associative Caches Set size: Keeping the overal cache capacity constant and changing the number of lines/set. missratio (h) Above 4 lines/set the missratio does not change significativelly Lines/set 2 0 2 1 2 2 2 3 Direct mapped two-way tour-way eigth-way Fully associative 24 Cache Memory 10/09/2020
Set Associative Caches 1.0 0.9 0.8 0.7 Hit ratio 0.6 0.5 0.4 0.3 0.2 0.1 0.0 1k 2k 4k 8k 16k 32k 64k 128k 256k 512k 1M C ache size (bytes) direct 2-way 4-way 8-way 16-way 25 Cache Memory 10/09/2020
Replacement Policy Least Recently Used - LRU The least recently used line will be merged from cache to make room for a new main memory block. Pseudo LRU Example: a four-way set associative cache =0 bit I 1 points to the least recently used line in this half =0 =1 points to the least recently bit I 0 =0 used half bit I 2 points to the least recently used line in this half =1 =1 The least recently used line of the least recently used half is elected to leave the cache. 26 Cache Memory 10/09/2020
Replacement Policy • Example: A four-way set associative The lines in the set are initially empty • LRU only accesses to the set → a b c d a e b e f a b c d a e b e a b c d a e b a b c d a a a b c d d • Pseudo LRU only accesses to the set → a b c d a e b e f c d d e e e c c d d d a b b b a a b b a a a b b a a 27 Cache Memory 10/09/2020
Main Memory Update Policy • Write Through All writes are carried out in the cache and in the main memory. The CPU does not halt until the main memory is updated. • Problem Lots of traffic → specially harmful in multiprocessors 15% of memory references are writes. 28 Cache Memory 10/09/2020
Main Memory Update Policy • Write Back Each cache line has a bit ( dirty ) that indicates when set (=1), that te block copy in the cache differ from the main memory. When the block is brought from main memory into the cache, dirty =0 All writes are performed in the cache only and, in this case, dirty=1. The main memory is updated when the block selected for replacement has dirty=1 . • I/O must access main memory through cache 29 Cache Memory 10/09/2020
Recommend
More recommend