Cache Memory Chapter 17 S. Dandamudi Outline Introduction - PDF document

Cache Memory Chapter 17 S. Dandamudi Outline • Introduction • Types of cache misses • How cache memory works • Types of caches • Why cache memory works • Example implementations • Cache design basics ∗ Pentium ∗ PowerPC • Mapping function ∗ MIPS ∗ Direct mapping • Cache operation summary ∗ Associative mapping • Design issues ∗ Set-associative mapping ∗ Cache capacity • Replacement policies ∗ Cache line size • Write policies ∗ Degree of associatively • Space overhead 2003  S. Dandamudi Chapter 17: Page 2 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. 1

Introduction • Memory hierarchy ∗ Registers ∗ Memory ∗ Disk ∗ … • Cache memory is a small amount of fast memory ∗ Placed between two levels of memory hierarchy » To bridge the gap in access times – Between processor and main memory (our focus) – Between main memory and disk (disk cache) ∗ Expected to behave like a large amount of fast memory 2003  S. Dandamudi Chapter 17: Page 3 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. Introduction (cont’d) 2003  S. Dandamudi Chapter 17: Page 4 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. 2

How Cache Memory Works • Prefetch data into cache before the processor needs it ∗ Need to predict processor future access requirements » Not difficult owing to locality of reference • Important terms ∗ Miss penalty ∗ Hit ratio ∗ Miss ratio = (1 – hit ratio) ∗ Hit time 2003  S. Dandamudi Chapter 17: Page 5 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. How Cache Memory Works (cont’d) Cache read operation 2003  S. Dandamudi Chapter 17: Page 6 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. 3

How Cache Memory Works (cont’d) Cache write operation 2003  S. Dandamudi Chapter 17: Page 7 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. Why Cache Memory Works • Example for (i=0; i<M; i++) for(j=0; j<N; j++) X[i][j] = X[i][j] + K; ∗ Each element of X is double (eight bytes) ∗ Loop is executed (M * N) times » Placing the code in cache avoids access to main memory – Repetitive use (one of the factors) – Temporal locality » Prefetching data – Spatial locality 2003  S. Dandamudi Chapter 17: Page 8 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. 4

How Cache Memory Works (cont’d) 300 250 Execution time (ms Column-order 200 150 100 Row-order 50 0 500 600 700 800 900 1000 Matrix size 2003  S. Dandamudi Chapter 17: Page 9 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. Cache Design Basics • On every read miss ∗ A fixed number of bytes are transferred » More than what the processor needs – Effective due to spatial locality • Cache is divided into blocks of B bytes » b -bits are needed as offset into the block b = log 2 B » Block are called cache lines • Main memory is also divided into blocks of same size ∗ Address is divided into two parts 2003  S. Dandamudi Chapter 17: Page 10 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. 5

Cache Design Basics (cont’d) B = 4 bytes b = 2 bits 2003  S. Dandamudi Chapter 17: Page 11 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. Cache Design Basics (cont’d) • Transfer between main memory and cache ∗ In units of blocks ∗ Implements spatial locality • Transfer between main memory and cache ∗ In units of words • Need policies for ∗ Block placement ∗ Mapping function ∗ Block replacement ∗ Write policies 2003  S. Dandamudi Chapter 17: Page 12 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. 6

Cache Design Basics (cont’d) Read cycle operations 2003  S. Dandamudi Chapter 17: Page 13 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. Mapping Function • Determines how memory blocks are mapped to cache lines • Three types ∗ Direct mapping » Specifies a single cache line for each memory block ∗ Set-associative mapping » Specifies a set of cache lines for each memory block ∗ Associative mapping » No restrictions – Any cache line can be used for any memory block 2003  S. Dandamudi Chapter 17: Page 14 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. 7

Mapping Function (cont’d) Direct mapping example 2003  S. Dandamudi Chapter 17: Page 15 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. Mapping Function (cont’d) • Implementing direct mapping ∗ Easier than the other two ∗ Maintains three pieces of information » Cache data – Actual data » Cache tag – Problem: More memory blocks than cache lines � Several memory blocks are mapped to a cache line – Tag stores the address of memory block in cache line » Valid bit – Indicates if cache line contains a valid block 2003  S. Dandamudi Chapter 17: Page 16 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. 8

Mapping Function (cont’d) 2003  S. Dandamudi Chapter 17: Page 17 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. Mapping Function (cont’d) Direct mapping Reference pattern: 0, 4, 0, 8, 0, 8, 0, 4, 0, 4, 0, 4 Hit ratio = 0% 2003  S. Dandamudi Chapter 17: Page 18 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. 9

Mapping Function (cont’d) Direct mapping Reference pattern: 0, 7, 9, 10, 0, 7, 9, 10, 0, 7, 9, 10 Hit ratio = 67% 2003  S. Dandamudi Chapter 17: Page 19 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. Mapping Function (cont’d) Associative mapping 2003  S. Dandamudi Chapter 17: Page 20 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. 10

Mapping Function (cont’d) Associative mapping Reference pattern: 0, 4, 0, 8, 0, 8, 0, 4, 0, 4, 0, 4 Hit ratio = 75% 2003  S. Dandamudi Chapter 17: Page 21 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. Mapping Function (cont’d) Address match logic for associative mapping 2003  S. Dandamudi Chapter 17: Page 22 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. 11

Mapping Function (cont’d) Associative cache with address match logic 2003  S. Dandamudi Chapter 17: Page 23 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. Mapping Function (cont’d) Set-associative mapping 2003  S. Dandamudi Chapter 17: Page 24 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. 12

Mapping Function (cont’d) Address partition in set-associative mapping 2003  S. Dandamudi Chapter 17: Page 25 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. Mapping Function (cont’d) Set-associative mapping Reference pattern: 0, 4, 0, 8, 0, 8, 0, 4, 0, 4, 0, 4 Hit ratio = 67% 2003  S. Dandamudi Chapter 17: Page 26 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. 13

Replacement Policies • We invoke the replacement policy ∗ When there is no place in cache to load the memory block • Depends on the actual placement policy in effect ∗ Direct mapping does not need a special replacement policy » Replace the mapped cache line ∗ Several policies for the other two mapping functions » Popular: LRU (least recently used) » Random replacement » Less interest (FIFO, LFU) 2003  S. Dandamudi Chapter 17: Page 27 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. Replacement Policies (cont’d) • LRU ∗ Expensive to implement » Particularly for set sizes more than four • Implementations resort to approximation ∗ Pseudo-LRU » Partitions sets into two groups – Maintains the group that has been accessed recently – Requires only one bit » Requires only ( W -1) bits ( W = degree of associativity) – PowerPC is an example � Details later 2003  S. Dandamudi Chapter 17: Page 28 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. 14

Cache Memory Chapter 17 S. Dandamudi Outline Introduction - PDF document

Cache Memory Chapter 17 S. Dandamudi Outline Introduction Types of cache misses How cache memory works Types of caches Why cache memory works Example implementations Cache design basics Pentium PowerPC

What Is Memory Hierarchy A typical memory hierarchy today: Lecture 13: Cache Basics and Cache

Memory Hierarchy: Cache Memory hierarchy Cache basics Locality Cache organization Cache-aware

1 Classifying cache misses Cache Organization Classifying misses by causes (3Cs) Cache size,

General Cache Mechanics CPU Block: unit of data in cache and memory. (a.k.a. line) Memory

Cache Memory Chapter 17 S. Dandamudi Outline Introduction Types of cache misses

Chapter 4 Cache Memory Contents Computer memory system overview Characteristics of

Cache Systems CPU Main Main CPU Memory Memory 400MHz 10MHz Cache 10MHz Memory Hierarchy

L09: Cache Name: ID: Question: Direct Mapping Cache Hit Rate Consider a 4-block empty Cache,

Virtual Memory 1 Memory Hierarchy Memory 4GB Cache 1M Registers 1K Question: What if

Lecture 23: Cache, Memory, Virtual Memory Todays topics: Cache examples, caching

Caches Electronic Computers M Caches 1 Cache LOCALITY PRINCIPLE (SPATIAL AND TEMPORAL)

Web Cache Consistency Web Cache Consistency Web Cache Consistency Web Cache Consistency

Cache Example Main memory: Byte addressable memory of size 4GB = 2 32 bytes Cache size: 64KB = 2 16

Generations of Cache 1980: no cache in proc; 1989 first Intel proc with a cache on chip.

Distributed Memory and Cache Consistency Distributed Memory and Cache Consistency (some slides

Distributed Memory and Cache Consistency Distributed Memory and Cache Consistency (some slides

MPA (Marker PDU Aligned Framing for TCP) draft-culley-iwarp-mpa-01 Paul R . Culley HP

The Light Detection System of protoDune DP Thorsten Lux On behalf of CIEMAT and IFAE Outline

Addressing mode in MIPS Different formats of addressing registers or memory locations are called

and Plan Design September 28, 2015 Judy Miller Director of Retirement Policy American

Recognizing stances, arguments, viewpoints Ruth Morrison, Julian Chan Somasundaran and Wiebe

Good Morning! MCS1250 Introduction to Journalism October 2016, Ulrich Werner, Adj. Prof.

Writing Reports Use an effective title key words + organizational markers Key words indicate

ACADEMIC WRITING Sharu ruga ga Rajesw eswara ra Cont ntent ent Introduction to academic

Sambuz

Useful Links

Newsletter

Mail Us