Lecture 24: Virtual Memory, Multiprocessors Todays topics: Virtual - PowerPoint PPT Presentation

Lecture 24: Virtual Memory, Multiprocessors • Today’s topics:  Virtual memory  Multiprocessors, cache coherence 1

Virtual Memory • Processes deal with virtual memory – they have the illusion that a very large address space is available to them • There is only a limited amount of physical memory that is shared by all processes – a process places part of its virtual memory in this physical memory and the rest is stored on disk (called swap space) • Thanks to locality, disk access is likely to be uncommon • The hardware ensures that one process cannot access the memory of a different process 2

Virtual Memory 3

Address Translation • The virtual and physical memory are broken up into pages 8KB page size Virtual address 13 virtual page page offset number Translated to physical page number Physical address 4

Memory Hierarchy Properties • A virtual memory page can be placed anywhere in physical memory (fully-associative) • Replacement is usually LRU (since the miss penalty is huge, we can invest some effort to minimize misses) • A page table (indexed by virtual page number) is used for translating virtual to physical page number • The page table is itself in memory 5

TLB • Since the number of pages is very high, the page table capacity is too large to fit on chip • A translation lookaside buffer (TLB) caches the virtual to physical page number translation for recent accesses • A TLB miss requires us to access the page table, which may not even be found in the cache – two expensive memory look-ups to access one word of data! • A large page size can increase the coverage of the TLB and reduce the capacity of the page table, but also increases memory waste 6

TLB and Cache • Is the cache indexed with virtual or physical address?  To index with a physical address, we will have to first look up the TLB, then the cache  longer access time  Multiple virtual addresses can map to the same physical address – must ensure that these different virtual addresses will map to the same location in cache – else, there will be two different copies of the same physical memory word • Does the tag array store virtual or physical addresses?  Since multiple virtual addresses can map to the same physical address, a virtual tag comparison can flag a miss even if the correct physical memory word is present 7

Cache and TLB Pipeline Virtual address Offset Virtual Virtual page number index TLB Tag array Data array Physical page number Physical tag Physical tag comparion Virtually Indexed; Physically Tagged Cache 8

Bad Events • Consider the longest latency possible for a load instruction:  TLB miss: must look up page table to find translation for v.page P  Calculate the virtual memory address for the page table entry that has the translation for page P – let’s say, this is v.page Q  TLB miss for v.page Q: will require navigation of a hierarchical page table (let’s ignore this case for now and assume we have succeeded in finding the physical memory location (R) for page Q)  Access memory location R (find this either in L1, L2, or memory)  We now have the translation for v.page P – put this into the TLB  We now have a TLB hit and know the physical page number – this allows us to do tag comparison and check the L1 cache for a hit  If there’s a miss in L1, check L2 – if that misses, check in memory  At any point, if the page table entry claims that the page is on disk, flag a page fault – the OS then copies the page from disk to memory and the hardware resumes what it was doing before the page fault … phew! 9

Multiprocessor Taxonomy • SISD: single instruction and single data stream: uniprocessor • MISD: no commercial multiprocessor: imagine data going through a pipeline of execution engines • SIMD: vector architectures: lower flexibility • MIMD: most multiprocessors today: easy to construct with off-the-shelf computers, most flexibility 10

Memory Organization - I • Centralized shared-memory multiprocessor or Symmetric shared-memory multiprocessor (SMP) • Multiple processors connected to a single centralized memory – since all processors see the same memory organization  uniform memory access (UMA) • Shared-memory because all processors can access the entire memory address space • Can centralized memory emerge as a bandwidth bottleneck? – not if you have large caches and employ fewer than a dozen processors 11

SMPs or Centralized Shared-Memory Processor Processor Processor Processor Caches Caches Caches Caches Main Memory I/O System 12

Memory Organization - II • For higher scalability, memory is distributed among processors  distributed memory multiprocessors • If one processor can directly address the memory local to another processor, the address space is shared  distributed shared-memory (DSM) multiprocessor • If memories are strictly local, we need messages to communicate data  cluster of computers or multicomputers • Non-uniform memory architecture (NUMA) since local memory has lower latency than remote memory 13

Distributed Memory Multiprocessors Processor Processor Processor Processor & Caches & Caches & Caches & Caches Memory I/O Memory I/O Memory I/O Memory I/O Interconnection network 14

SMPs • Centralized main memory and many caches  many copies of the same data • A system is cache coherent if a read returns the most recently written value for that word Time Event Value of X in Cache-A Cache-B Memory 0 - - 1 1 CPU-A reads X 1 - 1 2 CPU-B reads X 1 1 1 3 CPU-A stores 0 in X 0 1 0 15

Cache Coherence A memory system is coherent if: • P writes to X; no other processor writes to X; P reads X and receives the value previously written by P • P1 writes to X; no other processor writes to X; sufficient time elapses; P2 reads X and receives value written by P1 • Two writes to the same location by two processors are seen in the same order by all processors – write serialization • The memory consistency model defines “time elapsed” before the effect of a processor is seen by others 16

Cache Coherence Protocols • Directory-based: A single location (directory) keeps track of the sharing status of a block of memory • Snooping: Every cache block is accompanied by the sharing status of that block – all cache controllers monitor the shared bus so they can update the sharing status of the block, if necessary  Write-invalidate: a processor gains exclusive access of a block before writing by invalidating all other copies  Write-update: when a processor writes, it updates other shared copies of that block 17

Snooping-Based Protocols • Three states for a block: invalid, shared, modified • A write is placed on the bus and sharers invalidate themselves • The protocols are referred to as MSI, MESI, etc. Processor Processor Processor Processor Caches Caches Caches Caches Main Memory I/O System 18

Example • P1 reads X: not found in cache-1, request sent on bus, memory responds, X is placed in cache-1 in shared state • P2 reads X: not found in cache-2, request sent on bus, everyone snoops this request, cache-1does nothing because this is just a read request, memory responds, X is placed in cache-2 in shared state • P1 writes X: cache-1 has data in shared P1 P2 state (shared only provides read perms), request sent on bus, cache-2 snoops and then invalidates its copy of X, cache-1 Cache-1 Cache-2 moves its state to modified • P2 reads X: cache-2 has data in invalid state, request sent on bus, cache-1 snoops and realizes it has the only valid copy, so it downgrades itself to shared state and responds with data, X is placed in cache-2 Main Memory in shared state, memory is also updated 19

Example Request Cache Request Who responds State in State in State in State in Hit/Miss on the bus Cache 1 Cache 2 Cache 3 Cache 4 Inv Inv Inv Inv P1: Rd X Miss Rd X Memory S Inv Inv Inv P2: Rd X Miss Rd X Memory S S Inv Inv P2: Wr X Perms Upgrade X No response. Inv M Inv Inv Miss Other caches invalidate. P3: Wr X Write Wr X P2 responds Inv Inv M Inv Miss P3: Rd X Read Hit - - Inv Inv M Inv P4: Rd X Read Rd X P3 responds. Inv Inv S S Miss Mem wrtbk 20

Title • Bullet 21

Lecture 24: Virtual Memory, Multiprocessors Todays topics: Virtual - PowerPoint PPT Presentation

Lecture 24: Virtual Memory, Multiprocessors Todays topics: Virtual memory Multiprocessors, cache coherence 1 Virtual Memory Processes deal with virtual memory they have the illusion that a very large address space is

Lecture 23: Virtual Memory, Multiprocessors Todays topics: Virtual memory

Lecture 19: Virtual Memory Virtual Memory concept, Virtual- physical translation, page table,

Shared Memory OS Lecture 9 UdS/TUKL WS 2015 MPI-SWS 1 Review: Virtual Memory How is virtual

Shared Memory Multiprocessors Logical design and software interactions 1 Shared Memory

Lecture 21: Virtual Memory, I/O Basics Todays topics: Virtual memory I/O overview

LECTURE 12 Virtual Memory VIRTUAL MEMORY Just as a cache can provide fast, easy access to

Cap5 - Shared Memory Multiprocessors Logical design and software interactions 1 Shared Memory

Virtual Memory Logical address space can therefore be much larger than physical address

Lecture 23: Cache, Memory, Virtual Memory Todays topics: Cache examples, caching

1 Trends when work was done OS Issues for multiprocessors A period when multiprocessors were

Operating Systems Virtual Memory Lecture 11 Michael OBoyle 1 Paged virtual memory Allows a

Virtual Memory and Virtual Memory and Demand Paging Demand Paging Virtual Memory Illustrated

Protecting Memory What is there to protect in memory? Page tables and virtual memory

Virtual Memory Questions? ! What is virtual memory and when is it useful? CSCI [4|6] 730 ! What is

MIPS Memory lecture 16 virtual address physical address

lecture 16 virtual vs. physical memory - types of physical memory - paging Wed. March 9,

Virtual Memory Virtual Memory - The games we play with addresses and the memory behind them

Virtual Memory Questions? What is virtual memory and when is it useful? CSCI [4|6] 730

COMP 590-154: Computer Architecture Shared-Memory Multi-Processors Shared-Memory Multiprocessors

4 Chip Multiprocessors (I) Chip Multiprocessors (ACS MPhil) Robert Mullins Overview

Virtual Memory: Demand Paging and Replacment Virtual Memory Illustrated virtual physical

Memory Management & Virtual Memory Tevfik Ko ar University at Buffalo October 25 th , 2011

Lecture 20: Cache Hierarchies, Virtual Memory Todays topics: Cache hierarchies

Virtual Memory Goals for Today Virtual memory Mechanism Mechanism How does it

Lecture 24: Virtual Memory, Multiprocessors Todays topics: Virtual - PowerPoint PPT Presentation

Lecture 24: Virtual Memory, Multiprocessors Todays topics: Virtual memory Multiprocessors, cache coherence 1 Virtual Memory Processes deal with virtual memory they have the illusion that a very large address space is

Lecture 23: Virtual Memory, Multiprocessors Todays topics: Virtual memory

Lecture 19: Virtual Memory Virtual Memory concept, Virtual- physical translation, page table,

Shared Memory OS Lecture 9 UdS/TUKL WS 2015 MPI-SWS 1 Review: Virtual Memory How is virtual

Shared Memory Multiprocessors Logical design and software interactions 1 Shared Memory

Lecture 21: Virtual Memory, I/O Basics Todays topics: Virtual memory I/O overview

LECTURE 12 Virtual Memory VIRTUAL MEMORY Just as a cache can provide fast, easy access to

Cap5 - Shared Memory Multiprocessors Logical design and software interactions 1 Shared Memory

Virtual Memory Logical address space can therefore be much larger than physical address

Lecture 23: Cache, Memory, Virtual Memory Todays topics: Cache examples, caching

1 Trends when work was done OS Issues for multiprocessors A period when multiprocessors were

Operating Systems Virtual Memory Lecture 11 Michael OBoyle 1 Paged virtual memory Allows a

Virtual Memory and Virtual Memory and Demand Paging Demand Paging Virtual Memory Illustrated

Protecting Memory What is there to protect in memory? Page tables and virtual memory

Virtual Memory Questions? ! What is virtual memory and when is it useful? CSCI [4|6] 730 ! What is

MIPS Memory lecture 16 virtual address physical address

lecture 16 virtual vs. physical memory - types of physical memory - paging Wed. March 9,

Virtual Memory Virtual Memory - The games we play with addresses and the memory behind them

Virtual Memory Questions? What is virtual memory and when is it useful? CSCI [4|6] 730

COMP 590-154: Computer Architecture Shared-Memory Multi-Processors Shared-Memory Multiprocessors

4 Chip Multiprocessors (I) Chip Multiprocessors (ACS MPhil) Robert Mullins Overview

Virtual Memory: Demand Paging and Replacment Virtual Memory Illustrated virtual physical

Memory Management &amp; Virtual Memory Tevfik Ko ar University at Buffalo October 25 th , 2011

Lecture 20: Cache Hierarchies, Virtual Memory Todays topics: Cache hierarchies

Virtual Memory Goals for Today Virtual memory Mechanism Mechanism How does it

Memory Management & Virtual Memory Tevfik Ko ar University at Buffalo October 25 th , 2011