virtual memory
play

Virtual Memory Programmer can assume he/she has infinite amount of - PDF document

4/27/17 Virtual Memory Idea: Give the programmer the illusion of a large address space while having a small physical memory So that the programmer does not worry about managing physical memory Virtual Memory Programmer can assume


  1. 4/27/17 Virtual Memory • Idea: Give the programmer the illusion of a large address space while having a small physical memory • So that the programmer does not worry about managing physical memory Virtual Memory • Programmer can assume he/she has “infinite” amount of physical memory Samira Khan Apr 27, 2017 • Hardware and software cooperatively and automatically manage the physical memory space to provide the illusion • Illusion is maintained for each independent process 1 2 Overview of Paging Basic Mechanism • Indirection (in addressing) Virtual Page virtual • Address generated by each instruction in a program is a “virtual 4GB Process 1 Physical Page Frame 4G address” • i.e., it is not the physical address used to address main memory physical 16MB 16M • An “address translation” mechanism maps this address to a “physical Virtual Page address” • Address translation mechanism can be implemented in hardware and software virtual together 4GB Process 2 4G “At the heart [...] is the notion that ‘address’ is a concept distinct from ‘physical location.’” Peter Denning 3 4 1

  2. 4/27/17 Review: Virtual Memory & Physical Translation Memory Physical memory • Assume: Virtual Page 7 is mapped to Physical Page 32 Physical page Virtual address (DRAM) number or VP 1 PP 0 • For an access to Virtual Page 7 … disk address Valid VP 2 PTE 0 0 null VP 7 31 12 11 0 1 VP 4 PP 3 1 Virtual Address: 0000000111 011001 0 1 VPN Offset null Virtual memory 0 (disk) 0 PTE 7 1 VP 1 Translated Memory resident VP 2 page table VP 3 (DRAM) 27 VP 4 12 11 0 VP 6 Physical Address: 0000100000 011001 VP 7 PPN Offset ¢ A page table contains page table entries (PTEs) that map virtual pages to physical pages. 5 6 Address Translation: Page Hit Address Translation With a Page Table Virtual address 2 n-1 p p-1 0 CPU Chip Page table PTEA base register (PTBR) Virtual page number (VPN) Virtual page offset (VPO) 1 (CR3 in x86) PTE VA CPU MMU 3 Cache/ Memory Page table PA 4 Valid Physical page number (PPN) Physical page table address for the current process Data 5 Valid bit = 0: 1) Processor sends virtual address to MMU Page not in memory Valid bit = 1 (page fault) 2-3) MMU fetches PTE from page table in memory 4) MMU sends physical address to cache/memory m-1 p p-1 0 5) Cache/memory sends data word to processor Physical page number (PPN) Physical page offset (PPO) Physical address 7 8 2

  3. 4/27/17 Address Translation: Page Fault Integrating VM and Cache Exception Page fault handler 4 PTE 2 CPU Chip CPU Chip PTE Victim page PTEA PTEA 1 hit 5 VA PTE PTEA Cache/ PTEA CPU MMU PTEA Disk Memory miss 7 3 CPU New page VA MMU Memory PA PA PA 6 miss Data PA hit 1) Processor sends virtual address to MMU L1 2-3) MMU fetches PTE from page table in memory Data cache 4) Valid bit is zero, so MMU triggers page fault exception 5) Handler identifies victim (and, if dirty, pages it out to disk) 6) Handler pages in new page and updates PTE in memory 7) Handler returns to original process, restarting faulting instruction VA: virtual address, PA: physical address, PTE: page table entry, PTEA = PTE address 9 10 Two Problems Multi-Level Page Tables • Suppose: Level 2 • 4KB (2 12 ) page size, 48-bit address space, 8-byte PTE Tables • Two problems with page tables • Problem: Level 1 • Problem #1: Page table is too large • Would need a 512 GB page table! Table • 2 48 * 2 -12 * 2 3 = 2 39 bytes ... • Common solution: Multi-level page table • Problem #2: Page table is stored in memory • Example: 2-level page table ... • Before every memory access, always fetch the PTE from the slow memory? è • Level 1 table: each PTE points to a page table (always Large performance penalty memory resident) • Level 2 table: each PTE points to a page (paged in and out like any other data) 11 12 3

  4. 4/27/17 Translating with a k-level Page Table A Two-Level Page Table Hierarchy Level 1 Level 2 Virtual page table page tables memory 0 VP 0 Page table base register ... PTE 0 PTE 0 (PTBR) VP 1023 2K allocated VM pages ... PTE 1 VIRTUAL ADDRESS for code and data n-1 VP 1024 p-1 0 PTE 1023 PTE 2 (null) ... VPN 1 VPN 2 ... VPN k VPO PTE 3 (null) VP 2047 a Level k the Level 1 a Level 2 PTE 4 (null) PTE 0 page table page table page table PTE 5 (null) ... ... ... PTE 6 (null) PTE 1023 6K unallocated VM pages PTE 7 (null) Gap PPN PTE 8 1023 null (1K - 9) PTEs m-1 p-1 0 null PTEs 1023 PPN PPO PTE 1023 unallocated 1023 unallocated pages PHYSICAL ADDRESS pages 1 allocated VM page VP 9215 for the stack 32 bit addresses, 4KB pages, 4-byte PTEs 13 ... 14 Translation: “Flat” Page Table Translation: Two-Level Page Table pte_t PAGE_TABLE[1<<20];// 32-bit VA, 28-bit PA, 4KB page pte_t *PAGE_DIRECTORY[1<<10]; PAGE_TABLE[7]=2; PAGE_DIRECTORY[0]=malloc((1<<10)*sizeof(pte_t)); PAGE_DIRECTORY[0][7]=2; Virtual Address PAGE_TABLE PAGE_DIR PAGE_TABLE 0 15 0 31 12 11 0 PTE 1<<20-1 31 0 15 0 NULL 000000111 XXX PDE 1023 PTE 1023 VPN Offset NULL NULL ··· PTE 7 PTE 7 PTE 7 PTE 7 000000010 NULL 000000010 NULL PDE 1 NULL Physical Address ··· PDE 0 PDE 0 PTE 0 &PT 0 NULL NULL 27 12 11 0 NULL PTE 1 000000010 XXX PTE 0 NULL VPN[31:12]=0000000000_0000000111 PPN Offset Directory index Table index 15 16 4

  5. 4/27/17 Multi-Level Page Table (x86-64) Two-Level Page Table (x86) • CR3 : Control Register 3 (or Page Directory Base Register ) • Stores the physical address of the page directory • Q: Why not the virtual address? 17 18 Per-Process Virtual Address Space Two Problems • Each process has its own virtual address space • Two problems with page tables • Process X : text editor • Process Y : video player • Problem #1: Page table is too large • X writing to its virtual address 0 does not affect the data stored in Y ’s virtual • Page table has 1M entries address 0 (or any other address) • Each entry is 4B (because 4B ≈ 20-bit PPN) • This was the entire purpose of virtual memory • Page table = 4MB (!!) • Each process has its own page directory and page tables • very expensive in the 80s • On a context switch, the CR3’s value must be updated • Solution: Hierarchical page table X ’s PAGE_DIR Y ’s PAGE_DIR • Problem #2: Page table is in memory • Before every memory access, always fetch the PTE from the slow memory? è CR3 Large performance penalty 19 20 5

  6. 4/27/17 Speeding up Translation with a TLB Accessing the TLB • MMU uses the VPN portion of the virtual address • Page table entries (PTEs) are cached in L1 like any other to access the TLB: memory word T = 2 t sets VPN • PTEs may be evicted by other data references TLBT matches tag of n-1 p+t p+t-1 p p-1 0 line within set • PTE hit still requires a small L1 delay TLB tag (TLBT) TLB index (TLBI) VPO • Solution: Translation Lookaside Buffer (TLB) • Small set-associative hardware cache in MMU Set 0 v tag PTE v tag PTE • Maps virtual page numbers to physical page numbers TLBI selects the set • Contains complete page table entries for small number of pages Set 1 v tag PTE v tag PTE … Set T-1 v tag PTE v tag PTE 21 22 TLB Hit TLB Miss CPU Chip CPU Chip TLB TLB 4 PTE 2 2 PTE 3 VPN VPN 1 1 3 VA PA VA PTEA CPU MMU CPU MMU Cache/ Cache/ 4 Memory PA Memory 5 Data Data 5 6 A TLB miss incurs an additional memory access (the PTE) A TLB hit eliminates a memory access Fortunately, TLB misses are rare. Why? 23 24 6

  7. 4/27/17 Simple Memory System TLB Simple Memory System Example • Addressing • 16 entries • 14-bit virtual addresses • 4-way associative • 12-bit physical address TLBT TLBI • Page size = 64 bytes 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0 0 0 0 1 1 0 1 13 12 11 10 9 8 7 6 5 4 3 2 1 0 VPN VPO VPN = 0b1101 VPN VPO PPN = ? Virtual Page Number Virtual Page Offset Translation Lookaside Buffer (TLB) Set Set Tag Tag PPN PPN Valid Valid Tag Tag PPN PPN Valid Valid Tag Tag PPN PPN Valid Valid Tag Tag PPN PPN Valid Valid 11 10 9 8 7 6 5 4 3 2 1 0 0 0 03 03 – – 0 0 09 09 0D 0D 1 1 00 00 – – 0 0 07 07 02 02 1 1 1 1 03 03 2D 2D 1 1 02 02 – – 0 0 04 04 – – 0 0 0A 0A – – 0 0 PPN PPO 2 2 02 02 – – 0 0 08 08 – – 0 0 06 06 – – 0 0 03 03 – – 0 0 Physical Page Number 3 3 07 07 – – 0 0 03 03 0D 0D 1 1 0A 0A 34 34 1 1 02 02 – – 0 0 Physical Page Offset 25 26 Simple Memory System Page Table Context Switches Only showing the first 16 entries (out of 256) • Assume that Process X is running VPN PPN Valid VPN PPN Valid 00 28 1 08 13 1 • Process X ’s VPN 5 is mapped to PPN 100 VPN = 0b1101 01 – 0 09 17 1 • The TLB caches this mapping 02 33 1 0A 09 1 PPN = ? • VPN 5 à PPN 100 03 02 1 0B – 0 04 – 0 0C – 0 05 16 1 0D 2D 1 0x0D → 0x2D • Now assume a context switch to Process Y 06 – 0 0E 11 1 • Process Y ’s VPN 5 is mapped to PPN 200 07 – 0 0F 0D 1 • When Process Y tries to access VPN 5, it searches the TLB • Process Y finds an entry whose tag is 5 • Hurray! It’s a TLB hit! • The PPN must be 100! • … Are you sure? 27 28 7

Recommend


More recommend