lecture 19 virtual memory
play

Lecture 19: Virtual Memory Virtual Memory concept, Virtual- - PowerPoint PPT Presentation

Lecture 19: Virtual Memory Virtual Memory concept, Virtual- physical translation, page table, TLB, Alpha 21264 memory hierarchy 1 Adapted from UC Berkeley CS252 S01 Virtual Memory Virtual memory (VM) allows programs to have the illusion of a


  1. Lecture 19: Virtual Memory Virtual Memory concept, Virtual- physical translation, page table, TLB, Alpha 21264 memory hierarchy 1 Adapted from UC Berkeley CS252 S01

  2. Virtual Memory Virtual memory (VM) allows programs to have the illusion of a very large memory that is not limited by physical memory size � Make main memory (DRAM) acts like a cache for secondary storage (magnetic disk) � Otherwise, application programmers have to move data in/out main memory � That’s how virtual memory was first proposed Virtual memory also provides the following functions � Allowing multiple processes share the physical memory in multiprogramming environment � Providing protection for processes (compare Intel 8086: without VM applications can overwrite OS kernel) � Facilitating program relocation in physical memory space

  3. VM Example 3

  4. Virtual Memory and Cache VM address translation a provides a mapping from the virtual address of the processor to the physical address in main memory and secondary storage. Cache terms vs. VM terms � Cache block => page � Cache Miss => page fault Tasks of hardware and OS � TLB does fast address translations � OS handles less frequently events: � page fault � TLB miss (when software approach is used) 4

  5. Virtual Memory and Cache Parameter L1 Cache Main Memory Block (page) size 16-128 bytes 4KB – 64KB Hit time 1-3 cycles 50-150 cycles Miss Penalty 8-300 cycles 1M to 10M cycles Miss rate 0.1-10% 0.00001-0.001% Address mapping 25-45 bits => 13-21 32-64 bits => 25-45 bits bits

  6. 4 Qs for Virtual Memory Q1: Where can a block be placed in the upper level? � Miss penalty for virtual memory is very high => Full associativity is desirable (so allow blocks to be placed anywhere in the memory) � Have software determine the location while accessing disk (10M cycles enough to do sophisticated replacement) Q2: How is a block found if it is in the upper level? � Address divided into page number and page offset � Page table and translation buffer used for address translation � Q: why fully associativity does not affect hit time?

  7. 4 Qs for Virtual Memory Q3: Which block should be replaced on a miss? � Want to reduce miss rate & can handle in software � Least Recently Used typically used � A typical approximation of LRU � Hardware set reference bits � OS record reference bits and clear them periodically � OS selects a page among least-recently referenced for replacement Q4: What happens on a write? � Writing to disk is very expensive � Use a write-back strategy 7

  8. Virtual and Physical Addresses A virtual address consists of a virtual page number and a page offset. The virtual page number gets translated to a physical page number. The page offset is not changed 36 bits 12 bits Virtual Address Virtual Page Number Page offset Translation Physical Page Number Page offset Physical Address 33 bits 12 bits 8

  9. Address Translation Via Page Table Assume the access hits in main memory 9

  10. Address Translation with Page Tables A page table translates a virtual page number into a physical page number A page table register indicates the start of the page table. The virtual page number is used as an index into the page table that contains The physical page number � A valid bit that indicates if the page is present in main � memory A dirty bit to indicate if the page has been written � Protection information about the page (read only, � read/write, etc.) Since page tables contain a mapping for every virtual page, no tags are required (how to compare it with cache?) Page table access is slow; we will see the solution

  11. Page Table Diagram 11

  12. Accessing Main Memory or Disk Valit bit being zero means the page is not in main memory Then a page fault occurs, and the missing page is read in from disk. 12

  13. How Large Is Page Table? Suppose � 48-bit virtual address � 41-bit physical address � 8 KB pages => 13 bit page offset � Each page table entry is 8 bytes How large is the page table? � Virtual page number = 48 - 13 = 25 bytes � Number of entries = number of pages = 2 25 = 32M � Total size = number of entries x bytes/entry = 32M x 8B = 256 Mbytes � Each process needs its own page table Page tables have to be very large, thus must be stored in main page or even paged, resulting in slow access We need techniques to reduce page table size 13

  14. TLB: Improving Page Table Access Cannot afford accessing page table for every access include cache hits (then cache itself makes no sense) Again, use cache to speed up accesses to page table! (cache for cache?) TLB is translation lookaside buffer storing frequently accessed page table entry A TLB entry is like a cache entry � Tag holds portions of virtual address � Data portion holds physical page number, protection field, valid bit, use bit, and dirty bit (like in page table entry) � Usually fully associative or highly set associative � Usually 64 or 128 entries Access page table only for TLB misses 14

  15. TLB Characteristics The following are characteristics of TLBs � TLB size : 32 to 4,096 entries � Block size : 1 or 2 page table entries (4 or 8 bytes each) � Hit time: 0.5 to 1 clock cycle � Miss penalty: 10 to 30 clock cycles (go to page table) � Miss rate: 0.01% to 0.1% � Associative : Fully associative or set associative � Write policy : Write back (replace infrequently) 15

  16. Alpha 21264 Data TLB 128 entries, fully associative ASN (like PID) to avoid flushing Also check protection 16

  17. Determine Page Size Larger Size Comments Page table size Inversely proportional � Fast L1 cache hit L1 cache can be larger � I/O utilization Longer burst � transfer TLB hit rate Increasing TLB coverage � Storage efficiency � Reducing fragmentation I/O efficiency Unnecessary data � transfer Process start-up Small processes are � popular Most commonly used size: 4KB or 8KB � Hardware may support a range of page sizes � OS selects the best one(s) for its purpose 17

  18. Alpha 21264 TLB Access Virtual indexed Physically tagged Physically indexed Physically tagged 18

  19. Alpha 21264 Virtual Memory Combining segmentation and paging � Segmentation: variable-size memory space range, usually defined by a base register and a limit field � Segmentation assign meanings to address spaces, and reduce address space that needs paging (reducing page table size) � Paging is used on the address space of each segment Three segments in Alpha � kseg: reserved for OS kernel, not VM management � seg0: virtual address accessible to user process � seg1: virtual address accessible to OS kernel 19

  20. Two Viewpoints of Virtual Memory Application programs � Sees a large, flat memory space � Assumes fast access to every place � Hardware/OS hide the complexity OS Kernel � Manages multiple process spaces � Reserves direct accesses to some portions of physical memory � May access physical memory, its own virtual memory, and virtual memory of the current process � Hardware facilitates fast VM accesses, and OS manages slow, less frequent events 20

  21. Alpha 21264 Page Table 10-bit 13-bit 1024 8B PTEs Page table access on TLB miss managed by 28-bit 13-bit software 21

  22. Memory Protection Memory protection: preventing unauthorized accesses to process and kernel memory Memory protection implementation: � User programs can only access through virtual memory � PTE entry contains protection bits to allow shared but protected accesses Protection fields in Alpha � Valid, user read enable, kernel read enable, user write enable, and kernel write enable 22

  23. Memory Hierarchy Example: Alpha 21264 in AlphaServer ES40 L1 instruction cache: 2-way, 64KB, 64-byte block, Virtually indexed and tagged � Use way prediction and line prediction to allow instruction fetching Inst prefetcher: store four prefetched instructions, accessed before L2 cache L1 data cache: 2-way, 64KB, 64-byte block, Virtually indexed, physically tagged, write-through Victim buffer: 8-entry, checked before L2 access L2 unified cache: 1-way 1MB to 16MB, off-chip, write-back; � Allow critical-word transfer to L1 cache, transfers 16B per 2.25ns TLB: 128-entry fully associative for inst and data (each) ES40: L1 miss penalty 22ns, L2 130 ns; up to 32GB memory; 256-bit memory buses (64-bit into processor) Read 5.13 for more details 23

Recommend


More recommend