virtual memory and paging
play

Virtual Memory and Paging 6A. Introduction to Swapping and Paging - PDF document

4/12/2016 Virtual Memory and Paging 6A. Introduction to Swapping and Paging 6B. Paging MMUs and Demand Paging Operating Systems Principles 6C. Replacement Algorithms Virtual Memory and Paging 6D. Thrashing and Working Sets 6E. Other


  1. 4/12/2016 Virtual Memory and Paging 6A. Introduction to Swapping and Paging 6B. Paging MMUs and Demand Paging Operating Systems Principles 6C. Replacement Algorithms Virtual Memory and Paging 6D. Thrashing and Working Sets 6E. Other optimizations Mark Kampe (markk@cs.ucla.edu) Virtual Memory and Paging 2 Memory Management Memory Management Goals 1. allocate/assign physical memory to processes 1. transparency – explicit requests: malloc (sbrk) – process sees only its own virtual address space – implicit: program loading, stack extension – process is unaware memory is being shared 2. manage the virtual address space 2. efficiency – instantiate virtual address space on context switch – high effective memory utilization – extend or reduce it on demand – low run-time cost for allocation/relocation 3. manage migration to/from secondary storage 3. protection and isolation – optimize use of main storage – private data will not be corrupted – minimize overhead (waste, migrations) – private data cannot be seen by other processes Virtual Memory and Paging 3 Virtual Memory and Paging 4 Primary and Secondary Storage Why we swap • primary = main (executable) memory • make best use of a limited amount of memory – primary storage is expensive and very limited – process can only execute if it is in memory – only processes in primary storage can be run – can’t keep all processes in memory all the time • secondary = non-executable (e.g. Disk/SSD) – if it isn't READY, it doesn't need to be in memory – blocked processes can be moved to secondary storage – swap it out and make room for other processes – swap out code, data, stack and non-resident context • improve CPU utilization – make room in primary for other "ready" processes – when there are no READY processes, CPU is idle • returning to primary memory – CPU idle time means reduced system throughput – process is copied back when it becomes unblocked – more READY processes means better utilization Virtual Memory and Paging 5 Virtual Memory and Paging 6 1

  2. 4/12/2016 scheduling states with swapping Pure Swapping • each segment is contiguous exit running – in memory, and on secondary storage create – all in memory, or all on swap device request • swapping takes a great deal of time allocate blocked ready – transferring entire data (and text) segments • swapping wastes a great deal of memory swap out swap in – processes seldom need the entire segment swapped swap allocate out wait • variable length memory/disk allocation – complex, expensive, external fragmentation Virtual Memory and Paging 7 Virtual Memory and Paging 8 paged address translation Paging and Fragmentation process virtual address space CODE DATA STACK a segment is implemented as a set of virtual pages • internal fragmentation – averages only ½ page (half of the last one) • external fragmentation – completely non-existent (we never carve up pages) physical memory Virtual Memory and Paging 9 Virtual Memory and Paging 10 Paging Memory Management Unit Paging Relocation Examples virtual address physical address virtual address physical address page # offset page # offset 0004 0005 0000 1C08 3E28 0100 041F 0C20 0100 1C08 offset within page page fault remains the same. V page # V 0C20 virtual page # is V page # V 0105 used as an index V page # V 00A1 into the page table 0 0 V page # V 041F selected entry valid bit is checked 0 0 contains physical to ensure that this V page # V 0D10 page number. virtual page # is V page # V 0AC3 legal. page table page table Virtual Memory and Paging 11 Virtual Memory and Paging 12 2

  3. 4/12/2016 Demand Paging C 1 Demand Paging – advantages • paging MMU supports not present pages • improved system performance – generates a fault/trap when they are referenced – fewer in-memory pages per process – OS can bring in page, retry the faulted reference – more processes in primary memory • more parallelism, better throughput • entire process needn’t be in memory to run • better response time for processes already in memory – start each process with a subset of its pages – less time required to page processes in and out – load additional pages as program demands them • fewer limitations on process size • they don't need all the pages all the time – process can be larger than physical memory – code execution exhibits reference locality – process can have huge (sparse) virtual space – data references exhibit reference locality Virtual Memory and Paging 13 Virtual Memory and Paging 14 Page Fault Handling Demand Paging and Performance • page faults hurt performance • initialize page table entries to not present – increased overhead • CPU faults when invalid page is referenced • additional context switches and I/O operations 1. trap forwarded to page fault handler – reduced throughput 2. determine which page, where it resides • processes are delayed waiting for needed pages • key is having the "right" pages in memory 3. find and allocate a free page frame – right pages -> few faults, little overhead/delay 4. block process, schedule I/O to read page in – wrong pages -> many faults, much overhead/delay 5. update page table point at newly read-in page • we have little control over what we bring in 6. back up user-mode PC to retry failed instruction – we read the pages the process demands 7. unblock process, return to user-mode • key to performance is which pages we evict Virtual Memory and Paging 15 Virtual Memory and Paging 16 Belady's Optimal Algorithm Approximating Optimal Replacement • note which pages have recently been used • Q: which page should we replace? – use this data to predict future behavior A: the one we won't need for the longest time • Possible replacement algorithms • Why is this the right page? – random, FIFO: straw-men ... forget them – it delays the next page fault as long as possible • Least Recently Used – minimum number of page faults per unit time – assert near future will be like recent past • How can we predict future references? • programs do exhibit temporal and spatial locality – Belady cannot be implemented in a real system • if we haven’t used it recently, we probably won’t soon – but we can run implement it for test data streams – we don’t have to be right 100% of the time • the more right we are, the more page faults we save – we can compare other algorithms against it Virtual Memory and Paging 17 Virtual Memory and Paging 18 3

  4. 4/12/2016 Why Programs Exhibit Locality True LRU is hard to implement • Code locality • maintain this information in the MMU? – MMU notes the time, every time a page is referenced – code in same routine is in same/adjacent page – maybe we can get a per-page read/written bit – loops iterate over the same code • maintain this information in software? – a few routines are called repeatedly – mark all pages invalid, even if they are in memory – intra-module calls are common – take a fault the first time each page is referenced • Stack locality – then mark this page valid for the rest of the time slice – activity focuses on this and adjacent call frames • finding oldest page is prohibitively expensive • Data reference locality – 16GB memory / 4K page = 4M pages to scan – this is common, but not assured Virtual Memory and Paging 19 Virtual Memory and Paging 20 Practical LRU surrogates True Global LRU Replacement • must be cheap – can’t cause additional page faults reference a b c d a b d e f a b c d a e d – avoid scanning the whole page table (it is big) True LRU loads 4, replacements 7 • clock algorithms … a surrogate for LRU frame 0 a ! f d ! frame 1 b ! a ! – organize all pages in a circular list frame 2 c e c – position around the list is a surrogate for age frame 3 d ! b e – progressive scan whenever we need another page • for each page, ask MMU if page has been referenced • if so, reset the reference bit in the MMU; skip page • if not, consider this page to be the least recently used Virtual Memory and Paging 22 Virtual Memory and Paging 21 Working Sets – per process LRU LRU Clock Algorithm reference a b c d a b d e f a b c d a e d • Global LRU is probably a blunder – bad interaction with round-robin scheduling LRU clock loads 4, replacements 7 – better to give each process it's own page pool frame 0 a ! ! ! ! f d ! frame 1 b ! ! ! a ! ! – do LRU replacement within that pool E 2 frame 2 c e b e • fixed # of pages per process is also bad frame 3 d ! ! ! c – different processes exhibit different locality clock pos 0 1 2 3 0 0 0 1 2 0 3 0 1 2 3 0 1 2 1 3 • which pages are needed changes over time True LRU loads 4, replacements 7 • number of pages needed changes over time frame 0 a a f d d – much like different natural scheduling intervals frame 1 b b a a frame 2 c e c • we clearly want dynamic working sets frame 3 d d b e Virtual Memory and Paging 23 Virtual Memory and Paging 24 4

Recommend


More recommend