Translation Lookaside Buffer (TLB) Implemented in Hardware CSE 120 Cache to map virtual page numbers to page frame � Associative memory: HW looks up in all cache entries simultaneously – Usually not big: 64-128 entries � TLB entry: – page number – Valid – Modified – Protection July 18, 2006 – Page frame � If not present, do ordinary lookup, then evict entry from TLB and add new one Day 5 – Evict which entry? � Serial/Parallel lookup Memory – Serial: First look in TLB. If not found, then look in page table – Parallel. Look in TLB and in page table in parallel. If not found in TLB, then page table lookup already in progress. Instructor: Neil Rhodes 2 Software TLB Management TLB Summary MMU doesn’t handle page tables; software does Cost Example � Direct memory access: 100ns On a TLB miss, generate a TLB fault and let OS deal with it � Without TLB: 200ns (lookup in Page Table first) � Search a larger memory cache. � With TLB – Page containing cache must be in TLB for speed – Assume cost of TLB lookup is 10ns � If not in cache, search page table – Assume TLB hit rate is 90% � Once page frame, etc. found, update TLB – Serial lookup: Average cost = .9*110ns + .1*200ns = 119ns Why not use hardware? – Parallel lookup: Average cost = .9*110ns + .1*(200ns-10ns) = 118ns � Logic to search page table takes space on the die � Spend die size alternatively: Caches are very sensitive to: – Increase Memory cache � Hit rate – Reduce cost/power consumption � Cost of cache miss Note that TLB must be flushed on context switch � Unless TLB entries include process ID 3 4
Inverted Page Tables Inverted Page Tables Traditional page tables: 1 entry/virtual page Hash table � Space: proportional to number of allocated memory frames Inverted page tables: 1 entry/physical frame of memory – 1 entry in hash table for each allocated page Why? Size � 64-bit virtual addresses, 4KB page 256MB of RAM per process. Inverted page table needs 65536 entries Page Table Entry: pid: p offset f offset � Process ID � Virtual page number virtual address physical address � Additional PTE info Slow to search through table with 65536 entries � Solution: Hash table. Key is virtual page number. Entry contains virtual page, process ID and page frame f hash(p) Advantage: � Page table memory is proportional to physical memory – Not logical address space – Not number of processes Disadvantage � Hard to share memory between processes 5 6 Segmentation vs. Paging Page Fault Handling for Paging MMU generates Page Fault (protection violation or page not Segmentation Paging present). Page fault handler must: Need the programmer be aware the technique is being used? � Save registers � Figure out virtual address that caused fault How many linear address spaces are there? – Often in hardware register � If protection problem, signal or kill process Can the total address space exceed the size of phys. mem? � If writing to page after currently-allocated stack – Allocate free page to add to stack Can procedures and data be distinguished and separately – Update page-table protected? – Restart instruction for faulting process - Must undo any partial effects � Else, signal or kill process Can tables whose size fluctuates be accommodated easily? Is sharing of procedures between users facilitated? 7 8
Virtual Memory Page Fault Handling for Virtual Memory Idea: Use fast (small, expensive) memory as a cache for slow (large, MMU generates Page Fault (protection violation or page not present) � Save registers expensive) disk � Figure out virtual address that caused fault � 90/10 rule: processes spend 90% of there time in 10% of the code � Not all of a process’s address space need be in memory at a time – Often in hardware register � If protection problem, signal or kill process � Illusion of near-infinite memory � If no free frame, evict a page from memory (which one?) � More processes in memory (higher degree of multiprogramming) – If modified, write to backing store (dedicated paging space or normal file) – Keep disk location of this page (not in page table, but some other data structure). - MMU doesn’t need to know disk location Locality – Suspend faulting process (resume when write is complete) � Spatial: The likelihood of accessing a resource is higher if a resource close � Read data from backing store for faulting page to it was just referenced. – From backing store or application code or fill-with-zero � Temporal: The likelihood of accessing a resource is higher if it was recently – Suspend faulting process (resume when read complete) accessed. � Update page table � Restart instruction for faulting process – Must undo any partial effects 9 10 Paging and Translation Lookaside Buffer Page Replacement Policy Resident Set Management � How many page frames are allocated to each active process? CPU checks TLB return to failed instruction – Fixed – Variable OS instructs CPU Free page � What existing pages can be considered for replacement PTE in to write the page yes frame? TLB? to disk – Local: only the process that caused the page fault – Global: all processes no � Cleaning policy OS instructs CPU CPU activates I/O Access page table – Pre-Cleaning: Write dirty pages out prospectively to read the page hardware from disk – Demand-Cleaning: Write dirty pages out only as needed � Fetch policy Page in no Page transferred CPU activates I/O – Demand paging main from main memory memory hardware – Prepaging. Load extra pages speculatively while you’re loading others to disk – Copy-on-write yes - Page transferred Lazy duplicate of pages. For example, on fork, don’t copy data page until write occurs. from disk to main Update page table Update TLB Replacement Policy memory � Which page, among those eligible, should be replaced – All policies want to replace those that won’t be needed for a long time CPU generates – Since most processes exhibit locality , recent behavior helps predict future behavior Update page table physical address � Eligibility may be limited based on locked frames – Kernel pages – I/O buffers in kernel space 11 12
Page References Opt: the Optimal Page Replacement Policy Assumption is that the sequence of page references exhibits locality Swap out the page that will be used farthest in the future � Difficult to implement:) Reference string is list of page numbers used by program � For example, <0 1 2 3 0 1 4 01 2 3 4> � Consecutive references to the same page are removed Example reference string: <0 1 2 3 0 1 4 0 1 2 3 4> – That page better still be in memory! � Three page frames � Reference means read or write 2 3 2 1 5 2 4 5 3 2 5 2 13 14 FIFO: First-In First-Out FIFO: Belady’s anomaly Swap out the page that’s been in memory the longest For FIFO, adding extra page frames can cause more page faults � Works well for swapping out initialization code � Not so good for often-used code Example reference string: <0 1 2 3 0 1 4 0 1 2 3 4> � Three page frames 0 1 2 3 0 1 4 0 1 2 3 4 2 3 2 1 5 2 4 5 3 2 5 2 � Four page frames 0 1 2 3 0 1 4 0 1 2 3 4 15 16
Least Recently Used (LRU) Clock (or Second-chance) Remove the page that has been unused the longest Choose the oldest page that hasn’t been referenced � Implementation: Hardware � Keep counter in PTE. Increment on use. Find PTE with lowest counter to evict – Pages in circular list � Or, keep a linked list ordered by usage – R bit maintained by hardware in the PTE - HW: Whenever a PTE is accessed (read or � Example reference string: <0 1 2 3 0 1 4 0 1 2 3 4> write for that page), R bit is set to 1 - SW: can set R bit to 0 or 1 – When page is loaded, set R bit to 1 – Hand points to particular page. When a page is 2 3 2 1 5 2 4 5 3 2 5 2 needed, it checks R bit of that page - If set, clear and move to next page - If not set, this is the page to free 2 3 2 1 5 2 4 5 3 2 5 2 17 18 Clock Nth Chance Two levels of pages: Clock gives a second chance, so has 2 ages it can distinguish � old pages (those not referenced in last clock cycle) � new pages (referenced in last clock cycle Give n chances instead. Algorithm picks one of the old pages � Don’t evict page unless hand has swept by n times � Not the oldest (LRU) � Need counter in PTE Another way to look at it: � FIFO with a second chance (if front of list is referenced, clear reference and Higher we make N, the closer it approximates LRU put in back of list) Can it loop infinitely? 19 20
Recommend
More recommend