Slides for Lecture 12 ENCM 501: Principles of Computer Architecture Winter 2014 Term Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary 25 February, 2014
slide 2/19 ENCM 501 W14 Slides for Lecture 12 Previous Lecture ◮ more about multi-level caches ◮ classifying cache misses: the 3 C’s ◮ introduction to virtual memory
slide 3/19 ENCM 501 W14 Slides for Lecture 12 Today’s Lecture ◮ Continued explanation of virtual memory. Related reading in Hennessy & Patterson: Sections B.4–B.5
slide 4/19 ENCM 501 W14 Slides for Lecture 12 Quick review of address translation virtual address virtual page page number offset straight copy translation (no translation!) physical page page number offset physical address The master list of VPN-to-PPN translations for a single process is maintained by the O/S kernel in a data structure called a page table. TLBs are circuits capable of doing some of these translations very quickly.
slide 5/19 ENCM 501 W14 Slides for Lecture 12 A couple of questions about address translation (1) Process 98 and 99 are running at the same time. Suppose that 0x7fffff567 is the VPN for a page for process 98’s stack , and the corresponding PPN is 0x13579bd . Suppose that 0x7fffff567 is also the VPN for a page for process 99’s stack . What can we conclude about the VPN-to-PPN translation for VPN 0x7fffff567 in process 99?
slide 6/19 ENCM 501 W14 Slides for Lecture 12 A couple of questions about address translation (2) As on the previous slide, process 98 and 99 are running at the same time. Suppose that 0x000000400 is the VPN for a page for process 98’s instructions , and the corresponding PPN is 0x1234567 . Suppose that 0x000000400 is also the VPN for a page for process 99’s instructions . What can we conclude about the VPN-to-PPN translation for VPN 0x000000400 in in process 99?
slide 7/19 ENCM 501 W14 Slides for Lecture 12 Linux / Mac OS X virtual address spaces on x86-64 Pointers are 64 bits wide, but only the least significant 48 bits are used in a virtual address. byte address 0xffff ffff ffff ffff virtual address 0xffff ffff ffff fffe . space for . . O/S kernel 0xffff 8000 0000 0000 HUGE range of invalid addresses 0x0000 7fff ffff ffff virtual address 0x0000 7fff ffff fffe . space for . . user processes 0x0000 0000 0000 0000 (For 64-bit Microsoft Windows, the picture is either identical, or not quite the same but very similar.)
slide 8/19 ENCM 501 W14 Slides for Lecture 12 A page table for an x86-64 Linux process The normal page size is 4 KB. So bits 11–0 of an address are page offset, and bits 46–12 of a virtual address are VPN (virtual page number). Conceptually, a page table is just an array of PTEs (page table entries) , where the indexes are VPNs: VPN 64-bit PTE 0x7fff ffff f 64-bit PTE 0x7fff ffff e . . . . . . 64-bit PTE 0x0000 0000 1 64-bit PTE 0x0000 0000 0
slide 9/19 ENCM 501 W14 Slides for Lecture 12 Suppose that a page table really is just a big array, as shown on the previous slide. How much space would such a page table occupy? The answer to the above question is a totally unreasonable number, so we’ll need to use more complex and much more space-efficient data structures for page tables. Let’s worry about the data structures later, and continue for a while with the simple model that a page table is just a big array of PTEs.
slide 10/19 ENCM 501 W14 Slides for Lecture 12 What information is in a PTE? A PTE answers several different questions about a virtual page. Here is an incomplete list: ◮ First, does the virtual page even exist? (For a typical x86-64 Linux process, the vast majority of VPNs in the range from 0x0000 0000 0 from 0x7fff ffff f correspond to non-existent virtual pages.) ◮ If the page exists, is it present in physical memory? ◮ If the page is present, what is the PPN (physical page number)? ◮ What are the permissions for the page—can the process write to the page, and can it fetch instructions from the page?
slide 11/19 ENCM 501 W14 Slides for Lecture 12 PTE formats in x86-64 Linux (1) First, let’s look at a PTE for a page that does not exist. I haven’t found documentation to confirm this, but I’m pretty sure that 64 zeros indicate that there is no page corresponding to a VPN: bit numbers within PTE 63 0 0 0 0 0 · · ·
slide 12/19 ENCM 501 W14 Slides for Lecture 12 PTE formats in x86-64 Linux (2) Now let’s look at a PTE for a page that does exist, and is present in physical memory. How can a page exist but NOT be present in physical memory? Okay, back to the PTE format for a page that is present . . . bit numbers within PTE 63 51 12 8 2 1 0 up to 40 bits for PPN 1 XD more page status bits R/W P : unused bits Let’s make some notes about the P, R/W and XD bits.
slide 13/19 ENCM 501 W14 Slides for Lecture 12 PTE formats in x86-64 Linux (3) And here is a PTE for a page that exists, but is not present in physical memory. 63 1 0 page location on disk, other info about page 0 P We won’t go into detail about bits 63–1, but if the assumption on slide 11 is correct, they must not all be zero. Source for information on this slide and slide 12: Bryant, R. E. and O’Hallaron, D. R., Computer Systems: A Programmer’s Perspective, 2nd ed. , published by Prentice Hall.
slide 14/19 ENCM 501 W14 Slides for Lecture 12 Review of P3/P4 memory system structure I-TLB DRAM CONTROLLER L1 I- CACHE UNIFIED DRAM CORE L2 MODULES CACHE D-TLB L1 D- CACHE On every instruction fetch, the I-TLB must attempt to translate a virtual instruction address into a physical instruction address. On every data read or write, the D-TLB must attempt to translate a virtual data address into a physical data address.
slide 15/19 ENCM 501 W14 Slides for Lecture 12 TLB structure A TLB is essentially a cache for page table information. A page table is a complete list of the statuses of all of the virtual pages belonging to a process. A TLB contains some of the most recently accessed information in a page table.
slide 16/19 ENCM 501 W14 Slides for Lecture 12 TLB hits Let’s outline: ◮ how a TLB hit is detected; ◮ what happens as a result of a TLB hit.
slide 17/19 ENCM 501 W14 Slides for Lecture 12 Simple TLB misses The simplest form of a TLB miss occurs when there is a valid VPN-to-PPN translation, which is in the page table, but not in the TLB. Let’s describe how such a TLB miss is handled.
slide 18/19 ENCM 501 W14 Slides for Lecture 12 DRAM, disk storage and flash memory Here’s a story that is simple, easy to understand, but not actually true . . . ◮ Instructions and data belonging to the kernel and to processses are in DRAM. ◮ I-caches and D-caches allow processor cores to access instructions and data much faster than if all such accesses really had to go to DRAM. ◮ Non-volatile storage, such as magnetic disks and flash memory arrays, are used for file storage. That’s actually a good model to start with, but it’s wrong! What is a more accurate model?
slide 19/19 ENCM 501 W14 Slides for Lecture 12 Upcoming Topics Short-term: ◮ Completion of material on virtual memory. ◮ Simple pipelining. Related reading in Hennessy & Patterson: Sections B.4–B.5, Appendix C. Big topics for the second half of the course: ◮ Instruction-level parallelism. ◮ Thread-level parallelism. Related reading in Hennessy & Patterson: Appendix C, Chapters 3 and 5.
Recommend
More recommend