8/31/12 ¡ Logical Diagram Binary Memory Threads Formats Allocators x86 Memory Protection User System Calls Kernel and Translation RCU File System Networking Sync Today’s Lecture Don Porter (focus on CSE 506 Memory Device CPU hardware ABI) Management Scheduler Drivers Hardware Interrupts Disk Net Consistency Lecture Goal Undergrad Review ò Understand the hardware tools available on a modern ò What is: x86 processor for manipulating and protecting memory ò Virtual memory? ò Lab 2: You will program this hardware ò Segmentation? ò Apologies: Material can be a bit dry, but important ò Paging? ò Plus, slides will be good reference ò But, cool tech tricks: ò How does thread-local storage (TLS) work? ò An actual (and tough) Microsoft interview question Memory Mapping Two System Goals 1) Provide an abstraction of contiguous, isolated virtual Process 1 Process 2 memory to a program Virtual Memory Virtual Memory 2) Prevent illegal operations Only one // Program expects (*x) � 0x1000 0x1000 physical address ò Prevent access to other application or OS memory // to always be at � 0x1000!! // address 0x1000 � ò Detect failures early (e.g., segfault on address 0) int *x = 0x1000; � ò More recently, prevent exploits that try to execute program data 0x1000 Physical Memory 1 ¡
8/31/12 ¡ Outline x86 Processor Modes ò x86 processor modes ò Real mode – walks and talks like a really old x86 chip ò x86 segmentation ò State at boot ò 20-bit address space, direct physical memory access ò x86 page tables ò 1 MB of usable memory ò Software vs. Hardware mechanisms ò Segmentation available (no paging) ò Protected mode – Standard 32-bit x86 mode ò Advanced Features ò Segmentation and paging ò Interesting applications/problems ò Privilege levels (separate user and kernel) x86 Processor Modes Translation Overview ò Long mode – 64-bit mode (aka amd64, x86_64, etc.) 0xdeadbeef Segmentation 0x0eadbeef 0x6eadbeef Paging ò Very similar to 32-bit mode (protected mode), but bigger Virtual Address Linear Address Physical Address ò Restrict segmentation use ò Garbage collect deprecated instructions Protected/Long mode only ò Chips can still run in protected mode with old instructions ò Segmentation cannot be disabled! ò But can be a no-op (aka flat mode) x86 Segmentation Programming model ò A segment has: ò Segments for: code, data, stack, “extra” ò Base address (linear address) ò A program can have up to 6 total segments ò Length ò Segments identified by registers: cs, ds, ss, es, fs, gs ò Type (code, data, etc). ò Prefix all memory accesses with desired segment: ò mov eax, ds:0x80 (load offset 0x80 from data into eax) ò jmp cs:0xab8 (jump execution to code offset 0xab8) ò mov ss:0x40, ecx (move ecx to stack offset 0x40) 2 ¡
8/31/12 ¡ Segmented Programming Programming, cont. Pseudo-example ò This is cumbersome, so infer code, data and stack // global int x = 1 � ds:x = 1; // data � segments by instruction type: int y; // stack � ss:y; // stack � ò Control-flow instructions use code segment (jump, call) if (ds:x) { � if (x) { � ò Stack management (push/pop) uses stack � ss:y = 1; � � y = 1; � ò Most loads/stores use data segment � cs:printf � � � printf (“Boo”); � ò Extra segments (es, fs, gs) must be used explicitly � (ds:“Boo”); � } else � } else � � y = 0; � � ss:y = 0; � Segments would be used in assembly, not C Segment management Segment registers ò For safety (without paging), only the OS should define segments. Why? Global or Local Table Index (13 bits) Ring (2 bits) Table? (1 bit) ò Two segment tables the OS creates in memory: ò Global – any process can use these segments ò Set by the OS on fork, context switch, etc. ò Local – segment definitions for a specific process ò How does the hardware know where they are? ò Dedicated registers: gdtr and ldtr ò Privileged instructions: lgdt, lldt Sample Problem: Booting problem (Old) JOS Bootloader ò Suppose my kernel is compiled to be in upper 256 MB of ò Kernel needs to set up and manage its own page tables a 32-bit address space (i.e., 0xf0100000) ò Paging can translate 0xf0100000 to 0x00100000 ò Common to put OS kernel at top of address space ò But what to do between the bootloader and kernel code ò Bootloader starts in real mode (only 1MB of addressable that sets up paging? physical memory) ò Bootloader loads kernel at 0x0010000 ò Can’t address 0xf0100000 3 ¡
8/31/12 ¡ Segmentation to the JOS ex 1, cont. Rescue! ò kern/entry.S: SEG(STA_X|STA_R, -KERNBASE, 0xffffffff) # code seg � � ò What is this code doing? Execute and Offset Segment Read -0xf0000000 Length (4 GB) mygdt: � permission � SEG_NULL # null seg � jmp 0xf01000db8 # virtual addr. (implicit cs seg) � SEG(STA_X|STA_R, -KERNBASE, 0xffffffff) # code seg � jmp (0xf01000db8 + -0xf0000000) SEG(STA_W, -KERNBASE, 0xffffffff) # data seg � � jmp 0x001000db8 # linear addr. � Flat segmentation Outline ò The above trick is used for booting. We eventually want ò x86 processor modes to use paging. ò x86 segmentation ò How can we make segmentation a no-op? ò x86 page tables ò From kern/pmap.c: ò Software vs. Hardware mechanisms // 0x8 - kernel code segment � ò Advanced Features [GD_KT >> 3] = SEG(STA_X | STA_R, 0x0, 0xffffffff, 0), � ò Interesting applications/problems Execute and Offset Segment Read Ring 0 0x00000000 Length (4 GB) permission Paging Model How it works ò 32 (or 64) bit address space. ò OS creates a page table ò Arbitrary mapping of linear to physical pages ò Any old page with entries formatted properly ò Hardware interprets entries ò Pages are most commonly 4 KB ò cr3 register points to the current page table ò Newer processors also support page sizes of 2 and 4 MB and 1 GB ò Only ring0 can change cr3 4 ¡
8/31/12 ¡ Translation Overview Example 0xf1084150 0x3b4 0x84 0x150 Page Dir Offset Page Table Offset Physical Page Offset (Top 10 addr bits: (Next 10 addr bits) (Low 12 addr bits) 0xf10 >> 2) cr3 From Intel 80386 Reference Programmer’s Manual Entry at cr3+0x3b4 * Entry at 0x84 * Data we want at sizeof(PTE) sizeof(PTE) offset 0x150 Page Table Entries Page Table Entries ò Top 20 bits are the physical address of the mapped page cr3 Physical Address Flags (12 bits) Upper (20 bits) ò Why 20 bits? ò 4k page size == 12 bits of offset 0x00384 PTE_W|PTE_P|PTE_U ò Lower 12 bits for flags 0 0 0x28370 PTE_W|PTE_P 0 0 0 0 0 0 0 0 0 0 Page flags Page Table Entries User, writable, ò 3 for OS to use however it likes present cr3 Physical Address ò 4 reserved by Intel, just in case Flags (12 bits) Upper (20 bits) No mapping ò 3 for OS to CPU metadata 0x00384 PTE_W|PTE_P|PTE_U 0 0 ò User/vs kernel page, 0x28370 PTE_W|PTE_P| ò Write permission, PTE_DIRTY ò Present bit (so we can swap out pages) … … Writeable, kernel-only, present, ò 2 for CPU to OS metadata and dirty (Dirty set by CPU on write) ò Dirty (page was written), Accessed (page was read) 5 ¡
8/31/12 ¡ Back of the envelope Challenge questions ò If a page is 4K and an entry is 4 bytes, how many entries ò What is the space overhead of paging? per page? ò I.e., how much memory goes to page tables for a 4 GB ò 1k address space? ò How large of an address space can 1 page represent? ò What is the optimal number of levels for a 64 bit page table? ò 1k entries * 1page/entry * 4K/page = 4MB ò How large can we get with a second level of translation? ò When would you use a 2 MB or 1 GB page size? ò 1k tables/dir * 1k entries/table * 4k/page = 4 GB ò Nice that it works out that way! TLB Entries TLB Entries ò The CPU caches address translations in the TLB ò The CPU caches address translations in the TLB ò Translation Lookaside Buffer ò Translation Lookaside BufferThe TLB is not coherent with memory, meaning: cr3 Virt Phys ò If you change a PTE, you need to manually invalidate 0xf0231000 � 0x1000 � cached values 0x00b31000 � 0x1f000 � ò See the tlb_invalidate() function in JOS 0xb0002000 � 0xc1000 � - � - � Page Traversal is Slow Table Lookup is Fast TLB Entries Outline ò The TLB is not coherent with memory, meaning: ò x86 processor modes ò If you change a PTE, you need to manually invalidate ò x86 segmentation cached values ò x86 page tables ò See the tlb_invalidate() function in JOS ò Software vs. Hardware mechanisms cr3 Virt Phys 0xf0231000 � 0x1000 � ò Advanced Features 0x00b31000 � 0x1f000 � ò Interesting applications/problems 0xb0002000 � 0xc1000 � - � - � Same No Virt Addr. Change!!! 6 ¡
Recommend
More recommend