CS5460: Operating Systems Lecture 17: Intro to File Systems (Ch. 10) CS 5460: Operating Systems
Important From Last Time Page replacement algorithms – Optimal page replacement strategy evicts the page used farthest in the future – LRU is a decent approximation of optimal – Clock / second chance algorithm is a single-bit approximation of LRU – works well for most workloads Thrashing happens when working sets do not fit into RAM – Response: Swap out entire processes – Last resort: Start killing processes Copy on write optimizations Memory-mapped file optimizations CS 5460: Operating Systems
Filesystem Layers User’s viewpoint: – Objects: Files, directories, bytes User – Operations: Create, read, write, delete, User rename, move, seek, set attributes Apps Library Physical viewpoint: Trap – Objects: Sectors, tracks, disks – Operations: Seek, read block, write block Open() | Close() | Read() | Write() User ßà ßà OS layer – User library hides many details Seek() | ReadBlk() | WriteBlk() – OS can directly read/write user data Interrupts I/O regs OS ßà ßà Hardware layer DMA DMA – IO registers – Interrupts Disk Hardware – DMA CS 5460: Operating Systems
Typical Disk Organization Cylinder/Track Disk arm Coated with magnetic material that encodes bits – Capacity increases come from Rotation improvements in bit density Logically divided into: R/W head – Spindles: individual disks Block/sector – Tracks: rings on a disk – Sectors: portions of a track – Cylinders: stacks of tracks Read/write data (overview): – Position disk head over track – Wait for sector to rotate under head – Read/write data from/to sector Spindle CS 5460: Operating Systems
Disk Organization (cont’d) Disk physics: – Modern disks spin at 5400, 7200, 10000, and 15000 rpm – Outside edge of 3.5 ” disk spins at over 150 mph – Disk head “ floats ” on very thin cushion of air above platter » Bernoulli effect used to “ fly ” as close as possible » Head crash is exactly that à à disk head contacts the surface Disks organized as stacks of platters: – Disk heads mounted on “ combs ” à à often heads on both sides – Separate disk heads moved independently Disk controller – Managing all the independent head movements – Contains RAM to cache disk contents from/to disk – Accepts commands from CPU à à responds using DMA/interrupts CS 5460: Operating Systems
Disk Hardware Trends Model Size Interface Seek RPM Price • 2001 ST320011A 20 GB ATA/IDE 9.0ms 7,200 $92 ST318437LC 18 GB U2-SCSI 3.6ms 15,000 $329 • 2005 ST3120814A 120 GB ATA-100 8.5ms 7,200 $80 ST373453LC 73 GB U-SCSI 3.6ms 15,000 $263 • 2007 ST3320820A 320 GB PATA 4ms 7,200 $99 ST3146855LW 147 GB Ultra320 2ms 15,000 $313 ST2000DM001 2000 GB SATA 4.1ms 7,200 $130 • 2012 ST3600057SS 600 GB SAS 2ms 15,000 $500 Eliminating seeks is critical to performance! • Data: Seagate, NewEgg, dirtcheapdrives.com CS 5460: Operating Systems
How to avoid seeks? Design file system carefully Use RAM as a cache for disk – Once a block is read, cache it as long as possible – When a block is written to, delay the actual write Combine hard disk with solid state disk (SSD) Replace hard disk with SSD RAM cloud CS 5460: Operating Systems
What Do File System Users Need? Persistence: Data persists beyond jobs, crash, … – Disk provides basic non-volatile storage – OS can enhance persistence via redundancy Speed: Fast access to data – Random access handled efficiently – OS can enhance performance via file caching Size: Can store lots of data Sharing/protection: – Users can control who/what has access to their data Ease of use: – Basic file abstraction (names, offsets, byte streams, … ) – Directories simplify naming and lookup CS 5460: Operating Systems
File System Abstractions File: Basic container of persistent data – Unix: flat byte stream – IBM mainframes: series of records or objects Directory system: Hierarchical naming relationships – Directories are special “ files ” that index other files – OS exports operations to manage directories indirectly Common file access patterns: – Sequential: data processed in order, byte/record at a time » Example: Compiler reading a source file – Random access: address blocks of data based on file offset » Example: Demand paging reads, database searches – Keyed access: address blocks based on “ key ” values » Typically implemented using key-file (hash) -- data-file pairs CS 5460: Operating Systems
Common File System Operations Naming operations: Data operations: – HardLink() – Create() – SoftLink() – Delete() – Rename() – Open() – Close() – Read() Attribute operations: – Write() – SetAttribute() – Seek() – GetAttribute() Attributes include owner, protection, last accessed CS 5460: Operating Systems
File System Data Structures Kernel (in-mem) Structures • File inode – Global open file table – Per-process open file table – Free (disk) block list • File – Free inode list • contents • Key: Provide – File buffer cache: Cached disk blocks • this mapping – Inode cache • efficiently and – Name cache • safely. On-Disk Structures – Superblock: File system format info – File: Collection of blocks/bytes – File descriptor (inode): File metadata – Directory: Special kind of file – Free block/inode maps • Disk contents CS 5460: Operating Systems
Key In-Memory Data Structures Open file table: shared by all processes w/ open file – Open count and “ deleted ” flag – Copy of (or pointer to) file’s inode – Location of file blocks in file buffer cache (see below) Per-process file table: private for each process – Pointer to entry in global open file table – Current position in the file ( “ seek ” pointer) – Access mode (read, write, read-write) File buffer cache: cache of file data blocks – Indexed by file-blocknum pairs (hash structure) – Used to reduce effective access time of disk operations – Can hold blocks from user files, directories, file system metadata CS 5460: Operating Systems
Key In-Memory Data Structures Name cache: cache of recent name lookup results – Indexed by full filename (hash structure) – Used to eliminate directory traversals (disk ops) for name lookups Free space “ bitmap ” : – Used to track which blocks on disk are available Free inode “ bitmap ” : – Used to track which file index nodes on disk are available Superblock: holds key metadata that describes disk – Physical characteristics: size of disk, size of blocks, … – Location of free space and free inode “ bitmaps ” – Location of inodes – Multiple copies stored in known location à à redundancy CS 5460: Operating Systems
Key On-Disk Data Structures File descriptor (inode): File descriptor (aka “ inode ” ) – Link count ulong links; – Security attributes: UID, GID, … uid_t uid; – Size gid_t gid; – Access/modified times ulong size; – “ Pointers ” to blocks time_t access_time; – … time_t modified_time; Directory file: array of … addr_t blocklist…; – File name (fixed/variable size) Directory file: – Inode number – Length of directory entry Filename inode# Free block bitmap Filename inode# REALLYLONGFILENAME Free inode bitmap inode# Filename Superblock inode# Short inode# CS 5460: Operating Systems
Naming and Directories Need a method to “ name ” files on disk: – OS wants to use numbers or indices – Users prefer textual/visual names and hierarchical organization – Solution: Directories Naming schemes: – Simple: One name space for entire disk w/ unique names – User-based: Each user has a single separate directory (TOPS-10) – Hierarchical: Tree-structured name space (modern OSes) » Store directories as special files flagged as “ directory file ” » User programs can read directory like normal files » Only special system calls can modify directory files » Directory files contain <name, filedesc> pairs » Special “ root ” directory CS 5460: Operating Systems
Traversing Directories (Simplified) How do we locate file descriptor for “ /foo/bar ” ? – Divide file name into components (e.g., “ / ” , “ foo ” , and “ bar ” ). – Recursively descend directory hierarchy, at each step: » Load file descriptor of “ next ” directory file » Use file descriptor info to locate and load directory file contents » Scan directory file for matching filename of next component » If match found à à extract file descriptor number from (name, filedesc) » If no match à à lookup failure How can we speed up this process? – Name cache » Probe name cache for longest prefix contained in cache (e.g., “ / foo ” ) » Start recursive descent using longest prefix as starting point CS 5460: Operating Systems
Recommend
More recommend