file systems fundamentals
play

File Systems: Fundamentals File operations Files Create, Open, - PowerPoint PPT Presentation

Files Fundamental Ontology of File Systems What is a file? Metadata A named collection of related information recorded on secondary The index node (inode) is the fundamental data structure storage (e.g., disks) The superblock also has


  1. Files Fundamental Ontology of File Systems What is a file? Metadata Ø A named collection of related information recorded on secondary Ø The index node (inode) is the fundamental data structure storage (e.g., disks) Ø The superblock also has important file system metadata, like block size File attributes Data Ø Name, type, location, size, protection, creator, creation time, last- Ø The contents that users actually care about modified-time, … File Systems: Fundamentals File operations Files Ø Create, Open, Read, Write, Seek, Delete, … Ø Contain data and have metadata like creation time, length, etc. Directories How does the OS allow users to use files? Ø Map file names to inode numbers Ø “ Open ” a file before use Ø OS maintains an open file table per process, a file descriptor is an index into this file. Ø Allow sharing by maintaining a system-wide open file table 1 2 3 Basic data structures Block vs. Sector File System Functionality and Implementation The operating system may choose to use a larger Disk File system functionality: block size than the sector size of the physical disk. Ø An array of blocks, where a block is a fixed size data array Ø Pick the blocks that constitute a file. Each block consists of consecutive sectors. Why? ❖ Must balance locality with expandability. File Ø A larger block size increases the transfer efficiency (why?) ❖ Must manage free space. Ø Sequence of blocks (fixed length data array) Ø Provide file naming organization, such as a hierarchical Ø It can be convenient to have block size match (a multiple of) name space. the machine's page size (why?) Directory Some systems allow transferring of many sectors between interrupts. File system implementation: Ø Creates the namespace of files ❖ Heirarchical – traditional file names and GUI folders Ø File header (descriptor, inode): owner id, size, last modified Some systems interrupt after each sector operation time, and location of all data blocks. ❖ Flat – like the all songs list on an ipod (rare these days) ❖ OS should be able to find metadata block number N without a Ø “ consecutive ” sectors may mean “ every other physical disk access (e.g., by using math or cached data structure). Design issues: Representing files, finding file data, finding sector ” to allow time for CPU to start the next transfer before Ø Data blocks. free blocks the head moves over the desired sector ❖ Directory data blocks (human readable names) ❖ File data blocks (data). Ø Superblocks, group descriptors, other metadata … 4 5 6

  2. How do we find and organize files on the disk? File System Properties Most files are small. The information that we need: Ø Need strong support for small files. file header points to data blocks Ø Block size can’t be too big. If my file system only has lots of big video files what block size do I want? fileID 0, Block 0 --> Disk block 19 Some files are very large. fileID 0, Block 1 --> Disk block 4,528 Ø Must allow large files (64-bit file offsets). … Ø Large file access should be reasonably efficient. 1. Large Key performance issues: Most systems fit the following profile: 2. Small 1. We need to support sequential and random access. 1. Most files are small 2. Most disk space is taken up by large files. 2. What is the right data structure in which to maintain 3. I/O operations target both small and large files. file location information? --> The per-file cost must be low, but large files must also have 3. How do we lay out the files on the physical disk? good performance. 7 8 9 File Allocation Methods File Allocation Methods File Allocation Methods Contiguous allocation Linked allocation Linked allocation – File Allocation Table (FAT) (Win9x, OS2) Create a table with an entry for each block Ø Overlay the table with a linked list I I Ø Each entry serves as a link in the list Ø Each table entry in a file has a pointer to the next entry in that file (with a special “ eof ” marker) Ø A “ 0 ” in the table entry è free block File header specifies starting block & length ◆ Files stored as a linked list of blocks Placement/Allocation policies Comparison with linked allocation ◆ File header contains a pointer to the first and last file Ø First-fit, best-fit, ... Ø If FAT is cached è better sequential and random access blocks performance ◆ Minuses ◆ Pluses ◆ Minuses ❖ How much memory is needed to cache entire FAT? Pluses Ø Impossible to do true Ø Best file read Ø Fragmentation! Ø Easy to create, grow & shrink files ◆ 400GB disk, 4KB/block è 100M entries in FAT è 400MB random access performance ❖ Solution approaches Ø Problems with file growth Ø No external fragmentation Ø Reliability Ø Efficient sequential & ◆ Allocate larger clusters of storage space ❖ Pre-allocation? ❖ Break one link in the chain random access ◆ Allocate different parts of the file near each other è better locality ❖ On-demand allocation? and... for FAT 10 11 12

  3. File Allocation Methods File Allocation Methods Indexed Allocation Direct allocation Indexed allocation Handling large files Linked index blocks (IB+IB+ … ) I I IB I IB IB IB File header points to each data block Create a non-data block for each file called the index block Ø A list of pointers to file blocks File header contains the index block Multilevel index blocks (IB*IB* … ) ◆ Pluses ◆ Pluses ◆ Minuses ◆ Minuses Ø Easy to create, grow & Ø Easy to create, grow & Ø Inode is big or variable size Ø Overhead of storing index I shrink files shrink files IB IB IB IB when files are small Ø How to handle large files? Ø Little fragmentation Ø Little fragmentation Ø How to handle large files? Ø Supports direct access Ø Supports direct access 13 14 15 Indexed Allocation in UNIX Multi-level Indirection in Unix Multilevel, indirection, index blocks 10 Data Blocks 1 st Level File header contains 13 pointers Indirection Why bother with index blocks? Inode Ø 10 pointes to data blocks; 11 th pointer à indirect block; 12 th pointer Block n à doubly-indirect block; and 13 th pointer à triply-indirect block Ø A. Allows greater file size. Data Blocks Ø B. Faster to create files. Implications Ø C. Simpler to grow files. Ø Upper limit on file size (~2 TB) n 2 Ø D. Simpler to prepend and append to files. Data Ø Blocks are allocated dynamically (allocate indirect blocks only for IB IB Blocks large files) 2 nd Level Indirection Block Features IB Ø Pros IB n 3 Data ❖ Simple Blocks ❖ Files can easily expand ❖ Small files are cheap IB Ø Cons IB IB IB 3 rd Level ❖ Large files require a lot of seek to access indirect blocks Indirection Block IB IB IB IB 16 17 18

  4. Allocate from a free list Free list representation Represent the list of free blocks as a bit vector : Need a data block How big is an inode? 111111111111111001110101011101111... Ø Consult list of free data blocks Ø A. 1 byte Ø If bit i = 0 then block i is free , if i = 1 then it is allocated Need an inode Ø B. 16 bytes Simple to use and vector is compact: Ø Consult a list of free inodes Ø C. 128 bytes 1TB disk with 4KB blocks is 2^28 bits or 32 MB Ø D. 1 KB Why do inodes have their own free list? If free sectors are uniformly distributed across the disk then Ø E. 16 KB the expected number of bits that must be scanned before Ø A. Because they are fixed size finding a “ 0 ” is Ø B. Because they exist at fixed locations n / r Ø C. Because there are a fixed number of them where n = total number of blocks on the disk, r = number of free blocks If a disk is 90% full, then the average number of bits to be scanned is 10, independent of the size of the disk 19 20 21 Deleting a file is a lot of work Naming and Directories Files are organized in directories Every directory has an inode Ø Directories are themselves files Ø A. True Ø Contain <name, pointer to file header> table Ø B. False Data blocks back to free list Only OS can modify a directory Ø Coalescing free space Ø Ensure integrity of the mapping Given only the inode number (inumber) the OS can Indirect blocks back to free list Ø Application programs can read directory (e.g., ls) find the inode on disk Ø Expensive for large files, an ext3 problem Ø A. True Directory operations: Inodes cleared (makes data blocks “ dead ” ) Ø B. False Ø List contents of a directory Inode free list written Ø Search (find a file) Directory updated ❖ Linear search ❖ Binary search The order of updates matters! ❖ Hash table Ø Create a file Ø Can put block on free list only after no inode points to it Ø Delete a file 22 23 24

Recommend


More recommend