Implementation: Directory key ideas A file that contains a collection of mapping from file Directories name to file number file name file number Index structures /Users/lorenzo Documents 394 file number block Music 416 griso.jpg 864 Free space maps To look up a file, find the directory that contains the find a free block; actually, find a free block nearby mapping to the file number Locality heuristics To find that directory, find the parent directory that policies enabled by above mechanisms contains the mapping to that directory’ s file number... group together directory files Good news: root directory has well-known number (2) make writes sequential defragment Looking up a file Directory Layout Directory stored as a file Find file /Users/lorenzo/griso.jpg Linear search to find filename (small directories) file 2 “/” File 1061 bin 438 usr 782 /Users/lorenzo Users 256 1197 . .. chiara maria 294 file 256 Music Documents griso.jpg lorenzo 1061 “/Users” End of File Free Space 1061 256 416 394 864 Free Space Documents 394 file 1061 Music 416 “/Users/lorenzo” griso.jpg 864 Larger directories use B trees file 864 searched by hash of file name “/Users/lorenzo/griso.jpg”
Finding data Case studies FAT late 70s; Microsoft Index structure provides a way to locate each of key idea: linked list the file’ s blocks Today: flash sticks usually implemented as a tree for scalability Unix FFS mid 80’ s Free space map provides a way to allocate free key idea: tree-based multi-level index blocks Today: Linux ext2 and ext3 often implemented as a bitmap NTFS early 1990s; Microsoft. Locality heuristics group data to maximize access Key idea: variable size extents instead of fixed size blocks performance Today: Windows 7, Linux ext4, Apple HFS ZFS early 2000; open source. Key idea: copy on write (COW) FAT File system FAT File system Microsoft, late 70s Microsoft, late 70s File Allocation Table (FAT) File Allocation Table (FAT) started with MSDOS started with MSDOS in FAT-32, supports 2 28 blocks and files of 2 32 -1 bytes in FAT-32, supports 2 28 blocks and files of 2 32 -1 bytes FAT Data blocks FAT Data blocks Disadvantages Advantages Poor locality Index Structures 0 simple! 0 next fit? seriously? File Allocation Table (FAT) 1 used in many 1 Poor random access 2 2 array of 32-bit entries USB flash keys 3 needs sequential traversal 3 file 9 block 3 file 9 block 3 file represented as a linked list used even within 4 Limited access control 4 of FAT entries 5 MS Word! 5 no file owner or group ID metadata 6 6 file # = index of first FAT entry any user can read/write any file 7 7 8 8 No support for hard links 9 file 9 block 0 9 file 9 block 0 metadata stored in directory entry Free space map 10 10 file 9 block 1 file 9 block 1 Volume and file size are limited If data block i is free, 11 file 9 block 2 11 file 9 block 2 Locality heuristics FAT entry is 32 bits, but top 4 are 12 12 then FAT[i] = 0 file 12 block 0 file 12 block 0 As simple as next fit: 13 reserved 13 find free blocks by scan sequentially from 14 14 no more than 2 28 blocks scanning MFT last allocated entry and 15 15 with 4kB blocks, at most 1TB volume 16 16 return next free entry file 12 block 1 file 12 block 1 file no bigger than 4GB 17 17 Can be improved through No support for transactional updates 18 file 9 block 4 18 file 9 block 4 defragmentation 19 19 20 20
FFS: Fast File System File structure Unix, 80s Each file is a fixed, asymmetric tree, with fixed size data Smart index structure blocks (e.g. 4KB) as its leaves multilevel index allows to locate all blocks of a file The root of the tree is the file’ s inode efficient for both large and small files contains file’ s metadata Smart locality heuristics owner, permissions (rwx for owner, group other), type, creation time, etc setuid: run with temporarily elevated privileges block group placement file is executed with the permissions of the owner, not the caller add flexibility but can introduce security risks optimizes placement for when a file data and metadata, and setgid: like setuid for groups other files within same directory, are accessed together contains a set of pointers reserved space typically 15 gives up about 10% of storage to allow flexibility needed to first 12 point to data block achieve locality last three point to intermediate blocks, themselves containing pointers 13: indirect pointer 14: double indirect pointer 15: triple indirect pointer Multilevel index: Multilevel index key ideas Data blocks indirect block } contains pointers to data blocks Inode Array 12 x Data Inode 4KB = double indirect block blocks contains pointers to indirect blocks 48KB Inode Tree structure array File efficient in finding blocks metadata } 1K x 4KB File High degree at known metadata = 4MB 4 Bytes entries location on disk efficient in sequential reads } file number = 1K x 1k x once an indirect block is read, inode number = 4KB = can read 100s of data block index in the 4GB array Fixed structure } simple to implement 1K x Asymmetric 1k x supports efficiently files big 1k x 4KB = and small 4TB triple indirect block contains pointers to double indirect blocks
Example: variations Free space management on the FFS theme Data In BigFS an inode stores Easy blocks Inode 4kb blocks, 8 byte pointers array 12 direct pointers a bitmap with one bit per storage block 1 indirect pointer 1 double indirect bitmap location fixed at formatting time File 1 triple indirect metadata i-th bit indicates whether i-th block is used or free 1 quadruple indirect What is the maximum size of a file? Through direct pointers 12 x 4kb = 48KB Indirect pointer 512 x 4kb = 2MB Double indirect pointer 512 2 x 4kb = 1GB Triple indirect pointer 512 3 x 4kb = 512GB Quadruple indirect pointer Total = (256 + .5 + 10 -6 + 2 x 10 -9 + 4.8 x 10 -11 ) ≈ 256.5 TB 512 4 x 4kb = 256TB Locality heuristics: Locality heuristics: block group placement block group placement Divide disk in block groups Divide disk in block groups Block group 0 Block group 0 sets of nearby tracks sets of nearby tracks Distribute metadata Distribute metadata Block group 1 Block group 1 old design: free space bitmap and inode map in a old design: free space bitmap and inode map in a single contiguous region single contiguous region SB SB lots of seeks when going from reading metadata to lots of seeks when going from reading metadata to Block group 2 Block group 2 reading data reading data SB SB Data Data Data FFS: distribute free space bitmap and inode array Data FFS: distribute free space bitmap and inode array Data Data among block groups. Keep a superblock copy in among block groups. Keep a superblock copy in blocks blocks blocks blocks Free blocks Free blocks each block group each block group for for for for for for Inodes Inodes files s files Place file in block group files s files Place file in block group p files p files a a c c s s in e in in in e in in e when a new file is created, FFS looks for inodes in e when a new file is created, FFS looks for inodes in d d /c /c b o b o i n the same block as the file’ s directory i n the same block as the file’ s directory t t /a /b I /a /b I m m /d/ q /d/ q a a p p when a new directory is created, FFS places it in a when a new directory is created, FFS places it in a p p /d /a/p /a/g /d /a/p /a/g a a m m different block from the parent’ s directory different block from the parent’ s directory t t i i b b /b/c /b/c c e c e a /z a /z Inodes p Inodes p s s Free Free Place data blocks Place data blocks bitmap bitmap first free heuristics first free heuristics S S B e B e c trade short term for long term locality c trade short term for long term locality a a p p s s Free Free Free In use Start of block group
Recommend
More recommend