5/10/2016 File Systems: Semantics & Structure 11A. File Semantics Operating Systems Principles 11B. Namespace Semantics 11C. File Representation File Systems: Semantics & Structure 11D. Free Space Representation 11E. Namespace Representation 11F. File System Integration Mark Kampe (markk@cs.ucla.edu) File Systems: Semantics and Structure 2 Sequential Byte Stream Access Random Access int infd = open(“abc”, O_RDONLY); void *readSection(int fd, struct hdr *index, int section) { struct hdr *head = &hdr[section]; int outfd = open(“xyz”, O_WRONLY+O_CREATE, 0666); off_t offset = head->section_offset; if (infd >= 0 && outfd >= 0) { size_t len = head->section_length; int count = read(infd, buf, sizeof buf); void *buf = malloc(len); if (buf != NULL) { while( count > 0 ) { lseek(fd, offset, SEEK_SET); write(outfd, buf, count); if ( read(fd, buf, len) <= 0) { count = read(infd, inbuf, BUFSIZE); free(buf); buf = NULL; } } close(infd); } close(outfd); return(buf); } } File Systems: Semantics and Structure 3 File Systems: Semantics and Structure 4 Consistency Model File Attributes – basic properties • When do new readers see results of a write? • thus far we have focused on a simple model – a file is a "named collection of data blocks" – read-after-write • in most OS files have more state than this • as soon as possible, data-base semantics • this commonly called “POSIX consistency” – file type (regular file, directory, device, IPC port, ...) – read-after-close (or sync/commit) – file length (may be excess space at end of last block)) • only after writes are committed to storage – ownership and protection information – open-after-close (or sync/commit) – system attributes (e.g. hidden, archive) • each open sees a consistent snapshot – creation time, modification time, last accessed time – explicitly versioned files • typically stored in file descriptor structure • each open sees a named, consistent snapshot File Systems: Semantics and Structure 5 File Systems: Semantics and Structure 6 1
5/10/2016 Extended File Types and Attributes File Names and Name Binding • extended protection information • file system knows files by their descriptors • users know files by names – e.g. access control lists – names more easily remembered than disk addresses • resource forks – names can be structured to organize millions of files – e.g. configuration data, fonts, related objects • file system responsible for name-to-file mapping • application defined types – associating names with new files – e.g. load modules, HTML, e-mail, MPEG, ... – changing names associated with existing files • application defined properties – allowing users to search the name space – e.g. compression scheme, encryption algorithm, ... • there are many ways to structure a name space File Systems: Semantics and Structure 7 File Systems: Semantics and Structure 8 What is in a Name? Flat Name Spaces directory • there is one naming context per file system /home/mark/TODO.txt – all file names must be unique within that context suffix • all files have exactly one true name separator base name – these names are probably very long • suffixes and file types • file names may have some structure – file-to-application binding often based on suffix – e.g. CAC101.CS111.SECTION1.SLIDES.LECTURE_13 • defined by system configuration registry – this structure may be used to optimize searches • configured per user, or per directory – the structure is very useful to users – suffix may define the file type (e.g. Windows) – the structure has no meaning to the file system – suffix may only be a hint (magic # defines type) File Systems: Semantics and Structure 9 File Systems: Semantics and Structure 10 A rooted directory tree Hierarchical Namespaces • directory root – a file containing references to other files – it can be used as a naming context user_1 user_2 user_3 • each process has a current working directory • names are interpreted relative to directory file_a dir_a file_b file_c dir_a • nested directories can form a tree (/user_3/dir_a) (/user_1/file_a) (/user_1/dir_a) (/user_2/file_b) (/user_3/file_c) – file name is a path through that tree file_a – directory tree expands from a root node file_b (/user_1/dir_a/file_a) (/user_3/dir_a/file_b) • fully qualified names begin from the root – may actually form a directed graph File Systems: Semantics and Structure 11 File Systems: Semantics and Structure 12 2
5/10/2016 Hard Links: example Unix-style Hard Links • all protection information is stored in the file Note that we now root associate names with – file owner sets file protection (e.g. read-only) links rather than with user_1 user_3 files. – all links provide the same access to the file – anyone with read access to file can create new link dir_a file_c – but directories are protected files too file_a /user_1/file_a and • not everyone has read or search access to every directory file_b /user_3/dir_a/file_b • all links are equal are both links to the – there is nothing special about the owner‘s link same I-node – file is not deleted until no links remain to file – reference count keeps track of references File Systems: Semantics and Structure 13 File Systems: Semantics and Structure 14 Symbolic Links: example Symbolic Links • another type of special file root – an indirect reference to some other file user_1 user_3 – contents is a path name to another file • Operating System recognizes symbolic links dir_a file_c – automatically opens associated file instead file_a – if file is inaccessible or non-existent, the open fails file_b (/user_1/file_a) • symbolic link is not a reference to the I-node – symbolic links will not prevent deletion – do not guarantee ability to follow the specified path – Internet URLs are similar to symbolic links File Systems: Semantics and Structure 15 File Systems: Semantics and Structure 16 Databases Object Stores • simplified file systems, cloud storage • a tool managing business critical data – optimized for large but infrequent transfers • table is equivalent of a file system • bucket is equivalent of a file system • data organized in rows and columns – a bucket contains named, versioned objects – row indexed by unique key • objects have long names in a flat name space – columns are named fields within each row – object names are unique within a bucket • support a rich set of operations • an object is a blob of immutable bytes – multi-object, read/modify/write transactions – get … all or part of the object – put … new version, there is no append/update – SQL searches return consistent snapshots – delete – insert/delete row/column operations File Systems: Semantics and Structure 17 File Systems: Semantics and Structure 18 3
5/10/2016 Key-Value Stores File System Goals • ensure the privacy and integrity of all files • smaller and faster than an SQL database • efficiently implement name-to-file binding – optimized for frequent small transfers – find file associated with this name • table is equivalent of a file system – list the file names in this part of the name space – a table is a collection of key/value pairs • efficiently manage data associated w/each file • keys have long names in a flat name space – return data at offset X in file Y – key names are unique within a table – write data Z at offset X in file Y • value is a (typically 64-64MB) string • manage attributes associated w/each file – get/put (entire value) – what is the length of file Y – delete – change owner/protection of file Y to be X File Systems: Semantics and Structure 19 File Systems: Semantics and Structure 20 Unix System 5 – Volume Structure File System Structure • disk volumes are divided into fixed-sized blocks boot block block 0 – many sizes are used: 512, 1024, 2048, 4096, 8192 ... block size and number of I-nodes are super block 1 • most of them will store user data block specified in super block • some will store organizing “meta-data” block 2 I-node #1 (traditionally) describes the – description of the file system (e.g. layout and state) I-nodes root directory – file control blocks to describe individual files – lists of free blocks (not yet allocated to any file) data blocks begin immediately after the available • all operating systems have such data structures end of the I-nodes. blocks – different OS and FS often have very different goals – these result in very different implementations File Systems: Semantics and Structure 21 File Systems: Semantics and Structure 22 Unix I-nodes and block pointers File Descriptor Structures • all file systems have file descriptor structures block pointers data block s tripple double-Indirect Indirect blocks (in I-node) 1 st 1 st • contain all info about file UNIX I-node 2 nd 2 nd 3 rd ... – type (e.g. file, directory, pipe) 4 th type protection 5 th 10 th owner group 6 th – ownership and protection ... 7 th 11 th # links ... 8 th – size (in bytes) file size 9 th 1034 th 10 th last access time 11 th ... – other attributes 1035 th last written time ... 12 th ... 13 th last I-node update time – location of data blocks 2058 th data block pointers 2059 th … ... ... • descriptor location/# is file’s true name F 1 File Systems: Semantics and Structure 23 File Systems: Semantics and Structure 24 4
Recommend
More recommend