file management
play

File Management What is a file? Elements of file management File - PDF document

CPSC 410 / 611 : Operating Systems File Management File Management What is a file? Elements of file management File organization Directories File allocation What is a File? A file is a collection of data elements, grouped


  1. CPSC 410 / 611 : Operating Systems File Management File Management • What is a file? • Elements of file management • File organization • Directories • File allocation What is a File? A file is a collection of data elements, grouped together for purpose of access control, retrieval, and modification Persistence: Often, files are mapped onto physical storage devices, usually nonvolatile. Some modern systems define a file simply as a sequence, or stream of data units. A file system is the software responsible for – creating, destroying, reading, writing, modifying, moving files – controlling access to files – management of resources used by files. 1

  2. CPSC 410 / 611 : Operating Systems File Management The Logical View of File Management user • directory management file structure • access control • access method records • blocking physical blocks in memory • disk scheduling physical blocks on disk • file allocation File Management • What is a file? • Elements of file management • File organization • Directories user • File allocation file structure • directory management • access control • UNIX file system • access method records • blocking physical blocks in memory physical blocks on disk • disk scheduling • file allocation 2

  3. CPSC 410 / 611 : Operating Systems File Management Logical Organization of a File • A file is perceived as an ordered collection of records , R 0 , R 1 , ..., R n . • A record is a contiguous block of information transferred during a logical read/write operation. • Records can be of fixed or variable length. • Organizations: – Pile – Sequential File – Indexed Sequential File – Indexed File – Direct/Hashed File Pile • Variable-length records • Chronological order • Random access to record by search of whole file. • What about modifying records? Pile File 3

  4. CPSC 410 / 611 : Operating Systems File Management Sequential File • Fixed-format records key field • Records often stored in order of key field . • Good for applications that process all records. • No adequate support for random access. • Q: What about adding new record? • A: Separate pile file keeps Sequential File log file or transaction file. Indexed Sequential File • Similar to sequential file, with two additions. – Index to file supports index random access. – Overflow file indexed main file from main file. • Record is added by appending it to overflow file and providing link from predecessor. overflow file Indexed Sequential File 4

  5. CPSC 410 / 611 : Operating Systems File Management Indexed File • Variable-length records index • Multiple Indices index partial index • Exhaustive index vs . partial index File Representation to User (Unix) UNIX File Descriptors : file descriptor system file in-memory table table inode table int myfd; [0] [1] myfd = open ( “ myfile.txt ” , O_RDONLY); [2] [3] [4] file descriptor file structure table [0] myfd [1] 3 [2] [3] user space kernel space [4] 3 ISO C File Pointers : FILE *myfp; myfp myfp = fopen ( “ myfile.txt ” , “ w ” ); user space kernel space 5

  6. CPSC 410 / 611 : Operating Systems File Management File Descriptors and fork() • With fork() , child inherits parent ’ s file desc table content of parent ’ s address A(SFT) space, including most of parent ’ s [0] B(SFT) state: [1] C(SFT) [2] – scheduling parameters D(SFT) system file table (SFT) [3] – file descriptor table [4] [5] – signal state A B – environment – etc. C D ( “ myf.txt ” ) child ’ s file desc table A(SFT) [0] B(SFT) [1] C(SFT) [2] D(SFT) [3] [4] [5] File Descriptors and fork() (II) parent ’ s file desc table A(SFT) [0] B(SFT) int main(void) { [1] C(SFT) char c = ‘ ! ’ ; [2] D(SFT) int myfd; system file table (SFT) [3] [4] myfd = open( ‘ myf.txt ’ , O_RDONLY); [5] A B fork(); C read(myfd, &c, 1); D ( “ myf.txt ” ) child ’ s file desc table A(SFT) [0] printf( ‘ Process %ld got %c\n ’ , B(SFT) [1] C(SFT) (long)getpid(), c); [2] D(SFT) [3] return 0; [4] } [5] 6

  7. CPSC 410 / 611 : Operating Systems File Management File Descriptors and fork() (III) parent ’ s file desc table A(SFT) [0] B(SFT) int main(void) { [1] system file table (SFT) C(SFT) char c = ‘ ! ’ ; [2] D(SFT) int myfd; [3] A [4] B fork(); [5] C myfd = open( ‘ myf.txt ’ , O_RDONLY); D ( “ myf.txt ” ) read(myfd, &c, 1); child ’ s file desc table A(SFT) E ( “ myf.txt ” ) [0] printf( ‘ Process %ld got %c\n ’ , B(SFT) [1] C(SFT) (long)getpid(), c); [2] E (SFT) [3] return 0; [4] } [5] Duplicating File Descriptors: dup2() • Want to redirect I/O from well-known file descriptor to descriptor associated with some other file? – e.g. stdout to file? Errors : EBADF: fildes or fildes2 is not valid dup2 interrupted by signal EINTR : #include <unistd.h> int dup2 (int fildes, int fildes2); Example: redirect standard output to file. int main(void) { int fd = open( ‘ my.file ’ , <some_flags>, <some_mode>); dup2 (fd, STDOUT_FILENO); close(fd); write(STDOUT_FILENO, ‘ OK ’ , 2); } 7

  8. CPSC 410 / 611 : Operating Systems File Management Duplicating File Descriptors: dup2() (II) • Want to redirect I/O from well-known file descriptor to descriptor associated with some other file? – e.g. stdout to file? Errors : EBADF: fildes or fildes2 is not valid dup2 interrupted by signal EINTR : #include <unistd.h> int dup2 (int fildes, int fildes2); after open after dup2 after close file descriptor table file descriptor table file descriptor table standard input standard input standard input [0] [0] [0] standard output write to file.txt write to file.txt [1] [1] [1] standard error standard error standard error [2] [2] [2] write to file.txt write to file.txt [3] [3] File Management • What is a file? • Elements of file management • File organization • Directories user • File allocation file structure • directory management • access control • UNIX file system • access method records • blocking physical blocks in memory physical blocks on disk • disk scheduling • file allocation 8

  9. CPSC 410 / 611 : Operating Systems File Management Allocation Methods • File systems manage disk resources • Must allocate space so that – space on disk utilized effectively – file can be accessed quickly • Typical allocation methods: – contiguous – linked – indexed • Suitability of particular method depends on – storage device technology – access/usage patterns Contiguous Allocation Logical file mapped onto a sequence of adjacent physical blocks. 0 1 2 3 Pros: 4 5 6 7 • minimizes head movements • simplicity of both sequential and direct access. 8 9 10 11 • Particularly applicable to applications where 12 13 14 15 entire files are scanned. 16 17 18 19 Cons: • Inserting/Deleting records, or changing length 20 21 22 23 of records difficult. • Size of file must be known a priori . (Solution: 24 25 26 27 copy file to larger hole if exceeds allocated size.) file start length • External fragmentation file1 0 5 file2 10 2 • Pre-allocation causes internal fragmentation file3 16 10 9

  10. CPSC 410 / 611 : Operating Systems File Management Linked Allocation • Scatter logical blocks throughout secondary storage. 0 1 2 3 • Link each block to next one by forward pointer. 4 5 6 7 • May need a backward pointer for backspacing. 8 9 10 11 Pros: 12 13 14 15 • blocks can be easily inserted or deleted • no upper limit on file size necessary a priori 16 17 18 19 • size of individual records can easily change over time. 20 21 22 23 24 25 26 27 Cons: • direct access difficult and expensive file start end • overhead required for pointers in blocks file 1 9 23 … … … • reliability … … … Variations of Linked Allocation Maintain all pointers as a separate linked list, preferably in main memory. 0 16 0 1 2 3 9 0 10 23 4 5 6 7 file start end 16 24 8 9 10 11 file1 9 23 ... ... ... 23 -1 12 13 14 15 ... ... ... 24 26 16 17 18 19 26 10 20 21 22 23 Example: File-Allocation Tables (FAT) 24 25 26 27 10

  11. CPSC 410 / 611 : Operating Systems File Management Indexed Allocation Keep all pointers to blocks in one location: index block (one index block per file) 0 1 2 3 4 5 6 7 9 0 16 24 26 10 23 -1 -1 -1 8 9 10 11 • Pros: – supports direct access 12 13 14 15 – no external fragmentation 16 17 18 19 – therefore: combines best of continuous and linked allocation. 20 21 22 23 • Cons: 24 25 26 27 – internal fragmentation in index blocks file index block • Trade-off: file1 ... ... 7 – what is a good size for index block? ... ... – fragmentation vs. file length ... ... Solutions for the Index-Block-Size Dilemma Linked index blocks: Multilevel index scheme: 11

  12. CPSC 410 / 611 : Operating Systems File Management Index Block Scheme in UNIX 0 direct 9 single 10 indirect double 11 indirect triple 12 indirect UNIX (System V) Allocation Scheme Example : block size: 1kB access byte offset 9000 access byte offset 350000 808 367 8 367 816 331 3333 75 0 3333 11 9156 331 9156 12

Recommend


More recommend