outline
play

Outline ! Introduction ! Basic - PowerPoint PPT Presentation

File Structures An Introduction Outline ! Introduction ! Basic Concepts ! Secondary Storage ! Sequential Files ! Direct Files ! Indexed Files ! Tree-Based Files ! Multilist &


  1. File Structures An Introduction สมชาย ประสิทธิ์จูตระกูล

  2. Outline ! Introduction ! Basic Concepts ! Secondary Storage ! Sequential Files ! Direct Files ! Indexed Files ! Tree-Based Files ! Multilist & Inverted Files rasitjutrakul

  3. Managing Large Quantities of Data " Accessed by multiple people and programs " Kept on external storage devices " Always reliably available for processing " Rapidly accessible when information is needed rasitjutrakul

  4. Speed & Capacity " Disks are slow. – RAM ≈ 100 ns – Disk ≈ 10 ms " Disks provide enormous capacity. – RAM ≈ 10 MB (volatile) – Disk ≈ 1000 MB (nonvolatile) rasitjutrakul

  5. Design Goal Minimizing disk accesses for files that keep changing in content and size. Minimizing disk accesses for files that keep changing in content and size. rasitjutrakul

  6. A Short History " 1950s : Sequential access + indexes " 1960s : Tree Structures " 1970s : B-tree " 1980s : Extendible Hashing rasitjutrakul

  7. Basic Concept : Outline ! Files ! Records, Fields ! Keys ! Users ! File Processing ! File Design rasitjutrakul

  8. Filing System Size Size Persistence Persistence Sharability Sharability rasitjutrakul

  9. Files Savings Account File Loan Applications File Checking Accounts File Employee File rasitjutrakul

  10. Records Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43 Yudee 56 Chareonkrung, 10210 99,768.25 248-922-88 Wangdee 102 Bantadthong, 10330 125,899.29 741-673-76 Dundee 77 Saphanluang, 10330 232.48 Checking Accounts File Checking Accounts File rasitjutrakul

  11. Fields Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43 Yudee 56 Chareonkrung, 10210 99,768.25 248-922-88 Wangdee 102 Bantadthong, 10330 125,899.29 741-673-76 Dundee 77 Saphanluang, 10330 232.48 Checking Accounts File Checking Accounts File rasitjutrakul

  12. Files & Records ! A file is a collection of records of the same type. ! A record is a collection of related fields. rasitjutrakul

  13. Keys Find the Balance of [ Account = 116-057-43 ] Find the Balance of [ Account = 116-057-43 ] " Locate the Checking Account file. " Access the record whose contents of the Account field = 116-057-43. " Retrieve the record from the file. " Examine the contents of the Balance field. rasitjutrakul

  14. Keys Find the Balance of [ Account = 116-057-43 ] Find the Balance of [ Account = 116-057-43 ] Key is a field of a record whose contents identify the record. Key is a field of a record whose contents identify the record. rasitjutrakul

  15. Primary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43 Yudee 56 Chareonkrung, 10210 99,768.25 248-922-88 Wangdee 102 Bantadthong, 10330 125,899.29 741-673-76 Rakdee 77 Saphanluang, 10330 232.48 A primary key is a field that uniquely identify the Primary key Primary key record. rasitjutrakul

  16. Secondary Keys Account Name Address Balance Thongdee 018-745-96 36 Sathon, 10600 25,250.93 Dundee 108-964-09 488 Rama 4, 10330 2,252.00 Yudee 116-057-43 56 Chareonkrung, 10210 99,768.25 Wangdee 248-922-88 102 Bantadthong, 10330 125,899.29 Rakdee 741-673-76 77 Saphanluang, 10330 232.48 A secondary key is a field that does identify the record, Secondary key Secondary key but this identification is not unique. rasitjutrakul

  17. Users " End-users " Application programmers " System programmers rasitjutrakul

  18. File Processing Systems response Retrieve Balance of Account = 116-057-43 Retrieve Balance of Account = 116-057-43 99,768.25 99,768.25 end-users Checking Accounts File Processing System Checking Accounts File Processing System application programmers system programmers File System File System Checking Accounts File Checking Accounts File rasitjutrakul

  19. Users' Concerns " End-users – receive accurate information. " Application programmers – aware of the file organization, record structure, and access mechanisms. " System programmers – aware of the available tools and resources to enhance the file system efficiency. rasitjutrakul

  20. Data Transfer Application Program Logical READ Application Program Application programmers' view of the records Logical Record Logical Record File System Physical READ File System System programmers' view of the records Physical Block Physical Block rasitjutrakul

  21. Logical Records typedef struct customerTag { typedef struct customerTag { int iAccount; int iAccount; char szName[20]; char szName[20]; char szAddress[50]; char szAddress[50]; float fBalance; float fBalance; } recCustomer; } recCustomer; recCustomer CustomerRecord; recCustomer CustomerRecord; iAccount szName szAddress fBalance rasitjutrakul

  22. Physical Blocks System Logical record Logical record Logical record data #1 #2 #3 logical block physical block Blocking factor Blocking factor rasitjutrakul

  23. Blocking & Deblocking Logical record Logical record Logical record Logical record Deblocking Blocking Input buffer Output buffer Input buffer Output buffer physical read physical write Physical Block Physical Block Physical Block Physical Block rasitjutrakul

  24. Disk Caching block block block block record block record block record block record block record record record block record block record block user space record block record block record block disk block buffer block block block block block disk cache rasitjutrakul

  25. Blocking Factor " Blocking factor vs # Block transfers " Blocking factor vs Buffer size " Optimal blocking factor If the blocking factor were equal to the number of logical records then one could successfully argue that only one data transfer would be needed !!! rasitjutrakul

  26. Logical & Physical File Structure " Logical file structure – The organization of all logical records in the file. " Physical file structure – The organization of all the physical blocks stored in secondary storage. rasitjutrakul

  27. Logical & Physical File Structure record 1 key 1 record 1 key 1 record 3 record 4 record 4 record 3 record 2 key 2 record 2 key 2 record 3 key 3 record 3 key 3 record 47 record 48 record 47 record 48 record 4 key 4 record 4 key 4 . . . . . . . . . . . . record 49 record 50 record 49 record 50 record 48 key 48 record 48 key 48 record 49 key 49 record 49 key 49 record 1 record 2 record 50 key 50 record 1 record 2 record 50 key 50 sequential file physical linked sequential file rasitjutrakul

  28. Access Path 2 Somchai P. ... 5 9 3 Somboon T. ... 15 5 Chukiat V. ... 31 19 7 Samruay W. ... 70 23 90 27 8 Supat R. ... 130 31 9 Chatchart S. ... 162 12 Kukiat R. ... 200 39 14 Wiwat W. ... 250 42 15 Boonchai S. ... 49 53 60 34 Yingyong E. ... 65 35 Rangsan S. ... 70 39 Kriengkai F. ... rasitjutrakul

  29. Access Path 2 Somchai P. ... 5 9 3 Somboon T. ... 15 5 Chukiat V. ... 31 19 7 Samruay W. ... 70 23 90 27 8 Supat R. ... 130 31 9 Chatchart S. ... 162 12 Kukiat R. ... 200 39 14 Wiwat W. ... 250 42 49 15 Boonchai S. ... 53 60 34 Yingyong E. ... 65 35 Rangsan S. ... 70 39 Kriengkai F. ... rasitjutrakul

  30. Access Methods Target Record Target Record Access Method Access Method Physical File Structure Physical File Structure rasitjutrakul

  31. Classification of Access Methods Access methods Primary access methods Secondary access methods Inverted file Sequential access methods Random access methods Cellular inverted Sequential Direct Multilist Hash Cellular multilist Indexed sequential Binary search AVL-tree Paged tree B-tree B+ -tree Trie rasitjutrakul

  32. File Design " Logical file design – select one of the available file organizations – design a new file organization " Physical file design – design the physical file rasitjutrakul

  33. File Design " Selection of blocking factor " Allocation of the I/O buffers " Size of the physical file " Organization of the physical blocks " Design or selection of the access method " Selection of the primary key " File growth " Reorganization point rasitjutrakul

  34. File Operations " RetrieveAll " Batch " RetrieveOne " RetrieveNext " RetrievePrevious " InsertOne " DeleteOne " UpdateOne " RetrieveFew rasitjutrakul

  35. Performance " Response time – The type of allowable operations. – The frequency of each type of operation. Ex. 95% Retrieve_One 5% Batch Random or Sequential ? Search length Search length rasitjutrakul

Recommend


More recommend