indexed files outline
play

Indexed Files : Outline ! Introduction ! Indexed Files ! Full Index - PowerPoint PPT Presentation

Indexed Files : Outline ! Introduction ! Indexed Files ! Full Index Organization ! Indexed Sequential Files ! Multilevel Indexes ! Overflow Management ! Performance Analysis rasitjutrakul Indexed Files Ordered Random access sequential


  1. Indexed Files : Outline ! Introduction ! Indexed Files ! Full Index Organization ! Indexed Sequential Files ! Multilevel Indexes ! Overflow Management ! Performance Analysis rasitjutrakul

  2. Indexed Files Ordered Random access sequential processing Sequential file structure fast slow Direct file structure slow fast Indexed file structure fast fast rasitjutrakul

  3. Indexed file look up index Key Block address Key look up index Block address block # 0 block # 1 block # 2 block # 3 block # 4 675 693 270 105 987 Somchai Somwang Somnuek Somsamorn Somroo Data Data 675 693 270 105 987 Index 0 1 2 3 4 Index 105 270 675 693 907 Index 3 2 0 1 4 Index rasitjutrakul

  4. Full Index Organization 4 0 block # 0 7 0 10 1 16 1 4 7 block # 0 block # 1 18 2 10 16 block # 1 19 2 20 3 18 19 block # 2 block # 2 21 3 20 21 block # 3 22 4 22 23 block # 4 23 4 block # 3 24 5 24 25 block # 5 25 5 26 28 block # 6 26 6 block # 4 28 6 33 35 block # 7 33 7 37 39 block # 8 35 7 block # 5 41 44 37 8 block # 9 39 8 48 78 block # 10 41 9 81 92 block # 11 block # 6 44 9 48 10 78 10 block # 7 81 11 92 11 rasitjutrakul

  5. Indexed File Structure " Binary search for the target key in the index " RetrieveOne : SL [ BinarySearch ] + 1 rba " RetrieveAll : 1 rba + m2 sba " DeleteOne : SL [ RetrieveOne ] + 2 sba " InsertOne : SL [ RetrieveOne ] + 2 sba index blocks data blocks m1 blocks m2 blocks rasitjutrakul

  6. Indexed Sequential Files " If records in the data file are ordered, – ordered sequential is fast. – do not have to be full indexed (keep only max index value of the data block) – # indexes decreases, # index blocks decreases search length decreases, improve performance 18932 38211 16345 17324 17543 18932 19823 20221 23847 38211 Siripol Sirirak Siri Siriroj Toy Tao Ting Took rasitjutrakul

  7. Indexed Sequential File 4 7 block # 0 10 16 7 0 block # 1 16 1 block # 0 18 19 block # 2 19 2 20 21 block # 3 21 3 22 23 block # 4 block # 1 23 4 24 25 25 5 block # 5 26 28 block # 6 28 6 block # 2 35 7 33 35 block # 7 39 8 37 39 block # 8 44 9 41 44 block # 9 block # 3 78 10 48 78 block # 10 x 11 81 92 block # 11 rasitjutrakul

  8. Indexed Sequential Files " 100,000 records, each of size 500 bytes " index record size = 20 bytes " block size = 2000 bytes 1 block = 4 data recs, 1 block = 100 index recs. 25,000 data blocks " Full index : – index file : 100,000 recs = 1000 index blocks " Indexed sequential : – index file : 25,000 recs = 250 index blocks rasitjutrakul

  9. Multilevel Indexed Sequential " Trimming search length = better performance " Modify the logical structure of the index file one level two levels three levels rasitjutrakul

  10. Indexed Sequential File : 2 levels 4 7 block # 0 10 16 7 0 block # 1 16 1 18 19 block # 2 19 2 19 20 21 block # 3 25 21 3 22 23 block # 4 23 4 24 25 25 5 block # 5 26 28 block # 6 28 6 35 7 33 35 39 block # 7 39 8 x 37 39 block # 8 44 9 41 44 block # 9 78 10 48 78 block # 10 x 11 81 92 block # 11 level 0 level 1 rasitjutrakul

  11. Indexed Sequential File : 3 levels 4 7 block # 0 10 16 7 0 block # 1 16 1 18 19 block # 2 19 2 19 20 21 block # 3 25 21 3 22 23 block # 4 23 4 25 24 25 25 5 block # 5 x 26 28 block # 6 28 6 35 7 33 35 39 block # 7 39 8 x 37 39 block # 8 44 9 41 44 block # 9 78 10 48 78 block # 10 x 11 81 92 block # 11 level 0 level 1 level 2 rasitjutrakul

  12. Overflow Records " Insertion generates overflow records " Allocate empty slots for each blocks " Reorganizing the file if needed " Allocate extra overflow blocks (overflow area) – overflow records are linked in a logical, ordered, chained fashion with the primary block to which they belongs – overflow recorded are not blocked rasitjutrakul

  13. Overflow Records Primary area Overflow area 2 4 7 x 10 13 x 16 x 18 19 x 15 20 21 x 6 22 23 x 5 24 25 x 39 x 26 28 x 33 35 x ... 37 38 41 44 x 48 78 x 81 92 x rasitjutrakul

  14. Performance Analysis " Number of rba 's needed to retrieve a target depends the height of the index tree. " The height depends on the NBLK and BF of index. " Let k be the avg. # of indexes per index block " Let the index tree be a h level tree. − h k 1 − = + + + + = 2 h 1 L NBLK 1 k k k index − k 1 = h NBLK k data = h log ( NBLK ) k data rasitjutrakul

  15. Performance : RetrieveOne " 100,000 records, each of size 500 bytes " index record size = 20 bytes " block size = 2000 bytes " Full index : 1000 blocks : log 1000 ≈ 10 rba " Indexed seq (1 level) : 250 blocks – 1 + log 250 ≈ 9 rba " Indexed seq (multilevel) : BF = 2000/20 = 100 – h = ? log 100000  = 3 – 1 + h = 4 rba 100 rasitjutrakul

  16. 34 7 7 7 4 4 4 9 4 9 4, 7, 9, 34, 63, 66, 70, 71 34 7 7 7 Initial Loading 34 rasitjutrakul

  17. Initial Loading 4 7 7 34 9 34 34 63 66 66 4 7 7 34 9 34 34 63 66 x 66 70 71 x 4, 7, 9, 34, 63, 66, 70, 71 rasitjutrakul

  18. Reorganization Point " Reorganize when performance has deteriorated by 50% from the performance just after (initial loading). " Let n 1 be # of RetrieveOne in a unit time " Let n 2 be # of RetrieveAll in a unit time " Let L be the average length of overflow recs. − h k 1 − = + + 2 + + h 1 = L NBLK 1 k k k index − k 1 = h NBLK k data = h log ( NBLK ) k data rasitjutrakul

  19. Physical Structure master index master index . . . cylinder index cylinder index cylinder index cylinder index . . . . . . . . . track index track index track index track index . . . . . . . . . rasitjutrakul

  20. Physical Structure trk 0 master index cylinder indx track index data data trk 1 track index data data data data trk 2 track index data data data data track index data data data data trk 19 . . . trk 0 cylinder indx track index data data data trk 1 track index data data data data trk 2 track index data data data data track index data data data data trk 19 rasitjutrakul

  21. Physical Structure level 0 level 1 data blocks ... level 1 data blocks ... level 1 data blocks ... etc. index index index index level 0 level 1 data blocks ... overflow level 1 data blocks ... overflow etc. index index blocks index blocks " Faster access – Mingling the data and index blocks : locality – Keep master index (level 0 index) in RAM rasitjutrakul

  22. Example " 10,000 records, 160 bytes/record, key is 16 bytes, pointer is 4 bytes " HP7925 - 256 bytes/sector, 64 sectors/track, 9 tracks/cylinder, 815 cylinder " Choose BF = 6, utilization=(160x6)/(256x4) = 93.8% " 1 block = 1024, 1024/(16+4) = 51 index entries " 1 track = 64/4 = 16 blocks " 10000 records, 10000/6 = 1667 blocks " number of cylinders = 1667/(16x9-10) = 13 cyl. 16 block / tracks, 9 tracks/cylinder (9 track index block + 1 cylinder index block) rasitjutrakul

Recommend


More recommend