comparison based dictionaries
play

Comparison Based Dictionaries: Fault Tolerance versus I/O Efficiency - PowerPoint PPT Presentation

Comparison Based Dictionaries: Fault Tolerance versus I/O Efficiency Gerth Stlting Brodal Allan Grnlund Jrgensen Thomas Mlhave University of Aarhus ADS 2007, 3rd Bertinoro Workshop on Algorithms and Data Structures University


  1. Comparison Based Dictionaries: Fault Tolerance versus I/O Efficiency Gerth Stølting Brodal Allan Grønlund Jørgensen Thomas Mølhave University of Aarhus ADS 2007, 3rd Bertinoro Workshop on Algorithms and Data Structures University Residential Centre of Bertinoro, Italy, September 30-October 5, 2007

  2. Binary Searching I/O This talk Fault Efficiency Future tolerance work Dictionaries: Fault Tolerance versus I/O Efficiency 2 Brodal, Jørgensen, Mølhave

  3. Search(17) 4 7 10 13 14 15 16 18 19 23 25 26 27 29 30 31 32 33 34 36 38 17 O (log N ) comparisons Dictionaries: Fault Tolerance versus I/O Efficiency 3 Brodal, Jørgensen, Mølhave

  4. Search(17) soft memory error 4 7 10 13 14 15 16 18 19 23 25 26 27 29 30 31 32 33 34 36 38 9 17? Dictionaries: Fault Tolerance versus I/O Efficiency 4 Brodal, Jørgensen, Mølhave

  5. Faulty-Memory RAM Model Finocchi and Italiano, STOC’04  Content of memory cells can get corrupted  Corrupted and uncorrupted content cannot be distinguished  O (1) safe registers  Assumption : At most δ corruptions  Example : Sorting requires time Θ ( N· log N + δ 2 ) Finocchi, Grandoni, Italiano, ICALP‘06 Dictionaries: Fault Tolerance versus I/O Efficiency 5 Brodal, Jørgensen, Mølhave

  6. Faulty-Memory RAM: Searching Θ (log N + δ ) comparisons  Lower bound Finocchi, Grandoni, Italiano, ICALP’06  Upper bound Brodal, Fagerberg, Finocchi, Grandoni, Italiano, Jørgensen, Moruz, Mølhave, ESA’07 Dictionaries: Fault Tolerance versus I/O Efficiency 6 Brodal, Jørgensen, Mølhave

  7. Faulty-Memory RAM: Searching Low confidence High confidence Problem? 4 7 10 13 14 15 16 18 19 23 25 26 27 29 30 31 32 33 34 36 38 9 17? Requirement: If there exists an uncorrupted element equal to the search key, we should find such an element Dictionaries: Fault Tolerance versus I/O Efficiency 7 Brodal, Jørgensen, Mølhave

  8. Faulty-Memory RAM: Searching Contradiction, i.e. at least one fault When are we done ( δ =3 )? If range contains at least δ +1 and δ +1 then there is at least one uncorrupted and , i.e. x must be contained in the range Dictionaries: Fault Tolerance versus I/O Efficiency 8 Brodal, Jørgensen, Mølhave

  9. Faulty-Memory RAM: Θ (log N + δ ) Searching Brodal, Fagerberg, Finocchi, Grandoni, Italiano, Jørgensen, Moruz, Mølhave, ESA’07 5 1 4 3 2 4 3 5 1 2 If verification fails → contradiction, i.e. ≥1 memory -fault → ignore 4 last comparisons → backtrack one level of search Dictionaries: Fault Tolerance versus I/O Efficiency 9 Brodal, Jørgensen, Mølhave

  10. Faulty-Memory RAM: Θ (log N + δ ) Searching Brodal, Fagerberg, Finocchi, Grandoni, Italiano, Jørgensen, Moruz, Mølhave, ESA’07 1 4 3 2 4 3 1 2  Standard binary search + verification steps  At most δ verification steps can fail/backtracking  Detail : Avoid repeated comparison with the same (wrong) element by grouping elements into blocks of size O ( δ ) Dictionaries: Fault Tolerance versus I/O Efficiency 10 Brodal, Jørgensen, Mølhave

  11. Faulty-Memory RAM: Reliable Values  Store 2 δ +1 copies of value x - at most δ copies uncorrupted  x = majority  Time O ( δ ) using two safe registers (candidate and count) Boyer and Moore ‘91 δ =5 y y y x x y x x x y x Candidate y y y y y y y – x – x Count 1 2 3 2 1 2 1 0 1 0 1 Dictionaries: Fault Tolerance versus I/O Efficiency 11 Brodal, Jørgensen, Mølhave

  12. Faulty-Memory RAM: Brodal, Fagerberg, Finocchi, Grandoni, Italiano, Jørgensen, Dynamic Dictionaries Moruz, Mølhave, ESA’07 Itai, Konheim, Rodeh, 1981  Packed array  Reliable pointers and keys ... ...  Updates O ( δ ·log 2 N )  Searches = fault tolerant O (log N + δ ) Θ ( δ ·log N ) elements  2-level buckets of size O ( δ ·log N )  Root: Reliable pointers and keys ...  Bucket search/update amortized O (log N + δ ) Θ ( δ ) elements  Search and update amortized O (log N + δ ) Dictionaries: Fault Tolerance versus I/O Efficiency 12 Brodal, Jørgensen, Mølhave

  13. I/O Model Dictionaries: Fault Tolerance versus I/O Efficiency 13 Brodal, Jørgensen, Mølhave

  14. I/O Model Aggarwal and Vitter 1988 I/O  N = problem size M e m External  M = memory size CPU o r Memory y  B = I/O block size  One I/O moves B consecutive records from to disk  Complexity = number of I/Os   N N  Example : Sorting requires I/Os    log M / B   B B Dictionaries: Fault Tolerance versus I/O Efficiency 14 Brodal, Jørgensen, Mølhave

  15. B-trees O (log B N ) Ω ( B ) .... Search path  Search and update O (log B N ) Dictionaries: Fault Tolerance versus I/O Efficiency 15 Brodal, Jørgensen, Mølhave

  16. Fault-Tolerance versus I/O Efficiency Dictionaries: Fault Tolerance versus I/O Efficiency 16 Brodal, Jørgensen, Mølhave

  17. Lower Bound for Fault-Tolerant External Searching  Adversary argument Possible values  If B ε slabs per I/O → factor B ε reduction and B 1- ε faults  After k I/Os N /( B ε ) k – k· B 1- ε elements remain    1      I/Os required [minimized wrt ε ] log N    B   1 B Dictionaries: Fault Tolerance versus I/O Efficiency 17 Brodal, Jørgensen, Mølhave

  18. Randomized Upper Bound for Fault-Tolerant External Searching  Sorted array + 2 δ identical B-trees (over N/(2 δ ) elements, stored in BFS layout)  Search: Select random tree for each node on search path + verification  Probability no faults on path:  where Σ β i ≤ δ    log N 1  B    i 1    2 2  i 1  Search O (log B N + δ / B ) expected .... Dictionaries: Fault Tolerance versus I/O Efficiency 18 Brodal, Jørgensen, Mølhave

  19. Deterministic Upper Bound for Fault-Tolerant External Searching  Sorted array + 2 δ / B 1- ε identical B-trees of degree B ε + B 1- ε copies of each key + min/max  Search: Verify against min/max in each step – if fail, backtrack one level and advance to next copy     1      Search I/Os O log N    B   1 B B Dictionaries: Fault Tolerance versus I/O Efficiency 19 Brodal, Jørgensen, Mølhave

  20. Dynamic Fault-Tolerant External Dictionaries Static structure Static + Packed arrays + Buckets of size O ( δ ·log 3 N ) ... ...  Deterministic    1  I/Os search and updates   log O N     B  1 B  Randomized Expected O (log B N + δ / B ) I/Os search and updates Dictionaries: Fault Tolerance versus I/O Efficiency 20 Brodal, Jørgensen, Mølhave 

  21. Conclusion  Fault-tolerant external memory searching    1     log N I/Os    B   1 B worst-case [minized wrt ε ]  Randomized O (log B N + δ / B ) I/Os Dictionaries: Fault Tolerance versus I/O Efficiency 21 Brodal, Jørgensen, Mølhave

  22. Future Work Fault Tolerance versus I/O Efficiency  Randomized algorithms: Memory faults in internal memory?  Sorting:    2 N N     log   ? M / B   B B B  ... Dictionaries: Fault Tolerance versus I/O Efficiency 22 Brodal, Jørgensen, Mølhave

  23. THANKS Dictionaries: Fault Tolerance versus I/O Efficiency 23 Brodal, Jørgensen, Mølhave

Recommend


More recommend