Lower Bounds for External Memory Dictionaries Gerth Stølting Brodal Rolf Fagerberg BRICS University of Aarhus Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms Baltimore, MD, USA, January 13, 2003 1
Dictionary • Queries – membership – predecessor / successor – range queries . . . • Updates – insertions – deletions Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 2
Dictionary • Queries – membership – predecessor / successor – range queries . . . • Updates – insertions – deletions This talk : Comparison based, membership, insertions Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 2
Dictionaries – Comparison Based Insert Search Balanced search trees O (log N ) O (log N ) Search log N log N Insert Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 3
Dictionaries – Comparison Based Insert Search Balanced search trees O (log N ) O (log N ) Ω(log N ) Adversary ∞ ⇒ Search log N log N Insert Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 3
Dictionaries – Comparison Based Insert Search Balanced search trees O (log N ) O (log N ) Ω(log N ) Adversary ∞ ⇒ N/ 2 O ( t ) Borodin et al. 1982 O ( t ) ⇒ Search log N log N Insert Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 3
External Memory Model Aggarwal and Vitter 1988 I/O I n t N = problem size e r n a M = memory size l External CPU Memory M B = I/O block size e m o r y • One I/O moves B consecutive records from/to disk • Cost : number of I/Os • Elements can be copied and compared in internal memory Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 4
B-trees – An External Memory Dictionary Bayer and McCreight 1972 � �� � O ( B ) · · · Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 5
B-trees – An External Memory Dictionary Bayer and McCreight 1972 O (log B M ) Internal Memory External � � N O log B Memory � �� � M O ( B ) · · · Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 5
B-trees – An External Memory Dictionary Bayer and McCreight 1972 O (log B M ) Internal Memory External � � N O log B Memory � �� � M O ( B ) · · · Search/update path � � � Insert N O log B I/Os M Membership Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 5
Dictionaries – External Memory Insert Search N N B-trees O (log B M ) O (log B M ) Search N log B M N log B Insert M Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 6
Dictionaries – External Memory Insert Search N N B-trees O (log B M ) O (log B M ) N Ω(log B M ) Adversary ∞ ⇒ Search N log B M N log B Insert M Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 6
Dictionaries – External Memory Insert Search N N B-trees O (log B M ) O (log B M ) N Ω(log B M ) Adversary ∞ ⇒ Search ? N log B M N log B Insert M Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 6
Comparisons vs. I/Os Search Search ? N log N log B M N log N log B Insert Insert M Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 7
Comparisons vs. I/Os Search Search ? N log N log B M N log N log B Insert Insert M Sorting Comparisons Θ( N log N ) Θ( N N B log M/B M ) I/Os Aggarwal and Vitter 1988 Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 7
Comparisons vs. I/Os Search Search Sorting threshold Sorting threshold ? ? N log N log B M 1 N N log N B log M/B log B Insert Insert M M Sorting Comparisons Θ( N log N ) Θ( N N B log M/B M ) I/Os Aggarwal and Vitter 1988 Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 7
Results N/ ( M · ( M B ) Θ( δ ) ) Search N Θ(log δ M ) B-trees δ Insert Sorting Threshold δ = number of I/Os for B insertions Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 8
Results N/ ( M · ( M B ) Θ( δ ) ) Search N Θ(log δ M ) Buffered B-trees 1 N ε log B M B-trees δ B ε N ε log B Insert Sorting M Threshold δ = number of I/Os for B insertions Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 8
Results N/ ( M · ( M B ) Θ( δ ) ) Search N Θ(log δ M ) Buffered B-trees 1 N ε log B M B-trees N log B M δ B ε N ε log B Insert Sorting M Threshold log 1+ ε N B/ log 3 N N N Θ(log M/B M ) B log B M δ = number of I/Os for B insertions Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 8
Buffered B-trees – how to speedup B-tree updates by a factor B 1 − ε Internal Memory External Memory · · · Searches O ( 1 N ε log B M ) B insertions O ( B ε N ε log B M ) Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 9
Buffered B-trees – how to speedup B-tree updates by a factor B 1 − ε � 1 � ε log B M Internal O Memory External � 1 � N ε log B Memory O � �� � M O ( B ε ) · · · • B-tree with degree Θ( B ε ) Searches O ( 1 N ε log B M ) B insertions O ( B ε N ε log B M ) Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 9
Buffered B-trees – how to speedup B-tree updates by a factor B 1 − ε � 1 � ε log B M Internal O Memory Buffer External � 1 � N ε log B Memory O � �� � M O ( B ε ) · · · • B-tree with degree Θ( B ε ) Searches O ( 1 N ε log B M ) • Buffers of O ( B ) delayed insertions B insertions O ( B ε N ε log B M ) Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 9
Buffered B-trees – how to speedup B-tree updates by a factor B 1 − ε � 1 � ε log B M Internal O Memory Buffer External � 1 � N ε log B Memory O � �� � M O ( B ε ) · · · Search path • B-tree with degree Θ( B ε ) Searches O ( 1 N ε log B M ) • Buffers of O ( B ) delayed insertions B insertions O ( B ε N ε log B M ) Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 9
Buffered B-trees – how to speedup B-tree updates by a factor B 1 − ε � 1 � ε log B M Internal O Memory Buffer External � 1 � N ε log B Memory O � �� � M O ( B ε ) · · · Search path • B-tree with degree Θ( B ε ) Searches O ( 1 N ε log B M ) • Buffers of O ( B ) delayed insertions • On buffer overflow move O ( B 1 − ε ) B insertions O ( B ε N elements to a child with one I/O ε log B M ) Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 9
Lower Bound – optimality of buffered B-trees S 1 S 2 S 3 S K · · · insert time x ij ≈ i S K S j S 2 S 1 N i 1 2 3 ordering · · · · · · 2 • Adversary online constructs S 1 , . . . , S K • Constructs i such that x ij has not been in internal memory since S j was inserted, for all j = 1 , . . . , K • x i 1 , . . . , x iK form an antichain, i.e. search requires ≥ K I/Os Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 10
Lower Bound (Cont.) Construct S j +1 S j S j +1 insert time • Let ˆ I be the indexes i where – x ij ∈ S j but is not in internal memory after inserting S j – x i 1 , . . . , x i ( j − 1) have not been read into internal memory by the δ | S j | B I/Os during the insertion of S j • Construct I ⊂ ˆ I such that all blocks in external memory contain O ( B δ ) elements x ij where i ∈ I – Existence by randomized sampling with probability O (1 /δ ) and Chernoff bounds, provided B/δ = Ω(log N ) • Let x i ( j +1) ∈ S j iff i ∈ I N K = Θ(log δ M ) Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 11
Lower Bound — Below Sorting Insert Search δ N ⇒ � � Θ( δ ) B M M · B Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 12
Lower Bound — Below Sorting Insert Search δ N ⇒ � � Θ( δ ) B M M · B • W.l.o.g. memory and each block totally ordered after each I/O � � N log M + δN B log M comparisons B B Insert in internal memory Merging a block with internal memory Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 12
Lower Bound — Below Sorting Insert Search δ N ⇒ � � Θ( δ ) B M M · B • W.l.o.g. memory and each block totally ordered after each I/O � � N log M + δN B log M comparisons B B Insert in internal memory Merging a block with internal memory • Antichain of size (Borodin et al. 1982 / Dillworth’s lemma) N N = � � δ 2 log M + δ log M M M · B B of which all elements except one are in distinct blocks Brodal, Fagerberg: Lower Bounds for External Memory Dictionaries 12
Recommend
More recommend