Batched Dynamic Geometric Problems Jeff Vitter Duke University Center for Geometric and Biological Computing and Department of Computer Science Center for Geometric & Biological Computing http://www.cs.duke.edu/CGBC/ July 2002 Center for Geometric & Biological Computing
Outline ✫ Fundamental Techniques for batched problems. � Merge sort, distribution sort. ) Techniques for solving batched geometric problems. = � Distribution sweeping, batched filtering, randomized incremental construction, parallel simulation. � Red-blue orthogonal rectangle intersection, convex hull, range search, nearest neighbors. � Empirical results (via TPIE programming environment). ✫ Fundamental lower bounds. � Sorting, permuting, FFT, matrix transposition, bundle sort. � Dynamic memory allocation � Hierarchical memory. ✫ Parallel disks. � Load balancing among disks is key issue. � Duality: reading (prefetching) ! writing, ! distribution merging Jeff Vitter 2 Center for Geometric & Biological Computing
Review of Parallel Disk Model [Aggarwal & Vitter 88], [Vitter & Shriver 90, 94], . . . = problem data size : N = size of in ternal memory M : D isk D isk = size of disk blo k : B = n um b er of indep enden t disks : D Block I/O = n um b er of CPUs P : = n um b er of queries : Q M em = problem output size : Z Notational convenience (in units of blocks): C P U N M Q Z = B , = B , = B , = B . n m q z Jeff Vitter 3 Center for Geometric & Biological Computing
Fundamental I/O Bounds (with 1 disk) D = ✫ Batched problems [AV88], [VS90], [VS94]: � � N � Scanning (touch problem): � = �( n ) B � Sorting: ! N log � � N N N B � = � log = � ( log n ) n M =B m M log B B B B � Permuting: f N n g � ( min log ) ; n m ✫ For other problems [CGGTVV95], [AKL95], . . . � Graph problems � Permutation � Computational Geometry � Sorting ✫ Online problems: Z � � � Searching and Querying: � log + = �(log + ) N N z B B B Jeff Vitter 4 Center for Geometric & Biological Computing
Batched Problems in Geometry [GTVV93], [AVV95], [APRSV98a], [APRSV98b], [CFMMR98] ✫ Orthogonal rectangle intersection. ✫ Red-blue line segment intersection. ✫ General line segment intersection. ✫ All nearest neighbors. ✫ 2-D and 3-D convex hulls. ✫ Batched range queries. ✫ Trapezoid decomposition ✫ Batched planar point location. ✫ Triangulation. � � ) = � log + Use of virtual memory I/Os. Bad !!! N N Z B � � log + We can improve this to I/Os using O n n z m ✫ Distribution sweep. ✫ Persistent B-trees and batched filtering. ✫ Random incremental construction. ✫ Parallel simulation. Jeff Vitter 5 Center for Geometric & Biological Computing
Orthogonal Line Segment Intersection s 9 s 3 s s 5 7 s 2 s 1 s 6 s 8 s 4 Problem: Find all intersections of vertical segments with horizontal segments. Jeff Vitter 6 Center for Geometric & Biological Computing
Internal Memory Approach ✫ Presort the endpoints in y -order. ✫ Sweep the plane from top to bottom with a horizontal line. ✫ When reaching a vertical segment, store its x value in a balanced tree. When leaving a vertical segment, delete its x value from the tree. ✫ At any given time, the balanced tree stores the vertical segments hit by the sweep line. ✫ When reaching a horizontal segment, do a 1-d range query in the tree to find intersections with vertical segments. Time is (lg + ) , O N Z 0 0 is number of intersections reported. where Z ✫ Total running time is ( N lg + ) . O N Z Jeff Vitter 7 Center for Geometric & Biological Computing
External Solution? s 9 s 3 s s 5 7 s 2 s 1 s 6 s 8 s 4 ✫ Internal plane-sweep solution runs in ( N log + ) time. O N Z ✫ Using B-tree gives an ( N log + ) I/O solution. O n z B ✫ We want an ( n log + ) I/O solution that takes advantage of O n z m batching! Jeff Vitter 8 Center for Geometric & Biological Computing
Distribution Sweeping [Goodrich, Tsay, Vengroff & Vitter 93] s 9 s 3 s s 5 7 s 2 s 1 s 6 s 8 s 4 Sweep Line horizontal segment being processed Slab 1 Slab 2 Slab 3 Slab 4 Slab 5 Jeff Vitter 9 Center for Geometric & Biological Computing
Distribution Sweeping ✫ Presort endpoints by x and y coordinates. ✫ Divide the x -range into �( m ) slabs , so that each slab contains the same number of x values of vertical segments. ✫ Sweep all slabs simultaneously from top to bottom, keeping the vertical segments of a slab in a stack. ✫ For each slab spanned by a horizontal segment, output all “living” vertical segments in the slab’s stack and delete all “dead” vertical segments from stack. ✫ For the left and right “endpieces” of a horizontal segment, that stick out into a slab but don’t completely span it, handle those intersections recursively for each slab. Jeff Vitter 10 Center for Geometric & Biological Computing
Implementing a Stack . . Various stack operations: . 1. Push element onto top i 1 2. Read top entry, i 2 3. Pop entry from top. i 3 Variants: We can read the top k entries from the stack i 4 by iterating operation 3 k times and then operation 1 k . . . times. i k Keep current block and one other in internal memory i k +1 (using LRU). i k +2 It takes ( B ) pushes or pops to require one I/O. O . . . � � 1 ) # I/Os per operation = = amortized. O B Jeff Vitter 11 Center for Geometric & Biological Computing
Analysis of External Distribution Sweeping ✫ Each of the �( m ) stacks can use (1) blocks in internal memory. O � � 1 ✫ Therefore, each push, pop, or read uses I/Os amortized. B ✫ In each pass, the ( N ) vertical segments are inserted into the stack O ( n ) I/Os. in O ✫ For each of the ( N ) horizontal segments, we report intersections in O the slabs it completely spans. If the total number of intersections ( n ) plus the cost reported in this pass is 0 , the number of I/Os is Z O 0 stack push, pop, or read operations, which is of ( n + ) . Z O Z 0 =B Jeff Vitter 12 Center for Geometric & Biological Computing
Analysis of External Distribution Sweeping ✫ We recurse on each of the �( m ) slabs to handle the left endpieces and right endpieces of the horizontal segments. ✫ Note that the total number of endpieces at every level of recursion is 2 � # horizontal segments. at most It doesn’t double at each level. ✫ Number levels of recursion is (log n ) . O m ✫ Final result: ( n log + ) I/Os. O n z m Jeff Vitter 13 Center for Geometric & Biological Computing
Class Quiz What about batched range searching? �� �� � � �� �� � � �� �� �� �� �� �� � � � � � � �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� � � � � � � � � � � �� �� � � �� �� � � � � �� �� �� �� � � �� �� � � � � � � �� �� � � �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� � � �� �� �� �� � � �� �� � � � � � � � � � � �� �� � � � � �� �� � � � � � � �� �� � � � � � � � � � � � � � � � � �� �� � � �� �� �� �� �� �� �� �� � � � � � � We want to be able to do Q range queries on N points in (( n + ) log + ) I/Os. O q n z m Ideas??? Jeff Vitter 14 Center for Geometric & Biological Computing
Distribution Sweeping to the Rescue � � �� �� � � �� �� � � �� �� � � � � � � � � � � � � � � � � � � � � �� �� �� �� � � � � � � �� �� �� �� � � �� �� � � �� �� �� �� �� �� � � �� �� � � �� �� �� �� � � �� �� � � �� �� � � � � � � �� �� �� �� � � �� �� � � � � � � � � � � � � �� �� �� �� �� �� �� �� �� �� � � �� �� �� �� � � �� �� �� �� �� �� �� �� � � �� �� �� �� �� �� Sweep �� �� �� �� Line �� �� �� �� � � �� �� � � �� �� �� �� �� �� �� �� �� �� �� �� Slab 1 Slab 2 Slab 3 Slab 4 Slab 5 Jeff Vitter 15 Center for Geometric & Biological Computing
Recommend
More recommend