geometric algorithms
play

Geometric Algorithms Range & windowing queries (2 lectures) - PowerPoint PPT Presentation

Range & windowing queries 1/180 Geometric Algorithms Range & windowing queries (2 lectures) Database queries 2/180 G. Ometer born: Aug 16, 1954 salary salary: $3,500 A database query may ask for all employees with age between a 1


  1. Result 41/180 Theorem: A set of n points in the plane can be preprocessed in O ( n log n ) time into a data structure of O ( n log n ) size so that any 2D range query can be answered in O ( log 2 n + k ) time, where k is the number of answers reported In contrast, a kd-tree has O ( n ) size and answers queries in O ( √ n + k ) time ( Chapter 5.2 ).

  2. Higher dimensional range trees 42/180 A d -dimensional range tree has a main tree which is a one-dimensional balanced binary search tree on the first coordinate, where every node has a pointer to an associated structure that is a ( d − 1 ) -dimensional range tree on the other coordinates

  3. Storage 43/180 The size S d ( n ) of a d -dimensional range tree satisfies: S 1 ( n ) = O ( n ) for all n S d ( 1 ) = O ( 1 ) for all d for d ≥ 2 S d ( n ) ≤ 2 · S d ( n / 2 )+ S d − 1 ( n ) This solves to S d ( n ) = O ( n log d n )

  4. Query time 44/180 The number of grey nodes G d ( n ) satisfies: for all n G 1 ( n ) = O ( log n ) G d ( 1 ) = O ( 1 ) for all d for d ≥ 2 G d ( n ) ≤ 2 · log n + 2 · log n · G d − 1 ( n ) This solves to G d ( n ) = O ( log d n )

  5. Result 45/180 Theorem: A set of n points in d -dimensional space can be preprocessed in O ( n log d − 1 n ) time into a data structure of O ( n log d − 1 n ) size so that any d -dimensional range query can be answered in O ( log d n + k ) time, where k is the number of answers reported

  6. Improving the query time 46/180 We can improve the query time of a 2D range tree from O ( log 2 n ) to O ( log n ) by a technique called fractional cascading This automatically lowers the query time in d dimensions to O ( log d − 1 n ) time

  7. Improving the query time 47/180 The idea illustrated best by a different query problem: Suppose that we have a collection of sets S 1 ,..., S m , where | S 1 | = n and where S i + 1 ⊆ S i We want a data structure that can report for a query number x , the smallest value ≥ x in all sets S 1 ,..., S m

  8. Improving the query time 48/180 1 2 3 5 8 13 21 34 55 S 1 1 3 5 8 13 21 34 55 S 2 1 3 13 21 34 55 S 3 3 34 55 S 4

  9. Improving the query time 49/180 1 2 3 5 8 13 21 34 55 S 1 1 3 5 8 13 21 34 55 S 2 1 3 13 21 34 55 S 3 3 34 55 S 4

  10. Improving the query time 50/180 1 2 3 5 8 13 21 34 55 S 1 1 3 5 8 13 21 34 55 S 2 1 3 13 21 34 55 S 3 3 34 55 S 4

  11. Improving the query time 51/180 Suppose that we have a collection of sets S 1 ,..., S m , where | S 1 | = n and where S i + 1 ⊆ S i We want a data structure that can report for a query number x , the smallest value ≥ x in all sets S 1 ,..., S m This query problem can be solved in O ( log n + m ) time instead of O ( m · log n ) time

  12. Improving the query time 52/180 Can we do something similar for m 1-dimensional range queries on m sets S 1 ,..., S m ? We hope to get a query time of O ( log n + m + k ) with k the total number of points reported

  13. Improving the query time 53/180 1 2 3 5 8 13 21 34 55 S 1 1 3 5 8 13 21 34 55 S 2 1 3 13 21 34 55 S 3 3 34 55 S 4

  14. Improving the query time 54/180 1 2 3 5 8 13 21 34 55 S 1 1 3 5 8 13 21 34 55 S 2 1 3 13 21 34 55 S 3 3 34 55 S 4

  15. Improving the query time 55/180 [6,35] 1 2 3 5 8 13 21 34 55 S 1 1 3 5 8 13 21 34 55 S 2 1 3 13 21 34 55 S 3 3 34 55 S 4

  16. Fractional cascading 56/180 Now we do “the same” on the associated structures of a 2-dimensional range tree Note that in every associated structure, we search with the same values y and y ′ ◮ Replace all associated structure except for the root by a linked list ◮ For every list element (and leaf of the associated structure of the root), store two pointers to the appropriate list elements in the lists of the left child and of the right child

  17. Fractional cascading 57/180

  18. Fractional cascading 58/180

  19. Fractional cascading 59/180 17 8 52 5 15 33 58 2 7 12 21 41 67 17 58 2 5 7 8 12 15 21 33 41 52 67 93 (2 , 19) (7 , 10) (12 , 3) (17 , 62) (21 , 49) (41 , 95) (58 , 59) (93 , 70) (5 , 80) (8 , 37) (15 , 99) (33 , 30) (52 , 23) (67 , 89)

  20. Fractional cascading 60/180 3 10 19 23 30 37 49 59 62 70 80 89 95 99 3 10 19 37 62 80 99 23 30 49 59 70 89 95 10 19 37 80 3 62 99 23 30 49 95 59 70 89 19 80 10 37 3 99 62 30 49 23 95 59 70 89 19 80 10 37 3 99 49 30 95 23 89 70

  21. Fractional cascading 61/180 [4 , 58] × [19 , 65] 17 8 52 5 15 33 58 2 7 12 21 41 67 17 58 2 5 7 8 12 15 21 33 41 52 67 93 (2 , 19) (7 , 10) (12 , 3) (17 , 62) (21 , 49) (41 , 95) (58 , 59) (93 , 70) (5 , 80) (8 , 37) (15 , 99) (33 , 30) (52 , 23) (67 , 89)

  22. Fractional cascading 62/180 3 10 19 23 30 37 49 59 62 70 80 89 95 99 3 10 19 37 62 80 99 23 30 49 59 70 89 95 10 19 37 80 3 62 99 23 30 49 95 59 70 89 19 80 10 37 3 99 62 30 49 23 95 59 70 89 19 80 10 37 3 99 49 30 95 23 89 70

  23. Fractional cascading 63/180 3 10 19 23 30 37 49 59 62 70 80 89 95 99 3 10 19 37 62 80 99 23 30 49 59 70 89 95 10 19 37 80 3 62 99 23 30 49 95 59 70 89 19 80 10 37 3 99 62 30 49 23 95 59 70 89 19 80 10 37 3 99 49 30 95 23 89 70

  24. Fractional cascading 64/180 Instead of doing a 1D range query on the associated structure of some node ν , we find the leaf where the search to y would end in O ( 1 ) time via the direct pointer in the associated structure in the parent of ν The number of grey nodes reduces to O ( log n )

  25. Result 65/180 Theorem: A set of n points in d -dimensional space with d ≥ 2 can be preprocessed in O ( n log d − 1 n ) time into a data structure of O ( n log d − 1 n ) size so that any d -dimensional range query can be answered in O ( log d − 1 n + k ) time, where k is the number of answers reported. Multiple points with the same x - or y -coordinate need to be handled with care.

  26. Windowing 66/180 Zoom in; re-center and zoom in; select by outlining

  27. Windowing 67/180

  28. Windowing 68/180 Given a set of n axis-parallel line segments, preprocess them into a data structure so that the ones that intersect a query rectangle can be reported efficiently

  29. Windowing 69/180 How can a rectangle and an axis-parallel line segment intersect?

  30. Windowing 70/180 Essentially two types: ◮ Segments whose endpoint lies in the rectangle (or both endpoints) ◮ Segments with both endpoints outside the rectangle Segments of the latter type always intersect the boundary of the rectangle (even the left and/or bottom side)

  31. Windowing 71/180 Instead of storing axis-parallel segments and searching with a rectangle, we will: ◮ store the segment endpoints and query with the rectangle ◮ store the segments and query with the left side and the bottom side of the rectangle Note that the query problem is at least as hard as rectangular range searching in point sets

  32. Windowing 72/180 Instead of storing axis-parallel segments and searching with a rectangle, we will: ◮ store the segment endpoints and query with the rectangle ◮ store the segments and query with the left side and the bottom side of the rectangle Question: How often might we report the same segment?

  33. Windowing 73/180 Instead of storing axis-parallel segments and searching with a rectangle, we will: ◮ store the segment endpoints and query with the rectangle use range tree ◮ store the segments and query with the left side and the bottom side of the rectangle need to develop data structure

  34. Windowing 74/180 Current problem of our interest: Given a set of horizontal (vertical) line segments, preprocess them into a data structure so that the ones intersecting a vertical (horizontal) query segment can be reported efficiently

  35. Windowing 75/180 Simpler query problem: What if the vertical query segment is a full line? Then the problem is essentially 1-dimensional

  36. Interval querying 76/180 Given a set I of n intervals on the real line, preprocess them into a data structure so that the ones containing a query point (value) can be reported efficiently

  37. Splitting a set of intervals 77/180 The median x of the 2 n endpoints partitions the intervals into three subsets: ◮ Intervals I left fully left of x ◮ Intervals I mid that contain (intersect) x ◮ Intervals I right fully right of x x

  38. Interval tree: recursive definition 78/180 The interval tree for I has a root node ν that contains x and ◮ the intervals I left are stored in the left subtree of ν ◮ the intervals I mid are stored with ν ◮ the intervals I right are stored in the right subtree of ν The left and right subtrees are proper interval trees for I left and I right How many intervals can be in I mid ? How should we store I mid ?

  39. Interval tree: left and right lists 79/180 How is I mid stored? x Observe: If the query point is left of x , then only the left endpoint determines if an interval is an answer Symmetrically: If the query point is right of x , then only the right endpoint determines if an interval is an answer

  40. Interval tree: left and right lists 80/180 x Make a list L left using the left-to-right order of the left endpoints of I mid Make a list L right using the right-to-left order of the right endpoints of I mid Store both lists as associated structures with ν

  41. Interval tree: example 81/180 s 5 , s 6 , s 7 L left L right s 7 , s 5 , s 6 s 4 , s 3 , s 2 s 4 , s 3 , s 2 s 9 , s 10 s 9 , s 10 s 1 s 8 s 12 , s 11 s 1 s 8 s 11 , s 12 s 2 s 3 s 6 s 7 s 10 s 12 s 5 s 1 s 4 s 8 s 9 s 11

  42. Interval tree: storage 82/180 The main tree has O ( n ) nodes The total length of all lists is 2 n because each interval is stored exactly twice: in L left and L right and only at one node Consequently, the interval tree uses O ( n ) storage

  43. Interval querying 83/180 Algorithm Q UERY I NTERVAL T REE ( ν , q x ) 1. if ν is not a leaf 2. then if q x < x mid ( ν ) 3. then Traverse list L left ( ν ) , starting at the interval with the leftmost endpoint, reporting all the intervals that contain q x . Stop as soon as an interval does not contain q x . 4. Q UERY I NTERVAL T REE ( lc ( ν ) , q x ) 5. else Traverse list L right ( ν ) , starting at the interval with the rightmost endpoint, reporting all the intervals that contain q x . Stop as soon as an interval does not contain q x . 6. Q UERY I NTERVAL T REE ( rc ( ν ) , q x )

  44. Interval tree: query example 84/180 s 5 , s 6 , s 7 L left L right s 7 , s 5 , s 6 s 4 , s 3 , s 2 s 4 , s 3 , s 2 s 9 , s 10 s 9 , s 10 s 1 s 8 s 11 , s 12 s 1 s 8 s 12 , s 11 s 2 s 3 s 6 s 7 s 10 s 12 s 5 s 1 s 4 s 8 s 9 s 11

  45. Interval tree: query example 85/180 s 5 , s 6 , s 7 L left L right s 7 , s 5 , s 6 s 4 , s 3 , s 2 s 4 , s 3 , s 2 s 9 , s 10 s 9 , s 10 s 1 s 8 s 11 , s 12 s 1 s 8 s 12 , s 11 s 2 s 3 s 6 s 7 s 10 s 12 s 5 s 1 s 4 s 8 s 9 s 11

  46. Interval tree: query example 86/180 s 5 , s 6 , s 7 L left L right s 7 , s 5 , s 6 s 4 , s 3 , s 2 s 4 , s 3 , s 2 s 9 , s 10 s 9 , s 10 s 1 s 8 s 11 , s 12 s 1 s 8 s 12 , s 11 s 2 s 3 s 6 s 7 s 10 s 12 s 5 s 1 s 4 s 8 s 9 s 11

  47. Interval tree: query example 87/180 s 5 , s 6 , s 7 L left L right s 7 , s 5 , s 6 s 4 , s 3 , s 2 s 4 , s 3 , s 2 s 9 , s 10 s 9 , s 10 s 1 s 8 s 11 , s 12 s 1 s 8 s 12 , s 11 s 2 s 3 s 6 s 7 s 10 s 12 s 5 s 1 s 4 s 8 s 9 s 11

  48. Interval tree: query example 88/180 s 5 , s 6 , s 7 L left L right s 7 , s 5 , s 6 s 4 , s 3 , s 2 s 4 , s 3 , s 2 s 9 , s 10 s 9 , s 10 s 1 s 8 s 11 , s 12 s 1 s 8 s 12 , s 11 s 2 s 3 s 6 s 7 s 10 s 12 s 5 s 1 s 4 s 8 s 9 s 11

  49. Interval tree: query example 89/180 s 5 , s 6 , s 7 L left L right s 7 , s 5 , s 6 s 4 , s 3 , s 2 s 4 , s 3 , s 2 s 9 , s 10 s 9 , s 10 s 1 s 8 s 11 , s 12 s 1 s 8 s 12 , s 11 s 2 s 3 s 6 s 7 s 10 s 12 s 5 s 1 s 4 s 8 s 9 s 11

  50. Interval tree: query example 90/180 s 5 , s 6 , s 7 L left L right s 7 , s 5 , s 6 s 4 , s 3 , s 2 s 4 , s 3 , s 2 s 9 , s 10 s 9 , s 10 s 1 s 8 s 11 , s 12 s 1 s 8 s 12 , s 11 s 2 s 3 s 6 s 7 s 10 s 12 s 5 s 1 s 4 s 8 s 9 s 11

  51. Interval tree: query example 91/180 s 5 , s 6 , s 7 L left L right s 7 , s 5 , s 6 s 4 , s 3 , s 2 s 4 , s 3 , s 2 s 9 , s 10 s 9 , s 10 s 1 s 8 s 11 , s 12 s 1 s 8 s 12 , s 11 s 2 s 3 s 6 s 7 s 10 s 12 s 5 s 1 s 4 s 8 s 9 s 11

  52. Interval tree: query example 92/180 s 5 , s 6 , s 7 L left L right s 7 , s 5 , s 6 s 4 , s 3 , s 2 s 4 , s 3 , s 2 s 9 , s 10 s 9 , s 10 s 1 s 8 s 11 , s 12 s 1 s 8 s 12 , s 11 s 2 s 3 s 6 s 7 s 10 s 12 s 5 s 1 s 4 s 8 s 9 s 11

  53. Interval tree: query example 93/180 s 5 , s 6 , s 7 L left L right s 7 , s 5 , s 6 s 4 , s 3 , s 2 s 4 , s 3 , s 2 s 9 , s 10 s 9 , s 10 s 1 s 8 s 11 , s 12 s 1 s 8 s 12 , s 11 s 2 s 3 s 6 s 7 s 10 s 12 s 5 s 1 s 4 s 8 s 9 s 11

  54. Interval tree: query example 94/180 s 5 , s 6 , s 7 L left L right s 7 , s 5 , s 6 s 4 , s 3 , s 2 s 4 , s 3 , s 2 s 9 , s 10 s 9 , s 10 s 1 s 8 s 11 , s 12 s 1 s 8 s 12 , s 11 s 2 s 3 s 6 s 7 s 10 s 12 s 5 s 1 s 4 s 8 s 9 s 11

  55. Interval tree: query time 95/180 The query follows only one path in the tree, and that path has length O ( log n ) The query traverses O ( log n ) lists. Traversing a list with k ′ answers takes O ( 1 + k ′ ) time The total time for list traversal is therefore O ( log + k ) , with the total number of answers reported (no answer is found more than once) The query time is O ( log n )+ O ( log n + k ) = O ( log n + k )

  56. Interval tree: query example 96/180 Algorithm C ONSTRUCT I NTERVAL T REE ( I ) Input. A set I of intervals on the real line Output. The root of an interval tree for I 1. if I = / 0 2. then return an empty leaf 3. else Create a node ν . Compute x mid , the median of the set of interval endpoints, and store x mid with ν 4. Compute I mid and construct two sorted lists for I mid : a list L left ( ν ) sorted on left endpoint and a list L right ( ν ) sorted on right endpoint. Store these two lists at ν 5. lc ( ν ) ← C ONSTRUCT I NTERVAL T REE ( I left ) 6. rc ( ν ) ← C ONSTRUCT I NTERVAL T REE ( I right ) 7. return ν

  57. Interval tree: result 97/180 Theorem: An interval tree for a set I of n intervals uses O ( n ) storage and can be built in O ( n log n ) time. All intervals that contain a query point can be reported in O ( log n + k ) time, where k is the number of reported intervals.

  58. Back to the plane 98/180

  59. Back to the plane 99/180 Suppose we use an interval tree on the x -intervals of the horizontal line segments? Then the lists L left and L right are not suitable anymore to solve the query problem for the segments corresponding to I mid

Recommend


More recommend