r trees
play

R-Trees Albert-Jan Yzelman October 22, 2007 Albert-Jan Yzelman - PowerPoint PPT Presentation

R-Trees R-Trees Albert-Jan Yzelman October 22, 2007 Albert-Jan Yzelman R-Trees > Introduction Outline R-trees Introduction 1 Basics 2 Tree Construction 3 Conclusions 4 Albert-Jan Yzelman R-Trees > Introduction Background


  1. � ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ � ✁ � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ✁ R-Trees > Basics Demonstration of a line query Albert-Jan Yzelman

  2. ✁ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✁ ✁ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✂ ✄ ✂ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✁ ✁ ✄ � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ � ✁ � ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✄ R-Trees > Basics Demonstration of a line query Albert-Jan Yzelman

  3. ✂ ☎ ✂ ✄ ✄ ✄ ✄ ✄ ✄ ☎ ☎ ☎ ☎ ☎ ✆ ☎ ☎ ☎ ☎ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✂ ✂ ✆ � � � � � � � � � � � � ✁ ✂ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✂ ✆ R-Trees > Basics Demonstration of a line query Albert-Jan Yzelman

  4. ✂ ✄ ☎ ☎ ☎ ☎ ☎ ☎ ✄ ✄ ✄ ☎ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✂ ☎ ☎ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ☎ ✆ ✆ ✆ ☎ ☎ ☎ ☎ ☎ ☎ ✂ ✂ ✆ � ✁ � � � � � � � � ✁ � � � � � � � � � ✁ ✁ ✂ ✁ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✆ R-Trees > Basics Demonstration of a line query Albert-Jan Yzelman

  5. R-Trees > Basics Asymptotic query time t query ≤ c · log m n , n → ∞ = O (log n ) Because: the tree is tallest when each internal node has precisely m children, and the tree is balanced. Albert-Jan Yzelman

  6. R-Trees > Tree Construction Outline R-trees Introduction 1 Basics 2 Tree Construction 3 Conclusions 4 Albert-Jan Yzelman

  7. R-Trees > Tree Construction Actively researched R-tree variations: The original R-tree, introduced in 1984 Top-down Greedy Split (TGS) Hilbert R-tree Hilbert TGS Albert-Jan Yzelman

  8. ☎ ☎ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ☎ ✆ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ✆ ✆ ✄ ✝ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✝ ✆ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✞ ✄ ✞ � ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ � � ✁ � � � � � � � � � � � � ✁ ✁ ✄ ✂ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✂ ✂ ✁ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✁ ✞ R-Trees > Tree Construction Grouping criteria Figure: Four to-be grouped MBRs Albert-Jan Yzelman

  9. ☎ ☎ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ☎ ✆ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ✆ ✆ ✄ ✝ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✝ ✆ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✞ ✄ ✞ � ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ � � ✁ � � � � � � � � � � � � ✁ ✁ ✄ ✂ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✂ ✂ ✁ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✁ ✞ R-Trees > Tree Construction Grouping criteria Figure: Minimum overlap criteria Albert-Jan Yzelman

  10. ☎ ☎ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ✆ ☎ ✆ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ☎ ✆ ✆ ✄ ✝ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✞ ✝ ✆ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✝ ✞ ✄ ✞ � ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ � � ✁ � � � � � � � � � � � � ✁ ✁ ✄ ✂ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✄ ✂ ✂ ✁ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✁ ✞ R-Trees > Tree Construction Grouping criteria Figure: Minimum total volume criteria Albert-Jan Yzelman

  11. R-Trees > Tree Construction Original dynamic R-tree We start out with an empty R-tree and insert new objects one-by-one. For this we need an insertion algorithm which works on arbitrary R-trees. Consider the following example with m = 2 and M = 3. Albert-Jan Yzelman

  12. R-Trees > Tree Construction Original dynamic R-tree Albert-Jan Yzelman

  13. R-Trees > Tree Construction Original dynamic R-tree Albert-Jan Yzelman

  14. R-Trees > Tree Construction Original dynamic R-tree Albert-Jan Yzelman

  15. R-Trees > Tree Construction Original dynamic R-tree Albert-Jan Yzelman

  16. R-Trees > Tree Construction Original dynamic R-tree Differences in overflow handling yield the linear , quadratic and the polynomial R-tree variants. Consider the following example with m = 2 and M = 3. Albert-Jan Yzelman

  17. R-Trees > Tree Construction Overflow handling: linear splitting Albert-Jan Yzelman

  18. R-Trees > Tree Construction Overflow handling: linear splitting Albert-Jan Yzelman

  19. R-Trees > Tree Construction Overflow handling: linear splitting Albert-Jan Yzelman

  20. R-Trees > Tree Construction Overflow handling: linear splitting Albert-Jan Yzelman

  21. R-Trees > Tree Construction Overflow handling: linear splitting Albert-Jan Yzelman

  22. R-Trees > Tree Construction Overflow handling: linear splitting Albert-Jan Yzelman

  23. R-Trees > Tree Construction Overflow handling: linear splitting Albert-Jan Yzelman

  24. R-Trees > Tree Construction Overflow handling: linear splitting Albert-Jan Yzelman

  25. R-Trees > Tree Construction Top-down Greedy Split (TGS) Observation: Query efficiency is determined top-down by the shape of the bounding boxes Why not build the R-tree top-down? Albert-Jan Yzelman

  26. R-Trees > Tree Construction Top-down Greedy Split (TGS) Uses an collection of orderings S . Subdivide the input set into a maximum of M subsets each containing no more than ˜ n elements: With respect to each ordering in S , subdivide the input set into groups of ˜ n elements Find the best binary split with repect to all s ∈ S Recursively split both groups until all groups contain less than m elements So either we sort | S | times to find this best binary split, or we duplicate the input set | S | times Albert-Jan Yzelman

  27. R-Trees > Tree Construction Top-down Greedy Split (TGS) n=28, M=3, and so: h=4 Albert-Jan Yzelman

  28. R-Trees > Tree Construction Top-down Greedy Split (TGS) n=26, M=3, and so: h=3 at the first tree level, each subtree of height h−1=2 may contain 3^2=9 elements x x y y Albert-Jan Yzelman

  29. R-Trees > Tree Construction Top-down Greedy Split (TGS) n=26, M=3, and so: h=3 at the first tree level, each subtree of height h−1=2 may contain 3^2=9 elements x x y Albert-Jan Yzelman

  30. R-Trees > Tree Construction Top-down Greedy Split (TGS) n=26, M=3, and so: h=3 at the first tree level, each subtree of height h−1=2 may contain 3^2=9 elements At the second tree level, each subtree of height h−2=1 may contain 3 elements Albert-Jan Yzelman

  31. R-Trees > Tree Construction Random TGS For each of the K times divide some set in two, we sort | S | times:   c 11 c 21 · · · c | S | 1 c 12 c 22 · · · c | S | 2    . .  ... . .   . .   · · · c 1 M c 2 M c | S | M Albert-Jan Yzelman

  32. R-Trees > Tree Construction Ordering on MBRs It would be useful to have an ordering on MBRs where: X < Y < Z (1) Would imply that X is closer to Y than to Z . Albert-Jan Yzelman

  33. R-Trees > Tree Construction Ordering on MBRs G G=max(E,F) E F E=max(A,B) < F=max(C,D) A B C D A < B < C < D Albert-Jan Yzelman

  34. R-Trees > Tree Construction Ordering on MBRs Let us use the centre coordinate of MBRs for ordering. This is trivial in one dimension; x < y , with x , y ∈ ❘ is well-defined. Albert-Jan Yzelman

  35. R-Trees > Tree Construction Ordering on MBRs Let us use the centre coordinate of MBRs for ordering. But for a higher number of dimensions d ∈ ◆ , d > 1: y ∈ ❘ d � x < � y , with � x ,� is not well-defined. Albert-Jan Yzelman

  36. R-Trees > Tree Construction Ordering on MBRs To solve this, we find a mapping: h : ❘ d → ❘ by using the Hilbert curve . Albert-Jan Yzelman

  37. R-Trees > Tree Construction Ordering on MBRs Figure: First-order Hilbert curve Albert-Jan Yzelman

  38. R-Trees > Tree Construction Ordering on MBRs Figure: Recursion of the Hilbert curve Albert-Jan Yzelman

  39. R-Trees > Tree Construction Ordering on MBRs Figure: Second-order Hilbert curve Albert-Jan Yzelman

  40. R-Trees > Tree Construction Ordering on MBRs Figure: Third-order Hilbert curve Albert-Jan Yzelman

  41. ❘ R-Trees > Tree Construction Ordering on MBRs Map points in ❘ d to a point y ∈ ❘ d on the n th order Hilbert curve i Calculate the distance d = 2 dn of y on the n th order Hilbert curve i Write h n ( x ) = d = 2 dn for the n th order Hilbert coordinate transform of x Albert-Jan Yzelman

  42. R-Trees > Tree Construction Ordering on MBRs Map points in ❘ d to a point y ∈ ❘ d on the n th order Hilbert curve i Calculate the distance d = 2 dn of y on the n th order Hilbert curve i Write h n ( x ) = d = 2 dn for the n th order Hilbert coordinate transform of x The true Hilbert coordinate transform h is then defined by: x ∈ ❘ d h ( x ) = lim n →∞ h n ( x ) , Albert-Jan Yzelman

  43. R-Trees > Tree Construction Ordering on MBRs In software calculation we use instead: c ˜ h ( x ) = 2 dn where c is the cell number containing x . Albert-Jan Yzelman

  44. R-Trees > Tree Construction Hilbert R-tree Invariants: Each leaf node stores the Hilbert coordinate of the centre coordinate of the MBR of the object stored there Each internal node stores the maximum Hilbert coordinate value h max found at its children Albert-Jan Yzelman

  45. R-Trees > Tree Construction Hilbert R-tree Invariants: Each leaf node stores the Hilbert coordinate of the centre coordinate of the MBR of the object stored there Each internal node stores the maximum Hilbert coordinate value h max found at its children Insertion: Get the Hilbert coordinate h of the MBR of the new object to-be inserted Find the deepest internal node v with the smallest h max larger than h and insert the new object there Update the h max value at the v and all its parent nodes Check if v overflows Albert-Jan Yzelman

  46. R-Trees > Tree Construction Hilbert R-tree: overflow handling When a single internal node v overflows: 0.8 0.8 0.3 0.5 0.8 0.5 0.8 0.7 0.3 0.5 0.7 0.8 Figure: Overflow handling when there are no neighbour nodes Albert-Jan Yzelman

  47. R-Trees > Tree Construction Hilbert R-tree: overflow handling When a single internal node v overflows: 0.8 0.8 0.1 0.5 0.8 0.4 0.8 0.3 0.4 0.5 0.7 0.8 0.1 0.3 0.4 0.5 0.7 0.8 Figure: Overflow handling when there is a non-full neighbour Albert-Jan Yzelman

  48. R-Trees > Tree Construction Hilbert Top-down Greedy Split Like normal TGS, but with S containing only the Hilbert coordinate based ordering. Albert-Jan Yzelman

  49. R-Trees > Conclusions Outline R-trees Introduction 1 Basics 2 Tree Construction 3 Conclusions 4 Albert-Jan Yzelman

  50. R-Trees > Conclusions Experiments Some variations have been implemented in C++. The resulting library recently went public as an open-source project: http://www.sourceforge.net/projects/rtree-lib Applied to datasets supplied by Shell we obtained the following experimental results. Albert-Jan Yzelman

  51. R-Trees > Conclusions Experiments: construction time Grid size vs. building time 6 10 Bisection tree. 2/4 RandomTGS bulk−loaded basic R−tree. 2/4 HilbertTGS bulk−loaded basic R−tree. 5 10 Building time (in processor ticks) 4 10 3 10 2 10 1 10 0 10 3 4 5 6 7 10 10 10 10 10 Number of grid elements Albert-Jan Yzelman

  52. R-Trees > Conclusions Experiments: point query time Grid size vs. query time −− point query 0 10 Bisection tree. 2/4 RandomTGS bulk−loaded basic R−tree. 2/4 HilbertTGS bulk−loaded basic R−tree. Average required time per point query (in seconds) −1 10 −2 10 −3 10 −4 10 −5 10 2 3 4 5 6 7 10 10 10 10 10 10 Number of grid elements Albert-Jan Yzelman

  53. R-Trees > Conclusions Experiments: line query time Grid size vs. query time −− line query 1 10 Bisection tree. 2/4 RandomTGS bulk−loaded basic R−tree. 2/4 HilbertTGS bulk−loaded basic R−tree. 0 Average required time per line query (in seconds) 10 −1 10 −2 10 −3 10 −4 10 2 3 4 5 6 7 10 10 10 10 10 10 Number of grid elements Albert-Jan Yzelman

  54. R-Trees > Conclusions Experiments: box query time Grid size vs. query time −− box query 2 10 Bisection tree. 2/4 RandomTGS bulk−loaded basic R−tree. 2/4 HilbertTGS bulk−loaded basic R−tree. 1 10 Average required time per box query (in seconds) 0 10 −1 10 −2 10 −3 10 −4 10 2 3 4 5 6 7 10 10 10 10 10 10 Number of grid elements Albert-Jan Yzelman

  55. R-Trees > Conclusions Other query types: k -nn query (implemented) Hyperplane query (not implemented) Grid size vs. query time −− knn query 2 10 Bisection tree. 2/4 RandomTGS bulk−loaded basic R−tree. 2/4 HilbertTGS bulk−loaded basic R−tree. 1 Average required time per knn query (in seconds) 10 0 10 −1 10 −2 10 −3 10 2 3 4 5 6 7 10 10 10 10 10 10 Number of grid elements Figure: k nn query time Albert-Jan Yzelman

Recommend


More recommend