improving spatial data processing by clipping minimum
play

Improving Spatial Data Processing by Clipping Minimum Bounding Boxes - PowerPoint PPT Presentation

Improving Spatial Data Processing by Clipping Minimum Bounding Boxes Darius Sidlauskas Sean Chester EPFL NTNU Eleni Tzirita Zacharatou Anastasia Ailamaki EPFL EPFL Br Brain mo model el (axons) 97% of the Minimum Bounding Box is empty


  1. Improving Spatial Data Processing by Clipping Minimum Bounding Boxes Darius Sidlauskas Sean Chester EPFL NTNU Eleni Tzirita Zacharatou Anastasia Ailamaki EPFL EPFL

  2. Br Brain mo model el (axons) 97% of the Minimum Bounding Box is empty 2

  3. Empty space ce è unnece cessary I/Os Os 100 Optimal/Actual #leafAcc (%) 80 60 40 20 0 Query High Medium Low Query Selectivity Up to 64% of the accessed leaf nodes are false hits 3

  4. Tighter struct cture (convex hull) Empty space from 97% to 37%, but requires 49+ points 4

  5. How t Ho w to r o red educe d e dea ead sp space wi e with th on only f few e w extr tra p poi oints ts 5

  6. “Light cuts” “Li ” using only few extra points 6

  7. “Li “Light cut cuts” us using o ng onl nly f few e w extra po points 45% reduction in empty space with just 3 extra points 7

  8. Cl Clip point • Relevant to a corner of the Minimum Bounding Box. • The rectangular area between the clip point and the corner is dead. R 11 o1 <p1,11> <p2,00> o2 o4 o5 o3 <p3,00> R 00 Low representation overhead for clipped areas 8

  9. Clipped Cl ed Bo Bounding g Bo Box (CBB) CBB) • Augments the Minimum Bounding Box with a set of clip points. • The lesser the retained volume, the better the approximation. R 11 <p,11> <q,11> o1 o2 <t,00> o4 o5 o3 R 00 9

  10. Challenge: Choice ce of cl clip points o1 o2 o4 o5 o3 Choose ≤ k clip points that maximize the eliminated volume 10

  11. Candidate cl clip points • For given corner R b: • Consider only points in the outer surface of the objects o i . • Consider only the closest corner o ib . o1 o2 o4 o5 o3 R 00 11

  12. Candidate cl clip points • For given corner R b: • Consider only points in the outer surface of the objects o i . • Consider only the closest corner o ib . o1 o2 o4 o5 o3 R 00 12

  13. Sk Skyline e cl clip points • For given corner R b: • Consider only points in the outer surface of the objects o i . • Consider only the closest corner o ib . • Only the clip points in the Skyline of {o ib } are valid clip points! o1 o2 o4 o5 o3 R 00

  14. Sk Skyline -ba based d CBB • Get skyline points with respect to each corner R b. • Choose up to k points. R 11 R 01 o1 o2 o4 o5 o3 R 10 R 00 14

  15. Sk Skyline -ba based d CBB ( k k = 1 = 1) o1 o2 o4 o5 o3 15

  16. Sk Skyline -ba based d CBB ( k = 2 = 2) o1 o2 o4 o5 o3 16

  17. Sk Skyline -ba based d CBB ( k = 3 = 3) o1 o2 o4 o5 o3 17

  18. St Stairline cl clip po points ts • “Between” two skyline points. • Retain the “best” value in each dimension. • Clip away significantly more dead space. • Require more expensive pre-processing. R 11 p o1 q o2 o4 o5 o3 18

  19. Stairline -ba St based d CBB CBB • Get stairline points that are valid clip points with respect to each corner R b. • Choose up to k points. R 11 R 01 o1 o2 o4 o5 o3 R 10 R 00 19

  20. St Stairline -ba based d CBB ( k k = 1 = 1) o1 o2 o4 o5 o3 20

  21. St Stairline -ba based d CBB ( k k = 2 = 2) o1 o2 o4 o5 o3 21

  22. St Stairline -ba based d CBB ( k k = 3 = 3) o1 o2 o4 o5 o3 22

  23. St Stairline -ba based d CBB ( k k = 4 = 4) o1 o2 o4 o5 o3 23

  24. St Stairline -ba based d CBB ( k k = 5 = 5) o1 o2 o4 o5 o3 24

  25. Ex Experimental Setup • R-tree variants Quadratic [QR-tree], Hilbert [HR-tree], R*-tree, Revised R*-tree [RR*-tree] • Range queries • Spatial Join • High: ≈ 1 object per query • Medium: ≈ 10 objects per query • Low: ≈ 100 objects per query • Hardware Quad-core Intel Core i7-3770 CPU @ 3.4GHz, 16GB RAM, 500GB HDD - 7200RPM rea02 axo03 par02/par03 2 30 elements ~2M elements ~2.5 M elements 25

  26. Dead space ce el elimi mination Stairline clipping Dead space / node volume (%) QR-tree HR-tree R*-tree RR*-tree Dead space: CBBs remove 27% - 60% of dead space 26

  27. Range query performance ce Stairline clipping QR-tree HR-tree R*-tree RR*-tree Avg. #leafAcc w.r.t. original (%) Selectivity: high medium low ≈26% I/O reduction across all R-trees/workloads 27

  28. Querying 1B spatial object cts HR-tree RR*-tree Original Skyline-clipped Avg. query time (sec) Stairline-clipped par02: 71 GB par03: 91 GB Selectivity: medium Enabling interactive times for 1B objects 28

  29. Ta Take home message • The Minimum Bounding Box (MBB) is ubiquitous • Compact • Cheap intersection tests • Poor approximation of real data: can be > 90 % empty è up to 64% unnecessary I/Os! • The Clipped Bounding Box • Augments the MBB with few additional clip points • Retains the simplicity of the MBB • Eliminates up to 60 % of dead space • Enables interactive exploration of 1B objects Th Than ank you ou! 29

Recommend


More recommend