Improving Spatial Data Processing by Clipping Minimum Bounding Boxes Darius Sidlauskas Sean Chester EPFL NTNU Eleni Tzirita Zacharatou Anastasia Ailamaki EPFL EPFL
Br Brain mo model el (axons) 97% of the Minimum Bounding Box is empty 2
Empty space ce è unnece cessary I/Os Os 100 Optimal/Actual #leafAcc (%) 80 60 40 20 0 Query High Medium Low Query Selectivity Up to 64% of the accessed leaf nodes are false hits 3
Tighter struct cture (convex hull) Empty space from 97% to 37%, but requires 49+ points 4
How t Ho w to r o red educe d e dea ead sp space wi e with th on only f few e w extr tra p poi oints ts 5
“Light cuts” “Li ” using only few extra points 6
“Li “Light cut cuts” us using o ng onl nly f few e w extra po points 45% reduction in empty space with just 3 extra points 7
Cl Clip point • Relevant to a corner of the Minimum Bounding Box. • The rectangular area between the clip point and the corner is dead. R 11 o1 <p1,11> <p2,00> o2 o4 o5 o3 <p3,00> R 00 Low representation overhead for clipped areas 8
Clipped Cl ed Bo Bounding g Bo Box (CBB) CBB) • Augments the Minimum Bounding Box with a set of clip points. • The lesser the retained volume, the better the approximation. R 11 <p,11> <q,11> o1 o2 <t,00> o4 o5 o3 R 00 9
Challenge: Choice ce of cl clip points o1 o2 o4 o5 o3 Choose ≤ k clip points that maximize the eliminated volume 10
Candidate cl clip points • For given corner R b: • Consider only points in the outer surface of the objects o i . • Consider only the closest corner o ib . o1 o2 o4 o5 o3 R 00 11
Candidate cl clip points • For given corner R b: • Consider only points in the outer surface of the objects o i . • Consider only the closest corner o ib . o1 o2 o4 o5 o3 R 00 12
Sk Skyline e cl clip points • For given corner R b: • Consider only points in the outer surface of the objects o i . • Consider only the closest corner o ib . • Only the clip points in the Skyline of {o ib } are valid clip points! o1 o2 o4 o5 o3 R 00
Sk Skyline -ba based d CBB • Get skyline points with respect to each corner R b. • Choose up to k points. R 11 R 01 o1 o2 o4 o5 o3 R 10 R 00 14
Sk Skyline -ba based d CBB ( k k = 1 = 1) o1 o2 o4 o5 o3 15
Sk Skyline -ba based d CBB ( k = 2 = 2) o1 o2 o4 o5 o3 16
Sk Skyline -ba based d CBB ( k = 3 = 3) o1 o2 o4 o5 o3 17
St Stairline cl clip po points ts • “Between” two skyline points. • Retain the “best” value in each dimension. • Clip away significantly more dead space. • Require more expensive pre-processing. R 11 p o1 q o2 o4 o5 o3 18
Stairline -ba St based d CBB CBB • Get stairline points that are valid clip points with respect to each corner R b. • Choose up to k points. R 11 R 01 o1 o2 o4 o5 o3 R 10 R 00 19
St Stairline -ba based d CBB ( k k = 1 = 1) o1 o2 o4 o5 o3 20
St Stairline -ba based d CBB ( k k = 2 = 2) o1 o2 o4 o5 o3 21
St Stairline -ba based d CBB ( k k = 3 = 3) o1 o2 o4 o5 o3 22
St Stairline -ba based d CBB ( k k = 4 = 4) o1 o2 o4 o5 o3 23
St Stairline -ba based d CBB ( k k = 5 = 5) o1 o2 o4 o5 o3 24
Ex Experimental Setup • R-tree variants Quadratic [QR-tree], Hilbert [HR-tree], R*-tree, Revised R*-tree [RR*-tree] • Range queries • Spatial Join • High: ≈ 1 object per query • Medium: ≈ 10 objects per query • Low: ≈ 100 objects per query • Hardware Quad-core Intel Core i7-3770 CPU @ 3.4GHz, 16GB RAM, 500GB HDD - 7200RPM rea02 axo03 par02/par03 2 30 elements ~2M elements ~2.5 M elements 25
Dead space ce el elimi mination Stairline clipping Dead space / node volume (%) QR-tree HR-tree R*-tree RR*-tree Dead space: CBBs remove 27% - 60% of dead space 26
Range query performance ce Stairline clipping QR-tree HR-tree R*-tree RR*-tree Avg. #leafAcc w.r.t. original (%) Selectivity: high medium low ≈26% I/O reduction across all R-trees/workloads 27
Querying 1B spatial object cts HR-tree RR*-tree Original Skyline-clipped Avg. query time (sec) Stairline-clipped par02: 71 GB par03: 91 GB Selectivity: medium Enabling interactive times for 1B objects 28
Ta Take home message • The Minimum Bounding Box (MBB) is ubiquitous • Compact • Cheap intersection tests • Poor approximation of real data: can be > 90 % empty è up to 64% unnecessary I/Os! • The Clipped Bounding Box • Augments the MBB with few additional clip points • Retains the simplicity of the MBB • Eliminates up to 60 % of dead space • Enables interactive exploration of 1B objects Th Than ank you ou! 29
Recommend
More recommend