The V*-Diagram: A Query-Dependent Approach to Moving KNN Queries Sarana Nutanong, Rui Zhang, Egemen Tanin, Lars Kulik Dept. of Computer Science and Software Engineering University of Melbourne – p.1/25
Motivation Consider two scenarios: • a driver in a GPS-equipped car finding the nearest gas station along the route of a trip; • a tourist walking in the city looking for the nearest ATM. These scenarios are examples of moving k nearest neighbor queries (M k NN) . – p.2/25
Simple Approach The Voronoi Diagram Figure 1: Voronoi diagrams Drawbacks: 1. Expensive precomputations 2. Inefficient update operations 3. No support for dynamically changing k values – p.3/25
Best Existing Approach Influence-set Retrieval [Zhang et al., 2003] (a) Bisector B ad is discovered as a (b) All boundaries are discovered boundary. Figure 2: Computing a Voronoi cell locally – p.4/25
Our Approach: V*-Diagram Objectives: 1. Requires no precomputation 2. Supports dynamic insertions / deletions of objects 3. Handles dynamically changing k – p.5/25
Our Approach: V*-Diagram Objectives: 1. Requires no precomputation 2. Supports dynamic insertions / deletions of objects 3. Handles dynamically changing k Result: Outperforms the best practice [Zhang et al.] by 2 orders of magnitude – p.5/25
The V*-Diagram Known Region If the known NNs to q are { d , f , j } , the know region W ( q , j ) is { v : dist ( q , v ) ≤ dist ( q , j ) } . – p.6/25
The V*-Diagram Safe region wrt a data point We retrieve ( k + x ) objects. In this example, k and x are 1 , so we retrieve p and z . If q ′ ∈ S ( q b , z , p ) then, ∀ p ′ / ∈ W ( q b , z ) , dist ( q ′ , p ) < dist ( q ′ , p ′ ) . S ( q b , z , p ) = { q ′ : dist ( p , q ′ ) ≤ dist ( q b , z ) − dist ( q b , q ′ ) } – p.7/25
The V*-Diagram The Fixed-rank Region (FRR) [Kulik and Tanin, 2006] (a) � a , c , b , f , e , d � (b) � a , c , b , e , f , d � Figure 3: Incremental rank update – p.8/25
The V*-Diagram Integrated Safe Region (ISR) and V*- k NN ISR is an intersection of 1. the safe region wrt k th NN, S ( q b , z , p k ) ; 2. the FRR of the ( k + x ) NNs of q b . Figure 4: V*- k NN Example ( k = 2 , x = 2 ) – p.9/25
V*- k NN Algorithm http://www.csse.unimelb.edu.au/~sarana/demo.html – p.10/25
Experiments • Data Structure: R*-trees (1-kB block size). • Comparative Method: RIS- k NN [Zhang et al.] • Datasets: • (U) 25,000 of data points in uniform distribution • (Z) 25,000 of data points in Zipfian distribution • (C) 65,743 postal addresses from California • (N) 119,897 postal addresses from North-Eastern USA – p.11/25
Experiments Trajectories 6500 2950 2925 6000 2900 5500 2875 5000 2850 5000 5500 6000 6500 2150 2175 2200 2225 (a) Directional (D) (b) Random (R) Figure 5: Trajectory types – p.12/25
Experiments total cost wrt x 100 U U 100 Z Z C C 10 N N 10 time (sec) time (sec) 1 1 0.1 0.1 0.01 0.01 3 6 9 12 15 18 21 24 3 6 9 12 15 18 21 24 x x (a) Total cost (D) (b) Page access (D) Figure 6: Effect of x – p.13/25
Experiments total cost wrt k 1000 1000 100 100 time (sec) time (sec) V* (D) V* (D) V* (R) V* (R) RIS (D) 10 RIS (D) RIS (R) 10 RIS (R) 1 1 10 20 30 40 10 20 30 40 k k (a) Total Cost (California) (b) Total Cost (North-Eastern USA) Figure 7: Effect of k – p.14/25
Experiments total cost wrt n 100 100 V* (D) V* (R) V* (D) time (sec) time (sec) RIS (D) 10 V* (R) 10 RIS (R) RIS (D) RIS (R) 1 1 25 50 75 100 25 50 75 100 n (x1000) n (x1000) (a) Total Cost (Uniform) (b) Total Cost (Zipfian) Figure 8: Effect of dataset size – p.15/25
Cost model RIS- k NN The number of the k VD cells in 2 D space is approximated as 2 kn [Okabe et al., 1992]. For a given trajectory length l , the number n v of k VD cells crossed by the trajectory is given by √ n v = l 2 kn. – p.16/25
Cost model V*- k NN Directional: n b = l/d e . Random: n b = ls/d 2 e , where s is the step size. – p.17/25
Experiments Cost Model 1000 100 100 #accesses #accesses 10 V* (D) V* (R) 10 RIS (D) V* (D) RIS (R) 1 V* (R) Est. RIS (D) RIS (R) 0.1 1 Est. 25 50 75 100 10 20 30 40 n (x1000) k (a) Effect of n (b) Effect of k Figure 9: Cost model validation – p.18/25
The V*-Diagram in a spatial network Figure 10: Safe region Figure 12: ISR is S ( q 1 , u , s ) ∩ F � s , t , u � Figure 11: Fixed-rank region – p.19/25
Experiments The V*-Diagram in a spatial network Figure 13: Road network in north America (175,813 nodes and 179,179 edges) – p.20/25
Experiments The V*-Diagram in a spatial network 110 40 k=2 k=2 100 k=4 35 k=4 90 k=6 k=6 30 k=8 k=8 80 #accesses k=10 k=10 time (sec) 70 25 60 20 50 40 15 30 10 20 10 5 2 4 6 8 10 2 4 6 8 10 x x (a) Total Response Time (b) Access Cost Figure 14: Spatial network: effect of x – p.21/25
Experiments The V*-Diagram in a spatial network 220 55 k=2 k=2 200 50 k=4 k=4 180 45 k=6 k=6 k=8 k=8 160 40 #accesses k=10 k=10 time (sec) 140 35 120 30 100 25 80 20 60 15 40 10 20 5 250 500 750 1000 250 500 750 1000 l l (a) Total Response Time (b) Access Cost Figure 15: Spatial network: effect of l – p.22/25
Conclusions • The V*-Diagram constructs a safe region using: 1. the location of the query point, 2. k NN-search coverage (known region), 3. known data points. • V*- k NN is local , incremental and dynamic . • V*- k NN outperforms the best existing technique by two orders of magnitude. • The V*-diagram is a general philosophy, which can be applied to most safe region based techniques. – p.23/25
Related Publications • S. Nutanong, R. Zhang, E. Tanin, L. Kulik: Analysis and Evaluation of V*- k NN: An Efficient Algorithm for Moving k Nearest Neighbor Queries. To appear in VLDB Journal. • S. Nutanong, R. Zhang, E. Tanin, L. Kulik: V*- k NN: An Efficient Algorithm for Moving k Nearest Neighbor Queries (Demo). ICDE 2009: 1519-1522. • S. Nutanong, R. Zhang, E. Tanin, L. Kulik: The V*-Diagram: a query-dependent approach to moving KNN queries. PVLDB 1(1): 1095-1106 (2008). – p.24/25
Key References • Lars Kulik, Egemen Tanin: Incremental Rank Updates for Moving Query Points. GIScience 2006:251-268. • Atsuyuki Okabe, Berry Boots, Kokichi Sugihara, Sung Nok Chiu: Spatial Tessellations: Concepts and Applications of Voronoi Diagrams. John Wiley & Sons, Inc., 1992. • Jun Zhang, Manli Zhu, Dimitris Papadias, Yufei Tao, Dik Lun Lee: Location-based Spatial Queries. SIGMOD 2003:443-454. – p.25/25
Recommend
More recommend