Hierarchical Graph Traversal for Aggregate k Nearest Neighbors Search in Road Networks ICAPS 2020 SEMINAR PRESENTER: TENINDRA ABEYWICKRAMA CO-AUTHORS: MUHAMMAD AAMIR CHEEMA, SABINE STORANDT
Background: Road Network Graph β’ Input: Road network graph π» = π, πΉ β’ Vertex set π: Road intersections β’ Edge set πΉ: Road segments β’ Source: Each edge has weight: e.g. travel time https://magazine.impactscool.com/en/spec iali/google-maps-e-la-teoria-dei-grafi/ 2
Background: k Nearest Neighbour (kNN) Queries β’ Input: Object set π β π (e.g. all restaurants) β’ Input: Agent location π β π (e.g. a diner) β’ kNN Query: What is the nearest object to π ? β’ By Euclidean Distance: π 2 β’ By Network Distance: π 1 o 1 o 2 β’ More accurate + versatile q 3
Our Problem: Aggregate k Nearest Neighbours (AkNN) β’ AkNN: Find the nearest object to multiple agents β’ Example: Three friends (agents) want to meet at a McDonalds (objects). Which object to meet at? Sources: Google Maps, McDonalds, Flaticon.com π 2 π 1 π 3 π 2 π 3 π 1 4
Our Problem: AkNN β’ Input: Aggregate Function (e.g. SUM), Agent Set π β π β’ Aggregate individual distances from each agent Sources: Google β’ Rank objects by their aggregate score Maps, McDonalds, Flaticon.com π 2 π(π 2 , π 2 ) π(π 3 , π 2 ) π 1 π 3 π 2 π(π 1 , π 2 ) π 3 π 1 π΅ππ_ππππ π(π 2 ) = π΅ππ_πΊπ£πππ’πππ π(π 1 , π 2 , π(π 2 , π 2 ), π(π 3 , π 2 )) 5
Our Problem: AkNN β’ Still using network distance for accuracy/versatility β’ Example: Which McDonalds minimises the SUM of travel times over all diners? Sources: Google Maps, McDonalds, Flaticon.com π 2 π 3 π 1 π 2 π 3 π 1 6
Motivation β’ Inefficient to compute distance to every object β’ Typical Solution: heuristically retrieve likely candidates until all results found β’ But existing heuristics are either: β’ (a) borrowed from kNN => not suitable for AkNN β’ (b) not accurate enough for network distance 7
Expansion Heuristics β’ Borrowed from kNN search heuristics: expand from each query vertex β’ But best AkNN candidates unlikely to be near any one query vertex π 3 π 2 π 1 π 2 π 1 π 3 π 4 8
Hierarchical Search Heuristic β’ Divide space to group objects => recursively β’ Search βpromisingβ regions top -down (recursively) β’ Pinpoint best candidate anywhere in space π 6 π 4 π 5 π 3 π 2 π 2 π 1 π 2 π 4π π 4π Children Root of Q4 Level π 5 π 2 π 1 π 1 π 3 π 4π π 4π π 4 π 3
Hierarchical Search β’ How do we decide which regions are βpromisingβ? β’ Use lower-bound score for all objects in a region β’ Past Work: R-tree + Euclidean distance lower-bound β’ Not accurate for road network distance Data structure needed for o 1 o 2 accurate hierarchical lower- bound search in graphs q
Landmark Lower-Bounds β’ Pre-compute distances from landmark vertices β’ Use triangle inequality to compute lower-bound β’ Only allows small numbers of landmarks (space cost) β’ Not suitable for hierarchical search l π π, π β€ |π π, π β π(π, π)| d(l,o) d(l,q) Choose tightest LLB over a set o of multiple landmarks q d(q,o) 11
Compacted-Object Landmark Tree (COLT) Index β’ Partition graph recursively => subgraph tree β’ Choose localised landmarks in every subgraph β’ Compact based on object set π S 0 S 2 S 1 l 1 o 1 ,1 o 2 ,3 S 2A S 2B l 2 o 2 ,1 o 1 ,4 12
COLT β’ Non-leaf + leaf nodes stores π β : min distance to any object in subgraph from landmark β’ β’ π + : max distance to any object in subgraph from landmark β’ Enables accurate lower-bound for any tree node
Hierarchical Traversal in COLT β’ Top-down search from root node β’ Compute lower-bound for child using equation β’ Recursively evaluate child with best score S 0 S 2 S 1 o 1 ,1 o 2 ,3 l 1 S 2A S 2B o 2 ,1 o 1 ,4 l 2
Hierarchical Traversal in COLT β’ Leaf nodes store Object Distance List β’ Find object with minimum aggregate lower-bound β’ Interestingly common functions preserve convexity! β’ Easily found using modified binary search π(π¦) π¦ π 4 π 2 π 5 π 1 π 3 Object Distance 2 5 7 8 12
Experimental Setup β’ Dataset: US Road Network Graph from DIMACS β’ π = 23,947,347 vertices, πΉ = 57,708,624 edges β’ Real-World POIs from OSM for US β’ Comparison against IER and NVD β’ IER: hierarchical search using Euclidean heuristic β’ NVD: state-of-the-art expansion heuristic 16
Query Time: Real-World POIs β’ COLT up to an order of magnitude faster! β’ COLT performs better on dense POI sets β’ Heuristics is less important on sparse POI sets 17
Sensitivity Analysis β’ COLT maintains improvement for β’ Varying parameters ( π , number of agents) β’ Varying aggregate functions (MAX, SUM) β’ Heuristic efficiency metrics β’ Comes at a lightweight pre-processing cost 18
Thank You! Questions? 19
Recommend
More recommend