Mining and Exploration of Multiple Intersecting Axis-aligned Objects Click to edit Master text styles Second level Third level Fourth level Master’s Thesis Fifth level Tilemachos Pechlivanoglou Supervisor: Manos Papagelis 1
Axis-aligned objects 1-D line segments/intervals 2-D rectangles Regions Multidimensional 2 3-D boxes/cuboids
Object intersection problem 3
Object intersection problem In Input: - a set of axis aligned geometric objects Output ut: - which pairs of objects intersect - how much 4
Sweep-line algorithm (1-D) L L L L L L L L L L L 5 5 4 4 3 3 2 2 1 1 0 0 (0, 2) (0, 1) (1, 2) (0, 3) (1, 3) (0, 4) (1, 4) (3, 4) (0, 5) 5
Sweep-line algorithm (2-D) L LL L LL L L L L L L Interval tree: 6
Divide-and-conquer algorithm Interval tree Computationally equivale valent nt to Sweep-Line 7
Multiple Intersecting Click to edit Master text styles Second level Objects Third level Fourth level Fifth level 8 8
Research questions How to detect multiple ple intersecting objects? What is the si size of their overlap (com ommon on re region on)? Where is that common region loc ocated ed? 9
Applications Circuit design Spatial databases 10 10 Task scheduling Simulations
The problem In Input: - a set of regions in R d : Output ut: − enumeration of all intersecting sets of regions − size and position of each common region Sets: A,B A,B,C A,C A,B,D B,C,D A,D B,C B,D A,B,C,D C,D D,E 11 11
Multiple Intersection Calculation 12 12
Common region A set of 3 or more objects, all inter erse sect ctin ing g pair-wi wise se with each other have a non-empty common mon region ion. (Helly’s theorem, convex sets) Common region 13 13
Common region size: 1-D For a fully intersecting set I , the common mon reg egion ion len ength |Z| is: |Z I |= max(start points) - min(end points) |Z ABC |= max( a 0 , b 0 , c 0 ) - min( a 1 , b 1 , c 1 ) = a 1 - c 0 14 14
Common region size: 2-D, 3-D ... For more dimensions, |Z| is the prod oduct ct of the common region lengths hs in each dimension nsion |Z d | 15 15
Intersection cardinality ( k ) The number er of simultaneously overlapping objects in a set k ABCD = 4 k AE = 0 k DE = 2 k ABCDE = 4 16 16
Sensible baseline algorithms 17 17
Naive approach 1. Compare each object ect with eve very other 2. If any 2 intersect, compare the pair’s common region with every other r objec ect 3. If any 3 intersect, compare the triplet’s common region with every other r objec ect 4. 4. Repeat until no intersections found or no objects left many nested loops ⦁ very high computational cost ⦁ 18 18
Modified sweep-line approach 1. Execute sweep-li line ne algorithm to find intersecting pairs 2. Get the common regions of all resulting pairs rs 3. Execute sweep-line line on them to find tripl plets ets, quadrup uple lets ts 4. 4. Repeat until no intersections found better performance than naive ⦁ 19 19
Limitations ⦁ High computational cost ⦁ Difficult to implement ⦁ Lack of versatility − different implementations needed for different problems − hard to process/explore specific part of dataset 20 20
Our approach (SLIG) 21 21
Region intersection graph A graph data structure where: ⦁ Each ve vertex ex corresponds to an object ct An edge edge exists between two vertices if the corresponding ⦁ objects inters ersect ect 22 22
Clique Subset of vertices where every two are connected (i.e. a fully connected subgraph) size-3 cliques: ABC, ABD, ACD, BCA size-4 clique: ABCD (maximal imal clique) 23 23
Observation On an intersection graph, a cliq ique ue corresponds to a full lly inters ersect ectin ing g set with a commo mon n region on 24 24
Sweep-Line with Intersection Graph (SLIG) 1. Execute sweep-li line ne algorithm to find intersecting pairs 2. Use pairs to construct the intersection graph 3. Execute a cliq ique ue enumer erati ation on algorithm on graph best performance ⦁ using established, efficient clique enumeration methods ⦁ ⦁ much easier to implement 25 25
Extensions: Querying capability The intersection graph provides additi tional onal minin ing g option ions, such as exploration using queries ries: Singl ngle e Region on Query: given an object find all other objects ⦁ intersecting with it Mu Multi tipl ple e Region on Query: given a set of objects, find all ⦁ intersections occuring in the set 26 26
Multiple Intersections Evaluation 27 27
Randomly generated objects 1-D intervals 1-D intersection graph 2-D rectangles 2-D intersection graph 28 28
Intersection graph size 29 29
Performance of SLIG SLIG scales much better than baseline 31 31
Effect of graph topology smaller/sparser objects -> sparser graphs -> faster execution 32 32
SLIG query performance 33 33
Real-world data 34 34 Overlapping areas of extreme weather in CA & NV, USA
Node Importance in Click to edit Master text styles Second level Trajectory Networks Third level Fourth level Fifth level 35 35 35 35
Trajectories of moving objects 36 36
Trajectory Mining Trajectory similarity Trajectory clustering Trajectory anomaly detection Trajectory pattern mining Trajectory classification ...more 37 37
Node Importance 38 38
Node importance (or centrality) Degre ree centrality Betweenn enness ess centrality Clos oseness ess centrality Eigen envec vector tor centrality 39 39
Over time Node degree over time Triangles over time Connected components over time (connectedness) 40 40
Applications Infection spreading Wireless signal security Rich dynamic network analytics 41 41
Proximity networks θ θ 42 42
Distance can represent line of sight Wifi signal range travel distance in a day 43 43
Trajectory networks 44 44
Problem difficulty Node importance algorithms for static ic netwo work rks Sequence of static networks (sn snaps pshots ots) One larg rge network pe per r discrete time unit! 45 45
Node Importance in Trajectory Networks 46 46
Naive approach 47 47
Naive approach For every ry discrete time unit: 1. get static sn snapsho shot of network 2. run st static c node importance algorit orithms hms on snapshot Aggrega regate results at the end 48 48
Streaming approach Similar to naive, but: ﹘ no fi final aggregation gregation ﹘ results calculated iter erat ativel ively y at every step Still every y time unit 49 49
Every discrete time unit ... 0 1 2 3 4 T time ... 50 50
Sweep Line Over Trajectories (SLOT) (algor orithm hm sk sketch) represent TN edges as time interval vals apply variation of sw swee eep line algorithm si simultan aneousl eously compute node degree , triangle membership , connected components in one pass ss 51 51
Edges as time intervals... L e 1 :(n 1 ,n 2 ) . edges . . e n t 2 t 3 t 4 t 6 t 5 t 7 t 13 t 1 t 8 t 10 t 9 t 1 t 12 0 T 1 time 52 52
Sweep Line Over Trajectories (SLOT) 53 53
At every edge star art ⦁ Degre ree − nodes u, v now connected − increment u, v degree edges e:(u, v) ⦁ Tri riangles les − did a triangle just form? − look for u, v common neighbors t 2 t 1 T 0 − increment triangle (u,v,common) time ⦁ Com ompo ponents − did two previously unconnected components connect? u − compare old components of u, v v − if not same, merge them 54 54
At every edge stop ⦁ Degre ree − nodes u, v now disconnected − decrement u, v degree edges e:(u, v) ⦁ Tri riangles les − did a triangle just break? − look for u, v common neighbors t 2 t 3 t 1 T 0 − decrement triangle (u,v,common) time ⦁ Com ompo ponents − did a component separate? − BFS to see if u, v still connected u − if not, split component to two v 55 55
SLOT SL OT: At the end of the algorithm... Rich analytics tics ⦁ node degrees: start/end time, duration ⦁ triangles: start/end time, duration ⦁ connected components: start/end time, duration Exa xact results (not approximations) 56 56
Evaluation of SLOT 57 57
Simulating trajectories constant velocity random velocity 58 58
Degree 59 59
SLOT performance (triangles, connectedness) 60 60
with max=0.15, min=0 61 61
Seagull migration trajectories 62 62
Click to edit Master text styles Summary Second level Third level Fourth level Fifth level 63 63 63 63
Multiple Intersections Axis-aligned object Sweep-line algorithm intersections SLIG properties: - Fast & efficient - Exact - Query capabilities Region intersection graph 64 64
Node importance in TNs Trajectory networks Network Importance over time SLOT properties: - Fast - Exact - Scalable 65 65 SLOT algorithm
Recommend
More recommend