mining and exploration of multiple intersecting axis
play

Mining and Exploration of Multiple Intersecting Axis-aligned Objects - PowerPoint PPT Presentation

Mining and Exploration of Multiple Intersecting Axis-aligned Objects Click to edit Master text styles Second level Third level Fourth level Masters Thesis Fifth level Tilemachos Pechlivanoglou Supervisor: Manos Papagelis 1


  1. Mining and Exploration of Multiple Intersecting Axis-aligned Objects Click to edit Master text styles Second level Third level Fourth level Master’s Thesis Fifth level Tilemachos Pechlivanoglou Supervisor: Manos Papagelis 1

  2. Axis-aligned objects 1-D line segments/intervals 2-D rectangles Regions Multidimensional 2 3-D boxes/cuboids

  3. Object intersection problem 3

  4. Object intersection problem In Input: - a set of axis aligned geometric objects Output ut: - which pairs of objects intersect - how much 4

  5. Sweep-line algorithm (1-D) L L L L L L L L L L L 5 5 4 4 3 3 2 2 1 1 0 0 (0, 2) (0, 1) (1, 2) (0, 3) (1, 3) (0, 4) (1, 4) (3, 4) (0, 5) 5

  6. Sweep-line algorithm (2-D) L LL L LL L L L L L L Interval tree: 6

  7. Divide-and-conquer algorithm Interval tree Computationally equivale valent nt to Sweep-Line 7

  8. Multiple Intersecting Click to edit Master text styles Second level Objects Third level Fourth level Fifth level 8 8

  9. Research questions How to detect multiple ple intersecting objects? What is the si size of their overlap (com ommon on re region on)? Where is that common region loc ocated ed? 9

  10. Applications Circuit design Spatial databases 10 10 Task scheduling Simulations

  11. The problem In Input: - a set of regions in R d : Output ut: − enumeration of all intersecting sets of regions − size and position of each common region Sets: A,B A,B,C A,C A,B,D B,C,D A,D B,C B,D A,B,C,D C,D D,E 11 11

  12. Multiple Intersection Calculation 12 12

  13. Common region A set of 3 or more objects, all inter erse sect ctin ing g pair-wi wise se with each other have a non-empty common mon region ion. (Helly’s theorem, convex sets) Common region 13 13

  14. Common region size: 1-D For a fully intersecting set I , the common mon reg egion ion len ength |Z| is: |Z I |= max(start points) - min(end points) |Z ABC |= max( a 0 , b 0 , c 0 ) - min( a 1 , b 1 , c 1 ) = a 1 - c 0 14 14

  15. Common region size: 2-D, 3-D ... For more dimensions, |Z| is the prod oduct ct of the common region lengths hs in each dimension nsion |Z d | 15 15

  16. Intersection cardinality ( k ) The number er of simultaneously overlapping objects in a set k ABCD = 4 k AE = 0 k DE = 2 k ABCDE = 4 16 16

  17. Sensible baseline algorithms 17 17

  18. Naive approach 1. Compare each object ect with eve very other 2. If any 2 intersect, compare the pair’s common region with every other r objec ect 3. If any 3 intersect, compare the triplet’s common region with every other r objec ect 4. 4. Repeat until no intersections found or no objects left many nested loops ⦁ very high computational cost ⦁ 18 18

  19. Modified sweep-line approach 1. Execute sweep-li line ne algorithm to find intersecting pairs 2. Get the common regions of all resulting pairs rs 3. Execute sweep-line line on them to find tripl plets ets, quadrup uple lets ts 4. 4. Repeat until no intersections found better performance than naive ⦁ 19 19

  20. Limitations ⦁ High computational cost ⦁ Difficult to implement ⦁ Lack of versatility − different implementations needed for different problems − hard to process/explore specific part of dataset 20 20

  21. Our approach (SLIG) 21 21

  22. Region intersection graph A graph data structure where: ⦁ Each ve vertex ex corresponds to an object ct An edge edge exists between two vertices if the corresponding ⦁ objects inters ersect ect 22 22

  23. Clique Subset of vertices where every two are connected (i.e. a fully connected subgraph) size-3 cliques: ABC, ABD, ACD, BCA size-4 clique: ABCD (maximal imal clique) 23 23

  24. Observation On an intersection graph, a cliq ique ue corresponds to a full lly inters ersect ectin ing g set with a commo mon n region on 24 24

  25. Sweep-Line with Intersection Graph (SLIG) 1. Execute sweep-li line ne algorithm to find intersecting pairs 2. Use pairs to construct the intersection graph 3. Execute a cliq ique ue enumer erati ation on algorithm on graph best performance ⦁ using established, efficient clique enumeration methods ⦁ ⦁ much easier to implement 25 25

  26. Extensions: Querying capability The intersection graph provides additi tional onal minin ing g option ions, such as exploration using queries ries: Singl ngle e Region on Query: given an object find all other objects ⦁ intersecting with it Mu Multi tipl ple e Region on Query: given a set of objects, find all ⦁ intersections occuring in the set 26 26

  27. Multiple Intersections Evaluation 27 27

  28. Randomly generated objects 1-D intervals 1-D intersection graph 2-D rectangles 2-D intersection graph 28 28

  29. Intersection graph size 29 29

  30. Performance of SLIG SLIG scales much better than baseline 31 31

  31. Effect of graph topology smaller/sparser objects -> sparser graphs -> faster execution 32 32

  32. SLIG query performance 33 33

  33. Real-world data 34 34 Overlapping areas of extreme weather in CA & NV, USA

  34. Node Importance in Click to edit Master text styles Second level Trajectory Networks Third level Fourth level Fifth level 35 35 35 35

  35. Trajectories of moving objects 36 36

  36. Trajectory Mining Trajectory similarity Trajectory clustering Trajectory anomaly detection Trajectory pattern mining Trajectory classification ...more 37 37

  37. Node Importance 38 38

  38. Node importance (or centrality) Degre ree centrality Betweenn enness ess centrality Clos oseness ess centrality Eigen envec vector tor centrality 39 39

  39. Over time Node degree over time Triangles over time Connected components over time (connectedness) 40 40

  40. Applications Infection spreading Wireless signal security Rich dynamic network analytics 41 41

  41. Proximity networks θ θ 42 42

  42. Distance can represent line of sight Wifi signal range travel distance in a day 43 43

  43. Trajectory networks 44 44

  44. Problem difficulty Node importance algorithms for static ic netwo work rks Sequence of static networks (sn snaps pshots ots) One larg rge network pe per r discrete time unit! 45 45

  45. Node Importance in Trajectory Networks 46 46

  46. Naive approach 47 47

  47. Naive approach For every ry discrete time unit: 1. get static sn snapsho shot of network 2. run st static c node importance algorit orithms hms on snapshot Aggrega regate results at the end 48 48

  48. Streaming approach Similar to naive, but: ﹘ no fi final aggregation gregation ﹘ results calculated iter erat ativel ively y at every step Still every y time unit 49 49

  49. Every discrete time unit ... 0 1 2 3 4 T time ... 50 50

  50. Sweep Line Over Trajectories (SLOT) (algor orithm hm sk sketch) represent TN edges as time interval vals apply variation of sw swee eep line algorithm si simultan aneousl eously compute node degree , triangle membership , connected components in one pass ss 51 51

  51. Edges as time intervals... L e 1 :(n 1 ,n 2 ) . edges . . e n t 2 t 3 t 4 t 6 t 5 t 7 t 13 t 1 t 8 t 10 t 9 t 1 t 12 0 T 1 time 52 52

  52. Sweep Line Over Trajectories (SLOT) 53 53

  53. At every edge star art ⦁ Degre ree − nodes u, v now connected − increment u, v degree edges e:(u, v) ⦁ Tri riangles les − did a triangle just form? − look for u, v common neighbors t 2 t 1 T 0 − increment triangle (u,v,common) time ⦁ Com ompo ponents − did two previously unconnected components connect? u − compare old components of u, v v − if not same, merge them 54 54

  54. At every edge stop ⦁ Degre ree − nodes u, v now disconnected − decrement u, v degree edges e:(u, v) ⦁ Tri riangles les − did a triangle just break? − look for u, v common neighbors t 2 t 3 t 1 T 0 − decrement triangle (u,v,common) time ⦁ Com ompo ponents − did a component separate? − BFS to see if u, v still connected u − if not, split component to two v 55 55

  55. SLOT SL OT: At the end of the algorithm... Rich analytics tics ⦁ node degrees: start/end time, duration ⦁ triangles: start/end time, duration ⦁ connected components: start/end time, duration Exa xact results (not approximations) 56 56

  56. Evaluation of SLOT 57 57

  57. Simulating trajectories constant velocity random velocity 58 58

  58. Degree 59 59

  59. SLOT performance (triangles, connectedness) 60 60

  60. with max=0.15, min=0 61 61

  61. Seagull migration trajectories 62 62

  62. Click to edit Master text styles Summary Second level Third level Fourth level Fifth level 63 63 63 63

  63. Multiple Intersections Axis-aligned object Sweep-line algorithm intersections SLIG properties: - Fast & efficient - Exact - Query capabilities Region intersection graph 64 64

  64. Node importance in TNs Trajectory networks Network Importance over time SLOT properties: - Fast - Exact - Scalable 65 65 SLOT algorithm

Recommend


More recommend