Trajectory Clustering: Visual Analytics Approaches Gennady Andrienko & Natalia Andrienko y http://geoanalytics.net
Outline Similarity measures for trajectories Si il it f t j t i Density-based clustering of trajectories - Using Space-Time Cube for interpreting clusters U i S Ti C b f i t ti l t - Progressive clustering - Using projection for assessing clustering results Using projection for assessing clustering results Sammon’s projection for trajectory clustering Clustering of large trajectory data sets Clustering of large trajectory data sets Analysis of trajectory attributes in clusters 2 http://geoanalytics.net
Projection Set of objects Set of objects Projection space Projection space Basic idea: similar objects close points; dissimilar objects distant points This requires a numeric measure of (dis)similarity 3 http://geoanalytics.net
Measuring (dis)similarity Approach 1: use feature vectors Approach 1: use feature vectors Purpose P - Describe objects by values of N Material numeric attributes (features) chosen Colour Colour according to the analysis goals according to the analysis goals Size - Feature vector == list of N attribute values Shape - Dissimilarity == distance between the Weight vectors in the N-dimensional abstract space of the possible combinations of Producer the attribute values (e g Euclidean the attribute values (e.g. Euclidean … distance) However, objects may have complex properties that cannot be adequately properties that cannot be adequately represented by numeric features - Approach 2: devise an ad-hoc distance function distance function i.e. algorithm to measure dissimilarity 4 http://geoanalytics.net
Using feature vectors Cl Clustering of table data t i f t bl d t Which trajectory attributes to use? Whi h t j t tt ib t t ? standard clustering methods - N points - Duration - Travelled distance Travelled distance - Displacement - Direction - Sinuosity - Tortuosity - Average / Median / Max speed - N intermediate stops - Total / Max duration of intermediate stops - … How to avoid redundancy? How to normalize the selected attributes? Which clustering method to apply? How many clusters are expected? 5 http://geoanalytics.net
Trajectories are complex geographical objects: Trajectory: sequence (t 1 ,s 1 ), (t 2 ,s 2 ), … (t i – time moment, s i – position in space) ) Similarity: routes (ignoring time), dynamics (relative time), coincidence (absolute time) – no adequate representation by feature vectors Complexities: different sequence lengths, different time spacing, measurement errors and gaps, … 6 http://geoanalytics.net
Distance functions for trajectories A lib A library of easily understandable distance functions f il d t d bl di t f ti oriented to different properties More sophisticated distance functions are available More sophisticated distance functions are available through loose integration with HERMES TDW - N.Pelekis, G.Andrienko, N.Andrienko, I.Kopanakis, G.Marketos, Y.Theodoridis Visually Exploring Movement Data via Similarity-based Analysis Journal of Intelligent Information Systems , 2012, v.38 (2), pp.343-391 http://dx.doi.org/10.1007/s10844-011-0159-2 7 http://geoanalytics.net
Example of specific distance function: route similarity Finds corresponding points in two Fi d di i t i t trajectories Computes the average distance Computes the average distance between the corresponding points Accumulates the length of the corresponding parts di t Accumulates the deviations of non- corresponding points corresponding points Penalty factor = (cumulative deviation) / (corresponding length) Penalty distance = (cumulative deviation) * (penalty factor) Fi Final distance = average distance + l di t di t penalty distance 8 http://geoanalytics.net
Density-based Clustering Algorithm OPTICS Ankerst M Ankerst, M., Breunig, M., Kriegel, H. P., & Sander, J. (1999). OPTICS: Ordering Points to Identify the Clustering Breunig M Kriegel H -P & Sander J (1999) OPTICS: Ordering Points to Identify the Clustering Structure. Proceedings of ACM SIGMOD International Conference on Management of Data (SIGMOD'99). Philadelphia: ACM, 49-60 The algorithm assigns a reachability distance to each The algorithm assigns a reachability distance to each object. The first object is chosen randomly. At each following step, the object with the smallest reachability distance to the previously chosen objects is selected. y j The order of choosing Reachability plot the objects Eps : a distance threshold MinPts : minimum number of neighbours for MinPts : minimum number of neighbours for a core object of a cluster o : a core object (has MinPts neighbours) r(p 1 ), r(p 2 ) : reachability distances of p 1 and (p 1 ), (p 2 ) y p 1 p 2 Cluster 1 Cluster 2 Property of the algorithm: separates clusters from “noise”, typical from peculiar 9 http://geoanalytics.net
10 http://geoanalytics.net Clustering of trajectories
Spatial summarization of trajectory clusters D t il Details: N.Andrienko, G.Andrienko Spatial Generalization and Aggregation of Massive Movement Data of Massive Movement Data IEEE Transactions on Visualization and Computer Graphics (TVCG), 2011, v.17 (2), pp.205-219 http://doi ieeecomputersociety org/10 1109 http://doi.ieeecomputersociety.org/10.1109 /TVCG.2010.44 11 http://geoanalytics.net
12 Space-time cube for cluster interpretation? http://geoanalytics.net
Time transformation in space-time cube Transformations with respect to temporal cycles, which include T f ti ith t t t l l hi h i l d - bringing the times of the trajectories to the same year or season, - the same month, the same month - week, - day, y, - Hour Transformations with respect to the individual lifelines of the trajectories, which include - bringing the trajectories to a common start moment, - a common end moment, a common end moment - common start and end moments 13 http://geoanalytics.net
14 Transformations with respect to temporal cycles: days http://geoanalytics.net
15 Transformations with respect to temporal cycles: weeks http://geoanalytics.net
16 Transformations with respect to individual lifelines http://geoanalytics.net
Progressive clustering Combine functions by applying one to results of another C bi f ti b l i t lt f th - For example, cluster trajectories by common starts and ends, continue with route similarity {& dynamics} for clusters of interest y { y } The approach improves computational complexity by applying complex functions to the results of cheap functions Each step brings easily interpretable results; following steps refine the results and our knowledge Details: Details: - S.Rinzivillo, D.Pedreschi, M.Nanni, F.Giannotti, N.Andrienko, G.Andrienko Visually–driven analysis of movement data by progressive clustering Information Visualization 2008 v 7 (3/4) pp 225-239 Information Visualization, 2008, v.7 (3/4), pp. 225 239 http://dx.doi.org/10.1057/palgrave.ivs.9500183 17 http://geoanalytics.net
Projection techniques Principal components analysis (PCA) P i i l t l i (PCA) M lti di Multi-dimensional scaling (MDS) i l li (MDS) Self-organizing map (SOM) – projection Sammon’s projection (a.k.a. Sammon’s onto a discrete space (regular grid) onto a discrete space (regular grid) mapping) mapping) Require input in form of feature vectors Require input in form of feature vectors Can be applied to a pre-defined Can be applied to a pre defined distance matrix , which can be computed using an arbitrary distance function Suitable for complex objects S it bl f l bj t requiring specific distance functions 18 http://geoanalytics.net
19 Use of Sammon’s projection to explore clustering results http://geoanalytics.net
20 Density-based clusters of trajectories by route similarity http://geoanalytics.net
Exploring clustering results Are the clusters well separated? Are the clusters well-separated? Are they compact? How different are the clusters? Ho How far are the clusters from the noise? far are the cl sters from the noise? How sensitive are the results to the parameters? 21 http://geoanalytics.net
22 http://geoanalytics.net Exploring selected clusters
23 Use of projection to define clusters http://geoanalytics.net
Tessellation of the projection area E Each polygon defines a h l d fi cluster 24 http://geoanalytics.net
Choosing cluster sizes The user may interactively change the sizes (radii) of the clusters. y y g ( ) Further possible extension: interactive refinement or joining of selected clusters by direct manipulation in the projection display. 25 http://geoanalytics.net
26 http://geoanalytics.net Assigning colours to clusters
27 Clusters of trajectories defined by means of projection http://geoanalytics.net
28 Step 2 Progressive clustering with the use of projection http://geoanalytics.net Step 1
29 http://geoanalytics.net Results of step 2
Recommend
More recommend