SeTraStream : Semantic-Aware Trajectory Construction Over Streaming Movement Data Zhixian Yan * Nikos Giatrakos † Vangelis Katsikaros † Nikos Pelekis † Yannis Theodoridis † * Distributed Informa2on Systems Lab † Informa2on Management Lab Swiss Federal Ins2tute of Technology University of Piraeus, (EPFL), Lausanne, Switzerland Piraeus, Greece 12 th International Symposium on Spatial and Temporal Databases Minneapolis, MN, USA, 26 August 2011
Outline Introduction semantic trajectories… …over streaming movement data? Related Work SeTraStream Framework Big Picture Details of each module Data Cleaning Data Compression Segmentation – Episode Identification Experimental Evaluation Conclusions 2
Outline Introduction semantic trajectories… …over streaming movement data? Related Work SeTraStream Framework Big Picture Details of each module Data Cleaning Data Compression Segmentation – Episode Identification Experimental Evaluation Conclusions 3
What is semantic trajectory? raw mobility data sequence (x,y,t) points e.g., GPS feeds [8am, 9am] [6pm, 6:30am] [7:30pm, 8pm] meaningful mobility tuples Sideway Road Train <place, time in , time out , tags> ( walk ) ( bus ) ( metro ) Home (breakfast) office (work) Market (shopping) Home (relax) [~, 8am] [8pm,~] [9am, 6pm] [6:30pm, 7:30pm] • Semantic Trajectory: T={ e first , … , e last } • Episode: e i =( t from , t to , place, tag ) 4
Why semantic trajectories? Detection of homogenous fractions of movement, Trajectory is recreated as a sequence of episodes (stops/moves) E.g., home, shopping, move with bus, in train … Semantic data abstraction & compression (efficiency/effectiveness) Better mobility understanding & LBS Home-office trajectory examples Road name Start time Ch. veilloud 08:50:26 Rt. du Boi 08:54:46 Walk Rt. de Villar 08:57:24 Tir Fédéra 08:58:41 Metro M1 08:59:24 Rt. de la Sorg 09:03:57 Walk Ch. du Barrag 09:04:42 La Diagonal 09:05:24 Raw GPS Trajectory Notion Semantic-Aware (a) HomeOffice via Bike (b) HomeOffice via Bus Points of Segments Trajectory 5
Why on streaming mobility data? Offline vs. Real-time Server-side Offline: past trajectories mobility streams: ongoing trajectories efficient computation Real-life scenarios Traffic Control Scenarios: real time placement & rearrangement of traffic Client-side wardens Modern Navigation & Social Networking Services e.g. www.waze.com … Distributed setting Antennas local site vs. coordinator Moving objects client vs. server side Status updates - Batches 6
Outline Introduction semantic trajectories… …over streaming movement data? Related Work SeTraStream Framework Big Picture Details of each module Data Cleaning Data Compression Segmentation – Episode Identification Experimental Evaluation Conclusions 7
Offline Construction of Semantic Trajectories (ESWC ’10, EDBT ’11) output CUSTOMER semantic SCHOOL FACTORY OFFICE HOME SCHOOL HOME MARKET trajectory • spatial join (region) Semantic • map-matching (line) Annotation • HMM (point) Layer structured S 9 S 2 S 3 S 5 S 7 S 8 trajectory S 4 S 6 S 1 • velocity-based Trajectory Structure • density-based Layer • orientation spatio- temporal a trajectory another trajectory • raw GPS gap Trajectory trajectory • time interval Identification • spatial extent Layer cleansed GPS feeds Data • outlier removal Preprocess • kernel smoothing Layer • compression original input GPS feeds
Related Work & Motivation Semantic Trajectories (DKE ’08, ESWC ’10, EDBT ’11) High-level trajectory concepts like episodes (e.g., stops/moves), trajectory ontologies Offline training & tuning parameters (particularly on raw movement features like velocity/direction/density) Tuning parameters, not efficient in real-time settings Streaming data processing Online mobility data compression (e.g., Honle @GIS ’10) Time series online segmentation (e.g., Keogh @ICDM ’01) Tilted time window specification (Giannotti ’02) Semantic Trajectories + Online Algorithms
Outline Introduction semantic trajectories… …over streaming movement data? Related Work SeTraStream Framework Big Picture Details of each module Data Cleaning Data Compression Segmentation – Episode Identification Experimental Evaluation Conclusions 10
SeTraStream - Server Side T Buffer of incoming batches O 1 of objects (arriving every τ ) … Candidate Div Point … O i O 5 O i W 1 l W r O 8 Long term e 1 : walk e 2 : shopping Short term change? W 2 l change? W 3 l … Location Stream Complementary Feature 1. Filter noisy data Instances Instances O N 2. Compress batch 3. Extract Movement Feature Position Distance to Steering Vectors <x,y,t> in Lane Headway Wheel Vehicle Activity 123.34, 121.21, 18:35:43 0.1m 1m π /36 … … … … 11 120.34, 125.21, 18:36:59 0.05m 3m π /16
Online Cleaning (1) Two types of GPS errors systematic errors (outlier) - removing random errors (e.g. ±15 meter) – smoothing ONE LOOP build Kernal smooth calculate residual calculate the outlier bound & the smooth bound filter outlier or smooth error keep smooth remove 0 ∞ 12
Online Cleaning (2) 13
SeTraStream - Compression T Buffer of incoming batches O 1 of objects (arriving every τ ) … … O i O 5 O i O 8 e 1 e 2 … 1. Filter noisy data O N 2. Compress batch 3. Extract Movement Feature Vectors 14
Online Compression (1) u 3 Why Compression? Q 7 Data continuously growing ε u 1 Q 6 Remove “ redundant ” data points u 2 Q 1 Q 5 Reduce transmission cost (local?) Q 2 Q 4 Q 3 Fast computation, application performance SED (Synchronized Euclidean Distance) Q ls p sed Q ls p+1 Q` ls p (x p ,y p ,t p ) Q ls p-1 15
Online Compression (2) SED (Synchronized Euclidean Distance) Relative Spatio-Temporal Significance SCC (Synchronized Correlation Coefficient) Relative Significance of the Complementary Features Normalization: Simple combination: 16
SeTraStream - Feature Extraction T Buffer of incoming batches O 1 of objects (arriving every τ ) … … O i O 5 O i O 8 e 1 e 2 … 1. Filter noisy data O N 2. Compress batch 3. Extract Movement Feature Vectors 17
Movement Feature Vectors (MFVs) Position Distance to Steering in Lane Headway Wheel <x,y,t> Vehicle Activity 123.34, 121.21, 18:35:43 0.1m 1m π /36 … … … … 120.34, 125.21, 18:36:59 0.05m 3m π /16 MFVs in Batch make up a speed direction acceleration Matrix 35 m/s 76 o 40 m/s 2 35 … 60 … … … 76 … 85 60 m/s 85 o 55 m/s 2 40 … 55 0.1 … 0.05 1 … 3 π /36 … π /16 18
SeTraStream - Segmentation T Buffer of incoming batches O 1 of objects (arriving every τ ) … Candidate Div Point … O 5 O 8 O i O i W 1 l W r e 1 e 2 Similar If YES Movement σ thres Pattern? … O N Which types of similarity measurement? 19
Movement Similarity • Existing trajectory computing: – Offline, thresholds on movement features like velocity/direction/density • Online solution: – Similarity on movement patterns (not individual attributes) – Threshold on movement pattern alteration RV-coefficient: A multivariate correlation coefficient, focusing on “trend” similarity; NOT on absolute differences Measures the relative resemblance of two sequences of vectors Dimension independent since W l W l ’, W r W r ’ possess d * d dimension – d the number of features 20
Short-term Movement Change T Buffer of incoming batches O 1 of objects (arriving every τ ) … Div Point … O 5 O 8 O i As soon as we find an episode, we tag it W 1 l W r e 1 e 2 e 3 End of e 3 Start of e 4 … Tagging Episodes: O N Training offline, tagging online 21
Long-term Movement Change T Buffer of incoming batches O 1 of objects (arriving every τ ) … Candidate Div Point … O 5 O 8 O i O i W 1 l W r e 1 e 2 If NO W 2 l Similar Patter? σ thres W 3 l … O N Similarity (W1, W2) e.g. RV-coefficient (W1, W2) 22
Outline Introduction semantic trajectories… …over streaming movement data? Related Work SeTraStream Framework Big Picture Details of each module Data Cleaning Data Compression Segmentation – Episode Identification Experimental Evaluation Conclusions 23
Experiment - Dataset GPS data from Nokia Research Center @ Lausanne User tags: home_cook, office_work, stand, jog, walk, bus …. 24
Experiment - Compression 25
Experiment - Segmentation 1 0.9 *+%,-$.."("$/! 0.8 0.7 � 0%1' � 0%2' 0.6 � 0%3' � 0%45' 0.5 6-77"/7 89:;"/7 '!9/<"/7 0.4 50 100 150 200 250 300 350 400 450 !"#$%&'$() Different batch sizes Different RV threshold 26
Experiment - Latency 27
Outline Introduction semantic trajectories… …over streaming movement data? SeTraStream Framework Big Picture Details of each module Data Cleaning Data Compression Segmentation – Episode Identification Experimental Evaluation Related Work Conclusions 28
Recommend
More recommend