Learning Urban Community Structures: A Collective Embedding Perspective with Periodic Spatial-temporal Mobility Graphs Pengyang Wang, Yanjie Fu, Jiawei Zhang, Xiaolin Li, Dan Lin
Outline 1 ¨ Background and Motivation ¨ Definition and Problem Statement ¨ Methodology ¨ Application ¨ Evaluation ¨ Conclusion
Background and Motivation 2 ¨ Urban life is getting more diverse and vibrant Urban community
Why we study urban communities? 3 § Spatial Imbalance ----vibrancy differencesbetween communities
Challenges & Insights 4 ¨ Challenge I – Graph construction How to unify and represent the POIs and human periodic mobility records as a set of mobility graphs? ¨ Insight I a set of periodic spatial-temporal mobility graphs
Challenges & Insights 6 ¨ Challenge II – Collective embedding How to collectively learn the embeddings of POIs from multiple periodic mobility graphs? ¨ Insight II Collective deep auto-encoder
Challenges & Insights 7 ¨ Challenge III - Embedding aggregation How to align and aggregate POI embeddings for community structure representation learning? ¨ Insight III unsupervised graph-based weighting method
Outline 8 ¨ Background and Motivation ¨ Definition and Problem Statement ¨ Methodology ¨ Application ¨ Evaluation ¨ Conclusion
Definition I 9 ¨ Urban communities residential complex r a d i u s = 1 k m neighborhood area
Definition II 10 ¨ Mobility Graph
Definition III 11 ¨ Periodic Mobility Graphs
Problem Statement 11 ¨ Given o Residential communities (locations, POIs) o Human mobility (e.g., taxi GPS traces) ¨ Objective o Learning representations about static spatial configurations o Learning representations about dynamic human mobility connectivity of POIs in the community ¨ Core tasks o Construction of the periodic mobility graph set for a community o Collectively embedding o Aggregating and aligning POI embedding into community embedding.
Framework Overview 12
Outline 13 ¨ Background and Motivation ¨ Problem Statement ¨ Methodology ¨ Application ¨ Evaluation ¨ Conclusion
Methodology 14 ¨ Periodic Mobility Graph Construction ¨ Collective POI Embedding ¨ Aligning and Aggregating POI Embeddings to Community Embeddings
Periodic Mobility Graph Construction 18 Propagate visit probability the closer, the more likely to visit? 0.8 0.6 Probablity 0.4 0.2 0.0 0 100 200 300 400 500 600 700 Distance to desination (m)
Collective POI Embedding 17
Collective POI Embedding 18 8 y ( k ) , 1 = σ ( W ( k ) , 1 p ( k ) i,t + b ( k ) , 1 ) , ∀ t ∈ { 1 , 2 , · · · , 7 } , > i,t i,t i,t > > y ( k ) ,r = σ ( W ( k ) ,r p ( k ) i,t + b ( k ) ,r > ) , ∀ r ∈ { 2 , 3 , · · · , o } , < i,t i,t i,t Encoder y ( k ) ,o +1 t W ( k ) ,o +1 y ( k ) ,o + b ( k ) ,o +1 = σ ( P ) , i t i,t t > > > z ( k ) = σ ( W ( k ) ,o +2 y ( k ) ,o +1 + b ( k ) ,o +2 ) , > : i i y ( k ) ,o +1 W ( k ) ,o +2 z ( k ) + ˆ = σ ( ˆ b ( k ) ,o +2 ) , ˆ i i y ( k ) ,o W ( k ) ,o +1 y ( k ) ,o +1 b ( k ) ,o +1 = σ ( ˆ + ˆ ˆ ˆ ) , t t i,t i Decoder y ( k ) ,r − 1 W ( k ) ,r y ( k ) ,r b ( k ) ,r = σ ( ˆ + ˆ ˆ ˆ ) , ∀ r ∈ { 2 , 3 , · · · , o } , i,t i,t i,t i,t p ( k ) W ( k ) , 1 y ( k ) , 1 + ˆ b ( k ) , 1 = σ ( ˆ ˆ ˆ ) , i,t i,t i,t i,t L ( k ) = Loss Function: k ( p ( k ) p ( k ) i,t ) � v ( k ) X X i,t k 2 i,t � ˆ 2 t ∈ { 1 , 2 ,..., 7 } i
Aligning and Aggregating POI Embeddings to Community Embeddings 19 ¨ Graph based weighting method POI #1 Similarity 1 , 2 Similarity 1 , 5 Similarity 2 , 3 Similarity 1 , 4 Similarity 2 , 5 POI #2 POI #5 Similarity 3 , 5 Similarity 2 , 4 Similarity 2 , 3 Similarity 4 , 5 Similarity 3 , 4 POI #3 POI #4 POI similarity graph
Graph based weighting method 20 ¨ Weight Calculation j ∈ c k sim i,j × | ˜ G ( k ) [ i, l ] − ˜ G ( k ) [ j, l ] | P P w ( k ) i ∈ c k = l M if the l-th dimension of the latent feature makes more sense, when POI 𝑞 " and 𝑞 # are very similar, the difference of 𝑞 " and 𝑞 # on the | ˜ G ( k ) [ i, l ] − ˜ l-th dimension G ( k ) [ j, l ] | G ( k ) [ i, l ] × w ( k ) ˆ ˜ should be very small. Therefore, X G ( k ) [ s, l ] = l if the l-th dimension of the latent p i ∈ Φ s feature does not make much sense, will | g [ i, l ] − g [ j, l ] | increase; if 𝑞 " and 𝑞 # are very similar, 𝑇𝑗𝑛 ",# will further penalize | g [ i, l ] − g [ j, l ] |
Outline 8 ¨ Background and Motivation ¨ Definition and Problem Statement ¨ Methodology ¨ Application ¨ Evaluation ¨ Conclusion
Application I 22 ¨ Predicting Willing to Pay (WTP) Final Price r = P f − P i P i Initial Price
Application II 23 ¨ Spotting vibrant urban communities Density of Consumer Activities u k = 2 × freq ( k ) × div ( k ) freq ( k ) × div ( k ) Urban Vibrancy Value Diversity of Consumer Activities
Outline 25 ¨ Background and Motivation ¨ Definition and Problem Statement ¨ Methodology ¨ Application ¨ Evaluation ¨ Conclusion and Future Work
Evaluation 26 ¨ Data Description From Beijing City
The Application of WTP Prediction 26 ¨ Baselines Explicit Features (EF): (i) POI numbers per category; (ii) Average commute v distance; (iii) Average commute speed; (iiii) Average commute time; (v) Number of mobilities; (vi) Average distance between POIs. Latent Features (LF): Specifically, the latent features are learned from the v proposed collective embedding method. The combination of EF and LF (ELF). v Variation of step1 (V-1): using distance-based matching of the records. v Variation of step2 (V-2): computing the POI embedding as an average of the v embeddings. Variation of step3 (V-3): averaging over the POI embeddings. v ¨ Evaluation Metric Root-Mean-Square Error (RMSE) v
The Application of WTP Prediction 27 ¨ Results
Spotting vibrant urban communities 27 ¨ Baselines v Learning to Rank (1)MART: it is a boosted tree model, specifically, a linear combination of the outputs of a set of regression trees. (2)RankBoost (RB): it is a boosted pairwise ranking method, which trains multiple weak rankers and combines their outputs as final ranking. (3)LambdaMART (LM): it is the boosted tree version of LambdaRank. (4)ListNet (LN): It is a listwise ranking model with permutation top-k ranking likelihood as objective function. (5) RankNet (RN): it uses a neural network to model the underlying probabilistic cost function. v Feature Set (1)Explicit Features (2)Latent features (3)Explicit&Latent features
Evaluation 29 ¨ Evaluation Metrics v Root-Mean-Square Error (RMSE) v Normalized Discounted Cumulative Gain(NDCG@N) - Evaluate the rankingperformance at TopN v Kendall’s Tau Coefficient(Tau) - Measure the overall ranking accuracy. v F-measure@N - “high-vibrancy” and the rating > 3 - “low-vibrancy” and the rating < 3 - measure the rankingprecision and recall @ TopN
Overall performance 30 ELF − MART ELF − RN ELF − RB ELF − MART ELF − RN ELF − RB LF − MART LF − RN LF − RB LF − MART LF − RN LF − RB 1.2 1.2 EF − MART EF − RN EF − RB EF − MART EF − RN EF − RB V − 1 − MART V − 1 − RN V − 1 − RB V − 1 − MART V − 1 − RN V − 1 − RB V − 2 − MART V − 2 − RN V − 2 − RB V − 2 − MART V − 2 − RN V − 2 − RB V − 3 − MART V − 3 − RN V − 3 − RB V − 3 − MART V − 3 − RN V − 3 − RB 1.0 1.0 0.8 0.8 Fmeasure NDCG 0.6 0.6 0.4 0.4 0.2 0.2 0.0 0.0 @5 @10 @15 @20 @5 @10 @15 @20 ELF − MART ELF − RN ELF − RB LF − MART LF − RN LF − RB EF − MART EF − RN EF − RB V − 1 − MART V − 1 − RN V − 1 − RB 0.5 V − 2 − MART V − 2 − RN V − 2 − RB V − 3 − MART V − 3 − RN V − 3 − RB 0.0 Tau − 0.5 − 1.0
Comparison with Representation Learning Algorithms 31 Our Model RBM Our Model RBM 1.2 1.2 NMF Skip − gram NMF Skip − gram 1.0 1.0 0.8 0.8 NDCG NDCG 0.6 0.6 0.4 0.4 0.2 0.2 0.0 0.0 @5 @10 @15 @20 @5 @10 @15 @20 Our Model RBM Our Model RBM 1.2 NMF Skip − gram 1.2 NMF Skip − gram 1.0 1.0 0.8 0.8 NDCG NDCG 0.6 0.6 0.4 0.4 0.2 0.2 0.0 0.0 @5 @10 @15 @20 @5 @10 @15 @20
Investigation of Community Structure Properties 32 ¨ Community Connectivities.
Investigation of Community Structure Properties 33 ¨ The Learned Representation of the Community Structure Community 2 Community 1 Visualization of the learned structure representations of two similar communities
Outline 35 ¨ Background and Motivation ¨ Definition and Problem Statement ¨ Methodology ¨ Application ¨ Evaluation ¨ Conclusion
Recommend
More recommend