Toward Automated Pattern Discovery: Deep Representation Learning with Spatial-Temporal-Networked Data —Collective, Dynamic, and Structured Analysis Yanjie Fu
Outline 5 ¨ Background and Motivation ¨ Collective Representation Learning ¨ Dynamic Representation Learning ¨ Structured Representation Learning ¨ Conclusions and Future Work
Human-Social-Technologic Systems 6 Physical World IoT, GPS, wireless sensors, mobile Apps Cyber World
Human Activities in Human-Social- Technologic Systems 7 ¨ Spatial, Temporal, and Networked (STN) data can be o Spatial: Point-of-Interests, blocks, zones, regions o Spatiotemporal: Taxi trajectories, bus trips, bike traces o Spatiotemporal-networked: Geo-tagged twitter posts, power grid netload ¨ from a variety of sources o Devices: phones, WIFIs, network stations, RFID o Vehicles: bikes, taxicabs, buses, subways, light-rails o Location based services: geo-tweets (Facebook, Twitter), geo- tagged photos (Flickr), check-ins (Foursquare, Yelp) Bus Traces Taxicab GPS Traces Mobile Check-ins Phone Traces Represent the spatial, temporal, social, and semantic contexts of dynamic human/systems behaviors within and across regions
Important Applications 8 Solar Analytics for User Profiling & Intelligent Energy Saving Recommendation Systems Transportation Systems Personalized and Intelligent City Governance and Smart Heath Care Education Emergency Management
Unprecedented and Unique Complexity 9 ¨ Spatiotemporallly non-i.i.d. ¨ Spatial autocorrelation ¨ Spatial heterogeneity ¨ Sequential asymmetric patterns ¨ Temporal periodicity and dependency Spatial autocorrelations Sequential asymmetric transitions Temporal periodical patterns Spatial heterogeneity
Unprecedented and Unique Complexity 10 ¨ Networked over time ¨ Collectively-related ¨ Heterogeneous ¨ Multi-source ¨ Multi-view ¨ Multi-modality ¨ Semantically-rich ¨ Trajectory semantics ¨ User semantics ¨ Event semantics ¨ Region semantics
Technical Pains in Pattern Discovery (1) 11 Car Classic Not car Machine learning Classification Output Input Pattern/Feature / Clustering extraction ¨ Feature identification and quantification o Traditional method: Find domain experts to hand-craft features o Can we automate feature/pattern extraction?
Technical Pains in Pattern Discovery (2) 12 Car Classic Not car Machine learning Classification Output Input Pattern/Feature / Clustering extraction ¨ Multi-source unbalanced data fusion o Traditional method: Extract features, weigh features, weighted combination o Can we automatically extract features from multi-source unbalanced data?
Technical Pains in Pattern Discovery (3) 13 Car Classic Not car Machine learning Classification Output Input Pattern/Feature / Clustering extraction ¨ Field data/real-world systems are usually lack of benchmark labels (i.e., y, responses, targets) o Example: Netload in power grids: behind-the-meter gas-generated electricity and solar-generated electricity are unknown o Can we learn features without labels (unsupervised)?
Deep Learning Can Help 14 Task-specific Car (End to End) Not car Deep Learning Feature extraction + Input Output Classification/Clustering Automated Feature learning from Lack of feature learning multi-source data labels Car Generic Not car Deep Learning Unsupervised Pattern (Feature Output Input / Representation ) Learning Classification /Clustering
Technical Pains in Pattern Discovery (4) 15 Car Classic Not car Machine learning Classification Output Input Pattern/Feature / Clustering extraction ¨ Classic algorithms are not directly available in spatiotemporal networked data o Traditional method: revised classic algorithms + spatiotemporal networked data regularities n Regression + spatial properties = spatial autoregression method n Clustering + spatial properties = spatial co-location method o Can we learn features while maintaining the regularities of spatiotemporal networked data?
Data Regularity-aware Unsupervised Representation Learning 16 Data Regularity- Human and system Regularities of aware behaviors have spatiotemporal representation spatiotemporally networked learning socially regularities data Car Generic Not car Deep Learning Lack of Lack of labels (unsupervised) • Automated Multi-source multi-view multi-modality • labels feature learning • Spatial autocorrelation (peer) Data Spatial heterogeneity (clustering) • • Temporal dependencies (current-past) regularities Periodical patterns • Feature learning from Sequential asymmetric transition • • Spatial hierarchy (hierarchical clustering) multi-source data Hidden semantics • • Spatial locality Global and sub structural patterns in behavioral graphs •
The Overview of The Talk 17 Automated Feature Learning from Spatial-Temporal-Networked Data Collective representation learning with multi-view data Collective Learning Dynamic representation Structured representation learning with stream data learning with global and sub structure preservation Dynamic Structured Learning Learning
Outline 18 ¨ Background and Motivation ¨ Deep Collective Representation Learning ¨ Deep Dynamic Representation Learning ¨ Deep Structured Representation Learning ¨ Conclusion and Future Work
The Rising of Vibrant Communities 19 ¨ Consumer City Theory, Edward L. Glaeser (2001), Harvard University. ¨ More by Nathan Schiff (2014), University of British Columbia. Victor Coutour (2014), UC Berkeley. Yan Song (2014), UNC Chapel Hill. ¨ Spatial Characters : walkable, dense, compact, diverse, accessible, connected, mixed-use, etc. ¨ Socio-economic Characters: willingness to pay, intensive social interactions, attract talented workers and cutting-edge firms, etc. Supported by NSF CISE pre-Career award (III- 1755946) What are the underlying driving forces of a vibrant community?
Measuring Community Vibrancy 20 ¨ Mobile checkin data Urban vibrancy is reflected by the frequency and diversity of user activities. Shopping Transport Dinning Travel Lodging ¨ Frequency and diversity of mobile checkins o Frequency: fre = # &ℎ(&)*+ #(674689:,1234) =>? #(674689:,1234) o Diversity: div = − ∑ 1234 , where type denotes #(674689:) #(674689:) the activity type of mobile users ¨ Fused scoring Vibrancy IJ4∗L9M o @*ABC+&D = (1 + G H ) Score (N O ∗IJ4PL9M) o G controls the weights of fre and div o Power-law distributed o Some are highly vibrant while most are somewhat vibrant Community rankings
Spatial Unbalance of Urban Community Vibrancy 21
Motivation Application: How to Quantify Spatial Configurations and Social Interactions 22 Static Element Dynamic Element Urban Community =Spatial Configuration + Social Interactions
From Regions to Graphs 10 Spatial Regions as Human Mobility Graphs ¨ POIs à nodes ¨ Human mobility connectivity between two POIs à edge weights ¨ Edge weights are asymmetric
Periodicity of Human Mobility 24 ¨ Different days-hours à different periodic mobility patterns à different graph structures
Collective Representation Learning with Multi-view Graphs 12 Multiple Spatial Objects Feature Vector Representations Graphs (e.g., Regions) f( , ) = Constraint: the multi-view graphs are collaboratively related
Solving Single-Graph Input 26 ¨ The encoding-decoding representation learning paradigm o Encoder: compress a graph into a latent feature vector o Decoder: reconstruct the graph based on the latent feature vector o Objective: minimizing the difference between original and reconstructed graphs d 1 input matrix D d N d 2 d 2 d N input matrix D d 1 y y x z x Unsupervised (label-free): doesn’t require labels • Generic: not specific for single application • Intuitive: a good representation can be used to reconstruct original signals •
Solving Multi-graph Inputs: An Ensemble-Encoding Dissemble-Decoding Method 27 NN as an output NN as an input unit Minimize reconstruction loss unit of decoder of encoder signal ensemble (Multi-perceptron summation ) signal dissemble (Multi-perceptron filtering )
Solving the Optimization Problem 28 8 y ( k ) , 1 = σ ( W ( k ) , 1 p ( k ) i,t + b ( k ) , 1 ) , ∀ t ∈ { 1 , 2 , · · · , 7 } , > i,t i,t i,t > > y ( k ) ,r = σ ( W ( k ) ,r p ( k ) i,t + b ( k ) ,r 1. Multi-graph > ) , ∀ r ∈ { 2 , 3 , · · · , o } , < i,t i,t i,t y ( k ) ,o +1 t W ( k ) ,o +1 y ( k ) ,o + b ( k ) ,o +1 Ensemble Encoding = σ ( P ) , i t i,t t > > > z ( k ) = σ ( W ( k ) ,o +2 y ( k ) ,o +1 + b ( k ) ,o +2 ) , > : i i Ensemble multi-graphs y ( k ) ,o +1 W ( k ) ,o +2 z ( k ) Dissemble multi-graphs = σ ( ˆ + ˆ b ( k ) ,o +2 ) , ˆ 2. Multi-graph i i y ( k ) ,o W ( k ) ,o +1 y ( k ) ,o +1 b ( k ) ,o +1 = σ ( ˆ + ˆ Dissemble Decoding ˆ ˆ ) , t t i,t i y ( k ) ,r − 1 W ( k ) ,r y ( k ) ,r + ˆ b ( k ) ,r = σ ( ˆ ˆ ˆ ) , ∀ r ∈ { 2 , 3 , · · · , o } , i,t i,t i,t i,t p ( k ) = σ ( ˆ W ( k ) , 1 y ( k ) , 1 + ˆ b ( k ) , 1 ˆ ˆ ) , i,t i,t i,t i,t Reconstruction loss 3. Objective L ( k ) = k ( p ( k ) p ( k ) i,t ) � v ( k ) X X i,t k 2 i,t � ˆ 2 Function i t ∈ { 1 , 2 ,..., 7 } Sparsity regularization: If mobility connectivity = 0, weight=1 to penalize the loss If mobility connectivity >0, weight>1
Recommend
More recommend