Huan Liu Joint Work with Huiji Gao and Jiliang Tang Data Mining and - - PowerPoint PPT Presentation

huan liu
SMART_READER_LITE
LIVE PREVIEW

Huan Liu Joint Work with Huiji Gao and Jiliang Tang Data Mining and - - PowerPoint PPT Presentation

Toward Mobile Cloud Computing: Data Analysis with Location-Based Social Network Huan Liu Joint Work with Huiji Gao and Jiliang Tang Data Mining and Machine Learning Lab Location-Based Social Networks (LBSNs) l Location-Based Social


slide-1
SLIDE 1

Data Mining and Machine Learning Lab

Toward Mobile Cloud Computing: Data Analysis with Location-Based Social Network

Huan Liu

Joint Work with Huiji Gao and Jiliang Tang

slide-2
SLIDE 2

Location-Based Social Networks (LBSNs)

l Location-Based Social Networking Sites Foursquare, Facebook Places, Yelp

slide-3
SLIDE 3

A Location-Based Social Network Framework

Social Computing Traditional Mobile Computing

slide-4
SLIDE 4

Essential Data from LBSN

Ø Check-in history with time stamps Ø Social networks derived from check- in locations Ø User generated contents Ø Interdependency of social networks and locations

slide-5
SLIDE 5

Distinct Properties of LBSN Data

Ø Large-Scale Mobile Data Ø Accurate Location Descriptions Ø Explicit Social Friendships Ø Significant Sparsity of Data

slide-6
SLIDE 6

Research Opportunities

Ø Study a user’s mobile behavior through both real and virtual worlds in spatial, temporal and social dimensions. Ø Understand the role of social networks and geographical properties with large amounts of heterogeneous data Ø Improve the development of location- based services such as mobile marketing, disaster relief, traffic forecasting, and etc. Ø Mobile cloud computing

slide-7
SLIDE 7

Some Challenges

Ø How to study human mobile behavior from high dimensional data from heterogeneous sources Ø How to deduce human movement through sparse check-in data Ø How to design location-based services to improve user’s experience without sacrificing one’s privacy

slide-8
SLIDE 8

Potential Applications

Ø Disaster Relief/Crisis Response Ø Mobile Search/Recommendation Ø Location Prediction Ø Recommendation Systems Ø Mobile Community Detection Ø Location Privacy Protection Ø Mobile Marketing

slide-9
SLIDE 9

Some of Our Recent Findings

  • Social-Historical Ties on Location-Based Social

Networks (ICWSM’2012)

– Are two types of ties equally important?

  • Geo-Social Correlation (CIKM’2012)

– Handling the Cold Start Problem

  • Mobile Location Prediction in Spatio-Temporal

Context in Next Location Prediction in 2012 Nokia Mobile Data Challenge Workshop, 3rd Prize

– Together is better

slide-10
SLIDE 10

Exploring Social-Historical on Location-Based Social Networks

slide-11
SLIDE 11

Social-Historical Effect of Online Check-ins

Historical Ties Social Ties

slide-12
SLIDE 12

Why is the prediction hard

  • Power-law distribution

Whole Dataset Individual

slide-13
SLIDE 13

Analyzing User’s Historical Ties

  • Short Term Effect

Ø The historical ties of the previous check-ins at airport, shuttle stop, hotel and restaurant have different strengths to the latest check-in of drinking coffee. Ø The historical tie strength decreases

  • ver time.
slide-14
SLIDE 14

Modeling User’s Historical Ties

  • Power-law distribution
  • Short Term Effect
  • Correspondences between language and LBSN modeling

HPY (Hierarchical Pitman-Yor) Language Model

slide-15
SLIDE 15

Modeling User’s Social Ties

  • Friend Similarity
  • Friends’ Check-in Sequence
  • HPY

v Social Ties Ø Common Check-ins Ø Check-in Similarities Users with friendship have higher check-in similarity than those without. Null hypothesis ​𝐼↓0 :​𝑇↓𝐺 ≤​𝑇↓𝑆 , rejected at significant level α = 0.001 with p-value of 2.6e-6. Social Model

) ( ) 1 ( ) ( ) (

1 1 1

l c P l c P l c p

n i S n i H n i SH

= − + = = =

+ + +

η η

slide-16
SLIDE 16

Experiment Results for Location Prediction

§ Experiment Results

Ø MFC Most Frequent Check-in Model Ø MFT Most Frequent Time Model Ø Order-1 Order-1 Markov Model Ø Order-2 Order-2 Markov Model Ø HM Historical Model Ø SHM Social-Historical Model

slide-17
SLIDE 17

Social-historical Tie Effect w.r.t. η

Ø When no historical information is considered, the prediction performs worst, suggesting that considering social information only is not enough to capture the check-in behavior. Ø By gradually adding the historical information, the performance shows the following pattern: first increasing, reaching its peak value and then decreasing. Most

  • f the time, the best performance is achieved at around η = 0.7. A big weight is given

to historical ties, indicating that historical ties are more important than social ties.

slide-18
SLIDE 18

Predicting New Check-Ins

limited contribution to improve location prediction performance Impossible to predict relying on personal history

slide-19
SLIDE 19

Motivation

F : Local Friends : Local Non-friends D : Distant Friends : Distant Non-friends

slide-20
SLIDE 20

Geo-Social Correlations

Local Correlation Distant Correlation Confounding Unknown Effect

slide-21
SLIDE 21

Modeling Geo-Social Correlations

Ø : the probability of a user u checking-in at a new location l at time t

) (l Pt

u

slide-22
SLIDE 22

Ø : the probability of a user u checking-in at a new location l at time t

) (l Pt

u

Modeling Geo-Social Correlations

  • 1. Sim-Location Frequency (S.Lf)
  • 2. Sim-User Frequency (S.Uf)
  • 3. Sim-Location Frequency & User Frequency (S.Lf.Uf)

Ø Geo-Social Correlation Probability Measures:

slide-23
SLIDE 23

Dataset

Ø Foursquare Dataset

Duration Jan 1, 2011-July 31, 2011

  • No. of user

11,326

  • No. of check-ins

1,385,223

  • No. of unique locations

182,968

  • No. of links

47,164

Table 2: Statistical information of the dataset

Social Circle

  • No. of SCCs

Ratio 34,523 44.50% 5,636 7.26% 3,588 4.62% 39,423 50.82% Others 1,672 2.2% 35,277 45.47% 35,784 46.12% 8,235 10.61% 36,486 47.03%

Table 3: Statistical information of the July data

slide-24
SLIDE 24

Methods Top-1 Top-2 Top-3 EsVm 17.88% 24.06% 27.86% EsSm 16.20% 21.92% 25.43% VsSm 16.49% 22.28% 25.92% RsSm 14.93% 20.30% 23.70% RsVm 15.23% 20.85% 24.50% gSCorr 19.21% 25.19% 28.69% Ø Effect of Geo-Social Correlation Strength and Probability Measures

Experiments

Ø Location Prediction Evaluation Metrics

Single Measure Various Measures Equal Strength EsSm EsVm Random Strength RsSm RsVm Various Strength VsSm gSCorr

slide-25
SLIDE 25

Methods Top-1 Top-2 Top-3 6.51% 8.31% 9.32% 3.65% 4.75% 5.34% 18.37% 24.10% 27.34% 18.62% 24.44% 27.79% 19.01% 24.95% 28.35% 8.33% 10.79% 12.23% 19.21% 25.19% 28.69%

Ø Effect of Different Geo-Social Circles

Experiments

slide-26
SLIDE 26

Mobile Location Prediction in Spatio-Temporal Context

slide-27
SLIDE 27

Problem Statement

) | ( ) | ( ) , | (

1 1 k i i i i k i i i

l v l v p l v t t p l v t t l v p = = = = = = = =

− −

Spatial Prior Temporal Constraint

The probability of next visit at location l given the current visit at lk The probability of the i-th visit happening at time t,

  • bserving that the i-th

visit location is l.

The probability of checking in at location l given the check-in time at t and latest check-in

Historical Model

slide-28
SLIDE 28

Temporal Constraint

h: Hour of the day, i.e., 10:00am, 3:00pm d: Day of the week, i.e., Monday, Sunday

) | ( ) | ( ) | , ( ) | ( l v d d p l v h h p l v d d h h p l v t t p

i i i i i i i i i

= = = = = = = = = = =

Temporal Constraint:

Daily Constraint Hourly Constraint

slide-29
SLIDE 29

Temporal Constraint

Ø Distribution of a user’s visits at a specific location in 24 hours. (user id: 013; place id: 3 ) Compute and

) | ( l v h h p

i i

= =

) | ( l v d d p

i i

= =

) , | ( ) | (

2 h h l i i

h N l v h h p σ µ = = =

=

= =

l

N i h h i l i

h N l v H p

1 2)

, | ( ) | ( σ µ Maximizing Likelihood

⎩ ⎨ ⎧

2 h h

σ µ

) | | , (

l i

N H H h = ∈

slide-30
SLIDE 30

Temporal Constraint

Curve Fitting: [user id: 013; place id: 3]

slide-31
SLIDE 31

Location Prediction

Probability of visiting location l at time t with the latest visit at lk

) , | ( ) , | ( ) | ( ) | ( ) | ( ) | ( ) , | (

2 2 1 1 1 d d l h h l k i i i i i i k i i k i i i

d N h N l v l v p l v d d p l v h h p l v l v p l v t t l v p σ µ σ µ = = = = = = = = = = = = =

− − −

HPY Prior Gaussian Gaussian

HPY Prior Hour-Day Model (HPHD)

slide-32
SLIDE 32

Experiments – Together is Better

v Results Rank 3rd among 21 participated teams in Nokia Mobile Competition

slide-33
SLIDE 33

Some of Our Recent Findings

  • Social-Historical Ties on Location-Based Social

Networks (ICWSM’2012)

– Are two types of ties equally important?

  • Geo-Social Correlation (CIKM’2012)

– Handling the Cold Start Problem

  • Mobile Location Prediction in Spatio-Temporal

Context in Next Location Prediction in 2012 Nokia Mobile Data Challenge Workshop, 3rd Prize

– Together is better

slide-34
SLIDE 34

Acknowledgments: The projects are, in part, sponsored by ONR grants.

THANK YOU