Cross-Domain Recommendation via Clustering on Multi-Layer Graphs Al - PowerPoint PPT Presentation

Cross-Domain Recommendation via Clustering on Multi-Layer Graphs Al Aleksandr Fa Farseev, Ivan Samborskii, Andrey Filchenkov, Tat-Seng Chua By AleksandrFarseev http://farseev .com Aug 8 th , 2017

Venue Category Recommendation Collaborative Venue Category Recommendation – recommendation of venue categories (i.e. restaurant, cinema) to user using information about his/her profile (i.e. past visits) and/or information about users from the same domain. Venue categories: Clothing Store Hotel Venue categories: Ice Cream Shop Total 764 different categories

Idea 1: Utilization of Individual And Group Knowledge for Better Recommendation

User Community-Based Collaborative Recommendation We perform venue category recommendation based on both individual and group knowledge => naturally models the impact of society on an individual's behavior during the selection of a new place to go: ∑ 𝑤𝑓𝑑 0 0∈2 3 𝑠𝑓𝑑 𝑣 = 𝑡𝑝𝑠𝑢 𝛿 * 𝑤𝑓𝑑 , + 𝜄 𝐷 , +

What do we need user communities for? + Users from the same community (extracted from multi-source data) may have similar location preferences + Search within user community significantly reduces search space during the recommendation process

Example of User Communities (1) Community 1: Gingers Community K: Darker Hair

User Relation and Community Representations One way to find user communities is to model users' relationships in the form of a graph so that dense subgraphs are considered to be user communities.

Community Detection based on a single data source One of the commonly formulations is MinCut problem. For a given number k of subsets, the MinCut involves choosing a partition 𝐷 ; ,…, 𝐷 > such that it minimizes the expression: > 𝑑𝑣𝑢 𝐷 ; ,… ,𝐷 > = ? 𝑋(𝐷 B ,𝐷̅ B ) BE; *W is the sum of weights of edges attached to vertices in 𝐷 B

How to solve MinCut problem? Approximation of MinCut as standard tr st trace mi minimi mization problem: m: H∈I J×L tr 𝑉 O 𝑀𝑉 ,s.t. 𝑉 O 𝑉 = 𝐽 min which can be solved by Sp Spectral Clu lusterin ing: Calculates Laplacian matrix 𝑀 ∈ 𝑆 U×U 1. 2. Builds matrix of the first 𝑙 eigenvectors 𝑉 ∈ 𝑆 U×> correspond to the smallest eigenvalues of 𝑀 3. Clusters data in a new space 𝑉 using i.e. 𝑙 -means algorithm

Idea 2: Utilization of Multi-Source Data

Most of user actively use ≈ 3 social networks Accounts Ac ~6 registered social network ~6 accounts per person* 5 Ac Active Usage 4 6 People actively use ~3 ~3 social platforms simultaneously* 3 7 2 8 1 9 0 10 * GlobalWebIndex. 2016. GWI Social report. http://www.globalwebindex.net/blog/internet-users-have-average-of-5-social-media-accounts

Multi-source data describe user from multiple views

Cross-Domain Venue Category Recommendation Cr Cross Domain - Ve Venue ca category reco commendation – recommendation of venue categories (i.e. restaurant, cinema) using information about his/her profile (i.e. past visits) and/or information about users from other sources (i.e. images, texts, location types). Venue categories: Clothing Store Hotel Ice Cream Shop Multi-Source Data:

Community Detection must performed in a Cross-Source Manner… Problems: • Data source integration • Community detection

How to represent multi-source data? Mu Multi-la layer graph – graph 𝐻 , where 𝐻 = 𝐻 B , 𝐻 B = 𝑊,𝐹 B

Extending definition of spectral clustering [ H∈I J×L ? tr 𝑉 O 𝑀 B 𝑉 , s.t.𝑉 O 𝑉 = 𝐽 min BE; [ H∈I J×L tr 𝑉 O 𝑀 \,] 𝑉 , where 𝑀 \,] = ? 𝑀 B min BE; Such approximation could suffer from poor poor ge gene neralization on abi bility.

Regularized Clustering on Multi-layer Graph -1 Use Gr Grassman Ma Manifolds to keep final latent representation “close” to all layers of multi-layer graph*. Where projected distance between two spaces 𝑍 ; and 𝑍 b : b = 1 b ,where 𝐵 k is the Frobenius norm O − 𝑍 b O 𝑒 defg 𝑍 ; ,𝑍 2 𝑍 ; 𝑍 b 𝑍 ; b k [ = 𝑙𝑁 − ?tr(𝑇𝑇 O − 𝑇 B 𝑇 B b [ O ) 𝑒 defg 𝑇, 𝑇 B BE; BE; * X. Dong, P. Frossard, P. Vandergheynst, and N. Nefedov. Clustering on multi-layer graphs via subspace analysis on grassmann manifolds. IEEE Transactions on Signal Processing, 2014.

Regularized Clustering on Multi-layer Graph -2 Extends the objective function to introduce the subspace analysis regularization [ [ O 𝑉 O 𝑀 B 𝑉 + 𝛽 𝑉𝑉 O 𝑉 B 𝑉 B ,s.t. 𝑉 O 𝑉 = 𝐽 H∈ℝ J×L ? tr min 𝑙𝑁 − ? tr BE; BE; H∈ℝ J×L tr 𝑉 O 𝑀 ]ft 𝑉 min [ O ) 𝑀 ]ft = ?(𝑀 B − 𝛽𝑉 B 𝑉 B BE;

Idea 4: Making use of Inter-Layer (Inter-Source) Relations

Incorporating inter-layer relationship (1) By using distance on Grassman Manifolds, we present the new objective function for the 𝑗 th layer: [ v B O 𝑀 B 𝑉 v B + 𝛾 B v B 𝑉 v B O 𝑉 O v w ∈ℝ J×L tr 𝑉 min 𝑙𝑁 − ? 𝑥 B,g tr 𝑉 g 𝑉 g H gE;,gzB O 𝑀 v B { B 𝑉 v B v w ∈ℝ J×L tr 𝑉 min H [ { B = 𝑀 B − 𝛾 B O 𝑀 ? 𝑥 B,g tr 𝑉 g 𝑉 g gE;,gzB

But how can we determine w |,} when computing i-th layer ? O 𝑀 v B { B 𝑉 v B v w ∈ℝ J×L tr 𝑉 min H [ { B = 𝑀 B − 𝛾 B O 𝑀 ? 𝑥 B,g tr 𝑉 g 𝑉 g gE;,gzB In Inter-la layer rela latio ionship ip graph 𝑺(𝑾,𝑭) – weighted graph which represents the similarity between layers. 𝑁 B,> − 𝑁 g,> „ ∑ 1 − >Eb 𝑂 𝑂 − 1 ∀ 𝑗,𝑘 ∈ 𝐹, 𝑥 B,g = 𝐿 − 1 where 𝑁 B,> is clustering co-occurrence matrix of layer 𝑗 , 𝑛 ‡,ˆ = 1, if users 𝑏 and 𝑐 assigned to the same cluster , and 0 otherwise.

Final objective function Let’s combine equations from previous slides to define the final objective function: [ [ { B 𝑉 + 𝛽 v B 𝑉 v B O 𝑉 O 𝑀 𝑉𝑉 O 𝑉 min ∈ℝ J×L ?tr 𝑙𝑁 − ? tr = H BE; BE; [ ∈ℝ J×L tr 𝑉 O ?(𝑀 { B − 𝛽𝑉 v B 𝑉 v B O ) = min 𝑉 H BE;

Problems • Community detection • Data source integration

Recall: Community-Based Cross-Domain Recommendation We perform venue category recommendation based on both individual and group knowledge, where group knowledge is obtained from multiple sources: ∑ 𝑤𝑓𝑑 0 0∈2 3 𝑠𝑓𝑑 𝑣 = 𝑡𝑝𝑠𝑢 𝛿 * 𝑤𝑓𝑑 , + 𝜄 𝐷 , +

Foursquare Instagram NUS-MSS Dataset Dataset* is presented as a set of features, extracted from user-generated data in three social networks: - text based fromTwitter (LDA, LIWC, text features) - image based from Instagram (concepts) - location based from Foursquare (LDA, categories, Mobility Features) Foursquare categories is splited into two parts: 3 months data (train) and 2 months (test). Twitter * A. Farseev, N. Liqiang, M. Akbari, and T.-S. Chua. Ha Harvesting multiple so sources s for use ser profile learning: a Big data st study. ACM International Conference on Multimedia Retrieval (ICMR). China. June 23-26, 2015.

Data Sources Text Features: Linguistic features: LIWC; Latent Topics Heuristic features: Writing behavior LIWC LDA Location Features: Location Semantics: Venue Category Distribution Mobility Location Type Mobility Features: Areas of Interest (AOI) Preferences Image Features Image Google Net Concepts Image Concept Distribution (Image Net) Images

Evaluation Baselines Re Recommender Systems Co Community Detection Approaches • 𝐣 — C ’ R recommendation without inter-layer 𝐃 𝟒 𝐒 − 𝐌 Po Popular (PO POP) P) —recommendation based on user’s past regularization experience • 𝐣 - 𝐌 • 𝐍𝐩𝐞 — C ’ R recommendation without inter-layer 𝐃 𝟒 𝐒 − 𝐌 Popular Al All (POP Al All) ) —recommendation based on experience of regularization and sub-space regularization all users 𝐃 𝟒 𝐒 − 𝑫𝒑𝒏𝒏 — C ’ R recommendation without user Mu Multi-So Source Re-Ra Ranking (MSRR) RR) — linearly combines community extraction recommendation results from all data modalities 𝐃 𝟒 𝐒 (DB ) — C ’ R recommendation, where user Nearest Ne Ne Neighbor Collaborative Filtering (CF) — DBScan) recommendation based on top k most similar Foursquare users communities are detected by Density-Based clustering (DBScan) Ea Early Fusion (EF EF) — fuses multi-source data into a single feature 𝐃 𝟒 𝐒 (x means) — C ’ R recommendation, where user vector (x-me communities are detected by x-means clustering SV SVD++ — makes use of the “implicit feedback” information 𝐃 𝟒 𝐒 (H (Hierarchical) — C ’ R recommendation, where user FM— brings together the advantages of different factorization- FM communities are detected by Hierarchical Clustering based models via regularization. 𝐃 𝟒 𝐒 — Our Ap Approach

Evaluation against other recommender systems

Evaluation against other community detection approaches + Incorporation of group knowledge is is important + Multi-modal clustering performs better than single-source clustering + Incorporation of Inter-Source relationshipis crucial.

Evaluation against source combinations + In different geo regions, different data sources are of different importance + Location data is more powerful than other data modalities

Cross-Domain Recommendation via Clustering on Multi-Layer Graphs Al - PowerPoint PPT Presentation

Cross-Domain Recommendation via Clustering on Multi-Layer Graphs Al Aleksandr Fa Farseev, Ivan Samborskii, Andrey Filchenkov, Tat-Seng Chua By AleksandrFarseev http://farseev .com Aug 8 th , 2017 Venue Category Recommendation Collaborative

Multi Multi Multi- Multi - - -Layer Access Control Layer Access Control Layer Access

Kicking Down the Cross Domain Door Techniques for Cross Domain Exploitation Billy K Rios (BK) and

Graph Clustering Graph Clustering What is clustering? What is clustering? Finding patterns

Subspace Clustering Ensemble Clustering Subspace Clustering, Ensemble Clustering, Alternative

Overview Multi-layer networks: Cognitive Modeling limits of single layer networks; Lecture

Network Layer October 2, 2019 guha.jayachandran@sjsu.edu Layer 2: Protocol atop Layer 1

A multi- -layer layer A multi A multi-layer research and training platform research and

Evolutionary Clustering Presenter: Lei Tang Evolutionary Clustering Evolutionary Clustering

Clustering A Categorization of Major Clustering Methods Partitioning Methods

Lecture 6: Wireless Link Layer, Lecture 6: Wireless Link Layer, MAC protocols, CSMA MAC

1 Transport Layer Transport Layer Outline Message, Segment, Datagram Transport-layer

ELEC / COMP 177 Fall 2016 Some slides from Kurose and Ross, Computer Networking , 5 th Edition

5 Network Layer Network Layer Network Layer Network Layer Example: Choosing among multiple ASes

02 | 27 SOUTHERN CROSS 23.04 03 | 27 SOUTHERN CROSS 23.04 04 | 27 SOUTHERN CROSS 23.04 06

4 Network Layer Network Layer Network Layer Network Layer Switching Via Memory Three types of

DSPACE CLUSTERING DSPACE CLUSTERING VIA PUPPET, HAPROXY AND CEPHFS VIA PUPPET, HAPROXY AND

IP N IP Networks as a Service k S i Victor Reijs Work Package 2 leader (victor.reijs@heanet.ie)

Professional Communication in Computer Science Giving a scientific talk Jiri Srba Jiri Srba,

Analysis and Synthesis of Communication-Intensive Heterogeneous Real-Time Systems Paul Pop

Lecture 7: Distributed memory David Bindel 15 Feb 2010 Logistics HW 1 due Wednesday: See

LINDA COLET SENIOR OUTREACH REPRESENTATIVE CollectionSpace is A web-based, open-source

Data to Care: A Com m unity of Practice W ebinar Series: Part 1 of 4 Thursday, March 2 6 , 2 0

Defining Your Value: How to Develop and Use Your Value Proposition Wednesday, May 15 th 12-1pm

The Big Data Value Strategic Research and Innovation Agenda Version 2.0 Prof. Dr. Milan Petkovi

Sambuz

Useful Links

Newsletter

Mail Us