BIS: Bidirectional Item Similarity for Next-Item Recommendation Zijie Zeng Weike Pan* Zhong Ming* National Engineering Laboratory for Big Data System Computing Technology College of Computer Science and Software Engineering Shenzhen University zengzijie1991@gmail.com, { panweike,mingz } @szu.edu.cn Zeng, Pan and Ming (SZU) BIS SCF ICWS 2018 1 / 29
Introduction Problem Definition Figure: Illustration of next-item recommendation. Formally, we have n users and their observed action lists, i.e. ′ ′ ′ I = {I 1 , I 2 , ..., I n } , and each list consists of items that are sorted by ′ u = [ i 1 u , i 2 u , ..., i |I u | ′ the user-item interaction timestamps, i.e., I ] . For each u user, our task is to build a model capable of predicting the next item that is most likely to be interacted with by the user in the near future. Zeng, Pan and Ming (SZU) BIS SCF ICWS 2018 2 / 29
Introduction Motivation FPMC [Steffen Rendle and Schmidt-Thieme, 2010] applies Markov chains to the process of factorizing the user-item interaction matrix. Fossil [He and Mcauley, 2016] utilizes Markov chains in a way similar to that of FPMC. However, Fossil is based on FISM [Kabbur et al., 2013] and factorizes the item-item matrix instead. ...... Most sequential CF methods are model-based. However, memory-based CF methods for next-item recommendation can rarely be seen. This motivates us to develop memory-based CF methods for sequential recommendation. Zeng, Pan and Ming (SZU) BIS SCF ICWS 2018 3 / 29
Introduction Notations Table: Some notations and explanations. Symbol Meaning n ∈ N + user number m ∈ N + item number U the whole set of all users s ki the similarity between item k and item i ′ the set/list of user u ’s interacted items I u / I u I the set of all users’ lists of interacted items the nearest neighbors of item i N i I k the set of user u ’s latest interacted items u , latest s ( ℓ ) the sequence-oriented directional item similarity (BIS( ρ = 0)) from j to i j → i s ( ℓ,ρ ) the sequence-oriented bidirectional item similarity (BIS) from j to i j → i δ ( x ) indicator function that returns 1 if x is true and 0 otherwise T max the maximum timestamp in the training set T min the minimum timestamp in the training set t uk ∈ N + the timestamp of the record ( u , k ) Zeng, Pan and Ming (SZU) BIS SCF ICWS 2018 4 / 29
Introduction Overall of Our Solution The proposed method is based on the framework of time-aware item-based CF: r ui = w ( t uk ) · s ki � ˆ k ∈I u ∩N i We propose a novel similarity measurement called sequence-oriented bidirectional item similarity (BIS). We develop a compound weighting function which is based on the user’s active session window [Mobasher et al., 2002] and exponential function [Ding and Li, 2005] . We apply BIS and the compound weighting function to time-aware ICF framework and propose a novel collaborative filtering method. Zeng, Pan and Ming (SZU) BIS SCF ICWS 2018 5 / 29
Method Sequence-oriented Bidirectional Item Similarity Mathematically, the sequence-oriented bidirectional item similarity from item j to item i is as follows: u ∈U δ ( j ∈ I u ) · δ ( i ∈ I u ) · δ ( − ρ · ℓ ≤ p u ( i ) − p u ( j ) ≤ ℓ ) � s ( ℓ,ρ ) j → i = . (1) |U j ∪ U i | Item position: we use p u ( j ) to denote the position of item j in user u ’s action list. Note that the items in the list are sorted by the timestamps of the corresponding user-item interactions in ascending order. Maximum gap: we use ℓ as a threshold to identify whether two items are associated in a specific user’s action list. Reverse factor: we introduce a reverse factor ρ into BIS in order to better adapt it to real-world data. Zeng, Pan and Ming (SZU) BIS SCF ICWS 2018 6 / 29
Method A Toy Example of BIS (1/2) We have 4 users and their action lists. Note that items are ordered by the timestamps in each list. For comparison, we first use Jaccard index to measure the similarity between item b and item c : s bc = |U b ∩ U c | | U b ∪ U c | = 4 4 = 1 . Zeng, Pan and Ming (SZU) BIS SCF ICWS 2018 7 / 29
Method A Toy Example of BIS (2/2) Table: The calculation of the BIS from item b to item c . p u ( c ) − p u ( b ) δ ( − 1 ≤ p u ( c ) − p u ( b ) ≤ 2 ) User u 1 3 − 1 = 2 1 u 2 1 − 0 = 1 1 u 3 1 − 0 = 1 1 u 4 0 − 2 = − 2 0 Then we use BIS to measure the similarity from item b to item c : u ∈U δ ( b ∈ I u ) · δ ( c ∈ I u ) · δ ( − 1 ≤ p u ( c ) − p u ( b ) ≤ 2 ) � = 3 s ( 2 , 0 . 5 ) = 4 , b → c |U b ∪ U c | where we set the maximum gap ℓ and reverse factor ρ to 2 and 0 . 5, respectively. Zeng, Pan and Ming (SZU) BIS SCF ICWS 2018 8 / 29
Method The Proposed CF Method With a compound weighting function and the proposed BIS, we reach our proposed collaborative filtering method: r ui = w ( u , j ) · s ( ℓ,ρ ) w active ( u , j ) · w e ( t uj ) · s ( ℓ,ρ ) � � ˆ j → i = (2) j → i , j ∈I u ∩N i j ∈I u ∩N i where the compound weighting function consists of two functions: Exponential function [Ding and Li, 2005] : w e ( t ) = e − Tmax − t + 1 p · Tmax . (3) User’s active session window [Mobasher et al., 2002] : j ∈ I k � 1 , u , latest w active ( u , j ) = . (4) otherwise 0 , Zeng, Pan and Ming (SZU) BIS SCF ICWS 2018 9 / 29
Experiments Datasets (1/3) We conduct our studies on two public dataset, i.e., MovieLens 10M and Netflix. MovieLens 10M contains 10 million ratings ranging from 0.5 to 5 with a step size of 0.5. The ratings are assigned by 71567 users to 10681 movies. Netflix contains about 0.1 billion ratings in the range of { 1 , 2 , 3 , 4 , 5 } assigned by 480189 users to 17770 movies. We preprocess the rating records of each data as follows: We remove the records whose rating value is smaller than 5 from the raw data. We remove the records of the users who rated fewer than 10 times from the above processed data. Zeng, Pan and Ming (SZU) BIS SCF ICWS 2018 10 / 29
Experiments Datasets (2/3) To construct validation set, test set and training set from the preprocessed data: We sort the records of each user by the timestamps in ascending order. We then split the records of each user into two parts, i.e., the first m u − 1 records and the last record ( m u denotes the number of items in user u ’s action list). The first m u − 1 records are used for model training while the last record is distributed to test set with a 50 % probability, or to valid set with a probability of 50 % . Zeng, Pan and Ming (SZU) BIS SCF ICWS 2018 11 / 29
Experiments Datasets (3/3) Table: Statistics of the processed data used in the experiments. User # Item # Training record # Test record # Validation record # MovieLens 10M 40,600 8,625 1,370,625 20,228 20,372 Netflix 329,549 17,747 21,856,804 164,741 164,808 Zeng, Pan and Ming (SZU) BIS SCF ICWS 2018 12 / 29
Experiments Baselines (1/7) We conduct empirical studies in order to verify the following three hypotheses: We believe that our proposed bidirectional item similarity can 1 capture the item correlations better than the existing ones such as Jaccard index and cosine similarity. We believe that the proposed compound weighting function can 2 better weigh the importance of the corresponding similarity in the prediction rule. We believe that the proposed collaborative filtering method with 3 BIS and compound weighting function can recommend the next item more accurately. Hence, we include several baselines of collaborative filtering with different similarity measurements and weighting functions in our empirical studies. Zeng, Pan and Ming (SZU) BIS SCF ICWS 2018 13 / 29
Experiments Baselines (2/7) JI: item-based CF with Jaccard index as the similarity measurement. CS: item-based CF with cosine similarity as the similarity measurement. JI-uWIN: time-aware item-based CF using user’s active session window in Eq.(4) and Jaccard index. Zeng, Pan and Ming (SZU) BIS SCF ICWS 2018 14 / 29
Experiments Baselines (3/7) JI-WIN: time-aware item-based CF using WIN: t uk ≥ T w � 1 , w w ( t uk ) = . (5) t uk < T w 0 , as the weighting function and Jaccard index as similarity measurement. In order to make the task of tuning parameters easier, we introduce lr : T w = lr · ( T max − T min ) + T min , (6) where T max and T min are the maximum timestamp and minimum timestamp in the training set, respectively. In this way, we can change the value of T w by varying lr . We select lr from { 0.1,0.3,0.5,0.7,0.9 } in our experiments. Zeng, Pan and Ming (SZU) BIS SCF ICWS 2018 15 / 29
Experiments Baselines (4/7) JI-EXP: time-aware item-based CF using Jaccard index as similarity measurement and exponential function as decay function [Ding and Li, 2005] : w e ( t ) = e − Tmax − t + 1 p · Tmax , (7) where the parameter p is chosen from { 0.001,0.01,0.1 } . Zeng, Pan and Ming (SZU) BIS SCF ICWS 2018 16 / 29
Recommend
More recommend