conditional restricted boltzmann machine for item
play

Conditional Restricted Boltzmann Machine for Item Recommendation - PowerPoint PPT Presentation

Conditional Restricted Boltzmann Machine for Item Recommendation Zixiang Chen a, b, c , Wanqi Ma a, b, c , Wei Dai a, b, c , Weike Pan a, b, c , Zhong Ming a, b, c { chenzixiang2016, mawanqi2019, daiwei20171 } @email.szu.edu.cn, { panweike,


  1. Conditional Restricted Boltzmann Machine for Item Recommendation Zixiang Chen a, b, c , Wanqi Ma a, b, c , Wei Dai a, b, c , Weike Pan a, b, c ∗ , Zhong Ming a, b, c ∗ { chenzixiang2016, mawanqi2019, daiwei20171 } @email.szu.edu.cn, { panweike, mingz } @szu.edu.cn a National Engineering Laboratory for Big Data System Computing Technology, Shenzhen University, Shenzhen, China b Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), Shenzhen University, Shenzhen, China c College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China Chen et al., (SZU) CRBM-IR Neurocomputing 1 / 31

  2. Introduction Problem Definition Top- k Recommendation with Users’ Explicit Rating Feedback Input: A set of rating triples ( u , i , r ui ) from n users and m items, where r ui denotes the numerical rating, i.e., explicit feedback, assigned by user u to item i . Goal: Provide a personalized ranked list of unrated items from I\I u for each user u . Chen et al., (SZU) CRBM-IR Neurocomputing 2 / 31

  3. Introduction Motivation In terms of modeling users’ rating data, existing methods are 1 mainly neighborhood- and factorization-based, most of which are rating oriented. Among network-based methods, the restricted Boltzmann 2 machine (RBM) model is also applied to rating prediction tasks. However, item recommendation tasks play a more important role in the real world, due to the large itemsets as well as users’ limited attention. Chen et al., (SZU) CRBM-IR Neurocomputing 3 / 31

  4. Introduction Our Contributions To the best of our knowledge, this is the first study to apply the 1 CRBM model to solving the problem of top- k recommendation with users’ explicit rating feedback. To better model users’ preferences, we treat the original rating 2 matrix from a new perspective, i.e., three different views of users’ behaviors. We conduct empirical studies on four publicly available datasets 3 and the experimental results show that our proposed CRBM-IR is effective in generating personalized top- k items for each user. Chen et al., (SZU) CRBM-IR Neurocomputing 4 / 31

  5. Introduction Notations (1/3) Table: Some notations and their explanations. Notation Explanation n user number m item number U = { 1 , 2 , . . . , n } the whole set of users I = { 1 , 2 , . . . , m } the whole set of items u ∈ U user ID i , j ∈ I item ID r ui the rating assigned by user u to item i I + a set of positive items w.r.t. user u , i.e., item set rated by u with u high ratings I − a sampled set of negative items w.r.t. user u , I − u ⊆ I\I u u I u a set of items rated by user u Chen et al., (SZU) CRBM-IR Neurocomputing 5 / 31

  6. Introduction Notations (2/3) Table: Some notations and their explanations (cont.). Notation Explanation p + = [ p + i ] 1 × m ∈ [ 0 , 1 ] 1 × m probabilities of the nodes in the (positive) visible layer b + = [ b + i ] 1 × m ∈ R 1 × m biases of the nodes in the (positive) visible layer p − = [ p − i ] 1 × m ∈ [ 0 , 1 ] 1 × m probabilities of the nodes in the (negative) visible layer b − = [ b − i ] 1 × m ∈ R 1 × m biases of the nodes in the (negative) visible layer p h = [ p h i ] 1 × d ∈ [ 0 , 1 ] 1 × d probabilities of the nodes in the hidden layer b h = [ b h i ] 1 × d ∈ R 1 × d biases of the nodes in the hidden layer i · , W − W + i · ∈ R 1 × d weight between the i th node in the visible layer and the nodes in the hidden layer C u a set of items w.r.t. user u in the condition layer (e.g., C u = I u in this paper) c i the value of the item i in the condition layer (e.g., c i = 1 in this paper) C i · ∈ R 1 × d weight between the i th node in the condition layer and the nodes in the hidden layer Chen et al., (SZU) CRBM-IR Neurocomputing 6 / 31

  7. Introduction Notations (3/3) Table: Some notations and their explanations (cont.). Notation Explanation d number of nodes in the hidden layer sampling ratio ρ L number of steps in the contrastive divergence (CD) algorithm T iteration number Chen et al., (SZU) CRBM-IR Neurocomputing 7 / 31

  8. Related Work Top- k Recommendation There are mainly three branches of recommendation methods, including: Neighborhood-based methods such as user-oriented methods [Resnick et al., 1994] and item-oriented methods [Sarwar et al., 2001]. Model-based methods such as PMF [Mnih and Salakhutdinov, 2008] and BPR [Rendle et al., 2009]. Network-based methods such as CDAE [Wu et al., 2016] and RBM [Nguyen and Lauw, 2016]. In this paper, we focus on a network-based approach, i.e., restricted Boltzmann machine (RBM). Chen et al., (SZU) CRBM-IR Neurocomputing 8 / 31

  9. Related Work Restricted Boltzmann Machine RBM and its extension conditional RBM (CRBM) are firstly applied to recommendation problems based on users’ explicit feedback [Salakhutdinov et al., 2007]. Boltzmann machine (BM) is proposed for the task of rating prediction by exploiting the ordinal property, but it consumes longer training time. Auto-Rec applies an AE to model explicit feedback [Phung et al., 2009], where the form of the input is similar to that of RBM. In this paper, we propose to apply CRBM to the task of item recommendation with explicit feedback. Chen et al., (SZU) CRBM-IR Neurocomputing 9 / 31

  10. Method CRBM-IR: Illustration Chen et al., (SZU) CRBM-IR Neurocomputing 10 / 31

  11. Method Probabilities (1/2) The probabilities of the nodes in the hidden layer are as follows, p h = σ ( � � p + i W + � p − i W − i · + b h ) , c i C i · + i · + (1) i ∈C u i ∈I + i ∈I − u u 1 where σ ( x ) = 1 +exp( − x ) is the sigmoid function. c i = 1 is a constant, i · , b + , b − and b h are to be learned from the and p + i , p − i , C i · , W + i · , W − data. Chen et al., (SZU) CRBM-IR Neurocomputing 11 / 31

  12. Method Probabilities (2/2) The probability of the node i ∈ I + u in the positive layer is as follows, W + + b + i · , p h � � exp( i ) p + i = i ) , (2) W + + b + W − + b − � i · , p h � � i · , p h � exp( i ) + exp( which is usually called the prediction rule in recommender systems since p + i can be used for item ranking and recommendation. And the probability of the node i ∈ I − u in the negative layer is as follows, W − i · , p h � + b − � exp( i ) p − i = i ) . (3) W + + b + W − + b − � i · , p h � � i · , p h � exp( i ) + exp( Chen et al., (SZU) CRBM-IR Neurocomputing 12 / 31

  13. Method Energy Function (1/2) Similar to other RBM-based models, the minus energy of our CRBM-IR is as follows, i · , p h > − E ( p + , p − , p h | Θ) � b + i p + � b − i p − � p + i W + = i + i + < i ∈I + i ∈I − i ∈I + u u u i · , p h > + < b h , p h > � p − i W − + < i ∈I − u C i · c i , p h >, � + < (4) i ∈C u from which we can have the gradient of each model parameter W + i · , W − i · , C i · , b + i , b − i and b h , Chen et al., (SZU) CRBM-IR Neurocomputing 13 / 31

  14. Method Energy Function (2/2) ∂ − E ( p + , p − , p h | Θ) i p h ∈ R 1 × d , p + = (5) ∂ W + i · ∂ − E ( p + , p − , p h | Θ) i p h ∈ R 1 × d , p − = (6) ∂ W − i · ∂ − E ( p + , p − , p h | Θ) c i p h ∈ R 1 × d , = (7) ∂ C i · ∂ − E ( p + , p − , p h | Θ) p + = i ∈ R , (8) ∂ b + i ∂ − E ( p + , p − , p h | Θ) p − = i ∈ R , (9) ∂ b − i ∂ − E ( p + , p − , p h | Θ) p h ∈ R 1 × d . = (10) ∂ b h Chen et al., (SZU) CRBM-IR Neurocomputing 14 / 31

  15. Method Joint Probability With the energy function in Eq.(4), we have the joint probability w.r.t. the visible positive layer, the visible negative layer and the hidden layer in our CRBM-IR as follows, − E ( p + , p − , p h | Θ) � � exp Prob ( p + , p − , p h | Θ) = � , (11) ′ | Θ) − E ( p + ′ , p − ′ , p h � � p + ′ , p −′ , p h ′ exp from which we have the marginal distribution w.r.t. the visible layers, ′ − E ( p + , p − , p h � � � p h ′ exp | Θ) Prob ( p + , p − | Θ) = (12) � . ′ | Θ) − E ( p + ′ , p − ′ , p h � � p + ′ , p −′ , p h ′ exp Chen et al., (SZU) CRBM-IR Neurocomputing 15 / 31

Recommend


More recommend