EBCR : Empirical Bayes Concordance Rate to weight similarity measurement in collaborative filtering recommendations Y. Du , LGI2P, IMT Mines Alès S. Ranwez , LGI2P, IMT Mines Alès V. Ranwez , AGAP, Montpellier SupAgro N. Sutton-Charani , LGI2P, IMT Mines Alès
2 Collaborative Filtering recommender systems Rose : like movie
3 Collaborative Filtering recommender systems Alice Rose Bob : like movie
4 Collaborative Filtering recommender systems Alice Rose Bob : like movie
Ƹ 5 Memory-based collaboratif filtering algorithm I i 1 , i 2 ,…, i, …, i n Input : an User-Item-Rating matrix R U u 1 (5, ?,…, 1, …, 2) Output : {ො 𝒔 𝑣𝑗 | 𝑣 ∈ 𝑉, 𝑗 ∈ 𝐽 and 𝑠 𝑣𝑗 = unknown} u 2 (?, 1,…, ?, …, ?) … … u (5, 2,…, ?, …, ?) R m x n ... … u m-1 (?, ?,…, 1, …, 2) u m (5, 2,…, 2, …, 4) Algorithm : weighted average of ratings of u ’s neighbors 𝑙 𝑠 𝑣𝑗 = σ 𝑤=1 𝑠 𝑤𝑗 ∗ 𝑡𝑗𝑛(𝑣, 𝑤) 𝑙 σ 𝑤=1 𝑡𝑗𝑛(𝑣, 𝑤)
6 Most employed similarity measurements in CF approches PCC : P earson C orrelation C oefficient: - Linear correlation between two rating vectors COS : Cos ine similarity: v u MSD : M ean S quare D istance - Normalized distance of two vectors in an euclidien space
7 What is the problem here? PCC, COS, MSD: consider only the rating distributions of u and v restricted to their co-rated items, i.e. I u,v . i1 i2 i3 i4 i5 Alice 1 ∅ ∅ ∅ 5 I Rose,Alice = {i1} Rose 1 2 ∅ 3 1 I Rose,Bob Bob = {i1,i2,i3,i5} ∅ 1 2 2 1
8 Here is the problem ! PCC, COS, MSD : consider only the rating distributions of u and v restricted to their co-rated items, which ignores the number of co-rated items . Why do we have to consider the number of co-rated items, i.e. | I u,v | ? Alice 1 ∅ ∅ ∅ 5 PCC(R, A) = 1 > PCC(R, B) = 0.905 Rose COS(R, A) = 1 > COS(R, B) = 0.9798 1 2 ∅ 3 1 MSD(R, A) = 1 > MSD(R, B) = 0.8 Bob 1 2 2 ∅ 1 NOT Reliable as | I u,v | is small !!! So, the values need to be adjusted
Proposed method : EBCR ( E mpirical B ayes C oncordance R ate) 9 Discretize user ratings by three categories of taste T T ( u, u,i ) )
Proposed method : EBCR ( E mpirical B ayes C oncordance R ate) 10 CR: concordance rate of a given user pair • Set C u,v of f ( u and v v )’s c oncordantly co-rated items: i ∈ C u,v if T ( u,i ) = T ( v,i ) v : CR u,v = | C u,v | • CR of u and v u: (1, 1, ?, 5, 4, 5, ?, ?, ..., 2) | I u,v | v: (?, 2, 5, ?, ?, 1, 1, ?, ..., 5) u: (1, 5, 2) u: (dislike, like, dislike) v: (2, 1, 5) v: (dislike, dislike, like) | I u,v | = 3 C u,v = 1 CR u,v = 1/3 • Interpretation of CR : Probability of two users having the same taste on an item BUT, what if I u,v is small ? 1 2000 ( 1 ) 1 != 1 ( 2000 )
11 Here comes Empirical Bayes 1. Take all the CR rates as a Beta prior distribution 2. Find 𝜷 0 and 𝜸 𝟏 that best fit the data, i.e. CR rate set 3. Use the prior to adjust each CR value : 𝑫 𝒗𝒘 𝑫 𝒗𝒘 + 𝜷 𝟏 EBCR u,v ,v : : Figure taken from Google Image 𝑱 𝒗𝒘 𝑱 𝒗𝒘 + 𝜷 𝟏 + 𝜸 𝟏 Espérance de la 4. Use EBCR to weight similarity measurement : lois Beta( 𝜷 0, 𝜸 𝟏 ) sim ’( u,v) = sim(u,v) * EBCR u,v
12 Evaluation and results • Dataset : Movielens-1M → 1 million movie ratings of 6 040 users on 3 900 items • Evaluation metric : MAE (Mean Absolute Error) • Evaluation protocol : 10-folds cross validation better
13 State of advance and perspectives 3 rd Year 1 st Year 2 nd Year Submission 1 st Envisage paper for IC 2019 submitting EBCR conference, to ECAI2019 in Collaborate Ontology, accepted in May. 1. Literature English version knowledge graph, on RS knowledge base and 2. Literature on Proposition Model-based RS for knowledge- Juin. 2019 recommendation diversity RS + semantic based RS and explanation Oct. 2018 Oct. 2021 Apr. 2019 Nov. 2019 2 nd Submission for the LFA 2019 conference
Merci pour votre attention
15 Formulars
Recommend
More recommend