Content-based recommendation systems (based on chapter 9 of Mining - PowerPoint PPT Presentation

Content-based recommendation systems (based on chapter 9 of Mining of Massive Datasets, a book by Rajaraman, Leskovec, and Ullman’s book) Fernando Lobo Data mining 1 / 16

Content-based Recommendation Systems ◮ Focus on properties of items. ◮ Similarity of items is determined by measuring the similarity in their properties. 2 / 16

Item profiles ◮ Need to construct a profile for each item. ◮ A profile is a collection of important characteristics about the item. ◮ Example for item = movie. Profile can be: ◮ set of actors ◮ director ◮ year the movie was made ◮ genre 3 / 16

Discovering features ◮ Features can be obvious and immediately available (as in the movie example). ◮ But many times they are not. Examples: ◮ document collections ◮ images 4 / 16

Discovering features of documents ◮ Documents can be news articles, blog posts, webpages, research papers, etc. ◮ Identify a set of words that characterize the topic of a document. ◮ Need a way to find the importance of a word in a document. ◮ We can pick the n most important words of that document as the set of words that characterize the document. 5 / 16

Finding the importance of a word in a document Common approach: ◮ Remove stop words — the most common words of a language that tend to say nothing about the topic of a document (examples from english: the, and, of, but, . . . ) ◮ For the remaining words compute their TF.IDF score ◮ TF.IDF stands for Term Frequency times Inverse Document Frequency 6 / 16

TF.IDF score First compute the Term Frequency (TF): ◮ Given a collection of N documents. ◮ Let f ij = number of times word i appears in document j . f ij ◮ Then the term (word) frequency TF ij = max k f kj ◮ Term frequency is f ij normalized by dividing it by the maximum number of occurrences of any term in the same document (excluding stop words) 7 / 16

TF.IDF score Then compute the Inverse Document Frequency (IDF): ◮ IDF for a term (word) is defined as follows. Suppose word i appears in n i of the N documents. ◮ The IDF i = lg( N / n i ) ◮ TF.IDF for term i in document j = TF ij × IDF i 8 / 16

TF.IDF score example ◮ Suppose we have 2 20 = 1048576 documents. Suppose word w appears in 2 10 = 1024 of them. ◮ The IDF w = lg(2 20 / 2 10 ) = 10 ◮ Suppose that in a document k , word w appears one time and the maximum number of occurrences of any word in this document is 20. Then, ◮ TF wk = 1 / 20. ◮ TF.IDF for word w in document k is 1 / 20 × 10 = 1 / 2. 9 / 16

Finding similar items ◮ Find a similar item by using a distance measure. ◮ For documents, two popular distance measures are: ◮ Jaccard distance between sets of words ◮ cosine distance between sets, treated as vectors 10 / 16

Jaccard Similarity and Jaccard Distance of Sets ◮ The Jaccard similarity (SIM) of sets S and T is | S ∩ T | / | S ∪ T | ◮ Example: SIM( S , T ) = 3 / 8 ◮ Jaccard distance of S and T is 1 − SIM( S , T ) 11 / 16

Cosine Distance of sets ◮ Compute the dot product of the sets (treated as vectors) and divide by their Euclidean distance from the origin. ◮ Example: x = [1 , 2 , − 1], y = [2 , 1 , 1] Dot product x . y = 1 · 2 + 2 · 1 + ( − 1) · 1 = 3 Euclidean distance of x to the origin √ 1 2 + 2 2 + ( − 1) 2 = � = 6 (same thing for y ) 3 Cosine distance between x and y = 6 = 1 / 2 √ √ 6 12 / 16

Sets of words as bit vectors ◮ Think of a set of words as a bit vector, one bit position for each possible word ◮ Position has 1 if the word is in the set, and has 0 if not. ◮ Only need to take care of words that exist in both documents. (0’s don’t affect the calculations) 13 / 16

User profiles ◮ Weighted average of rated item profiles ◮ Example: items = movies represented by boolean profiles. Utility matrix has a 1 if the user has seen a movie and is blank otherwise If 20% of the movies that user U likes have Julia Roberts as one of the actors, then user profile for U will have 0.2 in the component for Julia Roberts. 14 / 16

User profiles ◮ If utility matrix is not boolean, e.g., ratings 1–5, then weight the vectors by the utility value and normalize by subtracting the average value for a user. ◮ This way we get negative weights for items with below average ratings, and positive weights for items with above average ratings 15 / 16

Recommending items to users based on content ◮ Compute cosine distance between user’s and item’s vectors ◮ Movie example: ◮ highest recommendations (lowest cosine distance) belong to movies with lots of actors that appear in many of the movies the user likes. 16 / 16

Content-based recommendation systems (based on chapter 9 of Mining - PowerPoint PPT Presentation

Content-based recommendation systems (based on chapter 9 of Mining of Massive Datasets, a book by Rajaraman, Leskovec, and Ullmans book) Fernando Lobo Data mining 1 / 16 Content-based Recommendation Systems Focus on properties of

Recommended For You: A First Look at Content Recommendation Networks Muhammad Ahmad

Recommendation Systems Stony Brook University CSE545, Fall 2017 Recommendation Systems What

Recommendation Systems Stony Brook University CSE545, Spring 2019 Recommendation Systems

A Preference-Based Bandit Framework for Personalized Recommendation Maryam Tavakol and Ulf

Topics 11/13/2006 Chapter 11, start Chapter 12 11/20/2006 Chapter 12 11/27/2006 Chapter 13

Part 14: Content-Based Filtering and Hybrid Systems Francesco Ricci Content p Typologies of

PRODUCT POLICY FORUM NOVEMBER 5, 2019 RECOMMENDATION: Unverifiable Rumors Recommendation:

Plains Nitrogen Recommendation Plains Nitrogen Recommendation N lbs/A = (yield * N req.) lbs of

Writing Letters of Recommendation What is a letter of recommendation? A statement of

2015-2016 SUPERINTENDENTS BUDGET RECOMMENDATION BUDGET RECOMMENDATION CHESHIRE PUBLIC

Why learn how to build recommendation engines? Jamen Long Data Scientist DataCamp Building

Topics 11/13/2006 Chapter 11, start Chapter 12 11/20/2006 Chapter 12 Inheritance Concepts

User Recommendation in Content Curation Platforms Jianling Wang, Ziwei Zhu and James Caverlee

A REFERENCE-BASED RECOMMENDATION SYSTEM FOR ACADEMIC PAPERS ON ACEMAP By Jingqi Zhang

Future of Personalized Recommendation Systems Xing Xie Microsoft Research Asia Recommendation

Context-aware recommendation Eirini Kolomvrezou, Hendrik Heuer Special Course in Computer and

Entity Linking to Knowledge Graphs to Infer Column Types and Properties Avijit Thawani , Minda Hu,

Last class Represent a word by a context vector Each word x is represented by a vector v .

Word counts with bag- of-words Katharine Jarmul Founder, kjamistan DataCamp Natural Language

CSE 7/5337: Information Retrieval and Web Search Scoring, term weighting, the vector space model

GpKex : Genetically Programmed Keyphrase Extraction from Croatian Texts Marko Bekavac and Jan

Lecturer, Computational Science and Engineering, Georgia Tech Text is everywhere We use

Recognition Topics that we will try to cover: Indexing for fast retrieval (we still owe this one)

Matching Scores TVM, Session 4 CS6200: Information Retrieval Slides by: Jesse Anderton Finding