Combining Content with User Preferences for TED Lecture - PowerPoint PPT Presentation

The TED collection Recommendation algorithms Experiments Conclusions Combining Content with User Preferences for TED Lecture Recommendations CBMI 2013 Nikolaos Pappas Andrei Popescu-Belis inEvent project (https://www.inevent-project.eu) Idiap Research Institute, Martigny, Switzerland Wednesday 19 th June, 2013 1 ⊳

The TED collection Recommendation algorithms Experiments Conclusions Motivation Recommender systems are information filtering systems that seek to predict ratings (preferences) for items that might be of interest to a user. divided in content-based (CB), collaborative filtering (CF) and hybrid plenty of data available on certain domains (movies, music, etc.) fewer for multimedia content (e.g. VideoLectures) Questions – multimedia recommendations → How to perform quantitative experiments with ‘objective’ measures? → Which data to use for evaluation? → How important is content vs. collaborative information? 2 ⊳

The TED collection Recommendation algorithms Experiments Conclusions Summary Recommendation methods for scientific talks studying the merits of CB and CF methods over TED talks 1 evaluating in two different scenarios: cold-start, non-cold-start 2 (absence or presence of collaborative information) Main contributions → Introduction of TED dataset for multimedia recommendations → Definition of evaluation tasks over TED → Combining content features with user preferences → First benchmark scores on this promising dataset 3 ⊳

The TED collection Recommendation algorithms Experiments Conclusions 1 The TED collection 2 Recommendation algorithms 3 Experiments 4 Conclusions 4 ⊳

The TED collection Recommendation algorithms Experiments Conclusions The TED collection TED is an online repository of lectures (ted.com) which contains: audiovisual recordings of talks with extended metadata user-contributed material (comments, favorites) Total Per Talk Per Active User Attribute Count Average Std Average Std Talks 1,149 - - - - Speakers 961 - - - - Users 69,023 - - - - Active Users 10,962 - - - - Tags 300 5.83 2.11 - - Themes 48 2.88 1.06 - - Related Videos 3,002 2.62 0.74 - - Transcripts 1,102 0.95 0.19 - - Favorites 108,476 94.82 114.54 9.89 20.52 Comments 201,934 176.36 383.87 4.87 23.42 We crawled (Apr 2012), formatted and distributed the TED metadata: https://www.idiap.ch/dataset/ted/ (in agreement with TED) 5 ⊳

The TED collection Recommendation algorithms Experiments Conclusions Ground truth Typical problem : Given a rating matrix R ( | U | × | I | ) where R ui is user’s u explicit rating to item i ; the goal is to find the value of missing ratings in R . Categorical ratings (e.g. good, bad) Numerical ratings (e.g. 1 to 5 stars) Unary or binary ratings (e.g. favorites or like/dislike) On TED dataset we deal with unary ratings from user favorites:  r 1 , 1 r 1 , 2 · · · r 1 , n  1 1 ? ?   r 2 , 1 r 2 , 2 · · · r 2 , n ? ? ? 1     R u , i =  e . g .  . . .  ...   . . . 1 1 ? ?   . . .    1 ? 1 ? · · · r m , 1 r m , 2 r m , n → uncertainty about the negative class (one-class problem) → related/similar talks available (TED editorial staff) 6 ⊳

The TED collection Recommendation algorithms Experiments Conclusions Recommendation tasks Personalized recommendation task 1 Ground-truth: user favorites (binary values), namely “1” for action and “0” or “?” for inaction (not seen, or seen and not liked). → Predict the N most interesting items for each user (top-N) Generic recommendation task 2 Ground-truth: related talks per talk assigned by TED editorial staff. → Predict the N most similar items to a given one (top-N) How to evaluate? As a top-N ranking problem: train a recommender (ranker) on fragments of user history and evaluate the performance on the held-out ones → for each user all items have to ordered based on a scoring function → information retrieval metrics to capture the performance (P, R, F1) 7 ⊳

The TED collection Recommendation algorithms Experiments Conclusions Comparison with other collections Collection Basic Sp. Trans. Tags Impl. Expl. CC VideoLectures � � � � KhanAcademy � � � Youtube EDU � � � � DailyMotion � � � TED � � � � � � � Basic: Title, Description Sp.: Speaker Tra.: Transcript Tags: Categories in form of keywords Impl.: Implicit feedback (e.g. comments or views) Expl.: Explicit feedback (e.g. ratings, favorites or bookmarks) CC: Creative Commons Non-Commercial License 8 ⊳

The TED collection Recommendation algorithms Experiments Conclusions Representations of TED talks Each talk t j ∈ I is represented as a feature vector t j = ( w 1 , w 2 , ..., w ij ), where each position i corresponds to a word of the vocabulary w ∈ V . Pre-processing: I → Tokenization → Stop words removal → Stemming → V Semantic Vector Space Models Dimensionality reduction (LSI and RP), topic modeling (LDA) and concept-spaces built with external knowledge (ESA) vs. baseline (TF-IDF). diminish the curse of dimensionality effect proximity is interpreted as semantic relatedness Comparison of their effectiveness in the recommendation task 10 ⊳

The TED collection Recommendation algorithms Experiments Conclusions Recommendation algorithms Three types of nearest neighbor (NN) models for a given user u and talk i : Content-based � r ui = ˆ (1) s ij , j ∈ D k ( u ; i ) Collaborative filtering � ˆ r ui = b ui + d ij ( r uj − b uj ) , (2) j ∈ D k ( u ; i ) b ui = µ + b u + b i , (3) Combined � r ui = b ui + ˆ s ij ( r uj − b uj ) , (4) j ∈ D k ( u ; i ) d ij : collaborative similarity of two items computed on the co-rating matrix. s ij : the content similarity of two items in the given vector space. 11 ⊳

The TED collection Recommendation algorithms Experiments Conclusions Parameter and feature selection → Parameters fixed for all NN models (k=3, λ = 100) → Parameters for VSMs optimized (dimensionality k for LSI, RP, LDA and priors α , β for LDA) → Features are the words extracted from the metadata Method Optimal Features Performance (%) P@5 R@5 F@5 LDA ( t =200) Title, desc., TED event, 1.63 1.96 1.78 speaker ( tide.tesp ) TF-IDF Title ( ti ) 1.70 2.00 1.83 RP ( t =5000) Description ( de ) 1.83 2.25 2.01 LSI ( t =3000) Title ( ti ) 1.86 2.27 2.04 ESA Title, description ( tide ) 2.79 3.46 3.08 Table : CB performance with 5-fold c.-v. on the training set. 13 ⊳

The TED collection Recommendation algorithms Experiments Conclusions Feature ranking Ranking based on the average F@5 over all methods with cross-validation. 14 ⊳

The TED collection Recommendation algorithms Experiments Conclusions Experiments on held-out data semantic spaces outperform keyword-based ones within CB methods 1 combined methods achieve reasonable performance compared to CF ones 2 and they are applicable in both settings with good performance 15 ⊳

The TED collection Recommendation algorithms Experiments Conclusions Conclusions New dataset for lecture recommendation evaluation (ground-truth and rich content) Two recommendation benchmarks First experiments on personalized TED lecture recommendations We proposed to combine semantic spaces with CF methods → perform well in cold-start settings and can be used reasonably well in non-cold-start settings → applicable to multimedia datasets, where new items are inserted frequently (cold-start) 16 ⊳

The TED collection Recommendation algorithms Experiments Conclusions End of presentation Thank you! Any questions? 17 ⊳

Combining Content with User Preferences for TED Lecture - PowerPoint PPT Presentation

The TED collection Recommendation algorithms Experiments Conclusions Combining Content with User Preferences for TED Lecture Recommendations CBMI 2013 Nikolaos Pappas Andrei Popescu-Belis inEvent project (https://www.inevent-project.eu)

Sustainable intergenerational preferences preferences Geir B. Combining sensitivity for the

Ted Swem Ted Swem Ted Swem Common Loon reproductive effects now shown in New England and

Moral Preferences F R A N C E S C A R O S S I Decision making Based on our preferences

Rational preferences Idea: preferences of a rational agent must obey constraints. Rational

Rational preferences Idea: preferences of a rational agent must obey constraints. Rational

RUN groupadd -r user && useradd -r -g user user USER user $ docker run --read-only debian

Combining Models Oliver Schulte - CMPT 726 Bishop PRML Ch. 14 Combining Models: Some Theory

14.54 International Trade Lecture 3: Preferences and Demand 14.54 Week 2 Fall 2016 14.54 (Week

Radiation Testing of Advanced Non-Volatile Memories Ted Wilcox ted.wilcox@nasa.gov NASA Goddard

How to Design Ted-Worthy Presentation Slides (Black White Edition): How to Design Ted-Worthy

How to Design Ted-Worthy Presentation Slides (Black How to Design Ted-Worthy Presentation Slides

How to Deliver a Great TED Talk Presentation Secrets of How to Deliver a Great TED Talk

How to Design Ted-Worthy Presentation Slides (Black White Edition): How to Design Ted-Worthy

Variability of an artificial tandem repeat Ted Pak HURS 2007 Variability of an artificial tandem

PsyPhilProg Ted Neward Neward & Associates http://www.tedneward.com | ted@tedneward.com Who

Storage Information Services Ted Hesselroth Fermilab Abhishek Singh Rana and Frank Wuerthwein UC

RUSDs Transition to Mastery-Based Learning -This is not necessarily new thinking or new work,

Parents Go to School Night: Teacher-Led Sessions Session Facilitators Session Description

How to get the most out of T A R A M A I T R I Who are you? I am Tara Parashar, the ethical

How to change the world with content strategy a Contentious workshop Laura Robertson @laurazoee

Event Data Processing Frameworks for the Future The Vision The Model

InfoTracker: Pedigree Tracking in the Face of Ancillary Content Eugene Creswick, Terrance Goan

Community Meeting Celebrating 10 years of housing the homeless! 2 Strengthening the CoC System

AllJoyn Node AllJoyn Thin Client Other Proximal or Cloud Devices 72 Device System Bridge

Sambuz

Useful Links

Newsletter

Mail Us

Combining Content with User Preferences for TED Lecture - PowerPoint PPT Presentation

The TED collection Recommendation algorithms Experiments Conclusions Combining Content with User Preferences for TED Lecture Recommendations CBMI 2013 Nikolaos Pappas Andrei Popescu-Belis inEvent project (https://www.inevent-project.eu)

Sustainable intergenerational preferences preferences Geir B. Combining sensitivity for the

Ted Swem Ted Swem Ted Swem Common Loon reproductive effects now shown in New England and

Moral Preferences F R A N C E S C A R O S S I Decision making Based on our preferences

Rational preferences Idea: preferences of a rational agent must obey constraints. Rational

Rational preferences Idea: preferences of a rational agent must obey constraints. Rational

RUN groupadd -r user &amp;&amp; useradd -r -g user user USER user $ docker run --read-only debian

Combining Models Oliver Schulte - CMPT 726 Bishop PRML Ch. 14 Combining Models: Some Theory

14.54 International Trade Lecture 3: Preferences and Demand 14.54 Week 2 Fall 2016 14.54 (Week

Radiation Testing of Advanced Non-Volatile Memories Ted Wilcox ted.wilcox@nasa.gov NASA Goddard

How to Design Ted-Worthy Presentation Slides (Black White Edition): How to Design Ted-Worthy

How to Design Ted-Worthy Presentation Slides (Black How to Design Ted-Worthy Presentation Slides

How to Deliver a Great TED Talk Presentation Secrets of How to Deliver a Great TED Talk

How to Design Ted-Worthy Presentation Slides (Black White Edition): How to Design Ted-Worthy

Variability of an artificial tandem repeat Ted Pak HURS 2007 Variability of an artificial tandem

PsyPhilProg Ted Neward Neward &amp; Associates http://www.tedneward.com | ted@tedneward.com Who

Storage Information Services Ted Hesselroth Fermilab Abhishek Singh Rana and Frank Wuerthwein UC

RUSDs Transition to Mastery-Based Learning -This is not necessarily new thinking or new work,

Parents Go to School Night: Teacher-Led Sessions Session Facilitators Session Description

How to get the most out of T A R A M A I T R I Who are you? I am Tara Parashar, the ethical

How to change the world with content strategy a Contentious workshop Laura Robertson @laurazoee

Event Data Processing Frameworks for the Future The Vision The Model

InfoTracker: Pedigree Tracking in the Face of Ancillary Content Eugene Creswick, Terrance Goan

Community Meeting Celebrating 10 years of housing the homeless! 2 Strengthening the CoC System

AllJoyn Node AllJoyn Thin Client Other Proximal or Cloud Devices 72 Device System Bridge

Sambuz

Useful Links

Newsletter

Mail Us

RUN groupadd -r user && useradd -r -g user user USER user $ docker run --read-only debian

PsyPhilProg Ted Neward Neward & Associates http://www.tedneward.com | ted@tedneward.com Who