POLAR: Attention-based CNN for One-shot Personalized Article - PowerPoint PPT Presentation

Introduction & Problem Definition Approach Experiments & Analysis POLAR: Attention-based CNN for One-shot Personalized Article Recommendation Zhengxiao Du, Jie Tang, Yuhui Ding Tsinghua University { duzx16, dingyh15 } @mails.tsinghua.edu.cn, jietang@tsinghua.edu.cn September 13, 2018 1 / 20

Introduction & Problem Definition Approach Experiments & Analysis Motivation The publication output is growing every year (data source: DBLP) Book and Theses Conference and Workshop Papers 300k Editorship Informal Publications Journal Articles Parts in Books or Collections Reference Works 250k 200k 150k 100k 50k 0k 2 / 20 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016

Introduction & Problem Definition Approach Experiments & Analysis Related-Article Recommendation Figure: An example from AMiner.org 3 / 20

Introduction & Problem Definition Approach Experiments & Analysis Challenge How to provide personalized and non-personalized recommendation? How to overcome the sparsity of user feedback? How to utilize representative texts of articles effectively? 4 / 20

Introduction & Problem Definition Approach Experiments & Analysis Problem Definition Definition One-shot Personalized Article Recommendation Problem 5 / 20

Introduction & Problem Definition Approach Experiments & Analysis Problem Definition Definition One-shot Personalized Article Recommendation Problem Input : query article d q candidate set D = { d 1 , d 2 , · · · , d N } set S = { ( ˆ y i ) } T support d i , ˆ i =1 related to user u 5 / 20

Introduction & Problem Definition Approach Experiments & Analysis Problem Definition Definition One-shot Personalized Article Recommendation Problem Input : query article d q candidate set D = { d 1 , d 2 , · · · , d N } set S = { ( ˆ y i ) } T support d i , ˆ i =1 related to user u Output : a totally ordered set R ( d q , S ) ⊂ D with | R | = k 5 / 20

Introduction & Problem Definition Approach Experiments & Analysis One-shot Learning Image Classification 1 k � y = ˆ a (ˆ x , x i ) y i i =1 1 Vinyals et al., Matching Networks for One Shot Learning. 6 / 20

Introduction & Problem Definition Approach Experiments & Analysis One-shot Learning Image Classification 1 Article Recommendation Query article d q Support set { ( d i , y i ) } T i =1 j =1 c( ˆ 1 � T d i , d j ) y j T the matching to the user k preference(maybe missing) � y = ˆ a (ˆ x , x i ) y i i =1 1 Vinyals et al., Matching Networks for One Shot Learning. 6 / 20

Introduction & Problem Definition Approach Experiments & Analysis One-shot Learning Image Classification 1 Article Recommendation Query article d q Support set { ( d i , y i ) } T i =1 j =1 c( ˆ 1 � T s i = ˆ d i , d j ) y j T the matching to the user k preference(maybe missing) � y = ˆ a (ˆ x , x i ) y i i =1 1 Vinyals et al., Matching Networks for One Shot Learning. 6 / 20

Introduction & Problem Definition Approach Experiments & Analysis One-shot Learning Image Classification 1 Article Recommendation Query article d q Support set { ( d i , y i ) } T i =1 s i = c( d q , ˆ j =1 c( ˆ d i ) + 1 � T ˆ d i , d j ) y j T the matching to the query article the matching to the user k preference(maybe missing) � y = ˆ a (ˆ x , x i ) y i i =1 1 Vinyals et al., Matching Networks for One Shot Learning. 6 / 20

Introduction & Problem Definition Approach Experiments & Analysis Architecture Query w q 1 ⃗ k w q 2 ⃗ k · · · w ql q ⃗ k Candidate Personalized w d 1 ⃗ k Score w d 2 ⃗ k · · · w dl d ⃗ k Embedding Layer 7 / 20

Introduction & Problem Definition Approach Experiments & Analysis Architecture Query w q 1 ⃗ k Matching w q 2 ⃗ Matrix k · · · w ql q ⃗ k Candidate Personalized w d 1 ⃗ k Score w d 2 ⃗ k · · · Attention w dl d ⃗ k Matrix Embedding Conv Layer Input 7 / 20

Introduction & Problem Definition Approach Experiments & Analysis Architecture Query w q 1 ⃗ k Matching Hidden State w q 2 ⃗ Matrix k Feature · · · Map w ql q ⃗ k Matching Candidate Score Personalized w d 1 ⃗ k Score w d 2 ⃗ k · · · Attention w dl d ⃗ k Matrix Embedding Conv Convolution and Full-Connected Layer Input Max-Pooling Layer 7 / 20

Introduction & Problem Definition Approach Experiments & Analysis Architecture Query Support Set w q 1 ⃗ Matching k Hidden State w q 2 ⃗ k Matrix ˆ ˆ ˆ d 1 d 2 d T Feature · · · Map ˆ y 1 y 2 ˆ y T ˆ w ql q ⃗ k Matching Candidate Score Personalized w d 1 ⃗ k Score w d 2 ⃗ k · · · Attention w dl d ⃗ k Matrix Embedding Conv Convolution and Full-Connected One Shot Layer Input Max-Pooling Layer Matching 7 / 20

Introduction & Problem Definition Approach Experiments & Analysis Architecture Query Support Set w q 1 ⃗ Matching k Hidden State w q 2 ⃗ k Matrix ˆ ˆ ˆ d 1 d 2 d T Feature · · · Map y 1 ˆ y 2 ˆ y T ˆ w ql q ⃗ k Matching Candidate Score Personalized w d 1 ⃗ k Score w d 2 ⃗ k · · · Final Attention Score w dl d ⃗ k Matrix Embedding Conv Convolution and Full-Connected One Shot Layer Input Max-Pooling Layer Matching 7 / 20

Introduction & Problem Definition Approach Experiments & Analysis Matching Matrix and Attention Matrix Matching Matrix :( d m , d n ) → R l m × l n the similarity between the words of two articles. w T � mi · � w nj M ( m , n ) = i , j � � w mi � · � � w nj � 8 / 20

Introduction & Problem Definition Approach Experiments & Analysis Matching Matrix and Attention Matrix Matching Matrix :( d m , d n ) → R l m × l n the similarity between the words of two articles. w T � mi · � w nj M ( m , n ) = i , j � � w mi � · � � w nj � Attention Matrix :( d m , d n ) → R l m × l n the importance of the matching signals A ( m , n ) = r mi · r nj i , j 8 / 20

Introduction & Problem Definition Approach Experiments & Analysis Local Weight and Global Weight The word weight r t is the product of its local weight and global weight. 9 / 20

Introduction & Problem Definition Approach Experiments & Analysis Local Weight and Global Weight The word weight r t is the product of its local weight and global weight. Global Weight : The importance of a word in the corpus(shared among different articles) υ ij = [IDF( t ij )] β 9 / 20

Introduction & Problem Definition Approach Experiments & Analysis Local Weight and Global Weight The word weight r t is the product of its local weight and global weight. Global Weight : The importance of a word in the corpus(shared among different articles) υ ij = [IDF( t ij )] β The local weight is a little more complicated. . . 9 / 20

Introduction & Problem Definition Approach Experiments & Analysis Local Weight Local Weight : The importance of a word in the article A neural network is employed to compute the local weight. The lines with arrows denote the feature vectors 7 The feature vector for 6 The triangular points denote the 5 word t ij vectors of the words in two texts 4 3 w ij − � x ij = � � w i 2 The circular points denote the mean vectors of the texts. 1 0 1 2 3 4 5 6 7 10 / 20

Introduction & Problem Definition Approach Experiments & Analysis Local Weight Network The feature vector � x ij represents the semantic difference between the article and the term. u ( L ) Let � be the output of the last linear layer, the output of ij the local weight network is µ ij = σ ( W ( L ) · � u ( L ) + b ( L ) ) + α ij α sets a lower bound for local weights. 11 / 20

Introduction & Problem Definition Approach Experiments & Analysis CNN & Training The matching matrix and attention matrix are combined by element-wise multiplication and sent to a CNN. Matching Hidden State Matrix Feature Map Matching Score Attention Matrix Conv Convolution and Full-Connected Input Max-Pooling Layer The entire model, including the local weight network, is trained on the target task. 12 / 20

Introduction & Problem Definition Approach Experiments & Analysis Dataset AMiner : papers from ArnetMiner 1 Patent : patent documents from USPTO RARD (Related Article Recommendation Dataset 2 ) :from Sowiport, a digital library service provider. 1 Tang et al. ArnetMiner: Extraction and Mining of Academic Social Networks. In SIGKDD’2008. 2 Beel et al. Rard: The related-article recommendation dataset (2017) 13 / 20

POLAR: Attention-based CNN for One-shot Personalized Article - PowerPoint PPT Presentation

Introduction & Problem Definition Approach Experiments & Analysis POLAR: Attention-based CNN for One-shot Personalized Article Recommendation Zhengxiao Du, Jie Tang, Yuhui Ding Tsinghua University { duzx16, dingyh15 }

Research on polar bears at Norwegian Polar Tracking the Polar Bear Institute Dr. Jon Aars,

Polar Tankers - ConocoPhillips Five Endeavour Class Tankers Built in 2001-2006 30,000

Polar Decomposition of a Matrix Garrett Buffington May 4, 2014 The Polar Decomposition SVD and

CS7015 (Deep Learning) : Lecture 12 Object Detection: R-CNN, Fast R-CNN, Faster R-CNN, You Only

Object Detection using R-CNN Experiments CS381V: Visual Recognition, Spring 2016 William Xie

Attention in NLP CS 6956: Deep Learning for NLP Overview What is attention Attention in

Realizing the Dreams of Personalized Medicine Realizing the Dreams of Personalized Medicine

SHOT Brand Price NOTES WEST COAST MAGNUM SIZES 4 - 9 $ 39.20 Eagle shot prices may not be

Siamese Network & Matching Network for one-shot learning Reference Papers Siamese Neural

A Bayesian Approach to A Bayesian Approach to Unsupervised One- Unsupervised One -Shot Shot

Observations for the Polar Upper Atmosphere Research at Korea Polar Research Institute (KOPRI)

Polar Knowledge Canada Collaborating for the Future Canadas Polar Agency An Integrated

POLAR CAPITAL HOLDINGS PLC Results Presentation June 2007 Contents Overview of Polar

JUST THE MATHS SLIDES NUMBER 6.3 COMPLEX NUMBERS 3 (The polar & exponential forms)

First Results of POLAR: A dedicated Gamma-Ray Burst Polarimeter Merlin Kole on behalf of the

Attention Eye tracking seminar 2/19/15 Presented by Tatiana Emmanouil Outline What is

Dense Associative Memories and Deep Learning Dmitry Krotov IBM Research MIT-IBM Watson AI Lab

NEURAL NETWORKS NEURAL NETWORKS THE IDEA BEHIND ARTIFICIAL NEURONS Initially a simplified

Feature Selection Richard Pospesel and Bert Wierenga Introduction Preprocessing Peaking

Threshold Networks over undirected graphs Universidad Adolfo

Algorithmic Learning Theory Theoretical Computer Science Peter Rossmanith Felix Reidl, Fernando

Introduction to Artificial Intelligence What is Artificial Intelligence for YOU? CPSC 533

Course setup 9 ec course examination based on computer exercises weekly exercises

Fundamentals of Computational Neuroscience 2e Thomas Trappenberg December 11, 2009 Chapter 8:

Sambuz

Useful Links

Newsletter

Mail Us

POLAR: Attention-based CNN for One-shot Personalized Article - PowerPoint PPT Presentation

Introduction & Problem Definition Approach Experiments & Analysis POLAR: Attention-based CNN for One-shot Personalized Article Recommendation Zhengxiao Du, Jie Tang, Yuhui Ding Tsinghua University { duzx16, dingyh15 }

Research on polar bears at Norwegian Polar Tracking the Polar Bear Institute Dr. Jon Aars,

Polar Tankers - ConocoPhillips Five Endeavour Class Tankers Built in 2001-2006 30,000

Polar Decomposition of a Matrix Garrett Buffington May 4, 2014 The Polar Decomposition SVD and

CS7015 (Deep Learning) : Lecture 12 Object Detection: R-CNN, Fast R-CNN, Faster R-CNN, You Only

Object Detection using R-CNN Experiments CS381V: Visual Recognition, Spring 2016 William Xie

Attention in NLP CS 6956: Deep Learning for NLP Overview What is attention Attention in

Realizing the Dreams of Personalized Medicine Realizing the Dreams of Personalized Medicine

SHOT Brand Price NOTES WEST COAST MAGNUM SIZES 4 - 9 $ 39.20 Eagle shot prices may not be

Siamese Network &amp; Matching Network for one-shot learning Reference Papers Siamese Neural

A Bayesian Approach to A Bayesian Approach to Unsupervised One- Unsupervised One -Shot Shot

Observations for the Polar Upper Atmosphere Research at Korea Polar Research Institute (KOPRI)

Polar Knowledge Canada Collaborating for the Future Canadas Polar Agency An Integrated

POLAR CAPITAL HOLDINGS PLC Results Presentation June 2007 Contents Overview of Polar

JUST THE MATHS SLIDES NUMBER 6.3 COMPLEX NUMBERS 3 (The polar &amp; exponential forms)

First Results of POLAR: A dedicated Gamma-Ray Burst Polarimeter Merlin Kole on behalf of the

Attention Eye tracking seminar 2/19/15 Presented by Tatiana Emmanouil Outline What is

Dense Associative Memories and Deep Learning Dmitry Krotov IBM Research MIT-IBM Watson AI Lab

NEURAL NETWORKS NEURAL NETWORKS THE IDEA BEHIND ARTIFICIAL NEURONS Initially a simplified

Feature Selection Richard Pospesel and Bert Wierenga Introduction Preprocessing Peaking

Threshold Networks over undirected graphs Universidad Adolfo

Algorithmic Learning Theory Theoretical Computer Science Peter Rossmanith Felix Reidl, Fernando

Introduction to Artificial Intelligence What is Artificial Intelligence for YOU? CPSC 533

Course setup 9 ec course examination based on computer exercises weekly exercises

Fundamentals of Computational Neuroscience 2e Thomas Trappenberg December 11, 2009 Chapter 8:

Sambuz

Useful Links

Newsletter

Mail Us

Siamese Network & Matching Network for one-shot learning Reference Papers Siamese Neural

JUST THE MATHS SLIDES NUMBER 6.3 COMPLEX NUMBERS 3 (The polar & exponential forms)