Bayesian Personalized Feature Interaction Selection for Factorization Machines Yifan Chen 1,2 Pengjie Ren 1 Yang Wang 3 Maarten de Rijke 1 1 University of Amsterdam 2 National University of Defense Technology 3 Hefei University of Technology Yifan Chen , Pengjie Ren, Yang Wang, Maarten de Rijke SIGIR 2019 1 / 28
Introduction Factorization Machines Feature Interaction Selection Yifan Chen , Pengjie Ren, Yang Wang, Maarten de Rijke SIGIR 2019 2 / 28
Factorization Machines What is Factorization Machine? ◮ generic supervised learning method ◮ account for feature interactions with factored parameters ◮ the combination of features #Hashtag Feature combinations “comics” (“comics”,“marvel”) “marvel” (“comics”,“avengers”) “avengers” (“marvel”,“avengers”) Yifan Chen , Pengjie Ren, Yang Wang, Maarten de Rijke SIGIR 2019 3 / 28
Factorization Machines ◮ Linear regression: O ( d ) d � ˆ r ( x ) = b 0 + w i x i i =1 Yifan Chen , Pengjie Ren, Yang Wang, Maarten de Rijke SIGIR 2019 4 / 28
Factorization Machines ◮ Linear regression: O ( d ) d � ˆ r ( x ) = b 0 + w i x i i =1 ◮ Degree-2 polynomial regression: O ( d 2 ) d d d � � � r ( x ) = b 0 + ˆ w i x i + w ij · x i x j i =1 i =1 j = i +1 Yifan Chen , Pengjie Ren, Yang Wang, Maarten de Rijke SIGIR 2019 4 / 28
Factorization Machines ◮ Linear regression: O ( d ) d � ˆ r ( x ) = b 0 + w i x i i =1 ◮ Degree-2 polynomial regression: O ( d 2 ) d d d � � � r ( x ) = b 0 + ˆ w i x i + w ij · x i x j i =1 i =1 j = i +1 ◮ Factorization machine: O ( dk ) d d d � � � ˆ r ( x ) = b 0 + w i x i + � v i , v j � · x i x j i =1 i =1 j = i +1 Yifan Chen , Pengjie Ren, Yang Wang, Maarten de Rijke SIGIR 2019 4 / 28
Factorization Machines Example ˆ r (spider-man) = b 0 + w comics + w marvel + w avengers + � v comics , v marvel � + � v comics , v avengers � + � v marvel , v avengers � #Hashtag Feature combinations “comics” (“comics”,“marvel”) “marvel” (“comics”,“avengers”) “avengers” (“marvel”,“avengers”) Yifan Chen , Pengjie Ren, Yang Wang, Maarten de Rijke SIGIR 2019 5 / 28
Introduction Factorization Machines Feature Interaction Selection Yifan Chen , Pengjie Ren, Yang Wang, Maarten de Rijke SIGIR 2019 6 / 28
Factorization Machines for Recommendation ◮ Effective use of historical interactions between users and items ◮ Incorporate additional information associated with users or items ◮ High-dimensional feature space ◮ #feature = #user + #item + #additional ◮ not all features or feature interactions are helpful Yifan Chen , Pengjie Ren, Yang Wang, Maarten de Rijke SIGIR 2019 7 / 28
Factorization Machines for Recommendation ◮ Effective use of historical interactions between users and items ◮ Incorporate additional information associated with users or items ◮ High-dimensional feature space ◮ #feature = #user + #item + #additional ◮ not all features or feature interactions are helpful Yifan Chen , Pengjie Ren, Yang Wang, Maarten de Rijke SIGIR 2019 7 / 28
Feature Interaction Selection (FIS) Filter out useless feature interactions ◮ P-FIS: Select feature interactions ◮ FIS: select a common set of for users personally interactions FIS for u 1 and u 2 x 1 · x 2 x 1 · x 3 x 1 x 1 · x 3 x 1 · x 4 x 2 x 1 · x 4 FM x 2 · x 3 x 3 x 2 · x 4 x 2 · x 4 x 4 x 3 · x 4 Yifan Chen , Pengjie Ren, Yang Wang, Maarten de Rijke SIGIR 2019 8 / 28
Feature Interaction Selection (FIS) Filter out useless feature interactions ◮ P-FIS: Select feature interactions ◮ FIS: select a common set of for users personally interactions P-FIS for u 1 x 1 · x 3 x 1 · x 2 x 2 · x 3 FM x 1 · x 3 x 1 x 2 · x 4 x 1 · x 4 x 2 x 2 · x 3 x 1 · x 3 x 3 x 2 · x 4 x 2 · x 3 x 4 FM x 3 · x 4 x 2 · x 4 P-FIS for u 2 Yifan Chen , Pengjie Ren, Yang Wang, Maarten de Rijke SIGIR 2019 8 / 28
Introduction Factorization Machines Feature Interaction Selection Model description Bayesian personalized feature interaction selection Efficient optimization Yifan Chen , Pengjie Ren, Yang Wang, Maarten de Rijke SIGIR 2019 9 / 28
Personalized Factorization Machines (PFM) FM d d d � � � r ( x ) = b 0 + ˆ w i x i + w ij · x i x j i =1 i =1 j = i +1 PFM d d d � � � w uij · x i x j r ( x ) = b u + ˆ w ui x i + i =1 i =1 j = i +1 Select 1 st -order interactions { x i } and 2 nd -order interactions { x i x j } by { w ui } and { w uij } Yifan Chen , Pengjie Ren, Yang Wang, Maarten de Rijke SIGIR 2019 10 / 28
Bayesian Variable Selection (BVS) ◮ Apply BVS to select feature interactions ◮ avoid expensive cross-validation ◮ Priors for BVS ◮ sparsity priors ◮ spike-and-slab Yifan Chen , Pengjie Ren, Yang Wang, Maarten de Rijke SIGIR 2019 11 / 28
Bayesian Variable Selection Spike-and-slab Sparsity priors � � − | x − µ | ◮ Spike (black arrow): 1 ◮ f ( w ) = 2 b exp b p ( w = 0) = 0 . 5 ◮ p ( w = 0) = 0 ◮ Slab (blue line) Yifan Chen , Pengjie Ren, Yang Wang, Maarten de Rijke SIGIR 2019 12 / 28
Hereditary Spike-and-Slab Priors ◮ Spike-and-slab s ∼ Bernoulli ( π ) , w ∼ N (0 , 1) , ˜ w = ˜ w · s . ◮ Hereditary spike-and-slab ◮ capture the relations between 1 st -order and 2 nd -order feature interactions s ui , s uj ∼ Bernoulli ( π 1 ) p ( s uij = 1 | s ui s uj = 1) = 1 ( Strong heredity ) p ( s uij = 1 | s ui + s uj = 1) = π 2 ( Weak heredity ) p ( s uij = 1 | s ui + s uj = 0) = 0 Yifan Chen , Pengjie Ren, Yang Wang, Maarten de Rijke SIGIR 2019 13 / 28
Hereditary Spike-and-Slab Priors ◮ Spike-and-slab s ∼ Bernoulli ( π ) , w ∼ N (0 , 1) , ˜ w = ˜ w · s . ◮ Hereditary spike-and-slab ◮ capture the relations between 1 st -order and 2 nd -order feature interactions s ui , s uj ∼ Bernoulli ( π 1 ) p ( s uij = 1 | s ui s uj = 1) = 1 ( Strong heredity ) p ( s uij = 1 | s ui + s uj = 1) = π 2 ( Weak heredity ) p ( s uij = 1 | s ui + s uj = 0) = 0 Yifan Chen , Pengjie Ren, Yang Wang, Maarten de Rijke SIGIR 2019 13 / 28
Hereditary Spike-and-Slab Priors ◮ Spike-and-slab s ∼ Bernoulli ( π ) , w ∼ N (0 , 1) , ˜ w = ˜ w · s . ◮ Hereditary spike-and-slab ◮ capture the relations between 1 st -order and 2 nd -order feature interactions s ui , s uj ∼ Bernoulli ( π 1 ) p ( s uij = 1 | s ui s uj = 1) = 1 ( Strong heredity ) p ( s uij = 1 | s ui + s uj = 1) = π 2 ( Weak heredity ) p ( s uij = 1 | s ui + s uj = 0) = 0 Yifan Chen , Pengjie Ren, Yang Wang, Maarten de Rijke SIGIR 2019 13 / 28
Hereditary Spike-and-Slab Priors ◮ Spike-and-slab s ∼ Bernoulli ( π ) , w ∼ N (0 , 1) , ˜ w = ˜ w · s . ◮ Hereditary spike-and-slab ◮ capture the relations between 1 st -order and 2 nd -order feature interactions s ui , s uj ∼ Bernoulli ( π 1 ) p ( s uij = 1 | s ui s uj = 1) = 1 ( Strong heredity ) p ( s uij = 1 | s ui + s uj = 1) = π 2 ( Weak heredity ) p ( s uij = 1 | s ui + s uj = 0) = 0 Yifan Chen , Pengjie Ren, Yang Wang, Maarten de Rijke SIGIR 2019 13 / 28
Hereditary Spike-and-Slab Priors ◮ Spike-and-slab s ∼ Bernoulli ( π ) , w ∼ N (0 , 1) , ˜ w = ˜ w · s . ◮ Hereditary spike-and-slab ◮ capture the relations between 1 st -order and 2 nd -order feature interactions s ui , s uj ∼ Bernoulli ( π 1 ) p ( s uij = 1 | s ui s uj = 1) = 1 ( Strong heredity ) p ( s uij = 1 | s ui + s uj = 1) = π 2 ( Weak heredity ) p ( s uij = 1 | s ui + s uj = 0) = 0 Yifan Chen , Pengjie Ren, Yang Wang, Maarten de Rijke SIGIR 2019 13 / 28
Generative Procedure of BP-FIS Algorithm Generation procedure Yifan Chen , Pengjie Ren, Yang Wang, Maarten de Rijke SIGIR 2019 14 / 28
Generative Procedure of BP-FIS Algorithm Generation procedure 1: for each user u ∈ U do for each feature i ∈ F do 2: draw first-order interaction selection variable s ui ∼ Bernoulli ( π 1 ) 3: Yifan Chen , Pengjie Ren, Yang Wang, Maarten de Rijke SIGIR 2019 14 / 28
Generative Procedure of BP-FIS Algorithm Generation procedure 1: for each user u ∈ U do for each feature i ∈ F do 2: draw first-order interaction selection variable s ui ∼ Bernoulli ( π 1 ) 3: draw first-order interaction weight ˜ w i ∼ N (0 , 1) 4: Yifan Chen , Pengjie Ren, Yang Wang, Maarten de Rijke SIGIR 2019 14 / 28
Generative Procedure of BP-FIS Algorithm Generation procedure 1: for each user u ∈ U do for each feature i ∈ F do 2: draw first-order interaction selection variable s ui ∼ Bernoulli ( π 1 ) 3: draw first-order interaction weight ˜ w i ∼ N (0 , 1) 4: w ui = s ui · ˜ w i 5: Yifan Chen , Pengjie Ren, Yang Wang, Maarten de Rijke SIGIR 2019 14 / 28
Generative Procedure of BP-FIS Algorithm Generation procedure 1: for each user u ∈ U do for each feature i ∈ F do 2: draw first-order interaction selection variable s ui ∼ Bernoulli ( π 1 ) 3: draw first-order interaction weight ˜ w i ∼ N (0 , 1) 4: w ui = s ui · ˜ w i 5: for each feature pair i , j ∈ F do 6: draw second-order interaction selection variable s uij ∼ p ( s uij | s ui , s uj ) 7: Yifan Chen , Pengjie Ren, Yang Wang, Maarten de Rijke SIGIR 2019 14 / 28
Recommend
More recommend