relational learning with many relations
play

Relational learning with many relations Guillaume Obozinski - PowerPoint PPT Presentation

Relational learning with many relations Guillaume Obozinski Laboratoire dInformatique Gaspard Monge Ecole des Ponts - ParisTech Joint work with Rodolphe Jenatton, Nicolas Le Roux and Antoine Bordes. Labex B ezout - Huawei Seminar -


  1. Relational learning with many relations Guillaume Obozinski Laboratoire d’Informatique Gaspard Monge ´ Ecole des Ponts - ParisTech Joint work with Rodolphe Jenatton, Nicolas Le Roux and Antoine Bordes. Labex B´ ezout - Huawei Seminar - April 3rd, 2015 Relational learning with many relations 1/24

  2. Modelling relations between pairs of entities Triplets: Term 1 - Relation - Term 2 Relational learning with many relations 2/24

  3. Modelling relations between pairs of entities Triplets: Term 1 - Relation - Term 2 Single relation Collaborative filtering Link prediction Modeling of social networks Relational learning with many relations 2/24

  4. Modelling relations between pairs of entities Triplets: Term 1 - Relation - Term 2 Single relation Collaborative filtering Link prediction Modeling of social networks Multiple relations Collective classification Modelling in relational knowledge databases Proteins-protein and protein-ligand interactions Natural language semantics (and semantic role labelling) Relational learning with many relations 2/24

  5. Our motivation : Learning the semantic value of verbs Model triplets: Subject Verb Object S i R j O k Relational learning with many relations 3/24

  6. Our motivation : Learning the semantic value of verbs Model triplets: Subject Verb Object S i R j O k View this as the relation: R j ( S i , O k ) = 1 Relational learning with many relations 3/24

  7. Different kinds of relational learning Learn to predict relations from object attributes: Binary classification from pairs of feature vectors Relational learning with many relations 4/24

  8. Different kinds of relational learning Learn to predict relations from object attributes: Binary classification from pairs of feature vectors Exploit logical properties of relations: transitivity, implication, mutual exclusion, etc Markov Logic Networks (Kok and Domingos, 2007) Relational learning with many relations 4/24

  9. Different kinds of relational learning Learn to predict relations from object attributes: Binary classification from pairs of feature vectors Exploit logical properties of relations: transitivity, implication, mutual exclusion, etc Markov Logic Networks (Kok and Domingos, 2007) Predict relations from some observed relations Relational learning with many relations 4/24

  10. Different kinds of relational learning Learn to predict relations from object attributes: Binary classification from pairs of feature vectors Exploit logical properties of relations: transitivity, implication, mutual exclusion, etc Markov Logic Networks (Kok and Domingos, 2007) Predict relations from some observed relations Idea: relations derive from unobserved latent attributes. Relational learning from intrinsic latent attributes Relational learning with many relations 4/24

  11. Stochastic Block Model Wang and Wong (1987); Nowicki and Snijders (2001) C ′ C i k Z ik Relational learning with many relations 5/24

  12. Stochastic Block Model Wang and Wong (1987); Nowicki and Snijders (2001) C ′ C i k Z ik � P ( Z ik = 1 | C i = c , C ′ k = c ′ ) P ( C i = c ) P ( C ′ k = c ′ ) P ( Z ik = 1) = c , c ′ Relational learning with many relations 5/24

  13. Stochastic Block Model Wang and Wong (1987); Nowicki and Snijders (2001) C ′ C i k Z ik � P ( Z ik = 1 | C i = c , C ′ k = c ′ ) P ( C i = c ) P ( C ′ k = c ′ ) P ( Z ik = 1) = c , c ′ � R cc ′ S ci O c ′ k = ( s i ) ⊤ Ro k P ik = c , c ′ Relational learning with many relations 5/24

  14. Stochastic Block Model Wang and Wong (1987); Nowicki and Snijders (2001) C ′ C i k Z ik � P ( Z ik = 1 | C i = c , C ′ k = c ′ ) P ( C i = c ) P ( C ′ k = c ′ ) P ( Z ik = 1) = c , c ′ � R cc ′ S ci O c ′ k = ( s i ) ⊤ Ro k P ik = c , c ′ P = S ⊤ R O Relational learning with many relations 5/24

  15. A matrix factorization problem = S ⊤ P R O 0 ≤ R ik ≤ 1 o k ∈ △ , s i ∈ △ △ = { x ∈ R p with + | � x � 1 = 1 } Relational learning with many relations 6/24

  16. Stochastic Block Model for several relation types C ′ C i k Z ( j ) ik Relational learning with many relations 7/24

  17. Stochastic Block Model for several relation types C ′ C i k Z ( j ) ik P ( Z ( j ) � P ( Z ( j ) ik = 1 | C i = c , C ′ k = c ′ ) P ( C i = c ) P ( C ′ k = c ′ ) ik = 1) = c , c ′ Relational learning with many relations 7/24

  18. Stochastic Block Model for several relation types C ′ C i k Z ( j ) ik P ( Z ( j ) � P ( Z ( j ) ik = 1 | C i = c , C ′ k = c ′ ) P ( C i = c ) P ( C ′ k = c ′ ) ik = 1) = c , c ′ P ( j ) � [ R j ] cc ′ S ci O c ′ k = ( s i ) ⊤ R j o k ik = c , c ′ Relational learning with many relations 7/24

  19. Stochastic Block Model for several relation types C ′ C i k Z ( j ) ik P ( Z ( j ) � P ( Z ( j ) ik = 1 | C i = c , C ′ k = c ′ ) P ( C i = c ) P ( C ′ k = c ′ ) ik = 1) = c , c ′ P ( j ) � [ R j ] cc ′ S ci O c ′ k = ( s i ) ⊤ R j o k ik = c , c ′ P j = S ⊤ R j O . Relational learning with many relations 7/24

  20. Collective matrix factorization = S ⊤ O P j R j 0 ≤ [ R j ] ik ≤ 1 o k ∈ △ , s i ∈ △ △ = { x ∈ R p with + | � x � 1 = 1 } Relational learning with many relations 8/24

  21. Collective matrix factorization = S ⊤ O P j R j 0 ≤ [ R j ] ik ≤ 1 o k ∈ △ , s i ∈ △ △ = { x ∈ R p with + | � x � 1 = 1 } Corresponds to the approach used in RESCAL (Nickel et al., 2012) S = O , R j � Z j − P j � 2 min F Relational learning with many relations 8/24

  22. A bilinear logistic model s i o k Z ijk = R j ( S i , O k ) Relational learning with many relations 9/24

  23. A bilinear logistic model s i o k Z ijk = R j ( S i , O k ) � − 1 P ( R j ( S i , O k ) = 1) = P ( j ) 1 + exp − η ( j ) � ik = ik with an “energy” E ( s i , R j , o k ) = η ( j ) ik = � s i , R j o k � Relational learning with many relations 9/24

  24. A bilinear logistic model s i o k Z ijk = R j ( S i , O k ) � − 1 P ( R j ( S i , O k ) = 1) = P ( j ) 1 + exp − η ( j ) � ik = ik with an “energy” E ( s i , R j , o k ) = η ( j ) ik = � s i , R j o k � So that with H ( j ) = ( η ( j ) ik ) 1 ≤ i , k ≤ n we have H ( j ) = S ⊤ R j O Relational learning with many relations 9/24

  25. Dealing with the number of parameters? : related work Relational learning with many relations 10/24

  26. Dealing with the number of parameters? : related work Clustering of Entities and Relations Miller et al. (2009); Zhu (2012) Bayesian Non-parametric clustering: Kemp et al. (2006); Sutskever et al. (2009) Clustering in the context of Markov Logic Network: Kok and Domingos (2007) Relational learning with many relations 10/24

  27. Dealing with the number of parameters? : related work Clustering of Entities and Relations Miller et al. (2009); Zhu (2012) Bayesian Non-parametric clustering: Kemp et al. (2006); Sutskever et al. (2009) Clustering in the context of Markov Logic Network: Kok and Domingos (2007) Embeddings Collective Matrix Factorization by (Nickel et al., 2012) ( rescal ) Semantic Matching Energy ( sme ) model of Bordes et al. (2012): encodes relations as vectors for scalability. Relational learning with many relations 10/24

  28. Dealing with the number of parameters? : related work Clustering of Entities and Relations Miller et al. (2009); Zhu (2012) Bayesian Non-parametric clustering: Kemp et al. (2006); Sutskever et al. (2009) Clustering in the context of Markov Logic Network: Kok and Domingos (2007) Embeddings Collective Matrix Factorization by (Nickel et al., 2012) ( rescal ) Semantic Matching Energy ( sme ) model of Bordes et al. (2012): encodes relations as vectors for scalability. Tensor factorization CANDECOMP/PARAFAC Tucker (1966); Harshman and Lundy (1994) Probabilistic formulation of Chu and Ghahramani (2009) Relational learning with many relations 10/24

  29. Our solution: Latent relational factors Idea: Modelling the relations between the relations... Relational learning with many relations 11/24

  30. Our solution: Latent relational factors Idea: Modelling the relations between the relations... d � Θ r = u r v ⊤ α j R j = r Θ r , with r r =1 for some sparse vector α j ∈ R d . Relational learning with many relations 11/24

  31. Our solution: Latent relational factors Idea: Modelling the relations between the relations... d � Θ r = u r v ⊤ α j R j = r Θ r , with r r =1 for some sparse vector α j ∈ R d . Given n r number of relations p embedding dimension: R j ∈ R p × p d number of latent relational factors ¯ s ≤ λ d average number of non-zero α coefficients Relational learning with many relations 11/24

  32. Our solution: Latent relational factors Idea: Modelling the relations between the relations... d � Θ r = u r v ⊤ α j R j = r Θ r , with r r =1 for some sparse vector α j ∈ R d . Given n r number of relations p embedding dimension: R j ∈ R p × p d number of latent relational factors ¯ s ≤ λ d average number of non-zero α coefficients ⇒ we reduce the # of parameters from n r p 2 to 2 pd + ¯ sn r Relational learning with many relations 11/24

  33. Algorithmic approach Large scale |P| = 10 6 Relational learning with many relations 12/24

  34. Algorithmic approach Large scale |P| = 10 6 Stochastic projected block-coordinate gradient descent algorithm Relational learning with many relations 12/24

Recommend


More recommend