Bayesian nonparametric models for bipartite graphs Fran¸ cois Caron Department of Statistics, Oxford Statistics Colloquium, Harvard University November 11, 2013 F. Caron 1 / 27
Bipartite networks Readers/Customers A 1 A 2 B 1 B 2 B 3 B 4 ◮ Scientists authoring papers ◮ Readers reading books ◮ Internet users posting messages on forums ◮ Customers buying items ◮ Objects sharing a set of features F. Caron 2 / 27
Bipartite networks Readers/Customers A 1 A 2 B 1 B 2 B 3 B 4 ◮ Scientists authoring papers ◮ Readers reading books ◮ Internet users posting messages on forums ◮ Customers buying items ◮ Objects sharing a set of features F. Caron 2 / 27
Bipartite networks Readers/Customers Readers A 1 A 2 Books B 1 B 2 B 3 B 4 ◮ Scientists authoring papers ◮ Readers reading books ◮ Internet users posting messages on forums ◮ Customers buying items ◮ Objects sharing a set of features F. Caron 2 / 27
Bipartite networks Readers/Customers Readers A 1 A 2 ? ? Books B 1 B 2 B 3 B 4 ◮ Scientists authoring papers ◮ Readers reading books ◮ Internet users posting messages on forums ◮ Customers buying items ◮ Objects sharing a set of features F. Caron 2 / 27
Bipartite networks Readers/Customers Readers A 1 A 2 Books B 1 B 2 B 3 B 4 ◮ Scientists authoring papers ◮ Readers reading books ◮ Internet users posting messages on forums ◮ Customers buying items ◮ Objects sharing a set of features F. Caron 2 / 27
Bipartite networks Readers/Customers Readers A 1 A 2 A 3 Books B 1 B 2 B 3 B 4 ◮ Scientists authoring papers ◮ Readers reading books ◮ Internet users posting messages on forums ◮ Customers buying items ◮ Objects sharing a set of features F. Caron 2 / 27
Bipartite networks Readers/Customers Readers A 1 A 2 A 3 ? Books B 1 B 2 B 3 B 4 ◮ Scientists authoring papers ◮ Readers reading books ◮ Internet users posting messages on forums ◮ Customers buying items ◮ Objects sharing a set of features F. Caron 2 / 27
Book-crossing community network 5 000 readers, 36 000 books, 50 000 edges F. Caron 3 / 27
Book-crossing community network Degree distributions on log-log scale 0 0 10 10 −1 −1 10 10 −2 −2 10 10 Distribution Distribution −3 −3 10 10 −4 −4 10 10 −5 −5 10 10 −6 −6 10 10 −7 −7 10 10 0 1 2 3 4 0 1 2 10 10 10 10 10 10 10 10 Degree Degree (a) Readers (b) Books F. Caron 4 / 27
Statistical network models ◮ Statistics literature ◮ Exponential random graph, stochastic block-models, Rasch models, etc ◮ Do not capture power-law behavior ◮ Inference do not scale well with the number of nodes ◮ Physics literature ◮ Preferential attachment ◮ Lacks interpretable parameters, non-exchangeability F. Caron 5 / 27
Bayesian nonparametrics ◮ Parameter of interest is infinite-dimensional ◮ Allows the complexity of the model to adapt to the data ◮ Dirichlet Process Mixtures: Clustering/density estimation with unknown number of modes ◮ Attractive power-law properties ◮ Language modeling, image segmentation [Teh, 2006; Sudderth and Jordan, 2008; Blunsom and Cohn, 2011] F. Caron 6 / 27
BNP for networks ◮ Models with some latent structure (e.g. infinite relational model) ◮ Number of nodes is fixed and dimension of the latent structure unknown ◮ Here: Infinite number of nodes ◮ (stable) Beta-Bernoulli/Indian Buffet Process ◮ Can capture power-law degree distributions for books ◮ Poisson degree distribution for readers [Griffiths and Ghahramani, 2005, Teh and G¨ or¨ ur, 2009] F. Caron 7 / 27
Bipartite networks Aims ◮ Bayesian nonparametric model for bipartite networks with a potentially infinite number of nodes of each type ◮ Each node is modelled using a positive rating parameter that represents its ability to connect to other nodes ◮ Captures power-law behavior ◮ Simple generative model for network growth ◮ Develop efficient computational procedure for posterior simulation. F. Caron 8 / 27
Hierarchical model ◮ Represent a bipartite network by a collection of atomic measures Z i , i = 1 , 2 , . . . such that ∞ � Z i = z ij δ θ j j =1 ◮ z ij = 1 if reader i has read book j , 0 otherwise ◮ { θ j } is the set of books ◮ Each book j is assigned a positive “popularity” parameter w j ◮ Each reader i is assigned a positive “interest in reading” parameter γ i ◮ The probability that reader i reads book j is P ( z ij = 1 | γ i , w j ) = 1 − exp( − w j γ i ) F. Caron 9 / 27
Hierarchical model ◮ Represent a bipartite network by a collection of atomic measures Z i , i = 1 , 2 , . . . such that ∞ � Z i = z ij δ θ j j =1 ◮ z ij = 1 if reader i has read book j , 0 otherwise ◮ { θ j } is the set of books ◮ Each book j is assigned a positive “popularity” parameter w j ◮ Each reader i is assigned a positive “interest in reading” parameter γ i ◮ The probability that reader i reads book j is P ( z ij = 1 | γ i , w j ) = 1 − exp( − w j γ i ) F. Caron 9 / 27
Hierarchical model ◮ Represent a bipartite network by a collection of atomic measures Z i , i = 1 , 2 , . . . such that ∞ � Z i = z ij δ θ j j =1 ◮ z ij = 1 if reader i has read book j , 0 otherwise ◮ { θ j } is the set of books ◮ Each book j is assigned a positive “popularity” parameter w j ◮ Each reader i is assigned a positive “interest in reading” parameter γ i ◮ The probability that reader i reads book j is P ( z ij = 1 | γ i , w j ) = 1 − exp( − w j γ i ) F. Caron 9 / 27
Hierarchical model ◮ Represent a bipartite network by a collection of atomic measures Z i , i = 1 , 2 , . . . such that ∞ � Z i = z ij δ θ j j =1 ◮ z ij = 1 if reader i has read book j , 0 otherwise ◮ { θ j } is the set of books ◮ Each book j is assigned a positive “popularity” parameter w j ◮ Each reader i is assigned a positive “interest in reading” parameter γ i ◮ The probability that reader i reads book j is P ( z ij = 1 | γ i , w j ) = 1 − exp( − w j γ i ) F. Caron 9 / 27
Data Augmentation ◮ Latent variable formulation ◮ Latent scores s ij ∼ Gumbel(log( w j ) , 1) ◮ All books with a score above − log( γ i ) are retained, others are discarded − log( γ i ) 30 30 25 25 20 20 books books 15 15 10 10 5 5 0 0 0 0.5 1 1.5 2 2.5 3 −8 −6 −4 −2 0 2 4 popularity score F. Caron 10 / 27
Model for the book popularity parameters ◮ Random atomic measure ∞ � G = w j δ θ j j =1 ◮ Construction: two-dimensional Poisson process N = { w j , θ j } j =1 ,... ◮ Completely Random Measure G ∼ CRM( λ, h ) characterized by a L´ evy measure λ ( w ) h ( θ ) dwdθ � ∞ ∞ � (1 − e − w ) λ ( w ) dw < ∞ ⇒ finite total z ij . 0 j =1 [Kingman, 1967, Regazzini et al., 2003, Lijoi and Pr¨ unster, 2010] F. Caron 11 / 27
Posterior characterization ◮ Observed bipartite network Z 1 , . . . , Z n ◮ n readers and K books with degree at least one ◮ Cannot derive directly the conditional of G given Z 1 , . . . , Z n nor the predictive of Z n +1 given Z 1 , . . . , Z n ◮ Let ∞ � X i = x ij δ θ j j =1 where x ij = max(0 , s ij + log( γ i )) ≥ 0 are latent positive scores. − log( γ i ) 30 30 25 25 20 20 books books 15 15 10 10 5 5 0 0 −8 −6 −4 −2 0 2 4 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 score censored score F. Caron 12 / 27
Posterior Characterization The conditional distribution of G given X 1 , . . . X n can be expressed as K G = G ∗ + � w j δ θ j j =1 where G ∗ and ( w j ) are mutually independent with � n � � G ∗ ∼ CRM( λ ∗ , h ) , λ ∗ ( w ) = λ ( w ) exp − w γ i i =1 and the masses are � n � P ( w j | other ) ∝ λ ( w j ) w m j � γ i e − x ij exp − w j j i =1 Characterization related to that for normalized random measures [Pr¨ unster, 2002, James, 2002, James et al., 2009] F. Caron 13 / 27
Generative Process for network growth Predictive distribution of Z n +1 given the latent process X 1 , . . . , X n Books Reader 1 A 1 F. Caron 14 / 27
Generative Process for network growth Predictive distribution of Z n +1 given the latent process X 1 , . . . , X n Books ... Reader 1 A 1 B 1 B 2 B 3 F. Caron 14 / 27
Generative Process for network growth Predictive distribution of Z n +1 given the latent process X 1 , . . . , X n Books ... Reader 1 18 4 14 A 1 B 1 B 2 B 3 F. Caron 14 / 27
Generative Process for network growth Predictive distribution of Z n +1 given the latent process X 1 , . . . , X n Books ... Reader 1 18 4 14 Reader 2 A 1 A 2 B 1 B 2 B 3 F. Caron 14 / 27
Generative Process for network growth Predictive distribution of Z n +1 given the latent process X 1 , . . . , X n Books ... Reader 1 18 4 14 Reader 2 A 1 A 2 B 1 B 2 B 3 F. Caron 14 / 27
Generative Process for network growth Predictive distribution of Z n +1 given the latent process X 1 , . . . , X n Books ... Reader 1 18 4 14 ... Reader 2 A 1 A 2 B 1 B 2 B 3 B 4 B 5 F. Caron 14 / 27
Generative Process for network growth Predictive distribution of Z n +1 given the latent process X 1 , . . . , X n Books ... Reader 1 18 4 14 ... Reader 2 12 0 8 13 4 A 1 A 2 B 1 B 2 B 3 B 4 B 5 F. Caron 14 / 27
Recommend
More recommend