Random Function Priors for Correlation Modeling Aonan Zhang John Paisley Columbia University
Setup Model exchangeable data X = [ X 1 , …, X N ] collection of features θ = ( θ k ) k ∈ K α Z n X n θ k Z n = [ Z n 1 , …, Z nk , …, Z nK ] ∈ ℝ K + N K the extent is used to express . θ k X n E.g. Sparse factor models: Z n ∈ {0,1} K Topic models: Z n ∈ Δ K − 1 Problem: model flexible correlations among Z n 1 , …, Z nK Complexity: 2 O ( K ) Exponential family? Solution: random function priors
Model the matrix Z K columns Joint distribution K N ∏ ∏ N p ( X , Z , θ ) = p ( Z ) ⋅ p ( θ k ) ⋅ p ( X n | Z n , θ ) rows k =1 n =1 ??? i.i.d. application dependent Z Workflow to derive p(Z) representation theorems Exchangeability assumptions on p(Z) ————————————> p(Z) is a random function model
Representation theorem Trick: Transform Z (random matrix) to (random measure) on S. ξ ξ = ∑ Z nk δ τ n , σ k n , k Assumption: is separately exchangeable. ξ Proposition. A discrete random measure on is separately exchangeable, if and only if almost ξ S surely, ξ = ∑ trivial terms f n ( ϑ k ) δ τ n , σ k + Z nk = f n ( ϑ k ) n , k Poisson process on R 2 + random functions on R +
The power of random function priors Prototype to applicable models 1. f n ( ϑ k ) → f ( h n , ϑ k ) learned via inference networks h n = g ( X n ) decoder network 2. f ( h n , ϑ k ) → f ( h n , ϑ k , ℓ k ) augment the 2d Poisson process ( ϑ k , σ k ) to higher dimension ( ϑ k , σ k , ℓ k ) Model correlations through arbitrary moments ℓ 1 ℓ 4 Assume Z nk = f ( h ⊤ n ℓ k ) h n Then 𝔽 [ Z nk 1 Z nk 2 … Z nk j ] = 𝔽 [ f ( h ⊤ n ℓ k 1 ) ⋯ f ( h ⊤ ℓ 3 n ℓ k j )] ℓ 2
Visualize correlations via paintboxes Each paintbox is a heatmap.
Visualize correlations via paintboxes Correlated Topics Un-correlated Topics Z n ,20 = f ( h n , ℓ 20 ) Z n ,47 = f ( h n , ℓ 47 ) Z n ,54 = f ( h n , ℓ 54 ) Z n ,64 = f ( h n , ℓ 64 ) Z n ,67 = f ( h n , ℓ 67 ) Z n ,71 = f ( h n , ℓ 71 ) h n h n
Model performance Our model: PRME
Summarize More details A representation theorem for correlation modeling. A deeper understanding of IBP beyond the Beta-Bernoulli process. A generalization of Kingman’s and Broderick’s paintbox models. Connections to random graphs. … Poster: #222 Code: https://github.com/zan12/prme
Recommend
More recommend