An Instability in Variational Inference for Topic Models Behrooz Ghorbani Joint work with Hamid Javadi and Andrea Montanari Stanford University Department of Electrical Engineering June, 2019 Behrooz Ghorbani Topic Models Stanford University 1 / 6
Problem Statement Statistical model: √ β d WH T + Z X = where W ∈ R n × r , H ∈ R d × r and Z is i.i.d Gaussian noise n , d ≫ 1 with n d = δ > 0, where δ, r ∼ O (1) i.i.d i.i.d ∼ Dir ( ν 1 ) and H j ∼ N(0 , I r ) W i Behrooz Ghorbani Topic Models Stanford University 2 / 6
Problem Statement Statistical model: √ β d WH T + Z X = where W ∈ R n × r , H ∈ R d × r and Z is i.i.d Gaussian noise n , d ≫ 1 with n d = δ > 0, where δ, r ∼ O (1) i.i.d i.i.d ∼ Dir ( ν 1 ) and H j ∼ N(0 , I r ) W i Behrooz Ghorbani Topic Models Stanford University 2 / 6
Problem Statement Statistical model: √ β d WH T + Z X = where W ∈ R n × r , H ∈ R d × r and Z is i.i.d Gaussian noise n , d ≫ 1 with n d = δ > 0, where δ, r ∼ O (1) i.i.d i.i.d ∼ Dir ( ν 1 ) and H j ∼ N(0 , I r ) W i Behrooz Ghorbani Topic Models Stanford University 2 / 6
(Naive Mean Field) Variational Inference Goal: Use the posterior distribution, p H , W | X ( ·| X ), to estimate W and H Variational Inference: Approximate the posterior with a simpler distribution ˆ q such that: d n � � q ( H , W ) = q ( H ) ˜ ˆ q ( W ) = q a ( H a ) q i ( W i ) ˜ a =1 i =1 Is the output of variational inference reliable? Behrooz Ghorbani Topic Models Stanford University 3 / 6
(Naive Mean Field) Variational Inference Goal: Use the posterior distribution, p H , W | X ( ·| X ), to estimate W and H Variational Inference: Approximate the posterior with a simpler distribution ˆ q such that: d n � � q ( H , W ) = q ( H ) ˜ ˆ q ( W ) = q a ( H a ) q i ( W i ) ˜ a =1 i =1 Is the output of variational inference reliable? Behrooz Ghorbani Topic Models Stanford University 3 / 6
(Naive Mean Field) Variational Inference Goal: Use the posterior distribution, p H , W | X ( ·| X ), to estimate W and H Variational Inference: Approximate the posterior with a simpler distribution ˆ q such that: d n � � q ( H , W ) = q ( H ) ˜ ˆ q ( W ) = q a ( H a ) q i ( W i ) ˜ a =1 i =1 Is the output of variational inference reliable? Behrooz Ghorbani Topic Models Stanford University 3 / 6
Comparison of Two Thresholds β Bayes : Information theoretic threshold Behrooz Ghorbani Topic Models Stanford University 4 / 6
Comparison of Two Thresholds β Bayes : Information theoretic threshold If β < β Bayes , then any estimator is asymptotically uncorrelated with the truth Behrooz Ghorbani Topic Models Stanford University 4 / 6
Comparison of Two Thresholds β Bayes : Information theoretic threshold If β < β Bayes , then any estimator is asymptotically uncorrelated with the truth β inst : Threshold for variational inference to return a non-trivial estimate Behrooz Ghorbani Topic Models Stanford University 4 / 6
Comparison of Two Thresholds β Bayes : Information theoretic threshold If β < β Bayes , then any estimator is asymptotically uncorrelated with the truth β inst : Threshold for variational inference to return a non-trivial estimate If β < β inst , ˆ W i = 1 r 1 r ⇒ No signal found in the data! Behrooz Ghorbani Topic Models Stanford University 4 / 6
Comparison of Two Thresholds β Bayes : Information theoretic threshold If β < β Bayes , then any estimator is asymptotically uncorrelated with the truth β inst : Threshold for variational inference to return a non-trivial estimate If β < β inst , ˆ W i = 1 r 1 r ⇒ No signal found in the data! If β > β inst , ˆ W i � = 1 r 1 r ⇒ variational algorithm declares that it has found a signal! Behrooz Ghorbani Topic Models Stanford University 4 / 6
Comparison of Two Thresholds We want β Bayes ≈ β inst Behrooz Ghorbani Topic Models Stanford University 5 / 6
Comparison of Two Thresholds We want β Bayes ≈ β inst Comparisons of ¯ Bayes and ¯ inst 14 ¯ Bayes 12 Signal Strength, ¯ ¯ inst 10 8 6 4 2 1.0 1.5 2.0 2.5 3.0 Aspect Ratio, ± = n d Behrooz Ghorbani Topic Models Stanford University 5 / 6
Comparison of Two Thresholds We want β Bayes ≈ β inst Comparisons of ¯ Bayes and ¯ inst 14 ¯ Bayes 12 Signal Strength, ¯ ¯ inst 10 8 6 4 2 1.0 1.5 2.0 2.5 3.0 Aspect Ratio, ± = n d Behrooz Ghorbani Topic Models Stanford University 5 / 6
Credible intervals: Nominal coverage 90% 2 . 0 = β < β inst = 2 . 2 1.0 0.8 � W i, 1 0.6 W i, 1 W i, 1 0.4 0.2 0.0 0 20 40 60 80 100 Weight Index, i β = 4 . 1 1.0 0.8 0.6 W i, 1 0.4 0.2 0.0 0 20 40 60 80 100 Weight Index, i β = β Bayes = 6 . 0 1.0 0.8 0.6 W i, 1 0.4 0.2 0.0 0 20 40 60 80 100 Weight Index, i Empirical coverage β = 2 < β inst : 0 . 87 β = 4 . 1 ∈ ( β inst , β Bayes ): 0 . 65 β = 6 = β Bayes : 0 . 51 Behrooz Ghorbani Topic Models Stanford University 6 / 6
Recommend
More recommend