Latent LSTM Allocation Manzil Zaheer, Amr Ahmed and Alexander J Smola Presented by Akshay Budhkar & Krishnapriya Vishnubhotla March 3, 2018 Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 1 / 22
Outline Introduction 1 Latent Dirichlet Allocation LSTMs Latent LSTM Allocation 2 Algorithm Inference Different Models Results 3 Conclusion 4 Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 2 / 22
Latent Dirichlet Allocation Probabilistic graphical model Not sequential, but easily interpretable. Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 3 / 22
LSTMs Good for modeling sequential data, preserves temporal aspect Too many parameters Hard to interpret Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 4 / 22
Latent LSTM Allocation (LLA) - Algorithm Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 5 / 22
Graphical model for LLA Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 6 / 22
Marginal probability of observing a document is � p ( w d | LSTM , φ ) = p ( w d , z d | LSTM , φ ) z d (1) � � = p ( w d , t | z d , t ; φ ) p ( z d , t | z d , 1: t − 1 ; LSTM ) z d t Uses a K × H dense matrix and a V × K sparse matrix. Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 7 / 22
Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 8 / 22
Inference Stochastic Expectation Maximization is used to compute the posterior. The Evidence Lower Bound (ELBO) can be written as: � log p ( w d | LSTM , φ ) d (2) q ( z ) log p ( z d ; LSTM ) � t p ( w d , t | z d , t ; φ ) � � ≥ q ( z d ) z d d Conditional probability of topic at time step t is: p ( z d , t = k | w d , t , z d , 1: t − 1 | LSTM , φ ) (3) ∝ p ( z d , t = k | z d , 1: t ; LSTM ) p ( w d , t | z d , t = k ; φ ) And p ( w d , t | z d , t = k ; φ ) = φ w , k = n w , k + β (4) n k + V β Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 9 / 22
Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 10 / 22
Mathematical Intuition LDA � log p ( w ) = log p ( w t | model ) t (5) � � = log p ( w t | z t ) p ( z t | doc ) t z t LSTM � log p ( w ) = log p ( w t | w t − 1 , w t − 2 , . . . , w 1 ) (6) t LLA � � log p ( w ) = log p ( w t | z t ) p ( z t | z t − 1 , z t − 2 , . . . , z 1 ) (7) z 1: T t Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 11 / 22
Different Models Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 12 / 22
Perplexity vs. Number of topics (Wikipedia) Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 13 / 22
Perplexity vs. Number of topics (User Search) Cannot use Char LLA, since URLs lack morphological structure Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 14 / 22
LDA Ablation Study Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 15 / 22
Interpreting Cleaner Topics Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 16 / 22
Interpreting Factored Topics Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 17 / 22
LSTM Topic Embedding (Wikipedia) Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 18 / 22
Convergence Speed Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 19 / 22
Effect of Joint vs. Independent Training Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 20 / 22
Final Thoughts Pros Provides a knob for interpretability and accuracy Less number of parameters for a reasonable perplexity Cleaner factored topics Cons Did not compare to something like hierarchical LDA Can’t use Char LLA for every problem Perplexity is not a good measure of text generation accuracy Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 21 / 22
Bibliography Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine Learning research , 3(Jan):993–1022. Galley, M., Brockett, C., Sordoni, A., Ji, Y., Auli, M., Quirk, C., Mitchell, M., Gao, J., and Dolan, B. (2015). deltableu: A discriminative metric for generation tasks with intrinsically diverse targets. arXiv preprint arXiv:1506.06863 . Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural computation , 9(8):1735–1780. Zaheer, M., Ahmed, A., and Smola, A. J. (2017). Latent lstm allocation: Joint clustering and non-linear dynamic modeling of sequence data. In International Conference on Machine Learning , pages 3967–3976. Manzil Zaheer, Amr Ahmed and Alexander J Smola (Presented by Akshay Budhkar & Krishnapriya Vishnubhotla) Latent LSTM Allocation March 3, 2018 22 / 22
Recommend
More recommend