Recommenda)on Engines; Collabora)ve Filtering; Thema)c clustering of large text corpora; Infinite Markov model for sta)s)cal NLP Russell W. Hanson Dec. 8, 2008
Outline • Several problems in applied mathema)cs and approaches to their solu)ons: – Recommenda)on Engines – Collabora)ve Filtering – Thema)c clustering of large text corpora – Infinite Markov model for sta)s)cal NLP
LobeLink.com – social bookmarking; social web annota)on; and recommenda)on engine
Recommenda)on Engines, $$$ Amazon.com NetFlix.com
A Recommenda)on Engine
ATribu)zed Bayesian Choice Modeling ATribu)zed content items, i, are stored as vectors in the choice‐set database such that: Summary of tastes, T: • Collabora)ve Filtering for text and “news”: – Cold Start Problem (it isn’t collabora)ve un)l it’s collabora)ve) – Past Experience: Some people want the most popular (“Dodgers make offer to Manny Ramirez ‐ Boston.com”); some don’t (“Non‐Abelian Anyons and Topological Quantum Computa)on”) – By weight in whole network; by weight in user’s network; by weight in thema)c cluster
Thema)c Clustering • Want to have more fine‐grained recommenda)ons than connec)vity in user network — weight in a given thema)c cluster.
Latent Dirichlet Alloca)on/Analysis
Latent Dirichlet Alloca)on/Analysis (p3)
Latent Dirichlet Alloca)on/Analysis (p2)
Infinite Markov Models Language models and parsers N-gram (bigram, trigram) vs. ∞ -gram The supercalifragilisticexpialidocious-problem hierarchical Pitman-Yor language model (HPYLM) variable order hier archical Pitman-Yor language model (VPYLM)
Selected References Document Clustering in Large German Corpora Using Natural Language Processing Richard Forster (2006) University of Zurich Latent Dirichlet Allocation Blei, Ng, and Jordan Journal of Machine Learning Research 3 (2003) 993-1022 The Infinite Markov Model Daichi Mochihashi, Eiichiro Sumita NIPS, 2007 LobeLink.com
Recommend
More recommend