recommenda on engines collabora ve filtering thema c
play

Recommenda)onEngines;Collabora)ve Filtering;Thema)cclusteringoflarge - PowerPoint PPT Presentation

Recommenda)onEngines;Collabora)ve Filtering;Thema)cclusteringoflarge textcorpora;InfiniteMarkovmodelfor sta)s)calNLP RussellW.Hanson Dec.8,2008 Outline


  1. Recommenda)on
Engines;
Collabora)ve
 Filtering;
Thema)c
clustering
of
large
 text
corpora;
Infinite
Markov
model
for
 sta)s)cal
NLP
 Russell
W.
Hanson
 Dec.
8,
2008


  2. Outline
 • Several
problems
in
applied
mathema)cs
and
 approaches
to
their
solu)ons:
 – Recommenda)on
Engines
 – Collabora)ve
Filtering
 – Thema)c
clustering
of
large
text
corpora
 – Infinite
Markov
model
for
sta)s)cal
NLP


  3. LobeLink.com
–
social
bookmarking;
 social
web
annota)on;
and
recommenda)on
engine


  4. Recommenda)on
Engines,
$$$
 Amazon.com
 NetFlix.com


  5. A
Recommenda)on
Engine


  6. ATribu)zed
Bayesian
Choice
Modeling
 ATribu)zed
content
items,
i,
are
stored
as
vectors
in
the
choice‐set
database
such
that:
 Summary
of
tastes,
T:
 • Collabora)ve
Filtering
for
text
and
“news”:
 – Cold
Start
Problem
(it
isn’t
collabora)ve
un)l
it’s
collabora)ve)
 – Past
Experience:
Some
people
want
the
most
popular
(“Dodgers
 make
offer
to
Manny
Ramirez
‐
Boston.com”);
some
don’t
 (“Non‐Abelian
Anyons
and
Topological
Quantum
Computa)on”)
 – By
weight
in
whole
network;
by
weight
in
user’s
network;
by
 weight
in
thema)c
cluster


  7. Thema)c
Clustering
 • Want
to
have
more
 fine‐grained
 recommenda)ons
 than
connec)vity
in
 user
network
—
 weight
in
a
given
 thema)c
cluster.


  8. Latent
Dirichlet
Alloca)on/Analysis


  9. Latent
Dirichlet
Alloca)on/Analysis
(p3)


  10. Latent
Dirichlet

 Alloca)on/Analysis
(p2)


  11. Infinite
Markov
Models
 Language models and parsers N-gram (bigram, trigram) vs. ∞ -gram The supercalifragilisticexpialidocious-problem hierarchical Pitman-Yor language model (HPYLM) variable order hier archical Pitman-Yor language model (VPYLM)

  12. Selected
References
 Document Clustering in Large German Corpora Using Natural Language Processing Richard Forster (2006) University of Zurich Latent Dirichlet Allocation Blei, Ng, and Jordan Journal of Machine Learning Research 3 (2003) 993-1022 The Infinite Markov Model Daichi Mochihashi, Eiichiro Sumita NIPS, 2007 LobeLink.com

Recommend


More recommend