Introduction To Machine Learning
David Sontag
New York University
Lecture 21, April 14, 2016
David Sontag (NYU) Introduction To Machine Learning Lecture 21, April 14, 2016 1 / 14
Introduction To Machine Learning David Sontag New York University - - PowerPoint PPT Presentation
Introduction To Machine Learning David Sontag New York University Lecture 21, April 14, 2016 David Sontag (NYU) Introduction To Machine Learning Lecture 21, April 14, 2016 1 / 14 Expectation maximization Algorithm is as follows: 1 Write
David Sontag (NYU) Introduction To Machine Learning Lecture 21, April 14, 2016 1 / 14
1 Write down the complete log-likelihood log p(x, z; θ) in such a way
2 Initialize θ0, e.g. at random or using a good first guess 3 Repeat until convergence:
David Sontag (NYU) Introduction To Machine Learning Lecture 21, April 14, 2016 2 / 14
David Sontag (NYU) Introduction To Machine Learning Lecture 21, April 14, 2016 3 / 14
i = 1 to N d = 1 to D
wid
Prior distribution
Topic of doc d Word
β
Topic-word distributions
θ zd
David Sontag (NYU) Introduction To Machine Learning Lecture 21, April 14, 2016 4 / 14
i = 1 to N d = 1 to D
wid
Prior distribution
Topic of doc d Word
β
Topic-word distributions
θ zd
d=1 p(wd, Zd; θ, β), where
N
K
k N
K
k,wid
David Sontag (NYU) Introduction To Machine Learning Lecture 21, April 14, 2016 5 / 14
D
N
K
D
N
K
David Sontag (NYU) Introduction To Machine Learning Lecture 21, April 14, 2016 6 / 14
D
N
K
K
D
K
W
D
k
d=1 p(Zd = k | wd; θt, βt)
ˆ k=1
d=1 p(Zd = ˆ
David Sontag (NYU) Introduction To Machine Learning Lecture 21, April 14, 2016 7 / 14
+"/,9#)1 +.&),3&'(1 "65%51 :5)2,'0("'1 .&/,0,"'1
2,'3$1 4$3,5)%1 &(2,#)1 6$332,)%1 )+".()1 65)&65//1 )"##&.1 65)7&(65//1 8""(65//1
weather+ .50+ finance+ .49+ sports+ .01+
David Sontag (NYU) Introduction To Machine Learning Lecture 21, April 14, 2016 8 / 14
1 Sample the document’s topic distribution θ (aka topic vector)
2 For i = 1 to N, sample the topic zi of the i’th word
3 ... and then sample the actual word wi from the zi’th topic
David Sontag (NYU) Introduction To Machine Learning Lecture 21, April 14, 2016 9 / 14
1
t=1 are hyperparameters.The Dirichlet density, defined over
t=1 θt = 1}, is:
T
t
α1 = α2 = α3 =
θ1 θ2 log Pr(θ) θ1 θ2 log Pr(θ)
α1 = α2 = α3 = David Sontag (NYU) Introduction To Machine Learning Lecture 21, April 14, 2016 10 / 14
3 ... and then sample the actual word wi from the zi’th topic
poli6cs+.0100+ president+.0095+
washington+.0085+ religion+.0060+
religion+.0500+ hindu+.0092+ judiasm+.0080+ ethics+.0075+ buddhism+.0016+ sports+.0105+ baseball+.0100+ soccer+.0055+ basketball+.0050+ football+.0045+
David Sontag (NYU) Introduction To Machine Learning Lecture 21, April 14, 2016 11 / 14
gene 0.04 dna 0.02 genetic 0.01 .,, life 0.02 evolve 0.01
.,, brain 0.04 neuron 0.02 nerve 0.01 ... data 0.02 number 0.02 computer 0.01 .,,
Topics Documents Topic proportions and assignments
(Blei, Introduction to Probabilistic Topic Models, 2011) David Sontag (NYU) Introduction To Machine Learning Lecture 21, April 14, 2016 12 / 14
David Sontag (NYU) Introduction To Machine Learning Lecture 21, April 14, 2016 13 / 14
i = 1 to N d = 1 to D
wid
Prior distribution
Topic of doc d Word
β
Topic-word distributions
θ zd α
Dirichlet hyperparameters i = 1 to N d = 1 to D
θd wid zid
Topic distribution for document Topic of word i of doc d Word
β
Topic-word distributions
David Sontag (NYU) Introduction To Machine Learning Lecture 21, April 14, 2016 14 / 14