Gibbs Sampling for LDA Lei Tang Department of CSE Arizona State - PowerPoint PPT Presentation

Gibbs Sampling for LDA Lei Tang Department of CSE Arizona State University January 7, 2008 1 / 10

Graphical Representation α , β are fixed hyper-parameters. We need to estimate parameters θ for each document and φ for each topic. Z are latent variables. This is different from original LDA work. 2 / 10

Property of Dirichlet The expectation of Dirichlet is E ( µ k ) = α k α 0 where α 0 = � α k . 3 / 10

Gibbs Variants 1 Gibbs Sampling Draw a conditioned on b, c Draw b conditioned on a, c Draw c conditioned on a, b 2 Block Gibbs Sampling Draw a, b conditioned on c Draw c conditioned on a,b 3 Collapsed Gibbs Sampling Draw a conditioned on c Draw c conditioned on a b is collopsed out during the sampling process. 4 / 10

Collapsed Sampling for LDA In the original paper “Finding Scientific Topics”, the authors are more interested in text modelling, (find out Z ), hence, the Gibbs sampling procedure boils down to estimate P ( z i = j | z − i , w ) Here, θ , φ are intergrated out. Actually, if we know the exact Z for each document, it’s trivial to estimate θ and φ . P ( z i = j | z − i , w ) ∝ P ( z i = j , z − i , w ) = P ( w i | z i = j , z − i , w − i ) P ( z i = j | z − i , w − i ) = P ( w i | z i = j , z − i , w − i ) P ( z i = j | z − i ) The first term is the likelihood and the 2nd term like a prior. 5 / 10

P ( w i | z i = j , z − i , w − i ) � P ( w i | z i = j , φ ( j ) ) P ( φ ( j ) | z − i , w − i ) d φ ( j ) = � φ ( j ) w i P ( φ ( j ) | z − i , w − i ) d φ ( j ) = P ( φ ( j ) | z − i , w − i ) P ( w − i | φ ( j ) , z − i ) P ( φ j ) ∝ Dirichlet ( β + n ( w ) ∼ − i , j ) Here, n ( w ) − i , j is the number of instances of word w assigned to topic j . Using the property of expectation of Dirichlet distribution, we have n ( w i ) − i , j + β P ( w i | z i = j , z − i , w − i ) = n ( · ) − i , j + W β where n − i , j total number of words assigned to topic j . 6 / 10

Similarly, for the 2nd term, we have � P ( z i = j | θ ( d ) ) P ( θ ( d ) | z − i ) d θ ( d ) P ( z i = j | z − i ) = P ( θ ( d ) | z − i ) P ( z − i | θ ( d ) ) P ( θ ( d ) ) ∝ Dirichlet ( n ( d ) ∼ − i , j + α ) where n ( d ) − i , j is the number of words assigned to topic j excluding current one. n ( d ) − i , j + α P ( z i = j | z − i ) = n ( d ) − i , · + K α where n ( d ) − i , · is the total number of topics assigned to document d excluding current one. 7 / 10

Algorithm n ( w i ) n ( d ) − i , j + β − i , j + α P ( z i = j | z − i , w ) ∝ n ( · ) n ( d ) − i , j + W β − i , · + K α Need to record four count variables: document-topic count n ( d ) − i , j document-topic sum n ( d ) − i , · (actually a constant) topic-term count n ( w i ) − i , j topic-term sum n ( · ) − i , j 8 / 10

Parameter Estimation To obtain φ , and θ , two ways, (draw one sample of z or draw multiple samples of z to calculate the average) n ( j ) w + β φ j , w = w =1 n ( j ) � V w + V β n ( d ) + α θ ( d ) j = j z =1 n ( d ) � K + K α z where n ( j ) w is the freqency of word assigned to topic j , and n ( d ) is the z number of words assigned to topic z . 9 / 10

Comment Compared with VB, Gibbs Sampling is easy to implement. Easy to extend. More efficient. Faster to obtain good approximation. 10 / 10

Gibbs Sampling for LDA Lei Tang Department of CSE Arizona State - PowerPoint PPT Presentation

Gibbs Sampling for LDA Lei Tang Department of CSE Arizona State University January 7, 2008 1 / 10 Graphical Representation , are fixed hyper-parameters. We need to estimate parameters for each document and for each topic. Z are

Gibbs-non-Gibbs dynamical transitions. A large-deviation paradigm R. Fern andez F. den

Gibbs sampling Dr. Jarad Niemi Iowa State University March 29, 2018 Jarad Niemi (Iowa State)

Sampling Methods CMSC 678 UMBC Outline Recap Monte Carlo methods Sampling Techniques Uniform

Poisson-Minibatching for Gibbs Sampling with Convergence Rate Guarantees Ruqi Zhang and

CSci 8980: Advanced Topics in Graphical Models MCMC, Gibbs Sampling Instructor: Arindam Banerjee

SALT LAKE LEGAL DEFENDER (LDA) AND SOCIAL SERVICES Who we are, what we do, court system and how LDA

Linking words to topics Pavel Oleinikov Associate Director DataCamp Topic Modeling in R LDA

Sampling Methods Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 11 Sampling Rejection Sampling

Chapter 7. Sampling Chapter 7. Sampling methods? methods? Two types of sampling methods Two

Multiple importance sampling Slides for CS6630 lecture 6 sampling the BRDF sampling the

What is the strengths and weakness of these sampling methods? Sampling Strengths /

Gibbs Sampling Bayesian Networks: A First Attempt with Cilk++ Alexander Dubbs May 13, 2010

Factors of Gibbs measures on subshifts What is a Gibbs measure? Two-ish definitions Equivalence

College P Planning N Night GIBBS GIBBS HIGH IGH SCHOOL SC SCHO HOOL COUNSE SELING OFFICE

Understanding Landscape Visualisation for Visual Impact Assessments Lock, David.J. 1 1 LDA Design,

Your local partner of choice THE ENGCO GROUP ENGCO Group consists of six companies: ENGCO, Lda

Load Granularity Refinements Gillian Biedler Senior Market Design & Policy Specialist Market

Welcoming Inclusion Network (WIN) When Inclusion Happens We All WIN LORI STILLMAN AND JOHN C.

Conceptual Graphs Based Modeling of Semi-Structured Data Andrea Eva Molnar, Viorica Varga, and

Industrial waste heat for electricity and DH production: Demonstration Plant based on ORC

Topic Modelling (and Natural Language Processing) workshop PyCon UK 2019 @MarcoBonzanini

Command line completion (CLC) an illustration of learning and decision making using the imprecise

MAP adaptation with SphinxTrain David Huggins-Daines dhuggins@cs.cmu.edu Language Technologies

CONCATENATION AND SPECIES TREE METHODS Joao Tonini, EXHIBIT STATISTICALLY INDISTINGUISHABLE

Gibbs Sampling for LDA Lei Tang Department of CSE Arizona State - PowerPoint PPT Presentation

Gibbs Sampling for LDA Lei Tang Department of CSE Arizona State University January 7, 2008 1 / 10 Graphical Representation , are fixed hyper-parameters. We need to estimate parameters for each document and for each topic. Z are

Gibbs-non-Gibbs dynamical transitions. A large-deviation paradigm R. Fern andez F. den

Gibbs sampling Dr. Jarad Niemi Iowa State University March 29, 2018 Jarad Niemi (Iowa State)

Sampling Methods CMSC 678 UMBC Outline Recap Monte Carlo methods Sampling Techniques Uniform

Poisson-Minibatching for Gibbs Sampling with Convergence Rate Guarantees Ruqi Zhang and

CSci 8980: Advanced Topics in Graphical Models MCMC, Gibbs Sampling Instructor: Arindam Banerjee

SALT LAKE LEGAL DEFENDER (LDA) AND SOCIAL SERVICES Who we are, what we do, court system and how LDA

Linking words to topics Pavel Oleinikov Associate Director DataCamp Topic Modeling in R LDA

Sampling Methods Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 11 Sampling Rejection Sampling

Chapter 7. Sampling Chapter 7. Sampling methods? methods? Two types of sampling methods Two

Multiple importance sampling Slides for CS6630 lecture 6 sampling the BRDF sampling the

What is the strengths and weakness of these sampling methods? Sampling Strengths /

Gibbs Sampling Bayesian Networks: A First Attempt with Cilk++ Alexander Dubbs May 13, 2010

Factors of Gibbs measures on subshifts What is a Gibbs measure? Two-ish definitions Equivalence

College P Planning N Night GIBBS GIBBS HIGH IGH SCHOOL SC SCHO HOOL COUNSE SELING OFFICE

Understanding Landscape Visualisation for Visual Impact Assessments Lock, David.J. 1 1 LDA Design,

Your local partner of choice THE ENGCO GROUP ENGCO Group consists of six companies: ENGCO, Lda

Load Granularity Refinements Gillian Biedler Senior Market Design &amp; Policy Specialist Market

Welcoming Inclusion Network (WIN) When Inclusion Happens We All WIN LORI STILLMAN AND JOHN C.

Conceptual Graphs Based Modeling of Semi-Structured Data Andrea Eva Molnar, Viorica Varga, and

Industrial waste heat for electricity and DH production: Demonstration Plant based on ORC

Topic Modelling (and Natural Language Processing) workshop PyCon UK 2019 @MarcoBonzanini

Command line completion (CLC) an illustration of learning and decision making using the imprecise

MAP adaptation with SphinxTrain David Huggins-Daines dhuggins@cs.cmu.edu Language Technologies

CONCATENATION AND SPECIES TREE METHODS Joao Tonini, EXHIBIT STATISTICALLY INDISTINGUISHABLE

Load Granularity Refinements Gillian Biedler Senior Market Design & Policy Specialist Market