Lecture 6: (Probabilistic) Latent Semantic Analysis Julia - PowerPoint PPT Presentation

CS598JHM: Advanced NLP (Spring 2013) http://courses.engr.illinois.edu/cs598jhm/ Lecture 6: (Probabilistic) Latent Semantic Analysis Julia Hockenmaier juliahmr@illinois.edu 3324 Siebel Center Office hours: by appointment

Indexing by Latent Semantic Analysis (Deerwester et al., 1990) Bayesian Methods in NLP 2

Latent Semantic Analysis The task: Return relevant documents for text queries The problem: relevance is conceptual/semantic - The index of relevant documents may not contain all query terms ( synonymy and missing information) - The query terms may be ambiguous ( polysemy ) Indexing by Latent Semantic Analysis - Map queries and documents into a new vector space whose k dimensions correspond to independent concepts - In this space, queries will be near semantically close documents 3 Bayesian Methods in NLP

: Documents : Terms ? : Query : Region closest to Query Dimension 2 (e.g. cosine > .9) ? Dimension 1 4 Bayesian Methods in NLP

Latent Semantic Analysis Low-rank approximation of Singular Value Decomposition (SVD): Documents Concepts Documents Concepts Terms Terms ≈ × × = X ≈ T 0 × S 0 × D 0 ’ = Ẋ X: Term-document matrix (=data): X ij = freq of w i in D j Ẋ = T 0 S 0 D 0 ‘ ( k -rank approximation of X ) T 0 : Columns are orthogonal and unit-length T 0 ’T 0 = I this S 0 : Diagonal matrix of the k largest singular values should really be D 0 : Columns are orthogonal and unit-length D 0 ’D 0 = I ^ X 5 Bayesian Methods in NLP

LSA: term similarity T 0 Ẋ Ẋ ‘ = T 0 S 0 S 0 T 0 Term w i dot product of w i , w j in the new space T 0 S 0 ẊẊ ‘ = T 0 S 0 S 0 T 0 ( D cancels out because S is diagonal and D orthonormal) Similarity of terms w i , w j in the new space: ( ẊẊ ‘ ) ij Bayesian Methods in NLP

LSA: document similarity Ẋ ’ Ẋ = D 0 S 0 S 0 D 0 D 0 20 Doc. D j ẊẊ ‘ dot product of D i , D j in the new space D 0 S 0 Ẋ ’ Ẋ = D 0 S 0 S 0 D 0 ( T cancels out because S is diagonal and T orthonormal) Similarity of documents d i , d j in the new space: ( Ẋ ’ Ẋ ) ij 7 Bayesian Methods in NLP

LSA: term-document similarity The elements of Ẋ give the similarity of terms and documents. Now, terms are projected to TS 1/2 , documents to DS 1/2 8 Bayesian Methods in NLP

LSA: query-document similarity Queries q are ‘pseudo-documents’: they don’t appear in X Construct their term vector X q Define their document vector D q = X’ q TS -1 9 Bayesian Methods in NLP

Probabilistic Latent Semantic Indexing (Hofmann 1999) Bayesian Methods in NLP 10

A geometric interpretation w 3 Word simplex Any point in this simplex defines 1.0 a multinomial over words Documents P(w |d) Each document corresponds to one multinomial over words Topics P(w | z) Each topic is a multinomial over words Topic simplex The topics define the corners of a (sub)simplex. 1.0 All training documents lie inside 1.0 w 2 this topic simplex. w 1 P(w | d) = λ 1 P(w | z 1 ) + λ 2 P(w | z 2 ) + λ 3 P(w | z 3 ) = P(z 1 | d)P(w | z 1 ) + P(z 2 | d)P(w | z 2 ) + P(z 3 | d)P(w | z 3 ) 12 Bayesian Methods in NLP

PLSA is a mixture model Mixture models: - K mixture components and N observations x 1... x N - Mixing weights ( θ 1 .. . θ K ): P( k ) = θ K - Each observation x n is generated by mixture component z n P( x n ) = P( z n ) P( x n | z n ) PLSI: - Mixture components = topics - Mixing weights are specific to each document θ d = ( θ d1 ... θ dK ) - Each observation (word) w d,n is a sample from the document-specific mixture model. It is drawn from one of the components z d,n P( w d,n ) = P( z d,n | θ d ) P( w d,n | z d,n ) 13 Bayesian Methods in NLP

Lecture 6: (Probabilistic) Latent Semantic Analysis Julia - PowerPoint PPT Presentation

CS598JHM: Advanced NLP (Spring 2013) http://courses.engr.illinois.edu/cs598jhm/ Lecture 6: (Probabilistic) Latent Semantic Analysis Julia Hockenmaier juliahmr@illinois.edu 3324 Siebel Center Office hours: by appointment Indexing by Latent

Retrieval by Content Part 3: Text Retrieval Latent Semantic Indexing Srihari: CSE 626 1 Latent

Models for Retrieval Models for Retrieval 1. HMM/N-gram-based 2. Latent Semantic Indexing (LSI)

An Introduction to Latent Semantic Analysis Thomas K Landauer Department of Psychology

1 Latent variable models In the next section we will discuss latent variable models for

Probabilistic model Probabilistic model c Probabilistic model Probabilistic model c c

Semantic Analysis Wilhelm/Seidl/Hack: Compiler Design Syntactic and Semantic Analysis,

Latent Variable Models CS3750 Xiaoting Li 1 Out utli line Latent Variable Models

Part III: Latent Tree Models Le Song ICML 2012 Tutorial on Spectral Algorithms for Latent

Introduction to Information Retrieval http://informationretrieval.org IIR 18: Latent Semantic

Introduction to Information Retrieval http://informationretrieval.org IIR 18: Latent Semantic

NPFL103: Information Retrieval (11) Latent semantic indexing Pavel Pecina Institute of Formal

Semantic Analysis/Checking Symbol tables Semantic analysis: the final part of analysis half of

Latent Class Analysis (LCA) in Stata Kristin MacDonald Director of Statistical Services

Semantic Analysis CMSC 35100 Natural Language Processing May 8, 2003 Roadmap Semantic

Automatic Scoring of Automatic Scoring of Handwritten Essays using Latent Handwritten Essays

CS 4110 Probabilistic Programming Probabilistic Programming It's not about writing software.

Fugledes spectral set conjecture on cyclic groups Romanos Diogenes Malikiosis TU Berlin Frame

The Singular Value Decomposition COMPSCI 527 Computer Vision COMPSCI 527 Computer Vision

On a new orthonormal basis for RBF native spaces and its fast computation Stefano De Marchi and

Digital Transmission through the Additive White Gaussian Noise Channel ELEN 3024 - Communication

Asymptotic Analysis of Random Matrices and Orthogonal Polynomials Arno Kuijlaars University of

Proper Orthogonal Decomposition: Theory and Reduced-Order Modeling Stefan Volkwein M. Gubisch,

Principal component analysis Ingo Blechschmidt December 17th, 2014 Kleine Bayessche AG

Juggling with representations Matrix representation of Symmetry Point Groups C2v Irreducible