Retrieval Models Probability Ranking Principle Web Search Slides based on the books: 1
Retrieval models • Geometric/linear spaces • Vector space model • Probability ranking principle • Language models approach to IR • An important emphasis in recent work • Probabilistic retrieval model • Binary independence model • Okapi’s BM25 2
Recall a few probability basics • For events A and B, the Bayes ’ Rule is: 𝑞 𝐵, 𝐶 = 𝑞 𝐵 𝐶 𝑞 𝐶 = 𝑞 𝐶 𝐵 𝑞 𝐵 𝑞 𝐵 𝐶 = 𝑞 𝐵, 𝐶 𝑞(𝐶) = 𝑞 𝐵 𝑞 𝐶 𝐵 𝑞 𝐶 • Interpretation: 𝑞𝑝𝑡𝑢𝑓𝑠𝑗𝑝𝑠 = 𝑞𝑠𝑗𝑝𝑠 ∙ 𝑚𝑗𝑙𝑓𝑚𝑗ℎ𝑝𝑝𝑒 ֞ 𝑞 𝐵 𝐶 = 𝑞 𝐵 𝑞 𝐶 𝐵 𝑓𝑤𝑗𝑒𝑓𝑜𝑑𝑓 𝑞 𝐶 3
ҧ ҧ ҧ ҧ Recall a few probability basics • Independence assumption: = 𝑞 𝐵 ς 𝑗 𝑞 𝑐 𝑗 |𝐵 𝑞 𝐵 𝐶 = 𝑞 𝐵 𝑞 𝐶 𝐵 ς 𝑗 𝑞 𝑐 𝑗 𝑞 𝐶 𝑃 𝐵 = 𝑞(𝐵) 𝑞(𝐵) • Odds: 𝐵 = 𝑞 1 − 𝑞 𝐵 𝑞 𝐵 𝑞(𝐶|𝐵) 𝑃 𝐵 𝐶 = 𝑞 𝐵 𝐶 = 𝑞 𝐵 𝑞(𝐶|𝐵) 𝑞 𝐶 𝐵 𝐶 = 𝐵 𝑞(𝐶| ҧ 𝐵 𝑞(𝐶| ҧ 𝑞 𝑞 𝐵) 𝑞 𝐵) 𝑞 𝐶 4
Recall a few probability basics 𝑞 𝐵 𝑒𝑏𝑢𝑏 = 𝑞 𝐵 𝑞 𝑒𝑏𝑢𝑏 𝐵 𝑞 𝑒𝑏𝑢𝑏 𝑞 𝑇𝑀𝐶 = 𝑑𝑏𝑛𝑞𝑓ã𝑝 𝑒𝑏𝑢𝑏 = 𝑞 𝑇𝑀𝐶 = 𝑑𝑏𝑛𝑞𝑓ã𝑝 𝑞 𝑒𝑏𝑢𝑏 𝑇𝑀𝐶 = 𝑑𝑏𝑛𝑞𝑓ã𝑝 𝑞 𝑒𝑏𝑢𝑏 𝑏𝑞𝑝𝑡𝑢𝑓𝑠𝑗𝑝𝑠𝑗 = 𝑏𝑞𝑠𝑗𝑝𝑠𝑗 ∙ 𝑤𝑓𝑠𝑝𝑡𝑗𝑛𝑗𝑚ℎ𝑏𝑜ç𝑏 𝑓𝑤𝑗𝑒𝑓𝑜𝑑𝑗𝑏 5
Why probabilities in IR? • In traditional IR systems, matching between each document and query is attempted in a semantically imprecise space of index terms. Understanding User Query of user need is Information Need Representation uncertain How to match? Uncertain guess of Document whether document has Documents Representation relevant content Probabilities provide a principled foundation for uncertain reasoning. Can we use probabilities to quantify our uncertainties? 6
The document ranking problem • We have a collection of documents • User issues a query • A list of documents needs to be returned • Ranking method is the core of an IR system: • In what order do we present documents to the user? • We want the “best” document to be first, second best second, etc …. Idea: Rank by probability of relevance of the document w.r.t. information need 7
Modeling relevance P(R=1|document, query) • Let d represent a document in the collection. • Let R represent relevance of a document w.r.t. to a query q • Let R=1 represent relevant and R=0 not relevant. 𝑞 𝑠 = 1|𝑟, 𝑒 = 𝑞 𝑒, 𝑟 𝑠 = 1 𝑞(𝑠 = 1) • Our goal is to estimate: 𝑞(𝑒, 𝑟) 𝑞 𝑠 = 0|𝑟, 𝑒 = 𝑞 𝑒, 𝑟 𝑠 = 0 𝑞(𝑠 = 0) 𝑞(𝑒, 𝑟) 8
Probability Ranking Principle (PRP) • PRP in action: Rank all documents by 𝑞 𝑠 = 1|𝑟, 𝑒 • Theorem: Using the PRP is optimal, in that it minimizes the loss (Bayes risk) under 1/0 loss • Provable if all probabilities correct, etc. [e.g., Ripley 1996] 𝑞 𝑠|𝑟, 𝑒 = 𝑞 𝑒, 𝑟 𝑠 𝑞(𝑠) 𝑞(𝑒, 𝑟) • Using odds, we reach a more convenient formulation of ranking : O 𝑆 𝑟, 𝑒 = 𝑞 𝑠 = 1|𝑟, 𝑒 𝑞 𝑠 = 0|𝑟, 𝑒 9
Probabilistic retrieval models interpretation • PRP in action: Rank all documents by 𝑞 𝑠 = 1|𝑟, 𝑒 • Theorem: Using the PRP is optimal, in that it minimizes the loss (Bayes risk) under 1/0 loss • Provable if all probabilities correct, etc. [e.g., Ripley 1996] 𝑞 𝑠|𝑟, 𝑒 = 𝑞 𝑒, 𝑟 𝑠 𝑞(𝑠) 𝑞(𝑒, 𝑟) • Using odds, we reach a more convenient ranking formulation: O 𝑆 𝑟, 𝑒 = 𝑞 𝑠 = 1|𝑟, 𝑒 𝑞 𝑠 = 0|𝑟, 𝑒 ∝ 𝑞 𝑒 𝑟, 𝑠 = 1 𝑞 𝑒 𝑟, 𝑠 = 0) 10
ҧ Language models interpretation • In language models, we do a different formulation towards the query posterior given the document as a model. O 𝑆 𝑟, 𝑒 = 𝑞 𝑠 = 1|𝑟, 𝑒 𝑞 𝑠 = 0|𝑟, 𝑒 ∝ log 𝑞 𝑟|𝑒, 𝑠 𝑞 𝑠|𝑒 𝑞 𝑟|𝑒, ҧ 𝑠 𝑞 𝑠|𝑒 11
ҧ The two families of Retrieval Models Probability Ranking Principle O 𝑆 𝑟, 𝑒 = 𝑞 𝑠 = 1|𝑟, 𝑒 𝑞 𝑠 = 0|𝑟, 𝑒 Language Models Probabilistic Retrieval Models 𝑃 𝑆 𝑟, 𝑒 ∝ log 𝑞 𝑟|𝑒, 𝑠 𝑞 𝑠|𝑒 𝑃 𝑆 𝑟, 𝑒 ∝ 𝑞 𝑒 𝑟, 𝑠 = 1 𝑞 𝑟|𝑒, ҧ 𝑠 𝑞 𝑠|𝑒 𝑞 𝑒 𝑟, 𝑠 = 0) • Vector Space Model • LM Dirichlet • Binary Independent Model • LM Jelineck-Mercer • BM25 12
Recommend
More recommend