Simulation of Within-Session Query Variations using a Text Segmentation Approach Debasis Ganguly Johannes Leveling Gareth J.F. Jones CNGL, School of Computing, Dublin City University, Ireland
Outline Introduction to query reformulation Automatic generation of query reformulations Characteristics of the reformulations in terms of retrieval results Evaluation Conclusions and future work
Query Reformulation Types in IR Specialization: more particular information need than in previous query Example: “Mahatma Gandhi” “Mahatma Gandhi non-violence movement” Generalization: more general information than in previous query Example: “Mahatma Gandhi assassination” “Mahatma Gandhi life and works” Query drift: move toward related but different information need Example: “Mahatma Gandhi assassination” “Gandhi film”
Motivation Hypothesis: Automatic query reformulations can be used to simulate user query sessions Objective: Simulate query sessions in large quantities – Less time-consuming – Less expensive – No privacy issues – Independent from real data Simulated query sessions can help in Session IR tasks: goal is to improve the IR effectiveness over an entire query session for a user Collaborative IR tasks: goal is to improve the IR effectiveness of a new user by utilizing user responses to related queries
Specialization Initial query: “wildlife poaching” Very general search; no restrictions on particular animal species or locations wildlife poaching After reading two documents, the user now knows that poaching is frequent for “African lions” African lions Indian tigers and “Indian tigers” Adding these words make the query more specific
Generalization Initial query: “osteoporosis” Specific information request (by using technical term); user may not be sure what it actually means Document about bone diseases in general Osteoporosis with a dedicated section on osteoporosis After reading the document, the user knows that “osteoporosis” is a type of Bone “bone disease” Bone osteoporosis Substituting “osteoporosis” with the words “bone” and “disease” now Bone means the user is interested in “bone Bone diseases” in general instead of one particular bone disease
Text Segmentation Documents are composed of a series of densely discussed subtopics Text segmentation draws boundaries between topic shifts The moon's chemical composition Introduction – The search for life in space How the moon helped life evolve on earth Example from M. Hearst. CL. 1997.
Term Distribution Perform text segmentation to get blocks of coherent text passages Terms densely topic 1 topic 2 distributed in a sub-topic are useful for specific Term 1: dominant in topic 1 reformulations Term 2: dominant in topic 2 Terms uniformly distributed Term 3: general term throughout a document are useful for general reformulations
Algorithm for Automatic Query Reformulation Use top ranked documents from an initial retrieval step as external information for reformulations Categorize terms into two classes – specific and general by computing their distribution into the segments of the top ranked documents Generate candidate query reformulations – Add the most specific terms from the most/least similar segments of documents to the original query to get a more specific/drifting query – Substitute original query terms with more general terms as obtained from the pseudo-relevant set of documents • Rank by score and select the best N variants
Term Scores Specialization/drift score: combine – term frequency in segment, – inverse segment frequency, and – idf Generalization score: combine – term frequency in document, – segment frequency, and – idf • Combination in mixture model (see paper for details)
Result Set Characteristics Specialization: – Smaller set of relevant documents (queries are typically longer) – Top ranked documents for the original query become more general with respect to the specific reformulated query but are still relevant (overlap in top ranked documents) Generalization: – Larger set of relevant documents (queries are typically shorter) – Low overlap and high shift of top ranked documents retrieved in response to the original query
Evaluation Measures Two measures: Overlap of retrieved documents at cut-off 10, 20, 50 and 500: O(N) – Net perturbation of top m documents: 1/ m Σ k=1 m new_rank(d k )-k p(N) Expected observations: – High overlap and low perturbation for specialized queries – Low overlap and high perturbation for general queries
Experiments TREC disk 4+5 documents TREC-8 topics: – Topic titles as initial queries for specific and drift reformulations – Topic description as initial queries for general reformulations • Top 5 documents retrieved by LM (lambda=0.4) • C99 algorithm for text segmentation • Added at most 3 specific terms for specific/drift reformulations • Retained at most 2 terms from for the general reformulations • Generated query variants • Judged query variants manually (by two assessors)
Results Type Manual Assessment Result Set Measures Assessor-1 Assessor-2 O(10) O(20) O(50) O(500) p(5) Specific 39 (78%) 26 (52%) 39.0 38.1 42.7 44.7 367.9 General 39 (78%) 43 (86%) 22.4 22.5 24.5 32.2 2208.6 Drift 34 (68%) 35 (70%) 12.0 10.2 8.6 5.9 3853.3 Highest inter-assessor agreement for drift since a drift in information need is not subject to personal judgements Lowest inter-assessor agreement for specific reformulations since semantic specificity of added words can depend on personal judgement Specific and general reformulations which are associated with an increase in overlap percentage with increasing cut-off rank indicate that we get more “seen” documents further down the ranked list
Sample Output of Specific Reformulation Specific reformulations Assessor 1 agrees Assessor 1 disagrees Assessor 2 agrees behavioural genetics cosmic events chromosomes DNA magnitude proton ion genome Assessor 2 disagrees N/A salvaging, shipwreck, treasure found aircraft Rotterdam • Specific reformulations involve adding new words which ought to be semantically related to the original keywords, and the degree of semantic closeness is often subject to personal judgments • One of the assessors does not agree that adding the words “magnitude”, “proton” and “ion” make the initial query “cosmic events” more specific
An Irish Perspective on Query Reformulation … Wildlife poaching → Elephants → Tigers → Beer ? Images from Flickr
… no, just a typo! Wildlife poaching → Elephants → Tigers → Beer → Bear Images from Flickr
Conclusions and Further Work Our proposed method can be used to produce query reformulations with an average accuracy of 65%, 82% and 69% for the specialization, generalization and drift reformulation, respectively We introduced metrics such as the average percentage overlap at fixed number of documents and the average net perturbation to quantify the retrieval result set changes Investigate relation between relevant documents for original query and relevant documents for query reformulations
Any General or Specific Queries?
Specialization term scores • tf (t, s) : term frequency of term t in a segment s • |S|/sf(t) : how dominant is term t in segment s compared to other segments of the same document • idf(t) : how rare is term t in the collection | S | φ = + − ( t , s ) a . tf ( t , s ). ( 1 a ). log( idf ( t )) sf ( t ) • Add n s terms with top φ (t,s) scores for more specific query
Generalization term scores • tf(t,d) : term frequency of term t in a document d (instead of frequency in individual segments) • sf(t)/|S| : segment frequency (instead of inverse segment frequency) • idf(t) : Inverse document frequency: sf ( t ) ψ = + − ( t ) a . tf ( t , d ). ( 1 a ). log( idf ( t )) | S | • Select n g terms with top ψ (t) scores for more general query
Recommend
More recommend