flatlands 08
play

Flatlands 08 Machine Learning Deirdre Lungley dmlung@essex.ac.uk - PowerPoint PPT Presentation

Adaptively Modelling the Context of an Intranet Query Deirdre Lungley Motivation Research Plan FCA Flatlands 08 Machine Learning Deirdre Lungley dmlung@essex.ac.uk University of Essex June 6, 2008 Deirdre Lungley (University of Essex)


  1. Adaptively Modelling the Context of an Intranet Query Deirdre Lungley Motivation Research Plan FCA Flatlands ’08 Machine Learning Deirdre Lungley dmlung@essex.ac.uk University of Essex June 6, 2008 Deirdre Lungley (University of Essex) June 6, 2008 1 / 16

  2. Adaptively Modelling the Context of an Intranet Query Deirdre Lungley Motivation Starting Point Users Like Help Query Table Research Focus Intranet Search Research Plan FCA Machine Learning Figure: University of Essex Intranet Search Deirdre Lungley (University of Essex) June 6, 2008 2 / 16

  3. Adaptively Modelling the Context of an Intranet Query Motivation Deirdre Lungley Users do like some help! Motivation Kruschwitz and Al-Bakour, 2005 Starting Point Users Like Help White and Ruthven, 2006 Query Table known-item search - query suggestions Research Focus Intranet Search exploratory search - query destinations Research Plan Web examples FCA Clusty - Vivisimo Ltd. Machine Learning CREDO - FUB, Italy Intranet example Aquabrowser - Medialab Solutions, The Netherlands Analysis of UoE intranet search modifications Dominated by single-term queries Many of these queries met by documents in top 5 results However, how about? Multi-context terms - sport, parking, printing Ambiguous terms - CES Deirdre Lungley (University of Essex) June 6, 2008 3 / 16

  4. Adaptively Modelling the Context of an Intranet Query Deirdre Lungley Query Term(s) Motivation Starting Point 1. library Users Like Help Query Table 2. accomodation Research Focus Intranet Search 3. exam timetable Research Plan 4. timetable FCA 5. courses Machine Learning 6. accommodation 7. fees 8. moodle 9. mba 10. graduation Table: Prominent Modified Queries Deirdre Lungley (University of Essex) June 6, 2008 4 / 16

  5. Adaptively Modelling the Context of an Intranet Query Deirdre Lungley Motivation Research Focus Starting Point Users Like Help Query Table All require a domain model Research Focus Intranet Search Non-trivial task Research Plan Relying on appropriate document annotation FCA Our answer! Machine Learning Automatically adapt our domain model - let it learn from implicit user feedback (clickthrough data) Current uses of clickthrough data Re-ranking of results Query refinement Deirdre Lungley (University of Essex) June 6, 2008 5 / 16

  6. Adaptively Modelling the Context of an Intranet Query Deirdre Lungley Motivation Starting Point Users Like Help Query Table Why Intranet Search? Research Focus Intranet Search Controlled environments Research Plan Often imposed annotation standards FCA Less spam, making inlinks and metadata more reliable Machine Learning Relatively cohesive community of users Similiar search needs aid the viability of harnessing user population feedback Deirdre Lungley (University of Essex) June 6, 2008 6 / 16

  7. Adaptively Modelling the Context of an Intranet Query Deirdre Lungley Motivation Components Research Plan Underlying Search Engine Components System Lucene’s Nutch Architecture FCA Natural Language Processing Machine Learning QTag Collocations (Justeson and Katz) AN, NN, AAN, ANN, NAN, NNN, NPN Context Model Formal Concept Analysis (FCA) Machine Learning SVM-Light (Joachims) Deirdre Lungley (University of Essex) June 6, 2008 7 / 16

  8. Adaptively Modelling the Context of an Intranet Query Deirdre Lungley NL Query Underlying Search Document Motivation Engine Indices Research Plan Components Crawler / Indexer NL Processor System Architecture FCA Document URL : Terms Collection Machine Learning Adaptive Element Predictions (URL : weighted terms) URL : Terms (Adapted) FCA Processor Logged Relevance Data Lattice/Document Machine Learning Lattice Exploration Representation Module Figure: System Architecture Deirdre Lungley (University of Essex) June 6, 2008 8 / 16

  9. Adaptively Modelling the Context of an Intranet Query Deirdre Lungley Motivation horse male female adult young Research Plan horse X stallion X X X FCA Hasse Table mare X X X Related Concept Lattice foal X X Automade filly X X X Adapted Lattice colt X X X Machine Learning Figure: Classical Lattice Example - Hasse Table Deirdre Lungley (University of Essex) June 6, 2008 9 / 16

  10. Adaptively Modelling the Context of an Intranet Query Deirdre Lungley horse Motivation horse Research Plan adult young male female FCA foal Hasse Table Related Concept Lattice Automade Adapted Lattice Machine Learning colt filly stallion mare Figure: Classical Lattice Example - Concept Lattice Deirdre Lungley (University of Essex) June 6, 2008 10 / 16

  11. Adaptively Modelling the Context of an Intranet Query Deirdre Lungley Motivation Research Plan FCA Hasse Table Related Concept Lattice Automade Adapted Lattice Machine Learning Figure: Automade Screenshot Deirdre Lungley (University of Essex) June 6, 2008 11 / 16

  12. Adaptively Modelling the Context of an Intranet Query Deirdre Lungley Motivation Research Plan FCA Hasse Table Related Concept Lattice Automade Adapted Lattice Machine Learning Figure: Example Adapted Lattice Deirdre Lungley (University of Essex) June 6, 2008 12 / 16

  13. Adaptively Modelling the Context of an Intranet Query Deirdre Lungley Motivation Research Plan FCA SVM-Light Machine Learning Machine learning tool developed by Thorsten Joachims SVM-Light Clickthrough Data Particularly suitable for Information Retrieval - developed to Adaptation Steps surmount the problem of sparsity in document/term matrices Default linear kernel Lattice-based kernel to optimize the lattice structure? Deirdre Lungley (University of Essex) June 6, 2008 13 / 16

  14. Adaptively Modelling the Context of an Intranet Query Deirdre Lungley Motivation Research Plan FCA Clickthrough Data Machine Learning Questions have been raised regarding the accuracy of using SVM-Light Clickthrough clickthrough data as an indicator of relevance Data Adaptation Steps Radlinski and Joachims promote relative relevance as against absolute relevance A document clicked on for a query is deemed more relevant to that particular query than documents above and below Deirdre Lungley (University of Essex) June 6, 2008 14 / 16

  15. Adaptively Modelling the Context of an Intranet Query Adaptation Steps Deirdre Lungley Record Log Data Motivation Log initial query term Research Plan Log subsequent query terms either entered in the textbox or FCA chosen by clicking on the lattice node Machine Learning SVM-Light Log clicked URL plus subsequent browser clicked URLs (possibly Clickthrough not within result list) Data Adaptation Steps Adaptive Element. Before creation of query lattice apply SVM-Light Model. This should: Associate query terms positively with the clicked URLs and negatively with skipped URLs (i.e., increase/decrease document/term weight) Decrease weight of document terms not in query terms If query term does not exist in document terms, add with positive weight Apply threshold to delete terms within documents and entire documents where all terms deleted Deirdre Lungley (University of Essex) June 6, 2008 15 / 16

  16. Adaptively Modelling the Context of an Intranet Query Deirdre Lungley Motivation Research Plan FCA Machine Learning SVM-Light Clickthrough Data Thank You! Adaptation Steps Deirdre Lungley (University of Essex) June 6, 2008 16 / 16

More recommend