personalized interactive faceted search
play

Personalized Interactive Faceted Search Jonathan Koren * , Yi Zhang - PowerPoint PPT Presentation

Personalized Interactive Faceted Search Jonathan Koren * , Yi Zhang * , and Xue Liu * University of California, Santa Cruz McGill University Outline Introduce Faceted Search Identify Problems with Current FS Tech Propose a


  1. Personalized Interactive Faceted Search Jonathan Koren * , Yi Zhang * , and Xue Liu † * University of California, Santa Cruz † McGill University

  2. Outline • Introduce Faceted Search • Identify Problems with Current FS Tech • Propose a Solution • Novel Evaluation Methodology • Experiments • Conclusions 2

  3. Faceted Search is Everywhere

  4. Formal Definition • Interactive Structured Search Using Key- Value Metadata • Parallel Hierarchies of Documents • Point and Click Structured Query Generation 4

  5. Problems • Too Many Facets and Values • Existing approach: Ad Hoc Value Presentation • Proposed Solution: Personalization and Collaborative faceted search for interactive system utility optimization

  6. Statistical Modeling Framework • Document Model • User Relevance Model 6

  7. Document Model • Docs are Unique Facet-Value Pairs • Facets Come in Different Types • Facet-Type Suggests Statistical Model • Docs Modeled as a Combination of Statistical Models 7

  8. User Relevance Model θ u = { P( rel | u ) , P( x k | rel, u ) , P( x k | non, u ) } 8

  9. User Collaboration Φ θ U θ 1 θ 2 θ u θ u-1 • Φ is the Conjugate Prior to θ u • Φ Fills in Gaps in Individual User Models 9

  10. Interface Evaluation • User Studies are Expensive • New Complementary Approach • Expected User Interface Utility • Simulated Interaction with Pseudousers 10

  11. User Interface Utility • Identify Types of Actions • Assign Costs to Actions • Reward for Relevant Docs Retrieved • Calculate Utility for Entire Search Session 11

  12. Expected User Interface Utility � � E [U] = E [U( u, D )]P( D | u )P( u ) u ∈ U D ∈ D � � E [U( u, D )] = R( q t +1 , a, q t )P( q t +1 | a, q t , u ) t =0 a ∈ A t P( a | q t , u, D )P( q t | q t − 1 , u, D ) 12

  13. Assumptions 1. Users Need to Satisfy a Need with a Set of Documents 2. Users Can Recognize Relevant Documents and Facet-Value Pairs 3. Users Continue to Perform Actions Until Their Need is Met 13

  14. Pseudousers • Stochastic Users • First-Match Users • Myopic Users • Optimal Users 14

  15. Stochastic Users A Nonrelevant (14 matches) B Relevant (17 matches) C Relevant (11 matches) D Nonrelevant (12 matches) • Picks Relevant FVP E Nonrelevant (12 matches) F Relevant (15 matches) at Random G Relevant (13 matches) H Nonelevant (4 matches) I Relevant (13 matches) J Nonrelevant (16 matches) 15

  16. First-Match Users A Nonrelevant (14 matches) B Relevant (17 matches) • Scans list for C Relevant (11 matches) D Nonrelevant (12 matches) Relevant FVPs E Nonrelevant (12 matches) from Top to F Relevant (15 matches) Bottom, Picking G Relevant (13 matches) the First H Nonelevant (4 matches) I Relevant (13 matches) J Nonrelevant (16 matches) 16

  17. Myopic Users A Nonrelevant (14 matches) B Relevant (17 matches) C Relevant (11 matches) • Picks Relevant FVP D Nonrelevant (12 matches) that is Contained E Nonrelevant (12 matches) F Relevant (15 matches) in the Least G Relevant (13 matches) Number of H Nonelevant (4 matches) Documents I Relevant (13 matches) J Nonrelevant (16 matches) 17

  18. Optimal Users A Nonrelevant (14 matches) • Examines the B Relevant (17 matches) Complete C Relevant (11 matches) Interface D Nonrelevant (12 matches) E Nonrelevant (12 matches) • Executes the F Relevant (15 matches) G Relevant (13 matches) Action that H Nonelevant (4 matches) Maximizes the I Relevant (13 matches) Utility J Nonrelevant (16 matches) 18

  19. Evaluation Review • Each Pseudouser Logs into the Search Interface • Pseudouser Interacts with Interface to Retrieve a Set of Documents. • Interface Receives a Score for the Session. • Expected Utility = Average Score for all Sessions 19

  20. Personalization Experiments • Facet-Value Pair • Start Page Suggestion Personalization • Most Frequent • Empty Page • Most Probable • Collaborative Page (Collaborative) • Personalized page • Most Probable (Personalized) • Mutual Information 20

  21. Document Corpora • 8000 Documents from IMDB • 19 Facets and 367k Facet-Value Pairs • 5000 Users Each from Netflix and MovieLens • 633k Ratings for Netflix • 742k Ratings for Movielens 21

  22. Results (Netflix) First-Match (Null Start) Myopic (Null Start) First-Match (Collab Start) Myopic (Collab Start) First-Match (Personal Start) Myopic (Personal Start) 60 45 Ave Num Actions 30 15 0 Frequency Collab Prob Personal Prob PMI FVP Suggestion Method 22

  23. Results (MovieLens) First-Match (Null Start) Myopic (Null Start) First-Match (Collab Start) Myopic (Collab Start) First-Match (Personal Start) Myopic (Personal Start) 60 Ave Num Actions 40 20 0 Frequency Collab Prob Personal Prob PMI FVP Suggestion Method 23

  24. Conclusions • Many Facets and Values are a Problem • Personalized Interfaces Can Help • Proposed Statistical Modeling Framework for Faceted-Search • Proposed Inexpensive Repeatable Evaluation Technique for Faceted-Search Interfaces • Personalized Start Pages are Helpful 24

  25. fin 25

  26. Example: Two Myopic Users Search for “The ‘Burbs” User: 302 User: 1329 certificate=PG certificate=PG soundmix=Dolby soundmix=Dolby genre=Comedy genre=Comedy country=USA language=English colorinfo=Color year=1989 productiondesigner=SpencerJamesH productiondesigner=SpencerJamesH

Recommend


More recommend