intelligent information retrieval intelligent information
play

Intelligent Information Retrieval: Intelligent Information - PowerPoint PPT Presentation

Intelligent Information Retrieval: Intelligent Information Retrieval: some research trends some research trends Gabriella Pasi Gabriella Pasi Istituto per le Tecnologie della Costruzione Istituto per le Tecnologie della Costruzione Sezione


  1. Intelligent Information Retrieval: Intelligent Information Retrieval: some research trends some research trends Gabriella Pasi Gabriella Pasi Istituto per le Tecnologie della Costruzione Istituto per le Tecnologie della Costruzione Sezione Tecnologie Informatiche Multimediali Sezione Tecnologie Informatiche Multimediali Consiglio Nazionale delle Ricerche Consiglio Nazionale delle Ricerche via Ampère, 56, 20131 - - Milano Milano via Ampère, 56, 20131 e- -mail: gabriella.pasi@itim.mi.cnr mail: gabriella.pasi@itim.mi.cnr. .it it e

  2. The problem of Information Access The problem of Information Access Development of the WWW Development of the WWW Increasing amount of available information Increasing amount of available information NEED FOR SYSTEMS WHICH SUPPORT A FAST AND NEED FOR SYSTEMS WHICH SUPPORT A FAST AND EFFECTIVE ACCESS TO INFORMATION EFFECTIVE ACCESS TO INFORMATION Distinct nature of information needs Distinct nature of information needs Distinct ways to provide an automatic Distinct ways to provide an automatic support to information access support to information access

  3. The problem of Information Access The problem of Information Access There are distinct ways to locate information, depending both on the way in which the information is represented, and on the users’ needs: – Navigating via links on – Navigating via links on web web sites (point and click paradigm) sites (point and click paradigm) The identification of a meaningful starting point requires requires – Explicit specification of Explicit specification of users needs users needs – An explicit query formulation ( Information Retrieval Systems - requires requires Search Engines ) – Reccomendations as a decision support aid Reccomendations as a decision support aid – Learning from “similar” preferences ( Recommender Systems ) requires requires – Preferences elicitation through Preferences elicitation through “ “guided” dialogues guided” dialogues – User knowledge elicitation ( Decision Support Systems ) requires requires

  4. The problem of Information Access The problem of Information Access – Systems Systems which which support support information information access access: : – The definition of systems which help users to access information relevant relevant to their needs is based on the solution of a decision making problem: how to select and rank information items which reflect the user’s information items which reflect the user’s preferences ? preferences – Notion Notion of of relevance relevance: : what the user wants is relevant – information. Relevance is a subjective property of information items. The notion of preference is in this The notion of preference is in this context related to the one of relevance context related to the one of relevance

  5. Information Retrieval Information Retrieval Information Retrieval (IR) Information Retrieval (IR) aims at defining systems able to find aims at defining systems able to find documents which satisfy someone’s information need . . documents which satisfy someone’s information need Information Information can be of any kind can be of any kind: textual, visual, : textual, visual, or auditory or auditory, , although most most actual IR actual IR systems systems store store and enable the retrieval of and enable the retrieval of although only textual textual information information organized organized in in documents documents. . only The problem of identifying the documents relevant to specific The problem of identifying the documents relevant to specific needs is a a decision decision- -making problem making problem, , based on the assessment of based on the assessment of needs is the subjective notion of relevance subjective notion of relevance. . the Very complex task, , pervaded with pervaded with imprecision and uncertainty imprecision and uncertainty Very complex task

  6. Information Retrieval System: Information Retrieval System: a basic scheme a basic scheme INDEXING MECHANISM INDEXING MECHANISM DOCUMENTS DOCUMENTS Usually unstructured FORMAL FORMAL or semi-structured text REPRESENTATION OF REPRESENTATION OF DOCUMENTS DOCUMENTS MATCHING MATCHING ITEMS ESTIMATED ITEMS ESTIMATED MECHANISM MECHANISM RELEVANT RELEVANT QUERY QUERY USER QUERY USER QUERY FORMULATION FORMULATION Ultimate aim of the system: to estimate the : to estimate the relevance relevance Ultimate aim of the system of documents on the basis of a comparison of the of documents on the basis of a comparison of the formal representation of documents and queries formal representation of documents and queries

  7. Techniques that improve the basic Techniques that improve the basic scheme of an IRS scheme of an IRS Some techniques which allows to improve the retrieval Some techniques which allows to improve the retrieval capabilities are: are: capabilities • Relevance Relevance Feedback, Feedback, • • Text Categorization Text Categorization, , • • Use Use of Thesauri of Thesauri • • Document clustering Document clustering • • Cross Cross- -lingual lingual Information Retrieval Information Retrieval •

  8. Information Retrieval: main issues Information Retrieval: main issues • • Text (or other media) formal representation Text (or other media) formal representation the text representation is usually based on keywords extraction the text representation is usually based on keywords extraction and weighting and weighting – how to improve document representations? how to improve document representations? – • Queries • Queries usually based on selection criteria specified by terms usually based on selection criteria specified by terms – how to define query languages that better express how to define query languages that better express – user’s needs? user’s needs? • The matching mechanism • The matching mechanism it compares the document and query representations it compares the document and query representations – what is a “good” model of retrieval? How to account what is a “good” model of retrieval? How to account – for imprecision and uncertainty? for imprecision and uncertainty? • Produced results: ranked lists of documents • Produced results: ranked lists of documents degrees of relevance or probability of relevance degrees of relevance or probability of relevance

  9. Information Retrieval Systems Information Retrieval Systems The relevance estimate strongly depends on the adopted IR model The relevance estimate strongly depends on the adopted IR model How to improve the relevance estimate? How to improve the relevance estimate? Definition of “ intelligent Definition of “ intelligent retrieval systems retrieval systems ” ” by better interpreting by better interpreting and and learning learning users’ preferences users’ preferences Flexible systems vs. intelligent systems Flexible systems vs. intelligent systems • tolerance to uncertainty and imprecision (intrinsic in subjective ctive • tolerance to uncertainty and imprecision (intrinsic in subje evaluations) evaluations) • • learning capabilities learning capabilities Application of soft computing techniques: Application of soft computing techniques: • to simplify the user- -system interaction system interaction • to simplify the user (tolerance to an approximate expression of users’ needs) (tolerance to an approximate expression of users’ needs) • to improve the formal representation of the documents’ content • to improve the formal representation of the documents’ content • to learn the user notion of relevance • to learn the user notion of relevance

  10. “Intelligent” IR: some research directions Intelligent” IR: some research directions “ • • IR models that manage uncertainty and vagueness IR models that manage uncertainty and vagueness They model the uncertainty and/or imprecision intrinsic in the They model the uncertainty and/or imprecision intrinsic in the retrieval activity retrieval activity • • Relevance Feedback Relevance Feedback To learn users’ preferences by refinement of queries To learn users’ preferences by refinement of queries • Automated text categorization • Automated text categorization • Vocabulary expansion and intelligent users’ interfaces interfaces • Vocabulary expansion and intelligent users’ • Personalized indexing • Personalized indexing To improve the formal representation of documents To improve the formal representation of documents • Flexible query languages query languages • Flexible To improve the expression of users’ needs To improve the expression of users’ needs

Recommend


More recommend