information needs
play

Information Needs IR, session 2 CS6200: Information Retrieval - PowerPoint PPT Presentation

Information Needs IR, session 2 CS6200: Information Retrieval Slides by: Jesse Anderton Information Retrieval Information Retrieval is the field of Computer Science concerned with finding the information from a collection that is relevant


  1. Information Needs IR, session 2 CS6200: Information Retrieval Slides by: Jesse Anderton

  2. Information Retrieval • Information Retrieval is the field of Computer Science concerned with finding the information from a collection that is relevant to a user’s information need, as expressed by a query. • This general task has many possible concrete formalizations, which define collection, information need, relevance, and query more precisely. • Let’s look at several examples to get a sense of the field’s scope.

  3. Ad-Hoc Search • Currently, the most important search task is ad hoc search on the Internet. • The collection is the set of web pages indexed by the search engine. • The information need is the web content the user is looking for. • The query is an ordered list of keywords. • A document is relevant if it contains text on the same topic as the query.

  4. Vertical Search • Vertical Search focuses on information from a particular domain: flights, music, news, sports, etc. • The collection might be the set of all airline fares, research papers, or blog posts. • An information need can be very specific: “the cost of a flight to Iceland tomorrow” • The query may be structured using a web form, providing specific property values to search for. • Document relevance is sometimes less ambiguous: matching the search fields.

  5. Enterprise Search • Enterprise Search is vertical search run against a company’s internal content. • The collection is the set of documents, e-mails, forum threads, wiki pages, etc. in the company’s internal network. • Information needs, queries, and relevance are typically defined as for ad-hoc search.

  6. Desktop Search • Desktop Search focuses on searching the contents of your computer. • The collection is the set of files (and contacts, messages, events, etc.) stored on your computer. • An information need is generally either a file, or information stored in a file. • A query can be a list of keywords as in ad-hoc search, or a list of property values in a custom query language. • Relevance is defined as in ad-hoc search.

  7. Peer to Peer Search • Peer to Peer Search focuses on finding content shared on peer to peer networks. • The collection is the set of all files currently shared by any peer on the network. • An information need is a particular file, e.g. a music video. • A query is often a keyword list, but may use an extended query language. • A document is relevant only if it’s the file the user wanted.

  8. Question Answering • Question Answering tries to answer questions posed as normal dialog. • Information needs are usually restricted to concisely-stated answers. • Queries are posed as a single sentence of natural language text. • A response is relevant if it answers the question correctly, and if it is expressed clearly (e.g. fluently).

  9. Wrapping Up • Information Retrieval is the field of Computer Science concerned with finding the information from a collection that is relevant to a user’s information need, as expressed by a query. • A query is an expression of an information need, and not the need itself. • Next, we’ll take a look at some of the distinct types of information needs users have.

Recommend


More recommend