Language and Searching in speech Language and Keyword searching in OSCAR Language and Computers Computers Computers Topic 2: Searching Topic 2: Searching Topic 2: Searching Introduction Introduction Introduction Text Text Text Speech Speech Speech ◮ One might also want to search for speech , e.g., to find Searching in a Searching in a Searching in a Library Catalog a particular sentence spoken in an interview one only Library Catalog Library Catalog Special characters Special characters Special characters ◮ In addition to querying literal strings, the keyword Language and Computers (Ling 384) has a recording (audio file) of. Operators Operators Operators search query language of OSCAR also supports the Searching the web Searching the web Searching the web ◮ With current technology, this is only possible if the Topic 2: Searching Operators Operators Operators use of Improving searching Improving searching Improving searching interview is transcribed, using the IPA or another writing Ranking of results Ranking of results Ranking of results ◮ special characters to abbreviate multiple options Evaluating search results system. Evaluating search results Evaluating search results Adriane Boyd ∗ ◮ special operators for combining two query strings Advanced searches Advanced searches Advanced searches ◮ It is, however, already possible to with regular with regular with regular (boolean operators) or modifying the meaning of a expressions expressions expressions Department of Linguistics, OSU ◮ detect the language of a spoken conversation, e.g., Syntax of regular expressions Syntax of regular expressions Syntax of regular expressions single string (unary operators) Autumn 2005 Grep: An example for using Grep: An example for using Grep: An example for using when listening in to a telephone conversation regular expressions regular expressions regular expressions Text corpora and searching ◮ detect a new topic being started in a conversation Text corpora and searching Text corpora and searching them them them ◮ In the following, we focus on searching in text. ∗ The course was created by Markus Dickinson, Detmar Meurers and Chris Brew. 1 / 33 4 / 33 7 / 33 Outline Language and Searching in a library catalog Language and Language and OSCAR: Special characters Computers Computers Computers Topic 2: Searching Topic 2: Searching Topic 2: Searching Introduction Introduction Introduction Text Text Text Speech Speech Speech Searching in a Searching in a ◮ Use * for 1–5 characters at end or within a word. Searching in a Introduction Library Catalog Library Catalog Library Catalog ◮ To find articles, books, and other library holdings, a Special characters Special characters Special characters ◮ art* finds arts, artists, artistic Operators library generally provides a database containing Operators Operators ◮ gentle*n Searching in a Library Catalog Searching the web Searching the web Searching the web information on its holdings. Operators Operators Operators ◮ Use ** for any number of characters at end of word. Improving searching Improving searching Improving searching ◮ OSCAR is the database frontend providing access to Ranking of results Ranking of results Ranking of results Searching the web art** finds artificial, artillery Evaluating search results Evaluating search results Evaluating search results the library database at OSU. Advanced searches Advanced searches Advanced searches ◮ Use ? for a single character at end or within a word. with regular ◮ OSCAR makes it possible to search for the occurrence with regular with regular expressions expressions expressions Advanced searches with regular expressions gentlem?n Syntax of regular expressions of literal strings occurring in the author, title, call Syntax of regular expressions Syntax of regular expressions Grep: An example for using Grep: An example for using Grep: An example for using ◮ The special * and ? characters must have at least 2 regular expressions regular expressions regular expressions number, etc. associated with an item held by the library. Text corpora and searching Text corpora and searching Text corpora and searching them them them characters to their left. ( → for efficiency reasons) 2 / 33 5 / 33 8 / 33 Language and Language and Language and Searching Basic searching in OSCAR OSCAR: Literal Strings and Operators (I) Computers Computers Computers Topic 2: Searching Topic 2: Searching Topic 2: Searching Introduction Introduction Introduction Text Text Text Speech Speech Speech ◮ Literal strings are composed of characters which Searching in a Searching in a Searching in a Library Catalog Library Catalog Library Catalog ◮ A breathtaking number of information resources are naturally must be in the same character encoding ◮ Use and or or to specify multiple words in any field, any Special characters Special characters Special characters Operators Operators Operators available: books, databases, the web, newspapers, . . . system (e.g. ASCII, ISO8859-1, UTF-8) as the strings order. Searching the web Searching the web Searching the web ◮ To locate relevant information, we need to be able to encoded in the database. Operators Operators Operators ◮ art and therapy Improving searching Improving searching Improving searching search these resources, which often are written texts : ◮ For literal strings, OSCAR does not distinguish between Ranking of results Ranking of results Ranking of results ◮ art or therapy Evaluating search results Evaluating search results Evaluating search results upper and lower-case letters (i.e. they aren’t so literal ◮ c+ or c++ ◮ Searching in a library catalog (e.g., using OSCAR) Advanced searches Advanced searches Advanced searches with regular with regular with regular ◮ Searching the web (e.g., using Google) after all ;-) expressions expressions ◮ Use and not to exclude words. expressions ◮ Advanced searching in text corpora (using regular Syntax of regular expressions Syntax of regular expressions Syntax of regular expressions ◮ Adjacent words are searched as a phrase. art and not therapy Grep: An example for using Grep: An example for using Grep: An example for using expressions) (e.g., using Opus) regular expressions regular expressions regular expressions Text corpora and searching ◮ art therapy Text corpora and searching Text corpora and searching them them them ◮ vitamin c 3 / 33 6 / 33 9 / 33
Recommend
More recommend