CS490W Without search engines the web wouldn’t scale The acceptance of search interaction makes “unlimited selection” stores possible: – Amazon, Netflix, etc Web Search (I ) Search turned out to be the best mechanism for advertising on the web, a $15+ B industry. – Growing very fast but entire US advertising industry g y g y $250B – huge room to grow Luo Si – Sponsored search marketing is about $10B Department of Computer Science Purdue University Slides from Manning, C., Raghavan, P. and Schütze, H. Usage of Web Search Search engines market share (iProspect Survey, 4/ 04, http:/ / www.iprospect.com/ premiumPDFs/ iProspectSurveyComplete.pdf) Without search engines the web wouldn’t scale No incentive in creating content unless it can be easily found – other finding methods haven’t kept pace (taxonomies, bookmarks, etc) The web is both a technology artifact and a social environment – “The Web has become the “new normal” in the American Classical I R vs. Web I R Classical I R vs. Web I R way of life; those who don’t go online constitute an ever- a of life those ho don’t go online constit te an e er shrinking minority.” – [Pew Foundation report, January 2005] Search engines make aggregation of interest possible: – Create incentives for very specialized niche players � Economical – specialized stores, providers, etc � Social – narrow interests, specialized communities, etc
Basic assumptions of The coarse-level dynamics Classical I nformation Retrieval Subscription Corpus: Fixed document collection Editorial Goal: Retrieve documents with information Feeds content that is relevant to user’s information need Crawls Transaction Advertisement Content creators Content aggregators Content consumers Classic I R Goal Brief (non-technical) history Early keyword-based engines Classic relevance – Altavista, Excite, Infoseek, Inktomi, ca. 1995-1997 – For each query Q and stored document D in a given Paid placement ranking: Goto.com (morphed corpus assume there exists relevance Score(Q, D) into Overture.com → Yahoo!) � Score is average over users U and contexts C g – Your search ranking depended on how much you paid – Optimize Score(Q, D) as opposed to Score(Q, D, U, C) – Auction for keywords: casino was expensive! – That is, usually: � Context ignored Bad assumptions � Individuals ignored in the web context � Corpus predetermined Brief (non-technical) history 1998+: Link-based ranking pioneered by Google – Blew away all early engines Great user experience in search of a business model – Meanwhile Goto/Overture’s annual revenues were nearing $1 billion g Web I R Web I R Result: Google added paid-placement “ads” to the side, independent of search results – Yahoo follows suit, acquiring Overture (for paid placement) and Inktomi (for search)
Web search basics Sponsored Links CG Appliance Express Discount Appliances (650) 756-3931 User Same Day Certified Installation www.cgappliance.com San Francisco-Oakland-San Jose, CA Miele Vacuum Cleaners Miele Vacuums- Complete Selection Free Shipping! www.vacuums.com Miele Vacuum Cleaners Miele -Free Air shipping! All models. Helpful advice. www.best-vacuum.com Web Results 1 - 10 of about 7,310,000 for miele . ( 0.12 seconds) Miele , Inc -- Anything else is a compromise At the heart of your home, Appliances by Miele . ... USA. to miele .com. Residential Appliances. Vacuum Cleaners. Dishwashers. Cooking Appliances. Steam Oven. Coffee System ... Web spider www. miele .com/ - 20k - Cached - Similar pages Miele Welcome to Miele , the home of the very best appliances and kitchens in the world. www. miele .co.uk/ - 3k - Cached - Similar pages Miele - Deutscher Hersteller von Einbaugeräten, Hausgeräten ... - [ Translate this Ads Ads page ] Das Portal zum Thema Essen & Geniessen online unter www.zu-tisch.de. Miele weltweit ...ein Leben lang. ... Wählen Sie die Miele Vertretung Ihres Landes. www miele de/ - 10k - Cached - Similar pages www. miele .de/ 10k Cached Similar pages Herzlich willkommen bei Miele Österreich - [ Translate this page ] Herzlich willkommen bei Miele Österreich Wenn Sie nicht automatisch weitergeleitet werden, klicken Sie bitte hier! HAUSHALTSGERÄTE ... www. miele .at/ - 3k - Cached - Similar pages Search Indexer The Web Algorithmic results. Indexes Ad indexes Ads vs. search results User Needs Need [Brod02, RL04] Sponsored Links Google has maintained that ads CG Appliance Express Discount Appliances (650) 756-3931 – Informational – want to learn about something Same Day Certified Installation (based on vendors bidding for www.cgappliance.com San Francisco-Oakland-San Jose, (~40% / 65%) CA P53 Cancer keywords) do not affect vendors’ Miele Vacuum Cleaners Miele Vacuums- Complete Selection – Navigational – want to go to that page (~25% / rankings in search results Free Shipping! www.vacuums.com 15%) Miele Vacuum Cleaners United Airlines Miele -Free Air shipping! All models. Helpful advice. – Transactional – want to do something (web- www.best-vacuum.com mediated) (~35% / 20%) Search = Web Results 1 - 10 of about 7,310,000 for miele . ( 0.12 seconds) � Access a service Seattle weather Miele , Inc -- Anything else is a compromise miele At the heart of your home, Appliances by Miele . ... USA. to miele .com. Residential Appliances. Mars surface images Vacuum Cleaners. Dishwashers. Cooking Appliances. Steam Oven. Coffee System ... � Downloads www. miele .com/ - 20k - Cached - Similar pages Canon S410 Miele � Shop Welcome to Miele , the home of the very best appliances and kitchens in the world. www. miele .co.uk/ - 3k - Cached - Similar pages Miele - Deutscher Hersteller von Einbaugeräten, Hausgeräten ... - [ Translate this – Gray areas page ] Das Portal zum Thema Essen & Geniessen online unter www.zu-tisch.de. Miele weltweit Car rental Brasil ...ein Leben lang. ... Wählen Sie die Miele Vertretung Ihres Landes. www. miele .de/ - 10k - Cached - Similar pages � Find a good hub Herzlich willkommen bei Miele Österreich - [ Translate this page ] Herzlich willkommen bei Miele Österreich Wenn Sie nicht automatisch � Exploratory search “see what’s there” weitergeleitet werden, klicken Sie bitte hier! HAUSHALTSGERÄTE ... www miele at/ - 3k - Cached - Similar pages Ads vs. search results Web search users Make ill defined queries Specific behavior Other vendors (Yahoo, MSN) have made – Short – 85% look over one similar statements from time to time � AV 2001: 2.54 terms avg, 80% < 3 result screen only words) – Any of them can change anytime � AV 1998: 2.35 terms avg, 88% < 3 – 78% of queries are not words [Silv98] modified (one We will focus primarily on search results p y – Imprecise terms p query/session) / i ) – Sub-optimal syntax (most independent of paid placement ads queries without operator) – Follow links – – Although the latter is a fascinating technical – Low effort “the scent of Wide variance in information” ... subject in itself – Needs – Expectations – Knowledge – Bandwidth
Query Distribution Users’ empirical evaluation of results Quality of pages varies widely – Relevance is not enough – Other desirable qualities (non IR!!) � Content: Trustworthy, new info, non-duplicates, well maintained, � Web readability: display correctly & fast � No annoyances: pop-ups, etc Precision vs. recall – On the web, recall seldom matters Power law: few popular broad queries, many rare specific queries How far do people look for results? Users’ empirical evaluation of engines Relevance and validity of results UI – Simple, no clutter, error tolerant Trust – Results are objective Coverage of topics for poly-semic queries Coverage of topics for poly semic queries Pre/Post process tools provided – Mitigate user errors (auto spell check, syntax errors,…) – Explicit: Search within results, more like this, refine ... – Anticipative: related searches (Source: iprospect.com WhitePaper_2006_SearchEngineUserBehavior.pdf) Loyalty to a given search engine Example* (iProspect Survey, 4/ 04) TASK Mis-conception Info Need Mis-translation Verbal form form Mis-formulation Query SEARCH ENGINE Polysemy Synonymy Query Results Refinemen Corpus t * To Google or to GOTO, Business Week Online, September 28, 2001
Recommend
More recommend