devices
play

DEVICES Geoffrey Zweig Outline What is Mobile Voice search? An - PowerPoint PPT Presentation

Microsoft Research ---- Lang Tech 2008 VOICE SEARCH ON MOBILE DEVICES Geoffrey Zweig Outline What is Mobile Voice search? An example: Live Search for Windows Mobile Why is it important? The Competitive Landscape Basic


  1. Microsoft Research ---- Lang Tech 2008 VOICE SEARCH ON MOBILE DEVICES Geoffrey Zweig

  2. Outline  What is Mobile Voice search?  An example: Live Search for Windows Mobile  Why is it important?  The Competitive Landscape  Basic Technology  Advancing the State-of-the-Art  Next generation Applications Microsoft Research ---- Lang Tech 2008

  3. What is Mobile Voice Search?  Getting information when you are on-the-go  Business-information  Phone numbers  Addresses  Ratings  Hours  Maps & Directions  Entertainment  Movie showtimes  Restaurant recommendations Microsoft Research ---- Lang Tech 2008

  4. Live Search for Windows Mobile Microsoft Research ---- Lang Tech 2008

  5. Asking for Seattle Microsoft Research ---- Lang Tech 2008

  6. Confirming the Location Microsoft Research ---- Lang Tech 2008

  7. Now we’re in Seattle Microsoft Research ---- Lang Tech 2008

  8. Asking for Vietnamese Restaurants Microsoft Research ---- Lang Tech 2008

  9. Finding a Vietnamese Restaurant Microsoft Research ---- Lang Tech 2008

  10. The Details Microsoft Research ---- Lang Tech 2008

  11. Let’s Get Directions Microsoft Research ---- Lang Tech 2008

  12. Starting from 8350 159 th PL NE Remond, WA Microsoft Research ---- Lang Tech 2008

  13. Specifying a Starting Point Microsoft Research ---- Lang Tech 2008

  14. And Now we can Go! Microsoft Research ---- Lang Tech 2008

  15. You can even check the traffic Microsoft Research ---- Lang Tech 2008

  16. What People Ask For – By Type Type of Request Business City-State-Zip Address Compound Microsoft Research ---- Lang Tech 2008

  17. Frequent Requests Businesses Cities Pizza (1.5%) Dallax TX (0.80%) Best Buy Seattle WA Starbucks Chicago IL Movies Redmond WA McDonald’s Los Angeles CA Wal-Mart Orlando FL Mexican Restaurant Miami FL Pizza Hut Bellevue WA Target San Diego CA Restaurants (0.73%) New York, NY (0.47%) Perplexity = 8514 Perplexity = 4741 Microsoft Research ---- Lang Tech 2008

  18. Outline  What is Mobile Voice search?  An example: Live Search for Windows Mobile  Why is it important?  The Competitive Landscape  Basic Technology  Advancing the State-of-the-Art  Next generation Applications Microsoft Research ---- Lang Tech 2008

  19. Skyrocketing Cellphone Use 600 500 400 PCs in Use per 1000 People 300 Internet Users per 1000 200 Cellphone Users per 1000 100 0 1990 1995 2000 2005 2010 Computer Industry Almanac Microsoft Research ---- Lang Tech 2008

  20. It’s a Global Market Number of Cellphones: ~2.2B in 2005 EU China US Russia Japan Brazil India UK Pakistan Mexico Indonesia Turkey Infoplease.com Rest of World Microsoft Research ---- Lang Tech 2008

  21. Potentially Big Revenues Will mobile search be like internet search? Microsoft Research ---- Lang Tech 2008

  22. Monetization  Free 411 services create modest revenue streams  But multimodal has advantages:  You are looking at a screen  You can be sms’d and that sticks around  Voice provides demographic clues not present in web search – gender, race, age, education  Many possibilities  Standard search-specific advertising  You say “ Zales Jewelers” system suggests “Tiffany’s”  Demographically targeted ads  Men get different results from women  Batched ads sent to email account provided at registration Microsoft Research ---- Lang Tech 2008

  23. Outline  What is Mobile Voice search?  An example: Live Search for Windows Mobile  Why is it important?  The Competitive Landscape  Basic Technology  State-of-the-Art  Next generation Applications Microsoft Research ---- Lang Tech 2008

  24. Competitive Landscape: Basic Search  Live Search for Windows Mobile  http://wls.live.com from your phone  Businesses, directions, maps, traffic, movies, gas  Windows Mobile phones  Tellme by Mobile  http://www.tellme.com/products/TellmeByMobile  Businesses, directions, maps  Java phones  V-enable  http://www.v-enable.com/directory_assistance.html  Businesses, directions, maps, weather  Demo only – not currently available Microsoft Research ---- Lang Tech 2008

  25. Competitive Landscape: Beyond Search  Vlingo  http://vlingo.com/  Businesses, directions, maps, music downloads  sms by voice  Java phones  Nuance Voice Control  http://www.nuance.com/voicecontrol/  Businesses, directions, maps, weather, stocks, sports, movies, web search  Send emails, update calendar, go to web pages  Blackberry, Treo, Windows Mobile phones Microsoft Research ---- Lang Tech 2008

  26. Outline  What is Mobile Voice search?  An example: Live Search for Windows Mobile  Why is it important? -- Trends in Cellphone use  The Competitive Landscape  Basic Technology  Advancing the State-of-the-Art  Next generation Applications Microsoft Research ---- Lang Tech 2008

  27. Client-Server Architecture Microsoft Research ---- Lang Tech 2008

  28. Typical Grammar Setup Business Address City-State-Zip n-gram LMs n-gram LMs Enumerated grammar Local 1 Local 1 National National Local 2 Local 2 Local 600 Local 600 Microsoft Research ---- Lang Tech 2008

  29. Sample Performance Levels 1-best N-best N-best Inter- depth annotator agreement Overall 42% 47 3.6 67% Microsoft Research ---- Lang Tech 2008

  30. Outline  What is Mobile Voice search?  An example: Live Search for Windows Mobile  Why is it important? -- Trends in Cellphone use  The Competitive Landscape  Basic Technology  Advancing the State-of-the-Art  Next generation Applications Microsoft Research ---- Lang Tech 2008

  31. Click-Driven Automated Feedback Language Model Acoustic Model Stupid Detector Microsoft Research ---- Lang Tech 2008

  32. Automated Feedback Methods  Data addition  What people click on & associated audio  Text searches from web  Discriminative LM training  Adjust LM to maximize posterior probability of correct words  Need to know competitors – from nbest lists  Translation-based data generalization  Maximum likelihood database cleaning  Learn error model of the mistakes people make when entering data  Recover the likeliest intended entries  Adaptive N-best postprocessing  Remove what history shows is obviously stupid  Reorder and augment the rest based on further analysis  Personalization  Per-person / user-profile grammars  Per-person speaker-adaptive transforms Microsoft Research ---- Lang Tech 2008

  33. Sample Click Data Entries that frequently co-occur Clicked Competitor McDonald’s Mc Donald Coffee Coffey Mexican Restaurant Mexican Restrant Coffee Copy Mexican Food Mexican Foods Starbucks Star Box Starbucks Starbuck’s Sex 6 Burger King 13 Microsoft Research ---- Lang Tech 2008

  34. Discriminative LM Training (Xiao Li)  Idea N-best alternatives  Increase n-gram probabilities of Maine Home 1. the true hypothesis Maine School 2.  Decrease n-gram probabilities Maine Car 3. of confusable competitors Maine 4.  The LM is estimated to Maine Heart 5. maximize p(W|O) Maine Mall 6. Maine Homes  Leveraging click data 7. Mayo 8.  View clicked item as “truth” Maine Golf 9.  View n-best alternatives as Maine Home Care 10. “competitors” Microsoft Research ---- Lang Tech 2008

  35. Rescoring Results  Experiments:  Rescore n-best alternatives using the baseline LM and discriminatively trained LM  Inspect if the rescored one-best is the user clicked item One-best Acc Train Set Dev set Test set # utterances 150K 1.3K 1.4K Baseline 71.1% 71.5% 70.5% Discriminative - 74.8% 72.7% Training Fraction of time the clicked item is at the top of the n-best. Microsoft Research ---- Lang Tech 2008

  36. Translation LM (Xiao Li, ICASSP-08)  Goal:  “Translate” listing forms to query forms  Use translated query forms to augment the training data for LM estimation.  Example listing Kung Ho Cuisine Of China can have  “Kung Ho Chinese Restaurant”  “Kung Ho Restaurant”  “Kung Ho” Microsoft Research ---- Lang Tech 2008

  37. Recognition Results  Experiments  Test set: 3K directory-assistance utterances  Different LM training sets: Sentence accuracy One-best N-best Listings 38.6% 48.3% Listings + transcription 41.5% 51.4% Listings + transcription 43.1% 52.5% + translation Microsoft Research ---- Lang Tech 2008

  38. Maximum Likelihood Database Recovery Wi: intended words ( unknown, e.g. “Starbucks” or “Al’s Quick Mart”) Wc : Corrupted words in data (observed, e.g. “Starbuck’s” or “Al’s Kwik Mart”) Want to find the likeliest intended word sequence ( ) ( | ) P w P w w i c i arg max ( | ) arg max P w w w i c w i i ( ) P w c arg max ( ) ( | ) P w P w w w i c i i LM built on Error model clean data w i w c P(w c |w i ) Starbucks Starbuck’s 0.5 Starbucks Starbucks 0.5 Transductive aparatus used to recover the Quick Quick 0.3 likeliest words Quick Kwik 0.3 Quick Quik 0.3 Microsoft Research ---- Lang Tech 2008

Recommend


More recommend