designing interac ve systems that embrace uncertainty
play

Designing Interac-ve Systems that Embrace Uncertainty Keith Vertanen - PowerPoint PPT Presentation

Designing Interac-ve Systems that Embrace Uncertainty Keith Vertanen , http://keithv.com, vertanen@mtu.edu Keiths Thesis-o-meter 80000 Words 70000 Submission BGS limit 60000 Number words 50000 40000 30000 20000 10000 07 07 07 08


  1. Designing Interac-ve Systems that Embrace Uncertainty Keith Vertanen , http://keithv.com, vertanen@mtu.edu

  2. Keith’s Thesis-o-meter 80000 Words 70000 Submission BGS limit 60000 Number words 50000 40000 30000 20000 10000 07 07 07 08 08 08 08 09 09 04 07 10 01 04 07 10 01 04 Year/month

  3. My research Design, build, and evaluate intelligent interac-ve systems that leverage uncertain input technologies with a focus on enhancing the capabili4es of users with permanent or situa4onally-induced disabili4es. • Research areas – Speech interfaces – Mobile interfaces – Assis4ve technologies – Text entry – Crowdsourcing

  4. Speech interfaces I must go down to the see again, to the lonely sea and the sky, and all I ask is ah a tall ship and star to steer her by. Confidence visualiza4on for error detec-on Highlight likely errors using word confidence scores. One-step error correc-on Infer loca4on and content of correc4on using uHerance + op4onal approximate loca4on info. Spelling-based error avoidance "I must go down to the seas Spell difficult words during correc4ons or S-E-A-S again..." preemp4vely during ini4al dicta4on.

  5. Parakeet: touchscreen error correc-on Uses a correc4on interface built from the word confusion network built from the speech recogni4on result. Speech Dasher: eye-tracking error correc-on Zoom through the speech recogni4on hypothesis space to confirm and correct result. Gesture keyboard error correc-on / avoidance Speak sentence, provides gestures for: all words, only recogni4on errors, words deemed difficult

  6. Assis4ve technologies Augmenta4ve and Alterna4ve Communica4on (AAC) "Hello my name is Keith" I want ju Predic0ve AAC iPad app Voice output AAC device Speaking users: ~150 wpm (words-per-minute), AAC users: oWen < 10 wpm No corpora of actual AAC user communica4ons • Models trained on telephone transcripts or newswire data • Ø Shared research resources for conversa4onal AAC

  7. Is the dog friendly? I need to start making a shopping list soon. What I would really like right now is a plate of fruit. Who will drive me to the doctor's office tomorrow? Some invented communica0ons • Trained in-domain language model on invented communica4ons • Selected data from larger data sets: twiHer, blog, Usenet • Results: perplexity ↓82%, keystroke savings ↑11% iSCAN: Predic4ve phoneme-based AAC 16 par4cipants: entry rate ↑108%, error rate ↓79% AAC user: beHer than device used for 4 years

  8. AAC user input: oWen low bandwidth and noisy Ø Maximize output for each input bit Dwell-clicking on a keyboard 6 wpm (words-per-minute) Wri4ng via Dasher naviga4on 14 wpm Dwell-free via eye gestures Performance poten4al: 46 wpm

  9. VelociTap: Project goals • Encourage users to go fast – Avoid delays due to monitoring intermediate results – Avoid overly precise tapping Ø Sentence-at-a--me entry • Test on many users – No training of user-specific model – No learning of new keyboard or entry technique Ø Tapping on familiar QWERTY layout • One-handed or small keyboard use Ø Tapping with a single finger

  10. From taps to text Given a noisy tap sequence: Guess the user's intended text: have a good day 0.06 have a food day 0.01 have a fod day 0.004 have a god day 0.0006 ...

  11. VelociTap: Touch modeling 2D Gaussians centered at each key. Separate variances in the x- and y-dimensions.

  12. VelociTap: Language modeling • Language models: – 12-gram leVer model – 4-gram word model with unknown word – Trained on billions of words of data § TwiHer, blog, social media, Usenet, and web data – Op4mized for short email-like messages – LeHer and word model: ~4 GB memory

  13. VelociTap: Decoder d Observa0on 1 Observa0on 2 Observa0on 3 o f X god g ϵ ϵ X c X z a o z b go a z X a X ϵ d X good X z Prune unlikely paths Tokens track: probability, LM context

  14. The Invisible Keyboard

  15. First find a transform that produces the best probability using greedy character-at-a- -me decoding scheme. Full search considering many possible character sequences. Tap sequence scaled horizontally and slightly translated and rotated.

  16. Future work: text entry • Improve models : more data – Text Blaster: mul4-player tex4ng game – Eyes-free text adventure game • Correc-on interfaces • Improve user signal – Audio / tac4le feedback – Real-4me uncertainty feedback – Error avoidance for difficult words • Other use scenarios – Searching links on a web page – Input via mid-air gestures – Input on an actual smartwatch

  17. Future work: speech interfaces • One-step voice correc-on – Detec4ng hyperar4culate speech – Evaluate complete system, for: • Improved desktop dicta4on • Eyes-free mobile dicta4on • Other domains – Command-and-control • Hands-busy or no-device use (instrumented environments) – Ambient speech recogni4on • Inform future searches, push relevant content • Inform predic4ve AAC

  18. Future work: assis4ve technologies • Improve AAC language models : more data – Develop chat-like game, played by AAC and non-AAC users – Validate on AAC users / transcripts • Dwell-free eye-wri-ng – Evaluate with a recogni4on-based approach • Context in predic-ve AAC – Gleaned via sensors – Explicit partner sugges4ons

  19. Conclusions • Want to know more? – keithv.com -> Papers, videos – Or stop by my office • Opportuni-es for undergrads – Good programming skills • Java + Android development • Socket programming • Web development – Good people skills: • Recrui4ng par4cipants • Running studies Jus4n Emge Haythem Memmi Google MicrosoW

Recommend


More recommend