speech processing 15 492 18 492
play

Speech Processing 15-492/18-492 Spoken Dialog Systems Beyond basic - PowerPoint PPT Presentation

Speech Processing 15-492/18-492 Spoken Dialog Systems Beyond basic dialogs Building your own dialogs Back-channeling Human response to speech Human response to speech Robots dont really do this Robots dont really do this


  1. Speech Processing 15-492/18-492 Spoken Dialog Systems Beyond basic dialogs Building your own dialogs

  2. Back-channeling Human response to speech � Human response to speech � � Robots don’t really do this Robots don’t really do this � � Uhms Uhms, errs filler works , errs filler works � � Yeah, uh Yeah, uh- -huh, huh, hm hm, right, okay , right, okay � � Typically words *not* in the lexicon Typically words *not* in the lexicon � � Prosody delivery is important Prosody delivery is important � � Timing is important Timing is important �

  3. Back-channel Example H It is like a party, like, “rave” type party or like H It is like a party, like, “rave” type party or like C well, it’s someone’s house C well, it’s someone’s house H yeah H yeah C there’s going to be, I mean there’s like, they’re C there’s going to be, I mean there’s like, they’re going to be spinning. So, in that sense, maybe, going to be spinning. So, in that sense, maybe, but it’s just at someone’s house, like but it’s just at someone’s house, like H yah- -yeah yeah H yah C It’s in the middle of the night, that,too that,too, but , but C It’s in the middle of the night, (from Nigel Ward UTEP) (from Nigel Ward UTEP)

  4. Timing Replies happen before question ends � Replies happen before question ends � Humans can guess when turn is ending � Humans can guess when turn is ending � � Combination of semantics, prosody (and Combination of semantics, prosody (and � arrogance) arrogance) Human- -machine dialogs more restricted machine dialogs more restricted � Human �

  5. Gesture and Gaze What you look at when talking � What you look at when talking � What the machine should look at � What the machine should look at � Talking to the machine vs vs talking to your talking to your � Talking to the machine � friend friend

  6. Laughter Most common non- -verbal vocal production verbal vocal production � Most common non � Should machines laugh? � Should machines laugh? � � Yes to fit in with the other participants Yes to fit in with the other participants � Laughing takes different forms � Laughing takes different forms � � Near verbal (ha ha Near verbal (ha ha ha ha) ) � � Vocal but unlike speech Vocal but unlike speech � � Subvocal Subvocal � � Overlayed Overlayed on speech on speech �

  7. Participant in Meeting Machine participants in meetings � Machine participants in meetings � � At least follow the speaker At least follow the speaker � � Know when to agree/laugh etc Know when to agree/laugh etc � � Know when it can speak Know when it can speak �  Needs to watch how people interact Needs to watch how people interact 

  8. Machine assistant Needs to watch what you do � Needs to watch what you do � � When are you busy When are you busy � � When are you When are you interruptable interruptable � � What is the importance of the information What is the importance of the information � � (Cell phone just rings, no matter where you are) (Cell phone just rings, no matter where you are) � Look at human brain state � Look at human brain state � � Find when you are thinking Find when you are thinking � � Busy, thinking, dreaming Busy, thinking, dreaming �

  9. How do humans interact with machines Look at human- -human calls human calls � Look at human � “Pretend” they are talking to a machine � “Pretend” they are talking to a machine � � “Wizard of Oz” (WOZ) “Wizard of Oz” (WOZ) � � Have a human play a machine Have a human play a machine � � Need to constrain the human Need to constrain the human �  Give them “robotic” voice Give them “robotic” voice   Constrain their options Constrain their options 

  10. Building a New Dialog Systems What will it do? � What will it do? � � Write down a typical dialog Write down a typical dialog � � No *really* write down a typical dialog No *really* write down a typical dialog � � Write a second (simpler) one Write a second (simpler) one � Look at human- -human dialogs human dialogs � Look at human � � What information is being passed What information is being passed � � Can you avoid the hard ASR parts Can you avoid the hard ASR parts �  (Avoid large numbers of names) (Avoid large numbers of names) 

  11. Breaking down the task What is the ontology � What is the ontology � � What entity types must you deal with What entity types must you deal with �  e.g. Busses, times, bus stops e.g. Busses, times, bus stops  � How will people say them How will people say them �  List *many* yourself and ask others List *many* yourself and ask others  � How should your system say them How should your system say them �  Consistently, and in a way that’s easy to recognize Consistently, and in a way that’s easy to recognize 

  12. Breaking down the task What is the flow of the dialog � What is the flow of the dialog � � How should you order the questions How should you order the questions � � Should you allow multiple orders Should you allow multiple orders � � Is this ordering reasonable for your users Is this ordering reasonable for your users �  Ask others, you are too close to the task Ask others, you are too close to the task  � Test with your written down dialogs Test with your written down dialogs �  (You did write them down didn’t you?) (You did write them down didn’t you?) 

  13. Writing grammars Write grammars for what response � Write grammars for what response � � Test them with multiple examples Test them with multiple examples � � (Get others too if you can) (Get others too if you can) � Test it with text. � Test it with text. � � ASR will have errors ASR will have errors � � Test by typing first, easier to debug Test by typing first, easier to debug �

  14. Testing the dialog Check for one dialog you know works � Check for one dialog you know works � Test it in the system � Test it in the system � � Modify you grammar/dialog accordingly Modify you grammar/dialog accordingly � Then try the variations � Then try the variations � Get others to test it � Get others to test it � Does it do the task you expect � Does it do the task you expect �

  15. Help Try to be consistent and concise � Try to be consistent and concise � � Give good examples of what to say Give good examples of what to say � � Give multiple levels of help Give multiple levels of help � � Nobody will listen …. Nobody will listen …. � Test your help advice � Test your help advice � � Is it really useful? Is it really useful? �

  16. SDS Architecture

Recommend


More recommend