Internet Engineering: VoiceXML Ali Kamandi Sharif University of - - PowerPoint PPT Presentation

internet engineering voicexml
SMART_READER_LITE
LIVE PREVIEW

Internet Engineering: VoiceXML Ali Kamandi Sharif University of - - PowerPoint PPT Presentation

Internet Engineering: VoiceXML Ali Kamandi Sharif University of Technology Fall 2007 kamandi@ce.sharif.edu What Is VoiceXML? VoiceXML, or VXML, is a markup language like HTML. The difference: HTML is rendered by your Web browser


slide-1
SLIDE 1

Internet Engineering: VoiceXML

Ali Kamandi Sharif University of Technology Fall 2007 kamandi@ce.sharif.edu

slide-2
SLIDE 2

Sharif University of Technology 2

What Is VoiceXML?

VoiceXML, or VXML, is a markup language

like HTML.

The difference:

HTML is rendered by your Web browser to format

content and user-input forms;

VoiceXML is rendered by a voice browser.

slide-3
SLIDE 3

Sharif University of Technology 3

User Interaction

Your application can speak to the user via

synthesized speech prerecorded audio files.

Your software can receive input from the user

via

speech the tones from their telephone keypad.

slide-4
SLIDE 4

Sharif University of Technology 4

How to Make Your Content Telephone-Accessible

Rent a telephone line and run commercial

voice recognition software and text-to-speech (TTS) conversion software.

VoiceXML revolution There are free VoiceXML gateways, such as:

BeVocal (http://www.bevocal.com), Voxeo (http://www.voxeo.com), and VoiceGenie (http://www.voicegenie.com).

slide-5
SLIDE 5

Sharif University of Technology 5

VoiceXML

These gateways take VoiceXML pages from

your Web server and read them to your user. If your application needs input from the user, the gateway will interpret the incoming response and pass that response to your server in a way that your software can understand.

slide-6
SLIDE 6

Sharif University of Technology 6

VoiceXML

slide-7
SLIDE 7

Sharif University of Technology 7

VoiceXML

slide-8
SLIDE 8

Sharif University of Technology 8

VoiceXML Basics

<?xml version="1.0"?> <vxml version="2.0"> <form> <block> <audio>Hello, World</audio> </block> </form> </vxml>

slide-9
SLIDE 9

Sharif University of Technology 9

VoiceXML Basics (2)

Within that is a <form>, which can either be

an interactive element-requesting input from the user-or informational.

You can have as many forms as you want

within a VoiceXML document.

A <block> is a container for your executables,

meaning that all your tags that make your application do something, such as <audio>, <goto>,…

slide-10
SLIDE 10

Sharif University of Technology 10

VoiceXML Basics (3)

<audio>text</audio> will read the text with a TTS converter,

whereas

<audio src="wav_file_URL"/> will play a prerecorded .wav audio file.

slide-11
SLIDE 11

Sharif University of Technology 11

More VoiceXML

slide-12
SLIDE 12

Sharif University of Technology 12

More VoiceXML

<?xml version="1.0"?> <vxml version="2.0"> <form id="animal_questionnaire"> <field name="favorite_animal"> <prompt> <audio>Which do you like better, dogs or cats?</audio> </prompt> <grammar> <![CDATA[ [ [dog dogs] {<option "dogs">} [cat cats] {<option "cats">} ] ]]> </grammar>

slide-13
SLIDE 13

Sharif University of Technology 13

More VoiceXML (2)

<!-- if the user gave a valid response, the filled block is executed. --> <filled> <if cond="favorite_animal == ‘dogs’"> <goto next="#popular_dog_facts"/> <else/> <goto expr="‘psychological_evaluation.cgi?affliction=’+ favorite_animal"/> </if> </filled>

slide-14
SLIDE 14

Sharif University of Technology 14

More VoiceXML (3)

<!-- if the user responded but it didn’t match the grammar, the nomatch block is executed --> <nomatch> I’m sorry, I didn’t understand what you said. <reprompt/> </nomatch>

slide-15
SLIDE 15

Sharif University of Technology 15

More VoiceXML (4)

<!-- if there is no response for a few seconds, the noinput block is executed --> <noinput> I’m sorry, I didn’t hear you. <reprompt/> </noinput> </field> </form> <!-- additional forms can go here --> </vxml>

slide-16
SLIDE 16

Sharif University of Technology 16

Barge-in

Normally the user does not have to wait for

the prompt to finish before speaking. Instead, he can “barge in” and speak a response at any time. <prompt bargein="false"> <audio src="advertisement.wav"> </prompt>

slide-17
SLIDE 17

Sharif University of Technology 17

Speech Timeouts

If a user does not speak after hearing a

prompt, the interpreter will generate a timeout and execute the <noinput> event handler, if there is one.

<property name="timeout" value="10">

slide-18
SLIDE 18

Sharif University of Technology 18

Grammar Format

In VoiceXML 1.0, the W3C did not specify the

grammar format, allowing each Voice-XML platform to implement grammars as they chose.

In VoiceXML 2.0, each platform is required to

implement the XML format of the W3C’s Speech Recognition Grammar Format (SRGF),

slide-19
SLIDE 19

Sharif University of Technology 19

Mobile versus Voice Applications

Hard to use in noisy environment Works well in noisy environment Speech or keypad input User-input with uncomfortable keypads Can be used with any phone Requires browser-enabled telephones VoiceXML Mobile Browser

slide-20
SLIDE 20

Sharif University of Technology 20

Mobile versus Voice Applications

You only need to develop

  • ne version of your

software You need to develop versions of your software for a variety of mobile gateways Users can only say predefined phrases User can enter arbitrary information Works poorly for giving the user long lists of information Works well for displaying long list of information VoiceXML Mobile Browser

slide-21
SLIDE 21

Sharif University of Technology 21

Syntax in HTML & VoiceXML

Compared to HTML, VoiceXML is much

stricter about using correct syntax.

In HTML

writing attribute values without quotes

  • mitting the ending tag of a container

In VoiceXML, you must use proper syntax in

all documents.

slide-22
SLIDE 22

Sharif University of Technology 22

Beyond VoiceXML: Conversational Speech

You Will it rain tomorrow in Boston? JUPITER To my knowledge, the forecast

calls for no rain tomorrow in Boston.

You What about Detroit? JUPITER To my knowledge, the forecast

calls for no rain tomorrow in Detroit.

  • Assumed that you were still interested in

rain when asking about Detroit, context carried over from the Boston question.

slide-23
SLIDE 23

Sharif University of Technology 23

References

Chapter 10:

Software Engineering for Internet Applications

by Eve Andersson, Philip Greenspun, and Andrew Grumet; The MIT Press Cambridge, Massachusetts London, England, 2006.