<www.vxml.co.za>





<training>





<voicexml tutorial>















Speech Synthesis Markup Language (SSML) Version 1.0





Synthesized speech (text-to-speech or TTS) is useful as a placeholder during application development, or when the data to be spoken is "unbounded" (not known in advance), which makes it impossible to prerecord.

When deploying your applications, however, you should plan to use professionally recorded prompts whenever possible. Users expect commercial systems to use high-quality recorded speech, and only recorded speech can guarantee highly natural pronunciation and prosody.

Although recorded prompts are best for many applications, it is important to keep in mind that it is easier to maintain and modify an application that uses TTS prompts. For this reason, you should typically use TTS prompts during development.

Handling unbounded data: If the information that the application needs to speak is unbounded, you will need to use TTS. Examples of unbounded information include:

- Telephone directories
- E-mail messages
- Frequently updated lists of employee or customer names, movie titles etc.
- Up-to-the-minute news stories


To edit and manage the style of the TTS pronunciation, you can use a markup language which is incorporated into the VXML code, this is SSML; Speech Synthesis Markup Language.

If your development environment incorporates a speech engine, SSML development can be tested in the development environment. If the test IVR system has access to a speech server, SSML development can be tested in the runtime environment.

Below is a working example of SSML, which you should be able to run if you have access to a speech server. The complete VXML document is included below:



Useful SSML elements are:

<s>
This element represents a sentence.

<break>
This element is placed before an important piece of information, where the listener is to pay special attention.

<p>
This element represents a paragraph.

<voice>
This element can request a change in the speaking TTS voice.
Gender can be set, "male" or "female".
The name of the voice can also be set.
In all these cases the developer needs to know what variants in TTS voice is available on the Speech Server.

<emphasis>
This element causes contained text to be spoken with emphasis. Emphasis can differ according to languages and voices used.

When using the <enumerate> element to play menu choices with the TTS engine, added punctuation to control the length of the pauses between <choice> elements, is not needed. The VoiceXML browser will automatically add the appropriate pauses and intonations when speaking the prompts.










Copyright 2009 - Computer Assisted Telephony Systems (Pty) Ltd