<www.vxml.co.za>





<training>





<voicexml tutorial>















Voice Extensible Markup Language (VoiceXML) Version 2.0




The supreme reference with regards to elements, the available elements and their employment would be the World Wide Web Consortium (W3C) specification. The complete reference is available from W3C. The W3C references Voice Extensible markup Language (VoiceXML) version 2 and supplies a complete guide. At this stage, there are no proposed or pending Errata.

The basic architecture of a VoiceXML enviroment consits of 3 main components:
- The IVR wit hthe VXML browser
- The speech server and
- The Application Server






The IVR containing the VXML browser can be accessed by a few telephony avenues. The most common method is an ordinary circuit switched telephone call (PSTN), which is received by the IVR also as a PSTN call, making use of a telephony card.

The second is a PSTN call form the user side, which passes through a VoIP (Voice over Internet Protocol) gateway and is received by the IVR as a VoIP call.

The third is a straight VoIP call from the user to the IVR - this is very relevant and widely used in development environments. The IVR thus answers the call and initiates the speech browser which interprets the VoiceXML document in terms of the call and input.

The Speech Server is a separate system which handles the text-to-speech conversion and also the ASR (Advanced Speech Recognition). The configuration and intrinsic workings of the server is outside the scope of this document. However, SSML (Synthesized Speech Markup Language) will be addressed in this document. When VXML elements are encountered by the speech browser which requires Speech Server interaction, the IVR will facilitate the interaction with the Speech Server. The protocol used to communicate with the Speech Server is MRCP, a media resource control protocol; however, this is transparent to the VXML programmer.

The application / Web server will be the domain of the developer or the client. Hence it will fall outside the charge of the telco. Here, on the application server, the developer can host server side applications, which are invoked by the VXML application. These server side applications can be used for saving recorded voice, entering call data into a database, or retrieving data relevant to the call. Communication to the Application Server takes place via HTTP as with any web page. There might be as the diagram illustrates, web applications which also makes use of the application server.





For more information go to:
www.w3.org/TR/voicexml20








Copyright 2009 - Computer Assisted Telephony Systems (Pty) Ltd