Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Vocalocity Voice Browser
VoiceXML Implementation Reference Guide
Version 2.4.1
Vocalocity VoiceXML Implementation Reference Guide, Version 2.4.1
Copyright © 2003–2005. Vocalocity, Inc. All rights reserved. An unpublished work under US Copyright Laws.
Published June 2005
This document is protected by copyright. No part of this document may be used or reproduced in any form by any means without prior written authorization of Vocalocity, Inc. (“Vocalocity”) and its licensors, if any. This document contains information that may be protected by one or more US patents, foreign patents, or pending applications. This document is subject to the terms of the Vocalocity Evaluation Agreement and/or the Vocalocity Master Software License Agreement.
THIS DOCUMENT IS PROVIDED “AS IS” AND WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
THIS DOCUMENT MAY CONTAIN TECHNICAL INACCURACIES OR TYPOGRAPHICAL ERRORS, AND VOCALOCITY MAKES NO REPRESENTATION OR WARRANTIES WITH RESPECT TO THE ACCURACY OR COMPLETENESS OF THE INFORMATION CONTAINED IN THIS DOCUMENT. CHANGES MAY BE ADDED PERIODICALLY TO THE INFORMATION CONTAINED HEREIN; THESE CHANGES WILL BE INCORPORATED IN NEW EDITIONS, VERSIONS OR RELEASES OF THIS DOCUMENT. VOCALOCITY MAY MAKE IMPROVEMENTS AND/OR CHANGES IN THE PRODUCT(S) AND/OR PROGRAM(S) DESCRIBED IN THIS DOCUMENT AT ANY TIME.
Vocalocity, the Vocalocity logo, and combinations thereof are trademarks of Vocalocity, Inc. in the United States and other countries. Other product names and brands used in this document are for identification purposes only, and are the trademarks and/or property of their respective owners. This notice does not evidence any actual or intended publication of this document.
For more information, contact us at [email protected].
Vocalocity, Inc.730 Peachtree StreetSuite 1100Atlanta, GA 30308 USA+1.404.487.1200
http://www.vocalocity.com
Contents
Preface: About This Guide
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viIntended Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viVersion Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
Using Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .viiContents of This Guide. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiRelated Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Contacting Vocalocity Technical Support . . . . . . . . . . . . . . . . . . . . . . . . .x
Chapter 1: IntroductionAbout Voice Browsers and VoiceXML . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2Voice Browsers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2Supported Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3
Implementation of the VoiceXML Specifications. . . . . . . . . . . . . . . . . . 1-4
ASR Vendor Support of SRGS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5ASR Vendor Support of SISR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5
Chapter 2: VoiceXML Element SummaryVoiceXML Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2
SSML Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-7
SRGS Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-9
Detailed Implementation Notes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-10<audio> Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-10<data> Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-10<log> Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-11<meta> Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-11
Contents
Chapter 3: Standard Types and DefaultsIntroduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2Setting Vocalocity Browser Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2
Setting Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2Setting Java System Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2
Vocalocity Session Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3
Vocalocity Property Defaults. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4Generic Speech Recognition Properties . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4Generic DTMF Recognition Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4Prompt and Collection Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5Fetching Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5Miscellaneous Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6
Custom Browser Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7Specifying the ASR or TTS Engine to Use. . . . . . . . . . . . . . . . . . . . . . . . . 3-7
ASR Engines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8TTS Engines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8Defaults Used when No Engine Is Specified . . . . . . . . . . . . . . . . . . . . 3-8
Audio and Initial Page Fetching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9
MIME Type Mapping. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10Overriding a MIME Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-11
SAX Parsers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-12
ECMAScript . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-13Accessing the log4j Logger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-13Accessing Web Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-13
Implementation Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-14Bargein. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-14Default Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-14DTMF-Only Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-14Infinite Loop Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-15Strict Content Type Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-15Time Unit Designations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-15Using a File-Based URL in Applications . . . . . . . . . . . . . . . . . . . . . . . . . 3-15
Chapter 4: SpeechWorks OSR NotesIntroduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2
Application Name Used for OSR Logging . . . . . . . . . . . . . . . . . . . . . . . 4-3
SpeechWorks Recognizer Properties . . . . . . . . . . . . . . . . . . . . . . . . . . 4-4
Endpointer Tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-5
Licensing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6
iv Vocalocity Voice Browser VoiceXML Implementation Reference
Preface
About This Guide
The Vocalocity Voice Browser 2.4.1 fully conforms to the VoiceXML 2.0 and 2.1 specifications, and supports the Speech Recognition Grammar Specification (SRGS) and other related open standards. This reference guide provides additional detail for how the Vocalocity Voice Browser implements the standards.
The topics discussed in this guide include:
Voice browsers and VoiceXML
Supported specifications
Implementation of VoiceXML elements
Vocalocity defaults
About This Guide
IntroductionThe Vocalocity VoiceXML Implementation Reference Guide describes how the Vocalocity Voice Browser implements VoiceXML as described in:
The W3C Recommendation 16 March 2004, Voice Extensible Markup Language (VoiceXML) 2.0
The W3C Working Draft 28 July 2004, Voice Extensible Markup Language (VoiceXML) 2.1
The guide is not a programming guide; it: clarifies how Vocalocity has implemented the standards, where the requirements were ambiguous or where we have chosen to implement in a slightly different manner.
This guide is intended to provide an explanation of VoiceXML support in the Vocalocity Voice Browser. Use this guide along with the W3C VoiceXML specifications when developing applications.
Intended Audience
This guide should be used by:
Application developers who are creating VoiceXML applications for the Vocalocity Voice Browser
Technical personnel who are responsible for troubleshooting deployed applications
Version Information
The information in this guide is accurate for Version 2.4.1 of the Vocalocity Voice Browser.
It discusses Vocalocity Voice Browser’s implementation of VoiceXML 2.0 and VoiceXML 2.1.
vi Vocalocity Voice Browser VoiceXML Implementation Reference Guide
Using Documentation
Using DocumentationThis section outlines the structure of the VoiceXML Implementation Reference Guide and explains other guides in the documentation set and their intended audiences.
Contents of This Guide
This guide consists of four chapters. The following table describes each chapter.
Chapter or Appendix Description
Preface Introduces the structure of this guide, and explains how information is presented
Chapter 1, Introduction Provides some background on voice browsers, voice standards, and the Vocalocity VoiceXML Interpreter. Lists the specifications to which the VoiceXML Interpreter conforms.
Chapter 2, VoiceXML Element Summary
For each VoiceXML element, provides additional detail for how the Vocalocity Voice Browser implements the VoiceXML standard.
Chapter 3, Standard Types and Defaults
Lists the standard event types, session variables, and application variables in the Vocalocity Voice Browser VoiceXML implementation, and how to configure them.
Chapter 4, SpeechWorks OSR Notes
Contains implementation suggestions or usage notes for using SpeechWorks OSR with the Vocalocity Voice Browser.
Vocalocity Voice Browser VoiceXML Implementation Reference Guide vii
About This Guide
Related Documentation
There are several different guides to help you understand, implement and run the Vocalocity Voice Browser. The documentation set consists of the following guides.
Guide Description Intended Audiences
Vocalocity Voice Browser Installation Guide
Contains hardware and software requirements for the Vocalocity Voice Browser, describes deployment options, and contains procedures for installing and configuring Vocalocity software, third-party software, and hardware.
Note: Operations information is included in the Control Center User’s Guide.
Anyone planning an implementation or installing Vocalocity Voice Browser, Voice Browser components, and Vocalocity tools
Vocalocity App Center User’s Guide
Describes how to build and deploy Vocalocity Voice Browser solutions using Vocalocity App Center and the Vocalocity Voice Browser
Voice application developers who are creating and publishing VoiceXML applications for their own use or for their customers
Vocalocity Control Center User’s Guide
Describes how to monitor Vocalocity Voice Browser solutions
Operations personnel performing ongoing maintenance of Vocalocity Voice Browser solutions
Vocalocity Info Center User’s Guide Describes how to gather call information and use that information to support Vocalocity Voice Browser solutions
Support personnel responsible for troubleshooting and supporting voice applications
VoiceXML Implementation Reference Guide
Describes how the Vocalocity Voice Browser implements VoiceXML 2.0 and 2.1.
This guide should be used along with the W3C VoiceXML specifications when developing applications.
Application developers who are creating VoiceXML applications for the Vocalocity Voice Browser
Technical personnel who are responsible for troubleshooting deployed applications
viii Vocalocity Voice Browser VoiceXML Implementation Reference Guide
Conventions
ConventionsThe following table describes the typographical conventions used in this guide.
Convention Meaning
Monospace Indicates text that should be entered exactly as shown (including punctuation) or examples of code. Here is an example of a command line:
# mkdir /somedir
Bold Type Indicates a path or the name of a program, process, procedure, routine, script, or table, such as ASSIGN
Italic Type Indicates a variable entry, such as <server>, or a term being defined for the first time
Vocalocity Voice Browser VoiceXML Implementation Reference Guide ix
About This Guide
Contacting Vocalocity Technical SupportThere are many ways to contact Vocalocity Customer Support.
Contact us... At...
On the Web http://support.vocalocity.com
The Vocalocity support website is available 24 hours. The website has integrated issue tracking functionality that allows customers to enter and track defects and enhancements to our software.
Via email [email protected]
Your email goes directly to our technical support staff.
By telephone +1 404.487.1200
By mail Our corporate offices are located at:
Vocalocity, Inc.730 Peachtree StreetSuite 1100Atlanta, GA 30308 USA
x Vocalocity Voice Browser VoiceXML Implementation Reference Guide
1
IntroductionThe Vocalocity Voice Browser is a voice browser, a packaged solution that integrates all the components necessary for a voice application system. The Vocalocity Voice Browser includes a VoiceXML Interpreter that reads and plays VoiceXML applications. This chapter provides some background on voice browsers, voice standards, and the Vocalocity VoiceXML Interpreter.
This chapter contains the following topics:
About Voice Browsers and VoiceXML
Implementation of the VoiceXML Specifications
ASR Vendor Support of SRGS
Introduction
About Voice Browsers and VoiceXMLVocalocity is an active member of the W3C Voice Browser Working Group. The Working Group has defined a suite of markup languages covering dialog, speech synthesis, speech recognition, call control and other aspects of interactive voice response applications.
Vocalocity is one of the W3C Editors of VoiceXML 2.0, VoiceXML 2.1, CCXML 1.0 and SSML. Additionally, Vocalocity is a Board Member of the VoiceXML Forum and our Chief Architect, Ken Rehor, serves as the organization's Vice Chair.
For more information about the:
W3C Voice Browser Working Group, go to www.w3c.org/Voice
VoiceXML Forum, go to www.voicexml.org
Specifications such as the Speech Synthesis Markup Language (SSML), Speech Recognition Grammar Specification (SRGS), and Call Control XML (CCXML) are core technologies for describing speech synthesis (text-to-speech), recognition grammars (automatic speech recognition), and call control constructs respectively.
VoiceXML, or Voice eXtensible Markup Language, is a dialog markup language that leverages the other specifications for creating dialogs that feature synthesized speech, digitized audio, recognition of spoken and DTMF key (touch tone) input, recording of spoken input, telephony, and mixed initiative conversations.
VoiceXML is the HTML of the voice web, the open standard markup language for voice applications. Where HTML assumes a graphical web browser with display, keyboard, and mouse, VoiceXML assumes a voice browser with audio output (recorded messages and TTS synthesis), audio input (ASR), and keypad input (DTMF).
Voice Browsers
A voice browser is a collection of software that works together to integrate and manage telephony, automatic speech recognition (ASR), text-to-speech (TTS), DTMF (touchtone), third-party or custom services, media, and other resources required to run VoiceXML applications.
The Vocalocity Voice Browser is a packaged solution that integrates all the components necessary for a voice application system. The Vocalocity Voice Browser includes a VoiceXML Interpreter that enables it to execute voice applications written in VoiceXML.
1-2 Vocalocity Voice Browser VoiceXML Implementation Reference Guide
About Voice Browsers and VoiceXML
The Vocalocity VoiceXML Interpreter conforms to the VoiceXML 2.0 and VoiceXML 2.1 specifications and related specifications.
Supported Specifications
The Vocalocity Voice Browser supports the following W3C specifications.
Standard Description Specification
VoiceXML 2.0 Voice eXtended Markup Language
Markup language used to create dialogs – voice applications.
The Vocalocity Voice Browser includes a VoiceXML interpreter that can render VoiceXML applications.
W3C Recommendation 16 March 2004
www.w3.org/TR/voicexml20/
VoiceXML 2.1 Voice eXtended Markup Language W3C Working Draft 28 July 2004
www.w3.org/TR/voicexml21/
SSML 1.0 Speech Synthesis Markup Language
SSML tags are used for TTS capabilities. They are noted in the Implementation Notes in the following table.
W3C Recommendation 7 September 2004
www.w3.org/TR/speech-synthesis/
SRGS 1.0 Speech Recognition Grammar Specification
SRGS tags are used for ASR capabilities, for example, to specify a grammar.
W3C Recommendation 16 March 2004
www.w3.org/TR/speech-grammar/
SISR 1.0 Semantic Interpretation for Speech Recognition
The SRGS element <tag> provides a placeholder for instructions to a semantic processor.
W3C Working Draft 8 November 2004
www.w3.org/TR/semantic-interpretation/
Vocalocity Voice Browser VoiceXML Implementation Reference Guide 1-3
Introduction
Implementation of the VoiceXML SpecificationsThe Vocalocity VoiceXML Interpreter conforms to all required elements in the VoiceXML 2.0 and 2.1 specifications. However, there are elements where the implementation of attributes has been left to the interpreter. This guide describes how Vocalocity has implemented the standard. It should be used along with the W3C VoiceXML specifications when developing applications.
Support of built-in VoiceXML grammars is dependent on the ASR vendor implementation. For a list of vendors and supported SRGS versions, see ASR Vendor Support of SRGS on page 1-5.
Support of semantic interpretation is dependent on the ASR vendor. For a list of vendors and supported SRGS versions, see ASR Vendor Support of SISR on page 1-5.
Support for SSML is dependent on the TTS vendor.
1-4 Vocalocity Voice Browser VoiceXML Implementation Reference Guide
ASR Vendor Support of SRGS
ASR Vendor Support of SRGSThe version of the SRGS supported also depends on the speech recognition vendor your implementation uses.
ASR Vendor Support of SISR
The version of the SISR specification supported also depends on the speech recognition vendor your implementation uses.
ASR Vendor Supported Specification
ScanSoft SpeechWorks OSR 3.0
SRGS 1.0
W3C Proposed Recommendation 18 December 2003
http://www.w3.org/TR/2003/PR-speech-grammar-20031218/
ScanSoft SpeechWorks OSR 2.0
SRGS 1.0
W3C Candidate Recommendation 26 June 2002
http://www.w3.org/TR/2002/CR-speech-grammar-20020626/
LumenVox SRE 5.5 SRGS 1.0
W3C Recommendation 16 March 2004
http://www.w3.org/TR/speech-grammar/
Nuance Speech Recognition System 8.0.0
SRGS 1.0
W3C Working Draft 20 August 2001
www.w3.org/TR/2001/WD-speech-grammar-20010820/
ASR Vendor Supported Specification
ScanSoft SpeechWorks OSR 3.0
SISR 1.0
W3C Working Draft 8 November 2004
http://www.w3.org/TR/semantic-interpretation/
ScanSoft SpeechWorks OSR 2.0
SISR 1.0
LumenVox SRE 5.5 SISR 1.0
Vocalocity Voice Browser VoiceXML Implementation Reference Guide 1-5
Introduction
Nuance Speech Recognition System 8.0.0
SISR 1.0
ASR Vendor Supported Specification
1-6 Vocalocity Voice Browser VoiceXML Implementation Reference Guide
2
VoiceXML Element SummaryThis chapter explains how the Vocalocity Voice Browser implements the VoiceXML 2.0 and 2.1 standards. It identifies areas of clarification in cases where the specifications had ambiguous requirements or where the implementation has been left to the vendor.
This chapter contains the following topics:
VoiceXML Summary
SSML Summary
SRGS Summary
Detailed Implementation Notes
VoiceXML Element Summary
VoiceXML SummaryThe following table is a summary of the current VoiceXML elements supported in this release of the Vocalocity Voice Browser.
Element Purpose Implementation Notes
<assign> Assign a variable a value Implemented as defined in VoiceXML 2.0.
<audio> Play an audio clip within a prompt The VoiceXML interpreter processes audio content so that only text is sent to the TTS engine; pre-recorded audio is played by the telephony hardware. For more information, see <audio> Element on page 2-10.
<block> A container of (non-interactive) executable code
Implemented as defined in VoiceXML 2.0.
<break> Control the pausing or other prosodic boundaries between words
Implemented as defined in VoiceXML 2.0.
<catch> Catch an event Implemented as defined in VoiceXML 2.0.
<choice> Define a menu item or specify a speech or DTMF grammar,
Up to 100 <choice> tags are supported for each menu.
Exactly one of “next”, “expr”, “event” or “eventexpr” must be specified; otherwise, an error.badfetch event is thrown.
Exactly one of “message” or “messageexpr” may be specified; otherwise, an error.badfetch event is thrown.
<clear> Clear one or more form item variables Implemented as defined in VoiceXML 2.0
<data> Allows a VoiceXML application to fetch XML data from a document server without transitioning to a new VoiceXML document
New in VoiceXML 2.1
A Java system property – vocalos.vxml.data.access_control.allow – can be set to configure the default behavior if the returned XML content does not contain the access-control XML processing instruction. See <data> Element on page 2-10.
<disconnect> Disconnect a session The namelist values are passed to the TEP implementation for further processing. Upon processing of <disconnect/> the prompt queue will be flushed before sending the hangup command to the TEP.
<else> Used in <if> elements Implemented as defined in VoiceXML 2.0.
<elseif> Used in <if> elements Implemented as defined in VoiceXML 2.0.
2-2 Vocalocity Voice Browser VoiceXML Implementation Reference Guide
VoiceXML Summary
<enumerate> Shorthand for enumerating the choices in a menu. An automatically generated description of the choices available to the user.
Implemented as defined in VoiceXML 2.0.
<enumerate> specifies a template that is applied to each choice in the order they appear in the menu.
If it is used with no content, the VoiceXML interpreter uses the following template:
Enter “dtmf-digit” for “text”
<error> Catch an error event Implemented as defined in VoiceXML 2.0.
An abbreviation for <catch event=”error”> element
<exit> Exit a session and returns control to the interpreter context which determines what to do next
The namelist values are passed to the TEP implementation for further processing. Upon processing the </exit> the prompt queue will be flushed before sending the hangup command to the TEP.
<field> Declares an input field in a form – an input to be gathered from the user
Implemented as defined in VoiceXML 2.0
<filled> An action executed when fields are filled Implemented as defined in VoiceXML 2.0
<foreach> Allows a VoiceXML application to iterate through an ECMAScript array and to execute the content contained within the <foreach> element for each item in the array
New in VoiceXML 2.1
Implemented as defined in VoiceXML 2.1
<form> A dialog for presenting information and collecting data
Implemented as defined in VoiceXML 2.0
<goto> Go to another dialog in the same or different document
Implemented as defined in VoiceXML 2.0
<grammar> Specify a speech recognition or DTMF grammar
Implemented as defined in VoiceXML 2.0
<help> Catch a help event Implemented as defined in VoiceXML 2.0.
An abbreviation for <catch event=”help”> element
Grammars generated for all system level prompts such as “help” and “exit” are generated in US English only.
<if> Simple conditional logic Implemented as defined in VoiceXML 2.0
<initial> Declares initial logic upon entry into a (mixed initiative) form
Implemented as defined in VoiceXML 2.0
Element Purpose Implementation Notes
Vocalocity Voice Browser VoiceXML Implementation Reference Guide 2-3
VoiceXML Element Summary
<link> Specify a transition common to all dialogs in the link's scope
Implemented as defined in VoiceXML 2.0
<log> Generate a debug message Implemented as defined in VoiceXML 2.0
An application can log the content of the log tag to a Log4J category. For more information, see <log> Element on page 2-11.
For SpeechWorks OSR, any log tag starting with SWI will be logged to the SPWK log.
<menu> A dialog for choosing amongst alternative destinations
Implemented as defined in VoiceXML 2.0
<meta> Define a metadata item as a name/value pair The following meta properties are supported:Expires (http-equiv)Pragma (http-equiv)Cache-Control (http-equiv)
For more information, see <meta> Element on page 2-11.
<metadata> Define metadata information using a metadata schema
This element is supported, but not used by the Vocalocity Voice Browser.
<noinput> Catch a noinput event Implemented as defined in VoiceXML 2.0
An abbreviation for <catch event=”noinput”> element
<nomatch> Catch a nomatch event Implemented as defined in VoiceXML 2.0
An abbreviation for <catch event=”nomatch”> element
<object> Interact with a custom extension Use the Vocalocity Object API to register a custom object implementation.
If you use this tag without registering the object, an error.object.notsupported event will be thrown.
<option> Specify an option in a <field> Implemented as defined in VoiceXML 2.0
<param> Parameter in <object> or <subdialog> Implemented as defined in VoiceXML 2.0
<prompt> Queue speech synthesis and audio output to the user
The attribute bargeintype is not supported.
Recognition-based bargein is not currently supported. Regardless of the value of the bargeintype attribute, the browser will always use energy.
For information about bargein support, see Bargein on page 3-14.
Element Purpose Implementation Notes
2-4 Vocalocity Voice Browser VoiceXML Implementation Reference Guide
VoiceXML Summary
<property> Control implementation platform settings (such as recognition process, timeouts, caching policy, etc.)
The Vocalocity Voice Browser sets default properties, if not specified by <property>. For more information, see Vocalocity Property Defaults on page 3-4.
<record> Record an audio sample The timeout attribute is ignored (the VoiceXML interpreter will not throw a <noinput> event if timeout is exceeded before recording begins).
Use finalsilence attribute to set interval that indicates end of speech to record.
<reprompt> Play a field prompt when a field is re-visited after an event
Implemented as defined in VoiceXML 2.0
<return> End execution of a subdialog and return control and data to the calling dialog
Implemented as defined in VoiceXML 2.0
<script> Specify a block of ECMAScript client-side scripting logic
Implemented as defined in VoiceXML 2.0
If not specified, the default encoding is UTF-8.
For additional information about ECMAScript usage, see ECMAScript on page 3-13.
<subdialog> Invoke another dialog as a subdialog of the current one
Implemented as defined in VoiceXML 2.0
<submit> Submit values to a document server When submitting a variable that contains recorded content, the browser will use the value of “multipart/form-data” for the enctype attribute, regardless of what is supplied by the application.
<throw> Throw an event Implemented as defined in VoiceXML 2.0
<transfer> Transfer the caller to another destination The Vocalocity Voice Browser supports:Bridge transfer (bridge=”true”)Blind transfer (bridge=”false”)
The default is blind.
Hot word transfer cancellation is not supported.
The transferaudio attribute is not supported.
<value> Insert the value of an expression in a prompt Implemented as defined in VoiceXML 2.0
<var> Declare a variable Implemented as defined in VoiceXML 2.0
Element Purpose Implementation Notes
Vocalocity Voice Browser VoiceXML Implementation Reference Guide 2-5
VoiceXML Element Summary
<vxml> Top-level element in each VoiceXML document
Implemented as defined in VoiceXML 2.0
Required attributes:version=“2.0” (for VoiceXML 2.0)xmlns=http://www.w3.org/2001/vxml
Element Purpose Implementation Notes
2-6 Vocalocity Voice Browser VoiceXML Implementation Reference Guide
SSML Summary
SSML SummaryThese speech markup elements are defined in SSML and are available in VoiceXML 2.0. The following table is a summary of the current SSML elements supported in this release of the Vocalocity Voice Browser.
Important: Support is largely driven by the text-to-speech engine. Consult the documentation provided by your TTS vendor for specific information about how SSML elements are supported.
Element Purpose Implementation Notes
<audio> Play an audio clip within a prompt Implemented as defined in SSML 1.0
The VoiceXML interpreter processes audio content so that only text is sent to the TTS engine; pre-recorded audio is played by the telephony hardware. For more information, see <audio> Element on page 2-10.
<break> Specifies a pause in the speech output Implemented as defined in SSML 1.0
<desc> Provides a description of a non-speech audio source in <audio>.
Implemented as defined in SSML 1.0
<emphasis> Specifies that the enclosed text should be spoken with emphasis
Implemented as defined in VoiceXML 2.0
Ignored by Speechify 2.1.6.
<lexicon> Specifies a pronunciation lexicon for the prompt
Implemented as defined in SSML 1.0
<mark> Place a marker into the text or tag sequence so that it can be referenced
Not supported
<meta> Define a metadata item as a name/value pair The following meta properties are supported:Expires (http-equiv)Pragma (http-equiv)Cache-Control (http-equiv)
For more information, see <meta> Element on page 2-11.
<metadata> Define metadata information using a metadata schema
This element is supported, but not used by the Vocalocity Voice Browser.
<p> Identifies the enclosed text as a paragraph, containing zero or more sentences
Implemented as defined in SSML 1.0
Vocalocity Voice Browser VoiceXML Implementation Reference Guide 2-7
VoiceXML Element Summary
f<phoneme> Specifies a phonetic pronunciation for the contained text
Implemented as defined in SSML 1.0
ph attribute is a required attribute that specifies the phoneme/phone string
Speechify 2.1.6 only supports ph attribute with SPR format. See Speechify Users Guide for more information.
<prosody> Specifies prosodic information – control of the pitch, speaking rate, and volume of the speech output – for the enclosed text
Implemented as defined in SSML 1.0
Speechify 2.1.6 only supports volume and rate attributes.
<say-as> Specifies the type of text construct contained within the element and helps specify the level of detail for rendering the contained text
Implemented as defined in SSML 1.0
Most types are supported by Speechify. See the Speechify Users Guide for a complete listing.
<s> Identifies the enclosed text as a sentence Implemented as defined in SSML 1.0
<sub> Specifies replacement spoken text for the contained text
Implemented as defined in SSML 1.0
<voice> Specifies voice characteristics for the spoken text.
Implemented as defined in SSML 1.0
Ignored by Speechify 2.1.6
Element Purpose Implementation Notes
2-8 Vocalocity Voice Browser VoiceXML Implementation Reference Guide
SRGS Summary
SRGS SummaryThe following table is a summary of the current SRGS elements supported in this release of the Vocalocity Voice Browser.
Important: Support is largely driven by the speech recognition engine. Consult the documentation provided by your ASR vendor for specific information about how SRGS elements are supported.
Element Purpose Implementation Notes
<item> Defines expected spoken responses from the caller
Implemented as defined in SRGS 1.0
<rule> Defines a rule for speech interpretation and processing
Implemented as defined in SRGS 1.0
<ruleref> Specifies the type of rule processing to use Implemented as defined in SRGS 1.0
Several rulenames are defined to have specific interpretation and processing by a speech recognizer. A grammar must not redefine these rulenames.
SpeechWorks OSR – the attribute special is not supported by OSR 2.0.
<tag> A legal rule expansion. A tag is an arbitrary string that may be included inline within any legal rule expansion. Any number of tags may be included inline within a rule expansion.
Implemented as defined in SRGS 1.0
Used to pass semantic interpretation information to speech recognition systems.
Content of the <tag> is based on the SISR specification.
<token> The part of a grammar that defines words or other entities that may be spoken.
The attributes lexicon and xml:lang are not supported.
Vocalocity Voice Browser VoiceXML Implementation Reference Guide 2-9
VoiceXML Element Summary
Detailed Implementation NotesThis section includes additional detail for the items contained in the VoiceXML Summary on page 2-2.
<audio> Element
The VoiceXML interpreter processes audio content so that only text is sent to the TTS engine; pre-recorded audio is played by the telephony hardware. The VoiceXML interpreter:
Extracts the audio content from a SSML container
Sends content before and after <audio> content to the TTS processor
Processes audio content (for example, a referenced WAV file) using the associated telephony extension point for the call
In the following example, the VoiceXML interpreter will extract the <audio expr=”greeting”/> so that it can be played by the telephony hardware. The remaining content is processed through the TTS engine.
<data> Element
By default, the VoiceXML interpreter will allow a voice application to access returned XML content when an access-control processing instruction is not specified.
A Java system property – vocalos.vxml.data.access_control.allow – can be set to “false” to override this default behavior. This system property is not delivered (the VoiceXML interpreter behaves as if it were set to “true”).
To add the parameter, edit the vocalos.conf (Windows) file and add the -D system parameter as shown in the following example.
<prompt>
Your recorded greeting is <audio expr="greeting"/>
To rerecord, press 1.
To use this greeting, press pound. To return to the main menu press star M.
To exit press star X.
</prompt>
[JVM]...
wrapper.java.additional.6="-Dvocalos.vxml.data.access_control.allow=false"
2-10 Vocalocity Voice Browser VoiceXML Implementation Reference Guide
Detailed Implementation Notes
<log> Element
To log information using the Apache Software Foundation log4j facility, insert the optional label attribute using the following syntax:
Note: Check the log4j.xml file to make sure the logging level you specify with the label attribute is defined for the category.
Valid logging levels – set with “priority” in the above example – are:
DEBUG (default)
INFO
WARN
ERROR
FATAL
For example, the following attribute will log to the log4j category, “sample,” using the debug level.
<meta> Element
The following meta properties are supported.
<log label="log4j:priority:categoryname"/>Log entry text </log>
<log label="log4j:debug:sample">DNIS is <value expr="session.connec-tion.local.uri"/></log>
Property Description
Expires (http-equiv) Overrides (or provides for) the HTTP Expires header.
Set the value to 0 to make the current document expire immediately. It will not be cached.
All times must be entered in GMT format:
EEE, dd MMM yyyy HH:mm:ss z
For example:
Fri, 07 Sep 2001 17:54:32 EST
Vocalocity Voice Browser VoiceXML Implementation Reference Guide 2-11
VoiceXML Element Summary
Pragma (http-equiv) Controls caching of the current document.
Set this property to no-cache to make the current document expire immediately. It will not be cached.
Cache-Control (http-equiv) Controls caching of the current document.
Set this property to no-cache to make the current document expire immediately. It will not be cached.
Property Description
2-12 Vocalocity Voice Browser VoiceXML Implementation Reference Guide
3
Standard Types and DefaultsThis chapter introduces the standard event types, session variables and Java system properties available to customize the Vocalocity Voice Browser VoiceXML implementation.
This chapter contains the following topics:
Introduction
Vocalocity Session Variables
Vocalocity Property Defaults
Custom Browser Properties
Audio and Initial Page Fetching
MIME Type Mapping
SAX Parsers
ECMAScript
Implementation Notes
Standard Types and Defaults
IntroductionYou can customize the behavior of the Vocalocity Voice Browser by changing Vocalocity configuration properties (in vocalos-node.xml) or Java system properties in vocalos.conf (Windows).
Setting Vocalocity Browser Properties
The vocalos-node.xml file contains all default browser properties. You can make changes to these parameters without restarting the VocalOS server.
You should use Vocalocity Control Center to access these settings.
When you make changes to this file, you do not have to restart the VocalOS server. This file is refreshed every minute. So, if you make a change to a setting, it will apply to the next inbound call that arrives one minute after the change.
Properties are defined in two sections: BrowserProperties and vxml.
Setting Properties
To customize the global defaults for the Vocalocity Voice Browser, edit the vocalos-node.xml file and search for the node starting with:
The format of a property entry is:
Setting Java System Properties
Java system properties are defined in the vocalos.conf (Windows) file. The format is:
<node name="BrowserProperties">
<property name="property-name" value="value"/>
wrapper.java.additional.#="-Dproperty=value"
3-2 Vocalocity Voice Browser VoiceXML Implementation Reference Guide
Vocalocity Session Variables
Vocalocity Session VariablesThe standard session variables defined in VoiceXML 2.0 and 2.1 are supported.
The Vocalocity Browser defines the following additional session variables to assist in application development.
Variable Description
session.connection.callid The unique call identifier
session.connection.sessionid The unique session identifier
session.connection.local.port The logical port identifier for the call
session.connection.local.channelid The channel identifier for the call
Vocalocity Voice Browser VoiceXML Implementation Reference Guide 3-3
Standard Types and Defaults
Vocalocity Property DefaultsProperties are used to set values that affect platform behavior, such as the recognition process, timeouts, and caching policy. You should specify properties in the top-level VoiceXML document in your application. This section contains the default browser properties the VoiceXML interpreter will use, if you do not provide them in the application.
The VoiceXML specifications group these properties into the following categories:
Generic speech recognition
Generic DTMF recognition
Prompting and collection
Fetching
Miscellaneous
Generic Speech Recognition Properties
The following properties control speech recognition.
Generic DTMF Recognition Properties
The following properties control DTMF recognition.
Property Vocalocity Default VoiceXML 2.x Default
completetimeout 0s platform-specific
confidencelevel 0.75 0.5
incompletetimeout 1000ms platform-specific
maxspeechtimeout 30s platform-specific
sensitivity 0.5 0.5
speedvsaccuracy 0.5 0.5
Property Vocalocity Default VoiceXML 2.x Default
interdigittimeout 3000ms platform-specific
termchar # #
3-4 Vocalocity Voice Browser VoiceXML Implementation Reference Guide
Vocalocity Property Defaults
Prompt and Collection Properties
The following properties control prompting and collection.
Fetching Properties
The following properties control fetching.
termtimeout 0s 0s
Property Vocalocity Default VoiceXML 2.x Default
Property Vocalocity Default VoiceXML 2.x Default
bargein true true
bargeintype speech platform-specific
timeout 5000ms platform-specific
Property Vocalocity Default VoiceXML 2.x Default
audiofetchhint prefetch prefetch
audiomaxage 300000s platform-specific
audiomaxstale 30s platform-specific
documentfetchhint safe safe
documentmaxage 0s platform-specific
documentmaxstale 0s platform-specific
fetchaudio N/A (the value parameter is empty)
N/A
fetchaudiodelay 500ms platform-specific
fetchaudiominimum 30000ms platform-specific
fetchhint safe platform-specific
fetchtimeout 30s platform-specific
grammarfetchhint prefetch prefetch
grammarmaxage 300000s platform-specific
Vocalocity Voice Browser VoiceXML Implementation Reference Guide 3-5
Standard Types and Defaults
Miscellaneous Properties
The following are miscellaneous properties
grammarmaxstale 30s platform-specific
maxage 60s platform-specific (defaults to documentmaxage)
maxspeechtimeout 30s platform-specific
maxstale 0s platform-specific (defaults to documentmaxstale)
objectfetchhint prefetch prefetch
objectmaxage 300000s platform-specific
objectmaxstale 30s platform-specific
recordutterance false
New in VoiceXML 2.1
false
recordutterancetype audio/basic
New in VoiceXML 2.1
platform-specific
scriptfetchhint prefetch prefetch
scriptmaxage 300000s platform-specific
scriptmaxstale 30s platform-specific
Property Vocalocity Default VoiceXML 2.x Default
Property Vocalocity Default VoiceXML 2.x Default
inputmodes dtmf voice dtmf voice (on platforms that support both modes)
maxnbest 1 1
universals none none
3-6 Vocalocity Voice Browser VoiceXML Implementation Reference Guide
Custom Browser Properties
Custom Browser PropertiesIn addition to the standard VoiceXML properties, Vocalocity provides custom properties that can be used by application developers. These are included in the vocalos-node.xml file, under <node name=BrowserProperties>.
Specifying the ASR or TTS Engine to Use
If you have ASR or TTS engines from different vendors connected to one VocalOS Server, you should specify which ASR or TTS engine to use for each VoiceXML application.
Important: This is required only if you use multiple ASR or TTS vendors.
To specify the engine, add the asrengine and ttsengine properties to your VoiceXML pages:
For example, if you have SpeechWorks OSR and LumenVox SRE installed in your environment, and you want SpeechWorks to handle certain pages, add the following property to those VoiceXML pages:
Entry Default Description
vocalos.event.maxcount 10 The maximum number of times a single event is called from within the same execution context.
For more information, see Infinite Loop Detection on page 3-15.
vocalos.url.maxlength 2000 bytes The maximum length, in bytes, to retrieve using an HTTP GET.
When the URL is longer than the maximum set here, the HTTP GET will be converted to an HTTP POST before submit.
<property name="asrengine" value="vendor-name"/>
<property name="ttsengine" value="vendor-name"/>
<property name="asrengine" value="speechworks"/>
Vocalocity Voice Browser VoiceXML Implementation Reference Guide 3-7
Standard Types and Defaults
ASR Engines The following table lists the supported engines and the value to enter for each.
TTS Engines The following table lists the supported engines and the value to enter for each.
Defaults Used when No Engine Is Specified
If you do not specify the engine and:
You are using only one vendor’s TTS or ASR engine, the Vocalocity Voice Browser will default to that engine.
You installed TTS or ASR engines from different vendors, the Vocalocity Voice Browser runs through options in this order until it gets to one that is installed. For:
ASR – SpeechWorks, Nuance, LumenVox, Vocalocity DTMF
TTS – SpeechWorks Speechify, ScanSoft RealSpeak, VoiceWare
If the selected engine is not available, an error.noresource.asr or error.noresource.tts will be raised in the VoiceXML application.
Engine Values
SpeechWorks OSR Any of these values:scansoftspeechworksosr
LumenVox SRE lumenvox
Nuance nuance
Vocalocity’s DTMF-only vocalocity
Engine Value
SpeechWorks Speechify 2.1 and 3.0
speechify
ScanSoft RealSpeak 4.0 scansoft
VoiceWare Any of these values:neospeechvoicetextvoiceware
3-8 Vocalocity Voice Browser VoiceXML Implementation Reference Guide
Audio and Initial Page Fetching
Audio and Initial Page FetchingThe default system audio is configured in the vocalos-node.xml file and can be configured by changing this file. You do not have to restart the VocalOS server after changing a property.
Search for:
You can configure the following entries.
<node name="vxml">
<map>
<entry key="initialFetchTimeout" value="60s"/>
Entry Default Description
error.default.audio.url file://${jboss.server.data.dir}/error.ulaw
The error audio prompt to play if not provided by the application
help.default.audio.url file://${jboss.server.data.dir}/help.ulaw
The help audio prompt to play if not provided by the application
initialFetchTimeout 60s The timeout for fetching the initial page
initialMaxAge 0s The maxage setting for the initial page
initialMaxStale 0s The maxstale setting for the initial page
maxspeechtimeout.default.audio.url
file://${jboss.server.data.dir}/maxspeechtimeout.ulaw
The maxspeech audio prompt to play if not provided by the application
nomatch.default.audio.url file://${jboss.server.data.dir}/nomatch.ulaw
The nomatch audio prompt to play if not provided by the application
Vocalocity Voice Browser VoiceXML Implementation Reference Guide 3-9
Standard Types and Defaults
MIME Type MappingThe following are file extension to MIME type mappings if the Content-Type HTTP header is not provided.
MIME Type Format File Extension
application/srgs gram
application/srgs+xml grxml
application/ssml+xml ssml
application/voicexml+xml vxml
audio/basic Headerless 8Khz 8-bit mu-law mulaw or ulaw
audio/l16 Headerless 8Khz 8-bit L16
audio/vox Vox 8Khz 4-bit OKI_ADPCM vox
audio/wav WAV 8Khz 8-bit mono mu-law wav
audio/x-alaw-basic Headerless 8Khz 8-bit A-law alaw
audio/x-fft WAV 6Khz 8-bit FFT
audio/x-g726 Vox 8Khz 4-bit G726
audio/x-g729A Vox 6Khz 8-bit G729A
audio/x-gsm610 WAV 8Khz GSM610
audio/x-vox Vox 8Khz 4-bit OKI_ADPCM vox
audio/x-vox-11khz Vox 11Khz 8-bit OKI_ADPCM vox11
audio/x-vox-6khz Vox 6Khz 8-bit OKI_ADPCM vox6
audio/x-wav WAV 8Khz 8-bit mono mu-law xwav
audio/x-wav-11khz WAV 11Khz 8-bit mono mu-law wav11
audio/x-wav-6khz WAV 6Khz 8-bit mono mu-law wav6
text/xml xml
3-10 Vocalocity Voice Browser VoiceXML Implementation Reference Guide
MIME Type Mapping
Overriding a MIME Type
You can re-define any audio MIME type without having to define a new MIME type. For each MIME type, you can set rate (sampling rate), bitrate, and codec. Valid values are:
rate = 6000, 8000, 11000, 16000
bitrate = any number
codec = alaw, mulaw, pcm
For example, the MIME type audio/x-wav is defined as 8Khz 8-bit mu-law. To change this to 16Khz 8-bit pcm, provide the following MIME type in your application:
audio/x-wav;codec=pcm;rate=16000
Vocalocity Voice Browser VoiceXML Implementation Reference Guide 3-11
Standard Types and Defaults
SAX ParsersSAX (Simple API for Java) is a common front-end for XML parsers (like the JDBC for database access). By default, the VocalOS XML Parser uses the Piccolo implementation of the SAX parser. You can also select the Xerces or Crimson SAX parsers.
To specify a different SAX parser, edit the Java system property, javax.xml.parsers.SAXParserFactory and enter one of the following values.
Java system properties are defined in the vocalos.conf (Windows) file.
Parser Property
Crimson org.apache.crimson.jaxp.SAXParserFactoryImpl
Piccolo (default) com.bluecast.xml.JAXPSAXParserFactory
Xerces org.apache.xerces.jaxp.SAXParserFactoryImpl
3-12 Vocalocity Voice Browser VoiceXML Implementation Reference Guide
ECMAScript
ECMAScriptECMAScript is the required scripting language for VoiceXML (using the <script> element). This section contains additional information about using ECMAScript to access logging and web service facilities.
Note: Java system properties are defined in the vocalos.conf (Windows) file.
Accessing the log4j Logger
You can access the log4j logger from within the ECMAScript of your application (executed in the browser). To do this, set the Java System property vocalos.data.webservice to true (the default is false).
The following is an example of how you would access the log4j logger from within an ECMAScript script.
Accessing Web Services
You can access Web Services from within the ECMAscript of your application (executed in the browser). To do this, set the Java System property vocalos.data.webservice to “true” (the default is false).
When activated, the browser will insert a helper variable named “WebService” which exposes one method named “create” with the signature.
You can create an AXIS call object from within your script by calling:.
For more information about using AXIS, go to http://ws.apache.org/axis/.
if (Log.isDebugEnabled())
{ Log.debug("Variable value is: " + var);
}
org.apache.axis.client.Call create (String url) throws MalformedURLException;
Call call = WebService.create("http://home/index");
call.invoke("myservice",null);
Vocalocity Voice Browser VoiceXML Implementation Reference Guide 3-13
Standard Types and Defaults
Implementation NotesThis section contains additional details about how the Vocalocity Voice Browser handles the following areas:
Bargein
Default encoding
DTMF-only applications
Infinite loop detection
Strict content type detection
Time unit designations
Using a file-based URL
For vendor-specific notes, see Chapter 4, Telephony and Speech Notes.
Bargein
Support for speech-based bargein is supported. Hotword or recognition-based bargein is not currently supported.
Default Encoding
The default encoding for all documents, if not specified, is UTF-8.
DTMF-Only Applications
By default, the Vocalocity Voice Browser assumes the input modes of “dtmf” and “voice.” If your application is DTMF-only, you must specify the inputmodes browser property in your application:
If you do not specify the inputmodes property, the Vocalocity Voice Browser will attempt to perform speech recognition and will not be able to run your application.
<property name="inputmodes" value="dtmf"/>
3-14 Vocalocity Voice Browser VoiceXML Implementation Reference Guide
Implementation Notes
Infinite Loop Detection
Infinite loops can be caused when an application or the platform throws an event which is caught and re-thrown recursively.
The Vocalocity Voice Browser will attempt to check event recursion depth and prevent an infinite loop in the application.
In the vocalos-node.xml file, set the VoiceXML property vocalos.event.maxcount to control the maximum number of times a single event is called from within the same execution context. The default count is 10. Upon reaching the max count, the application will be forced to exit. You can control the max count within your application by setting this property.
Strict Content Type Processing
You can turn on strict content type processing by setting the Java system property vocalos.vxml.strict.contenttype.mode to “true.” (The default value is false.)
If you enable strict content type processing, the Vocalocity Voice Browser requires the Content-Type HTTP header in all documents returned from the web server to be application/voicexml+xml. Documents without this content-type will be rejected as invalid VoiceXML documents.
Time Unit Designations
Time unit designations for default browser properties can be specified in milliseconds. (For a list of these properties, see Vocalocity Property Defaults on page 3-4.)
In Vocalocity Version 2.3, the VoiceXML Interpreter does not require a designator in the value. However, this has been fixed in Version 2.4.1. You should specify the correct designator in your application to ensure conformance with the VoiceXML 2.0 standard.
Using a File-Based URL in Applications
To specify a file-based URL instead of HTTP, use the URL format:
file:///<path>
Vocalocity Voice Browser VoiceXML Implementation Reference Guide 3-15
Standard Types and Defaults
The following example shows how you would set this in Windows:
file:///c:/vocalos/server/server/node/data/fetchaudio.ulaw
3-16 Vocalocity Voice Browser VoiceXML Implementation Reference Guide
4
SpeechWorks OSR NotesThis chapter contains implementation suggestions and usage notes for using SpeechWorks OSR with the Vocalocity Voice Browser.
This chapter contains the following topics:
Introduction
Application Name Used for OSR Logging
SpeechWorks Recognizer Properties
Endpointer Tuning
Licensing Modes
SpeechWorks OSR Notes
IntroductionThe procedures in this chapter require changing Vocalocity configuration properties in the vocalos-node.xml file. This file is located in:
/opt/VocalOS/Server/server/nodeserver/conf/
When you make changes to this file, you do not have to restart the VocalOS server. This file is refreshed every minute. When you make a change to a setting, it will apply to the next inbound call that arrives one minute after the change.
You will be setting parameters in the RecoManager node. Edit the vocalos-node.xml file and search for the node starting with:
<node name="RecoManager">
<map/>
<node name="vocalos:type=RecoManager,vendor=Vocalocity">
<map>
<entry key="extensionPoint" value="net.vocalocity.dtmfonly.DTMFOnlyExtensionPoint"/>
</map>
</node>
<node name="vocalos:type=RecoManager,vendor=SpeechWorks">
<map>
<entry key="extensionPoint"
value="net.vocalocity.speechworks.osr.extension.OSRExtensionPoint"/>
4-2 Vocalocity Voice Browser VoiceXML Implementation Reference Guide
Application Name Used for OSR Logging
Application Name Used for OSR LoggingYou can add the applicationName entry key to set the name of the application that is used when logging to the SpeechWorks OSR log.
Note: By default, the application name is vocalos.
Edit the vocalos-node.xml file and scroll to the OSRSession node. Add an entry key for applicationName and specify the name you want to use. In the following example, the name is “excel.”
<node name="OSRSession"><map>
<entry key="autoStartServer" value="true"/>
<entry key="autoSelfTestCheck" value="false"/><entry key="useEndpointerTuning" value="true"/>
<entry key="applicationName" value="excel"/>
</map>
Vocalocity Voice Browser VoiceXML Implementation Reference Guide 4-3
SpeechWorks OSR Notes
SpeechWorks Recognizer PropertiesYou can set SpeechWorks recognizer properties by defining them as VoiceXML property attributes. VoiceXML properties are contained in the BrowserProperties node.
The following example shows how to set the property for the audio environment.
For more information, see the SpeechWorks OSR Reference Manual.
<property name="swirec_audio_environment" value="cellular"/>
4-4 Vocalocity Voice Browser VoiceXML Implementation Reference Guide
Endpointer Tuning
Endpointer Tuning
You can control the endpointer tuning parameters that are configured for certain telephony environments when using an external first-pass detector, that is, the telephony hardware’s energy detector. You can allow the telephony hardware to detect energy levels that indicate, for example, silence or bargein; the telephony hardware will pass the appropriate audio to the speech recognizer.
By default, the Vocalocity Voice Browser uses an external firstpass detector. Edit the vocalos-node.xml file and set the useEndpointerTuning parameter to:
False – to turn off external firstpass detection and use OSR for energy detection. This sets the OSR Endpointer runtime property swiep_external_firstpass to “0”.
True (the default) – to turn on external firstpass detection and use the CSP tuning parameters for energy detection. This sets the OSR Endpointer runtime property swiep_external_firstpass to “1”.
The following example shows how to turn off CSP tuning (set to “false”).
<node name="vocalos:type=RecoManager,vendor=SpeechWorks">
<map> <entry key="extensionPoint" value="net.vocalocity.speechworks.osr.extension.OSRExtension-
Point"/>
</map><node name="OSRSession">
<map>
<entry key="autoStartServer" value="true"/> <entry key="autoSelfTestCheck" value="false"/>
<entry key="useEndpointerTuning" value="false"/>
</map></node>
Vocalocity Voice Browser VoiceXML Implementation Reference Guide 4-5
SpeechWorks OSR Notes
Licensing ModesBy default, the SpeechWorks OSR extension point uses implicit licensing mode. In implicit licensing mode, a license is checked out at the beginning of a call and remains checked out for the duration of the call.
In explicit licensing mode, a license is checked out on an as-needed basis – each time the recognizer is listening for speech – and checked back in when the Vocalocity VoiceXML interpreter is transitioning.
To turn on explicit licensing mode, edit the SpeechWorks Baseline.xml file (located in the config directory of the OSR install directory) and set the swirec_licensing_mode parameter.
<!-- specifies the mode for controlling license allocation -->
<param name="swirec_licensing_mode">
<value>explicit</value> </param>
4-6 Vocalocity Voice Browser VoiceXML Implementation Reference Guide
Index
A
ASR enginedefault 3-8specifying when more than one 3-
7supported 3-8
assign element 2-2
audioinitial page fetching 3-9
audio element 2-2, 2-7, 2-10
B
bargeinsupported types 3-14
block element 2-2
break element 2-2, 2-7
browser propertiescustom 3-7
C
catch element 2-2
choice element 2-2
clear element 2-2
Crimson SAX parser 3-12
D
data element 2-2, 2-10
defaults
browser properties 3-4Fetching Properties 3-5Generic DTMF Recognition
Properties 3-4Generic Speech Recognition
Properties 3-4Miscellaneous Properties 3-6Prompt and Collection
Properties 3-5
desc element 2-7
disconnect element 2-2
DTMF-only applicationsfailing 3-14specifying input mode 3-14
E
ECMAScriptlog4j logger 3-13usage 3-13Web Services 3-13
elementsassign 2-2audio 2-2, 2-7block 2-2break 2-2, 2-7catch 2-2choice 2-2clear 2-2data 2-2desc 2-7disconnect 2-2else 2-2elseif 2-2
Index
emphasis 2-7enumerate 2-3error 2-3exit 2-3field 2-3filled 2-3foreach 2-3form 2-3goto 2-3grammar 2-3help 2-3if 2-3initial 2-3lexicon 2-7link 2-4log 2-4mark 2-7menu 2-4meta 2-4, 2-7metadata 2-4, 2-7noinput 2-4nomatch 2-4object 2-4option 2-4p 2-7param 2-4phoneme 2-8prompt 2-4property 2-5prosody 2-8record 2-5reprompt 2-5return 2-5say-as 2-8script 2-5s 2-8sub 2-8subdialog 2-5submit 2-5
throw 2-5transfer 2-5value 2-5var 2-5voice 2-8VoiceXML 2-2vxml 2-6
else element 2-2
elseif element 2-2
emphasis element 2-7
encodingdefault 3-14
endpointerSpeechWorks OSR 4-5
enumerate element 2-3
error element 2-3
exit element 2-3
explicit licensing modedefined 4-6enabling 4-6
F
fetchinginitial page 3-9
Fetching Propertiesdefaults 3-5
field element 2-3
file-based URLs 3-15
filled element 2-3
foreach element 2-3
form element 2-3
Index-2 Vocalocity Voice Browser VoiceXML Implementation Reference Guide
Index
G
Generic DTMF Recognition Properties
defaults 3-4
Generic Speech Recognition Properties
defaults 3-4
goto element 2-3
grammar element 2-3
H
help element 2-3
I
if element 2-3
implicit licensing modedefined 4-6
infinite loopsdetecting 3-15
initial element 2-3
item element 2-9
J
Java system propertiesjavax.xml.parsers.SAXParserFact
ory 3-12setting 3-2vocalos.conf (Windows) 2-10vocalos.vxml.data.access_control.
allow 2-2, 2-10vocalos.vxml.strict.contenttype.mo
de 3-15
L
lexicon element 2-7
link element 2-4
log element 2-4
log4jaccessing within ECMAScript 3-
13
LumenVox SRE 5.5SISR 1-5SRGS 1-5
M
mark element 2-7
menu element 2-4
meta element 2-4, 2-7, 2-11supported properties 2-11
meta propertiessupported 2-11
metadata element 2-4, 2-7
MIME typesaudio 3-11mapping 3-10redefining audio 3-11supported 3-10
Miscellaneous Propertiesdefaults 3-6
N
noinput element 2-4
nomatch element 2-4
Nuance Speech Recognition System 8.0
SISR 1-6SRGS 1-5
O
object element 2-4
option element 2-4
Vocalocity Voice Browser VoiceXML Implementation Reference Guide Index-3
Index
P
p element 2-7
param element 2-4
phoneme element 2-8
Piccolo SAX parser 3-12
Prompt and Collection Propertiesdefaults 3-5
prompt element 2-4
propertiescustom browser 3-7Vocalocity defaults 3-4
property element 2-5
prosody element 2-8
R
record element 2-5
reprompt element 2-5
return element 2-5
rule element 2-9
ruleref element 2-9
S
SAX parserCrimson 3-12Piccolo 3-12Xerces 3-12
SAX parser, selecting 3-12
say-as element 2-8
script element 2-5
Semantic Interpretation for Speech Recognition. See SISR
s element 2-8
session variablesstandard 3-3Vocalocity-defined 3-3
SISRLumenVox SRE 5.5 1-5Nuance Speech Recognition
System 8.0 1-6SpeechWorks OSR 3.0 1-5supported specifications 1-3
specificationsSRGS 1-3SSML 1-3VoiceXML 1-3
Speech Recognition Grammar Specification. See SRGS
Speech Synthesis Markup Language. See SSML
SpeechWorks OSRlogging 4-3
SpeechWorks OSR 2.0SISR 1-5SRGS 1-5
SpeechWorks OSR 3.0SISR 1-5SRGS 1-5
SRGSLumenVox SRE 5.5 1-5Nuance Speech Recognition
System 8.0 1-5SpeechWorks 2.0 1-5SpeechWorks 3.0 1-5supported specifications 1-3
SRGS elements 2-9item 2-9rule 2-9ruleref 2-9tag 2-9token 2-9
Index-4 Vocalocity Voice Browser VoiceXML Implementation Reference Guide
Index
SSMLsupported specifications 1-3
SSML elements 2-7
strict content type processingenabling 3-15
sub element 2-8
subdialog element 2-5
submit element 2-5
T
tag element 2-9
throw element 2-5
time unit designations 3-15
token element 2-9
transfer element 2-5
TTS enginedefault 3-8specifying when more than one 3-
7supported 3-8
U
URLfile-based 3-15
UTF-8 3-14
V
value element 2-5
var element 2-5
Vocalocity browser propertiessetting 3-2vocalos-node.xml 3-2
Vocalocity Voice Browserdefined 1-2
vocalos.confvocalos.vxml.data.access_control.
allow 2-10
vocalos-node.xmlbrowser properties 3-2
voice browserdefined 1-2
voice element 2-8
VoiceXMLdefined 1-2implementation summary 2-2supported specifications 1-3
VoiceXML elementhow implemented 2-2
VoiceXML Forumabout 1-2
vxml element 2-6
W
W3C Voice Browser Working Groupabout 1-2website 1-2
Web Servicesaccessing within ECMAScript 3-
13
X
Xerces SAX parser 3-12
XML parser. See SAX parser
Vocalocity Voice Browser VoiceXML Implementation Reference Guide Index-5
Index
Index-6 Vocalocity Voice Browser VoiceXML Implementation Reference Guide