25
! b " # # " $ # # % # # # # # # # ! & # # ( ) # # * * ! ! * # ! & # + # # * * ! ! * # * + * # , , + , # # ! !

CIC Text to Speech Engines Technical Reference · 2015. 10. 9. · application/ssml+xml Specifies the content type of the text that ITTS will synthesize i3tts.voice.name The ITTS

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CIC Text to Speech Engines Technical Reference · 2015. 10. 9. · application/ssml+xml Specifies the content type of the text that ITTS will synthesize i3tts.voice.name The ITTS

PureConnect®

2018 R5

Generated:

12-November-2018

Content last updated:

08-Janauary-2018

See Change Log for summary ofchanges.

CIC Text to Speech Engines

Technical Reference

Abstract

This document describes the Text-to-Speech engines supported in CICand provides installation and configuration information.

For the latest version of this document, see the PureConnectDocumentation Library at: http://help.genesys.com/cic.

For copyright and trademark information, seehttps://help.genesys.com/cic/desktop/copyright_and_trademark_information.htm.

1

Page 2: CIC Text to Speech Engines Technical Reference · 2015. 10. 9. · application/ssml+xml Specifies the content type of the text that ITTS will synthesize i3tts.voice.name The ITTS

2345666779

10101111121214141919212324

Table of ContentsTable of ContentsIntroductionSupported TTS enginesSupported languagesTTS SAPI engines

Microsoft SAPI engineOther SAPI enginesSAPI architectureConfigure the SAPI TTS voice on the CIC server

TTS MRCP enginesInteraction Text to Speech

Benefits of Interaction Text to SpeechSupported languages for Interaction Text to SpeechLicensing for Interaction Text to SpeechUsage with Interaction Designer

Use Interaction Text to Speech with SAPI or MRCP TTS as defaultPartially supported SSML objectsSupported say-as text normalization

Configure TTSConfigure the TTS engine in Interaction AdministratorAdd voices and languages for SAPISet the volume level of a voice for SAPI

Change log

2

Page 3: CIC Text to Speech Engines Technical Reference · 2015. 10. 9. · application/ssml+xml Specifies the content type of the text that ITTS will synthesize i3tts.voice.name The ITTS

IntroductionThe PureConnect platform uses a Text-to-Speech (TTS) engine to read text to callers over the telephone. For example, a user cantake advantage of this system to retrieve an email message over the phone. The TTS engine then employs a speech synthesizer toread the sender, subject, and body of the message.

Genesys offers Interaction Text to Speech as a native TTS engine for Customer Interaction Center. Incorporated into InteractionMedia Server, Interaction Text to Speech does not require a separate installation or separate hardware.

Apart from Interaction Text to Speech, CIC supports various TTS engines that comply with Speech Application ProgrammingInterface (SAPI) and Media Resource Control Protocol (MRCP). The quality of the speech produced by these TTS engines variesfrom vendor to vendor.

You can use TTS through CIC handlers that you can create or modify through Interaction Designer, VoiceXML, and throughInteraction Attendant nodes.

3

Page 4: CIC Text to Speech Engines Technical Reference · 2015. 10. 9. · application/ssml+xml Specifies the content type of the text that ITTS will synthesize i3tts.voice.name The ITTS

Supported TTS enginesYou can find a complete list of the third-party TTS engines that CIC supports on the following Genesys website:

http://testlab.genesys.com

Additionally, you can purchase and use Interaction Text-to-Speech, which is integrated in Interaction Media Server, for basic TTSfunctionality.

4

Page 5: CIC Text to Speech Engines Technical Reference · 2015. 10. 9. · application/ssml+xml Specifies the content type of the text that ITTS will synthesize i3tts.voice.name The ITTS

Supported languagesFor Interaction Text to Speech, see Supported languages for Interaction Text to Speech.

To view the list of languages supported by a specific third-party TTS engine, see the website of the vendor of the third-party TTSengine.

5

Page 6: CIC Text to Speech Engines Technical Reference · 2015. 10. 9. · application/ssml+xml Specifies the content type of the text that ITTS will synthesize i3tts.voice.name The ITTS

TTS SAPI enginesIn this section:

Microsoft SAPI engine

Other SAPI engines

SAPI architecture

Configure the SAPI TTS voice on the CIC server

Microsoft SAPI engineThe Microsoft SAPI-compliant TTS engine is available with the Windows Server 2008 R2 and 2012 R2 operating systems, along withone or more TTS voices.

Customer Interaction Center supports the SAPI version 5 standard. Microsoft also offers software for your SAPI solution. Forinformation on Microsoft Speech Server SDK, Microsoft Speech Platform Runtime, and adding voices for SAPI, visit the SpeechPlatforms page at the following website:

http://msdn.microsoft.com/en-us/library/hh361571(v=office.14).aspx

Note:

For version compatibility information on Microsoft SAPI software, see http://testlab.genesys.com.

Other SAPI enginesAny third-party TTS engine that supports these same standards should integrate with Customer Interaction Center. NuanceVocalizer is the only SAPI TTS engine that you can purchase from Genesys. For TTS installation instructions for third-partyproducts, see the vendor product installation documentation.

Note:

A third-party TTS license key is required.

6

Page 7: CIC Text to Speech Engines Technical Reference · 2015. 10. 9. · application/ssml+xml Specifies the content type of the text that ITTS will synthesize i3tts.voice.name The ITTS

SAPI architectureThe following diagram depicts the protocol flow between servers when using SAPI for TTS plays. All audio is streamed from theTTS server to the CIC server using the vendor's proprietary method. The CIC server streams the audio using Real-time TransportProtocol (RTP) to Interaction Media Server, which then streams that audio using RTP to the IP device. For more information, seeInteraction Administrator help.

Configure the SAPI TTS voice on the CIC serverOn Windows, SAPI uses a selected voice. As such, the CIC server uses this voice by default for all TTS operations, unless youconfigure other voices in Interaction Administrator.1. Log on to the Windows Server hosting Customer Interaction Center with the user account that the Interaction Center service

runs under.

Note:If you log on to the Windows Server with a different user account than the one under which the Interaction Center serviceruns, the selected TTS voice applies only to that account and will not affect the voice used for TTS operations by the CICserver.

2. Run the speech applet, sapi.cpl, which is located in the following folder:C:\Windows\SysWOW64\Speech\SpeechUX\

Important!You must use the sapi.cpl program file in the specified directory path as it is the 32-bit version.

Using the 64-bit version of sapi.cpl from other directory paths or from the Speech applet in the Control Panel does notconfigure SAPI TTS operations for the CIC server.

3. The Speech Properties dialog box appears.

7

Page 8: CIC Text to Speech Engines Technical Reference · 2015. 10. 9. · application/ssml+xml Specifies the content type of the text that ITTS will synthesize i3tts.voice.name The ITTS

4. In the Voice selection list, click the voice you want to use as the default voice.

Note:Some Windows Server versions offer only one voice, by default.

Tip:If you want to preview the selected voice, click Preview Voice.

If you want to adjust the rate of speech for the voice playback, move the Voice speed slider to the right to increase thespeed or to the left to decrease the speed.

5. Click OK.These changes take effect immediately on the CIC server for any SAPI TTS operations.

8

Page 9: CIC Text to Speech Engines Technical Reference · 2015. 10. 9. · application/ssml+xml Specifies the content type of the text that ITTS will synthesize i3tts.voice.name The ITTS

TTS MRCP enginesMedia Resource Control Protocol (MRCP) enables speech servers to provide various speech services to clients. PureConnectsupports the MRCP v2.0 protocol for connecting to speech servers that provide text-to-speech (speech synthesis) services.Third-party TTS engines that support MRCP v2.0 can integrate with PureConnect but Genesys only resells the Nuance TTS product line.

For more information about these engines, see the MRCP Technical Reference in the CIC Documentation Library at the followingwebsite: https://help.genesys.com/cic/mergedprojects/wh_tr/mergedProjects/wh_tr_mrcp/desktop/mrcp_technical_reference.htm.

Also, see the vendor product documentation.

PureConnect is compliant with the Media Resource Control Protocol Version 2 (MRCPv2), RFC 6787:

http://tools.ietf.org/html/rfc6787

9

Page 10: CIC Text to Speech Engines Technical Reference · 2015. 10. 9. · application/ssml+xml Specifies the content type of the text that ITTS will synthesize i3tts.voice.name The ITTS

Interaction Text to SpeechInteraction Text to Speech is a native TTS engine that is incorporated within Interaction Media Server. Interaction Text to Speech iscontinuously being developed to comply with the following standards:

Speech Synthesizer Markup Language (SSML) v1.1Pronunciation Lexicon Specification (PLS) Version 1.0

Benefits of Interaction Text to SpeechSimpler deployment than a third-party TTS solution through SAPI or MRCP

No additional hardware or software requirements for the CIC server or Interaction Media ServerSimplification of selection rules – Interaction Text to Speech uses only Media Server selection rules. TTS solutions based onMRCP require both Media Server selection rules and MRCP selection rules in Interaction Administrator.

For more information about Media Server selection rules, see Interaction Media Server Technical Reference and InteractionAdministrator Help.

Less audio and signaling traffic on the network than a third-party TTS solution

10

Page 11: CIC Text to Speech Engines Technical Reference · 2015. 10. 9. · application/ssml+xml Specifies the content type of the text that ITTS will synthesize i3tts.voice.name The ITTS

Supported languages for Interaction Text to SpeechInteraction Text-to-Speech supports the following languages:

Dutch, Netherlands (nl-NL)English, United States (en-US)English, Australia (en-AU)English, Great Britain (en-GB)French, Canada (fr-CA)French, France (fr-FR)German, Germany (de-DE)Italian, Italy (it-IT)Japanese, Japan (ja-JP)Mandarin Chinese, China (zh-CN) (beta)

Important!

As a beta language model, ITTS provides few text normalizations for Mandarin Chinese at this time.For zh-CN, onlyLatin characters are supported in say-as "alphanumeric", not Chinese characters.

Portuguese, Brazil (pt-BR)Spanish, United States (es-US)

Note:

To ensure that Interaction Media Server does not exceed memory resources, customers should test the performance andmemory usage of Media Servers when using more than 4 TTS languages. Older Media Servers may not be able to handle morethan 4 languages. Overuse of Interaction Media Server resources can result in defects or failures in audio processing.

Licensing for Interaction Text to SpeechInteraction Text to Speech requires a license for the feature, a license for the number of sessions to allow, and a license for eachlanguage that you want to support. The following table provides the license names for Interaction Text to Speech:

License Name

Interaction Text to Speech (ITTS) I3_FEATURE_MEDIA_SERVER_TTS

ITTS Sessions (total across all languages) I3_SESSION_MEDIA_SERVER_TTS

ITTS Language Feature – Dutch (NL) I3_FEATURE_MEDIA_SERVER_TTS_LANGUAGE_NL

ITTS Language Feature - English (US) I3_FEATURE_MEDIA_SERVER_TTS_LANGUAGE_EN

ITTS Language Feature - English (AU) I3_FEATURE_MEDIA_SERVER_TTS_LANGUAGE_EN_AU

ITTS Language Feature - English (GB) I3_FEATURE_MEDIA_SERVER_TTS_LANGUAGE_EN_GB

ITTS Language Feature - French (CA) I3_FEATURE_MEDIA_SERVER_TTS_LANGUAGE_FR_CA

ITTS Language Feature – French (FR) I3_FEATURE_MEDIA_SERVER_TTS_LANGUAGE_FR

ITTS Language Feature - German (DE) I3_FEATURE_MEDIA_SERVER_TTS_LANGUAGE_DE

ITTS Language Feature - Italian (IT) I3_FEATURE_MEDIA_SERVER_TTS_LANGUAGE_IT

ITTS Language Feature – Japanese (JP) I3_FEATURE_MEDIA_SERVER_TTS_LANGUAGE_JA

ITTS Language Feature – Mandarin (CN) I3_FEATURE_MEDIA_SERVER_TTS_LANGUAGE_ZH_CN

ITTS Language Feature – Portuguese (BR) I3_FEATURE_MEDIA_SERVER_TTS_LANGUAGE_PT_BR

ITTS Language Feature - Spanish (US) I3_FEATURE_MEDIA_SERVER_TTS_LANGUAGE_ES

11

Page 12: CIC Text to Speech Engines Technical Reference · 2015. 10. 9. · application/ssml+xml Specifies the content type of the text that ITTS will synthesize i3tts.voice.name The ITTS

Usage with Interaction DesignerYou can use the following Interaction Designer tools with Interaction Text to Speech when Interaction Media Server is set as thedefault Text-to-Speech provider:

Play tools Record tools Play prompt toolsPlay StringPlay String ExtendedPlay Text FilePlay Text File Extended

Record StringRecord String ExtendedRecord Text FileRecord Text File Extended

Play Prompt Phrase

If you use SAPI or MRCP, you can still configure these tools to use Interaction Text-to-Speech by specifying an optional parameter,"I3TTS", in the properties for that tool step:

In this manner, you can also specify additional parameters to control various characteristics of the synthesized speech ofInteraction Text to Speech:

Note:

To use multiple parameters, use a space between each parameter in the Optional Parameters box. Separate parameters fromvalues with a colon (:). You must use double quotation marks around the entire string of characters.

Example: "I3TTS i3tts.content.language:text/plain"

Use Interaction Text to Speech with SAPI or MRCP TTS as default

12

Page 13: CIC Text to Speech Engines Technical Reference · 2015. 10. 9. · application/ssml+xml Specifies the content type of the text that ITTS will synthesize i3tts.voice.name The ITTS

Parameter Values Description

I3TTS N/A Specifies that Interaction Text to Speech is used for this tool step if it is notselected as the default TTS provider

Note:You must specify this parameter before any other parameters.

i3tts.content.language Language values asspecified in RFC 3066

Specifies that ITTS uses the specified language to synthesize the text

i3tts.content.type text/plain (default)application/ssml+xml

Specifies the content type of the text that ITTS will synthesize

i3tts.voice.name The ITTS voice to use forthe specified language.

Specifies the i3tts voice that ITTS will use to synthesize the text. At thistime, ITTS uses the following voices for the supported languages:

Dutch (NL) - MarinaEnglish (US) - JillEnglish (AU) - KandyceEnglish (GB) - ElleneFrench (CA) – HilorieFrench (FR) - ManonGerman (DE) – ArabellaItalian (IT) - LuisaJapanese (JP) – MikiMandarin (CN) – Mei-LingPortuguese (BR) – VivianeSpanish (US) - Isabel

i3tts.voice.rate A non-negativepercentagedefaultx-fastfastmediumslowx-slow

Specifies the SSML prosody rate of the selected ITTS voice

i3tts.voice.volume A positive or negativevalue in decibels (dB)defaultx-loudloudmediumsoftx-softsilent

Specifies the SSML prosody volume of the selected ITTS voice

i3tts.voice.pitch A value in hertz (Hz)defaultx-highhighmediumlowx-low

Specifies the SSML prosody pitch of the selected ITTS voice

13

Page 14: CIC Text to Speech Engines Technical Reference · 2015. 10. 9. · application/ssml+xml Specifies the content type of the text that ITTS will synthesize i3tts.voice.name The ITTS

Partially supported SSML objectsAt this time, Interaction Text to Speech partially supports the following SSML objects:

Element Attribute Notes

token N/A ITTS does not support the token element.

emphasis level ITTS does not support a value of none for the level attribute.

phoneme alphabet ITTS supports only the ipa alphabet and PureConnect's own Arpabet-style phoneme alphabet.

prosody duration

range

contour

pitch semitones (st)

ITTS supports the following attributes for the prosody element:pitchratevolume

sub alias ITTS does not support the sub element.

lookup N/A ITTS does not support the lookup element.

Supported say-as text normalizationIn verbal conversations, certain categories of speech, such as currency and time, use a specific method to convey information. Forexample, when people read $12,345, it is usually spoken as "twelve-thousand-three-hundred-forty-five dollars" as opposed to"dollar-symbol one-two (pause) three-four-five", which is how a computer might interpret it.

In TTS, say-as text normalization directs the speech synthesizer to speak text in a specific manner so that it is understood by thelistener. Without the say-as functionality, a time of 10:30 AM could be spoken by the synthesizer as "one-zero-three-zero am". Withsay-as, the synthesizer can say the time as "ten-thirty a-m", which is more easily understood by the listener.

The following table lists the say-as normalization types that are available and their support within Interaction Text to Speech:

Textnormalization

type

Usage Supported Notes

address Processes mailingaddresses

Yes Processes US addresses, including military addresses, Post Officeboxes, and rural routes. Abbreviations are expanded based on context:

Abbreviation examples:

123 Main St. "St." spoken as "street"

Springfield, IN "IN" spoken as "Indiana"

PO Box "PO" spoken as "post office"

RR 2 "RR" spoken as "rural route"

Tips:Do not include the name of the addressee inside the<say asinterpret as="address"> element. Instead, useregular text processing for all names.Use official United States Postal Service abbreviations for states,territories, and thoroughfare types.If you encounter the incorrect audio of "street" when "saint" isrequired, use the full word "Saint" instead of the "St." abbreviation,as seen in "St. Paul, MN".The say-as element for an address may contain newline characters

Supported languages:en-US

alphanumeric Spells letters and numbers Yes Supported languages:

14

Page 15: CIC Text to Speech Engines Technical Reference · 2015. 10. 9. · application/ssml+xml Specifies the content type of the text that ITTS will synthesize i3tts.voice.name The ITTS

de-DEen-AUen-GBen-USes-USfr-CA

fr-FRit-ITja-JPnl-NLpt-BR

boolean Understands yes and no Yes This type uses the VoiceXML 2.0 defined type for Boolean. ITTSsupports usage of T, True, F, and False, and local variants for thisnormalization type.

Supported languages:

de-DEen-AUen-GBen-USes-USfr-CA

fr-FRit-ITja-JPnl-NLpt-BR

currency Processes currencyamounts

Yes In specifying currency values, you can use either the monetary symbol,such as $, or the associated abbreviation defined by ISO‑4217, such as(USD).

ITTS processes values to the right of a decimal point for only fourdigits. ITTS ignores any additional digits.

Supported languages:

de-DE (€ or EUR)en-AU ($ or AUD)en-GB (£ or GBP)en-US ($ or USD)es-ES (€ or EUR)es-US ($ or USD)fr-CA ($ or CAD)You can use a comma or a period as thedecimal mark for fr-CA.

fr-FR (€ or EUR)it-IT (€ or EUR)ja-JP (¥ or JPY)nl-NL (€ or EUR)pt-BR (R$ or BRL)zh-CN (¥, CNY, orRMB)

15

Page 16: CIC Text to Speech Engines Technical Reference · 2015. 10. 9. · application/ssml+xml Specifies the content type of the text that ITTS will synthesize i3tts.voice.name The ITTS

date Processes chronologicaldates

Yes ITTS supports the following date formats:mdy - specify as mm/dd/yyyy in the textdmy - specify as dd/mm/yyyy in the textymd - specify as yyyy/mm/dd in the textmd - specify as mm/dd in the textdm - specify as dd/mm in the textym - specify as yyyy/mm in the textmy - specify as mm/yyyy in the texty - specify as yyyy in the textm - specify as mm in the textd - specify as dd in the text

Example (en-US):

<say-as interpret-as="date"format="mdy">01/01/1984</say-as>Output (en-US):

"January first, nineteen eighty-four"

Tip:You can use the following delimiters when specifying dates:/ (slash)- (hyphen). (period)You can also enter single-digits for months and days, and two-digityears.

Supported languages:

de-DEen-AUen-GBen-USes-ESes-USfr-CA

fr-FRit-ITja-JPnl-NLpt-BRzh-CN

digits Reads strings digit-by-digit Yes Supported languages:

de-DEen-AUen-GBen-USes-USes-ESfr-CA

fr-FRit-ITja-JPnl-NLpt-BRzh-CN

number Reads strings as a value(not digit-by-digit)

Yes Supported languages:

de-DEen-AUen-GBen-USes-USes-ESfr-CA

fr-FRit-ITja-JPnl-NLpt-BRzh-CN

16

Page 17: CIC Text to Speech Engines Technical Reference · 2015. 10. 9. · application/ssml+xml Specifies the content type of the text that ITTS will synthesize i3tts.voice.name The ITTS

ordinal Processes ordinalnumbers, such as "first","second", and so on.

Yes You can specify only the digits or the digits and an ordinal ending forthe supported language.

Note:If the language uses gendered forms for ordinals, and the gender isnot specified in the text, ITTS will hypothesize the most-likelygender for the ordinal.

Supported languages:

de-DEen-AUen-GBen-USes-ESes-USit-ITja-JPnl-NLpt-BR

fr-CA and fr-FR1 results in"premier" or"première" with theordinal.1re results in"première"1 femme results in"première femme"1 cœur results in"premier"

spell Processes characterswithin a string, such as aword

Yes The synthesizer speaks each character individually. Any punctuationcharacters in the string are read and named, such as ampersand (&),space ( ), and pound sign (#).

Supported languages:

de-DEen-AUen-GBen-USes-USes-ESfr-CA

fr-FRit-ITjp-JPnl-NLpt-BRzh-CN

telephone Processes telephonenumbers

Yes The synthesizer reads telephone numbers, including SIP, according tolocal conventions in supported languages.

de-DEen-AUen-GBen-USes-USes-ESfr-CA

fr-FRit-ITja-JPnl-NLpt-BRzh-CN

time Processes timestatements, such as"12:45 AM"

Yes Supports hours, minutes, seconds, and 12 or 24-hour clock.

This normalization type does not support any format options.

This normalization type does not support durations, such as "60minutes".

For all languages, the separator for the hours and minutes of the timeis a colon (:). For top-of-the-hour times, such as "2 o'clock", you canomit the colon and minutes. Example: 2 pmSupported languages:

17

Page 18: CIC Text to Speech Engines Technical Reference · 2015. 10. 9. · application/ssml+xml Specifies the content type of the text that ITTS will synthesize i3tts.voice.name The ITTS

de-DEen-AUen-GAen-USes-ESes-USfr-CA/fr-FRThese languages support usage of h as aseparator between the hours andminutes, such as 1h30.

it-ITja-JPnl-NLpt-BRzh-CN

18

Page 19: CIC Text to Speech Engines Technical Reference · 2015. 10. 9. · application/ssml+xml Specifies the content type of the text that ITTS will synthesize i3tts.voice.name The ITTS

Configure TTSIn this section:

Configure the TTS engine in Interaction Administrator

Add voices and languages for SAPI

Set the volume level of a voice for SAPI

Configure the TTS engine in Interaction AdministratorUse Interaction Administrator to configure TTS features.1. Open Interaction Administrator and log on with administrator credentials.2. In the left pane of the Interaction Administrator main window, select the System Configuration container.

3. In the right pane of the Interaction Administrator main window, double-click Configuration.The System Configuration dialog box appears.

19

Page 20: CIC Text to Speech Engines Technical Reference · 2015. 10. 9. · application/ssml+xml Specifies the content type of the text that ITTS will synthesize i3tts.voice.name The ITTS

4. In the System Configuration dialog box, click the Text To Speech tab.The Text To Speech tab appears.

20

Page 21: CIC Text to Speech Engines Technical Reference · 2015. 10. 9. · application/ssml+xml Specifies the content type of the text that ITTS will synthesize i3tts.voice.name The ITTS

5. In the Default TTS Provider box, select the TTS engine that you want to use:SAPI (default)- Uses the Microsoft Speech API (SAPI) of the Windows Server operating system on the CIC server.MRCP - Uses a third-party TTS engine, such as Nuance or Loquendo.Media Server - Uses the Interaction Text to Speech engine of Interaction Media Server.

Note:The Media Server item appears only if you applied the Interaction Text to Speech feature license to your CIC server.

6. If you are using SAPI for your TTS solution, you can configure the controls in the SAPI Configuration group.For more information on the SAPI Configuration controls, see Interaction Administrator Help.a. In the Concurrent Session Limit box, enter the maximum number of concurrent sessions allowed.

The limit is either a license-enforced limit or a load-enforced limit. For example, if you have a 20-port license, the systemcannot connect to more than 20 sessions.

b. In the Concurrent Session Warning Level box, you can enter the minimum number of concurrent sessions that can be activebefore a warning message appears.The warning message indicates that you are close to exceeding the concurrent session limit.

c. In the Volume Control box, you can adjust the loudness level for SAPI voices.d. Click OK.

Add voices and languages for SAPIYou can choose to write custom applications for multiple voices and languages by creating a voice name parameter for each voiceand then making the necessary handler modifications to use these SAPI voice name parameters.

If you downloaded and installed Microsoft Speech Runtime Platform on the CIC server and want to add additional voices that itprovides, you must define the voices in Interaction Administrator and reference the Registry location where the tokens are located.The base Registry path for the Microsoft Speech Runtime Platform voices is as follows:

21

Page 22: CIC Text to Speech Engines Technical Reference · 2015. 10. 9. · application/ssml+xml Specifies the content type of the text that ITTS will synthesize i3tts.voice.name The ITTS

HKEY_LOCAL_MACHINE \SOFTWARE \Wow6432Node \Microsoft \Speech Server \v11.0 \Voices \Tokens

Using the Text To Speech tab of the System Configuration dialog box in Interaction Administrator, you can add multiple voices andlanguages. You can add an unlimited number of voices; however, you can associate each language to only one voice. Voiceconfiguration settings on this page override the voice configuration settings in the Windows Speech applet.

After defining the voice, you can pass the voice name parameter (for example, “Jane English”) to the TTS-defined tool.

To add a voice for a language, do the following steps:1. Open Interaction Administrator and log on with administrator credentials.2. In the left pane of the Interaction Administrator window, select the System Configuration container.3. In the right pane of the Interaction Administrator window, double-click Configuration.

The System Configuration dialog box appears.4. Select the Text To Speech tab of the System Configuration dialog box.5. On the Text to Speech tab, select the Add button.

The Add Voice dialog box appears.

6. In the Name box, enter the name that you want to assign to the voice.7. In the Registry box, enter the registry path to the voice token.8. In the Language list, select the language in which the voice is spoken.

9. Click OK.The voice now appears in the Voices panel on the Text to Speech tab.

22

Page 23: CIC Text to Speech Engines Technical Reference · 2015. 10. 9. · application/ssml+xml Specifies the content type of the text that ITTS will synthesize i3tts.voice.name The ITTS

Set the volume level of a voice for SAPIAfter you have added a voice for SAPI, you can adjust the volume of that voice.1. Open Interaction Administrator and log on with administrator credentials.2. In the left pane of the Interaction Administrator window, select the System Configuration container.3. In the right pane of the Interaction Administration window, double-click Configuration.

The System Configuration dialog box appears.4. In the Volume Control box, enter or select the volume level for the voice.

100 is the default value and the maximum value.If you have more than one voice, repeat steps 1 through 4 for each voice as necessary.

Note:For more information about the options on the Text to Speech page, see Interaction Administrator Help.

23

Page 24: CIC Text to Speech Engines Technical Reference · 2015. 10. 9. · application/ssml+xml Specifies the content type of the text that ITTS will synthesize i3tts.voice.name The ITTS

Change logThe following table summarizes the changes made to this document.

Date Changes

12-October-2012 Updated for CIC 4.0 SU3; removed references to HMP.

29-April-2013 Updated title page and copyright notice.Updated reference to Microsoft text-to-speech website (now Tellme).Added reference to MRCP Technical Reference.Updated reference to IETF document RFC 6787.

13-February-2014 Updated Copyright notice.Updated registry path for the Jane English voice.

30-July-2014 Updated documentation to reflect changes required in the transition from version 4.0 SU# to CIC 2015 R1,such as updates to product version numbers, system requirements, installation procedures, references toInteractive Intelligence Product Information site URLs, and copyright and trademark information.

01-July-2015 Updated cover page to reflect new color scheme and logo.Updated copyright and trademark information.

09-October-2015 Updated the document to reflect the CIC 2016 R1 version.

09-February-2016 Updated copyright and trademark informationAdded content for Interaction Text-to-Speech (ITTS)Updated content to conform to latest TTS offerings and compatibilitiesApplied general edits for clarity and conformity

13-June-2016 IC-135402 - Made various edits, clarifications, and improvements throughout the document forInteraction Text to Speech (ITTS)IONMEDIA-2412 - Added French -France (fr-FR) as a supported language model for ITTSIONMEDIA-2631 - Added Mandarin Chinese (zh-CN) as a beta version of a language model for ITTSIONMEDIA-2630 - Added Dutch (nl-NL) as a supported language model for ITTSIONMEDIA-2483, 2472 - Added Brazilian Portuguese (pt-BR) as a supported language model for ITTSIONMEDIA-2400, 2401,2424: Say-as support improvements - Added support for more language models intext normalizations

22-September-2016 IONMEDIA-2754 - Added support for address say-as text normalization for en-US in Supported say-astext normalization

12-December-2016 Updated Supported say-as text normalization with the following changes:IONMEDIA-2832 - Added zh-CN support for currency, date, digits, number, telephone, and timesay-as text normalizationsIONMEDIA-2826 - Added pt-BR support for alphanumeric say-as text normalizationIONMEDIA-2827 - Added pt-BR support for Boolean say-as text normalizationIONMEDIA-2828 - Added pt-BR support for digits say-as text normalizationIONMEDIA-2829 - Added pt-BR support for number say-as text normalizationIONMEDIA-2830 - Added pt-BR support for spell say-as text normalization

24-May-2017 Added language support for Italian.

01-June-2017 Added language support for Dutch

24

Page 25: CIC Text to Speech Engines Technical Reference · 2015. 10. 9. · application/ssml+xml Specifies the content type of the text that ITTS will synthesize i3tts.voice.name The ITTS

24-October-2017 Rebranded this document to apply Genesys styles and terminology.Updated Supported languages for Interaction Text to Speech, to note, for zh-CN, only Latin charactersare supported in say-as "alphanumeric", not Chinese characters.Improved TTS reading of dates for English languages: Text to Speech (TTS) better interprets text to readback a date instead of a fraction, where applicable. For example, "12/15" is ambiguous, since that couldrepresent a fraction, a date (month/year or month/day), or a number sequence. Previously, "12/15" wouldhave been read back as a fraction.Following this update, outside of a say-as environment, TTS reads back a date unless it finds additionalcontext to indicate that a fraction or number is intended. The format used to read back dates is locale-specific. For example, "1/2" in an es-US date context will be read "January 2". In an en-GB/AU context,"February 1" is read back instead.In addition, TTS better interprets years. It reads back "nineteen eighty five" instead of "one thousand ninehundred eighty five". Plural years or decade references such as "1930s" are read back as "nineteenthirties", but not "one thousand nine hundred thirty S".Say-as-alphanumeric TTS improvements: To improve the customer experience, say-as-alphanumericinput is retained, even if some characters are not included in a language's character set. For example,Mandarin doesn't include the Polish letters ł or ę. If someone inputs "Lech Wałęsa" in zh-CN as say-asinput, TTS will input "Lech Wałęsa" into character normalizer.Previously, character normalizer stripped out any non-Mandarin characters, resulting in "Lech Wasa".Starting with this release, TTS will output "Lech Walesa" in the case of zh-CN. By retaining more of theinput, better pronunciation can be attained.

08-Janauary-2018 Updated a note which said, "To ensure that Interaction Media Server does not use too many CPU resources,PureConnect recommends that you use no more than four languages with Interaction Text to Speech.Overuse of Interaction Media Server resources can result in defects or failures in audio processing." Thenote now says:

"To ensure that Interaction Media Server does not exceed memory resources, customersshould test the performance and memory usage of Media Servers when using more than 4 TTSlanguages. Older Media Servers may not be able to handle more than 4 languages. Overuse ofInteraction Media Server resources can result in defects or failures in audio processing."

25