Upload
voxeo-corp
View
1.353
Download
0
Tags:
Embed Size (px)
DESCRIPTION
At the Voxeo Customer Summit 2010, Dir. of Speech Technologies Dan Burnett provided an update on the evolving VoiceXML 3 standard.More information at:http://www.voxeo.com/http://www.voxeo.com/summit2010http://blogs.voxeo.com/speakingofstandards/
Citation preview
Standards Update:VoiceXML 3Dan Burnett, Ph.D. Dir. of Speech Technologies, Voxeo (Dir. of Standards, Voxeo)
© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation
Voxeo on Standards
Develop ahead of standards
Make it Open Source
Lead in standards creation
Lead in standards adoption
© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation
Past Leadership W3C
• VoiceXML 2.0/2.1, SRGS 1.0, SISR 1.0, SSML 1.0
• CCXML 1.0, SCXML 1.0, EMMA 1.0
IETF • MRCPv1 extensions, MRCPv2,
P-charge-info, SIP security
© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation
Where we are now W3C
• VoiceXML 3, SSML 1.1, Pronunciation Alphabet Registry, Speech in HTML 5
• CCXML 1.0, SCXML 1.0, EMMA next, MMI architecture
IETF, 3GPP • MRCPv2, XMPP (incl. multi-party Jingle and
multiple chat), Media Control, SIP Overload, SIPREC, CODEC (Speex)
JCP • JSR 289, 309 – SIP servlets, media control • JSR 154, 254 – Java servlets and servlet
pages • XMPP SIP servlet – submitting to JCP
© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation
VoiceXML
2000 2010 2004 2007
VoiceXML 1.0
VoiceXML 2.0
VoiceXML 2.1
VoiceXML 3
© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation
VoiceXML
2000 2010 2004 2007
VoiceXML 1.0
VoiceXML 2.0
VoiceXML 2.1
VoiceXML 3
© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation
V3 Motivations
FIA flexibility
New features
Extensibility
Better integration with other W3C languages
© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation
V3 is . . .
a restructured core
some new features
convenience elements to mimic VoiceXML 2.1
© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation
V3 Architecture
Core functionality defined in modules
Modules combined with convenience syntax into profiles
© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation
Core functionality defined in modules
Module behavior defined precisely as state machines
© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation
Modules + Conv. Syntax = Profiles
Modules grouped into profiles
Legacy (V2.1), Basic, Maximal
Convenience syntax simplifies authoring
© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation
Convenience Syntax
New elements and attributes, but no new functionality
Behavior defined in terms of core functionality
For example, <menu> defined in terms of <form> with grammars and prompts
© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation
Convenience Syntax
Definite candidates are • menu/choice/enumerate/option • error/help/noinput/nomatch shortcuts • link
Possible (but different) candidates might be • if/else/elseif (using SCXML) • transfer (using CCXML)
© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation
New Stuff
New media, SIV functions
Session root documents
Real-time controls
Author-specifiable transition controllers
V2 eventing model now async & compatible with DOM Level 3
© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation
New Functionality – Video
Video -- <audio> replaced by <media>, which allows both audio and video
<media type="audio/x-wav" src="http://www.example.com/resource.wav"/>
<media type="video/3gpp" src="http://www.example.com/resource.3gp"/>
<media> <!-- inline SSML with audio media fallback--> <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis"> Ich bin ein Berliner. </speak> <media type="audio/x-wav" src="ichbineinberliner.wav"> </media>
© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation
New Functionality – Media Control
Media control -- media clipping, speed, and volume control now possible without resorting to SSML
<media type="audio/x-wav" soundLevel="+6.0dB" speed="50%" repeatcount= "2" src="http://www.example.com/resource.wav"/>
<media type="video/3gpp" clipBegin= "2s" clipEnd="5s" repeatDur="25s" src="http://www.example.com/resource.3gp"/>
© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation
New Functionality – SIV
SIV – speaker authentication capabilities available as core functionality
• Enrollment – creates voice model, associates it with id in speaker database
• Identification – which voice model in speaker database is a match for the speech?
• Verification – for the claimed id, does the speech match the voice model in the speaker database?
© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation
New Control – Session Root
Just like application root
Well, not exactly • If not specified, no session root • Session root change is ignored or causes error
First, let’s review application roots
<vxml session="blahblah.vxml" ...>
© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation
Application Root Review
A: <vxml>
B: <vxml>
C: <vxml root="B">
D: <vxml root="E">
F: <vxml root="E">
G: <vxml>
AppRoot A
AppRoot B
AppRoot B
AppRoot E
AppRoot E
AppRoot G
© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation
Session Root
A: <vxml>
B: <vxml session="C">
D: <vxml>
E: <vxml session="F" >
G: <vxml session="H" requiresession="true">
No Session Root
Session Root C
Session Root C
Session Root C
error.badfetch
© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation
Real-time Controls
Special grammars that are always active (not just in the wait state) • Allows arbitrary speech/dtmf • Immediate: volume, speed, skip • At next event processing: cancel, goto
Acts as pre-filter on input stream, replacing matches with silence
<form> <rtc grammar="digit3.grxml" action="volume" params="+5"/> <field name="a"> ... </field> <field name="b"> <cancelrtc grammar= "digit3.grxml "/> ... </field> </form>
© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation
Transition Controllers
Inter-element transitions now under author control
Controllers at form, document, application, and perhaps session levels • e.g. form controller specifies which form item to
execute next
Controllers can be in SCXML or another flow control language
Default controllers will give FIA behavior in Legacy Profile
© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation
Transition Controllers Example 1
<!-- document-level transition controller controls inter-form transitions --> <vxml ...> <controller ...> <scxml:scxml version="1.0" ...> <!-- SCXML code determining which form to go to next --> </scxml> </controller>
<form id="form_a" > ... <goto next="form_b"/> <!-- goto is only a suggestion now --> </form>
<form id="form_b" > ... </form> ... </vxml>
© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation
Transition Controllers Example 2
<!-- form-level transition controller controls inter-field transitions --> <vxml ...> <form> <controller src= "myformbehavior.scxml">
<field name="field_a" > ... </field> <field name="field_b" > ... </field> <field name="field_c" > ... </field> <field name="field_d" > ... </field> </form> ... </vxml>
© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation
For More V3 Info
Follow the work • http://www.w3.org/Voice
Check out our recent Developer Jam Session • http://developers.voiceobjects.com/tech-topics/
monthly-jam-sessions/
Contact me • dburnett at voxeo dot com
Dan Burnett, Ph.D. Dir. of Speech Technologies, Voxeo