25
Standards Update: VoiceXML 3 Dan Burnett, Ph.D. Dir. of Speech Technologies, Voxeo (Dir. of Standards, Voxeo)

Voxeo Summit 2010: Standards Update: VoiceXML3

Embed Size (px)

DESCRIPTION

At the Voxeo Customer Summit 2010, Dir. of Speech Technologies Dan Burnett provided an update on the evolving VoiceXML 3 standard.More information at:http://www.voxeo.com/http://www.voxeo.com/summit2010http://blogs.voxeo.com/speakingofstandards/

Citation preview

Page 1: Voxeo Summit 2010: Standards Update: VoiceXML3

Standards Update:VoiceXML 3Dan Burnett, Ph.D. Dir. of Speech Technologies, Voxeo (Dir. of Standards, Voxeo)

Page 2: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

Voxeo on Standards

  Develop ahead of standards

  Make it Open Source

  Lead in standards creation

  Lead in standards adoption

Page 3: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

Past Leadership   W3C

•  VoiceXML 2.0/2.1, SRGS 1.0, SISR 1.0, SSML 1.0

•  CCXML 1.0, SCXML 1.0, EMMA 1.0

  IETF •  MRCPv1 extensions, MRCPv2,

P-charge-info, SIP security

Page 4: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

Where we are now   W3C

•  VoiceXML 3, SSML 1.1, Pronunciation Alphabet Registry, Speech in HTML 5

•  CCXML 1.0, SCXML 1.0, EMMA next, MMI architecture

  IETF, 3GPP •  MRCPv2, XMPP (incl. multi-party Jingle and

multiple chat), Media Control, SIP Overload, SIPREC, CODEC (Speex)

  JCP •  JSR 289, 309 – SIP servlets, media control •  JSR 154, 254 – Java servlets and servlet

pages •  XMPP SIP servlet – submitting to JCP

Page 5: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

VoiceXML

2000 2010 2004 2007

VoiceXML 1.0

VoiceXML 2.0

VoiceXML 2.1

VoiceXML 3

Page 6: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

VoiceXML

2000 2010 2004 2007

VoiceXML 1.0

VoiceXML 2.0

VoiceXML 2.1

VoiceXML 3

Page 7: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

V3 Motivations

  FIA flexibility

  New features

  Extensibility

  Better integration with other W3C languages

Page 8: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

V3 is . . .

  a restructured core

  some new features

  convenience elements to mimic VoiceXML 2.1

Page 9: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

V3 Architecture

  Core functionality defined in modules

  Modules combined with convenience syntax into profiles

Page 10: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

Core functionality defined in modules

  Module behavior defined precisely as state machines

Page 11: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

Modules + Conv. Syntax = Profiles

  Modules grouped into profiles

  Legacy (V2.1), Basic, Maximal

  Convenience syntax simplifies authoring

Page 12: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

Convenience Syntax

  New elements and attributes, but no new functionality

  Behavior defined in terms of core functionality

  For example, <menu> defined in terms of <form> with grammars and prompts

Page 13: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

Convenience Syntax

  Definite candidates are •  menu/choice/enumerate/option •  error/help/noinput/nomatch shortcuts •  link

  Possible (but different) candidates might be •  if/else/elseif (using SCXML) •  transfer (using CCXML)

Page 14: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

New Stuff

  New media, SIV functions

  Session root documents

  Real-time controls

  Author-specifiable transition controllers

  V2 eventing model now async & compatible with DOM Level 3

Page 15: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

New Functionality – Video

  Video -- <audio> replaced by <media>, which allows both audio and video

<media type="audio/x-wav" src="http://www.example.com/resource.wav"/>

<media type="video/3gpp" src="http://www.example.com/resource.3gp"/>

<media> <!-- inline SSML with audio media fallback--> <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis"> Ich bin ein Berliner. </speak> <media type="audio/x-wav" src="ichbineinberliner.wav"> </media>

Page 16: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

New Functionality – Media Control

  Media control -- media clipping, speed, and volume control now possible without resorting to SSML

<media type="audio/x-wav" soundLevel="+6.0dB" speed="50%" repeatcount= "2" src="http://www.example.com/resource.wav"/>

<media type="video/3gpp" clipBegin= "2s" clipEnd="5s" repeatDur="25s" src="http://www.example.com/resource.3gp"/>

Page 17: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

New Functionality – SIV

  SIV – speaker authentication capabilities available as core functionality

•  Enrollment – creates voice model, associates it with id in speaker database

•  Identification – which voice model in speaker database is a match for the speech?

•  Verification – for the claimed id, does the speech match the voice model in the speaker database?

Page 18: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

New Control – Session Root

  Just like application root

  Well, not exactly •  If not specified, no session root •  Session root change is ignored or causes error

  First, let’s review application roots

<vxml session="blahblah.vxml" ...>

Page 19: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

Application Root Review

A: <vxml>

B: <vxml>

C: <vxml root="B">

D: <vxml root="E">

F: <vxml root="E">

G: <vxml>

AppRoot A

AppRoot B

AppRoot B

AppRoot E

AppRoot E

AppRoot G

Page 20: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

Session Root

A: <vxml>

B: <vxml session="C">

D: <vxml>

E: <vxml session="F" >

G: <vxml session="H" requiresession="true">

No Session Root

Session Root C

Session Root C

Session Root C

error.badfetch

Page 21: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

Real-time Controls

  Special grammars that are always active (not just in the wait state) •  Allows arbitrary speech/dtmf •  Immediate: volume, speed, skip •  At next event processing: cancel, goto

  Acts as pre-filter on input stream, replacing matches with silence

<form> <rtc grammar="digit3.grxml" action="volume" params="+5"/> <field name="a"> ... </field> <field name="b"> <cancelrtc grammar= "digit3.grxml "/> ... </field> </form>

Page 22: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

Transition Controllers

  Inter-element transitions now under author control

  Controllers at form, document, application, and perhaps session levels •  e.g. form controller specifies which form item to

execute next

  Controllers can be in SCXML or another flow control language

  Default controllers will give FIA behavior in Legacy Profile

Page 23: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

Transition Controllers Example 1

<!-- document-level transition controller controls inter-form transitions --> <vxml ...> <controller ...> <scxml:scxml version="1.0" ...> <!-- SCXML code determining which form to go to next --> </scxml> </controller>

<form id="form_a" > ... <goto next="form_b"/> <!-- goto is only a suggestion now --> </form>

<form id="form_b" > ... </form> ... </vxml>

Page 24: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

Transition Controllers Example 2

<!-- form-level transition controller controls inter-field transitions --> <vxml ...> <form> <controller src= "myformbehavior.scxml">

<field name="field_a" > ... </field> <field name="field_b" > ... </field> <field name="field_c" > ... </field> <field name="field_d" > ... </field> </form> ... </vxml>

Page 25: Voxeo Summit 2010: Standards Update: VoiceXML3

© Voxeo Corporation © Voxeo Corporation © Voxeo Corporation

For More V3 Info

  Follow the work •  http://www.w3.org/Voice

  Check out our recent Developer Jam Session •  http://developers.voiceobjects.com/tech-topics/

monthly-jam-sessions/

  Contact me •  dburnett at voxeo dot com

Dan Burnett, Ph.D. Dir. of Speech Technologies, Voxeo