14
DARPA Communicator: The Development of Advanced Dialog Systems Using Open Source Software Bryan George, Samuel Bayer Presented at July 27, 2001

DARPA Communicator: The Development of Advanced Dialog Systems Using Open Source Software Bryan George, Samuel Bayer Presented at July 27, 2001

Embed Size (px)

Citation preview

DARPA Communicator: The Development of Advanced Dialog

Systems Using Open Source Software

Bryan George, Samuel Bayer

Presented at

July 27, 2001

2

DARPA Communicator Program Vision

W: I need an early flight to send new computers to BosniaC: Where from?W: Washington DCC: OK, there’s a Tuesday evening flight out of Andrews

arriving 8:38 AM on Wednesday in Frankfurt GermanyW: No, I prefer [a flight from Andrews into] Ramstein Germany.C: How about MAC Flight #1296 arriving Ramstein at

10:45AM on Wednesday?W: Is that a C-141 aircraft?C: No, it’s a C-5.W: OK, arrange for transportation on that flight

Remote access to information via spoken mixed-initiative dialogue with context tracking, clarifications and confirmations

Technology focus: dialogue, presentation

Application focus: mobile, military

3DARPA Communicator and the Galaxy Communicator Software Infrastructure (GCSI)

ContextTracking

ContextTracking

Hub ApplicationBackend

ApplicationBackend

DialogueManagement

DialogueManagement

Language Generation

Language Generation

Frame Construction

Frame Construction

Speech Recognition

Speech Recognition

AudioAudio

Text-to-Speech

Text-to-Speech

The GCSI, originally implemented by MIT and now maintained, extended and distributed by MITRE, underlies the dialogue systems being developed by Communicator participants

4

GCSI Design Requirements

Flexibility:

the infrastructure should be flexible enough to encompass the range of interaction strategies that the various Communicator sites might experiment with

Obtainability:

Learnability:

the infrastructure should be easy to get and to install

the infrastructure should be easy to learn to use

Embeddability:

the infrastructure should be easy to embed into other software programs

Maintenance:

the infrastructure should be supported and maintained for the Communicator program

Leverage:

the infrastructure should support longer-term program and research goals for distributed dialogue systems

5

GCSI Flexibility: Background

Hub and spoke infrastructure

Hub supports scripting, logging

Distributed Message-based

AudioAudio

SpeechRecognition

SpeechRecognition

Hub

audio.mitre.org

rec.mitre.org

“rec: audio available”

“rec: audio available”

6

GCSI Flexibility: Design Benefits

Message-passing means that the hub doesn’t need compile-time knowledge of server APIs (vs. CORBA, e.g.)

Hub scripting allows the programmer to dictate the flow of control of messages

- So programs can integrate synchronous and asynchronous servers without modifying the servers themselves

- So programmers can insert simple tools and filters to convert data among formats without modifying the servers themselves

Hub script behavior is controlled by the hub state

- So programs can easily modify the message flow of control in real time

7

GCSI Obtainability: Open Source

Open source licensing simplifies software distribution

- Puts source code in the hands of researchers while preserving the intellectual property rights of the developers

Open source can simplify commercialization

- Joint MIT/MITRE GCSI open source license is MIT X Consortium* (no use restrictions)

Open source infrastructure is a platform for open source components

- Contributions from MITRE, MIT, CMU, Colorado...

*Plus US Gov’t use rights - see http://communicator.sourceforge.net/download/opensourcelicense.html

8

GCSI Obtainability: Installation

Resource restrictions impose focus on most common platforms in the Communicator program

- Intel Linux

- Sparc Solaris

- Windows NT Also known to work or have worked on other configurations

(e.g., HP-UX, SGI IRIX, PPC Linux), but these configurations are not supported

Open source supports community action

- Programmers have source if they want to enable a new OS (and, we hope, contribute their modifications to the code base)

9

GCSI Learnability: Training

Communicator program participants have had the option of attending a two-or-three-day introduction to the GCSI at MITRE-Bedford

- Building servers

- Scripting the Hub

- Logging

- Building an end-to-end system

Course materials available to program participants for download as a self-guided tutorial

10

GCSI Learnability: Support Materials

Documentation, in HTML and PDF (400 pages)

Extensive examples

- Basic server development

- Backchannel audio connections (“brokering”)

- GUI embedding Toy end-to-end dialogue

system At least two sites have

succeeded in creating dialogue systems using the GCSI in a short period of time without attending our training course

11

GCSI Embeddability

Embeddability means

- Compatibility with other software packages

- Compatibility with external main loops (CORBA, Java Swing, etc.)

Hub

GCSI

CORBA

Swing

GCSI

GCSI addresses these concerns

- Thread-safe server library with well-defined API, with distinguished symbol prefixes

- Event-based programming model implements the default Communicator server loop in C, Python and Allegro Common Lisp

12

GCSI Maintenance

Requirement for prompt support favors in-house development over third-party tools

Bug queue ([email protected]), feature enhancement surveys for major releases

Enhancements in GalaxyCommunicator 3.0 release

- Better simultaneous session management

- Message continuations

- Improved configuration management support

- Memory management improvements

- Hub scripting improvements

- New XDR-based communications protocol

- Major brokering enhancements

13

Leveraging the GCSI

Exploration of service standards for dialogue components Delivery platform for readily consumable open-source

dialogue components (e.g., audio servers, recognizers, parsers, synthesizers)

- MITRE, CMU, Colorado among the Communicator sites planning such releases

Exploration of domain portability issues Exploration and definition of "best practice" in dialogue

system development

14

Getting Started

The GCSI download consists of a scriptable central hub, libraries for constructing compliant spoke servers in C, Java, Allegro Common Lisp, and Python, extensive examples, documentation and sample servers

The GCSI is hosted by the MITRE Corporation. It is available directly from MITRE (http://fofoca.mitre.org/download) or via SourceForge (http://communicator.sourceforge.net)

For more information on the DARPA Communicator program, visit the DARPA Communicator home page (http://www.darpa.mil/ito/research/com/index.html)