33
U1, Speech in the interface: 1. Introduction 1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 [email protected]

U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 [email protected]

  • View
    223

  • Download
    3

Embed Size (px)

Citation preview

Page 1: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 1

Module u1:

Speech in the Interface1: Introduction

Jacques Terken

HG room 2:40tel. (247) 5254

[email protected]

Page 2: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 2

contents

1. Aims and overview of course 2. Speech interfaces 3. Usability issues: introduction 4. Project

Page 3: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 3

Aims

Acquire insight into usability issues and obtain an overview of state of the art for speech in the interface

Obtain hands-on experience with design of speech-centric interface

Exercise project skills (organisation, collaboration, report, presentation)

Page 4: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 4

Overview of Module

Introduction Dialog management Speech input technologies Speech output technologies Multimodal interaction Evaluation Human Communication

Exercises and project

Page 5: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 5

contents

1. Aims and overview of course 2. Speech interfaces 3. Usability issues: introduction 4. Project

Page 6: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 6

Speech in the interface

Non-

Interactive

Interactive

Online Monitoring speech communications, Live speech processing

Dialogue systems

Offline Speech data-mining

X

Page 7: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 7

Markets and applications

R. Moore 2005

Page 8: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 8

Speech interfaces

Conversational interfaces:

natural language interaction with machines (Star Trek syndrome)

Command & Control applications:

voice-based equivalent of command-line interfaces and button interfaces (utterances need to adhere to strict grammar)

Page 9: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 9

Components of conversational interfaces

Speechrecognition

Natural Language Analysis

DialogueManager

SpeechSynthesis

LanguageGeneration

Application

Page 10: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 10

Spin-offs

Speechrecognition

Natural Language Analysis

DialogueManager

SpeechSynthesis

LanguageGeneration

Application (e.g. MS-Word)

1. Dictation systems: what you say

Page 11: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 11

2. Command-control: what you mean

Speechrecognition

(Natural Language) Analysis

DialogueManager

SpeechSynthesis

LanguageGeneration

Application (e.g. stereo)

Page 12: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 12

3. Text-to-speech conversion

Speechrecognition

Natural Language Analysis

DialogueManager

SpeechSynthesis

LangGeneration:prosody

Application (e.g. E-mail)

Page 13: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 13

contents

1. Aims and overview of course 2. Speech interfaces 3. Usability issues: introduction 4. Project

Page 14: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 14

Speech in HCI: “yes please”

Among others Zue (MIT):

Speech will be key technology of the 21st century

Page 15: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 15

Background Zue c.s.:

– Aim: developing the conversational interface– Motivation: natural language interaction is the

most natural form of communication (learned at a very early age); among other things very efficient error handling

Page 16: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 16

Advantages of speech direct access to functionality supports mobility suited for hands busy/dirty - eyes busy situations no special motor abilities needed, optimal

compatibility with communicative abilities of users compatible with trend towards miniaturisation of

equipment

Page 17: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 17

Maturity hypothesis

Speech interfaces not yet mature because of complexity of technology:

– R.K. Moore:

“Spoken language interaction is the most sophisticated behaviour of the most complex organism in the known universe”

Page 18: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 18

Phylogenetic argumentation

First: direct manipulation (“you do what i want”)

Later: symbolic manipulation (cf. management, commercials)

Physical manipulation and violence considered primitive

Page 19: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 19

Ontogenetic argumentation

Russian educational psychology (Galperin):– knowledge acquisition starts with direct

manipulation– later-on symbolic manipulation

”stay off” warning to children: “look with your eyes not with your hands”

Page 20: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 20

Therefore

Direct manipulation phylogenetically and ontogenetically more primitive and less complex

Maturity hypothesis: same trajectory for HCI:

first direct manipulation

then symbolic manipulation (speech)

Page 21: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 21

However UI design principles (Schneiderman ‘86):

– transparency: continuous representation of objects and actions

– fast, incremental and reversible operations with immediate effect

– physical actions or labelled buttons, avoid complex syntax/natural language as much as possible

Design principles difficult to realise in speech interfaces

Page 22: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 22

In addition, language and speech technology is not (yet) very robust, and development costs are high

Getting towards the application semantics is more complicated for (natural) language than for direct manipulation

Finally: HCI is domain in its own right, so there is no a priori reason to model HCI after HHI

SO: avoid natural language

Page 23: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 23

Speech interfaces: yes or no

Speech not suited for all kinds of information or situations

(e.g. “a picture is worth a thousand words”) Nevertheless, speech is useful under certain

conditions, e.g.– hands busy - eyes busy– mobility, miniaturisation– disabilities (CTS/RSI!)

Page 24: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 24

use interface design guidelines for design of speech interfaces

e.g. http://www.larson-tech.com/MMGuide.html and in return: offer human communication theory as

model for HCI

Page 25: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 25

Speech interfaces (SI) and Direct-manipulation interfaces Main problems with speech interfaces:

– no external support for functionality– unreliability of input technology

Page 26: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 26

Dealing with unreliability

Constrain domain– restricted vocabulary– restricted application / task domain– restricted number of users: speaker-dependent

speech recognition Extensive verification (in connection with error

cost)

Page 27: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 27

Dealing with functionality problem

Quick reference card Training System-driven dialogue

experience

need for adaptive systems

(e.g. barge-in)

Page 28: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 28

contents

1. Aims and overview of course 2. Speech interfaces 3. Usability issues: introduction 4. Project

Page 29: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 29

Aim

Provide hands-on experience with design and implementation of a speech-centric interface, involving (at least) voice-based control and speech output.

The topic: speech/multimodal interface for in-car information and entertainment systems.

Page 30: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 30

Tools

Download CSLU toolkit from

http://www.cslu.ogi.edu/toolkit (requires registering)

Page 31: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 31

Project stages

Task analysis (requirements gathering) Design on paper (V0.1) Wizard of Oz Redesign, implementation of V1.0 Validation Evaluation Report

Page 32: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 32

Exercise for today

CSLU Exercises: McTear ch. 7, pizza application Extend the pizza application:

– Goto http://www.dominos.nl/– Click “online bestellen”– Extend the dialogue system to include all the

topping options, the side dishes and the drinks (see “menukaart”)

– Test the system and discuss your experiences

Page 33: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl

U1, Speech in the interface: 1. Introduction 33

Composition of project teams