22
WP 3 Presentation: Dialogue Manager Jürgen Geiger

ALIAS WP3 Results

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: ALIAS WP3 Results

WP 3 Presentation:Dialogue Manager

Jürgen Geiger

Page 2: ALIAS WP3 Results

WP 3 Presentation 2

Overview

• Goals • Achievements• Open Questions• List of Publications

04.06.2013

Page 3: ALIAS WP3 Results

WP 3 Presentation 3

Goals

• Dialogue Manager– Back-end for HMI– Control all other modules

• Applications: Games, Reading service, …• Physiological Monitoring

04.06.2013

Page 4: ALIAS WP3 Results

WP 3 Presentation 4

Tasks

T3.1 User identification via speech or face recognition

T3.2 Knowledge representation

T3.3 Development of a dialogue system

T3.4 Development and Integration of a game collection

T3.5 Web 2.0 wrapper for web services

T3.6 Integration of further software modules

T3.7 Adaptable behaviour of the robot platform

T3.8 Integration of natural language understanding

T3.9 Physiological monitoring

T3.10 Integration of the physiological monitoring into the dialogue manager

04.06.2013

Page 5: ALIAS WP3 Results

WP 3 Presentation 5

Deliverables

04.06.2013

Name DueD3.1 Report on the dialogue manager concept 09/2010

D3.2 Knowledge databases 04/2011

D3.3 Identification System (face & voice) 01/2011

D3.4 Prototype of dialogue manager 04/2011

D3.5 Physiological Monitoring (PM) 02/2011

D3.6 Dialogue system with integrated PM 06/2011

D3.7 Dialogue system updated to user‘s needs 05/2012D3.8 Final dialogue system with integrated PM 02/2013

Page 6: ALIAS WP3 Results

WP 3 Presentation 6

Achievements

• Dialogue Manager– Control of all other modules– Natural language understanding

• Software modules– Physiological Monitoring– User Identification

• Adaptable behaviour – Emotions

• Physiological Monitoring

04.06.2013

Page 7: ALIAS WP3 Results

WP 3 Presentation 7

Dialogue Manager: Overview (T3.3, D3.1, D3.4, D3.7, D3.8)

• Central component of the ALIAS robot („brain“)– Reproduces the basic mechanisms of human thinking– Decides on the behavior of the robot– Communicates with all other modules

04.06.2013

Hello Robot!

TTS

Face Detect

ASR

Robot Control

GUI

Touch Screen

DM Core

Situation ModelAction

Input CES Understanding

Physio Monitor

Page 8: ALIAS WP3 Results

WP 3 Presentation 8

Dialogue Manager: Overview

04.06.2013

• Components– DM-Core („Brain“)

• NLU-Engine understands human verbal messages

• Decision-Engine decides on the behavior

• Based on conceptual event representations (human thinking)

– DM-Communicator• Communicates with sensing and

acting modules• Translates between modules and

DM-Core

Page 9: ALIAS WP3 Results

WP 3 Presentation 9

Natural Language Understanding

• NLU-Engine (T3.8, D3.2)- Based on Cognesys CES technology- Extracts and processes the conceptual meaning of

verbal messages- Resistent to syntactically or grammatically degraded

informations- Uses knowledge and current situation to identify and

check the practicability of identified statements• NLU-Knowledge Database (T3.2, D3.2)

- World knowledge: understands the world in general, simulates human memory

- Expert knowledge: understands the world of elderly people and depends on the robots functionality

04.06.2013

Page 10: ALIAS WP3 Results

WP 3 Presentation 10

Acting and Behavior (T3.8, D3.2)

04.06.2013

• Decision-Engine- Based on Cognesys CES technology- Processes conceptual event representations

like humans do- Uses a situation model like human memory- Situation model

• Represents the currently relevant objects and their states and modalities

• Represents history of events that constitutes the current situation

- Proactive behavior• Example: inform the user about new mails, invites the

user to stay in contact with its relatives

Page 11: ALIAS WP3 Results

WP 3 Presentation 11

Dialogue Management (T3.6)

04.06.2013

• ASR Adapter– Receives spoken user input as text– The NLU-Engine processes the text

• GUI Adapter– Controls the GUI, processes user input

• Menus• Games, TV, audio books, email …• Skype call and alarm call control flow

– Synchronizes the GUI menus with BCI masks• BCI Adapter

– Controls the Brain Computer Interface masks– Processes user inputs

Page 12: ALIAS WP3 Results

WP 3 Presentation 12

Dialogue Management

04.06.2013

• TTS Adapter– Sends text to be spoken to the

Text-To-Speech module• RD Adapter

– Interface to the robots low-level-controlers– Controls navigation and movement behavior– Controls the robots head emotions– Receives speaker ident information

Page 13: ALIAS WP3 Results

WP 3 Presentation 13

User identification: speech (T3.1, D3.3)

• Research aspects– Speaker diarization– Overlap detection– Speech activity detection

• Implementation for the robot

04.06.2013

Page 14: ALIAS WP3 Results

WP 3 Presentation 14

Research aspects

• Speaker diarization– „Who speaks when?“– Utilise the output of a speech transcription system to suppress

linguistic variation• Overlap detection

– Overlapping speech degrades performance– Detect & handle overlap

• Voice activity detection

04.06.2013

Page 15: ALIAS WP3 Results

WP 3 Presentation 15

Speaker Recognition : Implementation

• Integrated with DM• Running permanently• DM receives name of

speaker• Used during TTS output

– To call the user by his name

04.06.2013

Page 16: ALIAS WP3 Results

WP 3 Presentation 16

User Identification: Face (T3.1, D3.3)

• Omnidirectional camera• Viola & Jones algorithm for face detection• Fusion with laser-based leg pair detection• Face identification using Eigenfaces• Keep eye contact with user

04.06.2013

Page 17: ALIAS WP3 Results

WP 3 Presentation 17

Gaming with Speech Control (T3.4, D3.8)

• Control game via ASR• Noughts and crosses• AI to control computer player• Touchscreen control also

possible

04.06.2013

Page 18: ALIAS WP3 Results

WP 3 Presentation 18

Reading Service (T3.5, D3.8)

• Customised GUI• Based on open-source software• Functionality:

– Read out e-books– Recognition from camera

04.06.2013

Page 19: ALIAS WP3 Results

WP 3 Presentation 19

Display of Emotions (T3.7, D3.8)

• Can ALIAS display emotions?• 5 basic emotions (Disgust, Fear, Joy, Sadness, Surprise)

• Integrated into Dialogue System

04.06.2013

Disgust

Neutral

Sadness

Page 20: ALIAS WP3 Results

WP 3 Presentation 20

Physiological Monitoring (T3.9, T3.10, D3.5, D3.6)

• Vital function monitoring system• Recording, saving, display of vital function data

– Manual data input– Data input directly by sensors

• Alarm function for suspicious data values

04.06.2013

Page 21: ALIAS WP3 Results

WP 3 Presentation 21

Open questions

• Personal data: storage and usage– Person ID, physiological monitoring– Who gets access?

• Learning how to use the robot– Self-explanatory system– Systems adapts to the user

• Tablet PC?

04.06.2013

Page 22: ALIAS WP3 Results

WP 3 Presentation 22

Selected Publications• J. Geiger, M. Hofmann, B.Schuller and G. Rigoll: "Gait-based Person Identification by Spectral, Cepstral and Energy-

related Audio Features," ICASSP 2013• J. Geiger, T. Leykauf, T. Rehrl, F. Wallhoff, G. Rigoll: "The Robot ALIAS as a Gaming Platform for Elderly Persons," AAL-

Kongress 2013• J. Geiger, I. Yenin, T. Rehrl, F. Wallhoff, G. Rigoll: "Display of Emotions with the Robotic Platform ALIAS", AAL-Kongress

2013• T. Rehrl, J. Geiger, M. Golcar, S. Gentsch, J. Knobloch, G. Rigoll: "The Robot ALIAS as a Database for Health Monitoring

for Elderly People," AAL-Kongress 2013• T. Rehrl, R. Troncy, A. Bley, S. Ihsen, K. Scheibl, W. Schneider, S. Glende, S. Goetze, J. Kessler, C. Hintermueller, and F.

Wallhoff: “The Ambient Adaptable Living Assistant is Meeting its Users,“ AAL-Forum 2012• T. Rehrl, J. Blume, A. Bannat, G. Rigoll, and F. Wallhoff: “On-line Learning of Dynamic Gestures for Human-Robot

Interaction,“ KI 2012• J. Geiger, R. Vipperla, S. Bozonnet, N. Evans, B. Schuller, G. Rigoll: " Convolutive Non-Negative Sparse Coding and New

Features for Speech Overlap Handling in Speaker Diarization", INTERSPEECH 2012 • R. Vipperla, J. Geiger, S. Bozonnet, D. Wang, N. Evans, B. Schuller, G. Rigoll: "Speech Overlap Detection and Attribution

Using Convolutive Non-Negative Sparse Coding", ICASSP 2012 • J. Geiger, M. Lakhal, B. Schuller, and G. Rigoll: “Learning new acoustic events in an HMM-based system using MAP

adaptation,“ INTERSPEECH 2011• T. Rehrl, J. Blume, J. Geiger, A. Bannat, F. Wallhoff, S. Ihsen, Y. Jeanrenaud, M. Merten, B. Schönebeck, S. Glende, and

C. Nedopil: “ALIAS: Der anpassungsfähige Ambient Living Assistent,“ AAL-Kongress 2011

04.06.2013