15
Bruxelles, October 3-4, 2000 10036 Interface Multimodal Analysis/Synthesis System for Human Interaction to Virtual and Augmented Environments IST Concertation Meeting Bruxelles - October 3- 4, 2000

Bruxelles, October 3-4, 2000 10036 Interface Multimodal Analysis/Synthesis System for Human Interaction to Virtual and Augmented Environments IST Concertation

Embed Size (px)

Citation preview

Page 1: Bruxelles, October 3-4, 2000 10036 Interface Multimodal Analysis/Synthesis System for Human Interaction to Virtual and Augmented Environments IST Concertation

Bruxelles, October 3-4, 2000

10036 Interface

Multimodal Analysis/Synthesis System for Human Interaction to Virtual and

Augmented Environments

IST Concertation Meeting

Bruxelles - October 3-4, 2000

Page 2: Bruxelles, October 3-4, 2000 10036 Interface Multimodal Analysis/Synthesis System for Human Interaction to Virtual and Augmented Environments IST Concertation

Bruxelles, October 3-4, 2000

The Consortium

DIST – University of Genoa I C

Lernout & Hauspie B PLinköpings Universitet S PUniversitat Politecnica de Catalunya E PEcole Polytechnique Fédérale de Lausanne CH PUniversity of Geneva CH P Informatics and Telematics Institute EL PTecnologia Automazione Uomo scrl. I PELAN Informatique F PUniversity of Maribor SI PCurtin University of Technology AU PUmea University S PCentre National de la Recherche Scientifique F PW Interactive SA F P

Page 3: Bruxelles, October 3-4, 2000 10036 Interface Multimodal Analysis/Synthesis System for Human Interaction to Virtual and Augmented Environments IST Concertation

Bruxelles, October 3-4, 2000

A “man in the computer”

Natural SpeechNatural Expressions

Emotional Speech Synthesis

Emotional Face&Body Animation

?

Page 4: Bruxelles, October 3-4, 2000 10036 Interface Multimodal Analysis/Synthesis System for Human Interaction to Virtual and Augmented Environments IST Concertation

Bruxelles, October 3-4, 2000

Man-to-machine action

• coherent analysis of audio-video channels

• high level interpretation and data fusion

• speech emotion understanding

• facial expression classification

Page 5: Bruxelles, October 3-4, 2000 10036 Interface Multimodal Analysis/Synthesis System for Human Interaction to Virtual and Augmented Environments IST Concertation

Bruxelles, October 3-4, 2000

Machine-to-man feedback

• human-like audio-video feedback simulating a "person in the machine”

• MPEG-4 face and body animation

• text-to-speech with lip synchronization

• speech and facial expressions emotional synthesis

Page 6: Bruxelles, October 3-4, 2000 10036 Interface Multimodal Analysis/Synthesis System for Human Interaction to Virtual and Augmented Environments IST Concertation

Bruxelles, October 3-4, 2000

WORK DONE SO FAR

• Specifications of the common sw platform

• Bimodal multi-language, multi-speaker corpus recording

• Development of preliminary tools

• Implementation of the intermediate common sw platform

• Finalization of the CfP for ICAV3D

• Participation to IBC2000

• Preparation toward IST2000

• Preliminary market study

Page 7: Bruxelles, October 3-4, 2000 10036 Interface Multimodal Analysis/Synthesis System for Human Interaction to Virtual and Augmented Environments IST Concertation

Bruxelles, October 3-4, 2000

The INTERFACE platform

• Network Platform (large set of distributed independent tools)

• Integrated Platform (a reduced set of strongly integrated tools)

• Demonstration Platform (personalization of the Integrated Platform on specific application dependent context)

Page 8: Bruxelles, October 3-4, 2000 10036 Interface Multimodal Analysis/Synthesis System for Human Interaction to Virtual and Augmented Environments IST Concertation

Bruxelles, October 3-4, 2000

Tools under development

• Low/high level video analysis

• Low/high level speech analysis

• High level facial animation

• Natural dialog manager

• 3D human and cartoon characters

• Phoneme markers to FAP conversion

• Emotional speech synthesis

• High level body animation

Page 9: Bruxelles, October 3-4, 2000 10036 Interface Multimodal Analysis/Synthesis System for Human Interaction to Virtual and Augmented Environments IST Concertation

Bruxelles, October 3-4, 2000

The Network CSP

http://www.ist-interface.org

C

IDL

C++

IDL

Java

IDL

Client

C

IDL

C++

IDL

Java

IDL

Server

ORB - Object Request Broker

Page 10: Bruxelles, October 3-4, 2000 10036 Interface Multimodal Analysis/Synthesis System for Human Interaction to Virtual and Augmented Environments IST Concertation

Bruxelles, October 3-4, 2000

Mapping for the 14 MPEG-4 visemes

p, b, m f, v T, D t, d

k, g tS, dZ, S s, z n, l

r A e I

O U

Page 11: Bruxelles, October 3-4, 2000 10036 Interface Multimodal Analysis/Synthesis System for Human Interaction to Virtual and Augmented Environments IST Concertation

Bruxelles, October 3-4, 2000

Facial emotion modeling and synthesis

Happiness Sadness Anger Fear Disgust Surprise

Page 12: Bruxelles, October 3-4, 2000 10036 Interface Multimodal Analysis/Synthesis System for Human Interaction to Virtual and Augmented Environments IST Concertation

Bruxelles, October 3-4, 2000

Bimodal emotional analysis : training

Classifier (GMM)

Video

Features

Audio Analysis

Audio Frames

Video Analysis

Video Frames Features

Selection

Frames #

Page 13: Bruxelles, October 3-4, 2000 10036 Interface Multimodal Analysis/Synthesis System for Human Interaction to Virtual and Augmented Environments IST Concertation

Bruxelles, October 3-4, 2000

Bimodal emotion synthesis: runtime

Audio Analysis

Classifier (GMM)

Audio

Features

Audio Frames

Predefined Movements or

Expressions

“Expression Blender”

Page 14: Bruxelles, October 3-4, 2000 10036 Interface Multimodal Analysis/Synthesis System for Human Interaction to Virtual and Augmented Environments IST Concertation

Bruxelles, October 3-4, 2000

Feature tracking tool: Track2FAP

Page 15: Bruxelles, October 3-4, 2000 10036 Interface Multimodal Analysis/Synthesis System for Human Interaction to Virtual and Augmented Environments IST Concertation

Bruxelles, October 3-4, 2000

Applications

• Virtual speaker for web news announcing

• Virtual operator for web call centers

• Virtual teacher for remote teaching

• Virtual salesman in e-commerce

• Virtual friend for web chatting

• Virtual guide for Internet navigation