18
rmation Technology – Dialogue Systems University (Germany) ://www.dialogue-systems.de Speech Data Corpus for Verbal Intelligence Estimation Kseniya Zablotskaya , Steffen Walter, Wolfgang Minker

Speech Data Corpus for Verbal Intelligence Estimation

Embed Size (px)

DESCRIPTION

Speech Data Corpus for Verbal Intelligence Estimation. Kseniya Zablotskaya , Steffen Walter, Wolfgang Minker. Outline. Introduction Improvement of Spoken Language Dialogue Systems Verbal Intelligence Estimation Monologues Collection Hamburg Wechsler Intelligence Test - PowerPoint PPT Presentation

Citation preview

Page 1: Speech Data Corpus  for Verbal  Intelligence Estimation

Information Technology – Dialogue SystemsUlm University (Germany)http://www.dialogue-systems.de

Speech Data Corpus for Verbal Intelligence Estimation

Kseniya Zablotskaya, Steffen Walter, Wolfgang Minker

Page 2: Speech Data Corpus  for Verbal  Intelligence Estimation

www.dialogue-systems.de | LREC 2010 | May 2010Page 2

Outline-Introduction-Improvement of Spoken Language Dialogue Systems-Verbal Intelligence Estimation-Monologues Collection-Hamburg Wechsler Intelligence Test-Dialogues Collection-Transcription Standards-Result Table for Each Candidate-Participants-Conclusions and Future Work

Page 3: Speech Data Corpus  for Verbal  Intelligence Estimation

www.dialogue-systems.de | LREC 2010 | May 2010Page 3

Introduction

Analysis of speech

Analysis of speech

Emotion

Gender

Age

Verbal Intelligence

Social class

Personality

Different ways to describe the same event:- “Excuse me, could you tell me the way to the railway station?”- “Hey you, show me where the railway station is.”

What our words can say about us:

Page 4: Speech Data Corpus  for Verbal  Intelligence Estimation

www.dialogue-systems.de | LREC 2010 | May 2010Page 4

Improvement ofSpoken Language Dialogue Systems

Spoken language dialogue system

Acoustic front-endSpeech recognition

Linguisticanalysis

Dialoguemanagement

Text generationSpeech synthesis

Application

Cognitive processes of the user

Cognitive processes of the user

Adaptation to the userAdaptation to the user

- estimation of cognitive processes of the user- adaptation to the user

Page 5: Speech Data Corpus  for Verbal  Intelligence Estimation

www.dialogue-systems.de | LREC 2010 | May 2010Page 5

Verbal Intelligence Estimation

VI=80

VI=105

VI=120VI=120Monologues Dialogues

VI=120

VI=120

VI=105

VI=80

Transcribedspeech

Evaluation %

Featureextraction

Model / Classifier

VI-test VI

VIMod

Page 6: Speech Data Corpus  for Verbal  Intelligence Estimation

www.dialogue-systems.de | LREC 2010 | May 2010Page 6

Monologues Collection

Two short films (Galileo):

Craziest hotels in the world: -necessary to memorize the names -necessary to memorize the order

Experiment on how long people could stay awake -possible to describe the film without certain details-descriptions are informative

Page 7: Speech Data Corpus  for Verbal  Intelligence Estimation

www.dialogue-systems.de | LREC 2010 | May 2010Page 7

Dialogues Collection

Duration: at least 10 minutes Topic: the education and the school system in Germany - interesting - participants know a lot about it - participants have different opinions

Dialogues

Discussions

Contra-Discussions

Answering Questions

Page 8: Speech Data Corpus  for Verbal  Intelligence Estimation

www.dialogue-systems.de | LREC 2010 | May 2010Page 8

Hamburg Wechsler Intelligence Test

Information (25 questions)-measures general knowledge-questions from a particular culture

For example: Who is president of Russia?

Page 9: Speech Data Corpus  for Verbal  Intelligence Estimation

www.dialogue-systems.de | LREC 2010 | May 2010Page 9

Hamburg Wechsler Intelligence Test

Comprehension (10 questions)- social awareness - common-sense

For example: What would you do if you lost your way in a forest?

Page 10: Speech Data Corpus  for Verbal  Intelligence Estimation

www.dialogue-systems.de | LREC 2010 | May 2010Page 10

Hamburg Wechsler Intelligence Test

Digital Span -forward-backward

- auditory short memory- concentration- attention

For example: Please listen to the fallowing digits and repeat them: 5 7 2 4 6

Page 11: Speech Data Corpus  for Verbal  Intelligence Estimation

www.dialogue-systems.de | LREC 2010 | May 2010Page 11

Hamburg Wechsler Intelligence Test

Arithmetic (10 questions)-mental alertness-attention and concentration while manipulation mental mathematical problems

For example: Seven envelopes cost twenty five cents. How many envelopes can you buy if you have one dollar?

Page 12: Speech Data Corpus  for Verbal  Intelligence Estimation

www.dialogue-systems.de | LREC 2010 | May 2010Page 12

Hamburg Wechsler Intelligence Test

Similarities in Dissimilar Objects (12 questions)- abstract reasoning- power of conceptualization

For example: Please find a similarity among a dog and a lion?

Page 13: Speech Data Corpus  for Verbal  Intelligence Estimation

www.dialogue-systems.de | LREC 2010 | May 2010Page 13

Hamburg Wechsler Intelligence Test

Vocabulary (42 questions)- comprehension of meanings- relation between the expressive words

For example: What does the word “zebra” mean?

Page 14: Speech Data Corpus  for Verbal  Intelligence Estimation

www.dialogue-systems.de | LREC 2010 | May 2010Page 15

Transcription Standards

“?” – interrogative word intonation and rising tone

“.” – completed thoughts and falling tone

“,” – short pauses in the speech, but with a continuation of the main idea

“;” – interrupted thoughts

Example: “no no. or yes? you say; understand,”

All monologues and dialogues were transcribed according to the standards by Mergenthaler.

The punctuation marks in transcripts are used to show rhythmical and syntactical speech interruptions:

Page 15: Speech Data Corpus  for Verbal  Intelligence Estimation

www.dialogue-systems.de | LREC 2010 | May 2010Page 16

Result Table for Each Candidate

Sub-Test Points

Information 18 out of 25

Comprehension 19 out of 20

Digit Span 15 out of 17

Arithmetic 13 out of 14

Similarities in dissimilar Objects 24 out of 24

Vocabulary 73 out of 84

Verbal IQ 122

- candidate’s points for each verbal task and the verbal IQ. - verbal IQ is measured according to the special tables of the HAWIE.

Page 16: Speech Data Corpus  for Verbal  Intelligence Estimation

www.dialogue-systems.de | LREC 2010 | May 2010Page 17

Participants56 candidates: men - 27, women - 29Age: 16 – 75Language: German

71 monologues (3 hours 30 minutes)30 dialogues (6 hours 30 minutes)

Page 17: Speech Data Corpus  for Verbal  Intelligence Estimation

www.dialogue-systems.de | LREC 2010 | May 2010Page 18

Conclusions and future directionsSpeech data corpus:- 56 candidates;- 10 hours of audio data;

Approaches which can be applied to the collected data:- Word usage, abstracts, emotion words;- Analysis at different linguistic levels: morphology, lexicology, syntax, semantics, and discourse;- Linguistic styles;- Content words;- Degree of speakers’ immersion in monologues and dialogues;- “Good story” criteria;- Status in a conversation;- Levels of agreement;-Kelly’s repertory grids;

We are going to find more candidates and to continue these recordings.

Page 18: Speech Data Corpus  for Verbal  Intelligence Estimation

www.dialogue-systems.de | LREC 2010 | May 2010Page 19

Thanks for your attention!