8
79 Sentence Intonation for Polish Language Prozodia wypowiedzi w języku polskim Bożena Piorkówska, Janusz Rafałko, Wojciech Lesiński, Edward Szpilewski Institute of Computer Sciences, University of Bialystok, Bialystok, Poland [email protected] ABSTRACT The article presents tests results of examination of sentence intonation in Polish. The tests were performed for the project “Development of multi-voice and multi-language Text-to-Speech (TTS) and Speech-to-Text (STT) conversations system (language: Belarussian, Polish, Russian)”. A short introduction to prosody with particular stress on syntax (simple and compound sentence structure). The examined material was recordings of a text read aloud by four different people (two women and two men). The article also presents the process of analysis and future plans. STRESZCZENIE Artykuł prezentuje wyniki badań intonacji zdaniowej dla języka polskiego przepro- wadzonych na potrzeby projektu „Syntezer mowy polskiej na podstawie tekstu”. Krótkie wprowadzenie do problematyki prozodii, ze szczególnym uwzględnieniem składni, czyli budowy zdania pojedynczego i zło żonego. Materia łem badawczym by ły nagrania tekstu czytanego przez cztery różne osoby (dwie kobiety i dwóch męż- czyzn). W artykule przedstawiono również sposób przeprowadzania analizy oraz kierunki dalszej pracy. 1. Introduction The research aims to fill the gap in introducing and promoting computerised speech technology for Polish language. The decisive factor in achieving high quality of speech synthesis is the completeness of the resources and databases used. The research objective is to develop the linguistic resources, vocabulary, grammar and acoustical databases. The synthesis of phonemic characteristics of speech is based on the Allophones Natural Waves method. The basic principle of synthesising the prosodic features of speech is the division of an utterance into accent groups and the formation on their basis of entire tonal, rhythmical and dynamic contours of a syntagm and utterance as a whole. By using Data Driven approach the speech synthesiser will resort to prosodic feature databases for the synthesis of speech sounds and intonation. The two modules are expected to achieve a high quality of synthesised speech. In order for the synthesed speech to sound natural it needs to have rhythm and, what is more important, proper intonation. The way people speak differs depending on the ability to produce utterances in a particular way. The voice signal is described using numerous PTFonR07:PTFonR07 2008-05-06 15:50 Strona 79

Sentence Intonation for Polish Language

Embed Size (px)

Citation preview

Page 1: Sentence Intonation for Polish Language

79

Sentence Intonation for Polish LanguageProzodia wypowiedzi w języku polskim

Bożena Piorkówska, Janusz Rafałko,Wojciech Lesiński, Edward Szpilewski

Institute of Computer Sciences, University of Bialystok, Bialystok, [email protected]

ABSTRACTThe article presents tests results of examination of sentence intonation in Polish. Thetests were performed for the project “Development of multi-voice and multi-languageText-to-Speech (TTS) and Speech-to-Text (STT) conversations system (language:Belarussian, Polish, Russian)”. A short introduction to prosody with particular stresson syntax (simple and compound sentence structure). The examined material wasrecordings of a text read aloud by four different people (two women and two men).The article also presents the process of analysis and future plans.

STRESZCZENIEAr ty kuł pre zen tu je wy ni ki ba dań in to na cji zda nio wej dla ję zy ka pol skie go prze pro -wa dzo nych na po trze by pro jek tu „Syn te zer mo wy pol skiej na pod sta wie tek stu”.Krót kie wpro wa dze nie do pro ble ma ty ki pro zo dii, ze szcze gól nym uwzględ nie niemskład ni, czy li bu do wy zda nia po je dyn cze go i zło żo ne go. Ma te ria łem ba daw czymby ły na gra nia tek stu czy ta ne go przez czte ry róż ne oso by (dwie ko bie ty i dwóch męż -czyzn). W ar ty ku le przed sta wio no rów nież spo sób prze pro wa dza nia ana li zy orazkie run ki dal szej pra cy.

1. Introduction

The research aims to fill the gap in introducing and promoting computerisedspeech technology for Polish language. The decisive factor in achieving high qualityof speech synthesis is the completeness of the resources and databases used. Theresearch objective is to develop the linguistic resources, vocabulary, grammar andacoustical databases. The synthesis of phonemic characteristics of speech is based onthe Allophones Natural Waves method. The basic principle of synthesising theprosodic features of speech is the division of an utterance into accent groups and theformation on their basis of entire tonal, rhythmical and dynamic contours of a syntagmand utterance as a whole. By using Data Driven approach the speech synthesiser willresort to prosodic feature databases for the synthesis of speech sounds and intonation.The two modules are expected to achieve a high quality of synthesised speech.

In order for the synthesed speech to sound natural it needs to have rhythm and, what ismore important, proper intonation. The way people speak differs depending on the abilityto produce utterances in a particular way. The voice signal is described using numerous

PTFonR07:PTFonR07 2008-05-06 15:50 Strona 79

Page 2: Sentence Intonation for Polish Language

80 Speech and Language Technology. Volume 9/10

physical parameters which vary as the speech goes on. The acoustic parameter that isparticularly important in prosodic analysis is the basic frequency (F0) and its curve(intonation contour). That is why special emphasis is placed on determining the intonationcontour and F0 maximum and minimum values for particular types of utterances.

2. Kind of utterances

Syntax (the sentence structure and message) is crucial in certain utterances intonation.One of the divisions of utterances is into clauses and phrases. Phrases are groups

of words that have either no subject or no predicate e. g. Przechodzić tylko na zielonymświetle. A sentence is a group of grammatically interrelated words containing a subjectand a predicate e. g. Proszę przechodzić tylko na zielonym świetle. Sentences withmodifiers are called long simple sentences, whereas the ones without – short simplesentences. A simple sentence can be as short as one word. Longer utterances with morethan one predicate and/or subject are compound sentences. Based on relations betweenthese elements compound and complex sentences are distinguished.

There are several types of compound sentences:1. Conjunction – the sentences are joined; The co-ordinating conjunctions may be: i,

oraz, a, jak również, ani itp. (and, as well as etc.);e. g. Marek był w górach a Ania nad morzem (Mark was in the mountains and Annwas by the seaside);

2. Negation – one sentence negates the other; The co-ordinating conjunctions may be:ale, lecz, a, jednak, zaś, natomiast itp. (but, however etc.);e. g. Kasia była spóźniona jednak się nie śpieszyła (Cathy was late but she was not ina hurry);

3. Disjunction – one sentence excludes the other; The co-ordinating conjunctions maybe: albo, czy, lub, bądź itp. (or, either etc.);e. g. Przeczytam książkę albo pójdę do kina (I will read a book or go to the cinema);

4. Implication – the second sentence is a consequence of the first; The co-ordinatingconjunctions may be: więc, toteż, dlatego, zatem itp. (so, that is why, consequently etc.);e. g. Marek jest zdolny więc ma wysoką średnią (Mark is intelligent so his average ishigh).In complex sentences the components are not equal. In such sentences there is a main

clause (antecedent) and a subordinate clause (consequent). The subordinate clausesubstitutes or complements one of the main clause parts.

Depending on the purpose and emotional undertone we divide sentences into de-clarative, interrogative, imperative and exclamatory ones. To express emotions weusually use punctuation marks like: dash, question mark, exclamation mark or ellipsis.

3. Research method

The very first step was creating the proper text to be recorder. It had to convey theexamined types of sentences. It consisted of several sentences of each examined type.Besides that, the sentences differed from each other in conjunctions and message.

PTFonR07:PTFonR07 2008-05-06 15:50 Strona 80

Page 3: Sentence Intonation for Polish Language

81Sentence Intonation for Polish Language

Picture 1. Spectogram of sentence read by a man.

Picture 2. Spectogram of sentence read by a woman.

PTFonR07:PTFonR07 2008-05-06 15:50 Strona 81

Page 4: Sentence Intonation for Polish Language

82 Speech and Language Technology. Volume 9/10

Next, using SoundForge, as the four different people read the text aloud, it was re-corded. Each person recorded his/her text at a time so as not to suggest intonation tothe other participants of the research. For every sentence (using Praat – a computerprogram with which phoneticians can analyse, synthesize, and manipulate speech) aspectrogram (spectro-temporal representation of the sound) and an intonation contourwere generated. Computer analysis gave some important data considering F0 fluc-tuation and its mean value. Tone contour examination also brought interesting results.Spectrograms and tone contours of sentences “Maciek był w górach a Ania nad mo -rzem” (Mark was in the mountains and Ann was by the seaside) read by a man and awoman are presented below. It is clearly seen that the man has a low-pitched voice –F0 maximum value is about 140Hz, whereas the woman’s – 243Hz. The woman’sintonation line is similar to the man’s line. They only differ in pitch variation which isnormal as women have higher voice than men.

4. Test results

4.1 Acoustic parameters fluctuationWhat greatly influences the basic frequency F0 is the pitch (high or low) of a

speaker’s voice. Generally the higher the basic tone the speaker has the greater therange between Fmin and Fmax. It is presented in the figure below. There are minimumand maximum frequencies of sample compound negation sentences. The sameutterances of every person were chosen.

The message of the utterance is of lesser importance. The table below presentsminimum, maximum and mean value frequencies in different utterances. The type of

Picture 3. Tone contour of sentence read by women (two above – female)and men (two below – male).

PTFonR07:PTFonR07 2008-05-06 15:50 Strona 82

Page 5: Sentence Intonation for Polish Language

83Sentence Intonation for Polish Language

an utterance needs special attention. Exclamatory and interrogative sentences havebroader frequency range. Generally the highest values of Fmax are in exclamatory andimperative sentences.

Tone contour observations showed that the way the participants produce theirsentences does not differ significantly in its diagram. It is clearly seen in fig.1.4. It

presents the same compound sentence intonation lines produces by every one of theparticipants. The two above – women’s, the two below – men’s.

There were differences concerning accent intensity of a particular word in asentence. The message and the speaker’s interpretation were crucial here. The figure

Picture 4. Minimum and maximum frequencies of sample negation sentences.

Table 1. Minimum, maximum and mean value frequencies in different utterances

Woman_1 Wonam_2 Man_1 Man_2Fmin Fmax Favr Fmin Fmax Favr Fmin Fmax Favr Fmin Fmax Favr

conjunction 176 265 231 133 236 206 127 212 154 93 124 110sentences 191 275 237 157 235 201 126 213 162 93 123 110disjunction 180 292 245 160 244 206 119 204 156 89 135 112sentences 170 288 243 161 253 213 136 200 170 93 126 111negation 224 277 244 136 288 211 125 233 167 103 136 112sentences 208 285 244 174 252 211 133 197 177 132 197 107complex 179 273 236 139 256 207 132 182 161 92 123 110sentences 166 287 242 156 262 214 142 197 171 100 136 113implication 204 285 241 110 267 157 129 227 168 99 138 111sentences 177 276 238 166 253 202 134 219 181 105 177 113exclamatory 199 298 275 195 292 261 146 273 214 104 194 162sentences 137 297 263 171 247 221 126 225 176 102 200 166interrogative 175 282 175 110 266 223 103 209 103 106 173 106sentences 233 277 233 182 256 223 146 200 146 130 146 130declarative 126 266 222 175 271 234 127 223 171 102 144 122sentences 183 269 223 158 274 211 113 225 160 108 155 124imperative 190 286 247 167 284 238 136 219 181 111 161 142sentences 194 282 235 182 290 251 125 206 159 103 174 150

PTFonR07:PTFonR07 2008-05-06 15:50 Strona 83

Page 6: Sentence Intonation for Polish Language

84 Speech and Language Technology. Volume 9/10

below presents tone contour for the sentence “Jutro pojadę na wycieczkę, albo zostanęw domu” (I will go on a trip tomorrow or stay at home). Three of the speakers decidedthat when they go was more important and one that the very fact of going on a trip wassignificant.

Picture 5. Tone contour of compound sentence read by women (two above – female)and men (two below – male).

Picture 6. Tone contour of sentence “Jutro pojadę na wycieczkę, albo zostanę w domu”(I will go on a trip tomorrow or stay at home) read by women (two above – female)

and men (two below – male).

PTFonR07:PTFonR07 2008-05-06 15:50 Strona 84

Page 7: Sentence Intonation for Polish Language

85Sentence Intonation for Polish Language

4.2. Syntax analysisVarious compound sentences intonation lines comparison failed to show any

significant differences. Of course there were utterances whose tone contour was notlike the general model, but, as it was previously mentioned, it was due to the inter-pretation.

Such utterance contour consists of two rise-and-fall parts. Usually the accentedwords belong to the subject and here the tone contour is high. When the conjunctionappears, there is again a rise.

Complex sentences have different F0 frequency graph. Regardless of the sentenceconstruction – main clause before the subordinate one or vice versa – the tone contourfalls.

Interrogative sentences have rising intonation line, but the strongest rise occurs inthe last words. In imperative and exclamatory sentences the intonation line first risessharply, then falls. In declarative sentences the tone contour rises and falls, but these donot happen as abruptly as in the previous ones. The figure below shows interrogative,exclamatory, imperative and declarative sentences intonation lines.

5. Conclusion

The correct and natural use of intonation is very difficult to accomplish. A person,pronouncing a sentence, knows exactly what he is trying to say and knows the meaningof words he uses. A lot of information is communicated through the accurate prosodyof the spoken text. To get the best results, not only the sentence construction should beanalysed, but also the meaning and layout of its words. Because of this such tests arecrucial in order to obtain the best possible quality of synthesed speech. The notion ofintonation and eurhythmics of particular types of utterances will make natural speechgeneration possible. In the future, such synthesiser will surely be commonly used.

The expected results can be applied in further research in applied linguistics,especially, in the study of phonetics and prosody of the Polish language, in expandingthe theoretical framework for multilingual speech communication systems. The projecthas great relevance for economic and social fields. The obtained results will facilitatethe development of new areas of business activities and services in Poland which areconnected with the creation of speech synthesiser. The speech synthesiser can be usedin audio servers to provide information to the users in telephone banking, cultural andtourist information telephone services, makes possible a round-the-clock telephonetransmission of required information by means of speech; the on-line telephone infor-mation services. One of the possible applications of the synthesiser is the sociallyoriented system, such as a computer-based transmission of textual information bemeans of voice to the sick, socially disabled and for the blind.

The extension of this work is a project of executing an opposite process which isspeech recognising and notation in the form of text. Conversion of the speech informationinto text: Speech-to-Text (STT). Recognition and synthesis methods of the audio-visualpatterns will be developed.

PTFonR07:PTFonR07 2008-05-06 15:50 Strona 85

Page 8: Sentence Intonation for Polish Language

86 Speech and Language Technology. Volume 9/10

Acnowledgement

This paper was supported by the EUROPEAN COMMISSIN under grant INTASRef. number 04-77-7404. The author wish to express their thanks for the support.

REFERENCES

0[1] Lobanov B., Karnevskaya H. 1991. MW Speech Synthesis from Text. Aix-en-Provense, France:Proc. of the XII International Congress of Phonetic Sciences.

0[2] Shpilewski E., Piorkowska B, Rafalko J., Lobanov B, Kiselov V., Tsirulnik L. 2004. Polish TTSin Multi-Voice Slavonic Languages Speech Synthesis System. Saint Petersburg: Proceedings ofthe 9th International Conference “Speech and Computer” – SPECOM’2004.

0[3] Boguslavsky I., Lobanov B. and Karnevskaya H. 1996. Generation of Intonation and Accentuationof SyntheticSpeech on the Base of Morpho-Syntactic Knowledge. Moscow: Proceedings of theInternational Workshop “Integration of Language and Speech”.

0[4] Piorkowska B., Rafalko J., Shpilewski E. 2005. Conversion of Textual Information to Speech forPolish Language. Wroclaw: Proceedings of the 4th International Conference on Computer Re-cogniotion Systems – CORES’2005.

0[5] Lobanov B., Piorkowska B., Rafalko J., Tsirulnik L. 2005. Implementation of InterlanguageDifferences of Completeness and Incompleteness Prosody Types in Russian and Polish TTS.Moscow: Proceedings of the International Conference Dialog-2005.

PTFonR07:PTFonR07 2008-05-06 15:50 Strona 86