288
Ioan-Iovitz Popescu, Mihaiela Lupea, Doina Tatar and Gabriel Altmann Quantitative Analysis of Poetic Texts

Quantitative

Embed Size (px)

DESCRIPTION

Quantitative

Citation preview

  • Ioan-Iovitz Popescu, Mihaiela Lupea, Doina Tatar and Gabriel AltmannQuantitative Analysis of Poetic Texts

  • Quantitative Linguistics

    EditorsReinhard KhlerGabriel AltmannPeter Grzybek

    Advisory EditorRelja Vulanovi

    Volume 67

  • Ioan-Iovitz Popescu, Mihaiela Lupea,Doina Tatar and Gabriel Altmann

    Quantitative Analysisof Poetic Texts

    DE GRUYTERMOUTON

  • ISBN 978-3-11-033605-4e-ISBN (PDF) 978-3-11-036379-1e-ISBN (EPUB) 978-3-11-039479-5ISSN 0179-3616

    Library of Congress Cataloging-in-Publication DataA CIP catalog record for this book has been applied for at the Library of Congress.

    Bibliographic information published by the Deutsche NationalbibliothekThe Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie;detailed bibliographic data are available on the Internet at http://dnb.dnb.de.

    2015 Walter de Gruyter GmbH, Berlin/BostonPrinting and binding: CPI books GmbH, Leck Printed on acid-free paperPrinted in Germany

    www.degruyter.com

  • Foreword The study of language is a travel without end. Not only because there are many languages but also because there is an unlimited number of texts. Everyone produces several ones on a daily basis and the only way to learn about lan-guages (and not only about languages) is the Sisyphean analysis of the infinite number of texts. Usually, a given problem whose solution is focused, presup-poses some definitions, some conventions and some hypotheses. The defini-tions concern concepts which are created by the researcher and enable her/him to describe and classify. This is the mandatory initial point of any analysis. If we want to proceed to the next level, we must try to test some hypotheses about the properties and the behavior of the given classes. But even if we succeed to do so and capture the results in form of a model, we see that each of the classes can be further scrutinized and split into new classes according to another property, e.g. classifying the words of a text in parts of speech (level 1), we state that the nouns have different grammatical functions (level 2); in turn we may state that every grammatical function contains elements differing in their polysemy (level 3), etc. This procedure does not differ from that in physics or astronomy. The only difference is the fact that language is a cultural product; its analysis neces-sitates rather complex methods.

    The next complication shared also with biology is the variability of texts. Texts are written by persons of different age, education, gender, mother tongue, social status; they were written/uttered for different aims in parts or as a whole (text-sorts), they describe different (existing or imaginary) matters and stick sometimes to quite different restrictions (e.g. meter, rhythm, rhyme). And each time we decide for analyzing some of the aspects, find a mathematical model, test it and subsume the discovered regularity under a system, we discover a new aspect. And somewhere on this endless wandering we shall meet psychologists, biologists and physicists and will be forced to take into account their view of things.

    In the present volume we went a very restricted path: we analyzed the poet-ic work of the Romanian author Mihai Eminescu and tried to show at least two aspects of texts, viz. the phonetic aspect (Ch. 2), and the vocabulary (Ch. 3). The control cycle was presented in Ch. 4. We hope that the methods will be used for analyzing many different texts.

    We want to express our gratitude to several colleagues who helped, ad-vised, corrected and improved the book. In the first place, it was Reinhard Khler who spent more time with corrections to this book than with writing his own one. Other colleagues: Relja Vulanovi, Jn Mautek, Gabriela Pan

  • VI Foreword

    Dindelegan, Claudiu Vasilescu, Sorin Vizireanu, and Dan Zotta helped us with the English language, mathematics, computing, graphics, and Romanian, and we are glad that they did not met with shattered nerves.

    Ioan-Iovitz Popescu Mihaiela Lupea

    Doina Tatar Gabriel Altmann

  • Contents ForewordV

    1 Introduction1

    2 Phonic phenomena9 2.1 Occurrence without pattern9 2.1.1 Phoneme frequencies9 2.1.2 Euphony in general21 2.2 Assonance31 2.2.1 The diagonal37 2.2.2 Symmetry39 2.2.3 Poem length and significant sequences41 2.3 Alliteration44 2.4 Aggregation56 2.5 Rhyme73 2.5.1 Word length74 2.5.2 Open and closed rhymes84 2.5.3 Masculine and feminine rhyme86 2.5.4 Parts of speech in rhyme words87

    3 The word97 3.1 Introduction97 3.2 Frequency distribution98 3.2.1 Stratification102 3.2.2 Ord's criterion122 3.2.3 The lambda indicator129 3.2.4 Entropy and repeat rate144 3.2.5 Gini's coefficient153 3.2.6 Geometric properties157 3.2.6.1 The triangle157 3.2.6.2 Writer's view and the golden section171 3.3 Vocabulary richness181 3.4 Word length196 3.4.1 Ord's scheme197 3.4.2 Word-length distribution202 3.5 Word classes (parts of speech)210 3.5.1 Frequencies210

  • VIII Contents

    3.5.2 Descriptiveness vs. activity216 3.5.3 Runs231 3.5.3.1 Sequential dependence232 3.5.3.2 Run length236 3.5.3.3 Placing tendency238

    4 The control cycle242

    References270

    Index277

  • 1 Introduction Poetic texts can be analyzed from an infinite number of viewpoints, just as any text and the whole of the human behaviour. Every viewpoint is interesting for some scientific discipline, and the number of viewpoints increases with the advancement of science. Our aim is very restricted, but, nevertheless, it opens up an infinite domain of new problems. And every problem can be solved in dif-ferent ways. Hence, there is a path without end, wherever one begins and in whatever direction one goes.

    In the present volume, we shall concentrate on a small number of methods used in the study of poetic texts and apply them to some already quantified textual properties. Our textual examples are poems; they are often short and each result can be checked even without the use of a computer. Besides, the study of the phonic structure of poems is reasonable, because according to R. Jakobson, in poetry the form stays in the foreground. In prose, the phonic struc-ture is not as prominent as in poetry and the rhythmic structure of prose de-pends also on the character of the given language, it is seldom a conspicuous property of a single text. Nevertheless, there is a discipline engaged in the study of prose rhythm.

    The methods presented in this study are applied to a corpus of 150 Roma-nian poems (including also a few outliers) written by Mihai Eminescu as they can be found in many editions of his works, texts analysing his works, or on the Internet: http://ro.wikisource.org/wiki/Autor:Mihai_Eminescu.

    In the present investigation, quantitative methods proven and tested in studies of prose texts, including methods for text comparison, are applied to poetic texts. Inter-sort or inter-language comparisons are frequently somewhat futile because each genre and each language has its own characteristic ways of text creation, hence most of the properties are significantly different. A statisti-cal test simply emphasizes this expectation.

    We shall study phonic features, word-form frequencies, word-length, word-classes, and the semantic structure of the poems revealing some parts of the author's world of associations. Each of them has many facets, but we concen-trate rather on methods and methodology.

    An obvious question at the beginning of any book on text studies is: What is a text? However, in contemporary science, such essentialist questions are rather outdated. They require determinations of a kind of Kantian noumenon, the es-sence of a thing, which does not exist, or, expressed in a weaker form, it would not explain anything because explanations form an infinite hierarchy whereas the essence would be a final (and therefore not acceptable) station on this

  • 2 Introduction

    way. Hence the only rational question is: what do we consider as a text? For the purpose of the present study, a text is a linear sequence of meaningful entities, organized also hierarchically (e.g. in the hierarchy sentence, clause, phrase, word, morpheme, syllable, phoneme). In linguistics, we restrict ourselves to spoken or written material but even within this restricted field, we find excep-tions. Hypertexts e.g., on the Internet, are full of pictures and links, or texts in comics, etc., belong to the domain of intertextuality. Of course, one can study them, too, from various points of view but they are not standard texts as we are interested in. The texts of our interest are written in some script and their enti-ties do not have only a purpose (like the kitchen in a house) but also a meaning, i.e., they refer to objects outside of the text. Nevertheless, even under this re-striction, they have many properties in common with other linear sequences, and consequently, many methods used in non-linguistic disciplines can be applied also in linguistics.

    In quantitative linguistics, the explication of a text is not one of the aims or results of the research activities nor are the description of the content nor its evaluation (whether aesthetic or stylistic). Quantitative-linguistic research aims at finding regularities which arise due to the effect of possibly still unknown background laws. These regularities should not be confused with grammatical rules, which can be learnt or changed or even violated, and appear, in a manner of speaking, on the surface of the texts. We rather search for textual phenomena which are evoked by and evidence of certain background mechanisms. We shall never know all of them but stepwise approaching the matter allows us to pene-trate deeper and deeper.

    There are five main approaches to text analysis (cf. Altmann 2007, 2009): (1) The static approach is concerned with the text as a whole, comprising

    the computation of all known properties, stylistic studies, evaluation of frequencies of different phenomena, lengths, polysemy values, word associations, measurement of grammatical structures, rankings, diver-sifications, classifications, denotative structures, measurement of dif-ferences, entropies, etc. This means that the text will be dissected into well defined units whose properties are studied. For this approach, at least elementary statistical methods are indispensible. Among the obvi-ous tools, mathematical graphs and their properties provide easy ways to describe and display phenomena and relations.

    (2) The sequential approach considers text as a linear sequence of entities forming time series, runs, Markov chains, reference chains, etc. These entities comprise degrees of properties, frequencies, metrical feet, dis-tances between elements of the series, etc., the position of certain de-

  • Introduction 3

    grees of a property in a higher unit, e.g. word length positioning in the given sentence. This approach is more complex and frequently requires more complex methods. Corresponding mathematical models may be based on differential and difference equations.

    (3) A systemic approach can be started when some of the problems in the first two domains have been solved. Relations between entities, proper-ties and structures which form control cycles and display the self-regulation mechanisms of text are in the focus of this approach. Though we know that texts are produced by authors, which consciously obey only grammatical rules and maybe rules of text structure, there are also latent, subconscious forces which compel the speaker/writer to form the text in a special way, e.g. reducing the decoding effort, reducing the memory effort, reducing sentence difficulty, increasing originality etc. The writer is free with respect to the content but not free with respect to the external form of the text: s/he must abide by some laws if s/he wants to be understood. The axiom concerning the non-existence of iso-lated entities in language and text is a sufficient motivation for the sys-temic approach. Investigations of this kind are known from the so-called synergetic linguistics (cf. Khler 2005) and comprise both lan-guage and text.

    (4) The typological approach consists of comparing all the above mentioned properties as they occur in texts of different languages, placing the lan-guages and texts on different scales, building fuzzy classes and study-ing the variability of various phenomena. Though text analysis played a secondary role in this research, its importance receives new impulses (cf. e.g. Kelih 2009; Popescu, Mautek, Altmann 2009). However, the notorious classifications based on categorical concepts do not yield anything else but new, more general, concepts. We need them, but they seldom lead to theoretical progress.

    (5) The chaos theoretical approach. All aspects mentioned above contain some elements of chaos which is placed in a deeper layer in all text phenomena. Some phenomena, e.g. fractals, dimensions, attractors are identifiable but because of their indirect relevance for the text sciences and also because of their computational effort they are not yet suffi-ciently discussed (cf. Hebek 1997, 2000; Andres 2010; Andres, Beneov 2011).

    Ideally, a quantitative text analysis engages three different specialists. This is because at the beginning of the research, it is always the task of a linguist/text

  • 4 Introduction

    scientist to set up a hypothesis with linguistic relevance. No hypothesis no quantitative text research! The linguist states what kind of data would be rele-vant for testing the hypothesis and the programmer tries to elicit them from texts. As opposed to facts and phenomena, data are not just given but they are the result of a scientific activity, they are constructed. To a text scientist, text is the matter from which data are conceptually constructed. In the meantime, the mathematician translates the verbal hypothesis into the language of mathemat-ics, i.e. formulates it as a statistical hypothesis. At the same time s/he tries to-gether with the linguist to find the mechanism that can lead to the rise of the given phenomenon. In other words, the mathematician tries to set up a model of the phenomenon and to subsume it under an existing theory, to embed it in a system of similar hypotheses. The programmer tests the hypothesis on her/his data and the mathematician interprets them statistically. The results of the test are translated into the daily language of linguistics, and the linguist interprets the result linguistically. Hence, the succession of persons in text analysis is: linguist > mathematician > programmer > mathematician > linguist. The linguist is placed at the beginning and the end of this procedure and warrants the linguistic relevance of the problem at the beginning and the relevance of the results at the end. Needless to say, mathematicians and programmers fre-quently propose excellent ideas; a sound cooperation yields the most reason-able results.

    Texts are sources also for other disciplines such as psycholinguistics, socio-linguistics, dialectology, language teaching, etc. in which the respective experts determine the course of research.

    Another obvious question is: What can be considered as poetry? The first answer is: Poetry is a kind of literary art where evocative and aesthetic effects are based on form, in addition to (sometimes: instead of) meaning. This volume aims at investigating the universal laws and interrelations of aspects connected with consciously formed texts under consciously imposed form restrictions.

    There are many commonalities in these texts but none of the properties can be supposed as a necessary condition. Rhyme, rhythm, meter, the existence of verse line, strophes, a fixed number of lines (as in sonnets), meaning, etc., can be found in many but not in all poems. We must rely on the judgement of liter-ary historians, making allowance for the existence of outliers which may de-struct even our theories. Many times they can be made harmless by introducing boundary or subsidiary conditions.

    A large part of quantitative characterisations is performed by means of indi-cators. Many of them tell the same story but their interpretation may be differ-

  • Introduction 5

    ent. But if they tell the same story, then there is a clear link between them, even when their method of computation is different.

    The indicators should have at least the following properties (cf. Galtung 1967; Grotjahn, Altmann 1988; Wimmer et al. 2003: 25ff): (1) Meaning. This seems to be quite natural, but many indicators arise in form of a proportion which does not have a clear interpretation. The indicator must tell us what it describes. (2) Simplicity, especially at the beginning of a research, because it alleviates computation and the mathematical treatment. It is advantageous to express different properties with different indicators. (3) Variation interval. If there are indicators varying in the interval , a given value of this indica-tor cannot be interpreted. Every number can be considered large (with respect to the lower limit 0) or small (with respect to the upper limit ). It is therefore reasonable to restrict the value to a finite interval by means of normalization. (4) Sampling distribution. This property of an indicator is indispensable for a reliable evaluation of the measured values. It gives information about the fre-quency or probability of the individual values of the indicator, information which is fundamental for any statistical assessment. Unfortunately, this re-quirement is still ignored in the humanities in many cases. In order to apply an indicator, e.g. for comparisons, one should know at least its variance, which is needed for asymptotic tests. Exact probabilities can be computed only when the distribution of the indicator is known. The application of non-parametric statis-tics, a well-established technique, is an alternative. (5) Reliability is the measure of exactness and stability. The indicator should be stable and express the same property in all cases. (6) Validity means the fact that the indicator truly ex-presses the studied property. An illustrative example in this respect is the large number of available measures of vocabulary richness, whose validity is an open question.

    But all this cannot be achieved in an elementary, preliminary investigation. Research begins always with the first step and improves its argumentation step by step, sets up more complex hypotheses, extends the investigations to other languages and, based on the surface phenomena expressed by indicators, fur-ther steps towards a theory follow. A theory is a system of interrelated hypothe-ses, some of which can be considered laws, i.e. general statements derived from axioms or other laws, or in other words, anchored in antecedent knowledge, and empirically well corroborated (cf. Bunge 1967).

    In mainstream linguistics, the term theory is misused. It stands, as a rule, for concepts, isolated phenomena, descriptive approaches, sets of facts, classi-fications and sets of rules. All that, and even strict definitions which are not more than conventions and a preceding formalization do not have the status

  • 6 Introduction

    of a theory. The mentioned definitions and formalisations are merely necessary but not sufficient conditions for the construction of a theory. A theory begins to arise when we derive hypotheses from antecedent knowledge, test them empiri-cally and join them with a system of universal, corroborated statements. This is, of course, not a simple task because language is not a deterministic system with clear-cut units and relations. Though it is always in a steady state, it varies with every speaker, it changes incessantly, and communication is possible only be-cause of its self-regulation. A speaker can and does change elements but if s/he aims at communicative success, the change must not surpass a certain limit. With every change, the limit is shifted by a tiny quantity. Since this shift is always advantageous from the point of view of the speaker s/he is the actor in this play the phenomena in language are never distributed according to the normal (Gaussian) distribution. Every distribution in language is skewed. Nev-ertheless, values of whatever property taken from many texts may display nor-mality in a statistical sense (a situation that can be tested) and a comparison with other text groups is possible by means of an asymptotic test based on nor-mality.

    The greatest advancements in every empirical science are achieved by intro-ducing mathematical methods. Mathematics is a warrant of exactness, testabil-ity, deducibility, and systematisation and it gives us the chance to predict phe-nomena which are not visible on the surface of texts. In spite of this, there are still objections against the application of mathematical instruments in literary studies. Such objections can be heard from hard-core poeticists relying only on intelligent verbal descriptions. The corresponding arguments have been ana-lysed (cf. Altmann 1999; Wimmer et al. 2003:14 ff.) and will be reproduced here for the sake of clarity. The objections are: 1. Our objects cannot be quantified/mathematised. 2. Even if it would be possible, we are not interested in numbers but in quali-

    ties and properties. 3. We are not interested in laws but in the uniqueness, idiosyncrasy of texts. 4. Our problems are that complex that no mathematics can capture and ex-

    press them.

    Evidently, all these objections arise either from misunderstanding and can easily be appeased, or from a negative attitude towards mathematics, and in that case they cannot be removed.

    Objection 1 is rooted in false epistemology. We do not mathematise real objects, we quantify/mathematise only our concepts and ideas about them. Objects do not contain numbers, which could be observed. Properties are first

  • Introduction 7

    conceptually constructed, then quantified and at last measured. These meas-ures are ascribed to objects. Properties are always gradual (cf. Bunge 1983; 187f; 1995), hence quantification is the best way to perform exact research. If the concepts of objects or properties formed by a researcher cannot be quantified, then we have to conclude that these concepts are too poor or too unclear. In qualitative research only inexact expressions like very warm, many, fre-quently etc., occur, in extreme cases the property is dichotomised relics from structuralism and loses the major part of the information. Qualitative concepts of properties are the ontogenetic heritage of our language, in which numbers appear later on, as can be observed also in children's development. But if we admit that qualities and quantities do not exist in reality but only in our con-cepts, objection 1 becomes irrelevant.

    Objection 2, just as objection 1, confuses epistemology with ontology. No scientist is interested in numbers, but numbers are the best way to exactly cap-ture our conceptual entities. Reality is neither qualitative nor quantitative. It simply exists. With the help of our concepts, we simply try to capture it in order to improve orientation and knowledge, and to survive. The information we ob-tain from reality are merely weak electrical impulses entering our brain via our senses, and the brain has to construct a (partial) map of the reality on this basis. Reality is (re-)constructed by means of concepts, which are primordially qualita-tive, and the natural language helps us by codifying them. Science, however, requires more exact concepts, viz. quantitative ones (cf. Bunge 1967, 1983; Stegmller 1970; Essler 1971). Disciplines working with quantitative concepts develop more rapidly than other ones.

    Objection 3 is an evident error. Idiosyncrasy can be stated only as a contrast to a general background or as a difference from other texts. In any case, com-parison is necessary. But if a text is said to represent an idiosyncrasy, the sig-nificance of the difference must be shown. This can be done only by means of a statistical test; the indication of the difference even if it is given in an exact form would not suffice. In literary studies, in corpus linguistics, and in com-putational linguistics, text and methods are frequently compared by indicating a certain numerical difference, sometimes in form of percentages. On this basis alone, conclusions are drawn such as "Method X is superior by 3 %" or "Text 1 possesses more of property X than text 2: Text 1 has 70 scores and text 2 only 68". These are proto-scientific statements, not more than opinions; they ignore that the difference may be a random result or due to an inexact measurement.

    Objection 4 is again a misunderstanding. Every statement, including scien-tific ones, is a simplification, whether it is given verbally or by means of mathematics. An object cannot be captured in its entirety especially because

  • 8 Introduction

    we do not even know what its entirety is. Only a small number of aspects of an object can be focused. In some theories, the inevitable separation into 'relevant' and other aspects is made explicit in the ceteris paribus condition, which can be presented in form of a disturbance constant or a special function, which weakly contributes to the main independent variable. Furthermore, mathematical con-cepts and quantitative methods are obviously the only imaginable way to de-scribe and analyse complex structures and processes including fuzzy ones.

    There are two ways to perform text analyses: comparison of texts and text sorts, written by different authors in one or more languages and the study of the outcomes of text laws or study of an individual author, one text sort, and one language, description of the properties of the given set of texts and the theoreti-cal search for the latent mechanisms which brought about the given phen-omenon. In this volume, we focus the poetic work by the famous Romanian poet Mihai Eminescu and try to characterise it, show some relations, and realisations of text laws, and we indicate perspectives for future research.

  • 2 Phonic phenomena 2.1 Occurrence without pattern

    2.1.1 Phoneme frequencies

    The usual way to capture phonic phenomena in texts is to consider sound/phoneme frequencies, either absolutely, relatively or associated with a position. While in prose mostly the first view is practised, in poetry patterns of sounds occur whose existence or positioning display a kind of statistical trend. The most common one is rhyme, which is created consciously whereas in other kinds of poetry and various languages, also phenomena such as alliteration, as-sonance, spontaneous aggregation, etc. are observed.

    Before scrutinising these specific phenomena, we will focus on the study of phoneme frequencies. We suppose that even in poetry if there is no special aim, as in Dadaistic poetry the phonemes abide by the stratification law, a general hypothesis, which was proposed as an alternative for Zipf's formula (cf. Popescu, Altmann, Khler 2010). Nevertheless, Zipf's power law or Zipf-Alekseev's function can also be used where the data are less complex. The stratification approach aims at finding the number of strata formed by the given entities. In short texts, there is usually only one layer, in longer texts, stratifica-tion becomes more obvious. This regularity holds for any kind of entities. In order to demonstrate this regularity on the phonic level, we first present the phonic analysis of Romanian and its phonemic interpretation as well as the transcription of letters into phonemes.

    In Romanian phonology, the phoneme inventory consists of seven vowels (strong vowels, syllabic vowels), one voiceless (non-syllabic) vowel, two or four semivowels (different views exist and we will work with the four semivowel version) and twenty-two consonants. The vowel i can occur at the end of a syllable which already contains a syllabic vowel. In this case, i is a non-syllabic (voiceless) vowel.

    A semivowel (weak vowel) is phonetically similar to a vowel (strong vowel) but functions as a syllable boundary rather than as the nucleus of a syllable and is shorter than the corresponding vowel. Out of the total number of seven vow-els, only four can behave as semivowels, which are involved in some special groups of phonemes called diphthongs and triphthongs. A diphthong refers to two adjacent vowels occurring within the same syllable. It contains one vowel (strong vowel) and one semivowel (weak vowel). A triphthong is the uninter-rupted combination of three vowels in the same syllable: a strong vowel and

  • 10 Phonic phenomena

    two semivowels (the strong vowel is usually in between the semivowels). The list of phonemic transcriptions of graphemes is presented in Tables 2.1.1.1 to 2.1.1.4

    Table 2.1.1.1: The phoneme-grapheme relation for vowels and semivowels in Romanian

    Phoneme (IPA)

    Grapheme Example in Romanian

    1 /a/ strong vowel apa 2 // strong vowel printe 3 //

    strong vowel cnta

    cobor, nainte 4 /e/ strong vowel erou 5 /e / in diphthongs

    and triphthongs weak vowel stea, /e /a/ - di-

    phthong doreai, /e /a/j/ - tri-phthong

    6 /i/ strong vowel inel 7 /i/ at the end of a

    syllable containing a syllabic vowel

    non-syllabic (voiceless) vowel

    flori, i, orice, galbeni

    8 /j/ in diphthongs and triphthongs

    weak vowel mai, /a/j/ -diphthong doreai, /e /a/j/ -triphthong

    9 /o/ strong vowel ora 10 /o /

    in diphthongs and triphthongs

    weak vowel coas, /o /a/ - diph-thong pleoape, /e /o /a/- triphthong

    11 /u/ strong vowel durere 12 /w/ in diphthongs

    and triphthongs weak vowel nou, /o/w/ - diphthong

    vreau, /e /a/w/ - triph-thong

  • Occurrence without pattern 11

    Table 2.1.1.2: The phoneme-grapheme relation for consonants in Romanian

    Phoneme (IPA) Grapheme Example Example in Eng-lish

    1 /b/ bine book 2 /k/

    curaj karate quasar

    close

    3 /t/

    cer ciree

    chest

    4 /k/

    chemare, chipe kilogram

    kept

    5 /d/ dar day 6 /f/ foc face 7 /g/ greu gold 8 /d/

    ger ginga

    gist

    9 /g/

    ghem ghiocel

    get

    10 /h/ harnic hat 11 // joc pleasure 12 /l/ lac lake 13 /m/ mac moon 14 /n/ nor name 15 /p/ parc pan 16 /r/ rac rain 17 /s/ soare sun 18 // arpe shape 19 /t/ tare time 20 /ts/ ar its 21 /v/

    val watt

    voice

    22 /z/ zori zone 23 /c/+/s/

    /g/+/z/

    excursie examen

    exception

  • 12 Phonic phenomena

    Table 2.1.1.3: Romanian diphthongs

    Grapheme Phonemic transcription Example in Romanian

    1 /a/j/ mai

    2 /a/w/ dau

    3 /e /a/ stea

    4 /e/j/ trei

    5 /e /o/ vreo

    6 /e/w/ leu

    7 /j/a/ biat

    8 /j/e/ miere

    9 /i/j/ fii

    10 /j/o/ iobag

    11 /i/w/ /j/u/

    auriu iubire

    12 /o /a/ soare

    13 /o/j/ foi

    14 /o/w/ nou

    15 /w/a/ ziua

    16 /w/e/ neuez

    17 /u/j/ pui

    18 /u/w/ continuu

    19 /w// dou

    20 /w// plound

    21 //j/ ri

    22 //w/ ru

    23

    //j/ cine i dau

    24 //w/ ru

  • Occurrence without pattern 13

    Table 2.1.1.4: Romanian triphthongs

    Grapheme Phonemic transcription Example in Romanian

    1 /e /a/j/ doreai

    2 /e /a/w/ mergeau

    3 /e /o /a/ pleoape

    4 /j/a/j/ voiai

    5 /j/a/w/ tiau

    6 /j/e/j/ piei

    7 /j/e/w/ eu

    8 /j/o /a/ creioane

    9 /j/o/j/ i-oi da

    10 /j/o/w/ maiou

    11 /o /a/j/ orzoaic

    12 /w/a/j/ neuai

    13 /w/a/w/ neuau

    14 /w//j/ roui

    Syllabification is very important in the identification of diphthongs, triphthongs and finally in the phonemic transcription. Some special cases of phonemic tran-scriptions with syllabification are presented below.

    1. The grapheme e at the beginning of personal pronouns is transcribed as

    follows:

    eu (I) /j/e/w/ ea (she) /j/a/ el (he) /j/e/l/ ele (they - feminine) /j/e/ - /l/e/ ei (they - masculine) /j/e/j/

    2. The grapheme e at the beginning of the forms (different tenses) of the

    verb a fi (to be ) is transcribed as /j/e/:

    e /j/e/ este /j/e/s/ - /t/e/ eram /j/e/ - /r/a/m/

  • 14 Phonic phenomena

    3. The graphemes e and a at the beginning of a syllable, following a sylla-ble which ends with i are transcribed as /j/e/ and /j/a/ respectively.

    urgie /u/r/ - /d/i/ - /j/e/ prietenie /p/r/i/ - /j/e/ - /t/e/ - /n/i/ - /j/e/ Romnia /r/o/ - /m// - /n/i/ - /j/a/ mantia /m/a/n/ - /t/i/ - /j/a/

    4. Exceptions: loan words (neologisms)

    cordial /k/o/r/ - /d/i/ - /a/l/ Eliade /e/ - /l/i/- /a/ - /d/e/ diamant /d/i/ - /a/ - /m/a/n/t/

    For more details related to the rules for phonemic transcription in Romanian see Dindelegan (2013: 717). Examples of phonemic transcriptions with syllabification:

    chemare /k/e/ - /m/a/ - /r/e/ cheam /k/a/ - /m// ochi /o/k/ ochii /o/ - /k/i/ copii /k/o/ - /p/i/ copiii /k/o/ - /p/i/- /j/i/ veciniciei /v/e/t/ - /n/i/ - /t/i/ - /j/e/j/ creioane /k/r/e/- /j/o /a/ - /n/e/ oarece //o /a/ - /r/e/ - /t/e/ fecioar /f/e/ - /t/o /a/ - /r// valuri /v/a/ - /l/u/r/i/ s-mi /s//m/i/ ghiozdan /g/o/z/ - /d/a/n/ ghiocel /g/i/ - /o/ - /t/e/l/ gean /d/a/ - /n// ginga /d/i/n/ - /g/a// examen /e/ - /g/z/a/ - /m/e/n/ excursie /e/k/s/ - /k/u/r/ - /s/i/ - /j/e/

  • Occurrence without pattern 15

    The phonemic transcription of the poem Lacul is presented below.

    Lacul phonemic transcription Lacul codrilor albastru /l/a/k/u/l/ /k/o/d/r/i/l/o/r/ /a/l/b/a/s/t/r/u/ Nuferi galbeni l ncarc /n/u/f/e/r/i/ /g/a/l/b/e/n/i/ //l/ //n/k/a/r/k// Tresrind n cercuri albe /t/r/e/s//r/i/n/d/ //n/ /t/e/r/k/u/r/i/ /a/l/b/e/ El cutremur o barc /j/e/l/ /k/u/t/r/e/m/u/r// /o/ /b/a/r/k//

    i eu trec de-a lung de maluri //i/ /j/e/w/ /t/r/e/k/ /d/e /a/ /l/u/n/g/ /d/e/ /m/a/l/u/r/i/

    Parc-ascult i parc-atept /p/a/r/k/a/s/k/u/l/t/ //i/ /p/a/r/k/a//t/e/p/t/ Ea din trestii s rsar /j/a/ /d/i/n/ /t/r/e/s/t/i/j/ /s// /r//s/a/r// i s-mi cad lin pe piept //i/ /s//m/i/ /k/a/d// /l/i/n/ /p/e/ /p/j/e/p/t/ S srim n luntrea mic /s// /s//r/i/m/ //n/ /l/u/n/t/r/e /a/ /m/i/k// ngnai de glas de ape, //n/g//n/a/ts/i/ /d/e/ /g/l/a/s/ /d/e/ /a/p/e/ i s scap din mn crma, //i/ /s// /s/k/a/p/ /d/i/n/ /m//n// /k//r/m/a/ i lopeile s-mi scape; //i/ /l/o/p/e/ts/i/l/e/ /s//m/i/ /s/k/a/p/e/

    S plutim cuprini de farmec /s// /p/l/u/t/i/m/ /k/u/p/r/i/n//i/ /d/e/ /f/a/r/m/e/k/

    Sub lumina blndei lune /s/u/b/ /l/u/m/i/n/a/ /b/l//n/d/e/j/ /l/u/n/e/

    Vntu-n trestii lin foneasc,

    /v//n/t/u/n/ /t/r/e/s/t/i/j/ /l/i/n/ /f/o//n/e /a/s/k//

    Unduioasa ap sune! /u/n/d/u/j/o /a/s/a/ /a/p// /s/u/n/e/ Dar nu vine... Singuratic /d/a/r/ /n/u/ /v/i/n/e/ /s/i/n/g/u/r/a/t/i/k/ n zadar suspin i sufr //n/ /z/a/d/a/r/ /s/u/s/p/i/n/ //i/ /s/u/f//r/ Lng lacul cel albastru /l//n/g// /l/a/k/u/l/ /t/e/l/ /a/l/b/a/s/t/r/u/ ncrcat cu flori de nufr //n/k//r/k/a/t/ /k/u/ /f/l/o/r/i/ /d/e/ /n/u/f//r/

    The frequencies of phonemes ranked in usual way are presented in Table 2.1.1.5. The poem Lacul has altogether total lines: 20; the total number of phonemes is 414 composed of 180 vowels (strong +voiceless +week) and 234 consonants. The size of the phoneme inventory is 29.

  • 16 Phonic phenomena

    Table 2.1.1.5: Rank-frequencies of phonemes in individual strophes in Lacul

    Strophe 1 Strophe 2 Strophe 3 Strophe 4 Strophe 5

    rank freq phoneme freq phoneme freq phoneme freq phoneme freq phoneme

    1 12 /r/ 9 /a/ 7 /a/ 10 /n/ 8 /a/ 2 8 /l/ 7 /e/ 7 /s/ 9 /u/ 8 /u/ 3 7 /a/ 7 /i/ 6 // 6 /a/ 8 /n/ 4 7 /e/ 7 /r/ 6 /e/ 6 /e/ 8 /r/ 5 7 /k/ 7 /t/ 6 /i/ 6 /s/ 6 /l/ 6 6 /u/ 6 /p/ 6 /n/ 5 /i/ 5 /i/ 7 5 /n/ 5 // 5 // 5 /l/ 5 /k/ 8 4 // 5 /k/ 5 /m/ 4 /t/ 5 /s/ 9 4 /b/ 5 /s/ 4 /k/ 3 // 4 // 10 3 // 4 /d/ 4 /l/ 3 /j/ 3 // 11 3 /i/ 4 /l/ 4 /p/ 3 /k/ 3 /e/ 12 3 /o/ 4 // 3 /d/ 3 /d/ 3 /d/ 13 3 /t/ 3 /j/ 3 /r/ 3 /m/ 3 /f/ 14 2 /i/ 3 /u/ 2 /i/ 3 /p/ 3 /t/ 15 2 /d/ 3 /n/ 2 /g/ 3 /r/ 2 /g/ 16 2 /s/ 2 /i/ 2 // 2 // 1 /i/ 17 1 /j/ 2 /m/ 2 /ts/ 2 /b/ 1 /o/ 18 1 /t/ 1 /e / 1 /e / 2 /f/ 1 /b/ 19 1 /f/ 1 /w/ 1 /o/ 2 // 1 /t/ 20 1 /g/ 1 /g/ 1 /u/ 1 /e / 1 /p/ 21 1 /m/ 0 // 1 /t/ 1 /i/ 1 // 22 0 /e / 0 /o/ 0 /j/ 1 /o/ 1 /v/ 23 0 /o / 0 /o / 0 /o / 1 /o / 1 /z/ 24 0 /w/ 0 /b/ 0 /w/ 1 /v/ 0 /e / 25 0 /p/ 0 /t/ 0 /b/ 0 /w/ 0 /j/ 26 0 // 0 /f/ 0 /t/ 0 /t/ 0 /o /

    27 0 /ts/ 0 /ts/ 0 /f/ 0 /g/ 0 /w/

    28 0 /v/ 0 /v/ 0 /v/ 0 /ts/ 0 /m/

    29 0 /z/ 0 /z/ 0 /z/ 0 /z/ 0 /ts/

  • Occurrence without pattern 17

    The rank-frequency distribution of phonemes in all strophes can be captured by the one- or two-component stratification formula

    (2.1.1.1) fr = 1 + a*exp(-r/b) + c*exp(-r/d)

    Figure 2.1.1.1. Rank-frequencies of phonemes in the first strophe of Lacul

    testifying to the phonic stratification of individual strophes. The individual fitting parameters, computed iteratively, are presented in Table 2.1.1.6. The graphic picture of the first strophe is presented in Figure 2.1.1.1 and the fourth strophe in Figure 2.1.1.2 (R2 is the usual determination coefficient).

    Table 2.1.1.6: Parameters of fitting (2.1.1.1) to individual strophes

    Strophe a b c d R2

    1 11.5941 6.1295 - - 0.95 2 8.8879 9.0145 - - 0.93 3 7.6789 9.0035 - - 0.91 4 4.1970 1.0692 8.5249 7.7596 0.96 5 9.5960 7.2343 - - 0.93 As can be seen in Table 2.1.1.6, the parameters do not vary excessively, the pho-neme representation is quite uniform. Nevertheless, different local phenomena may appear, and these will be analyzed in the subsequent sections.

    Of course, the differences between individual parameters could be tested but the differences are too small to allow general hypothesis building. The ho-

  • 18 Phonic phenomena

    mogeneity of the distributions in individual strophes cannot be performed by means of the chi-square test because the frequencies are too small; another test based on ranks could be used instead.

    To this end, we reorder Table 2.1.1.5 according to phonemes and ascribe them the respective rank in the given strophe. The result can be seen in Table 2.1.1.7. When two or more frequencies are identical, the corresponding phome-nes were assigned the mean of the ranks, e.g. if ranks 5,6,7 have the same fre-quency, then all three items receive rank 6. If a frequency is unique, its rank remains as it is. Further, if a phoneme does not occur in a strophe, it obtains the highest rank (highest mean rank).

    Figure 2.1.1.2. Rank-frequencies of phonemes in the fourth strophe of Lacul

    The S column contains the sums of the given rows; Vi is a function of ties (in the strophe i). A tie with ti occurrences corresponds to the function ti3 - ti, e.g. in the first column, the rank 4 occurs three times (the phonemes /a/, /e/, /k/), hence V/a/ = 33 - 3 = 24. If there are several ties in the column, the sum of the above function has to be calculated. In column S2, there are the squares of the values in the S column. The squared sum of empirical rank sums is given as

    (2.1.1.2) pN

    ii

    N

    ii

    N

    ii NSSSSSSR

    ppp

    /)(2

    11

    2

    1

    2

    ==

    ===

    ,

    yielding in our case SSR = 198356 - 21752/29 = 35231 (see Table 2.1.1.7). Np is the number of distinct phonemes in the studied poem, e.g. Np = 29 for Lacul (see Table 2.1.1.5).

  • Occurrence without pattern 19

    Since we have ties whose sums are given in the last row, we compute Kend-all's concordance coefficient in the form:

    (2.1.1.3)

    =

    = m

    iipp VmNNm

    SSRW

    1

    32 )(

    )(12

    where m is the number of strophes of the studied poem. Lacul has m = 5 stro-phes of 4 verses each. The value of W need not be calculated because one can directly compute the test-criterion (see below).

    The computation of the function Vi can be illustrated using the example of the first strophe. Rank 8.5 occurs twice, rank 4 three times, rank 15 three times, rank 11.5 four times, rank 19 five times, and rank 25.5 eight times, hence the function Vi is computed for the first strophe as follows:

    V1 = 1(23 - 2) + 2(33 - 3) + 1(43- 4)+1(53 - 5) +1(83 - 8) = 6 + 48 + 60 +120 + 504 = 738

    Now we want to find a criterion enabling us to decide whether the strophes are phonically independent. This can be done by means of the chi-square criterion as defined by Kendall (1962: 100):

    (2.1.1.4)

    =

    += m

    jj

    ppp VN

    NNm

    SSRX

    1

    2

    11)1(

    )(12

    yielding a chi-square statistic with Np-1 degrees of freedom. Inserting the com-puted values in this formula we obtain

    X2 = 12(35231)/[5(29)30 - (1/28)3930] = 422772/(4350-140.357) = 100.43

    Since we have DF = 28, our result is significant (e.g. for = 0.0005 we have X2 = 50.5), i.e. the use of the phonemes is divergent among the individual strophes. This fact allows to deduce several consequences. We cannot, however, ask the author any more whether our interpretations are correct. Possible inferences are e.g. that the poem has not been written in one go or that it has been corrected subsequently or that it has not been written spontaneously in form of an im-provisation, etc.

  • 20 Phonic phenomena

    Table 2.1.1.7: Ranks of phonemes in individual strophes in Lacul

    phoneme strophe S S2

    1 2 3 4 5 /a/ 4 1 1.5 4 2.5 13 169 // 8.5 8 4.5 12 9 42 1764 // 11.5 25 7.5 17.5 12 73.5 5402.25 /e/ 4 3.5 4.5 4 12 28 784 /e / 25.5 19 19.5 22 26.5 112.5 12656.25 /i/ 15 3.5 4.5 6.5 7 36.5 1332.25 /i/ 11.5 16.5 15.5 22 19.5 85 7225 /j/ 19 14 25.5 12 26.5 97 9409 /o/ 11.5 25 19.5 22 19.5 97.5 9506.25 /o / 25.5 25 25.5 22 26.5 124.5 15500.25 /u/ 6 14 19.5 2 2.5 44 1936 /w/ 25.5 19 25.5 27 26.5 123.5 15252.25 /b/ 8.5 25 25.5 17.5 19.5 96 9216 /k/ 4 8 10 12 7 41 1681 /t/ 19 25 25.5 27 19.5 116 13456 /d/ 15 11 12.5 12 12 62.5 3906.25 /f/ 19 25 25.5 17.5 12 99 9801 /g/ 19 19 15.5 27 15 95.5 9120.25 /l/ 2 11 10 6.5 5 34.5 1190.25

    /m/ 19 16.5 7.5 12 26.5 81.5 6642.25 /n/ 7 14 4.5 1 2.5 29 841 /p/ 25.5 6 10 12 19.5 73 5329 /r/ 1 3.5 12.5 12 2.5 31.5 992.25 /s/ 15 8 1.5 4 7 35.5 1260.25 // 25.5 11 15.5 17.5 19.5 89 7921 /t/ 11.5 3.5 19.5 8 12 54.5 2970.25

    /ts/ 25.5 25 15.5 27 26.5 119.5 14280.25 /v/ 25.5 25 25.5 22 19.5 117.5 13806.25 /z/ 25.5 25 25.5 27 19.5 122.5 15006.25

    Sum S 435 435 435 435 435 2175 198356

    Vi 738 882 726 666 918 3930

  • Occurrence without pattern 21

    2.1.2 Euphony in general

    In literary studies, euphony is a fuzzy concept originating from an individual perception of a text and the intuitive aesthetic evaluation of this perception. Peculiar enough, in music, which is strongly based on euphony (but not al-ways), the concept does not even exist. Instead, various kinds of aesthetics are discussed. In textology, beauty is associated rather with the choice of words or association of ideas, etc. Euphony is some background noise ascribed to the phonic composition of the poem expressed above all by the rhyme. Since rhyme is conscious, the concept of euphony becomes fuzzy if we add to the rhyme also some subconscious phenomena and perform dichotomic decisions about their presence or absence. Definitions that can be found en masse in dictionaries or on the Internet say that euphony is a pleasing or sweet sound or a harmonious succession of words with a pleasing sound which is simply a tautology, not an operational definition.

    In order to avoid passionate discussions about the greater harmonious and pleasing succession of words in Sheffield English than in Italian, we try to be-stow the concepts of euphony with a more objective correspondence with per-ceived reality and warrant it a computable existence.

    In the presented approach, euphony is understood as a regular or non-random occurrence of sounds/phonemes or their sequential patterns in a text. We prefer to apply the concept of phoneme because that of sound is rather fuzzy and sound realisation depends always on the automatisms acquired during childhood.

    In poetry, for which the concept of euphony is considered as relevant as opposed to prose , the best known euphonic phenomenon is the rhyme. If, say, we find an /a/ in each position in a verse where a vowel occurs then we are inclined to consider it a euphonic pattern. This event need not be a conscious act of the poet, it may simply be the outcome of the Skinner effect or a phe-nomenon of self-organisation. Nevertheless, there are cases, as e.g. in old Java-nese, where a specific vowels sequence is required for each verse. Whatever the cause may be, if we want to consider a pattern euphonic, we must show that the given pattern is significant, i.e. not a random effect. There are several methods to determine this fact: (i) To ask the author who perhaps remembers her/his motivs and writing method (as long as s/he lives) however, even if so, a writer will not be able to state exactly the degree of realised euphony. (ii) To ask in-formants for their subjective impression; but this method furnishes information only in form of subjective statements and depends strongly on age, education, gender, social status, dialect, etc. Nevertheless, at least a kind of scaling can be

  • 22 Phonic phenomena

    obtained in this way. (iii) The only objective method is a statistical test of the occurrence of individual phonemes or phonemic patterns for significance. This procedure has several advantages: (a) It is objective; every researcher will ob-tain the same results; (b) it involves quantification of a very fuzzy concept and can be used for comparisons, classifications, studying the evolution of a writer, etc., and (c) it allows us to determine the entities which evoked euphony. (d) Last but not least, we can compute even the probability of a false evaluation.

    Up to now, there are no general hypotheses about euphony. Neither bound-ary conditions are known under which it must, may or cannot occur. The num-ber of statistical studies concerning euphony is very small (cf. Skinner 1939, 1941; Sebeok, Zeps 1959; Meyer-Eppler 1959; Altmann 1963, 1966a,b, 1968; Wimmer et al. 2003). The investigations are local, restricted to a text sort or a writer. Anyway, they show that objective quantification of different phonic phenomena in poetry is possible.

    In the present chapter, we shall study the general euphony of verses in the sense of non-random occurrence, i.e. by a significantly frequent occurrence of some phonemes in the line and set up an indicator of general euphony of a poem. Since we study only one writer, the starting point is the table of letters, their phonemic correspondences, and their occurrences in his works consid-ered. The titles and sizes of 46 analysed poems are listed in Table 2.1.2.1 and the corresponding phoneme occurrence is presented in Table 2.1.2.2. It is to be noted that we work with the concept of phoneme, i.e. with a higher abstraction because of its uniformity as opposed to the variation observed in sounds.

    We want to present a realistic measurement of euphony; therefore we pro-ceed as follows. Every line is considered a sample of its own. We distinguish the number of vowels V and the number of consonants C in the line. Since in a vo-calic position, a certain vowel can occur or not, its distribution is binomial. Now, if a vowel i occurs two or more times in the line, in general xi-times, we compute the probability of the given or more extreme number of occurrences by means of the formula (2.1.2.1).

  • Occurrence without pattern 23

    Table 2.1.2.1: Titles and sizes of analysed 46 poems by Eminescu

    Poem title No of words (text size)

    1 Lebda 41 2 Peste vrfuri 47 3 Dintre sute de catarge 50 4 i dac... 53 5 La mijloc de codru... 55 6 Somnoroase psrele... 55 7 La steaua 71 8 Adnca mare 75 9 Trecut-au anii 88 10 Lacul 90 11 Ce te legeni... 102 12 Od n metru antic 103 13 De ce nu-mi vii 123 14 Mai am un singur dor 125 15 Criticilor mei 130 16 O, mam 140 17 Cu mane zilele-i adaogi... 141 18 Revedere 141 19 Sara pe deal 156 20 Att de fraged 176 21 Freamt de codru 179 22 Ce-i doresc eu ie, dulce Romnie 183 23 Pe lng plopii fr so 199 24 Povestea codrului 220 25 Floare-albastr 247 26 Sonete 265 27 Desprire 304 28 Ghazel 331 29 La moartea lui Heliade 332 30 O, adevr sublime... 334 31 Iubit dulce, o, m las 337 32 O clrire n zori 346 33 Dac treci rul Selenei 356

  • 24 Phonic phenomena

    34 Rugciunea unui dac 357 35 Copii eram noi amandoi 375 36 Gloss 380 37 Povestea teiului 390 38 Venere i Madona 393 39 Ft-Frumos din tei 415 40 Dumnezeu i om 443 41 Junii corupi 458 42 Mortua est! 491 43 Epigonii 921 44 mprat i proletar 1510 45 Luceafrul 1737 46 Scrisoarea III 2278

    (2.1.2.1) ( )i

    Vx V x

    i i ix x

    VP X x p q

    x

    =

    =

    ,

    where X is a vocalic phoneme, and analogically for consonants, replacing V by C. The first line of the poem Adnca mare

    Adnca mare sub a lunei fa;

    is transcribed as

    /a/d//n/k/a/ /m/a/r/e/ /s/u/b/ /a/ /l/u/n/e/j/ /f/a/ts//

    In this verse, we have V = 12, C = 11. Considering the vocalic phoneme /a/ we see that it occurs 5 times and its relative frequency with respect to the inventory of vowels is 0.181583 (Table 2.1.2.2); we compute accordingly

    Table 2.1.2.2: Phonemes in Eminescu's 46 poems

    phoneme No of occurrences

    relative frequency

    relative frequency vowels/consonants

    Vowels (strong, weak, voiceless) 1 /a/ 6172 0.086823 0.181583 2 // 3005 0.042272 0.088408

  • Occurrence without pattern 25

    3 // 1733 0.024379 0.050986 4 /e/ 7118 0.100131 0.209415 5 /e / 859 0.012084 0.025272 6 /i/ 4252 0.059814 0.125096 7 / i/ 1142 0.016065 0.033598 8 /j/ 2434 0.034240 0.071609 9 /o/ 2159 0.030371 0.063519 10 /e / 546 0.007681 0.016064 11 /u/ 4261 0.059941 0.125360 12 /w/ 309 0.004347 0.009091

    Consonants

    13 /b/ 767 0.010790 0.020676 14 /k/ 2389 0.033607 0.064399 15 /t/ 1145 0.016107 0.030865 16 /k/ 223 0.003137 0.006011 17 /d/ 2627 0.036955 0.070814 18 /f/ 795 0.011183 0.021430 19 /g/ 505 0.007104 0.013613 20 /d/ 277 0.003897 0.007467 21 /g/ 19 0.000267 0.000512 22 /h/ 67 0.000943 0.001806 23 // 114 0.001604 0.003073 24 /l/ 3173 0.044635 0.085533 25 /m/ 2539 0.035717 0.068442 26 /n/ 4867 0.068465 0.131197 27 /p/ 2046 0.028782 0.055153 28 /r/ 5220 0.073431 0.140712 29 /s/ 2766 0.038910 0.074561 30 // 1222 0.017190 0.032941 31 /t/ 3944 0.055481 0.106316 32 /ts/ 760 0.010691 0.020487 33 /v/ 1116 0.015699 0.030083 34 /z/ 514 0.007231 0.013856

  • 26 Phonic phenomena

    The phoneme /n/ occurs twice in the given line and its relative frequency with respect to the inventory of consonants is 0.131197 (Table 2.1.2.2), hence we ob-tain

    If the computed probability is smaller than , which can be determined conven-tionally as e.g. 0.05, then we may speak of a euphonic tendency. The extent of euphony contributed by the given phoneme will be measured by the indicator (cf. Wimmer et al. 2003: 60) (2.1.2.2) where = occurrences of the phoneme.

    Here, is the significance level (= 0.05), therefore the Coefficient of Eu-phony, CEphoneme expressed by (2.1.2.2) is always positive and may attain values in the interval [0; 5.00].

    In our example, we obtained in the first case the sum of 0.05059, which is greater than 0.05, hence the five occurrences of /a/ do not display a euphonic effect. The same holds for the two occurrences of /n/, from which follows that the first line is not constructed euphonically.

    Performing the above computation for all k phonemes occurring at least twice in the line we obtain the mean euphony indicator for the line as

    (2.1.2.3) 100 [ ( )]i

    line i iE

    CE P xk

    =

    where the i are the phonemes belonging to the euphonic set E fulfilling condi-tion E = {phoneme|CEphoneme > 0}. Having performed this computation for all lines of a poem we may define the euphonic value of the whole poem with K lines as

    1212

    5

    12(/ / 5) 0.181583 (1 0.181583)

    5 0.038452 0.009953 0.001893 0.000262 0.0000259 +

    + 0.00000172 0.0000000695 0.000000001285 0.05059.

    x x

    xP a

    =

    =

    = + + + +

    + +

    =

    100[ ( )], ( )0,phoneme

    P x if P xCE

    otherwise >

    =

    1111

    2

    11(/ / 2) 0.131197 (1 0.131197) 0.433506.

    2x x

    xP n

    =

    = =

  • Occurrence without pattern 27

    (2.1.2.4) 1

    1 Kline jpoem

    jCE CE

    K == .

    For the given poem we may compute the variance of CEpoem empirically as

    (2.1.2.5) 21

    1( ) ( )K

    line jpoem poemj

    Var CE CE CEK =

    = ,

    a simple expression that can be used for comparing two poems by means of a normal test. For the sake of illustration let us consider the poem Lebda:

    Lebda Phonemic transcription

    Cnd pintre valuri ce salt /k//n/d/ /p/i/n/t/r/e/ /v/a/l/u/r/ i/ / t/e/ /s/a/l/t// Pe balt /p/e/ /b/a/l/t// n ritm uor, //n/ /r/i/t/m/ /u//o/r/ Lebda alb cu-aripele-n vnturi

    /l/e/b//d/a/ /a/l/b// /k/w/a/r/i/p/e/l/e/n/ /v//n/t/u/r/ i/

    n cnturi //n/ /k//n/t/u/r/ i/ Se leagn-n dor; /s/e/ /l/e /a/g//n//n/ /d/o/r/ Aripele-i albe n raza cea cald

    /a/r/i/p/e/l/e/j/ /a/l/b/e/ //n/ /r/a/z/a/ /t/a/ /k/a/l/d//

    Le scald, /l/e/ /s/k/a/l/d// Din ele btnd, /d/i/n/ /j/e/l/e/ /b//t//n/d/ i-apoi pe luciu, pe unda d-oglinde

    //j/a/p/o/j/ /p/e/ /l/u/t/u/ /p/e/ /u/n/d/a/ /d/o/g/l/i/n/d/e/

    Le-ntinde /l/e/n/t/i/n/d/e/ O barc de vnt. /o/ /b/a/r/k// /d/e/ /v//n/t/

    We obtain the results presented in Table 2.1.2.3.

    Table 2.1.2.3: Analysis of the poem Lebda

    line no. phoneme CEphoneme CEline

    4 /b/ 1.701282 0.24304 5 // 3.544263 1.772131 7 /a/ 3.088126 0.772031 10 /p/ 1.837048 0.204116

  • 28 Phonic phenomena

    Adding the numbers in the last column and dividing the sum by the number of lines in the poem (K = 12) we obtain CELebada = 2.991318/12 = 0.249277. Using this mean and the values in the last column by taking into account the eight lines that have euphony zero we obtain the variance Var(CELebada) = 0.257629.

    In this way, the euphonic tendency of all poems can be computed. Here we shall simply order the poems according to increasing euphony as presented in Table 2.1.2.4. The problem of the euphonic weight of individual phonemes is language-dependent even if it can have an iconic background but it cannot contribute to answering general questions. We can simply state that individual poems have different euphony values ranging from 0.110634 up to 0.453381. Comparing the first poem (smallest euphony) with the last (highest euphony) by means of a normal test we obtain

    which is a highly significant value. Hence euphony played a certain role in Eminescu's work.

    We may ask two questions: (1) Did Eminescu develop in this respect or did he maintain the same level from the first to the last analysed poem? (2) Does the extent of euphony depend on the length of the poem?

    The first question can be answered by scrutinising the relation of the euphony of the poem to the year of its origin. Looking at Table 2.1.2.4 and plot-ting the euphony values according to years in a diagram (cf. Figure 2.1.2.1) we see that in a certain epoch of his creativity, Eminescu began to develop euphony, interrupted always this evolution and fell back to a lower state, where he began anew. His pathological year 18831 was, regarding euphony, particularly expanded. This reminds of renewal processes but scruti-nising of this mechanism must be postponed to the happy time when the works by more writers will be at our disposal and we shall know not only the year of origin but also the dates of first appearance. In any case, one sees a very charac-teristic historical movement of euphony. Taking simply means of the concerned years leads to a slight linear increase.

    1 In June 1883, Eminescu fell seriously ill and finally died in 1889.

    | 0.110634 0.453381| 0.342747 4.290.07981930.047272 0.442159

    56 80

    u = = =+

  • Occurrence without pattern 29

    Table 2.1.2.4: Euphony in Eminescu's poems (ordered according to increasing values)

    Year Poem title No of verses

    Euphony poem

    Variance euphony

    1873 Dumnezeu i om 56 0.110634 0.047272

    1867 Dac treci rul Selenei 41 0.126545 0.042433

    1867 La moartea lui Heliade 48 0.128128 0.047784

    1870 Epigonii 114 0.137836 0.037605

    1883 Peste vrfuri 12 0.14854 0.066133

    1883 Somnoroase psrele 16 0.150188 0.116695

    1879 Att de fraged... 36 0.153134 0.102216

    1883 Od n metru antic 20 0.153366 0.072062

    1873 Ghazel 40 0.165371 0.039335

    1879 Desprire 38 0.165682 0.063814

    1887 Venere i Madona 48 0.183075 0.069181

    1874 mprat i proletar 210 0.183341 0.076543

    1879 Rugciunea unui dac 46 0.184598 0.067198

    1881 Scrisoarea III 285 0.190571 0.079298

    1869 Junii corupi 78 0.195154 0.141009

    1873 Adnca mare... 14 0.195572 0.064063

    1883 Cu mine zilele-i adaogi... 32 0.208622 0.139903

    1866 O clrire n zori 86 0.2208 0.145174

    1871 Iubit dulce, o, m las 56 0.224681 0.096907

    1871 Mortua est! 70 0.225005 0.092254

    1874 O, adevr sublime... 44 0.228707 0.090329

    1867 Ce-i doresc eu ie, dulce Romnie

    32 0.229817 0.146835

    1887 De ce nu-mi vii 24 0.233317 0.113969

    1879 Freamt de codru 48 0.245238 0.117649

    1869 Lebda 12 0.249277 0.257629

    1883 i dac... 12 0.262959 0.34563

    1883 Criticilor mei 28 0.264219 0.147977

    1879 Sonete 42 0.264639 0.132969

    1885 Sara pe deal 24 0.265213 0.088441

  • 30 Phonic phenomena

    1887 Povestea teiului 88 0.26974 0.217131

    1876 Lacul 20 0.274255 0.152951

    1883 Luceafrul 392 0.27721 0.239471

    1883 Mai am un singur dor 36 0.28178 0.340581

    1873 Floare albastr 56 0.293544 0.24609

    1883 Trecut-au anii 14 0.296258 0.258758

    1880 O, mam... 18 0.296358 0.12687

    1875 Ft-Frumos din tei 92 0.300824 0.201322

    1883 La mijloc de codru 13 0.329564 0.127905

    1871 Copii eram noi amndoi 92 0.336609 0.325554

    1883 Pe lng plopii fr so 44 0.342626 0.228156

    1879 Revedere 36 0.353938 0.336106

    1886 La steaua 16 0.365743 0.625435

    1880 Dintre sute de catarge 16 0.373583 0.163423

    1878 Povestea codrului 52 0.406063 0.316016

    1883 Ce te legeni codrule 25 0.425026 0.486791

    1883 Gloss 80 0.453381 0.442159

    Figure 2.1.2.1. Plot of

  • Assonance 31

    The second question can be answered if we scrutinise the relation as given in Figure 2.1.2.2.

    Figure 2.1.2.2. Plot

    As can easily be seen, there is no dependence of euphony on poem length. On the contrary, some poems of the same length seem to be written under quite different euphonic regimes. However, as soon as more authors have been ana-lyzed, even this should be shown empirically by a statistical test.

    In our present study, euphony is a local phenomenon concerning a given poem but there is neither development nor length dependence.

    2.2 Assonance

    Assonance may remind of an echo, a repetition of a sound sequence in another position of the poem. It must consist of at least two sounds (vowels) in the same linear order; the sequence may be discontinuous.

    While in prose assonance is not always relevant, it may play a certain euphonic role in poetry. Assonance may give rise to parallelism, i.e. repetition of the same sound-sequence in parallel positions in the strophe. This phenome-non can be observed e.g. in Malay folk-quatrains called pantuns (cf. Altmann 1963). In modern poetry, one cannot expect ordered vocalic structures outside of rhyme and if rhyme does not exist (e.g. in hexameter), vocalic patterns are rather seldom.

    One way of finding vocalic assonance patterns in modern poetry is to study the transitions from one vowel to the next one, so to say, to search for Markov dependencies of the first order. But the computation would be complex and not very lucid (cf. Brainerd 1976; Altmann 1988) since we have several different

  • 32 Phonic phenomena

    states (= number of different vowels) in a text. Instead, we simply observe the transitions between vowels omitting the transition from one verse to the next and register them in a contingency table. For Romanian, we use the vocalic phonemes: {/a/, //, //, /e/, /e /, /i/, /i/, /j/, /o/, /o /, /u/, /w/}.

    Registering all transitions we obtain a 12 12 contingency table, in which we can find the individual tendencies. We test each individual cell using the criterion

    (2.2.1)

    2

    . .

    . . ( .)( . )( 1)

    i jij

    i j i j

    n nn

    nun n n n n n

    n n

    =

    ,

    where u is the quantile of the standard normal distribution N(0,1), nij is the fre-quency in cell (i,j); ni. is the sum of row i; n. j is the sum of column j, and n is the total sum. The expression ni.n.j /n is the expectation for the cell (i,j), and the expression in the denominator is the standard deviation in the cell (i,j). If u 1.96, we have a significant vowel pattern, otherwise the pattern can be con-sidered random. Here, we are not interested in different strengths of the pattern-ing, hence we decide dichotomically: if u 1.96, we have an existing, positive pattern (P), otherwise the sequence is not significant (N). This is why we use rather the normal distribution than the chi-square criterion, which yields only positive results (being the square of (2.2.1)).

    For the sake of illustration let us present the results from the poem Lacul in Table 2.2.1. Consider the sequence of two a-s /aa/ yielding n/aa/ = 7 and the sequence /au/, n/au/ = 10. Inserting the other numbers from Table 2.2.1 we ob-tain:

    /aa /2

    36(34)7160 0.2999

    36(34)(160 36)(160 34)160 (160 1)

    u

    = =

  • Assonance 33

    Table 2.2.1: Frequency of vowel sequences in the poem Lacul

    /a/ // // /e/ /e / /i/ /i/ /j/ /o/ /o / /u/ /w/ ni. a/ 7 6 1 7 0 4 1 0 0 0 10 0 36 // 4 2 1 0 0 3 2 0 1 0 2 0 15 // 4 3 2 2 0 0 0 0 0 0 2 0 13 /e/ 5 2 0 0 1 5 2 1 0 0 4 1 21 /e / 3 0 0 0 0 0 0 0 0 0 0 0 3 /i/ 2 3 3 5 0 1 1 3 3 0 3 0 24 /i/ 4 0 1 3 0 0 0 0 0 0 0 0 8 /j/ 1 1 0 2 0 1 0 0 0 1 1 0 7 /o/ 2 0 0 1 1 1 1 0 0 0 0 0 6 /o / 1 0 0 0 0 0 0 0 0 0 0 0 1 /u/ 1 3 0 7 1 6 2 1 2 0 2 0 25 /w/ 0 0 0 1 0 0 0 0 0 0 0 0 1 n.j 34 20 8 28 3 21 9 5 6 1 24 1 160

    Since u/aa/ = -1.96 < -0.2999 < 1.96, the sequence /aa/ does not represent a signifi-cant association. The sequence /au/ represents a significant association be-cause u/au/ = 2.43 1.96. Performing the same test for all cells of Table 2.2.1 we obtain the results as presented in Table 2.2.2. Here we took the value of u = 1.96 as a boundary, however, one can use other quantiles. In a two-sided test, this corresponds to = 0.05. One could, of course, assign the vowel sequences to dif-ferent classes as is usual in phonemics e.g. those with u < -1.96 to the dis-sociative class (D) and those within [-1.96; 1.96] to the neutral class (N) but our aim here is to find only existing preferred associations (P).

    /au /

    2

    36(34)10160 2.43

    36(34)(160 36)(160 34)160 (160 1)

    u

    = =

  • 34 Phonic phenomena

    Table 2.2.2: The u-test for individual cells of Table 2.2.1

    /a/ // // /e/ /e / /i/ /i/ /j/ /o/ /o / /u/ /w/ /a/ P // // /e/ P /e / P /i/ P P /i/ P /j/ P /o/ P /o / /u/ /w/ P

    Evidently, there are associative tendencies in constructing vowel chains. Nine out of one hundred and forty-four sequences are preferred by the poet. In order to see whether the same tendencies exist in other works by Eminescu we ana-lyzed 46 poems and presented the results in Table 2.2.3.

    Table 2.2.3: Associative two-member chains in Eminescu's poems

    Poem title No of verses

    Significant chains of phonemes No.

    Lebda 12 /a,/, /,w/, /,i/, /i,e/, /i,e/, /o,j/, /u, i/, 7 Peste vrfuri 12 /a,j/, /,o/, /e,a/, /e,o/, /i,e/, /j,/, /o,u/,/o ,a/, /u, i/ 9 i dac... 12 /a,/, /a,/, /,u/, /e ,a/, /i,j/, /j,e/,/o,e /, /o,i/, /u,i/ 9 La mijloc de codru... 13 /,i/, /,u/, /e ,a/, /i,e/, /j,e/, /o ,a/, /u,e / 7 Adnca mare 14 /a,/, /,e /, /,i/, /e,j/, /e,o /, /e ,a/,/i,/, /i,o/, /j,e/,

    /o,i/, /o ,a/ 11

    Trecut-au anii 14 /a,/, /a,e /, /,a/, /e,u/, /e ,a/, /i,j/, /i,o/, /i,/, /j,a/, /j,e/, /o,w /, /u, i/, /w,a/

    13

    Somnoroase psrele..

    16 /a,e/, /e ,a/, /i,j/, /i,/, /o,i/, /o ,a/, /u,/ 7

    La steaua 16 /a,e /, /a,w/, /,i/, /,i/, /e ,a/, /i,/, /i,j/, /i,o/, /j,e/, /o ,a/, /u,i/

    11

    Dintre sute de catarge 16 /a,u/, /,/, /,u/, /e ,o/, /i,e/, /j,o/, /o,o/, /o ,a/, /u,i/

    9

  • Assonance 35

    Poem title No of verses

    Significant chains of phonemes No.

    O, mam 18 /a,/, /,u/, /e,w/, /e ,a/, /i,e/, /i,/,/j,o/, /o,j/, /o ,a/, /u,/

    10

    Lacul 20 /a,u/, /e,w/, /e ,a/, /i,j/, /i,o/, /i,a/, /j,o /, /o,e /, /w,e/

    9

    Od n metru antic 20 /a,e/, /,/, /e,i/, /e,w/, /e ,a/, /i,j/, /i,/, /j,e/, /o,i/, /o ,a/, /w,o/

    11

    De ce nu-mi vii 24 /,w/, /e,u/, /e,w/, /e ,a/, /i,j/, /i,i/, /j,e/, /o,i/, /o,i/, /o ,a/, /u,i/

    11

    Sara pe deal 24 /a,e /, /,i/, /,/, /e,i/, /e,i/, /e,w/, /e ,a/, /i,/, /i,j/, /j,o /, /o,i/, /o ,a/

    12

    Ce te legeni... 25 /a,e /, /a,j/, /,u/, /e,/, /e,e/, /e ,a/, /i,o/, /o,u/, /o ,a/, /u,i/, /u,i/

    11

    Criticilor mei 28 /, e /, /,i/, /e,o /, /e,u/, /e ,a/, /i,j/, /j,e/, /o,i/, /o ,a/, /u,e/, /u,i/, /w,/

    12

    Cu mne zilele-i adaogi...

    32 /a,o/, /,o /, /e,i/, /e ,a/, /i,e/, /i,w/, /j,e/, /o,/, /o ,a/, /w,a/, /w,e /

    11

    Ce-i doresc eu ie, dulce Romnie

    32 /,i/, /,w/, /,i/, /e,o/, /e,w/, /e ,a/, /i,e /, /i,j/, /i,o/, /j,e/, /o,/, /o,o /, /o ,a/, /u,i/, /w,i/

    15

    Revedere 36 /,/, /,u/, /e,e /, /e,i/, /e ,a/, /i,e/, /j,e/, /j,o/, /o,j/, /o,u/, /o ,a/, /u,i/

    12

    Att de fraged... 36 /,o/, /e,i/, /e,u/, /e ,a/, /i,j/, /i,o/, /i,w/, /j,e/, /o,i/, /o ,a/, /w,a/

    11

    Mai am un singur dor 36 /a,o /, /a,u/, /,i/, /e,e/, /e ,a/, /i,e /, /i,j/, /i,/, /j,e/, /o,j/, /o,o/, /o ,a/, /u,i/, /w,o/

    14

    Desprire 38 /,w/, /,o /, /e ,a/, /e ,o /, /i,j/, /j,e/, /o,j/, /o ,a/, /w,/

    9

    Ghazel 40 /,u/, /,w/, /,o/, /e,i/, /e ,a/, /i,/, /i,/, /i,o /, /j,e/, /o,i/, /o,j/, /o ,a/, /u,e/, /u,i/

    14

    Dac treci rul Selenei 41 /a,/, /a,e/, /,a/, /,i/, /e ,a/, /i,j/, /i,o/, /i,w/, /i,/, /j,e/, /o,i/, /o,i/, /o,o /, /o ,a/, /u,e /, /w,/

    16

    Sonete 42 /a,/, /,u/, /e,e/, /e,j/, /e ,a/, /i,j/, /i,i/, /j,a/, /o,i/, /o ,a/, /u,i/

    11

    Pe lng plopii fr so

    44 /,w/, /,i/, /e ,a/, /i,j/, /o ,a/, /w,e/ 6

    O, adevr sublime... 44 /a,e /, /,o/, /,w/, /e,i/, /e ,a/, /i,/, /i,u/, /j,e/, /j,u/, /o,i/, /o ,a/, /w,u/

    12

    Rugciunea unui dac 46 /a,e /, /a,w/, /,u/, /e,/, /e ,a/, /e ,o/, /i,o/, /j,e/, /o ,a/, /u,j/, /w,o /

    11

    Venere i Madona 48 /a,/, /,u/, /,o /, /e,j/, /e,w/, /e ,a/, /j,e/, /o,i/, 11

  • 36 Phonic phenomena

    Poem title No of verses

    Significant chains of phonemes No.

    /o,w/, /o ,a/, /u,/

    Freamt de codru 48 /a,/, /,o /, /e,w/, /e ,a/, /i,i/, /i,o/, /j,e/, /o,j/, /o ,a/, /u,i/

    10

    La moartea lui Heliade 48 /a,/, /,/, /,i/, /e ,a/, /i,j/, /i,o/, /i,w/, /j,e/, /o,o/, /o ,a/, /u,i/, /u,j/

    12

    Povestea codrului 52 /a,/, /,i/, /,o/, /e ,a/, /i,e /, /i,j/, /i,o/, /j,/, /j,e/, /o,w/, /o ,a/, /u,i/, /u,o /, /w,o/

    14

    Iubit dulce, o, m las

    56 /a,e /, /a,o/, /,j/, /,u/, /,w/, /e ,a/, /j,e/, /o,i/, /o,u/, /o ,a/, /u,/, /w,o/

    12

    Floare-albastr 56 /a,/, /a,/, /,i/, /,u/, /,o/, /,u/, /e,i/, /e,w/, /e ,a/, /i,e /, /i,j/, /i,w/, /j,e/, /o,i/, /o,j/, /o ,a/, /u,/, /u,i/, /w,a/

    19

    Dumnezeu i om 56 /a,/, /a,e/, /a,u/, /a,w/, /,/, /,i/, /,u/, /,/, /e ,a/, /i,j/, /i,o /, /j,e/, /o ,a/

    13

    Mortua est! 70 /a,/, /a,e /, /a,j/, /,i/, /,o/, /e,u/, /e ,a/, /i,j/, /i,o /, /i,w/, /j,e/, /o,i/, /o ,a/, /u,i/, /w,/

    15

    Junii corupi 78 /a,/, /,o /, /,e /, /,i/ ,/e ,a/, /i,/, /i,j/, /i,/, /j,e/, /o,w/, /o ,a/, /u,i/

    12

    Gloss 80 /a,/, /,/, /,i/, /,w/, /e,e/, /e ,a/, /i,j/, /i,o/, /i,/, /j,e/, /o,o /, /o,w/, /o ,a/, /u, i/, /w,/

    15

    O clrire n zori 86 /a,/, /a,j/, /,i/, /,w/, /,u/, /e,/, /e,o/, /e ,a/, /i,j/, /i,/, /j,a/, /o,i/, /o,w/, /o ,a/, /u,j/

    15

    Povestea teiului 88 /a,/, /a,j/, /,/, /,u/, /e,e /, /e,w/, /e ,a/, /e ,o /, /j,a/, /o,i/, /o,u/, /o ,a/, /u, i/, /w,a/

    14

    Copii eram noi amn-doi

    92 /a,e/, /,/, /e,w/, /e ,a/, /i,j/, /i,o/, /j,e/, /o,i/, /o ,a/, /u, i/, /u,o /, /w,a/

    12

    Ft-Frumos din tei 92 /a,/, /,/, /e,/, /e ,a/, /j,a/, /j,e/, /o,i/, /o,i/, /o,u/, /o,w/, /o ,a/, /u,i/, /w,a/

    13

    Epigonii 114 /a,/, /a,u/, /a,w/, /,/, /e,i/, /e ,a/, /i,e /, /i,j/, /i,o /, /i,/, /j,e/, /o,j/, /o ,a/, /u,e/

    14

    mprat i proletar 210 /a,/, /a,u/, /,/, /,i/, /e,i/, /e ,a/, /i,e /, /i,j/, /i,o /, /j,a/, /j,e/, /j,o/, /o,i/, /o,o/, /o ,a/, /u,i/, /w,/

    17

    Scrisoarea III 285 /a,/, /a,u/, /a,w/, /,/, /,/, /,u/, /e,o /, /e ,a/, /i,e/, /i,j/, /i,o/, /i,/, /j,a/, /j,e/, /o,e /, /o,i/, /o,i/, /o ,a/, /u,/, /u,i/, /u,i/, /w,a/, /w,/

    23

    Luceafrul 392 /a,/, /a,e /, /a,w/, /,/, /,i/, /e,o /, /e,w/, /e ,a/, /i,e /, /i,i/, /i,j/, /i,o/, /i,/, /i,o /, /j,a/, /j,e/, /o,i/, /o,i/, /o,u/, /o ,a/, /u,i/

    21

  • Assonance 37

    2.2.1 The diagonal

    Looking at Table 2.2.2 we see that there is no sequence with significant fre-quency on the diagonal. However, if we look at Table 2.2.3 we find a very small number of identical (= diagonal) pairs. This may indicate that Romanian avoids such pairs or characterise a property of Eminescu's poems. Whatever the cause, we may test the behaviour of the diagonal applying a simple statistical test for the existence of vowel harmony in languages (cf. Altmann 1987; Schulz, Altmann 1988; Altmann, Altmann 2008). In our case, we surpass the boundaries of words and test the existence or nonexistence of a tendency. We set up a func-tion of diagonal cells in form of their sum

    S = n11 + n22 + + nkk,

    with k as the number of cells on the diagonal, and compare it with the expected sum given as

    using the variance defined as

    (2.2.2) ' '

    2

    . . ( .)( . ) 2 . . . .( )

    ( 1)

    i i i i i i i ii i i

    n n n n n n n n n nVar S

    n n 1.96), a significantly negative (u < 1.96) or a neutral tendency (u [-1.96; 1.96]). The corresponding chi-square test becomes simpler when only the pref-erence of the diagonal is of interest:

    1

    . .( )k

    i i

    i

    n nE Sn=

    =

  • 38 Phonic phenomena

    (2.2.4) 2

    2 1

    2

    1 1

    ( . . )

    . . ( . . )

    k

    i ii

    k k

    i i i ii i

    n nS n nX

    n n n n n

    =

    = =

    =

    ,

    which is approximately X2 u2. For the sake of illustration, we compute the tendency of the diagonal for the data in Table 2.2.1. We designate the two sums in (2.2.2) as A and B respectively and obtain

    S = 7 + 2 + 2 + 0 + 0 + 1+ 0 + 0 + 0 + 0 + 2 + 0 = 14 E(S) = [36(34) + 25(20) + 13(8) + 21(28) + 3(3) + 24(21) + 8(9) + 7(5) + +6(6) + 1(1) + 25(24) + 1(1)]/160 = 21.7125 A = [36(34)(160-36)(160-34) + 25(20)(160-25)(160-20) + + +1(1)(160-1)(160-1)] /1602 /159 = 15.34948 B =2[36(34)25(20) + 36(34)13(8)++36(34)1(1)++25(24)1(1)] /1602 /159 = = 2.33445 Var(S) = A + B = 17.6839

    Inserting the needed values in (2.2.3) we obtain

    showing that the diagonal is neutral although there is in fact a negative ten-dency (avoidance of sequences of equal vowels).

    Using the chi-square criterion we obtain for (2.2.4)

    which is approximately u2 = (-1.834)2 = 3.3635 and not significant: it does not show the direction of the tendency.

    14 21.7125 1.83417.6839

    u = =

    22

    2

    160[160(14) 3474] 3.16973474(160 3474)

    X = =

  • Assonance 39

    Instead to analyse each of the 46 poems separately, we show the numbers of positive associations in all poems as presented in Table 2.2.4, obtained by counting the significant sequences in Table 2.2.3.

    Table 2.2.4: Numbers of associations of subsequent vowels in 46 poems

    Since (on the diagonal) there are only 12 cases (marked in bold face) out of 563, which indicate association of identical phonemes (/,/, /e,e/, /i,i/, /o,o/), we may state that there is a strong dissociative tendency to use sequences consist-ing of identical vowels. The most frequent pairs of subsequent vowels are /e ,a/ (43), /o ,a/ (42), /j,e/ (34), /i,j/ (28) and they correspond to the diphthongs: ea, oa, ie, ii.

    2.2.2 Symmetry

    If assonance is symmetric then sequences are to be considered random; the given sequence and the same sounds in reverse order are to be expected with the same frequency. If /e ,a/ is significantly frequent, then we expect also /a,e / to have the same quality. If the situation is different, we may speak about sig-nificant asymmetry of assonance. In Table 2.2.4 we observe the asymmetry of assonance in the studied poems. This can be caused both by the given language and the style of the author. If the given language excludes specific sequences (e.g. in Indonesian there are sequences [,a] but no sequences [a,]), asymmetry is not necessarily given. In general, if vowel sequences are concerned, the more

    /a/ // // /e/ /e / /i/ /i/ /j/ /o/ /o / /u/ /w/ /a/ 0 23 2 5 8 0 0 5 2 1 7 6 // 1 0 8 0 1 10 5 1 2 3 7 10 // 1 6 2 0 2 6 1 0 5 2 10 0 /e/ 1 0 4 4 2 6 5 3 3 4 5 12 /e / 43 0 0 0 0 0 0 0 2 2 0 0 /i/ 0 4 1 7 7 2 0 28 12 4 0 4 /i/ 1 0 14 2 0 2 0 1 4 2 1 2 /j/ 8 0 2 34 0 0 0 0 4 2 1 0 /o/ 0 1 1 1 3 19 9 9 4 3 7 7 /o / 42 0 0 0 0 0 0 0 0 0 0 0 /u/ 0 5 1 3 2 5 24 3 0 2 0 0 /w/ 8 3 4 2 1 1 0 0 4 1 1 0

  • 40 Phonic phenomena

    vowels there are in the language, the smaller is the probability of the existence of all reverse sequences in a poem.

    Whatever the situation in the given language, asymmetry can be measured and expressed by an indicator. For this purpose, we use the well known Bowker-test, which is based on the comparison of all cells (i,j) with symmetric cells (j,i) where i j, i.e. symmetric cells without the diagonal of the contingency table. We set up the criterion

    (2.2.5) 21

    2

    1 10

    ( )

    ij ji

    k kij ji

    i j i ij jin n

    n nX

    n n

    = = ++

    =

    +

    which is distributed like a chi-square with k(k-1)/2 degrees of freedom (DF), k being the number of classes, here vowels (k = 12). Using Table 2.2.1 we obtain

    which is with DF = 12(12-1)/2 = 66 yields P = 0.93 showing that the sequences are quite symmetric. In this way, all poems could be scrutinised. But if we consider the overall Table 2.2.4 to see the situation and perform the same test we obtain X2 = 297.802 which is with 66 degrees of freedom highly significant, i.e. it testi-fies to strong asymmetry. All in all, there is a strong tendency to avoid re-verse/symmetric vocalic assonances either in Eminescu's work or in Romanian in general. In any case, the extent of asymmetry can be considered a property of his poems.

    This is not an automatic result but a case of poem structure. However, only a comparison with other poets also in other languages would help to unveil the strength of this structure. Simple functions applied to the sequence do not yield a better result than R2 = 0.80 but it is surely based on the fact that each poem is an individual creation, and a better result would follow only if we had many poems of the same length. Preliminarily we can consider this problem as a future task.

    2 2 2 22 (6 4) (4 1) (1 1) (2 0) 49.9151

    6 4 4 1 1 1 2 0X = + + + + =

    + + + +

  • Assonance 41

    2.2.3 Poem length and significant sequences

    Out of 144 possible vowel sequences in the given poems we find only 95 signif-icant ones. One can expect that if the number of lines increases, the number of significant sequences will increase, too. However, the increase is not linear, it can be more adequately captured using a power function. We apply such a func-tion to the data in Table 2.2.5, which results from counting the significance se-quences for each poem in Table 2.2.3. As can be seen in Figure 2.2.1, the number of significant sequences increases with increasing poem length.

    Figure 2.2.1. Increase of significant vowel sequences with increasing poem length

    Table 2.2.5: Dependence of significant sequences on verse numbers

    No. of verses No. of signifi-cant vowel sequences

    No. of verses No. of signifi-cant vowel sequences

    12 7 41 16 12 9 42 11 12 9 44 6 13 7 44 12 14 11 46 11 14 13 48 11 16 7 48 10 16 9 48 12 16 11 52 14

  • 42 Phonic phenomena

    No. of verses No. of signifi-cant vowel sequences

    No. of verses No. of signifi-cant vowel sequences

    18 10 56 12 20 9 56 19 20 11 56 13 24 11 70 15 24 12 78 12 25 11 80 15 28 12 86 15 32 11 88 14 32 15 92 12 36 12 92 13 36 11 114 14 36 14 210 17 38 9 285 23 40 14 392 21

    There is no significant trend, the sequences are likely to correspond with those usual in Romanian word structure. It will be reasonable to characterise Eminescu only after several Romanian authors have been investigated.

    An alternative measuring method for poem size is in terms of the number of words. This is done in Table 2.2.6.

    The result of the regression analysis is displayed in Figure 2.2.2. As can be seen, the regression is here too weak to be considered as really existing.

    Table 2.2.6: Title and size of analysed 46 poems by Eminescu

    Poem title N words (text size)

    No. of significant chains of phonemes

    1 Lebda 41 7 2 Peste vrfuri 47 9 3 Dintre sute de catarge 50 9 4 i dac... 53 9 5 La mijloc de codru... 55 7 6 Somnoroase psrele... 55 7 7 La steaua 71 11

  • Assonance 43

    8 Adnca mare 75 11 9 Trecut-au anii 88 13

    10 Lacul 90 9 11 Ce te legeni... 102 11 12 Od n metru antic 103 11 13 De ce nu-mi vii 123 11 14 Mai am un singur dor 125 14 15 Criticilor mei 130 12 16 O, mam 140 10 17 Cu mane zilele-i adaogi... 141 11 18 Revedere 141 12 19 Sara pe deal 156 12 20 Att de fraged 176 11 21 Freamt de codru 179 10 22 Ce-i doresc eu ie, dulce

    Romnie 183 15

    23 Pe lng plopii fr so 199 6 24 Povestea codrului 220 14 25 Floare-albastr 247 19 26 Sonete 265 11 27 Desprire 304 9 28 Ghazel 331 14 29 La moartea lui Heliade 332 12 30 O, adevr sublime... 334 12 31 Iubit dulce, o, m las 337 12 32 O clrire n zori 346 15 33 Dac treci rul Selenei 356 16 34 Rugciunea unui dac 357 11 35 Copii eram noi amndoi 375 12 36 Gloss 380 15 37 Povestea teiului 390 14 38 Venere i Madona 393 11 39 Ft-Frumos din tei 415 13 40 Dumnezeu i om 443 13 41 Junii corupi 458 12 42 Mortua est! 491 15

  • 44 Phonic phenomena

    43 Epigonii 921 14 44 mprat i proletar 1510 17 45 Luceafrul 1737 21 46 Scrisoarea III 2278 23

    Figure 2.2.2. Number of significant phoneme chains vs. poem lengths in words

    2.3 Alliteration

    In poetry, there are two kinds of alliteration (= repetitions of the same sound at the beginning of linguistic entities): (1) in the poem, at the beginning of verses, which we will call Skinner alliteration, (2) in the line, at the beginning of words, which can be called Beowulf alliteration. The evaluation of both kinds may be processed using the same method. However, there are different starting ap-proaches.

    (a) Stating the relative frequencies of all phonemes (sounds) in the poetic work of the author or (b) stating the relative frequencies only at the beginning of words (or verses) in all his poems; or (c) considering only the given poem and taking all phoneme/sound frequencies or (d) only the initial phonemes/sounds and their frequencies into account, and finally (e) starting from the relative frequencies of phonemes/sounds in the given language.

    Needless to say, the outcomes of tests may turn out to be very different but none of these universes of discourse represents some kind of population in the sense of statistics. They are samples but there is no population which could be called phoneme/sound frequency with Eminescu or even phoneme/sound frequency in Romanian. Every language changes every day, has many dialects

  • Alliteration 45

    and idiolects, and a population should represent also all spoken utterances. Hence, in language there is no population (cf. Orlov, Boroda, Nadarejvili 1982). Nevertheless, one may always start from a conventionally stated back-ground and obtain a restricted result.

    Furthermore, two distance approaches can be differentiated: (f) taking into account the mutual distance of the given alliterated lines or words because the repetition of the same sound e.g. at the beginning of the 1st and at the 100th line surely does not have an alliterative effect, or (g) igoring distance. Computa-tion according to approach (f) is much more difficult and requires some intui-tive, subjective decisions about the distance in which alliteration can still be perceived.

    Here we shall lean against the sound frequency in 146 poems by Eminescu. A study on the basis of the phoneme frequencies in modern Romanian would mean to perform judgements about samples on the basis of a population which arose circa 140 years later; considering the complete language of Eminescu would mean only his written and conserved texts, not his spoken ones. Every decision about the choice of a population in language is simply a condition under which the analysis will be performed. The situation resembles mathematical theorems with Let be given.

    Computing the Skinner alliteration does not differ from that of general euphony but here we do not differentiate vowels and consonants because both can occur at the same position. All formulas have been presented in previous chapters.

    The results of the computation of Skinner alliteration are presented in Table 2.3.1.

    Table 2.3.1: Skinner alliteration in 46 poems by M. Eminescu

    Year Poem No. of verses

    phoneme alliterative euphonies

    Mean Skinner alliteration poem

    1869 Lebda 12 // /l/

    1.66509 3.55578

    0.43507

    1883 Peste vrfuri 12 /m/ /p/

    4.93605 0.48573

    0.45181

    1883 i dac... 12 /j/ //

    4.30004 4.99989

    0.77499

    1883 La mijloc de codru...

    13 /l/ //

    4.79498 4.99983

    0.75345

    1873 Adnca 14 /a/ 2.20067 0.34688

  • 46 Phonic phenomena

    Year Poem No. of verses

    phoneme alliterative euphonies

    Mean Skinner alliteration poem

    mare... // 2.65572 1883 Trecut-au

    anii 14 /k/

    // 3.95403 2.65572

    0.47213

    1883 Somnoroase psrele

    16 /d/ /p/ /s/

    3.03033 3.99200 4.71373

    0.73350

    1886 La steaua 16 /j/ /l/

    3.39116 4.53123

    0.49515

    1880 Dintre sute de catarge

    16 /k/ /t/ /d/ /v/

    3.46930 2.31987 3.03033 4.99050

    0.86312

    1880 O, mam... 18 /d/ /m/ /s/

    2.28268 4.66702 4.54768

    0.63874

    1876 Lacul 20 // /s/ //

    3.78911 0.90644 4.96607

    0.48308

    1883 Od n metru antic

    20 /p/ 4.99844 0.24992

    1887 De ce nu-mi vii

    24 /k/ /d/ /s/

    4.20917 4.83748 4.79619

    0.57679

    1885 Sara pe deal 24 /s/ /v/

    4.97454 4.38810

    0.39011

    1883 Ce te legeni codrule

    25 /b/ /d/ // /z/

    2.03667 4.80299 4.99998 3.59533

    0.61740

    1883 Criticilor mei 28 /k/ /t/

    4.971341 3.98666

    0.319929

    1883 Cu mne zilele-i adaogi...

    32 /k/ /t/ /d/ //

    4.99917 3.53697 2.06036 4.78625

    0.48071

    1867 Ce-i doresc eu ie, dulce Romnie

    32 /k/ /t/ /d/ /f/ /v/

    2.83420 3.53697 4.39714 0.02892 4.84627

    0.48886

  • Alliteration 47

    Year Poem No. of verses

    phoneme alliterative euphonies

    Mean Skinner alliteration poem

    1879 Revedere 36 /k/ /t/ /m/ // /v/

    4.98289 2.99134 4.12971

    4.66840 3.12156

    0.55261

    1879 Att de fraged...

    36 /k/ /m/ //

    4.32218 1.13610

    4.66840

    0.28130

    1883 Mai am un singur dor

    36 /d/ /p/

    4.00055 3.05850

    0.19608

    1879 Desprire 38 /k/ /t/ /s/

    4.99999 2.68212 4.67215

    0.32511

    1873 Ghazel 40 /d/ /p/ //

    3.45432 4.90593 4.99999

    0.33401

    1873 Dac treci rul Selenei

    41 /j/ /k/ /d/ /m/ /p/ //

    3.73529 3.82598 3.29193 4.99121 2.01644 4.93282

    0.55594

    1879 Sonete 42 /k/ /d/ /p/ /s/ //

    4.95362 3.11865 4.30731 2.70689 4.41912

    0.46442

    1883 Pe lng plopii fr so

    44 /o/ /k/ //

    4.99518 4.93787 4.31429

    0.32380

    1874 O, adevr sublime...

    44 /o/ /k/ // /t/

    4.96600 4.66043 4.31429 1.63301

    0.35395

    1879 Rugciunea unui dac

    46 /j/ /k/ /p/ /s/ //

    4.53414 4.91815 4.80177 4.99392 4.99989

    0.52713

    1887 Venere i Madon

    48 /k/ /d/

    2.78525 1.83675

    0.26996

  • 48 Phonic phenomena

    Year Poem No. of verses

    phoneme alliterative euphonies

    Mean Skinner alliteration poem

    /f/ //

    3.33612 4.99984

    1879 Freamt de codru

    48 /j/ /k/ /t/ /p/ /s/ //

    2.62252 4.89379 4.25465 0.09509 1.17964

    4.06876

    0.35655

    1867 La moartea lui Heliade

    48 /a/ /o/ /k/ /p/ /v/

    2.90305 3.50287 4.89379 0.09509 4.31774

    0.32734

    1878 Povestea codrului

    52 /k/ /t/ //

    1.98510 4.01405 4.80065

    0.20769

    1871 Iubit dulce, o, m las

    56 /k/ /t/ /s/ //

    3.88829 4.79065 4.97055 4.99948

    0.33302

    1873 Floare-albastr

    56 // /v/

    4.99999 3.83283

    0.15773

    1873 Dumnezeu i om

    56 // /d/ /f/ //

    3.82567 3.29543 2.50228 4.99508

    0.26104

    1871 Mortua est! 70 /k/ /d/ /p/ /s/ //

    4.99999 3.52076 3.44782 3.09290 4.97938

    0.28630

    1869 Junii corupi 78 // /k/ /t/ /d/ /s/ // /v/

    4.99795 4.99999 1.24682 4.18110 1.77316

    4.99387 1.53272

    0.30417

  • Alliteration 49

    Year Poem No. of verses

    phoneme alliterative euphonies

    Mean Skinner alliteration poem

    1883 Gloss 80 /t/ /d/ // /t/ /v/

    4.99993 4.72079 3.75505

    4.99999 4.13341

    0.28261

    1866 O clrire n zori

    86 // /k/ /d/ /p/ //

    3.11600 4.93680 4.54513 4.98123 4.99817

    0.26253

    1887 Povestea teiului

    88 /j/ /k/ /d/ /p/ /s/ //

    4.91265 4.03543 3.37119 3.63219 4.26721 4.98556

    0.28641

    1871 Copii eram noi amndoi

    92 // /k/ /d/ /m/ /p/ //

    2.46870 4.99382 4.98496 3.26839 4.48592 4.99999

    0.273932

    1875 Ft-Frumos din tei

    92 // /j/ /k/ /p/ //

    2.46870 4.56485 3.75689 4.48592 4.99999

    0.22040

    1870 Epigonii 114 /k/ /t/ /p/ /s/ // /v/

    4.95225 4.99996 0.28475 4.49828 4.99698 4.99860

    0.21694

    1874 mprat i proletar

    210 // /k/ /t/ /p/ /s/ // /v/ /z/

    4.78623 4.99999 4.76277 4.99999 3.18147

    4.99998 4.35967 3.09749

    0.16756

  • 50 Phonic phenomena

    Year Poem No. of verses

    phoneme alliterative euphonies

    Mean Skinner alliteration poem

    1881 Scrisoarea III 285 // /k/ /t/ /d/ /p/ // /v/

    4.80557 4.99999

    3.16111 4.99732 4.58681 4.99999 4.99849

    0.114208

    1883 Luceafrul 392 // /j/ /k/ /d/ /h/ /p/ /s/ //

    4.59363 4.99538 4.99999 4.99999 4.94273 4.99950 3.81825 4.99999

    0.09783

    Figure 2.3.1. Skinner alliteration in 46 poems by Eminescu Comparing the mean alliteration of po