46
Entropy and Complexity in Music: Some examples by Barbra Gregory A thesis submitted to the faculty of the University of North Carolina at Chapel Hill in partial ful- fillment of the requirements for the degree of Master of Science in the Department of Mathematics. Chapel Hill 2005 Approved by Advisor: Professor Karl Petersen Reader: Professor Sue Goodman Reader: Professor Jane Hawkins

Entropy and Complexity in Music: Some examplessteinhurst.net/barbie/thesis.pdf · Entropy and Complexity in Music: Some examples by Barbra Gregory A thesis submitted to the faculty

  • Upload
    leliem

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Entropy and Complexity in Music: Some examples

byBarbra Gregory

A thesis submitted to the faculty of the University of North Carolina at Chapel Hill in partial ful-fillment of the requirements for the degree of Master of Science in the Department of Mathematics.

Chapel Hill2005

Approved by

Advisor: Professor Karl Petersen

Reader: Professor Sue Goodman

Reader: Professor Jane Hawkins

ABSTRACT

BARBRA GREGORY: Entropy and Complexity in Music: Some examples

(Under the direction of Professor Karl Petersen)

We examine the concepts of complexity and entropy and their application to musical structures

and explore the difficulty of obtaining a conclusive measure of entropy for a musical system. A set

of tools is applied to examples from modern composers, classical Western music, and traditional

tribal customs. Ergodic theory techniques are used to find the entropy of musical performance

structures left up to chance or choice. Techniques set forth by Lempel and Ziv, Pressing, and

Lerdahl and Jackendoff to explore the complexity of rhythms are studied and applied. Finally, we

consider the complexity of linguistic structure on a series of notes, first by counting the patterns

that are allowed within that structure and then by defining a new measure of complexity inspired

by the work of Pressing.

ii

ACKNOWLEDGMENTS

I would like to thank Dr. Thomas H. Brylawski of the University of North Carolina at Chapel

Hill for his help with Proposition 4.1.5. Dr. Thomas Warburton and Dr. Sarah Weiss of the

Department of Music at the University of North Carolina at Chapel Hill also provided useful

support throughout this work.

Thank you also to my advisor Karl Petersen who was adventurous enough to explore music and

mathematics with me.

And to Josh who kept telling me that this thesis could and would get done.

iii

CONTENTS

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi

LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

Chapter

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1. General outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2. Background mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2. ...explosante-fixe... by Pierre Boulez . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1. Entropy and ...explosante-fixe... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2. The effect of symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.3. Choice and chance in music . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3. African, Brazilian, and Cuban Clave Rhythms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.1. The Clave patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.2. Lempel-Ziv complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.3. Cognitive complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.4. Metricity and metric complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

4. The traditional Nzakara harpists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4.1. Counting the harp patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4.1.1. No duplicated letters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4.1.2. Non-factorable words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

4.1.3. Translation factorability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

4.1.4. Examples with different alphabet sizes and different word lengths . . . . . . . . . 30

4.2. The complexity of the harp patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

iv

LIST OF FIGURES

Figure 2.1. ...explosante-fixe... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Figure 2.2. Relabelling the graph of ...explosante-fixe... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Figure 2.3. Composition with radial symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Figure 2.4. Probability vectors for Musikalisches Wurfelspiel . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Figure 4.1. The five allowable bichords of the Nzakara harpists . . . . . . . . . . . . . . . . . . . . . . . . . . 24

Figure 4.2. Example of a limanza canon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

Figure 4.3. The five allowable bichords of the Nzakara harpists . . . . . . . . . . . . . . . . . . . . . . . . . . 32

v

LIST OF TABLES

Table 3.1. The Clave rhythms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Table 3.2. The Lempel-Ziv complexity measures of the six Clave rhythms . . . . . . . . . . . . . . . . 15

Table 3.3. Types of syncopation in Pressing’s cognitive complexity Measure . . . . . . . . . . . . . . 19

Table 3.4. Calculation of the cognitive complexity of four-pulse rhythms . . . . . . . . . . . . . . . . . 20

Table 3.5. Cognitive complexities of the six Clave rhythms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

Table 3.6. Complexity measures of the six Clave rhythms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Table 3.7. Cognitive and metric complexities of the four-pulse rhythms . . . . . . . . . . . . . . . . . . 22

Table 4.1. Notation and Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Table 4.2. The effect of Nzakara harp grammar in Z3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Table 4.3. The effect of Nzakara harp grammar in Z5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Table 4.4. The effect of Nzakara harp grammar in Z7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Table 4.5. The computed complexities of selected bichord patterns . . . . . . . . . . . . . . . . . . . . . . . 36

vi

CHAPTER 1

Introduction

From the early days of formal language study, researchers have striven to gain insight into the

mysterious communication power of music by fitting it into a linguistic framework. In the en-

thusiasm following Chomsky’s popularization of linguistics, authors such as Harold Powers, Robin

Cooper, and Judith and Alton Becker have sought Chomsky-inspired hierarchies in Javanese game-

lan and Indian Raga music [23, 11, 3]. Researchers such as William Bright, Steven Feld, and

Fred Lerdahl and Ray Jackendoff took a more general approach, seeking ways in which music and

linguistics can cooperate across musical styles [6, 13, 18]. Most attempts at relating linguistics

and music, however, have been criticized repeatedly for such fallacies as failing fully to appreciate

musicality, for attempting to squeeze music into a box where it simply doesn’t fit, for stretching the

definitions to a point where the meaning and usefulness disappear, and even for being Eurocentric.

In 1975, Ruwet pointed out one flaw contributing to the continued difficulty of quantizing and

linguisticizing music: “All human languages are apparently of the same order of complexity, but

that is not the case for all musical systems.” [26]

The multi-dimensionality of music requires a framework different from that of a single-layered

grammar. This idea was expanded by Powers in 1980:

If this be true – and I cannot imagine anyone would think otherwise once it is calledto his attention – it highlights a fundamental deficiency in the general analogy ofmusical structuring with the structuring of languages. Put barbarously in termsof the analogy itself, the ‘linguisticity’ of languages is the same from language tolanguage, but the ‘linguisticity’ of musics is not the same from music to music.

To Ruwet’s telling observation I would add only that musical systems are muchmore varied than languages not only as to order of complexity but also as to kindof complexity. For instance, no two natural languages of speech could differ fromone another in any fashion comparable to the way in which a complex monophonylike Indian classical music or the music of the Gregorian antiphoner differs froma complex polphony (sic) like that of the Javanese gamelan klenengan or of 16th-century European motets and masses. In monophonic musical languages we singor play melodic lines more or less alone, just as we talk more or less one at a timein conversation, and our hearers follow accordingly. We do not all talk at once,saying different things, and expect coherently to be understood. Yet in ensemblemusics with varied polyphonic lines we can (so to speak) easily make beautiful musictogether, which can be as easily followed by those who know how. [24]

While these observations imply a great difficulty in examining the complexity of a piece or style

of music, even a simple accompanied melodic line, we can nevertheless examine the complexities

of individual layers of music, such as harmonic progressions, rhythms, or structural progressions.

We can easily adopt models from mathematical linguistics to study these aspects on an individual

basis.

Music can be boring in two ways: either too repetitive and structured with no anticipation or

novelty, or too random with no structure in which to interpret what is heard. While we cannot

expect to find, for example, a range of complexity within which music is universally interesting

and inviting, it may be of interest to researchers in various fields to investigate the relationship

between complexity and its influence upon the audience. The multidimensionality of music allows,

for example, one layer (perhaps the harmony) to be relatively complex while other layers (perhaps

rhythm and melody) are simple and straightforward. But what if all layers have the same level of

complexity? Will they conflict and leave the listener feeling lost? These questions are beyond the

scope of this paper. However, study of the nature of complexity of the interaction among layers can

only follow understanding of the individual layers by themselves. This is the task we undertake.

Throughout our work, the term complexity is used generally as a measure of the predictability

of a sequence of notes or strikes (as in rhythm). Lempel and Ziv defined complexity as the measure

of randomness of a sequence [16]. Jeffrey Pressing [25] describes three incarnations of the concept

of complexity. First he describes hierarchical complexity, or the existence of structure on several

different levels at the same time, such as one might find in the polyrhythmic and polyphonic

music of Africa [2]. Pressing also describes the adaptive or dynamic complexity of systems that

adapt and change over time as new information is gained. Finally, he describes information-based

complexity, or the minimal cost of computing a solution to a given problem. While information-

based complexity is a common consideration among computer scientists, Pressing adapts the idea

to define cognitive complexity. Cognitive complexity measures the cognitive cost of learning a task,

such as the execution of a rhythmic pattern.

Pressing’s notion of information-based complexity is closely related to the algorithmic com-

plexity of Kolmogorov and Chaitin [15, 8]. In this context, we might call a finite string complex

if its shortest description is about as long as the string itself [29]. Researchers such as Chaitin

and Bennett have examined this interpretation and attempted to answer such questions as what

exactly a description might be and how one might reproduce the string from such a description.

One definition that has arisen from this work is that the complexity for a finite string s is the

length of the shortest string p of 0’s and 1’s such that a fixed universal Turing machine will output

s when given p as input [29]. In probability theory, Uspenskii, Semenov, and Shen examined the

2

idea of randomness [?]. In doing so, they discuss some of the relations among randomness, com-

plexity, stochasticity, and entropy. Other notions of complexity arise fields as cellular automata,

information theory, statistical inference, ergodic theory, and dynamical systems [29].

Entropy, as used in information theory, measures the complexity of a sequence by measuring

the amount of unexpectedness of each symbol (or pitch or pulse), or, in other words, the amount

of information conveyed with each new element of the sequence [22]. Alternatively, one might

consider the entropy of a message to be the theoretical smallest number of bytes which can convey

the entire meaning of the message [22]. According to [7], topological entropy is “the exponential

growth rate of the number of essentially different orbit segments of length n” and measures “the

complexity of the orbit structure of a dynamical system.” More simply, [12] defines entropy as the

“rate at which information can be gleaned about a dynamical system from a sequence of repeated

observations” and [20] describes the topological entropy of a symbolic dynamical system as the

amount of freedom to concatenate words and remain within the system. We will consider many

incarnations of complexity and entropy: the richness of a language as a function of the constraints

put upon it and the consequent size of the vocabulary as in the case of the Nzakara harpists; the

unpredictability of a sequence as in the study of ...explosante-fixe..., or the Lempel-Ziv complexity

of the Clave rhythm patterns; or the amount of mental processing required to perform or hear

sequences as in the cognitive complexity measures of the rhythms and the complexity measure of

the harp patterns proposed by the author.

1.1. General outline

We consider a few methods of examining the complexity and entropy of selected layers of four

musical samples: ...explosante-fixe... by Pierre Boulez; Musikalisches Wurfelspiel, attributed to

Wolfgang Amadeus Mozart; the Clave rhythms from African, Brazilian, and Cuban traditions;

and the traditional ritualistic harp and xylophone music of the Nzakara in Central Africa. We

build upon work by Marc Chemillier of the Universite de Caen and Godfried Toussaint of McGill

University, among others.

Chapter Two considers musical pieces which involve either an element of choice given to the

performer or an element of chance in their composition. Specifically, it uses the work of Block,

Guckenheimer, Misiurewicz, and Young to measure the amount of choice allowed to the performer

in ...explosante-fixe... by Pierre Boulez. It continues by considering the effect of limiting some

choices which Boulez had allowed. We consider a piece as being generated by a directed graph,

with the vertices representing the elements among which the performer chooses and the edges rep-

resenting the order in which the performer may choose these elements. This allows us to calculate

and compare the entropies of the two graphs: one with the choices Boulez allowed and one with

3

two of those choices removed. We find that, as expected, decreasing the number of choices in-

creases the predictability of the performance. The chapter concludes with a discussion of Mozart’s

Musikalisches Wurfelspiel, in which the player “composes” a piece of music by rolling a pair of

dice to determine which of eleven prescribed choices to play at each of 16 measures. It is unclear

whether Mozart understood that different outcomes on the dice have different probabilities (for

example, rolling a seven is more common than rolling any other number). In this section, we find

that rolling dice gives a slightly more predictable result than if we were to draw one of the eleven

measures out of a hat with equal probability.

Chapter Three is based upon the work of Godfried Toussaint of McGill University in Montreal.

It continues his 2002 examination of the Clave rhythms of Cuba, Brazil, and Africa. Toussaint used

the work of Lempel and Ziv, Lerdahl and Jackendoff, and Jeffrey Pressing to quantify the relative

difficulty of each of six basic Clave rhythms. Here we review the work of Toussaint, consider

differences of interpretation of the original authors, and present some possible modifications of

Toussaint’s methods. We expand upon Toussaint’s use of the Lempel-Ziv algorithm to include an

alternative common parsing algorithm. We discuss the use of each of these parsings in calculations

of entropy and their most common modern usage in data compression. While the Lempel-Ziv

algorithms do not provide much information on the complexity of these relatively short rhythm

patterns, we find that the entropy of these patterns is zero, as should be true of any infinite cyclic

sequence. Further, we readapt the complexities defined by Lerdahl and Jackendoff and by Pressing

to compare the two in their ranking of the difficulty of the six rhythms.

Chapter Four examines the traditional harp patterns of the Nzakara in Central Africa. The

chapter is divided into two main sections, both based upon the work of Marc Chemillier, who

observed an ascending ladder property in the patterns of chords used by the harpists. The first

section considers the extent to which the observed rules and patterns limit the music allowed. Here

we measure the richness of the language by counting the number of words allowed when each of the

rules is applied. We find that, though the grammatical limits severely restrict the allowed patterns,

the number of allowed patterns still grows exponentially as the length of the core repeated unit

increases. The second section, inspired by Pressing’s cognitive complexity measure from Chapter

Three, proposes a measure of the complexity of the harp patterns based upon the required movement

of the harpists’ hands from string to string. The algorithm we propose is successful in measuring

what we intended to measure. However, it points out once again the difficulty in constructing a

single measure to define the complexity of a multi-faceted musical structure.

4

1.2. Background mathematics

Each of the examples presented in this work involves the mathematical structure known as a

subshift. Adopting the notation of [20], let A be a finite set, and call A the alphabet. A block or word

is a string of finite length whose letters are elements of A. Then A∗ is defined to be the collection

of all finite strings of letters (or all words) from the alphabet A. A language on the alphabet A is

any subset of A∗.

The shift space, Σ(A), is defined as

(1.1) Σ(A) = {x = (xi)∞−∞ : xi ∈ A for each i}.

The one-sided shift space is defined similarly by

(1.2) Σ(A)+ = {x = (xi)∞0 : xi ∈ A for each i}.

The shift transformation σ : Σ(A) → Σ(A) is defined by

(1.3) (σx)i = xi+1, for all i.

In other words, the result of the shift transformation on an element (a sequence) in the shift space is

the same sequence “picked up and shifted one unit to the left.” Throughout this work, we imagine

the music heard as part of a potentially infinite sequence of sounds. Thus the shift transformation

will correspond to the transition between movements as in the case of ...explosante-fixe..., or to

changing chords in the Nzakara harp music, or moving on to each new pulse of the Clave rhythms.

A subset X of Σ(A) is called a subshift of Σ(A) if it is a nonempty, closed, shift-invariant (i.e.,

σX = X) set. The language of a subshift X is the set of all finite-length strings found in X. The

cylinder set determined by a block B of length r at position j ∈ Z is the set of elements of Σ(A)

which contain the block B beginning at the j’th position. A subshift X is called ergodic if and only

if there is an element x ∈ X which contains all words in the language of X.

For a subshift X of Σ(A), let Nn(X) be the number of distinct blocks of length n in the language

of X, for n = 1, 2, . . .. Then the topological entropy of X is defined as the limit of the average of

the logarithm of Nn as n →∞:

htop(X) = limn→∞

1n

log Nn(X).

Thus htop(X) measures the exponential growth rate of the number of words in the language of X

as we consider longer and longer words.

Given a graph Γ with any number of directed, labelled edges among n vertices, let mij be the

number of edges from vertex i to vertex j. The incidence matrix M of Γ is defined to be the n by

n matrix (mij). The shift of finite type XM corresponding to Γ is the set of all infinite sequences

5

whose letters are the labels on the edges of Γ and whose spellings are determined by following the

edges in order around the graph. It is straightforward to show that M2 is the matrix whose ij’th

entry gives the number of two-step paths between vertex i and vertex j, and in general Mk gives

the number of k-step paths. The topological entropy of the system is given by

htop(XM ) = limn→∞

1n

log2 (number of n-blocks in XM ) = limn→∞

1n

log2

∑i,j

(Mn)ij = log2 λM ,

where λM is the largest positive eigenvalue of M [19, 20].

6

CHAPTER 2

...explosante-fixe... by Pierre Boulez

In the 1970s, mathematician-turned-composer Pierre Boulez wrote ...explosante-fixe... in honor

of the late Igor Stravinsky. ...explosante-fixe..., in both its original form and its current form

following major revisions in the 1990s, challenges traditional ideas of chamber music and makes

extensive use of electronics [5]. In its original form, the score consists of rough sketches of six

Transitoires and an Originel, among which the performers choose to create the finished performance

[14]. Each performance must begin or end with the Originel material, but it may otherwise walk

among the seven units according to Figure 2.1, and the performance need not use all six of the

transitoires.

2.1. Entropy and ...explosante-fixe...

In 1979, Block, Guckenheimer, Misiurewicz, and Young proposed a method of finding the

characteristic polynomial of a graph [4]. For a graph such as the one we have, which is shown

relabelled in Figure 2.2, they define a path p as a sequence of vertices, each connected to the next

by at least one edge. For example, in Figure 2.2, (1,2,3,4) is a path, while (1,5) is not. The width

Transitoire VI77

wwooooooooooooooooo gg

''OOOOOOOOOOOOOOOOOOO

��

Transitoire IIIOO

��

oo //gg

''NNNNNNNNNNNNNNNNN Transitoire IVOO

��

77

wwppppppppppppppppp

Originel77

wwppppppppppppppppp gg

''NNNNNNNNNNNNNNNNNOO

��

Transitoire Vgg

''OOOOOOOOOOOOOOOOOoo // Transitoire II77

wwooooooooooooooooo

Transitoire VII

Figure 2.1. ...explosante-fixe...[14]

066

vvnnnnnnnnnnnnnnnnn hh

((PPPPPPPPPPPPPPPPPOO

��

1OO

��

oo //hh

((PPPPPPPPPPPPPPPPP 2OO

��

66

vvnnnnnnnnnnnnnnnnn

366

vvnnnnnnnnnnnnnnnnn hh

((PPPPPPPPPPPPPPPPPOO

��

4 hh

((PPPPPPPPPPPPPPPPPoo // 566

vvnnnnnnnnnnnnnnnnn

6

Figure 2.2. Relabelling the graph of ...explosante-fixe...

of a path, w(p), is the product of the number of edges between each pair of successive vertices. In

Figure 2.2, every path has width equal to one, since there is a maximum of one edge between any

two vertices. For a path p = (p0, p1, ..., pk), they define l(p) = k, the length of the path. A subset

R of vertices of the graph is called a rome if there is no loop outside of R, that is, there is no path

p = (p0, p1, ..., pk), l(p) > 0, such that pk = p0 and {p0, p1, ..., pk} is disjoint from R. Finally, given

a rome R, a path p = (p0, p1, ..., pk) is called simple with respect to R (or R-simple) if {p0, pk} ⊆ R

and {p1, p2, ..., pk−1} is disjoint from R.

Let M = (mij) be the incidence matrix for our graph. Let R = {r1, ..., rk} be a rome with

ri 6= rj for all i and j. Define AR(x) = (aij(x))ki,j=1, where aij(x) =

∑p w(p)x−l(p) for simple paths

p starting at ri and ending at rj . Then [4] proves Theorem 2.1.1:

Theorem 2.1.1. The characteristic polynomial of M is equal to

(2.1) (−1)n−kxndet(AR(x)− I),

where I is the k × k identity matrix.

Proof. If R = {1, 2, ..., n}, the proof is straightforward. In this case, the only simple paths are those

directly connecting the beginning and end vertices, i.e. paths of length one. So aij(x) = mijx−1 for

each i and j. Thus, det (AR(x)− I) = det (x−1M − I) = det (x−1(M − xI)) = x−n det (M − xI),

so that

(2.2) xn det (AR(x)− I) = det (M − xI),

which is the characteristic polynomial of M .

If a subset T contains a rome S, then T must also be a rome. Consider a rome S = {s1, ..., sq}

(where si 6= sj for i 6= j), s0 6∈ S, and T = S ∪ {s0}. If det (AS − I) = −det (AT − I), then

8

xn det (AS − I) = −xn det (AT − I), and the principle of induction can be used to prove the theorem

by starting with the rome R = {1, ..., n} and reducing the size of the rome by one at each step.

If AS−I = (bij)qi,j=1 and AT−I = (cij)

qi,j=0, then by the definition of AS and AT , bij = cij+ci0c0j

since every S-simple path from si to sj will either be a T -simple path or will be a combination

of two T -simple paths, one from si to s0 and the other from s0 to sj . Since S is a rome and s0

is not in S, there can be no paths from s0 to itself which are T -simple. So c00 = −1, since all

T -simple paths from s0 to itself will have length 0. By a series of column operations, we transform

the matrix AT − I into a matrix D = (dij)qi,j=0 with dij = bij for i, j > 0, d00 = −1, and d0j = 0

for j > 0. Namely, we multiply the 0’th column of AT − I by c0j and add it to the j’th column of

AT − I for j = 1, ..., q. Basic linear algebra then tells us that det (dij)qi,j=0 = det (AT − I). Then

since dij = bij for i, j > 0, det (AS − I) = −det (dij)qi,j=0 = −det (AT − I), as desired. ♦

Consider, in Figure 2.2, the rome R = 1, 2, 3, 4, 5. Then AR(x) is as in Equation (2.3):

(2.3) AR(x) =

x−2 x−1 + x−2 x−1 + x−2 x−1 0

x−1 + x−2 x−2 x−1 + x−2 0 x−1

x−1 + x−2 x−1 + x−2 2x−2 x−1 + x−2 x−1 + x−2

x−1 0 x−1 + x−2 x−2 x−1 + x−2

0 x−1 x−1 + x−2 x−1 + x−2 x−2

Using Mathematica combined with Theorem 2.1.1, the characteristic polynomial of this graph

is given by −1− 16x− 40x2− 16x3 +20x4 +14x5. Again using Mathematica, the largest eigenvalue

of this system is approximately 4.15633, and by [19] and [4], the entropy of this system is given by

h = log2 4.15633 ≈ 2.0553.

To compute the entropy directly, let M = (mij) be the incidence matrix for the graph in

Figure 2.2. Then the ij’th entry of the matrix Mn is the number of paths of length n from vertex

i − 1 to vertex j − 1 in the graph. If the sum of the entries of Mn is given by mn, the entropy of

the graph is h = limn→∞ n−1 log2 mn [20]. Again we find using Mathematica that the entropy of

the graph is approximately equal to 2.0553.

2.2. The effect of symmetry

Now consider what would happen if the transitions between Transitoire III and Transitoire IV

and between Transitoire V and Transitoire II were no longer allowed, as shown in Figure 2.3. Notice

that eliminating these transitions yields radial and other symmetries which the original structure

did not have. It stands to reason that this might yield a less interesting performance, as the possible

transitions from any of these four transitoires are fewer and thus more predictable. How does this

manifest itself mathematically?

9

066

vvnnnnnnnnnnnnnnnnn hh

((PPPPPPPPPPPPPPPPPOO

��

1OO

��

hh

((PPPPPPPPPPPPPPPPP 2OO

��

66

vvnnnnnnnnnnnnnnnnn

366

vvnnnnnnnnnnnnnnnnn hh

((PPPPPPPPPPPPPPPPPOO

��

4 hh

((PPPPPPPPPPPPPPPPP 566

vvnnnnnnnnnnnnnnnnn

6

Figure 2.3. Composition with radial symmetry

Using the same techniques as in Section 2.1, we find that we can use the same rome, but this

time,

(2.4) AR(x) =

x−2 x−2 x−1 + x−2 x−1 0

x−2 x−2 x−1 + x−2 0 x−1

x−1 + x−2 x−1 + x−2 2x−2 x−1 + x−2 x−1 + x−2

x−1 0 x−1 + x−2 x−2 x−2

0 x−1 x−1 + x−2 x−2 x−2

.

The calculated entropy without these transitions is approximately 1.8662, again calculated two

ways, demonstrating mathematically that there is less freedom of choice for the performer and less

“surprise” for the listener. This supports the idea that a performance following the modified model

might be less exciting and interesting than the original.

2.3. Choice and chance in music

Composers were playing with the idea of chance in music as early as the 1700s [21]. At the

time, the techniques were intended to allow any person, with little or no knowledge of music, to

“compose” pleasing pieces. Among the most famous of these pieces is Musikalisches Wurfelspiel,

or “Musical Dice Game,” attributed to Wolfgang Amadeus Mozart. In this piece, the goal is to

compose a 16-measure waltz by rolling dice to select each consecutive measure. Mozart wrote 176

measures, arranged in 16 columns and 11 rows. Before each measure, a pair of six-sided dice is

rolled and the result determines which of the eleven options from the corresponding column is

played in that measure. Since some of the 176 measures are duplicates of each other, there are

somewhat fewer than 1116 waltzes to be played. Further, since the dice rolled are two six-sided

dice, the elements of each row would have slightly different probabilities of occurrence.

10

A1 =[

136

118

112

19

536

16

536

19

112

118

136

]The first through seventh and ninth through fifteenth tosses of the dice

A2 =[

1]

The eighth toss of the dice

A3 =[

1718

118

]The sixteenth and final toss of the dice

Figure 2.4. Probability vectors for Musikalisches Wurfelspiel

To measure the entropy of this system, we consider each transition between measures as an

independent probabilistic event. The generation of the piece of music then becomes a series of

independent experiments, and [19] tells us how to compute the entropy. For each measure except

the 8th and 16th measures (the two cadence measures), Mozart wrote eleven distinct musical

patterns among which the roll of the dice will choose. For the 8th roll of the dice (i.e., the eighth

measure), Mozart wrote the same pattern eleven times. Thus, no matter what is rolled on the dice

for that musical measure, there is only one musical pattern which the musician could play and the

listener hear. For the final measure, Mozart wrote two musical patterns, duplicating one of them

ten times so that there is a 17/18 probability of hearing one pattern and a 1/18 probability of

hearing the other (taking into account the probability of throwing each possible total on the dice).

This makes a total of 1114 · 2 = 759, 499, 667, 166, 482 possible waltzes.

Each dice roll is independent of the others, so we may find the entropy of each independent

experiment and add them together to find the total entropy of the system [19]. First, we construct

the probability vectors describing the outcomes of each throw of the dice. The j’th entry of each

vector is the probability of choosing the j’th musical option for the next measure. The vectors are

as in Figure 2.4.

The entropy of an individual roll is given by taking the sum of each entry of the probability

vector times its logarithm base 2:

hk = −∑

j

pj log2 pj .

Thus the entropy of this system is given by

h =∑

k

hk ≈ 46.1512.

We might wonder whether Mozart was aware that some numbers are more likely to be rolled

on a pair of dice than other numbers. Perhaps he meant each of the musical patterns to be equally

likely within a measure. We could simulate such a system using a random number generator or a

11

single eleven-sided die. In such a case, the entropy of the system would be

h = log2 (1114 · 2) ≈ 49.431 [19].

The entropy of this completely random system is 1.07 times the entropy of the original system.

Thus Mozart limited by a small amount the unexpectedness of each measure. Had Mozart wanted

to maximize the amount of unexpectedness at each measure, he should have instructed the musician

to pull a number 1 through 11 out of a hat at each measure rather than rolling two dice.

12

CHAPTER 3

African, Brazilian, and Cuban Clave Rhythms

In the summer of 2002, Godfried Toussaint presented an analysis of the complexity of six Clave

rhythms [28]. He chose three methods of analysis proposed by earlier authors and compared the

results of his experiments to the difficulty students report experiencing in learning each of these

rhythms. This section presents his methods and results along with some proposed modifications to

the procedure.

3.1. The Clave patterns

Each of the six Clave rhythms studied consists of 16 pulses, with a strike on five of the pulses

while resting on the remaining 11 pulses. The rhythms are analyzed in the context of Western

music, where there are strong pulses and weak pulses. Rhythms with strikes primarily on strong

pulses and rests on the weak pulses are more predictable and thus considered to be less complex

and easier to learn. We correspond the sixteen pulses of the rhythm to the sixteenth notes in a

4/4 musical measure. Indeed, the rhythms are often notated in this way for musicians learning the

patterns [28]. In such a measure, each set of four pulses is considered a beat. So, the first beat,

which is also felt as the strongest, occurs on the first pulse. The third beat, the second strongest,

occurs on the ninth pulse. Alternatively, in order to study the patterns using ideas from information

theory, each rhythm can be denoted by a block of sixteen zeros and ones, where a ‘1’ represents a

strike and a ‘0’ represents a rest. In this notation, the six rhythms are as in Table 3.1.

Students and performers have found that these rhythms are of varying difficulty: Shiko has

been called easy while Bossa-Nova and Gahu are considered the most difficult to master [28]. A

“good” measure of complexity should mimic the experience of so many learners: Shiko should have

a low complexity relative to the other rhythms, while the complexity of Bossa-Nova and Gahu

Table 3.1. The Clave rhythms

Shiko 1000101000101000Son 1001001000101000Soukous 1001001000110000Rumba 1001000100101000Bossa-Nova 1001001000100100Gahu 1001001000100010

should be the highest among the six. This chapter explores three kinds of complexity studied

by Toussaint: Lempel-Ziv complexity, Jeffrey Pressing’s cognitive complexity, and Lerdahl and

Jackendoff’s metric complexity. Each section describes the results of Toussaint’s analysis and

continues with an expansion of the ideas presented. In particular, we expand Toussaint’s discussion

of the Lempel-Ziv complexity to consider alternate parsings which have developed through the years

of use of the Lempel-Ziv algorithms and explain the connection between entropy and Lempel-Ziv

compression techniques. We reconsider Pressing’s cognitive complexity in the context of the entire

block of sixteen pulses rather than the concatenation of four blocks of four pulses each. Finally, we

present a new comparison between Pressing’s complexity and the metric complexity by calculating

the metric complexities of the blocks of length four studied directly by Pressing.

3.2. Lempel-Ziv complexity

In 1977 Abraham Lempel and Jacob Ziv proposed a method for evaluating the complexity of a

finite sequence [16]. This original Lempel-Ziv algorithm has been studied in detail since, and several

variations on the algorithm have arisen in the development of information theory and computer

science [27]. Every incarnation of the Lempel-Ziv parsing algorithm can be described by

The next word is the shortest new word.

In the original Lempel-Ziv (LZ) parsing algorithm, a sequence is parsed so that each new

codeword is the shortest word that has not appeared previously anywhere in the sequence. For

example, the Shiko rhythm is parsed using this method as follows:

1000101000101000 . . . → 1, 0, 001, 0100, 0101000 . . .

When the rhythm is repeated over and over again, if we begin the new word with the next letter we

create the codeword 1000101 . . .. We never reach the end of this codeword since each time we add

a letter the string has been seen before. In fact, the cyclic nature of the rhythm implies that any

finite sequence we choose from the remainder will have been seen before. So, the infinite periodic

sequence formed by repeating this block again and again is wholly described by the five components

above. The Lempel-Ziv complexity of the rhythm, used by Toussaint, is then defined to be the

number of components thus derived from the rhythmic cycle. We will designate this measure of

complexity by LZ1.

In contrast, a more recent and common interpretation of the Lempel-Ziv algorithm will yield

infinitely many components for any infinite word, whether cyclic or not. In this algorithm, the next

new codeword is the shortest word which has not yet been designated a codeword. In other words,

the algorithm places each defined parsing component into a dictionary which is consulted each time

14

LZ1 LZ2Shiko 5 8Son 6 8

Soukous 6 8Rumba 6 7

Bossa-Nova 5 8Gahu 5 8

Table 3.2. The Lempel-Ziv complexity measures of the six Clave rhythms

the next component is sought. For example, the new parsing of the Shiko rhythm is

10001010001010001000101000101000 . . . → 1, 0, 00, 10, 100, 01, 010, 001, 000, 101, 0001, 0100, . . . .

Analogously to the LZ1 complexity, we could define the LZ2 complexity of the rhythms as the

number of codewords into which the rhythm might be parsed in this way. Table 3.2 gives both

the LZ1 complexity calculated by Toussaint and the LZ2 complexity. According to both of the

measures, the six rhythms are all of approximately the same complexity.

The LZ2 parsing is common in the hunt for better and faster electrical communication and

computer coding. The usefulness of the technique lies in the connection between the LZ2 parsing

of a typical sequence output by an ergodic stationary process and the ergodic-theoretic entropy

of that process. (In fact, all of the results we give in this section apply also to the LZ1 parsing

technique. In the discussion following, we will use the term ”Lempel-Ziv” to refer jointly to LZ1

and LZ2.) Either Lempel-Ziv parsing acting on an ergodic process compresses the infinite process

to its entropy in the limit.

First, notice that each codeword is either a single element of the alphabet {0, 1} or is of the form

ω(j)a, where ω(j) represents a prefix seen earlier in the sequence and a represents a single element

of the alphabet {0, 1}. The Lempel-Ziv n-code Dn then encodes the location of the prefix ω(j) and

its last element a. In the LZ1 parsing, the location of ω(j) is the location where it is first observed

in the sequence so far. In the LZ2 parsing, the location of ω(j) is its position in the dictionary of

codewords being created. In the notation of [?], the LZ2 parsing of 011000101001111010. . . would

be as follows:

0, 1, 10, 00, 101, 001, 11, 1010, . . .

and the LZ2 compression is

(000, 0)(000, 1)(010, 0)(001, 0)(011, 1)(100, 1)(010, 1)(101, 0)

The first codeword, namely 0, has no prefix. So the location of the prefix (000) is encoded base

2 followed by the digit 0 which is added to the end. The prefix of the fifth codeword, 101, is 10

– the third codeword. So the location of the prefix is recorded base 2 (11) and the digit added

15

to the end (1) is also recorded. In this small example, the length of the “compression” is actually

greater than the length of the sequence. However, as the length of the original sequence increases,

the compression becomes increasingly efficient. In fact, the compression tends toward the entropy

as the length tends toward infinity, as we will see below.

If CN is the number of codewords in the parsing of a sequence of length N , then we need

log2 CN bits to describe the location of the prefix and 1 bit to describe the last bit. So for each

ω(i) we need log2 CN + 1 bits to encode ω(i) in the Lempel-Ziv code. In other words, DN (xN1 ) has

length CN (log2 CN + 1) bits. Theorem 3.2.1, in its original form due to Lempel and Ziv and later

proved by Ornstein and Weiss, states that for very large N , the Lempel-Ziv code length approaches

the entropy of the system. The proof and discussion which follow can all be found in [27].

Theorem 3.2.1. If µ is a stationary ergodic process with entropy h, then

(3.1) limN→∞

(CN (lnCN + 1))/N = h a.e..

One argument for the truth of Theorem 3.2.1 is based on the entropy equipartition prop-

erty, a consequence of the Shannon-McMillan-Breiman Theorem of ergodic theory. Denote αnm =∨n

k=m T−kα, where∨

denotes the least common refinement as found in [19].

Definition 3.2.1. The gain in information when we learn to which element of a partition α

the point x belongs is defined to be

Iα(x) = − log2 µ(α(x)) = −∑A∈α

log2 µ(A)χA(x).

Then Iαn−10

(x) = Iα∨

T−1α∨

...∨

T−n+1α(x) is the amount of information gained when we learn to

which elements of a partition α the elements x, T−1x, T−2x, etc. belong.

Theorem 3.2.2. (Shannon-McMillan-Breiman) Let T : X → X be an ergodic measure-preserving

transformation on (X,B, µ) and α a finite or countable measurable partition of X with H(α) =

−∑∞

i=1 µ(Ai) log2 µ(Ai) < ∞. Then

Iαn−10

(x)

n→ h a.e. as n →∞.

The entropy equipartition property, also called the asymptotic equipartition property (AEP), in

its simplest form [1] tells us that “almost everything is almost equally probable.” In its precise

form, it states:

Theorem 3.2.3. (Entropy equipartition property) Let T : X → X be ergodic and α =

{A1, A2, ...} a finite or countable measurable partition of X with H(α) = −∑∞

i=1 µ(Ai) log2 µ(Ai) <

∞. Given ε > 0 there is an n0 such that for n ≥ n0 the elements of αn0 can be divided into “good”

16

and “bad” classes (γ and β, respectively) such that:

(i) µ⋃

B∈β B < ε, and

(ii) for each G ∈ γ,

2−n(h+ε) < µ(G) < 2−n(h−ε).

Proof. Since Iαn−10

/n → h a.e., Iαn−10

/n → h in measure: for each ε > 0 there is an n0 such that if

n ≥ n0 then

µ{x : |(Iαn−10

/n)− h| ≥ ε} < ε.

But Iαn−10

must be constant on each cell of αn−10 by definition. So, we let γ be those elements of

αn−10 for which |Iαn−1

0/n− h| < ε and let β be everything else. ♦

The summary “almost everything is almost equally probable” is justified as follows. Choosing

a sequence εm → 0, it is clear that for large n, µ(γ) ≈ 1 and µ(β) ≈ 0. Then on average each G ∈ γ

(namely, each element of αn−10 with positive measure) will have measure approximately equal to

2−nh. So, there will be approximately 2nh such elements G.

Now let us apply this to the Lempel-Ziv code. The average length of the CN codewords is

n ≈ N

CN.

The entropy equipartition property then implies that for very large N there are approximately

2(N/CN )h “typical” blocks of length n ≈ N/CN . So,

CN ≈ 2(N/CN )h.

Taking logarithms of both sides, we get

CN log2 CN

N≈ h

for very large N . This outline of a proof gives us an idea why the Lempel-Ziv theorem holds with

convergence in measure.

Another argument for the Lempel-Ziv convergence theorem uses the first return time formula

studied first by Wyner and Ziv and later by Ornstein and Weiss. If x = x1x2x3... ∈ A∞1 , we define

the first return time of xn1 as

Rn(x) = min{m ≥ 1 : xm+nm+1 = xn

1}.

Theorem 3.2.4. For any ergodic process µ with entropy h,

(3.2) limn→∞

log2 Rn(x)n

= h a.e.

17

The proof of the Ornstein-Weiss first-return time theorem is quite long and involved. It can be

done in three parts. Initially, the upper and lower limits are defined:

r(x) = lim supn→∞

log2 Rn(x)n

and

r(x) = lim infn→∞

log2 Rn(x)n

.

These limits are shown to be a.e.-equal to constants r and r, respectively, with r ≤ r, clearly. Wyner

and Ziv proved that r ≤ h by showing that for ε > 0, µ{x : Rn(x) > 2n(h+ε)} → 0 as n → ∞,

implying that for every ε and for most x, there exists an n0 such that for n > n0,

Rn(x) ≤ 2n(h+ε), and hencelog2 Rn(x)

n− ε ≤ h.

The other inequality, namely r ≥ h, as proved by Ornstein and Weiss, is a proof by contradiction:

if it is not true one can build a code that compresses to less than the entropy.

Since the shift transformation is ergodic, we can consider one of the CN “typical” codewords

as discussed in the last argument. Such an average codeword will have length

n ≈ N

CN.

By Kac’ Theorem, in an ergodic system the expected recurrence time of a subset A ⊆ X is 1/µ(A).

Because each cylinder corresponding to a “typical” codeword has about the same measure,

Rn ≈1

µ(CN )≈ CN .

Thus by Theorem 3.2.4, almost surely

(3.3)CN log2 CN

N=

log2 CN

N/CN≈ log2 Rn

n→ h.

We demonstrate an application of these theorems by verifying that any periodic process, such

as one arising from repeating a single Clave pattern over and over, has entropy 0. If we consider

the LZ1 parsing of a periodic sequence having period p, then CN ≤ p + 1 for every N . So,

(CN (log2 CN + 1))/N ≤ (p + 1)(log2 (p + 1) + 1)N

→ 0 as N →∞.

Now consider a parsing of a cyclic sequence of period n using the second method. While we could

use Theorem 3.2.1 to show again that the entropy is zero, the argument is somewhat simpler from

Theorem 3.2.4. Again for a cyclic sequence of period p, for any n, Rn(x) = min{m ≥ 1 : xm+nm+1 =

xn1} ≤ p. So,

(3.4) (log2 Rn(x))/n ≤ p/n → 0 as n →∞.

18

These results agree with the intuitive definition of entropy given by Pierce: after we pass through

only one cycle, no other information is gained from any subsequent letter. Thus we confirm that

the entropy of a cyclic sequence such as one generated by one of the rhythms is zero.

3.3. Cognitive complexity

Jeffrey Pressing proposed a method of computing the complexity of a rhythm which measures

the amount of “unexpectedness” in it, the amount by which the strikes occur off the beat [25].

Pressing refers to his measure as a “syncopation-generated cognitive cost” of learning the rhythm.

Pressing defines five types of syncopation, each adding a cognitive cost to the overall complexity of

the rhythm, as shown in Table 3.3.

Table 3.3. Types of syncopation in Pressing’s cognitive complexity Measure

Type Description Cognitive Costfilled strike at each pulse 1run strike at first pulse followed by a run of others 2upbeat strike at first pulse plus strike at last pulse 3syncopated starting and ending on off-beats 5null none of the above 0

Pressing mentions a fifth type of syncopation, called subbeat syncopation, with cognitive cost

equal to 4. However, he does not define this type since it is not applicable to the work he presents.

We suspect, based on his choice of name, that this syncopation applies only to rhythms whose

number of pulses is not a power of 2 – for example waltzes, which have one strong beat and two

equal weak beats. If this suspicion is true, this syncopation type will also not apply to our rhythmic

patterns with 16 pulses.

According to Pressing’s algorithm, the rhythm is divided into subparts of equal length, which

are then themselves divided, until we reach subdivisions of the smallest prime length. Then the

rhythm as a whole and each subdivision are examined for the properties in Table 3.3. The cognitive

complexity is calculated by summing each of the cognitive costs found by comparison with Table 3.3,

with each cognitive cost weighted by the proportion of the total pattern of the subdivision. For

example, the cognitive cost of the ten four-pulse patterns as calculated in [25] and [28], are as in

Table 3.4. Notice that there is some confusion as to the character of the 01 pattern. In calculating

the cognitive complexity of 1001 and 0110, Pressing interpreted 01 as an upbeat pattern, whereas

in 0101, 0100, and 0001 it was considered a fully syncopated rhythm with cognitive cost equal to 5.

According to the strict definition in Table 3.3, each of these should be regarded as fully syncopated

and having cognitive cost equal to 5. The corrected cognitive complexities are listed in Table 3.7

at the end of this chapter.

19

Table 3.4. Calculation of the cognitive complexity of four-pulse rhythms

Pattern Sync. Type Subbeats Sync. (Weight=.5) Cog. Complexity1100 run(2) 11, 00 filled(1), null(0) 2 + .5 + 0 = 2.51010 filled(1) 10, 10 null(0), null(0) 1 + 0 + 0 = 11001 upbeat(3) 10, 01 null(0), upbeat(3) 3 + 0 + 1.5 = 4.50110 sync(5) 01, 10 upbeat(3), null(0) 5 + 1.5 + 0 = 6.50101 sync(5) 01, 01 sync(5), sync(5) 5 + 2.5 + 2.5 = 100011 sync(5) 00, 11 null(0), filled(1) 5 + 0 + .5 = 5.51000 null(0) 10, 00 null(0), null(0) 0 + 0 + 0 = 00100 sync(5) 01, 00 sync(5), null(0) 5 + 2.5 + 0 = 7.50010 sync(5) 00, 10 null(0), null(0) 5 + 0 + 0 = 50001 sync(5) 00, 01 null(0), sync(5) 5 + 0 + 2.5 = 7.5

In [28], Toussaint calculated the complexity of each of the Clave rhythms by dividing each

of the rhythms into four beats and then adding together the cognitive complexities of the beats

according to Table 3.4. Here we adapt Pressing’s measure to the full sixteen-pulse rhythm. For

example, the adjusted cognitive complexity of the Clave Son rhythm is described as follows:

(1) Begin with the Clave Son pattern: 1001001000101000.

(2) The overall pattern fits none of the syncopation types, so it is type null(0).

(3) So 0 is added to the complexity.

(4) Consider the half notes: 10010010, 00101000.

(5) The first is of type null(0), and the second of type sync(5), each with weight=.5.

(6) So 2.5 is added to the complexity.

(7) Consider the quarter notes: 1001, 0010, 0010, 1000.

(8) They are of types upbeat(3), sync(5), sync(5), null(0), each with weight=.25.

(9) This adds .75 + 1.25 + 1.25 + 0 = 3.25 to the complexity.

(10) Finally, consider the eighth notes: 10, 01, 00, 10, 00, 10, 10, 00.

(11) These are null(0), sync(5), null(0), null(0), null(0), null(0), null(0), null(0), with weight=.125.

(12) This adds .625 to the complexity.

(13) So the complexity is 0 + 2.5 + 3.25 + .625 = 6.375

The measures of complexity derived in this way are given in Table 3.5. According to this

procedure, the Gahu rhythm has less cognitive cost than the Rumba rhythm, contrary to Toussaint’s

calculations. However, in both methods, Shiko has the least cognitive cost and Bossa-Nova the

greatest.

3.4. Metricity and metric complexity

The third measure of complexity Toussaint examines is inspired by Lerdahl and Jackendoff’s

1983 construct of the metrical structure of a rhythm [28, 17]. Toussaint describes the process as

follows:

20

Table 3.5. Cognitive complexities of the six Clave rhythms

Rhythm Toussaint GregoryShiko 6 4.5Son 14.5 6.375Soukous 15 6.625Rumba 17 7.75Bossa-Nova 22 8Gahu 19.5 7.375

Table 3.6. Complexity measures of the six Clave rhythms

LZ1 LZ2 Cognitive Adjusted Cognitive MetricShiko 5 8 6 4.5 2Son 6 8 14.5 6.375 4Soukous 6 8 15 6.625 6Rumba 6 7 17 7.75 5Bossa-Nova 5 8 22 8 6Gahu 5 8 19.5 7.375 5

At the first level of metrical accent a beat is added at every time-unit starting atunit 0. At the second level a beat is added at every second time unit starting at unit0. At the third level a beat is added at every fourth time unit starting at unit 0. Wecontinue in this fashion doubling the time interval between time units at every level.In this way time [for a 16-time unit interval] unit 8 receives 4 beats and finally timeunit 0 receives 5 beats. Thus time units 2, 6, 10, and 14 are weak beats, time units4 and 12 are medium strength beats, unit 8 is stronger and unit 0 is the strongestbeat.

The total metric strength of a rhythm, called its metricity or its metrical simplicity, is the sum

of the metrical accents of the pulses where a strike occurs in the rhythm. The metric complexity of

a rhythm is then defined to be the total number of possible metrical accents minus the metricity

of the rhythm.

Toussaint proceeds to measure the metric complexity of the Clave rhythms with this method

and compares the results to those of the cognitive complexity described above. The results of the

various measures of complexity are summarized in Table 3.6.

The metric complexity which we calculated of each of the four-pulse patterns from Section 3.3

is found in Table 3.7. In comparing the measures on the four-pulse patterns, the table shows that

the metric complexity and Pressing’s cognitive complexity rank the patterns in the same order from

least to most complex, though Pressing’s measure provides finer gradations.

21

Table 3.7. Cognitive and metric complexities of the four-pulse rhythms

Rhythm Cognitive Complexity Metric Complexity1010 1.0 21100 2.5 31001 5.5 30011 5.5 40110 7.5 40101 10.0 5

22

CHAPTER 4

The traditional Nzakara harpists

There are few musicians left who can play the traditional harp patterns of the Nzakara people

of Central Africa. Building upon the musical research of Eric de Dampierre and others, Marc

Chemillier has recently provided mathematical analysis of the few harp patterns still played. The

harp tunes are made up of two parallel melodic lines, the lower line different from the upper by

only a pitch translation and a time delay. Chemillier, however, found that not only do the two

melodic lines share this canon-like relationship, but within the harpists’ pieces is a cyclical ladder

property as well [10, 9]. It is this property which is studied here.

The traditional harps have 5 strings. The strings are played in pairs, which Chemillier terms

“bichords” and which can be viewed as the letters in the alphabet of the Nzakara harpists. There

are five allowable bichords: those consisting of neighboring strings are disallowed, as is the pair

of the lowest and highest strings. Each harp tune is a succession of bichords repeated over and

over again to accompany chant and dance. The smallest repeated set of bichords is the “core”

of the harp tune. Within each core, the same letter never appears twice in succession, and the

core does not begin and end with the same letter (to avoid repeating the letter upon repeat, or

concatenation). Further, the core will not “factor” into repeated smaller words since it is defined

to be the smallest repeated word. Chemillier encoded the five allowable bichords into Z5, ordering

them from lowest-pitched to highest by the lower note of the bichord and using the upper note to

break ties as shown in Figure 4.1. He found that each allowable core is made up of a generator

followed by four successive translations of the generator modulo 5. Since for any translation in Z5

the pattern returns to the beginning of the core at the fifth translation, each harp tune is a never-

ending cyclical “ladder” created by perpetually translating a word in Z5. An example is shown

in Figure 4.2. Harpists often improvise their own variations on the base theme defined by these

rules, and it is conceivable that there are rules governing the possibilities that might be susceptible

to mathematical or grammatical analysis. However, these possible variations are not recorded or

documented at this time. Section 4.1 measures the restrictions which the documented structure

imposes upon the vocabulary by examining the number of possible words which are eliminated

from the theoretical 5l cores of length l. Section 4.2 proposes a measure of complexity for these

ss ss ss s

s ss

0 1 2 3 4

Figure 4.1. The five allowable bichords of the Nzakara harpists

s s s s s s s s s s s s s s s s s s s s s s s s s s s s s ss s s s s s s s s s s s s s s s s s s s s s s s s s s s s s

0 2 3 0 1 0 1 3 4 1 2 1 2 4 0 2 3 2 3 0 1 3 4 3 4 1 2 4 0 4

Figure 4.2. Example of a limanza canon

harp patterns, both hypothetical and allowed, inspired by Pressing’s measure described in Section

3.3.

4.1. Counting the harp patterns

Just as not all combinations of letters make actual English words, not all dL possible words

of length L fit the patterns of the harpists’ lexicon. TL is defined to be the number of valid cores

of length L which fit the pattern. In counting these words, we address the constraints one at a

time. In Section 4.1.1, words with the same bichord appearing twice in a row are eliminated. In

Section 4.1.2, duplicates in the counts are eliminated by considering only the smallest cycle in

each infinite word. Section 4.1.3 keeps only the words with the required cyclical “ladder” property.

Finally, Section 4.1.4 lists some results of the formulas in the first three sections for a variety of

alphabets (i.e., a variety of theoretical harp sizes) and a variety of cycle lengths. The notation and

terminology for this discussion are summarized in Table 4.1.

Note that N1 = M1 = F1 = T1 = 0 no matter what d we choose. Further, N2 = F2 = d(d− 1)

and

T2 =

0 if d is odd

d if d is even

It will be shown that NL and FL both grow exponentially as L increases. TL also grows exponentially

but at a substantially slower rate.

4.1.1. No duplicated letters.

Proposition 4.1.1. For d ≥ 3 and L ≥ 3, NL = (d− 1)L + (−1)L(d− 1).

Proof. First we note that ML = NL−1 since any element counted in ML can be uniquely created

by using an element counted in NL−1 and adding the first letter to the end. Then NL = (d −

24

Table 4.1. Notation and Terminology

Zd Zd = {0, 1, ..., d− 1}, d ≥ 2 is our alphabet of symbols.W A word of length L ≥ 1NL The set of words of length L containing no ...aa... and not equal

to a...a for any a ∈ Zd. NL is the number of elements of NL.ML The set of words of length L containing no ...aa... for any a

but equal to b...b for some b in Zd. (That is, ML is the set ofwords beginning and ending with the same letter having no pairof repeated letters in between.) ML is the number of elements ofML.

factorable W can be factored if W = BB...B for some subword B of lengthless than L.

translation B + t is a translation of B by t modulo d: every letter in B istranslated by t modulo d in the ordered alphabet Zd.

translation-factorable W can be translation-factored if W = B(B+t)(B+2t)...(B+kt)for some subword B of length less than L, and for some t and k.

FL The subset of words in NL which cannot be factored. FL denotesthe number of elements of FL.

TL The subset of FL consisting of words which are “evenly”translation-factorable. That is, W = B(B + t)(B +2t)...(B + kt)and B + (k + 1)t = B. TL denotes the number of elements of TL.

2)NL−1 + (d − 1)ML−1 = (d − 2)NL−1 + (d − 1)NL−2 since any element counted in NL can be

created by appending to an element of NL−1 any element of the alphabet except the first or the

last letter of that element or by appending to an element of ML−1 any element of the alphabet

except the first letter (which in this case equals the last element). Solving this recursion, we find

that NL = (d− 1)L + (−1)L(d− 1). ♦

4.1.2. Non-factorable words.

Proposition 4.1.2. For d ≥ 3 and L ≥ 3, FL = NL −∑

m|L;m6=1,L Fm.

Proof. Assume W = a1a2...aL is factorable. That is, ∃ m|L such that

W = a1a2...aL

= a1a2...ama1a2...am...a1a2...am.

Clearly, then, W is removed from the count since there must be a smallest such m where a1a2...am

is counted in Fm. Thus, it remains to show that W is removed only once and that no other words

are removed in this formula.

Lemma 4.1.3. If W is factored into words of length m1 and m2, then W is also

factored into a word of length gcd (m1,m2).

25

Proof. Step 1: Assume gcd (m1,m2) = 1. (Without loss of generality, we can as-

sume m1 > m2.) Then a1 = am1+1 = a2m1+1 = ... and a1 = am2+1 = a2m2+1 = ....

Further, whenever km2 ≡ ik (mod m1), akm2 = aik . We claim that there exists k

such that akm2 = a1. Proof of this claim is a simple application of the basic number

theory theorem which states that if gcd (a, b) = 1, then there exist x such that ax ≡

1 (mod b). Here, since gcd (m1,m2) = 1, there exists k such that km2 ≡ 1 (mod m1).

So, akm2 = a1.

Now, notice that concatenating W onto itself indefinitely does not change the

characteristics, that is, WWW...W is still factorable by m, still contains no ...aa...,

and still begins and ends with different letters. Thus, we can consider such an

“extension” of W and claim that km2 ≡ 1 (mod m1). Further, multiplying both

sides of the equivalence, we get 2km2 ≡ 2 (mod m1), 3km2 ≡ 3 (mod m1), etc. up

to m1km2 ≡ m1 (mod m1). Thus, a2km2 = a2, a3km2 = a3, etc to am1km2 = am1 .

But, ∀x = 1, 2, 3, ..., axm2 = am2 , so we get

an2 = a1 = a2 = a3 = ... = an

So, W = a1a1a1...a1, so W is factorable into a subword of length 1 = gcd (m1,m2).

Step 2: Assume gcd (m1,m2) = d 6= 1. Then we can code W into substrings of

length d:

W = W1W2...Wk,

where

Wi = a(i−1)d+1a(i−1)d+2...aid

Then, both W1W2...Wm1/d and W1W2...Wm2/d are factors of W . So, W is fac-

torable into (recoded) substrings of length m1/d and m2/d respectively. Moreover,

gcd (m1/d,m2/d) = 1. So, referring to Step 1, we see that W1 = W2 = ... = Wk, so

W is factorable into a word of length d = gcd (m1,m2). This completes the proof.

We use this lemma to justify that W is subtracted out exactly once: Let l1, l2, ..., lp be the

lengths of all possible factors of W . Then lmin = gcd (l1, l2, ..., lp) is the length of the smallest factor

of W . So, the factor of W of length lmin can not itself be factored, so is counted in Flmin. Thus, it

is removed from the count of FL when Flminis subtracted off. But lmin divides each of l1, l2, ..., lp,

so the factor of length lmin generates factorable words of each of the p− 1 lengths larger than lmin.

Thus, the only Wli counted in Fli will be when li = lmin. So, W will be subtracted off only once.

26

Further,∑

m|L;m6=1,L Fm is the total number of non-further-factorable factor-candidates for a

word W of length L. Thus, each of the words in this count generates a unique factorable word of

length L, so only factorable words are removed from the count NL. ♦

Corollary 4.1.4. For every prime p, Fpk = Npk −Npk−1.

Proof.

Fpk = Npk − (Fpk−1 + Fpk−2 + ... + Fp)(4.1)

= Npk − (Npk−1 − (Fpk−2 + ... + Fp) + Fpk−2 + ... + Fp)(4.2)

= Npk −Npk−1(4.3)

Proposition 4.1.5. For d ≥ 3 and L ≥ 3, FL =∑

m|L((d−1)m +(−1)m(d−1))µ(L/m), where

µ(n) =

(−1)i n = p1p2...pi, each pj prime and distinct

0 otherwise

Further, if n > 2, FL =∑

m|L(d− 1)mµ(L/m).

Proof. Consider NL =∑

m|L Fm. By the Mobius inversion formula, FL =∑

m|L Nmµ(L/m) =∑m|L((d− 1)m + (−1)m(d− 1))µ(L/m).

Now, we claim that for L > 2, (d−1)∑

m|L(−1)mµ(L/m) = 0. Define f(L) =∑

m|L(−1)mµ(L/m).

By Mobius inversion again, we see that (−1)L =∑

m|L f(m). Observe:

L = 1: −1 =∑

m|1 f(m) = f(1)

L = 2: 1 =∑

m|2 f(m) = f(1) + f(2) = −1 + f(2) ⇒ f(2) = 2

Then consider the following inductive argument:

Base case: L = 3: −1 =∑

m|L f(m) = f(1) + f(L) = −1 + f(L) and it follows that f(L) = 0.

Induction: Let L > 3 be a product of primes. Assume for all k ∈ {3, 4, ..., L− 1}, f(k) = 0. We

have two cases:

L is even: Then we have 1 =∑

m|L f(m) = f(1)+ f(2)+∑

m|L,3≤m<L f(m)+ f(L) = −1+2+

0 + f(L) = 1 + f(L), so f(L) = 0.

L is odd: Then we have −1 =∑

m|L f(m) = f(1)+∑

m|L,3≤m<L f(m)+ f(L) = −1+0+ f(L),

so f(L) = 0. ♦

4.1.3. Translation factorability. The formula for TL is of the same genre as that for FL, with a

notable class of exceptions. Clearly, since we require that any element of TL be evenly translation-

factorable, TL = 0 if gcd (d, L) = 1. For general d when gcd (d, L) 6= 1, a formula for TL is elusive.

However, for prime d, we derive the following formula:

27

Theorem 4.1.6. For prime d and L a multiple of d, TL = (d− 2)NL/d +(d− 1)ML/d−∑

Tk·d,

where the sum is over all k, 1 ≤ k < L/d; k|(L/d); and d · k is neither a divisor of, equal to, nor a

multiple of L/d.

Proof. The proof is in the derivation. Assume W is an evenly translation-factorable word of length

L whose letters are from Zd and which is not factorable. So, we can write W = a1a2...an(a1+t)(a2+

t)...(an + t)...(a1 + mt)(a2 + mt)...(an + mt) for some n and m. Then we know a1 = a1 + (m + 1)t

since W is evenly translation-factorable. Further, since W is not factorable, a1 6= a1 + pt for each

p ≤ m. So, since d is prime, a1, a1 + t, ...a1 + mt must be in one-to-one correspondence with the

elements of Zd. So, n = L/d and m = d− 1. So, any element of TL can be generated by an element

of either NL/d or of ML/d with some translation t. When we consider a generating element that

comes from NL/d, t cannot equal either 0 or aL/d − a1 since the former would lead to a factorable

W and the latter would lead to the appearance of a repeated letter in W . So, there are (d−2)NL/d

possible candidates for TL generated by elements of NL/d. Moreover, it is easy to see that no two of

these candidates are the same since either they will differ in the first L/d letters or in the L/d+1th

letter. When we consider a generating element of ML/d, t = 0 would lead to both factorability

and a repeated letter in the word W . So in this case, t 6= 0 is the only restriction on t. Thus

there are (d− 1)ML/d possible candidates for TL generated by elements of ML/d. Again, it is easy

to see that no two of these candidates is the same for the same reason as above. Further, each

candidate generated by ML/d is different from each candidate generated by NL/d: consider the

first and L/dth letters of two such candidates. If the first letter is the same, then the L/dth letters

must be different, by definition of NL/d and ML/d.

Each of these candidates satisfies the translation-factorability characteristic of elements of TL by

construction. Each candidate also satisfies the criterion that no letter is found twice in succession.

Further, candidates constructed in this manner do not factor into subwords of length L/d (by

construction), of length equal to any multiple of L/d (since this would imply a1 = a1 + pt for some

p < d), nor by length which divides L/d (since this would imply factorability into words of length

L/d). However, some of these candidates might factor into subwords of some other length which

divides L.

In order to count the number of candidates which factor into subwords of some other length,

we introduce the following lemma:

Lemma 4.1.7. If W of length L is translation-factorable by subwords of length

m1 and m2, m1 < m2 < L, then W is translation-factorable by a subword of length

m3 = m2 −m1.

28

Proof. Let s denote the translation associated with the subword of length m1 and

let t be that associated with m2.

Then, by definition, if we start at any ak for k = 1, 2, ..., L − m1, shifting m1

units to the right is equivalent to adding s to ak: ak+m1 = ak + s. Similarly, for

k = 1, 2, ..., L − m2 shifting m2 units to the right is equivalent to adding t to ak.

Further, observe that for k = m1 +1, ..., L, shifting m1 units to the left is equivalent

to subtracting s from ak; and for k = m2 + 1, ..., L, shifting m2 units to the left is

equivalent to subtracting t from ak.

Let k = 1, 2, ..., L−m2. (Note that m2 < L, m2 ≤ L/2 since otherwise W would

not be evenly translation-factorable. So L/2 ≤ L −m2 < L −m1, with the latter

inequality because m1 < m2.) Then consider ak+(m2−m1). This is ak shifted right

m2 units and then back left m1 units. So, we can write ak+(m2−m1) = ak + t− s, so

long as k + m2 > m1, which is clearly true since m2 > m1.

So, for all ak, k = 1, 2, ..., L−m2, ..., L−m2 + m2 −m1 = L−m1, we see that

we have a translation of a word of length m2 −m1 with translation by t − s. We

need now only explore the remaining m1 letters at the end of the word.

Consider, for k = L − m1 + 1, ..., L, ak−(m2−m1) = ak−m2+m1 . This is the

equivalent of shifting ak m2 units to the left and then m1 units to the right. So,

ak−(m2−m1) = ak−m2+m1 = ak − t + s. So, we again have a translation of a word

of length m2 −m1 with translation by t − s at the end of the word W . So, W is

translation-factorable by a word of length m2 −m1 with translation by t− s.

Assume that a candidate W constructed as above is a concatenation of copies of a word W0

of length m for some m: W = (W0)L/m. Then we know m is not equal to, does not divide, and

is not a multiple of L/d. Consider the factorability of W into words of length m as a translation-

factorability of W into words of length m with translation by zero. Then by the lemma, we know

W must be translation-factorable into words of length |m−L/d| < L/d. We can apply the lemma

again to the smallest two of |m−L/d|, m, and L/d. Note that if we continue applying the lemma,

we find eventually that W must be translation-factorable into words of length gcdm,L/d. (The

repeated application of the lemma is a form of applying the Euclidean algorithm.) Let’s write

gcd m,L/d = mmin and denote the initial mmin letters of W as Wmin. Then there is some tmin

such that W0 = Wmin(Wmin + tmin)...(Wmin + (d− 1)tmin). Then each factorable word W counted

in the paragraphs above is associated with one and only one of the Td·mmin. Thus we arrive at

the summation term which subtracts out those candidates found in the complement of FL. This

completes the derivation and the proof. ♦

29

So, we can concatenate infinite copies any element of TL to create a harp pattern fitting Chemil-

lier’s bichord-translation requirements. Further, we can consider the shift action on a word of in-

finite length to be the mathematical equivalent of playing a succession of bichords on a Nzakara

harp. Under such considerations, observe that any periodic harp pattern will ultimately sound the

same no matter where we begin along its orbit under the shift action. So, ultimately, the number of

Nzakara harp pieces of a given length with a given alphabet is equal to the TL/L since any element

of TL will have a period of length L.

We can see that NL, FL, and TL grow exponentially as L increases (assuming we restrict

ourselves to L where TL 6= 0, that is L such that d|L or d = L). In the formula for NL =

(d−1)L +(−1)L(d−1) this is obvious. To see this for FL, observe that the dominating term in this

calculation will be the leading NL in the formula, and the exponent on any term in the summation

will be no higher than one-half of L: FL behaves like NL. Finally, TL grows much more slowly

than the others. However, its dominating term will have exponent L/d. By construction, every

other term contributing to the formula will have exponent less than L/d. In particular, the term

contributed by ML will have exponent (L/d) − 1 and terms in the summation will have exponent

at most (L/d)/2. So, TL behaves like NL/d. The following tables give values for these counts for

the first eight L such that TL 6= 0 for d = 3, 5, 7.

4.1.4. Examples with different alphabet sizes and different word lengths. Tables 4.2,

4.3, and 4.4 show the numbers of elements of each of the sets above for various alphabets.

Table 4.2. Z3

All NL FL TL OrbitsL = 3 27 6 6 6 2L = 6 729 66 54 0 0L = 9 19,683 510 498 18 2L = 12 531,441 4,098 4,020 24 2L = 15 14,348,907 32,766 32,730 60 4L = 18 387,420,489 262,146 261,522 108 6L = 21 10,460,353,203 2,097,150 2,097,018 252 12L = 24 282,429,536,481 16,777,218 16,772,880 480 20

30

Table 4.3. Z5 (The harpists’ alphabet)

All NL FL TL OrbitsL = 5 3,125 1,020 1,020 20 4L = 10 9,765,625 1,048,580 1,047,540 40 4L = 15 3.05 ∗ 1010 1.07 ∗ 109 1.07 ∗ 109 240 16L = 20 9.54 ∗ 1013 1.10 ∗ 1012 1.10 ∗ 1012 960 48L = 25 2.98 ∗ 1017 1.13 ∗ 1015 1.13 ∗ 1015 4,100 164L = 30 9.31 ∗ 1020 1.15 ∗ 1018 1.15 ∗ 1018 16,080 536L = 35 2.91 ∗ 1024 1.18 ∗ 1021 1.18 ∗ 1021 65,520 1,872L = 40 9.09 ∗ 1027 1.21 ∗ 1024 1.21 ∗ 1024 261,120 6,528

Table 4.4. Z7

All NL FL TL OrbitsL = 7 823,543 279,930 279,930 42 6L = 14 6.78 ∗ 1011 7.84 ∗ 1010 7.84 ∗ 1010 168 12L = 21 5.59 ∗ 1017 2.19 ∗ 1016 2.19 ∗ 1016 1,260 60L = 28 4.60 ∗ 1023 6.14 ∗ 1021 6.14 ∗ 1021 7,560 270L = 35 3.79 ∗ 1029 1.72 ∗ 1027 1.72 ∗ 1027 46,620 1,332L = 42 3.12 ∗ 1035 4.81 ∗ 1032 4.81 ∗ 1032 278,460 6,630L = 49 2.57 ∗ 1041 1.35 ∗ 1038 1.35 ∗ 1038 1,679,622 34,278L = 56 2.12 ∗ 1047 3.77 ∗ 1043 3.77 ∗ 1043 10,069,920 179,820

31

ss ss ss s

s ss

0 1 2 3 4

Figure 4.3. The five allowable bichords of the Nzakara harpists

4.2. The complexity of the harp patterns

Inspired by the work of Jeffrey Pressing (Section 3.3), we formulate a definition of complexity

for these tunes which takes into account how difficult each might be to play. Consider the five

allowed bichords and their labelling in Z5, shown again in Figure 4.3. Ignoring the element of

fatigue, the simplest string of notes to play is a single repeated note or bichord. The complexity of

such a string (e.g., ...000000... or ...222222...) is defined to be zero.

Notice that to move from bichord 0 to bichord 1, the player must move one finger one string.

So bichords 0 and 1 are considered to have distance equal to 1. To move from bichord 5 to bichord

1, the player must move one finger down one string and the other down two strings. So we say

bichords 2 and 5 have a distance of 3. Thus we define the distance between bichords to be equal

to the total number of strings which the fingers must move to change from one bichord to the

other. Because of the way the labelling has been defined, we can see that the distance between two

bichords is the absolute value of the difference between the numbers which represent them in Z5:

dist(B1, B2) = |B1 −B2|.

Similarly to Pressing’s definition of cognitive complexity, we will define the complexity of each

harp core by means of a comparison to more basic patterns. In this case, the more basic patterns

will be the shorter cores. This definition applies to both allowable harp patterns and those which are

disallowed, thus enabling comparison between the complexities of cores resulting from the observed

rules and possible cores which were excluded from these types of harp songs. We will define three

measures of complexity, denoted C1, C2, and CA. C1 and C2 will be variations on the same theme

of comparing the core in question with a single “closest” core of a smaller size. CA will use an

“average” over the smaller cores.

Each complexity measure relies upon a definition of the distance between two words. We define

a distance closely related to the Hamming distance or d-distance of information theory. Let l and

d be positive integers. Let u = u1u2...ul and v = v1v2...vl be two finite strings of elements of Zd,

each of length l. If dist(ui, vi) = |ui − vi| is the distance between ui and vi as defined above, then

the distance between u and v is defined to be

(4.4) dist(u, v) =1l

l∑i=1

dist(ui, vi).

32

In short, the distance between two finite words of the same length is the average distance between

their corresponding letters.

Since the cores used in the harp patterns are repeated over and over in a cyclical pattern,

there is a straightforward definition of the distance between two finite cores of different lengths.

Consider u and v of positive lengths s and t, respectively. Define u′ to be u concatenated upon

itself lcm(s, t)/s times to create a word of length lcm(s, t), and define v′ to be v concatenated to

itself lcm(s, t)/t times to create another word of length lcm(s, t). Then the distance between u and

v is defined to be the distance between u′ and v′.

We define the nearest-neighbor complexity of a core of length 2 as the minimum of the distances

to cores of length 1. Both the C1 and C2 complexities of cores of length 2 are defined this way.

For example, the complexity of the core 01 is 0.5, since its distance from 0 and 1 is equal to 0.5:

C1(01) = C2(01) = 0.5. Similarly, C1(03) = C2(03) = 1.5, demonstrating the more difficult task of

moving both fingers between the 0 bichord and the 3 bichord than between the nearby 0 and 1.

The nearest-neighbor complexity of a core of length longer than 2 takes into account all shorter

cores which are “close to” the core under consideration as well as their complexities. There are

many reasonable variations on this theme, two of which are discussed here. In the first version,

C1, finding a nearby neighbor is considered of more importance than the complexity of that nearby

neighbor: the set of words of minimum distance is found and then complexity is minimized within

that set. In the second version, C2, complexity and distance are minimized together by finding the

minimum of their product.

More precisely, given a word u of length k > 2, let Vi be the set of words of length i which have

the minimum distance to u for all i < k:

Vi = {v|v is of length i, dist(u, v) ≤ dist(u, w) for all words w of length i}.

If i = 1, then s1(u) = dist(u, v) for any v in V1. (Note that s1(u) is well-defined since every element

of Vi has the same distance to u.) For i > 1 we define

si(u) = min{dist(u, v) · C1(v)|v ∈ Vi}.

Then the C1 complexity of u is defined to be

C1(u) =k−1∑i=1

si(u)ωi

,

where ωi is a weighting term.

Some possible weighting variables ωi are i, 5i, 5i−1, or i!. We have chosen to use ωi = 5i−1.

Since we are working with an alphabet of size 5, the number of possible words of length i is given

by 5i. Thus with this choice of ωi, the weight on each sum term approximates the number of

33

choices among which we minimized. Furthermore, because of computing resource limitations, the

summations resulting in Table 4.5 are truncated at 5. (Note that the distance function is bounded

by 4, so as i increases, the weight variable we chose will quickly dominate and the contribution

from the i’th term becomes inconsequential.)

Our second version of the nearest-neighbor complexity of a core u of length k minimizes both

the distance and complexity of the comparison words with equal importance:

s1(u) = min{dist(u, v)|v is a word of length 1}

for i > 1, si(u) = min{dist(u, v) · C2(v)|v is a word of length i}

Then C2 is the weighted sum of the si’s, as before:

C2(u) =k−1∑i=1

si(u)ωi

.

The calculations in Table 4.5 use the same weighting variable as the first version and truncate the

summation term in the same location.

The final complexity measure in the table, the complexity over averages, is calculated using the

average of distance times complexity over all the smaller v. In other words,

(4.5) CA(u) =k−1∑i=1

mean(dist(u, vi) · CA(vi))

ωi,

where the arithmetic means are taken over all Vi with length i.

As we see in Table 4.5, the C2 complexity measure appears to do the best job of finding the

distance a player’s hands must travel to play any pattern of bichords. For example, a harpist’s

hands can never move more than 4 strings with each succeeding note, yielding a “tune” alternating

between bichords 0 and 4, or . . . 040404040 . . .. This tune does indeed have the largest C2 complexity

in the table, with a measure of 2. However, it has among the lowest complexities with averages.

In contrast, a harpist playing the “tune” . . . 00000000010000000001 . . . moves his hands very little,

and this pattern has the lowest nearest-neighbor complexity in the table (0.15), while it has the

highest complexities with averages (2.74).

One reason that CA seems to be a poor measure for hand movement is demonstrated by a

simple example. Consider a harpist playing a sequence of bichord 0 alternating with bichord 1.

This harpist’s hands move the same amount between each bichord no matter the number of total

strings on the harp (i.e., the size of the alphabet). Consequently, the complexity of 01 should

remain constant across alphabet sizes. However, as the size of the harp grows, CA(01) will also

grow as it is compared to words farther and farther away. In general, words far away from the

word in question with high complexity have at least as much influence as close words with near

34

complexity. One method of counteracting this problem might be to divide by the distance instead

of multiplying. This would have the immediate effect that words further away would have less

influence over the complexity of the word in question.

The two nearest-neighbor complexities both give similar results, and it is not until we examine

the sequences with period of length 10 that we see any difference at all. The values given by C1 are

naturally higher, since the minimum is being taken over a smaller set. Within each set of period

lengths, C1 and C2 demonstrate similar ratios among the patterns. However, we can see a telling

difference between the two when we examine 00001 and 0000000001. We see that C1(00001) is

actually smaller than C1(0000000001), indicating that 00001 is less complex than the string with

more zeros. It may indeed be mentally less complex in that the musician has to keep track of a

smaller number of 0’s before he plays a 1. However, in terms of the complexity we are trying to

measure, namely the movement of the hands, we find that C2 gives a more satisfying conclusion

with C2(0000000001) < C2(00001).

Our goal was to find a complexity measure that would evaluate the total amount of movement

of the harpists’ hands as a measure of the difficulty of the piece. The movement of the player’s

hands is not the only factor that plays into the difficulty of learning and playing a piece of music.

Other factors include the speed at which the piece is played, the positioning of the hands (for

example, whether the piece is played with one hand or two – a choice which varies by harpist – or

whether fingers or hands must cross over each other), the length of the period of the piece (which

determines how much the player has to remember), the care with which the player has to count

rhythms, finger fatigue, and many other factors. Here we find again that music is multidimensional

and we are able to calculate only one aspect of its complexity at a time, namely the movement

of the hands. This particular dimension seems to be measured well by the two nearest-neighbor

complexities. Variations on this measure may measure well the complexity due to some other

aspect. With enough such complexity measures and an understanding of what each one means, we

might be able to combine them to gain a reasonable profile of the overall complexity of a piece of

music.

35

Table 4.5. The computed complexities of selected bichord patterns

Bichord pattern C1 C2 CA

. . . 01010101010101. . . 0.5 0.5 1.7

. . . 02020202020202. . . 1 1 1.6

. . . 03030303030303. . . 1.5 1.5 1.7

. . . 04040404040404. . . 2 2 2

. . . 010010010010010. . . 0.38 0.38 2.38

. . . 012012012012012. . . 0.75 0.75 2.03

. . . 013013013013013. . . 1.12 1.12 2.12

. . . 024024024024024. . . 1.48 1.48 2.29

. . . 000010000100001. . . 0.26 0.26 2.67

. . . 000020000200002. . . 0.48 0.48 2.62

. . . 012340123401234. . . ∗ 1.35 1.35 2.29

. . . 024130241302413. . . ∗ 1.35 1.35 2.29

. . . 031420314203142. . . ∗ 1.35 1.35 2.29

. . . 043210432104321. . . ∗ 1.35 1.35 2.29

. . . 021340213402134. . . ∗∗ 1.35 1.35 2.29

. . . 012310123101231. . . ∗∗ 0.91 0.91 2.12

. . . 023040230402304. . . ∗∗ 1.58 1.58 2.45

. . . 113231132311323. . . ∗∗ 0.91 0.91 1.95

. . . 034140341403414. . . ∗∗ 1.58 1.58 2.51

. . . 00000000010000000001. . . 0.54 0.15 2.74

. . . 01000100000100010000. . . 0.68 0.24 2.64

. . . 01341240230134124023. . . ∗ 3.00 1.35 2.06

. . . 02132430410213243041. . . ∗ 3.05 1.35 2.06

. . . 01403423120140342312. . . ∗ 3.00 1.35 2.06

. . . 01340243140134024314. . . ∗∗ 3.27 1.55 2.07

. . . 04241414420424141442. . . ∗∗ 3.25 1.53 1.85

. . . 01120440210112044021. . . ∗∗ 2.70 1.24 2.27∗ These patterns follow the grammatical rules of the harp tunes.

∗∗ These patterns were generated randomly for comparison.

36

CHAPTER 5

Conclusion

Every piece of music is an intricate balance of form, pitch, harmony, rhythm, and many intan-

gible aspects. The intricacy of these relationships and the impact they have on each other and on

the listener is impossible to measure completely. In this paper we have tried to understand some of

these individually by using ideas from information theory and mathematical linguistics. We have

set forth examples of how some of these tools may be used to examine the complexity (or richness

or difficulty or unexpectedness) of some of these aspects. Perhaps such techniques may allow a

musicologist to create a “complexity profile” of a type of music.

The concepts of entropy and complexity from information theory were used repeatedly through-

out this work. We have primarily used three general conceptions of complexity and entropy: as

a measure of the difficulty of learning or playing a sequence, as a measure of the amount of free-

dom and richness remaining within a defined structure, and as a degree of unexpectedness in a

sequence. The first definition was realized in rhythmic sequences in the cognitive complexity of the

Clave rhythms and in the various complexities we defined for the sequences of bichords played by

the Nzakara harpists. The second was examined in the discussion of Boulez’ ...explosante-fixe...,

Mozart’s Musikalisches Wurfelspiel, and the size of the Nzakara harpists’ potential vocabulary.

Finally, the third definition, that of the degree of unexpectedness in a sequence, was explored in

both the ...explosante-fixe... and Musikalisches Wurfelspiel discussions and in the sections on the

Lempel-Ziv and metric complexities of the Clave rhythms.

While these definitions of complexity and entropy are integrally related, there is yet no common

structure in which to interpret the results. Still some amount of comparison can be made. In the

case of the Clave rhythms, we found that both the metric complexity and the cognitive complexity

measures determined that the Bossa-Nova rhythm is the most difficult of the six while the Shiko is

the easiest, a result matching the experience of many students and musicians learning and playing

these rhythms. Further, given additional computing power, we would be able to compare the

complexities of the allowed bichord sequences of the Nzakara harp tunes to the disallowed of the

same order to seek some reason why perhaps the allowed tunes were preferred – are they easier to

play or are they more interesting than the others, for example.

We have gathered here a set of tools to study mathematically the vast complexity of music.

Many attempts have been made in the past to analyze music mathematically. We believe that the

techniques and approach found in this paper are useful in at once being mathematically sound and

respectful of music’s nature to resist containment and restriction. The approaches we take apply

across time and space, as demonstrated by our mix of examples from Africa, Europe and America,

and from tribal traditions of unknown age to classical music of the 1700’s to modern compositions.

Our examples provide novel extensions of previous research, enabling us to understand further

several musical examples without unnecessarily restricting their form and musical identity.

38

BIBLIOGRAPHY

1. Asymptotic equipartition property, http://en2.wikipedia.org/wiki/Asymptoticequipartition property, Accessed on November 6, 2004.

2. Simha Arom, African polyphony and polyrhythm: Musical structure and methodology, Cam-bridge University Press, Cambridge, 1991, Translated from French by Martin Thom, BarbaraTuckett, and Raymond Boyd.

3. Judith Becker and Alton Becker, A grammar of the musical genre srepegan, Journal of MusicTheory 23 (1979), no. 1, 1–43.

4. Louis Block, John Guckenheimer, Michal Misiurewicz, and Lai Sang Young, Periodic points andtopological entropy of one dimensional maps, Global Theory of Dynamical Systems (ZbigniewNitecki and Clark Robinson, eds.), Springer-Verlag, Heidelberg, June 1979, pp. 18–34.

5. Pierre Boulez, Pierre Boulez: Conversations with Celestin Deliege, Eulenburg Books, London,1976, First published in French under the title Par Volonte et par Hasard: Entretiens avecCelestin Deliege, 1975.

6. William Bright, Language and music: Areas for cooperation, Ethnomusicology 7 (1963), no. 1,26–32.

7. Michael Brin and Garrett Stuck, Introduction to Dynamical Systems, Cambridge UniversityPress, Cambridge, 2002.

8. Gregory J. Chaitin, On the length of programs for computing finite binary sequences, Jour.ACM 13 (1966), 547–569.

9. Marc Chemillier, La musique de la harpe, Une esthetique perdue (Eric de Dampierre, ed.),Presses de l’Ecole Normale Superieure, Paris, 1995, pp. 99–208.

10. , Ethnomusicology, ethnomathematics. The logic underlying orally transmitted artisticpractices, Mathematics and Music: a Diderot Mathematical Forum (G. Assayag, H.G. Fe-ichtinger, and J.F. Rodrigues, eds.), Springer, Berlin, 2002, pp. 161–183.

11. Robin Cooper, Abstract structure and the Indian Raga system, Ethnomusicology 21 (1977),no. 1, 1–32.

12. Kenneth Falconer, Techniques in Fractal Geometry, John Wiley and Sons, Ltd., West Sussex,England, 1997.

13. Steven Feld, Linguistic models in ethnomusicology, Ethnomusicology 18 (1974), no. 2, 197–217.

14. Paul Griffiths, Boulez: Exploding-Fixed, Included with compact disc entitled Boulez ConductsBoulez, 1995.

15. A. N. Kolmogorov, Three approaches to the definition of the concept ‘quantity of information’,Problems of Information Transmission 1 (1965), 3–11.

16. Abraham Lempel and Jacob Ziv, On the complexity of finite sequences, IEEE Transactions onInformation Theory IT-22 (1976), no. 1, 75–81.

39

17. F. Lerdahl and R. Jackendoff, A Generative Theory of Tonal Music, MIT Press, Cambridge,Massachusetts, 1983.

18. Fred Lerdahl and Ray Jackendoff, Toward a formal theory of tonal music, Journal of MusicTheory 21 (1977), no. 1, 111–171.

19. Karl Petersen, Ergodic Theory, Cambridge University Press, Cambridge, 1983.

20. , Symbolic Dynamics, Lecture notes from Mathematics 261, Spring 1998 at the Univer-sity of North Carolina at Chapel Hill, 1998.

21. Ivars Peterson, Mozart’s Melody Machine, http://www.sciencenews.org/articles/20010901/mathtrek.asp, September 2001.

22. John R. Pierce, An Introduction to Information Theory: Symbols, Signals, and Noise, seconded., Dover Publications, Inc., New York, 1980.

23. Harold S. Powers, The structure of musical meaning: A view from Banaras (a metamodal modelfor Milton), Perspectives of New Music 14 (1976), no. 2, 307–334.

24. , Language models and musical analysis, Ethnomusicology 24 (1980), no. 1, 1–60.

25. Jeffrey Pressing, Cognitive complexity and the structure of musical patterns, internet article,accessed on August 16, 2004.

26. Nicolas Ruwet, Theorie et methodes dans les etudes musicales: quelques remarquesretrospectives et preliminaires, Musique en jeu 17 (1975), 11–36.

27. Paul C. Shields, The ergodic theory of discrete sample paths, Graduate Studies in Mathematics,vol. 13, American Mathematical Society, Providence, RI, 1996.

28. Godfried Toussaint, A Mathematical Analysis of African, Brazilian and Cuban Clave Rhythms,Proceedings of Bridges: Mathematical Connections in Art, Music, and Science (Wichita,Kansas) (Reza Sarhangi, ed.), Central Plain Book Manufacturing, 2002, pp. 157–168.

29. Homer Spence White, On the Algorithmic Complexity of the Trajectories of Points in Dynam-ical Systems, Ph.D. thesis, University of North Carolina at Chapel Hill, 1991.

40