Pattern matching in music and its use for automated ...computing-reports.open.ac.uk/2010/TR2010-09.pdf · of music analysis that heretofore were taken for granted. An analogous dialogue

ISSN 1744-1986

T e c h n i c a l R e p o r t N O 2010/ 09

Pattern matching in music and its use forautomated composition

Tom Collins

13 May, 2010

Department of ComputingFaculty of Mathematics, Computing and TechnologyThe Open University

Walton Hall, Milton Keynes, MK7 6AAUnited Kingdom

http://computing.open.ac.uk

Probation report:

pattern matching in music and its

use for automated composition

Tom Collins

Friday 1 May 2009

Abstract

It is something of a truism to say that self-reference abounds in ‘classical’ music. That is,given a single piece of music, it will most likely contain instances of verbatim repetitionas well as more subtle variation. Why then, when programming a computer to generatemusic, is little attention paid to ensuring the program’s output contains self-reference?If the program is intended to test a model of musical style then its inattention to self-reference may result in failure of the test. This report gives further exposition of theabove issue and the research question thus motivated. Current computational methodsfor pattern matching in music are exemplified and criticised. An introduction to Markovmodels for music is also provided, which is intended to be suggestive of more intricatecomputational models of musical style. The research proposal emphasises the need tointegrate pattern matchers and music generators, and makes feasible plans to pursuethis objective.

Contents

1 Research question 1

2 Review of existing research 4

2.1 Representing music 4

2.2 Pattern matching in music 7

Motivations 7

Methods 8

Evaluation 12

2.3 Automated composition 14

Boundaries 14

Critique of a Frequentist Statistical Approach 16

Markov Models for Music: an Example 18

3 Research approach 22

3.1 Review of pattern matching in music 22

3.2 Theoretical extension 22

3.3 Implementation, with an adapted measure for pattern interest 22

3.4 Evaluation of the pattern matcher 23

3.5 Formation of a generative model of musical style 24

3.6 Integration of pattern matcher into model of musical style 24

3.7 Evaluation of models of musical style 25

4 Work schedule 26

5 Glossary of music-theoretic terms 28

6 Bibliography 30

7 Appendix 34

7.1 Some theory of datasets and patterns 34

7.2 Draft: a musically robust Markov model for analysis and composition 41

7.3 Draft: using discovered, polyphonic patterns to filter computer-generated music 46

1 Research question

Two fields provide the context for my research question (p. 2): pattern matching in musicand automated composition. Figure 1 shows the relation of these fields to one another, andto their antecedents, music analysis and music composition respectively. Music analysis is‘generally motivated by a desire to encounter a piece of music more closely, to submit to itat length, and to be deeply engaged by it, in the hope of thereby understanding more fullyhow it makes its e!ect’ [39, p. 127]. In the seventies a new facet of music analysis beganto emerge, prompted by the realisation that some (but certainly not all) of the activities ofthe music analyst might be performed or at least assisted by a computer [23]. This facet isreferred to as pattern matching in music, so as to distinguish it from pattern matchingin image or text processing, although of course there has been cross-pollination [5]. ‘Patternmatching in music’ is still a rather broad term, admitting, for example, anything from labelingsegments of music with chord names [34] to locating the structural boundaries of a piece [41].So to be more specific, my research is concerned with identifying self-reference: a commoninitial step in making an analysis of a piece is to identify instances of repetition, rangingfrom verbatim restatements of material to more subtle variations. This initial step has beencalled ‘identifying self-reference’ [5] (and should not be confused with the computer-scienceconnotations of the term ‘reference’).

!"#$%&

'(')*#$#&

!"#$%&

%+!,+#$-+(&

,'./0(&

!'1%2$(3&'"1+!'1/4&

%+!,+#$-+(&

(/5&6'%/1&

/!/03/#&$(&

789:&;<=>&

(/5&

!/12+4#&

6+0&

(/5&6'%/1&

/!/03/#&$(&

78?8&;<:>&

(/5&

!/12+4#&

6+0&

%+!,)/@&

0/)'-+(#2$,&

Figure 1: A diagram to show the fields of pattern matching in music and automated com-position, their relation to one another and their respective antecedent fields, music analysisand music composition.

In Figure 1 music analysis and pattern matching are shown in dialogue. When a music-analytical activity is addressed from the point of view of pattern matching this gives rise toa new method for music analysis. At the same time it may raise questions about aspectsof music analysis that heretofore were taken for granted. An analogous dialogue is shownin Figure 1 between music composition and automated composition [36]. Since atleast as far back as 1959 computers have been programmed to generate music [20]. Musiccomposition is evolving new semi- and fully-automated methods [4]. My research focuses on

1

Markov models for music, which are introduced properly in section 2.3 (Markov models formusic: an example). The rationale behind a Markov model for music is as follows: supposeyou wish to generate music in the style of an existing collection of pieces, a ‘corpus’. Thecorpus and the musical transitions therein are analysed to provide knowledge such as ‘startingfrom state X, the corpus contains:

• m instances of the transition X ! Y ;

• n instances of the transition X ! Z’.

Often the analysis results are informative in their own right, but importantly they can be putto use in a compositional scenario. Finding your composition in state X, you would choose towrite state Y next with probability m/(m+n), and state Z next with probability n/(m+n).If a computer is programmed to make this choice then the compositional process becomesautomated in some sense. Later on in this report I discuss the issue of evaluating the outputof such processes as ‘stylistically successful’ (or not). Also indicated in Figure 1 is the complexrelationship between composition and analysis. These are reciprocal processes to some extent,for in ‘putting together’ (i.e. composing) music, one may benefit from analysing the e!orts ofother composers. The complexity of the relationship is due to the historical and geographicalrichness of compositional and analytic processes. There is a parallel relationship betweenpattern matching in music and automated composition, one which I argue is underdeveloped.1

Situated in this context, my research question is

How can current methods for pattern matching in music be improved and integrated into anautomated composition system?

At present my envisaged answer to the research question has the following form:

1. review of pattern matching in music, especially the SIA algorithms [30], related andsubsequent work;

2. theoretic extension, built on findings in 1 (this is well underway, see section 7.1);

3. implementation of 2, incorporating an adapted measure for pattern interest [9];

4. evaluation of the pattern matcher emerging from 2-3;

5. formation of a generative model of musical style for a specific corpus of music, anddrawing on replicable aspects of existing research [13, 37];

1One author, for instance, claims to have incorporated pattern matching in a program for automatedcomposition [13, pp. 139-152], yet a reviewer suggests that the author’s implementation falls short of theclaims: ‘As far as I can tell from the undocumented code, gaps in allusions are not allowed, so that underthis definition, variations on a theme would not count as an allusion’ [48, pp. 113-4].

2

6. integration of the pattern matcher from 3 into the model of musical style from 5;

7. evaluation of models from 5-6.2

As stated, step 2 is well underway, so section 7.1 is indicative of what an answer will looklike (to this part of the research question). Further, evaluative steps 4 and 7 will indicatewhen the question has been answered (more details about these steps are included in section3 on research approach). Both steps 4 and 7 will involve trials with subjects (a singlemusic-analytic expert or multiple subjects). The results of these trials will be subject tostandard statistical analyses, making it possible to quantify findings such as ‘pattern matcherA constitutes an improvement on pattern matcher B’, or ‘subjects were able to discriminatebetween computer-generated passages and original human-composed excerpts’. Herein liespart of my contribution to research. If, say, the statistical analysis in step 4 showsthat my proposed pattern matcher performs better than an existing pattern matcher, thenthis is evidence that steps 1-3 represent a contribution to research. Similarly, suppose thatthe statistical analysis in step 7 shows that subjects cannot discriminate original human-composed excerpts from computer-generated passages based on my proposed model of aspecific musical style. Then this is evidence of a well-fit model of musical style, and of steps4-5 constituting a contribution to research. The proper use of statistical analysis necessitatesa clear statement of my research aims, and consequently the results of the analysis willprovide a clear indication of the extent to which these research aims have been met. As bothpattern matching in music and automated composition are interdisciplinary and relativelyimmature fields, the methodology itself will constitute a contribution to research. Thereare other smaller contributions to be made (such as trying to replicate programs that areoutlined only [11]), and these ‘research gaps’ will be highlighted at appropriate points withinthe review of existing literature.

As the work schedule (section 4) indicates, the above steps do not need to be accomplishedin chronological order. Steps 2-3 are well underway and step 1 is an ongoing process. Initialattempts at steps 5-6 have been made in order to assess their overall feasibility. Having beenencouraged by the results of these attempts, work on steps 5-6 continues in earnest. Beingaware of the time demands and pitfalls of trials with multiple subjects, I have included amock trial on my homepage (‘Whodunnit?’, at http://users.mct.open.ac.uk/tec69). This hasenabled me to gain useful feedback that will be incorporated into the design of a more formaltrial (see the research approach in section 3 for more details). In Computing, one of theprobationary requirements is evidence of an implementation. This can also be found at theabove address, appended by ‘/probation-examples.html’.

2Terms in this list whose meanings are unclear will be introduced properly in section 2.

3

2 Review of existing research

The overall aims of this section are to give further context for this research, and to situatemy research question in this context. Before embarking on this course, readers unfamiliarwith musicology may find section 2.1 helpful.

2.1 Representing music

For the majority of this research, the term ‘music’ is taken to refer to one or more music scoresin sta! notation (see the Glossary of music-theoretic terms, p. 28). It is helpful to keep thewider picture in mind also (Figure 2), where ‘the musical event’ is depicted as a psychologicalconstruct, manifest in artefacts (e.g. a music score or recording) and experiences (e.g. a livemusic performance).

!"#$%&"'()*)'!+',-.'/0('

*1!23(45"'()*)'%6!7'

+#6!&#+'

63('

4$%5"!0'

(8(+6'

!$956#1:'()*)'2(1;#14!+"('!%'

3(!19'<:'05%6(+(1'

Figure 2: A diagram similar to this was presented by Geraint Wiggins in a talk entitled‘Computational modelling of music cognition’ at the Institute of Music Research, London on16 February 2009. The terms are Babbitt’s [2, p. 76]. Similar models have been proposedsince, for example see Leman [21].

Figure 3 contains an excerpt of music in sta! notation. It is written for piano, with the topsta! (five lines) for the right hand and the bottom sta! for the left. The first three pieces ofinformation on each sta! are the clef, key-signature and time signature.3 To the right of thisinformation is a series of dots (noteheads) with tails (stems) referred to as notes. Each notehas:

• an ontime (the time at which the note begins, increasing from left to right);

• a note height (relative height on the sta!, defined elsewhere as ‘morphetic pitch’ [29]);

3It is not essential to know about these but they are defined in the glossary.

4

• a duration (indicated by a combination of filled or empty noteheads and stem types),

among other attributes. For example, Figure 4 overleaf is a reproduction of the first barfrom Figure 3, minus the performance instruction con espressione and some phrase marks.Just above the reproduced bar in Figure 4 are some possible note heights and durations, andbelow the reproduced bar are the ontimes. Figures 5 and 6 maintain this format. The firstright-hand note is converted to a vector as shown in Figure 5. The arrows and lines emergingfrom this first right-hand note indicate that is has ontime 0, note-height 67 and duration 4.The units (beats, seconds etc.) are arbitrary: it is the relation of one note to another—notan absolute quantity—that is of interest. Similarly in Figure 6 it can be seent that the firstleft-hand note has ontime 1, note-height 60 and duration 1. Continuing in this fashion, eachof the notes in Figure 3 can be converted to a point in multidimensional space, so that thewhole excerpt can be expressed as a dataset

D = {(0, 67, 4), (1, 60, 1), (2, 62, 1), (3, 58, 1), (4, 60, 1), (4, 66, 1), . . . , (48, 60, 1)}. (1)

The formal representation of music as points in multidimensional space can be traced backat least two decades [22], and it underpins a large amount of research in the emerging fieldof ‘empirical musicology’ [6], an early example being ‘viewpoints’ [8]. Mathematicians mightinfer from (1) that a dataset is a non-empty, finite subset of Rk, with the dimension k = 3 inthis instance. What is gained by representing Figure 3 in this way is the potential to applyrelevant concepts from linear algebra, probability and statistics: lost in this representationare certain aspects of the music score (e.g. voicing) that do not convert readily to real numbersor integers.

!"

!"

Con espressione

# $$$1 % % %& % %% % %& %% %& % %% % % %% % % % % %% % %% % %& % %$ % % %'% %& % %& %& % % %(

3 3

) $$$ * % % % % % %% % % %& % %% %#% %& %

% %% % %% % % % %$ %% %& %% )% %% %% %$ %+ %& % % %& %

%%% %(3

3 3

Figure 3: Bars 1-6 of Variation V from Seven Variations on ‘God save the King’ in C major,by Ludwig van Beethoven.

Both Figure 3 and the dataset defined in (1) are symbolic representations of music [21]. Inorder to hear the music to which they refer, one must either locate a recording of the piece,or convert Figure 3 or (1) into a recording. For instance, the reader may wish to listen toTrack 1 at

http://users.mct.open.ac.uk/tec69/probation-examples.html

from a recording by the pianist Emil Gilels. Track 2, on the other hand, is the conversion ofFigure 3, via a synthesizer, to a sound recording. Recordings are an example of an acousticrepresentation of music [21]. The purpose of these given recordings is to demonstrate therelative expressiveness of a human performance compared with the synthesizer conversion.

5

!Noteheights

... 67" 66" 65" 64

"63

"62

"61

"60

"59 ...

Durations

#middle C

" " ... 4

$ " "% & &3

3

"2

" &3

1 ...

"% & &3

'

()

()

! ***1 " " "+ " " " "

3 3

# ***Ontimes 0

&1 2 3

" "

4 5 6

" " "

7 8 ...

" " "3

3 3

Figure 4: The first bar from Figure 3, with possible ontimes, note heights and durations.

!Noteheights

... 67" 66" 65" 64

"63

"62

"61

"60

"59 ...

Durations

#middle C

" " ... 4

$ " "% & &3

3

"2

" &3

1 ...

"% & &3

'

()

()

! ***1 " " "+ " " " "

3 3

# ***Ontimes 0

&1 2 3

" "

4 5 6

" " "

7 8 ...

" " "3

3 3

Figure 5: Focus on the first right-hand note.

!Noteheights

... 67" 66" 65" 64

"63

"62

"61

"60

"59 ...

Durations

#middle C

" " ... 4

$ " "% & &3

3

"2

" &3

1 ...

"% & &3

'

()

()

! ***1 " " "+ " " " "

3 3

# ***Ontimes 0

&1 2 3

" "

4 5 6

" " "

7 8 ...

" " "3

3 3

Figure 6: Focus on the first left-hand note.

6

They represent two extremes, and by including more information (e.g. dynamic levels) inthe synthesizer conversion it is possible to achieve a ‘median’, like Track 3, sounding lessmechanical than Track 2 but still lacking the richness of Track 1. Whilst acknowledging theexistence of a ‘nonlinear relationship between the notes in the score and what people hearwhen they listen to a performance of it’ [10, p. 79], for the reasons above the majority of myresearch makes use of symbolic representations such as (1).

2.2 Pattern matching in music

Motivations

The field known as ‘music information retrieval’ is motivated primarily by the followingscenario:

‘Imagine a world where you walk up to a computer and sing the song fragmentthat has been plaguing you since breakfast. The computer accepts your o!-keysinging, corrects your request, and promptly suggests to you that “CamptownRaces” is the cause of your irritation. You confirm the computer’s suggestionby listening to one of the many MP3 files it has found. Satisfied, you kindlydecline the o!er to retrieve all extant versions of the song, including a recentlyreleased Italian rap rendition and an orchestral score featuring a bagpipe duet’[18, p. 295].

A computer program that operates as described above is yet to be fully realised, but sig-nificant progress has been made in recent years [42]. This type of music analysis is knownas ‘query-based analysis’. In query-based analysis, a fragment of music is input by the user,the computer program searches pieces of music for the fragment and orders results by a rel-evance measure. Query-based analysis is not one of my research topics, but its introductionhelps to distinguish ‘intra-opus analysis’ [9], which is one of my chosen topics. In intra-opusanalysis, a single piece or excerpt thereof is analysed with the aim of identifying repetitionor more subtle instances of self-reference. In both query-based and intra-opus analysis theprogram has access to a database containing one or more pieces of music: the key di!erenceis that with query-based analysis the program is given an extra fragment of music by theuser, whereas with intra-opus analysis nothing apart from the database is provided.

There are financial incentives driving research in query-based analysis, but why is intra-opusanalysis a worthy pursuit? It is helpful to recall the definition of music analysis from (p. 1):‘generally motivated by a desire to encounter a piece of music more closely, to submit to it atlength, and to be deeply engaged by it, in the hope of thereby understanding more fully howit makes its e!ect’ [39, p. 127]. For the human analyst, identifying instances of repetitionand self-reference in a piece of music is something of a prerequisite to being ‘deeply engagedby it’, an important step no less since it is arguable that music ‘becomes intelligible to agreat extent through self-reference’ [5, p. 249]. At present, no computer program performs

7

this pattern matching task adequately. Evidence in support of this claim is that ‘SIATEC [analgorithm] typically discovers tens of thousands of TECs [patterns] even in relatively shortpieces. . .and usually only a very small proportion of these TECs are perceptually significantor analytically interesting’ [30, pp. 342-343]. The preceding survey of existing algorithms[30, pp. 324-328] indicates that this inadequacy is symptomatic of the field in general, notjust the SIATEC algorithm. As an assistant to the human analyst, and to the detection ofplagiarism in music, such a program would constitute a worthy contribution. Further it willbe argued in section 2.3 (Markov models for music: an example) that intra-opus analysis hasa role to play in computational modelling of musical styles.

Methods

Having touched on the motivations for pattern matching in music, let us review some ofthe methods. Commonly methods are geared towards monophonic (melody-only) music [5,8, 9, 26].4 The reason for the bias is twofold. First, notes in monophony can be treated muchlike words in text, and so algorithms for pattern matching in text can be adapted relativelyeasily to monophonic scenarios. For instance, Cambouropoulos [5] adapts an algorithm ofCrochemore [14].5 Second, in music-theoretic terms monophony is more straightforward thanpolyphony, hence the preference for investigating monophonic before polyphonic patternmatchers. Superficially then there are four areas (= two motivations " two categories ofmusic) of pattern matching in music:

• query-based analysis for monophonic music [26];

• query-based analysis for polyphonic music [17];

• intra-opus analysis for monophonic music [9];

• intra-opus analysis for polyphonic music [30].

While all four areas use methods that are interesting and occasionally relevant to my work,my research focuses on the latter area. As the bias outlined above indicates, this area is theleast developed and therefore the potential for making a contribution to research is greatest.

Meredith et al. [30] give an elegant method for intra-opus analysis of polyphonic music,a method that has been referenced and supplemented by much subsequent work [42]. Thepaper ‘Algorithms for discovering repeated patterns in multidimensional representations ofpolyphonic music’ by Meredith et al. [30] will be abbreviated hereafter by DRPPM. A propermathematical exposition and theoretical extension of the method in DRPPM is deferred toan appendix (section 7.1), preferring here to give some intuition of DRPPM and its short-comings. This description still covers some of the technical aspects, and makes it clear whatdistinguishes my work. The essential concept of DRPPM is that of maximal translatable

4The other category is known as polyphonic music, with more than one note sounding simultaneously.5There is evidence to suggest that text-search algorithms are adaptable to polyphony as well [17].

8

pattern (MTP). An MTP is a set of vectors, contingent on a dataset D and a vector v. Thedataset

D = {a,b, c,d, e, f ,g} (2)

will serve as an example and is depicted in Figure 7. It can be seen that

a = (1, 1), b = (2, 1), c = (2, 2), d = (3, 1), e = (3, 2), f = (4, 2), g = (4, 3). (3)

Suppose the vector v = (2, 1), without worrying too much about how v is chosen.6

0 1 2 3 4 5

1

2

3

4

5

a

e

db

c f

g

Figure 7: Depiction of the members of the dataset D from (2) as points in the plane.

Definition. The MTP of the vector v in the dataset D is the set of all vectors in D thatare ‘translatable by v in D’ [30, p. 331].

For example, a is a member of the MTP of v in D because

a + v = (1, 1) + (2, 1) = (3, 2) = e, (4)

and e is a member of D. Similarly b is a member of the MTP of v in D because

b + v = (2, 1) + (2, 1) = (4, 2) = f , (5)

and f is a member of D. Looking at d = (3, 1), however,

d + v = (3, 1) + (2, 1) = (5, 2), (6)

6This is addressed in section 7.1. It is just a coincidence that v = b.

9

and the vector (5, 2) is not a member of D, so d is not a member of the MTP of v in D.Checking each of the vectors in D, we find the MTP of v in D is {a,b, c}, as shown in Figure8. While this example has focused on a particular vector v and two-dimensional vectors, thedefinition applies to k-dimensional vectors and datasets.

0 1 2 3 4 5

1

2

3

4

5

0 1 2 3 4 5

1

2

3

4

5

a b

c

g

e f

Figure 8: In the left-hand plot, the members of the MTP of v in D are the enlarged points.Recalling v = (2, 1), the enlarged points in the right-hand plot indicate how this MTP is‘translatable in D’.

What does this have to do with pattern matching in music? DRPPM asserts that ‘inmusic, MTPs often correspond to the patterns involved in perceptually significant repetitions’[30, p. 331]. The authors support this assertion with two musical examples and it appearsplausible, at least initially. To begin finding MTPs in a piece of music, it may be necessaryto focus on a relevant ‘orthogonal projection’ [30, p. 329]. For instance, the excerpt in Figure9 can be converted to a set of vectors, just as Figure 3 was converted to D in (1) on p. 5(recall also Figures 4-6). However, not all of the dimensions may be relevant in terms offinding patterns. My (and arguably most musicians’) minimum expectation of a patternmatcher applied to the excerpt in Figure 9 is the identification of the transposed repetitionof A at B and again at C. This type of patterning is known in music-theory as a ‘sequence’and on this occasion the pattern involves the dimensions of ontime and note height, but notduration. This is what is meant by focusing on a relevant orthogonal projection.

!"# $1

%A

% % %B

$% % %C

% % % %% %% % % % %&%' (

Figure 9: A sequence indicated by the letters A, B, C.

Figure 10 overleaf shows how the opening lower note from Figure 9 is converted to thevector (0, 63). Notice the focus on ontime and note height but not duration. Figure 10 also

10

indicates that (0, 63) is a member of the MTP of (1,#1), because (0, 63) + (1,#1) = (1, 62),which corresponds to another vector in the dataset.

!"# $

Note height: 63

Ontime: 0

Vector: (0, 63)

%(1, -1)

Note height: 62

Ontime: 1

Vector: (1, 62)

%

Figure 10: A review of how notes are converted to vectors, focusing here on the dimensionsof ontime and note height. Also shown is the translation (0, 63) + (1,#1) = (1, 62).

I now show why the pattern matcher in DRPPM does not fulfill the stated minimumexpectation for the excerpt in Figure 9. The MTP for the vector (1,#1) is indicated bythe darker notes in Figure 11. For instance, take the note labelled 1, having ontime 0 andnote-height 63. The note labelled 4 has ontime 1 and note-height 62. In other words, note4 occurs 1 beat later and 1 note-height lower than note 1. Hence note 1 is a member ofthe MTP of (1,#1), as there is a note (note 4) occurring 1 beat later and #1 note-heighthigher. This relationship was discussed already with regard to Figure 10. Abbreviatingthis relationship between notes 1 and 4 by ‘1 ! 4’, it can also be seen from Figure 11that 2 ! 5, 3 ! 6, 4 ! 7, 5 ! 8, 6 ! 9, and 13 ! 18. Therefore the darker notes1, 2, 3, 4, 5, 6, 13 in Figure 11 indicate the MTP for the vector (1,#1).

!"# $12

1

%3 5

4

% % %6 8

7

$% % %9 10

12

11

% % % %%14

13

%15 16 17 19

% % % %18

%&%' (

Figure 11: The darker notes indicate the MTP for the vector (1,#1). Note indices are forthe purposes of discussion.

Similarly the darker notes 1, 2, 3, 9 in Figure 12 indicate the MTP of (2,#2). In order toidentify the sequence A, B, C, the method in DRPPM is reliant on either the MTP for (1,#1)or for (2,#2) being equal to A. Then by translating the discovered pattern, the two furtherinstances, B and C, would be identified. Unfortunately this does not happen:

• the MTP of (1,#1) contains both A and B, as well as note 13 in bar 3;

• the MTP of (2,#2) contains A and note 9 in bar 2.

Hence neither of these MTPs are equal to A, and so the pattern matcher does not fulfill thestated minimum expectation of identifying the transposed repetition of A at B and again atC.

11

!"# $12

1

%3 5

4

% % %6 8

7

$% % %9 10

12

11

% % % %%14

13

%15 16 17 19

% % % %18

%&%' (

Figure 12: The darker notes indicate the MTP for the vector (2,#2).

We have covered some of the technical aspects of DRPPM and its shortcomings. Now forwhat distinguishes my work. The MTP of (1,#1) consists of notes 1, 2, 3, 4, 5, 6, 13, andthe MTP of (2,#2) consists of notes 1, 2, 3, 9. Taking the intersection of these lists givesnotes 1, 2, 3, indicated by the darker notes in Figure 13. Comparing Figures 9 and 13, itcan be seen that the result is equal to A. Hence my method (of taking the intersection ofMTPs) would lead to fulfillment of the minimum expectation. The method in DRPPM isimplemented in two steps: find all MTPs; look for translations of each MTP in the dataset.The methodological shortcoming is the assumption that instances of all musically interestingpatterns are identified in step one. As Figures 11 and 12 show, this is not the case. I havesuggested that taking intersections of MTPs may lead to an improved pattern matcher, andshown (in section 7.1) that a pattern is an intersection of maximal translatable patterns ifand only if it is maximal in another sense (with respect to its translational equivalence class).E!orts are also being made to try to balance taking intersections of MTPs from the pointof view of computational complexity. This is what distinguishes the current research fromDRPPM.

!"# $12

1

%3 5

4

% % %6 8

7

$% % %9 10

12

11

% % % %%14

13

%15 16 17 19

% % % %18

%&%' (

Figure 13: The darker notes indicate the intersection of the MTP of (2,#2) and the MTPof (1,#1). These darker notes give A as defined in Figure 9.

Evaluation

In this section the criteria for ‘improving’ music pattern matchers are addressed. Supposethat a music-analytic expert [41] or group of musically trained subjects [38] has analysed apiece of music and identified certain patterns and their recurrence. The responses could betaken as a ‘target set’ against which to compare the performance of a computational patternmatcher. If the pattern matcher identifies m of the n target patterns then its recall can bequantified as m/n. Its precision can be quantified as m/M , where M is the total numberof patterns found by the algorithm. The latter measure will punish pattern matchers thatare exhaustive to the point of being impractical to use. Recall and precision are standardmeasures for evaluating natural language processing systems [28, pp. 267-271], and havebeen applied to evaluate query-based analytic systems for polyphonic music [17]. If the

12

study described above was conducted and the resulting discussion made use of recall andprecision measures, then this would constitute a new application to evaluating intra-opusanalytic systems. This method could be criticised for assuming that the target set is alegitimate benchmark. Its use implies that the aim is emulation (not surpassing) of humanperformance on a certain task. While these issues will not be dwelt upon, it is conceivablethat the target set alluded to above does not contain a pattern—perhaps it was hiddenby the composer deliberately—but nonetheless it is identified by the computational patternmatcher.

In designing and reporting on monophonic pattern matchers, researchers have found ithelpful to define measures such as pattern prominence [5] and interest [9], expressed as realnumbers on an arbitrary scale. In this way it is possible to order the output of a patternmatcher and in so doing its lack of precision may be ameliorated. Returning to the proposedstudy involving a music-analytic expert or group of musically trained subjects, as a secondarytask to identifying patterns in a piece of music, the participant(s) could be asked to assignvalues to their answers in terms of the interest of each pattern. Any formulation for patterninterest could then be tested for correlation with the human responses (and in the groupscenario responses could also be tested for pairwise correlations). While interest measuresfor monophonic music abound, to my knowledge no such measure has been formulated oradapted for polyphonic music, let alone tested as outlined above. This is a consequenceof the bias towards pattern matching for monophonic music, but is also a symptom of thecomparative di"culty of quantifying the likelihood of observing a polyphonic pattern in somedataset. I have made an initial attempt to adapt Conklin and Begeron’s measure for patterninterest [9] to polyphonic scenarios, but I will not give the mathematical details here. Armedwith the notions of recall, precision, pattern interest and correlation, it is possible to compareone or more pattern matchers against a target set. In this sense, it will be possible to saythat ‘pattern matcher A constitutes an improvement on pattern matcher B’. To conclude, thissection has discussed the disposition and shortcomings of the existing literature on patternmatching in music, and two areas for potential contributions to this literature have beenhighlighted:

• improvement (in terms of recall, precision etc.) of Meredith et al.’s method for patternmatching;

• formulation and correlative testing of a measure for the interest of a polyphonic pattern,or adaptation and testing of such a measure from the monophonic literature.

In both cases it would appear that evaluation by comparison with existing methods and/ora target set is a prerequisite for a piece of work to be considered a contribution to research.

13

Domain Activity MotivationComposition Algorithmic composition Expansion of

compositional repertoireSoftware Design of compositional Development of toolsengineering tools for composersMusicology Computational modelling Proposal and evaluation

of musical styles of theories of musicalstyles

Cognitive science Computational modelling Proposal and evaluationof music cognition of cognitive theories of

musical composition

Table 1: ‘Motivations for developing computer programs which compose music’, reproducedfrom [36, p. 141].

2.3 Automated composition

Boundaries

The field of automated composition is diverse and relatively immature, an oft-cited earlywork being Hiller and Isaacson’s Illiac Suite from 1959 [20, pp. 182-197]. The field’s very ex-istence raises questions to do with music-historical forces [4] and the nature of creativity itself[3], but I choose not to dwell on these here. It also seems inappropriate to attempt a reviewof the field in all its diversity, ranging for instance from genetic algorithms [47] to artificialneural networks [31]. It will be more fruitful to consider the changes to the research method-ology of automated composition brought about by two seminal papers: one, ‘Motivationsand methodologies for automation of the compositional process’ [36] abbreviated hereafteras MMACP, that (re)draws the boundaries of the field as shown in Table 1; and another,‘Towards a framework for the evaluation of machine compositions’ [35] abbreviated hereafteras FEMC, that proposes a framework for testing the output of an automated compositionalprocess. This section begins by discussing whether the boundaries in Table 1 are acceptable,and what the ramifications are of drawing these boundaries. The discussion will be followedby a critique of the approach employed in FEMC and subsequent work [38], and lastly anexample Markov model for music is described.

Cope’s work ([13] most recently) is cited in MMACP as an example of algorithmic compo-sition, as his collection of programs (known under the umbrella term EMI for experimentsin musical intelligence) was intended originally to cure a case of writer’s block. One of thetenets of Cope’s approach is to form new musical ideas by recombining existing excerpts ofmusic [13, p. 88]. Initially he set about this task using his own oeuvre as ‘data’, and then at-tention turned to other composers.7 No methodology is stipulated in MMACP to accompany

7For examples of EMI output, Track 4 at http://users.mct.open.ac.uk/tec69/probation-examples.html isbased on Frederic Chopin’s mazurkas, and Track 5 on Ludwig van Beethoven’s piano sonatas.

14

the motivation of ‘expanding one’s compositional repertoire’. It would be dogmatic for theauthors to stipulate so, having stated earlier that such ‘computer programs are written by thecomposer as an idiosyncratic extension to. . .compositional processes’ [36, p. 121]. A precisemethodology [35] is o!ered to pursue the computational modelling of musical styles, however,which will be explored in due course. I would argue that when Cope’s attention turns fromhis oeuvre to those of other composers, a boundary in Table 1 is crossed from algorithmiccomposition to computational modelling of musical styles. However, Cope’s methods remainunchanged: descriptions of the concepts behind the model, supplemented by examples ofmusic and code. Since much of the work [11] predates MMACP and FEMC, it is perhapsunreasonable to expect any change in method with the boundary crossing. Apparently theauthors of MMACP disagree, stating with reference to Cope that ‘the relevance of his workfor cognitive science or musicology is significantly compromised by his failure to clearly statethe cognitive or stylistic hypotheses embodied in EMI and adapt an appropriate method-ology for refuting these hypotheses’ [36, p. 121].8 From this point of view, Cope’s work isin abeyance and consequently neglected by apparently related research (while it may be amistake, related work [27] lacks a citation for example). I prefer a more constructive pointof view, asking:

• which aspects of Cope’s work are replicable?

• how can these aspects be recast in an appropriate methodology (acceptable in terms ofMMACP)?

Rather than neglecting Cope’s work—easily done having read MMACP or subsequent criti-cism [48]—there is considerable potential for contributing to existing research by addressingthe questions posed above. The draft paper in section 7.2 represents an initial attempt toaddress such questions.

For each of the remaining entries in Table 1, MMACP discusses examples of relevant re-search and appropriate methodology. The methodology for computational modelling of musiccognition is apparently compoundable with that for musical styles, so it shall be covered be-low. As design of compositional tools is not so relevant to my research question, appropriatemethods need not be discussed. The boundaries in Table 1 are acceptable in the sense thatthey help to frame an immature and diverse field. The authors of MMACP duly acknowl-edge the shortcomings of any attempt at pigeonholing, and that the boundaries should notbe construed as discouraging interdisciplinary research, between musicology and computing,say. A wealth of citations support the argument that researchers in the field of automatedcomposition often fail to state their aims clearly, resulting from muddled motivations andresulting in equally confused methodologies. Since the publication of MMACP and FEMC,it is no longer the case that ‘simply describing a computer program that composes musiccounts as a useful contribution to research’ [36, p. 121].

8An even more vehement criticism exists [48].

15

Critique of a Frequentist Statistical Approach

To every computational model of musical style (or cognition) there is a process and a prod-uct. In FEMC a frequentist statistical approach is espoused for evaluating the product ofsuch a model.9 This section is devoted to an account and critique of the approach takenby FEMC, primarily because I am likely to adopt a similar methodology in evaluating myautomated compositional system. Therefore it is important to understand the approach, itsstrength and weaknesses. While care has been taken to make the account below compre-hensible, if the statistics become too turgid a bulleted list summarising my critique is givenin the last paragraph of this section (p. 17). Remarks on the underpinning philosophy andgeneral suitability of a frequentist approach are deferred until after the account.

A trial is described in FEMC where ten computer-generated excerpts and ten excerpts ofhuman-composed music are presented in a random order to nineteen human subjects. Foreach excerpt, subjects state whether they think it human-composed or computer-generated.The null hypothesis is that subjects are unable to discriminate and do so therefore at random.Consequently it is expected that

H0 : the mean proportion of human-composed excerpts correctly classified is µ = 0.5,

and similarly for the computer-generated excerpts [35, p. 6]. Even if subjects were guessing,they were not told to aim for a 50:50 split [35, p. 9], so perhaps a more appropriate nullhypothesis is: the mean proportion of human-composed excerpts classified as such equals themean proportion of computer-generated excerpts classified as such.

The authors proceed to test their null hypothesis H0 with a t-test [15]. It is not necessaryto describe a t-test in detail, but suppose xi is the observed proportion of human-composedexcerpts classified correctly by subject i, where i ranges from 1 to n = 19. The frequentistassumption is that x1, x2, . . . , xn are realisations of independent and identicially distributedrandom variables X1, X2, . . . , Xn. In order to use the t-test an asymptotic assumption is alsomade that the sample mean X is normally distributed with mean µ and variance !2/n. Thisassumption relies on n being large (which is why it is called an asymptotic assumption). InFEMC, n = 19 only.

A problem with the approach detailed above is that it presents di"culties for non-statisti-cians in selecting and conducting the appropriate test. It should also be noted that theresearch implications of the hypotheses are switched in FEMC: typically rejecting the nullhypothesis H0 in favour of an alternative H1 constitutes a contribution to research, whereasfailing to reject H0 does not. As a general example, we might look at the mean weight gainµ of anorexia patients after a new two-month drug treatment. The relevant hypotheses are

H0 : µ = 0, H1 : µ > 0.

Rejecting the null hypothesis H0 in favour of the alternative H1 is typically where a contri-bution to research is made. In the above scenario, rejecting H0 in favour of H1 means there is

9Less is said about how or whether to evaluate the process but subsequent work [38] has addressed thematter and will be mentioned in due course.

16

evidence that the new drug treatment is worth developing further. Drug companies would beless interested in developing the treatment if H0 could not be rejected. In FEMC the researchimplications of the hypotheses are switched: if H0 can be rejected in favour of H1 then sub-jects appear to be discriminating human-composed from computer-generated excerpts. Thismight not be seen as a contribution to research as the model of musical style/cognition (onwhich computer-generated excerpts are based) is evidently lacking. On the other hand, ifH0 cannot be rejected then statistically speaking, subjects cannot discriminate between theexcerpts. This would be grounds for claiming that the model did constitute a contributionto research. This is what is meant by a switch in the research implications of the hypotheses.While not necessarily a problem or weakness, from a statistical point of view FEMC adoptsan atypical methodology. Another area where the research implications of the hypothesesare switched in this manner is ‘goodness-of-fit’ [15, pp. 355-359], so the switch is not withoutprecedent.

‘The success of a piece of machine composed music on this test would mean that thereare absolutely no perceivable features present or absent in the music which allow expertsto identify it as being composed by a machine rather than a human composer’ [35, p. 5].Given the criticisms outlined above, is this too strong a claim to make? It is notable that‘experts’ are mentioned in the quotation while general subjects (trained and untrained alike)are used in the trial. At the beginning of this section it was emphasised that a model ofmusical style (or cognition) consists of a process and a product. I would argue that themethodology detailed above is a test of product but only an implicit test of process. Anattempt to address this imbalance has appeared since [38, p. 79], using multiple regressionon predictor variables that describe objective musical features. In this way it is possible toinfer which musical features cause subjects to class excerpts as more or less ‘stylistic’. Finallyone should note the parallels between the test described and Turing’s well-known ‘imitationgame’ [46], although there are also several di!erences.

The philosophy underpinning the overall approach in FEMC is that of Popper [40]. Thisphilosophy of falsification leads naturally to hypothesis testing based on frequentist assump-tions: like the one stated on p. 16, that x1, x2, . . . , xn are realisations of independent andidenticially distributed random variables. It should be made clear, however, that this methodof inference is not an absolute, with another approach existing based on Bayes’ theorem.‘Many statisticians make the stronger claim that this theorem provides the only entirely con-sistent basis for inference, and insist on its use’ [16, p. 565]. To conclude, this critique hasshown that the approach in FEMC:

• presents di"culties to non-statisticians;

• is atypical in terms of the research implications of its hypotheses;

• focuses on product rather than process.

As a final point, while the methodology in FECM and subsequent work [36, 38] is upheldfor the standardisation it a!ords to automated composition, one could question why, half

17

a decade on, its proponents are still mainly analysing [41] and generating [38] monophony.There is a ‘research gap’ in models for polyphonic music and also monophonic models [38]are less rich than the polyphonic models of Cope, say.

Markov Models for Music: an Example

The Markov models introduced here are intended to be suggestive of more intricate com-putational models of musical style (see section 7.2). My research involves the modelling ofmusical styles via computer, focusing on fifty or so pieces called mazurkas by the Polish-borncomposer Fryderyk Chopin (1810-1849) [33]. The mazurka has its origins in a Polish folkdance, but as an art-music genre became popular in its own right throughout nineteenth-century Europe. The existence of such a large and musically homogeneous corpus makesit possible to model the music data therein using discrete-time Markov chains (which willbe introduced below). Intuitively this is because for any given musical event (e.g. a chordspacing or fragment of melody) in one of Chopin’s mazurkas, there are most likely severalother instances of this chord spacing or melody fragment, either in the same mazurka orsome other. For smaller, less stylistically homogeneous corpora, this is unlikely to be thecase. Other ‘classical’ corpora that might be modelled in this fashion are Antonio Vivaldi’sconcerti, Joseph Haydn’s string quartets, Gabriel Faure’s melodies, etc. Chopin’s mazurkashave been chosen because the data already exist in a relatively useable and reliable form.10

Furthermore, Cope has published six computer-generated mazurkas in the style of Chopin[12], o!ering a potential benchmark. Having introduced the general reader to Markov mod-els for music, this section will proceed to discuss how pattern matching for music can beintegrated into an automated compositional process.

Let I be a countable set called the state space, with members i $ I called states. Forexample, the set of pitch-classes

I = {F, G, A, B", B, C, D, E} (7)

forms a plausible state space for the material shown in Figure 14. For each i, j $ I we countthe number of transitions from i to j in some data and record this information, divided bythe total number of transitions from state i, in a so-called transition matrix. Table 2 is theresult of this counting process for the pitch-classes F, G, A, G, F, G, A, B,. . . from Figure14.

This is known as a Markovian analysis. Putting this analysis to use in a compositionalscenario requires the generation of an initial state. Formally, the random variable X0 takesvalues according to the probabilities contained in a vector # = (#1, #2, . . . ,#|I|), the initialdistribution. For instance

# = (12 , 0,

12 , 0, 0, 0, 0, 0) (8)

means that the initial pitch-class X0 of a generated melody will be F with probability 12 , and

A with probability 12 . (The probabilities contained in # do not have to be drawn empirically

10http://kern.ccarh.org.

18

! Andante.

" #F

3p

$%G A

$% &G F

$% $%G

$%A B

$% &'G

$A

$%B

$' %C

$(D

$(E

$B

$%D C

$( )

" #7

*A

&C

$Bb

$%A

$%G

&Bb Bb

$% $%A G F

$ $ $ $%G

$%A

$%F

$%A

&G

$ *

Figure 14: ‘Lydia’, Op. 4/2 by Gabriel Faure, bars 3-10 of the melody. The lyrics have beenreplaced with note names to aid understanding.

F G A B! B C D EF 0 3/4 1/4 0 0 0 0 0G 2/7 0 4/7 1/7 0 0 0 0A 1/8 1/2 0 0 1/4 1/8 0 0B! 0 0 2/3 1/3 0 0 0 0B 0 1/3 0 0 0 1/3 1/3 0C 0 0 1/3 1/3 0 0 1/3 0D 0 0 0 0 0 1/2 0 1/2E 0 0 0 0 1 0 0 0

Table 2: The transition matrix for the pitch classes occurring in Figure 14.

from the data, but often they are.) Suppose X0 = A, then we look along the third row ofTable 2 (as A is the third element of the state space) and randomly choose between X1 = F,X1 = G, X1 = B, X1 = C, with respective probabilities 1

8 ,12 ,

14 ,

18 . Continuing in this fashion,

suppose X1 = B. Looking along the fifth row of Table 2, a random, equiprobable choice ismade between X2 = G, X2 = C, X2 = D. This is known as Markovian composition.

A Markov model for some music data (possibly from many pieces of music) consists of:

1. a state space I such that each music datum can be associated with precisely one statei $ I;

2. a transition matrix, borne of Markovian analysis of the data;

3. an initial distribution for Markovian composition.

The state space in (7), the transition matrix in Table 2 and the initial distribution in (8)constitute a Markov model for pitch-class. This is a model for monophonic music (melodies):two Markov models for polyphonic music are available for perusal in section 7.2, but theunderlying music theory is more advanced. The above definitions and examples may besupplemented by a mathematical perspective [32], by an accessible introduction to nth-orderMarkov chains in the context of monophonic music [24], or by a historical survey with moreidiosyncratic examples [1].

19

It may already be apparent to the reader that there is no guarantee a Markovian compositionwill be a stylistic success in terms of the trial described in the previous section. Due to thenature of Markovian analysis, the composition will exhibit stylistic traits from Xn to Xn+1,and from Xn+1 to Xn+2, etc. but perhaps {Xn, Xn+1, Xn+2} as an entity will contain featuresthat enable subjects to discriminate between it and human-composed works. A combinationof short-term (such as the pitch-class model given) and long-term models has been proposedto try to take account of this issue [8]. There is still no guarantee however, that the outputcomposition will be self-referential in any way, and recall from p. 7 that this feature is heldin high regard, as music ‘becomes intelligible to a great extent through self-reference’ [5,p. 249]. In section 7.3 a process is described in which discovered, self-referential patternsfrom an excerpt of a mazurka are used as a ‘template’ to filter the output of a Markov musicgenerator. A summary schematic of the process will su"ce here, given in Figure 15 overleaf.The filtering process requires that any generated patterns are of the same type as in thetemplate, occurring in the same number and similar locations, relative to the duration ofthe generated passage as a whole. Therefore it is guaranteed that successful passages ofgenerated music (i.e. those that pass filtering) will inherit to some extent the self-referentialaspects of the original Chopin excerpt.

The initial results of this process are encouraging, although there are potential pitfalls:

• the process works for short excerpts with a low number of generations (section 7.3quotes a mean of 12.7, standard deviation of 13.2 over fifty iterations) but may notscale linearly to longer excerpts or whole pieces;

• the successful output contains self-referential patterns that are supplementary to theintended inheritance;

• the successful output can contain many consecutive states from the same mazurkasource.

It will be necessary to address these pitfalls in order to improve the current process. I am alsoinvestigating how the filtering can be made part of the generation: as can be seen from Figure15, at present the process involves generation followed by filtering. Instead it may be possibleto use a graph-theoretic approach [49] to perform filtering as the generation proceeds. It willalso be necessary to consider how other musical Markov models could be made to interactwith the chord-spacing model described in section 7.3. This could be done by employing aMarkov random field model [16], which is a theoretical step beyond a Markov chain model.These investigations are in their initial stages.

20

!"#$%&'"()

"("*+,',)

-#."/.,)

!"#$%&)

0%1.*)

2'/,))

/.03*"/.4)

5.(.#"/.)

3",,"6.)%7)

08,'9)8,'(6)

0%1.*)

:";.#(,)'()

.<9.#3/)

,"&.1)/%)

/.03*"/.)

:";.#()

0"/9=.#)

>99.3/)/=.)

,899.,,78*)

3",,"6.)

?"/"@",.)

9%(/"'('(6)

-=%3'(A,)

0"B8#$",)

C<9.#3/)%7)")

-=%3'()

0"B8#$")

:#.*'0'("#'.,)

D.,)

E%)

Figure 15: A schematic for the process described in section 7.3. Two independent processes(creation of a Markov model and pattern matching) are preliminary steps to a third processin which passages of music are generated iteratively, until one such passage fits the template.

21

3 Research approach

This section clarifies how the research question will be addressed. Each step of the envisagedanswer to the research question (from p. 2) is restated, followed by a description of whatmethods have or will be used to accomplish the step. There will also be due considerationgiven to the underpinning philosophy and justification of the chosen method.

3.1 Review of pattern matching in music

The review of pattern matching in music is intended to be focused on DRPPM, related andsubsequent work. Initially a narrative, non-technical approach will be taken to reviewingpattern matching in music, thus introducing a wealth of existing work with di!ering mo-tivations and outcomes. There is no need to be over-technical at this stage as the aim isto contextualise DRPPM and to demonstrate various strands of pattern matching in music.When it comes to reviewing DRPPM and subsequent work technically, I will use the ‘theory-example’ paradigm of mathematical exposition. It is common in mathematical textbooks, forexample, to give theory (definitions, lemmas, results etc.) followed by worked examples. In[15, p. 285] a theoretical result is stated that relates to the t-distribution. Then on pp. 314-317 this result is exemplified in several ‘real-world’ situations. The philosophy underpinningthis well-established approach is that it engages both abstractly- and applied-biased readers.Those who do not appreciate abstract equations can glean intuition from the examples. Andvice versa, those who prefer abstraction and the power of generality can pay less attentionto the examples. An alternative approach to adopt when explaining mathematical conceptsapplied to music is to avoid technical description and equations at all costs. For instance [41]does not even give the mathematical definitions of entropy or relative information, despitemaking extensive use of the concepts. The advantage is that the non-technical reader mightnot be fazed, but arguably the sacrifice—the research being unreplicable—is too great.

3.2 Theoretical extension

In this section my research will build on the review of pattern matching in music. A ‘theory-example’ approach as in section 3.1 will be adopted. The details of the theoretical extensionhave been proposed already (section 7.1), but this appendix was written originally for myown benefit: to record definitions, lemmas etc. in a distilled and highly general form. In thethesis proper, the definitions and lemmas will be interpolated with examples that show howthe theory applies to excerpts of pieces of music.

3.3 Implementation, with an adapted measure for pattern interest

By this step a focus on certain types of musical patterns will have emerged, and it will havebeen shown, theoretically, how these patterns ought to be found. A separate issue is addressed

22

in this section: the methodology used in implementing the theory, and in explaining theimplementation. When supplied with a piece of music (in a suitable symbolic representation),the implemented algorithm should actually find patterns therein. The Lisp programminglanguage [44] will be used to write the algorithm. There are other options (e.g. Matlab) butLisp is well-suited to my research aims. It being used by other researchers in the field, thereis the potential for interested readers to check the Lisp implementation and even to exchangecode. Had my research question involved the design of user interfaces for music analysts,say, then a more design-oriented environment would need to be considered (e.g. Cocoa [19]).The implementation will be explained in the thesis with written description, flow diagramsand perhaps pseudo-code. Some theses, though overtly computational, omit implementationdetails and rely on references to external code [7]. Arguably this a!ects the contribution ofthe research, because the reader can only be convinced by an outline of a process, ratherthan the process itself. My aims are: first, that the written description be su"cient foran interested reader to re-implement my algorithm in the language of their choice; second,to avoid presenting the reader with paragraphs of Lisp code; third, to make the algorithmavailable via my homepage.

In the review of existing research it was discussed how computational pattern matchers oftenmiss musically interesting patterns, yet return many thousands of results, even for relativelyshort excerpts of music. Also discussed was the intention to adapt a measure for patterninterest [9], from monophonic to polyphonic scenarios. This measure could be used to orderthe output of a pattern-matching algorithm, and to reduce the algorithm’s computationalcomplexity by limiting its search space. A ‘theory-example’ approach as in section 3.1 willbe used to adapt the measure to polyphonic scenarios. To heighten the contribution toknowledge of this aspect of the research, an experimental methodology might be adopted, inwhich the measure is validated by a trial. For instance, twenty trained musicians are askedto give already-discovered patterns a numerical rating, according to whether the patternsare more or less musically interesting. The patterns are shown in the context of the pieceof music in which they appear and the rating scale should be suitably fine-grained to reducethe possibility of two di!erent patterns receiving exactly the same rating. Subject responsesare tested pairwise for correlations, in order to form a justified aggregate response. Themeasure for pattern interest is applied to the same discovered patterns, and its output testedfor correlation with the aggregate subject response. A high level of correlation between themeasure and the aggregate response validates the measure for pattern interest, and viceversa. This type of approach is well established [5], but an alternative approach—that is yetto be ruled out—is to consult a single music-analytic expert [41], rather than multiple (andprobably less expert) subjects [45].

3.4 Evaluation of the pattern matcher

A trial was outlined in the above paragraph involving twenty trained musicians. A secondpart to this trial is designed with the evaluation of the pattern matcher (not just the interestmeasure) in mind. Subjects are now asked to find and rate patterns in pieces of music (dif-

23

ferent pieces to those used in the first part of the trial). This is a more demanding task so itis appropriate to place it in the second half of the trial. Furthermore the first part of the trialwill give an implicit indication of the focus on certain types of musical patterns. Treating theresults in much the same way as those from the first part (pairwise correlation, aggregation),one or more computational pattern matchers can be tested for recall and precision (definitionson p. 12), and correlation with the aggregated response. Again an alternative approach is toconsult a single music-analytic expert. Another alternative is to underplay the role of exper-imental evaluation in this research and concentrate instead on reducing the computationalcomplexity of the algorithm. For instance in DRPPM algorithms are evaluated by runningthem on large datasets and reporting performance statistics. Experimental evaluation is per-functory: recall the quotation from p. 8 that ‘SIATEC [the algorithm] typically discovers tensof thousands of TECs [patterns] even in relatively short pieces. . .and usually only a verysmall proportion of these TECs are perceptually significant or analytically interesting’ [30,pp. 342-343]. Criticism analogous to that quoted on p. 15 could be leveled at this kind ofapproach: ‘does simply describing a computer program that analyses music count as a usefulcontribution to research’? [36, adapted from p. 121]. Hence, it is important, not merelyjustifiable, to make an experimental methodology part of evaluating the pattern matcher.

3.5 Formation of a generative model of musical style

Frederic Chopin’s mazurkas have been chosen as a focus for the formation of a model ofmusical style. This research approach was justified on p. 18 in terms of the existence ofdata in a relatively useable and reliable form, and in terms of the existence of a potentialbenchmark [12]. It is intended that the formation of the model be supported by a reviewof existing research, paying particular attention to replicable aspects [13, 37]. As in section3.1, the model of musical style will be developed using the ‘theory-example’ paradigm ofmathematical exposition. Section 2.3 (Markov models in music: an example) is a goodindication of the balance to be struck between mathematical theory and music examples.

As for alternative approaches, that taken by Cope [11] has its roots in the humanities ratherthan the sciences. It begins by citing historical precedents for the model of musical style tobe proposed. Descriptions of the concepts behind the model are supplemented by examplesof music and code. Extra coding of model components—not an implementation of the over-all model—is made available on an accompanying CD. The advantage of this ‘descriptive’research approach is that it provides a strong intuition of the model’s construction. In mythesis I hope to devote a greater amount of e!ort and explanation to the codification ofunderpinning concepts, to the point that any implementation is replicable.

3.6 Integration of pattern matcher into model of musical style

The ‘theory-example’ paradigm of mathematical exposition will be employed again here,schematics such as Figure 15 being helpful in illustrating the interaction of pattern matchers,‘templates’, models of musical style, generated passages, etc. As pointed out in section

24

7.3, an alternative approach is taken in music constraint programming. This shows thatthere are established methods for interacting with computer music generators, rather thanjust filtering their output. However, as acknowledged on p. 20, attempts at integrating thepattern matcher with the model of musical style are in their early stages, so it is not possibleto be more precise about a research approach at present.

3.7 Evaluation of models of musical style

Considerable attention has been devoted already to criticising the prevailing methodology forevaluation of computational models of musical styles (section 2.3, Critique of a frequentiststatistical approach). To reiterate, the prevailing methodology:

• presents di"culties to non-statisticians;

• is atypical in terms of the research implications of its hypotheses;

• focuses on product rather than process.

Despite these criticisms, I intend to adopt an approach similar to the prevailing, experimentalmethodology. Its shortcomings are outweighed by the standardisation it a!ords, and hence byadhering to this approach there is an improved chance of making a contribution to knowledge.It appears that the application of the methodology to polyphonic music will be a first.

25

4 Work schedule

This section gives a work schedule for finishing the PhD thesis in good time. At variouspoints throughout this report it has been indicated that certain tasks have been completedalready, some partially. The last section addressed intentions for carrying out future tasks.So the emphasis here is not on what has been done, nor on the details of what is to be done,but on the scheduling of that work. The work schedule makes reference to the steps of theenvisaged answer to the research question (from p. 2):

1. review of pattern matching in music, especially the SIA algorithms [30], related andsubsequent work;

2. theoretic extension, built on findings in 1 (this is well underway, see section 7.1);

3. implementation of 2, incorporating an adapted measure for pattern interest [9];

4. evaluation of the pattern matcher emerging from 2-3;

5. formation of a generative model of musical style for a specific corpus of music, anddrawing on replicable aspects of existing research [13, 37];

6. integration of the pattern matcher from 3 into the model of musical style from 5;

7. evaluation of models from 5-6.

June 2009 4 presentations to give, and

write-up for ECIR.

July

August

September

October Contingency for above.

November

December

January 2010 Contingency for above box.

February

March

April

May

June

July

August Contingency for above boxes.

September

October Contingency for above box. Attend ISMIR.

November

December


February

March

April Step 6: write-up.

May

June

July

August Attend ICMC.

September

Key:

Strategic deadline/conference Step 4

Step 1 Step 5

Step 2 Step 6

Step 3 Step 7

CMJ - Computer Music Journal,

ECIR - European Conference on Information Retrieval

ICMC - International Computer Music Conference,

ISMIR - International Symposium for Music Information Retrieval,

JMM - Journal of Mathematics and Music, JNMR - Journal of New Music Research.

Step 5: build Markov model

with rhythmic features.

Step 6: explore & draft

integration methods.

Generate music.Step 4: run evaluation of

steps 2-3.Step 1: write-up review of

pattern matching in music.

Step 3: write-up & implement

adaption of interest measure.

Design validation trial.

Step 2: write-up. Design &

recruit for evaluation.

Steps 5, 7: amendments to

write-up.

Steps 1-4: amendments to

write-up.

Contingency for write-up.

Analysis & write-up for ISMIR,

JNMR, step 4.

Contingency for anticipated

presentations, workshops.

Step 7: evaluation of model

emerging from steps 5-6.

Step 5: complete & write-up

review of replicable aspects.

Analysis & write-up for ICMC,

CMJ, step 7.

Step 5: model selection.

Step 6: finalise method of

integration. Generate music.

Step 7: design & recruit.



26

June 2009 4 presentations to give, and

write-up for ECIR.

July

August

September

October Contingency for above.

November

December


February

March

April

May

June

July

August Contingency for above boxes.

September

October Contingency for above box. Attend ISMIR.

November

December


February

March

April Step 6: write-up.

May

June

July

August Attend ICMC.

September

Key:

Strategic deadline/conference Step 4

Step 1 Step 5

Step 2 Step 6

Step 3 Step 7

CMJ - Computer Music Journal,

ECIR - European Conference on Information Retrieval

ICMC - International Computer Music Conference,

ISMIR - International Symposium for Music Information Retrieval,

JMM - Journal of Mathematics and Music, JNMR - Journal of New Music Research.

Steps 5, 7: amendments to

write-up.

Steps 1-4: amendments to

write-up.

Contingency for write-up.

Analysis & write-up for ISMIR,

JNMR, step 4.



Step 7: evaluation of model

emerging from steps 5-6.

Step 5: complete & write-up

review of replicable aspects.

Analysis & write-up for ICMC,

CMJ, step 7.

Step 5: model selection.

Step 6: finalise method of

integration. Generate music.

Step 7: design & recruit.



Step 5: build Markov model

with rhythmic features.

Step 6: explore & draft

integration methods.

Generate music.Step 4: run evaluation of

steps 2-3.Step 1: write-up review of

pattern matching in music.

Step 3: write-up & implement

adaption of interest measure.

Design validation trial.

Step 2: write-up. Design &

recruit for evaluation.

The main task dependencies are indicated by arrows, and I have tried to allow su"cientcontingency time. This time will enable me to receive supervisor feedback and revise thesischapters as the project progresses. Steps 1 and 5 in particular are ongoing processes. Boxesfor these steps denote extra attention is being devoted to their completion. There are twostrands to the work schedule for clarity and also because often I work on two tasks at once.

27

5 Glossary of music-theoretic terms

Definitions make use of concise [25] and extended [43] dictionaries.

classical. Used with a small ‘c’ as a generic term to describe music, typically of Westernorigin, dating from the beginning of the Baroque period (c. 1600) to the end of the Romanticperiod (c. 1900). An even broader definition is ‘the opposite of light or popular music’ [25,p. 148]. With a large ‘C’, Classical refers to ‘music composed roughly between 1750 and1830’ [25, p. 148].

clef. ‘Symbol normally placed at the beginning of every line of music to indicate the exactlocation of a particular note on the sta!’ [25, p. 149].

form. ‘The structure and design of a composition’ [25, p. 261], e.g. binary form AB, ternaryform ABA, rondo form ABACADA etc.

harmony. ‘The simultaneous sounding (i.e. combination) of notes, giving what is knownas vertical music, contrasted with horizontal music (counterpoint). . .Even when the mainprocess in the composer’s mind is weaving together of melodic strands he has to keep beforehim this combinational element, both as regards the notes thus sounded together and thesuitability of one combination to follow and precede the adjacent combination’ [25, p. 321].

key-signature. ‘The sign, or number of signs, written at the beginning of each sta!, toindicate the key of the composition’ [25, p. 388].

Lydian. Of a scale consisting of the intervals ‘tone, tone, tone, semitone, tone, tone, semi-tone’. Beginning on pitch-class F, these intervals would give the scale F, G, A, B, C, D,E, F. It is one of the modes, the ‘scales which dominated European music for 1,100 years(approximately AD 400 to AD 1500)’ [25, p. 483].

monophony, monophonic. ‘One sound. Music which has a single melodic line of noteswithout harmonies or melodies in counterpoint, as opposed to polyphony and homophony’[25, p. 486].

note. Three meanings: (1) ‘a single sound of given musical pitch and duration’ [25, p. 520];(2) a written representation of (1); (3) ‘a finger-key of the pianoforte, organ, accordion etc. toproduce a sound of particular pitch’ [25, p. 520].

notehead. Typically a note as in definition (2) consists of a filled or empty ellipse (informallya dot) from which there emerges a stem. The ellipse is sometimes referred to as a notehead.

phrase mark. A curved line very similar in appearance to a tie (but having a di!erentmeaning) that encloses a ‘short section of a composition into which the music, whether vocalor instrumental, seems naturally to fall’ [25, p. 559].

pitch class. ‘The type of a pitch. Pitches belong to the same class if they have somerelation—for example, the octave relation—of compositional or analytical interest. . . Thusletter-names, possibly modified by accidentals, such as C, C$, D", D and E"", all denote dif-ferent classes of pitch. 12-note equal temperament, commonly used to model highly chromaticmusic, induces another equivalence relation, the enharmonic’ [43, vol. 19 p. 804].

28

polyphony, polyphonic. ‘Many sounds. Music in which several simultaneous vocal or in-strumental parts are combined contrapuntally, as opposed to monophonic music’ [25, p. 570].There is a historical connotation referring to music composed roughly from 13th-16th cen-turies, but in music computing especially, ‘polyphonic’ is used to describe any music that isnot monophonic.

score. ‘A form of manuscript or printed music in which the staves, linked by bar-lines, arewritten above one another, in order to represent the musical coordination visually’ [43, vol. 22p. 894].

sequence. ‘In music construction, the more or less exact repetition of a passage at a higheror lower level of pitch. If the repetition is of only the melody it is called a melodic sequence;if it is of a series of chords it is called a harmonic sequence. If the intervals between thenotes of the melody are to some extent altered (a major interval becoming a minor one andso forth, as is practically inevitable if the key is unchanged) it is called a tonal sequence; ifthere is no variation in the intervals (usually achieved by altering not merely the pitch of thenotes but also the key) it is called a real sequence’ [25, p. 664].

sta! (stave, plural staves). ‘The system of parallel lines on and between which the notes arewritten, from which music is played, the pitch being determined by the clef written at thebeginning of the sta!. Normally [consisting] of 5 lines’ [25, p. 695].

sta! notation. If notation refers to ‘the methods of writing down music, so that it canbe performed’ [25, p. 519], then sta! notation is that approach to notation which makesuse of staves (as opposed to historical variants or the graphical scores of twentieth-centuryavant-garde composers).

stem. A vertical stroke emerging from a notehead.

theme. ‘Succession of notes which play important part in construction of a composition’[25, p. 734].

tie (or bind). A ‘curved line very similar in appearance to a phrase mark (but havinga di!erent meaning) placed over a note and its repetition to indicate that the 2 shall beperformed as one unbroken note of their combined time-value’ [25, p. 737].

time signature. ‘Sign placed after the clef and key-signature at the beginning of a pieceof music, or during the course of it, to indicate the time or metre of the music. Normally itcomprises two numbers, one above the other, the lower defining the unit of measurement inrelation to the whole-note, the upper indicating the number of those units in each measure(bar)’ [25, p. 738].

29

6 Bibliography

1. Ames, C., ‘The Markov process as a compositional model: a survey and tutorial’, inLeonardo 22(2) (1989), 175-187.

2. Babbitt, Milton, ‘The use of computers in musicological research’, in Perspectives ofNew Music 3(2) (1965), 74-83.

3. Boden, Margaret A., The creative mind: myths and mechanisms, 2nd ed. (London:Routledge, 2004).

4. Born, Georgina, Rationalizing culture: IRCAM, Boulez, and the institutionalization ofthe avant-garde, (Berkeley, California: University of California Press, 1995).

5. Cambouropoulos, Emilios, ‘Musical parallelism and melodic segmentation: a computa-tional approach’, in Music Perception 23(3) (2006), 249-267.

6. Clarke, Eric, ‘Empirical methods in the study of performance’, in Empirical musicol-ogy: aims methods, prospects, eds. Eric Clarke and Nicholas Cook (Oxford: OxfordUniversity Press, 2004), 77-102.

7. Collins, Nick, Towards autonomous agents for live computer music: realtime machinelistening and interactive music systems. PhD thesis, Faculty of music, University ofCambridge (2006), downloaded from http://www.informatics.sussex.ac.uk/users/nc81/research.php on 3 November 2008.

8. Conklin, Darrell, and Ian H. Witten, ‘Multiple viewpoint systems for music prediction’,in Journal of New Music Research 24(1) (1995), 51-73.

9. Conklin, Darrell, and Mathieu Bergeron, ‘Feature set patterns in music’, in ComputerMusic Journal 32(1) (2008), 60-70.

10. Cook, Nicholas, ‘Perception: a perspective from music theory’, in Musical Perceptions,eds. Rita Aiello and John A. Sloboda (Oxford: Oxford University Press, 1994), 64-95.

11. Cope, David, Experiments in musical intelligence, in The Computer Music and DigitalAudio Series, (Madison, Wisconsin: A-R Editions, 1996).

12. Cope, David, Mazurkas, after Chopin (Pasadena, California: Spectrum Press, 1994).

13. Cope, David, Computational models of musical creativity (Cambridge, Massachusetts:MIT Press, 2005).

14. Crochemore, Max, ‘An optimal algorithm for computing the repetitions in a word’, inInformation Processing Letters 12(5) (1981), 244-250.

15. Daly, F., D.J. Hand, M.C. Jones, A.D. Lunn and K.J. McConway, Elements of statistics(Wokingham, England: Addison-Wesley, 1995).

30

16. Davison, A.C., Statistical models (Cambridge: Cambridge University Press, 2003).

17. Doraisamy, Shyamala, and Stefan Ruger, ‘Robust polyphonic music retrieval with N -grams’, in Journal of Intelligent Information Systems 21(1) (2003), 53-70.

18. Downie, J. Stephen, ‘Music information retrieval’, in Annual Review of InformationScience and Technology 37 (2003), 295-340.

19. Hillegass, Aaron., Cocoa programming for Mac OS X, 4th ed. (Upper Saddle River, NewJersey: Addison-Wesley, 2008).

20. Hiller, L., and L. Isaacson, Experimental music (New York: McGraw-Hill, 1959).

21. Leman, Marc, ‘Symbolic and subsymbolic description of music’, in Music processing,ed. Go!redo Haus (Oxford: Oxford University Press, 1993), 119-164.

22. Lewin, David, Generalized interval systems and transformations, (Oxford: Oxford Uni-versity Press, 2007). Originally published by Yale University Press, New Haven, 1987.

23. Lincoln, Harry B. (ed.), The computer and music (Ithaca, New York: Cornell UniversityPress, 1970).

24. Loy, Gareth, Musimathics: the mathematical foundations of music, vol. 1 (Cambridge,Massachusetts: MIT Press, 2005).

25. Kennedy, Michael, The concise Oxford dictionary of music, 4th ed. (Oxford: OxfordUniversity Press, 1996).

26. Kornstadt, A., ‘Themefinder: a web-based melodic search tool’, in Computing in Mu-sicology 11 (1998), 231-236.

27. Manaris, Bill, Patrick Roos, Penousal Machado, Dwight Krehbiel, Luca Pellicoro andJuan Romero, ‘A corpus-based hybrid approach to music analysis and composition’,in Proceedings of the Twenty-Second Conference on Artificial Intelligence, Vancou-ver, Canada (2007), 8 pages, downloaded from http://www.cs.cofc.edu/%manaris/?n=Main.SelectedPublications on 18 December 2008.

28. Manning, Christopher D., Schutze, Hinrich, Foundations of statistical natural languageprocessing (Cambridge, Massachusetts: MIT Press, 1999).

29. Meredith, D., ‘The computational representation of octave equivalence in the westernsta! notation system’, in Cambridge Music Processing Colloquium (September 1999),downloaded from http://www.titanmusic.com/papers/public/meredith.pdf on 24 April2009.

30. Meredith, D., K. Lemstrom, and G.A. Wiggins, ‘Algorithms for discovering repeatedpatterns in multidimensional representations of polyphonic music’, in Journal of NewMusic Research 31(4) (2002), 321-345.

31

31. Mozer, Michael C., ‘Neural network music composition by prediction: exploring thebenefits of psychoacoustic constraints and multi-scale processing’, in Connection Sci-ence 6(2-3) (1994), 247-280.

32. Norris, J.R., Markov chains (Cambridge: Cambridge University Press, 1997).

33. Paderewski, Ignacy J. (ed.), Fryderyk Chopin: complete works, vol. 10 mazurkas forpiano (Warsaw: Instytut Fryderyka Chopina, 1953).

34. Pardo, Bryan, and William P. Birmingham, ‘Algorithms for Chordal Analysis’, in Com-puter Music Journal 26(2) 2002, 27-49.

35. Pearce, Marcus, and Geraint Wiggins, ‘Towards a framework for the evaluation ofmachine compositions’, in Proceedings of the Artificial Intelligence and the Simulationof Behaviour Symposium on Artificial Intelligence and Creativity in Arts and Sciences(2001), 12 pages, downloaded from http://citeseerx.ist.psu.edu on 24 April 2009.

36. Pearce, Marcus, David Meredith, and Geraint Wiggins, ‘Motivations and methodologiesfor automation of the compositional process’, in Musicae Scientiae 6(2) (2002), 119-147.

37. Pearce, M.T., The construction and evaluation of statistical models of melodic structurein music perception and composition. PhD thesis, Department of Computing, CityUniversity (2005), downloaded from http://www.vislab.ucl.ac.uk/ mpearce on 3 May2009.

38. Pearce, M.T., and G.A. Wiggins, ‘Evaluating cognitive models of musical composition’,in eds. A. Cardoso and G.A. Wiggins, Proceedings of the Fourth International JointWorkshop on Computational Creativity, (Goldsmiths, University of London, 2007), 73-80.

39. Pople, Anthony, ‘Modeling musical structure’, in Empirical musicology: aims, methods,prospects, eds. Eric Clarke and Nicholas Cook (Oxford: Oxford University Press, 2004),127-156.

40. Popper, K., The logic of scientific discovery (London: Hutchinson and Co., 1959).

41. Potter, K., G.A. Wiggins, and M.T. Pearce ‘Towards greater objectivity in music the-ory: information-dynamic analysis of minimalist music’, in Musicae Scientiae 11(2)2007, 295-324.

42. Romming, Christian Andre and Eleanor Slefridge-Field, ‘Algorithms for polyphonicmusic retrieval: the Hausdor! metric and geometric hashing’, in Proceedings of theInternational Symposium on Music Information Retrieval, Vienna (2007), 6 pages.

43. Sadie, S. and J. Tyrrell (eds.), The new Grove dictionary of music and musicians(London: Macmillan, 2001), 29 volumes.

32

44. Seibel, Peter, Practical Common Lisp (Berkeley, California: Apress, 2005).

45. Sloboda, J. A., and S.A. O’Neill, ‘Emotions in everyday listening to music’, in eds. P.N.Juslin and J.A. Sloboda, Music and emotion: theory and research (Oxford: OxfordUniversity Press, 2001), 415-430.

46. Turing, Alan. ‘Computing machinery and intelligence’, in Mind 59(236) (1950), 433-460.

47. Wiggins, Geraint, George Papdopoulos, Somnuk Phon-Amnuaisuk, and Andrew Tuson,‘Evolutionary methods for musical composition’, in Proceedings of the Second Interna-tional Conference on Computing Anticipatory Systems, Liege, Belgium, 1998, 14 pages.

48. Wiggins, Geraint A. ‘Computer models of musical creativity: a review of computermodels of musical creativity by David Cope’, in Literary and Linguistic Computing23(1) (2008), 109-115.

49. Wilson, Robin J., Introduction to Graph Theory, 4th ed. (Harlow: Longman, 1996).

33

7. Appendices have been removed.

Documents

Pattern matching in music and its use for automated ...computing-reports.open.ac.uk/2010/TR2010-09.pdf · of music analysis that heretofore were taken for granted. An analogous dialogue