CHOI, Andrew. 2011. Jazz Harmonic Analysis as Optimal Tonality Segmentation

Embed Size (px)

DESCRIPTION

Computer Music Journal, 35:2, pp. 49–66, 2011

Citation preview

  • Jazz Harmonic Analysisas Optimal TonalitySegmentation

    Andrew Choi131 Shawnee Place SouthwestCalgary, Alberta, Canada T2Y [email protected]

    When asked to improvise over the chord changesof a tune, jazz musicians, either by an intuitiveor formal process, perform an analysis to obtain aharmonic road map that guides them through thepossibilities of what notes and scales to play. Thepurpose of this harmonic analysis is to discoverand understand underlying structures in the chordchanges. A notation often used in jazz theory texts(e.g., Nettles and Graf 1997; Jaffe 2009) to representthis harmonic structure identifies a segmentationof the chord changes, the key center (or tonality) ofeach segment, and the harmonic function of eachchord with respect to its key center and other chords.Performing such an analysis is a fundamental stepif jazz improvisation is to be simulated by software.It is also an interesting and important problem tobe considered on its own for the implementationof jazz compositional and teaching tools. Thisarticle presents a formulation of and an algorithmfor the harmonic analysis of jazz chord sequences.Some familiarity with jazz harmony or traditionalharmony is assumed.

    Here is an example of the intended kind ofanalysis. Consider the chord changes for MilesDaviss Solar in Figure 1.

    These chord changes will be the input given tothe harmonic analysis algorithm. The output of thealgorithm is an annotated chord chart, as shown inFigure 2.

    Key centers are shown below the bars: The keycenter of bars 1 and 2 is C minor, that of bars 36is F major, that of bars 79 is E major, and so on.Arrows and brackets represent dominant resolutionsand related minor seventh chords (i.e., the relatedIIm7s), respectively (Nettles and Graf 1997). (Thesewill be explained further in the section StructuralAnalysis.) Roman numeral chord symbols abovethe chords indicate their harmonic functions withrespect to their key centers. For example, the Dm75and G79 chords in bar 12 function as IIm75 and V7chords, respectively, resolving to the root of the key

    Computer Music Journal, 35:2, pp. 4966, 2011c 2011 Massachusetts Institute of Technology.

    center C minor. Once this analysis is performed, theinformation it provides can be used by a musicianto determine what notes to play over the chordchanges. This is the subject of numerous jazz theorytexts and method books (e.g., Levine 1995; Pass1996).

    This representation of the result of harmonicanalysis has sound theoretical basis and is wellunderstood by jazz musicians. It facilitates directcomparison to analyses performed manually, andallows analysis algorithms to be evaluated objec-tively and compared with one another, since corporaof analyses of jazz standards are available in theliterature (Mehegan 1959, 1962, 1964, 1965; Coker1987).

    The main innovation of the harmonic analysisalgorithm presented in this article results fromthe observation that chord functions and harmonicstructures (such as dominant resolutions and re-lated IIm7s) are completely determined by a givensegmentation and choice of key centers. Harmonicanalysis can therefore be formulated and solvedalgorithmically as a tonality segmentation problem.This formulation and a solution for it in the form ofa dynamic programming algorithm are the subjectof this article.

    Related Work

    Previous studies related to the subject of thisarticle can be categorized into the following areas:grammar-based and rule-based harmonic analysisof jazz chord sequences, harmonic analysis of tonalmusic, jazz improvisation systems, and jazz theory.

    Grammar-Based and Rule-Based HarmonicAnalysis of Jazz Chord Sequences

    Formal grammar has been used in the study of jazzchord sequences for some time. Steedman (1984)proposes a context-sensitive grammar as a generative

    Choi 49

  • Figure 1. Chord changesfor Solar.

    Figure 1.

    Figure 2. A harmonicanalysis of Solar.

    Figure 2.

    model of variations of twelve-bar blues changes bychord substitution. The use of formal grammar forharmonic analysis, however, is problematic becauseno grammar that encompasses all possible chordchanges will likely be found. It is also unclearhow modulations can be described by grammarrules because there seem to be no constraints onmodulation: a theme can modulate to any new key(Johnson-Laird 2002, p. 429).

    Pachet (1991) proposes a production rule systemfor harmonic analysis of jazz chord sequences. Arule-based system has the same weakness as agrammar-based system in that it cannot derive asuitable analysis when its set of rules does notinclude those required to analyze a given tune. Forinstance, the system is used to determine whetherthe chord changes of Solar are in the form of a blues.Lacking a rule for the general form of these changes,it produces an analysis which is not what a humanwould do (Pachet 2000). Mouton and Pachet (1995)suggest that a symbolic, rule-based method can

    benefit from a softer decision model by integratingnumerical techniques, but offer no concrete proposalas to how this can be applied to harmonic analysis.The algorithm presented in this article incorporatessymbolic knowledge in jazz theory and treatstonality segmentation as an optimization problem,and not one of parsing the chord sequences. Thesample analysis of Solar generated by this algorithm(see Figure 2) demonstrates that a chord sequencecan indeed be analyzed fully with such a technique.Therefore, with respect to harmonic analysis in thesense taught by jazz theory texts, it is unnecessaryto determine whether a tune such as Solar is ablues, although it is crucial to find a good tonalitysegmentation for it.

    Scholz, Dantas, and Ramalho (2005) extendPachets method by additional processing on gaps:segments of the chord sequence for which nopattern rules apply. After patterns are identified,gaps are merged with neighboring segments whencertain conditions are met. Both their and Pachets

    50 Computer Music Journal

  • methods determine tonality segmentation for achord sequence by the breaks among adjacentpatterns detected by their sets of rules. If tonalitysegmentation is viewed as an optimization problem,such methods represent greedy algorithms (Corman,Leiserson, and Rivest 2009) in the sense that theymake locally optimal choices in hopes that thesewill lead to globally optimal segmentations. Thetonality segmentation algorithm in this articleexplicitly models the conditions under whichmodulations may occur and solves the optimizationproblem directly, using dynamic programming. Inthe formulation given herein, the quality of thetonality segmentation completely determines thatof a harmonic analysis.

    Harmonic Analysis of Tonal Music

    Much work has also been done in harmonic analysisof tonal music, where the objective is to determinethe harmonic structure in a sequence of musicalnotes. Algorithms proposed for this type of analysismake decisions based on numerical as well assymbolic information. Numerical algorithms forharmonic analysis typically introduce cost anddistance functions and formulate the analysisproblems as optimization problems. Thus, thealgorithm of this article has more in common withthem than rule- and grammar-based algorithmsfor chord sequences. Temperley and Sleator (1999)propose an analysis algorithm based on well-formedness rules and preference rules, adaptedfrom A Generative Theory of Tonal Music (Lerdahland Jackendoff 1983). The well-formedness rulesdefine a solution space while the preference rulesdefine a scoring system for the solutions. Theirharmonic analysis problem is then formulated as anoptimization problemone of finding the shortestpath in a directed graph, which is solved usingdynamic programming.

    Pardo and Birmingham (2002) extend Temperleyand Sleators method by also taking chord qualitiesof segments into consideration. They present resultsof experiments that evaluate the performance oftheir algorithm and consider heuristic versions, aswell as an optimal version, of the search algorithm.

    Illescas, Rizo, and Inesta (2007) show how chordfunctions and cadences can be incorporated intopreference rules. Stronger cadences are given higherscores, causing analyses that contain them to bepreferred. Their technique shows how relationshipsamong consecutive chords and their functionscan be reflected in the preference rules. Theirharmonic analysis problem is then also formulatedas a shortest path problem and solved by dynamicprogramming.

    Jazz Improvisation Systems

    Numerous research papers and theses have beenwritten about computer-music systems that gen-erate jazz improvisations in various forms (e.g.,Ramalho, Rolland, Ganascia 1999; Klein 2005).However, in these studies the problem of harmonicanalysis only receives limited attention, and tech-niques are primarily borrowed from already-existingefforts. Interactivemusic systems for jazz improvisa-tion apply machine learning techniques to generateimprovisation from training data input by users(Pachet 2003; Thom 2003). Without a harmonicanalysis component, these systems can only playtunes on which they have been trained, and can onlyapply or adapt learned lines. Keller and colleagues(2006) design and implement a GUI learning toolthat allows its user to compose an improvisationby choosing lines to play against each chord (orgroup of chords) in a chart from a library. It alsodoes not perform harmonic analysis and leaves thedecision of which scales to use to its users. All ofthese systems will benefit from a more completesolution for the problem of harmonic analysis ofchord sequences, such as the one presented in thisarticle.

    Jazz Theory

    The notation for representing harmonic analysisused in the introduction has been taught at least asfar back as the 1980s at the Berklee College of Music(Nettles and Ulanowsky 1987) and elsewhere (Jaffe1983). Not all jazz musicians and teachers agree on

    Choi 51

  • Figure 3. Dominantresolution and deceptiveresolution.

    the importance of formal harmonic analysis, andthere is, of course, an immense body of knowledgeon different approaches to jazz improvisation. Mostwill agree, however, that the study of harmonicanalysis is necessary for understanding how andwhy jazz harmony works. It is also essential forcomputer simulation of jazz improvisation withexplainable and reproducible results.

    Jazz theory texts generally describe elementsof harmonic analysis by example and define theminformally. Specifically, there is no concrete modelfor modulations, nor is there a systematic procedurefor constructing an analysis from a given set ofchord changes. A contribution of this article is theformalization of these concepts and descriptionsinto a mathematical model that can be operated onby computer.

    Structural Analysis

    The harmonic analysis algorithm operates in twosteps: a structural analysis step which convertsa chord sequence into a list of analysis elements(defined subsequently) and a tonality segmentationstep which partitions this list into segments ofdifferent key centers.

    In this section the structural analysis algorithmand the data representation for the resulting har-monic structure are described. This descriptionbegins with a review of the following elementsof jazz theory: dominant resolutions, harmonicrhythm, substitute dominants, related IIm7s, ex-tended dominants, turnarounds, and interpolateddominants (Nettles and Graf 1997; Jaffe 2009).

    Review of Jazz Harmony

    The primary dominant is the V7 chord of a givenkey. In a major key it is expected to resolve to

    the IMaj7 chord. (Four-note diatonic chords areprevalent in jazz-related contexts, so in a major keythe I chord is usually IMaj7, and in a minor keythe Im chord is usually Im7 or ImMaj7. The I6 andIm6 chords can also be used as the I and Im chords,respectively.) For example, in F major, the primarydominant is C7, which resolves to FMaj7. In a majorkey, the secondary dominants are the VI7, VII7,I7, II7, and III7 chords, which are also denoted byV7/II, V7/III, V7/IV, V7/V, and V7/VI, respectively.They are expected to resolve to the diatonic chordsIIm7, IIIm7, IVMaj7, V7, and VIm7, respectively. In Fmajor, the secondary dominants are D7 (resolves toGm7), E7 (resolves to Am7), F7 (resolves to BMaj7),G7 (resolves to C7), and A7 (resolves to Dm7).

    A primary or secondary dominant resolving tothe expected diatonic chord a perfect fifth belowcreates a dominant resolution, which is annotatedin a harmonic analysis by a solid arrow (bars 2 and 3in Figure 3).

    A dominant chord that does not resolve to theexpected diatonic chord often represents a deceptiveresolution. A deceptive resolution is not representedby an arrow in a harmonic analysis, but by aparenthesized roman numeral chord (bars 1 and 4in Figure 3). Note that deceptive resolutions aredifficult to represent and analyze using grammarrules.

    For simplicity, tunes to be analyzed are assumedto be composed of sections whose lengths in barsare multiples of four, with four beats to a bar. Themetrical structure (Lerdahl and Jackendoff 1983)of every four bars of a chord sequence is given bythe grid in Figure 4. The majority of jazz standardssatisfy this assumption. For tunes that do not, othermeans of deducing the metrical structure must beused, which will not be covered here.

    Stronger beats are ones with higher numbers.Har-monic rhythm is the pattern of accents created bythe chord changes. Certain harmonic elements are

    52 Computer Music Journal

  • Figure 4. Four-bar metricalstructure.

    Figure 4.

    Figure 5. Substitutedominant resolutions andrelated IIm7s.

    Figure 5.

    identified by interaction between harmonic rhythmand metrical structure. A dominant resolution isheard only when a dominant chord on a weaker beatresolves to its target chord on a stronger beat. InFigure 3, the dominant chords C7 and A7 occur onweaker beats compared to the FMaj7 and BMaj7chords into which they resolve.

    The tritone substitution for a dominant chordis the dominant chord whose root is a IV intervalbelow (or equivalently, a V interval above) thefirst chords root. These substitute dominants canbe used in place of the corresponding primaryand secondary dominant chords. In a major key, thesubstitute dominants are the II7, III7, V7, and VI7chords, which are also denoted by subV7, subV7/II,subV7/IV, and subV7/V, respectively. The chords IV7(subV7/III) and VII7 (subV7/VI) will only functionas substitute dominants in rare situations (Nettlesand Graf 1997). Substitute dominant resolutions areannotated in a harmonic analysis by dotted arrows(G7 resolving to FMaj7 in Figure 5).

    The related IIm7 of a dominant chord is theminor seventh chord (or a minor seventh flattedfifth chord for minor tonalities) whose root is aperfect fourth below the dominant chords root.Any dominant chord may be preceded by its relatedIIm7, or its tritone substitutions related IIm7. Ina harmonic analysis, this is annotated by a solidbracket (or dotted bracket, respectively) under thetwo chords (see Figure 5). Harmonic rhythm mustalso be taken into consideration in the detection ofrelated IIm7s: the related IIm7 and the dominantchord must be on a stronger beat and a weaker beat,respectively. (This applies regardless of whether atritone substitution is used for the dominant chordand whether the IIm7 chord is the related IIm7 ofthe dominant chord or its tritone substitution.)

    An extended dominant (also called sequentialdominant and substitute sequential dominant)is a series of dominant chords each resolving(deceptively) to the next one. It is represented by aseries of arrows in a harmonic analysis (bars 14 in

    Choi 53

  • Figure 6. Examples ofextended dominants.

    Figure 6). Only the first and last dominant chords arelabeled with roman numeral chords. An extendeddominant may also be composed of a combinationof dominant and substitute dominant resolutions(bars 58 in Figure 6). It may also contain relatedIIm7s (bars 912 in Figure 6).

    A turnaround is an idiomatic progression of(typically) four chords occurring at the end of asection, replacing an extended duration of a tonicchord. Many turnarounds cannot be analyzed bythe usual rules. Because their chord progressionsare so distinctive, however, the structural analysisalgorithm can recognize them by looking up aninternal library of turnarounds. An example of aturnaround in F major is the two bars | Am7 Am7| Gm7 G7 |. The library of turnarounds used in thecurrent implementation of the structural analysisalgorithm contains 13 turnarounds extracted fromthe collections of tunes in Appendix D of Coker(1987).

    An interpolated dominant is a substitute domi-nant chord that is inserted into a larger structure. Italways resolves to the chord that follows it, whoseroot is a II interval below its own root. That targetchord must be part of a larger structure. It may bethe target of a dominant resolution (the FMaj7 chordin bar 3 in Figure 7). The interpolated dominant isdepicted with a straight dotted arrow connectingit to its target. It may be the dominant chord of

    a related IIm7-dominant chord pair (the C7 chordin bar 6 in Figure 7). Or, it may be the second orsubsequent dominant chord in, or the final targetof, an extended dominant (the chords G7, C7, andFMaj7 in bars 11, 12, and 1 in Figure 7).

    Structural Analysis Algorithm

    The structural analysis algorithm and the datastructure it generates are now described. Considerthe variation of the blues progression in Figure 8.

    Figure 9 shows its harmonic analysis. Note thatthe entire chord sequence is composed of a singlesegment with a F major key center.

    The structural information that needs to beextracted from the chord sequence to generate theharmonic analysis above is represented by boxes inFigure 10.

    Each chord, turnaround, interpolated dominant,related IIm7, or extended dominant is represented byan analysis element (AE). (AEs are representationsof basic elements of jazz theory used in harmonicanalysis; contrast them with analysis objects inPachet [2000] which represent such elements aswell as high-level concepts such as shapes andsong forms.) The output of the structural analysisalgorithm is simply a list of AEs, that is, an AElist. Each chord is represented by a chord AE. A

    54 Computer Music Journal

  • Figure 7. Examples ofinterpolated dominants.

    Figure 7.

    Figure 8. A bluesprogression.

    Figure 8.

    turnaround AE contains an AE list, the elements ofwhich are chord AEs. An interpolated dominant AEcontains AEs for the substitute dominant chord andits target chord. A related IIm7 AE contains AEs forthe IIm7 chord and the V7; the latter may be thedominant chord to which the IIm7 chord is relatedor that dominant chords tritone substitution, oran interpolated dominant whose target chord is asuitable dominant chord. An extended dominantAE contains an AE list; each of its elements maybe a chord AE that represents a dominant chord, aninterpolated dominant AE, or a related IIm7 AE.

    Note that dominant resolutions, substitutedominant resolutions, and deceptive resolutionsare not represented explicitly in the data structure.AEs that represent dominant chords, related IIm7s,extended dominants, and turnarounds ending indominant chords resolve (normally or deceptively)to the AEs that follow them. For example, theAEs corresponding to bars 2 (Em75 A7), 3 (Dm7G7), and 8 (Am7 D7) in Figure 9 all function as

    dominants that resolve to the AEs that follow them,respectively. The AE corresponding to bar 4 (Cm7C7) is a substitute dominant that resolves to theBMaj7 chord in bar 5. The related IIm7 (whichcontains an interpolated dominant) in bars 9 and 10(Gm7 D7 C7) resolves deceptively to Am7, the firstchord of the turnaround in bars 11 and 12.

    Here is the structural analysis algorithm.

    Input: an AE list of input chords.Output: an AE list representing the result of theanalysis.1. Scan the input AE list for turnarounds;for each one found, replace the chord AEs in itwith a turnaround AE.2. Scan the AE list for related IIm7s (*); foreach one found, replace the two AEs formingthe related IIm7 with a related IIm7 AE.3. Scan the AE list for extended dominants(*); for each one found, replace the AEs in itwith an extended dominant AE.

    Choi 55

  • Figure 9. A harmonicanalysis of the bluesprogression.

    Figure 9.

    Figure 10. Structuralinformation in the bluesprogression.

    Figure 10.

    4. Scan the AE list for interpolated domi-nants at its top level; for each one found, replacethe two AEs in it with an interpolated dominantAE.

    (*) When looking for a dominant chord instep 2 and 3, take into account that it may bepreceded by an interpolated dominant chordand construct an interpolated.

    The four passes of the algorithm detectturnarounds, related IIm7s, extended dominants,and top-level interpolated dominants, respectively.Note that the algorithm cannot scan for interpolateddominants all in one step because they can only be

    identified when they are part of a larger structure.Clearly the running time of the structure analysisalgorithm is a linear function of the number of inputchords.

    Note also that the structural analysis algorithmdoes not detect dominant chords in blues pro-gressions and special situations where they donot create a dominant or deceptive resolution.These will be handled by the tonality segmentationalgorithm.

    The input to the structural analysis algorithm isa list of chord AEs constructed from the input chordsequence. The duration of each chord, measured inbeats, is also stored in the chord AE. The following

    56 Computer Music Journal

  • list represents the chord sequence of the sampleblues progression.

    From this representation the starting beat of eachchord can be deduced, which enables us to checkharmonic rhythm requirements when locatingrelated IIm7s. Here is the output generated for thisinput by the structural analysis algorithm.

    Chord AEs are listed by just the names of thechords. In actual implementation each chordsstarting beat and duration are stored with thechord to make it easy to check harmonic rhythmrequirements for dominant resolutions and generatethe harmonic analysis output.

    An important observation is that the structuralanalysis algorithm does not require prior knowledgeof the key center to work. Interpolated dominants,related IIm7s, and extended dominants can beidentified without knowing the underlying tonality.Also in practice, chord progressions in turnaroundsare so distinctive that they can be identified bydeducing the key centers from the matched chords.For example the turnaround library contains thepattern IIIm7 IIIm7 IIm7 II7, which will match

    Am7 Am7 Gm7 G7 in the chord sequence.The presence of the turnaround Am7 Am7 Gm7G7 also implies that the current key centeris either F major or F minor. This informationwill be used to define the cost of a turnaroundin the subsequent Cost Function for Segmentssection.

    Because of this, the structural analysis algorithmhere can also be applied as it is to chord sequencescontaining tonality changes. The output generatedby the structural analysis algorithm for the tuneSolar, for example, whose analysis is given in Figure2, is as follows.

    The key centers are needed to generate the romannumeral chords in the analysis output. However,this operation is only performed when the tonalitysegmentation algorithm is complete, at which timethe key centers are known.

    Tonality Segmentation

    Conceptually, the tonality segmentation algorithmis quite simple. It divides the AE list generated bythe structural analysis algorithm into segments andassigns a key center to each segment. Its goal is todivide the AE list at positions where modulationsoccur in the represented chord sequence, and toassign key centers to the segments that bestexplain each chords harmonic function withrespect to the key center of its segment (to bequantified by a cost function below). These twoaspects of the algorithm will be discussed inValidity Conditions for Segments and CostFunction for Segments, respectively. First, the useof dynamic programming for tonality segmentationis described.

    Choi 57

  • A Dynamic Programming Algorithm forTonality Segmentation

    Let a0,a1, . . . ,an1 be the AE list output of thestructural analysis algorithm. Each ai, 0 i < n, is atop-level AE on the AE list. For example, for the AElist for Solar, n= 9, a0 is the chord AE for Cm6, a1 isthe related IIm7 AE for Gm7 and C7, a2 is the chordAE for FMaj7, and so on.

    A segmentation of the AE list into msegments isrepresented by indexes p0, p1, . . . , pm such that 0 =p0 < p1 < < pm = n. The i-th segment containsthe AEs api ,api+1, . . . ,api+11, for 0 i < m. Theobjective of the algorithm is to find a segmentationwith minimal cost, where the cost of a segmentationis defined as follows. Let dki, j be the cost (describedlater) of assigning the key center k to the segmentai,ai+1, . . . ,aj, where 0 i j < n, k K, and K isthe set of all possible major and minor keys. Let di, jbe the minimal cost of dki, j among all keys, i.e.,

    di, j = minkK dki, j

    Then the cost of the segmentation p0, p1, . . . , pmis given by

    M(m 1)+m1

    i=0 dpi ,pi+11

    where M is a constant that represents the costof a modulation. The choice of its value will bediscussed subsequently.

    A minimal-cost segmentation can be found usingdynamic programming. Let ci be the minimal costfor segmenting a0,a1, . . . ,ai, for 0 i < n. Then,

    ci ={d0,0 if i = 0min

    (d0,i,min

    ij=1(cj1 + dj,i + M

    ))if i > 0

    That is, apart from the initial condition, theminimalcost of analyzing the first i AEs is given by eitherthe cost of analyzing all of them in one key center,or the minimum of the sum of the minimal costof analyzing the first j AEs, that of analyzingthe remaining i j AEs in one key center, andthe cost of a modulation, over all possible valuesof jwhichever is smaller. Given the values ofdki, j, for 0 i j < n and k K, the values of

    di, j for 0 i j < n can be determined in n2 |K|operations. Because |K| is constant (24, if 12 majorkeys and 12 minor keys are considered), this steptakes O(n2) time. The values of ci for 0 i < n canalso be computed in O(n2) time in the order ofincreasing index i. A standard technique in dynamicprogramming is used to record the index j thatresults in the minimal cost in each step so that theminimal segmentation can be recovered after cn1has been computed.

    This formulation of tonality segmentation makesthe assumptions that changes in tonality do notoccur within the AEs and that the tonality of eachsegment is independent of those of past and futuresegments. In practice these assumptions do nothinder the discovery of the correct segmentation.The structural analysis algorithm can thus be viewedas a normalization step performed on the inputchord sequences so that the tonality segmentationalgorithm can process them more easily.

    Cost Function for Segments

    To complete the description of the tonality seg-mentation algorithm, dki, j, the cost for choosing kas the key center for the segment ai,ai+1, . . . ,ajneeds to be defined. For example, one would expectdF major1,2 to have a small value for the AE list in thegiven example, because the segment a1,a2 (Gm7 C7,FMaj7) corresponds to the most common cadence inF major. Let

    dki, j ={tki, j if s

    ki, j is true

    otherwise

    for all 0 i j < n and k K. The quantities ski, jand tki, j play the roles that well-formedness rules andpreference rules do in Temperley and Sleator (1999),respectively. That is, ski, j defines a space of all validsolutions, and tki, j assigns costs to these solutionsto reflect their respective quality; the optimizationproblem is then one of finding a valid solution withminimal cost. The cost measure tki, j is defined tobe the sum of costs associated with the harmonic

    58 Computer Music Journal

  • Table 1. Chord Categories and Costs

    Category Major Key Chords Minor Key Chords Cost

    Root IMaj7 Im7 6Diatonic excluding root IIm7, IIIm7, IVMaj7, IIm7, IIIMaj7, IVm7,and primary dominant VIm7, VIIm75 Vm7, VIMaj7, VII7 5Primary dominant V7 V7 4Substitute primary dominant II7 II7 3Secondary dominant VI7, VII7, I7, II7, III7 VI7, VII7, I7, II7, III7, IV7 2Substitute secondary dominant III7, V7, VI7, IV7, VII7 II7, II7, V7, VI7, VII7 1Blues IV7 0Modal interchange Im7, IIm75, IIIMaj7, IVm7, Vm7, VIMaj7, VII7 0Diminished Any diminished chord Any diminished chord 0Unknown Chords not listed above Chords not listed above

    functions of each of the AEs ai,ai+1, . . . ,aj withrespect to the key center k according to Table 1.

    For example, a1 in the given example, a relatedIIm7 with chords Gm7 and C7, functions as aprimary dominant in F major (as explained subse-quently), and contributes a cost of 4 to segmentsthat contain it. The cost assigned to each categoryindicates how much the presence of an AE in thatcategory supports the choice of the key center. Thatis, the presence of a root is better evidence in supportof the key center than a diatonic chord (as definedin Table 1, which excludes the root and the primarydominant), which is in turn better evidence thana primary dominant, and so on. Note that minorkeys do not have blues or modal interchange chords,indicated by empty entries in Table 1.

    These cost values are chosen by trial and errorthrough experiments conducted on the collection oftunes in Appendix D of Coker (1987). They followour intuition on the degrees to which chords in thesedifferent categories reflect the presence of a givenkey center. The tonality segmentation algorithmappears to be robust with respect to variations in thecost values as long as relative rankings among thecategories are preserved.

    To determine the category of a chord AE, itsroman numeral chord with respect to k is computedand looked up in the appropriate column of Table 1.The category of an interpolated dominant AE is thatof its target chord. The category of a related IIm7 AEis that of its V7 AE. The category of an extendeddominant AE is that of its last AE. The cost of a

    turnaround depends on the key centers it implies,which are determined when it is detected by thestructural analysis algorithm. Its cost is 0 if k is oneof those key centers and otherwise. For example,the cost of the turnaround Am7 Am7 Gm7 G7is 0 if k is F major or F minor, and otherwise.

    The values of tki, j, 0 i j < n, and k K can becomputed in O(n2) time:

    tki, j ={category cost of ai from table 1 if i = jtki, j1 + tkj, j otherwise

    The cost of a modulation, M, is assigned avalue larger than that of any segment. That is, letM > tk0,n1 for all k K. Because of this, the tonalitysegmentation algorithm will always minimize thenumber of modulations (under the constraint ofthe validity conditions to be discussed) as well asfind an assignment of key centers to the segmentsthat correspond to the best analysis of harmonicfunctions of all the AEs.

    Validity Conditions for Segments

    To complete the definition of dki, j, define ski, j to be

    the validity of analyzing the segment ai,ai+1, . . . ,ajin key center k. It is defined as a logical conjunctionof four conditions:

    ski, j = rki, j f ki, j uki, j eki, j

    Choi 59

  • Figure 11. Bridge of tune31a.

    A valid segment is assumed to always contain aroot AE of key center k. Let rki, j be true if and onlyif the segment ai,ai+1, . . . ,aj contains a root AE inkey k.

    Also, a valid segment is assumed not to end ina dominant chord. Let f ki, j be true if and only if ajis not an AE of a category in DOM in key k, whereDOM = {Primary dominant, Secondary dominant,Substitute primary dominant, Substitute secondarydominant}.

    Each AE in the segment must have a knownharmonic function in key k. Let uki, j be true if andonly if all of ai,ai+1, . . . ,aj have known categories inTable 1 with respect to key k.

    It is easy to verify that each of rki, j, fki, j, and u

    ki, j

    can be computed in O(n2) time for all 0 i j < nand k K.

    Validity of a Segment Due to Subsegments

    The final condition eki, j causes the validity ofanalyzing the segment ai,ai+1, . . . ,aj in key k to beaffected by whether subsegments embedded in itcan be analyzed as modulations to keys related to k.In other words, eki, j is true if and only if the segmentai,ai+1, . . . ,aj contains no modulation to a relatedkey that prevents it from being analyzed completelyin key k. A key is related to key k if the formerstonic chord has one of the harmonic functions ink listed in Table 1. A related key is specified by aroman numeral interval optionally followed by theletter m (for minor keys). For example, the relatedkeys VI and IIIm of F major are D major and Aminor, respectively. Using this notation, the related

    keys of a major key are IIm, IIIm, IV, VIm, Im, III,IVm, Vm, and VI and those of a minor key are IIm,III, IVm, Vm, and VI.

    Modulations to keys unrelated to k are alreadydetected by the use of uki, j. For example, in theanalysis of Solar in Figure 2, analysis of the segmentcontaining the AEs a1 (Gm7 C7), a2 (FMaj7), and a3(Fm7 B7) in F major cannot extend into a4 (EMaj7)because EMaj7 has no valid harmonic function in Fmajor. Thus, the value of uF major1,4 is false. Note alsothat because a3 functions as a substitute secondarydominant in F major and as a primary dominantin E major, the cost assignments in Table 1 willassociate it with the latter key center after themodulation from F major to E major is detected.

    As an example of a tune with a modulationto a related key, consider tune 31a in AppendixD of Coker (1987). This tune begins with an Asection with a key center of B major, played twice.It then modulates to E major in the bridge, asshown in Figure 11. This is followed by a repeat ofthe A section (in B major). The bridge contains aturnaround E6 Cm7 Fm7 B13 that implies anE major tonality. Because this turnaround does nothave a valid harmonic function in B major, themodulation to E major in the bridge is detected inthe same way a modulation to an unrelated key isdetected, as described above, using uki, j.

    Now consider tune 31b in Appendix D of Coker(1987). It begins with an A section in F major,played twice. It then modulates to B major inthe bridge, shown in Figure 12. This is followedby a variation of the A section (also in F major).The bridge is composed of only two types of AEs,

    60 Computer Music Journal

  • Figure 12. Bridge of tune31b.

    chord AE B6 and related IIm7 AE Cm7 F7. Becausethese AEs have valid harmonic functions in Fmajor, using only the mechanisms described sofar, the bridge will also be analyzed in F major,leaving the modulation to B major undetected.The use of eki, j enables the tonality segmentationalgorithm to detect modulations within segmentseven when these modulations do not contain AEsthat distinguish them from the tonal centers of theenclosing segments.

    The design of eki, j presents a challenge becauseif the condition is too selective, the algorithm willmiss subsegments that are in fact modulationsto related keys; if it is too indiscriminate, thealgorithm will detect superfluous modulations.Therefore, eki, j is defined in such a way that allowsexperimentation and fine tuning. The chord chartswith tonality segmentation in Appendix D of Coker(1987) are then used in experiments to determine itsfinal definition, which is presented herein. A futureextension to the tonality segmentation algorithmcan include a statistical model for modulations andselect parameters for defining eki, j automaticallyusing a set of training data.

    Let gki, j(rk, sc, st, sp), for 0 i j < n and k K,be the validity of analyzing segment ai,ai+1, . . . ,ajin the key k with respect to whether it containssubsegments that can be identified as modula-tions according to a criterion specified by the tuple(rk, sc, st, sp). That is, gki, j(rk, sc, st, sp) is true if andonly if no subsegment of a certain type (specifiedby (rk, sc, st, sp)) occurs within it, which wouldinvalidate the analysis of the entire segment in k.The first parameter rk is a related key of k. The

    precise definitions of sc, st, and sp will be givenherein. Intuitively, the parameters sc and st providea test for determining whether a subsegment hasthe characteristics of a modulation (e.g., being longenough, or containing AEs of the right categories).The parameter sp specifies the allowable positionsof a subsegment in ai,ai+1, . . . ,aj. The completelist of values for (rk, sc, st, sp) that the tonality seg-mentation algorithm considers is given in Table 2.Then eki, j is simply the logical conjunction of thevalues of gki, j(rk, sc, st, sp) over all these sets ofvalues.

    Certain related keys such as IV and VIm aremore related to the original major key thanothers. Subsegments in these related keys need tobe more prominent before they are identified asmodulations. Under certain conditions (see typeC, subsequently), these subsegments must beeight bars or longer to be identified as modulations.Subsegments in less related keys, such as IIIand VI, are identified as modulations more easily.For example, in a segment in the key of F major,a short two-bar subsegment of, say, an E7 chordfollowed by an AMaj7 chord, is already identifiedas a modulation.

    According to how easily subsegments analyzed inthem are identified as modulations, related keys aredivided into three types as follows.

    Type A III and VI of a major or minor keyType B Im, IIm, IIIm, IVm, and Vm of a major

    key; IIm and Vm of a minor keyType C IV, and VIm of a major key; IVm of a

    minor key

    Choi 61

  • Table 2. List of All Related Keys, Subsegment Categories, Subsegment Tests, and SubsegmentPositions Considered by the Tonality Segmentation Algorithm

    For major key kRelated Key (rk) Subsegment Categories(sc) Subsegment Test (st) Subsegment Position(sp) Type

    III scsimple stsimple ANY A VI scsimple stsimple ANY AIm scsimple stsimple E P BIIm scsimple stsimple E P BIIIm scsimple stsimple E P BIVm scsimple stsimple E P BVm scsimple stsimple E P BIV sclong stlong ANY CIV scshort stshort E P CVIm sclong stlong ANY CVIm scshort stshort E P C

    For minor key kRelated Key (rk) Subsegment Categories(sc) Subsegment Test (st) Subsegment Position (sp) Type III scsimple stsimple ANY A VI scsimple stsimple ANY AIIm scsimple stsimple E P BVm scsimple stsimple E P BIVm sclong stlong ANY CIVm scshort stshort E P C

    These types have increasingly stringent require-ments for subsegments analyzed in their relatedkeys to be identified as modulations. Each typehas its own criterion (or criteria) for identifyingmodulations.

    A subsegment in a related key of type A isdetected by setting the parameters sc, st, and sp to

    sc = scsimple = {Root, Primary dominant},st = stsimple = dur (h, l) 8, andsp = ANY

    respectively, where dur (h, l) denotes the total du-ration (in number of quarter notes) of the AEs inthe subsegment ah,ah+1, . . . ,al. In other words, apassage in a related key of type A must be at leasttwo bars long to be considered a modulation. Thesubsegment ah,ah+1, . . . ,al is identified as a modu-lation if, when analyzed in the related key rk, eachAE in ah,ah+1, . . . ,al is of a category in sc, the testst evaluates to true, and (1) at least one AE of the

    subsegment is in the Root category and (2) the sub-segment does not end in anAE of a category in DOM.Conditions (1) and (2) are always added regardlessof the values of rk, sc, st, and sp. For example, inthe key of F major, the subsegment |E7|AMaj7| isidentified as a modulation in the related key III. Soare |AMaj7|AMaj7| and |F7|B7|E7|AMaj7|, butnot |Bm7|E7| (violates condition [1]), |AMaj7|E7|(violates condition [2]), or |B7|AMaj7| (category ofB7 not in sc).

    The value ANY for the parameter sp means thatthe subsegment ah,ah+1, . . . ,al may appear anywherewithin the segment in question, and thereforerepresents a null test.

    Related keys of type B are more related to theoriginal key k than those of type A. Subsegments inthem are identified as modulations when they areimmediately preceded or followed by a modulationto yet another key center. They are not identifiedas modulations when they appear in the middle of asegment in key k, however. As an example of modu-lations to related keys of type B, consider the analysis

    62 Computer Music Journal

  • Figure 13. Bridge ofYardbird Suite.

    of the bridge of tune 73 (Yardbird Suite) in AppendixD of Coker (1987), shown in Figure 13. Its first threebars form a segment with a key center of E minor.Its last three bars (and the A section that follows,only the first bar of which is shown) form a segmentwith a key center of C major. Although bars 4 and 5can be analyzed in Cmajor, they satisfy the criterionspecified below (including being preceded by a seg-ment with another key center, E minor). They aretherefore detected as a separate segment with a keycenter of D minor, a related key of C major of type B.Note also that if the Em6 chords in bars 1 and 3 (noharmonic function in C major) are changed to Em7chords (diatonic chords in Cmajor), the entire bridgewill be analyzed in C major because bars 4 and 5occur in themiddle of a segment with a key center ofC major.

    Modulations to related keys of type B are detectedby setting the parameters sc, st, and sp to

    sc = scsimple,st = stsimple, andsp = EP

    The value of EP for parameter sp specifies thatthe subsegment ah,ah+1, . . . ,al must appear at anendpoint of the segment ai,ai+1, . . . ,aj. That is, anadditional test is performed which evaluates to trueif and only if h= i l = j.

    Chords in the related keys of type C are most re-lated to k. Two sets of criteria are needed to correctlyhandle the different ways in which modulations inthese related keys may appear. Longer subsegmentsin these related keys are identified as modulationsregardless of where they are in the original segment.

    Such subsegments will contain AEs belonging tomore categories. The settings for sc, st, and sp toidentify these modulations are

    sc = sclong = {Root, Diatonic} DOMst = stlong = dur (h, l) 32 4 rdur (h, l)

    odur (h, l), andsp = ANY

    where rdur (h, l) and odur (h, l) denote the totaldurations (in number of quarter notes) of root AEsand other AEs in the subsegment ah,ah+1, . . . ,al,respectively. The modulation from F major in theA section to B major in the bridge in tune 31bin Appendix D of Coker (1987; see Figure 12) isdetected by these parameter settings.

    Shorter subsegments in related keys of type Care also detected as modulations when they areimmediately preceded or followed by a modulationto another key center. The corresponding settingsfor sc, st, and sp are

    sc = scshort = {Root, Primary dominant},st = stshort = dur (h, l) 8 2 rdur (h, l)

    odur (h, l) odur (h, l) > 0, andsp = EP

    As an example, tune 33 (Jeepers Creepers) inAppendix D of Coker (1987) begins with two repeatsof anA section in the key of Emajor, followed by thebridge shown in Figure 14, then followed by anothertime through the A section. Bar 5 and the first half ofbar 6 of the bridge are analyzed in B major becauseBMaj7 has no valid harmonic function in E major.The first four bars can be analyzed in E major as

    Choi 63

  • Figure 14. Bridge of JeepersCreepers.

    Figure 14.

    Figure 15. Chord changesfor Solar represented bykey centers and romannumeral chords.

    Figure 15.

    a continuation of the segment containing the Asection. However, they are not, because they aredetected by the settings for sc, st, and sp above andanalyzed as a separate segment with a key center ofA major.

    It can be shown that the computation of eki, j for all0 i j < n and k K requires O(n2) time. Given k,for each combination of rk, sc, st, and sp, categorizethe AEs a0,a1, . . . ,an1 with respect to the relatedkey rk. Identify all maximal spans in it for which:each AE is of a category in sc, st is true, at leastone AE in the Root category is present, and thecategory of the last AE is not in DOM. Intuitively,a segment ai,ai+1, . . . ,aj can only be valid forsp = ANY if i and j are in the same maximal spanor gap between maximal spans, since any propersubsegment it contains that overlaps a maximalspan will invalidate it. (A proper subsegment of asegment is one that is [strictly] shorter than thesegment.) Additionally, the segment ai,ai+1, . . . ,ajcan only be valid for sp = EP if i and j are in thesame maximal span or gap between maximal spans,or each of them is in a gap between maximal spans.This ensures that no proper subsegment at its twoendpoints overlaps a maximal span, which willinvalidate it. Therefore let z0, z1, . . . , zn1 be integerlabels assigned to the AEs such that two AEs have

    the same odd (or even) label if and only if they belongto the same span (respectively, gaps between spans).Then it can be shown that

    gki, j(rk, sc, st, sp)

    ={zi=zj if sp=ANYzi=zj (zi mod 2=0 zj mod 2=0) if sp=EP

    This completes the description of the tonality seg-mentation algorithm.Note that the time complexityof the harmonic analysis algorithm is O(n2).

    Evaluation of the Tonality Segmentation Algorithm

    The performance of the tonality segmentation al-gorithm is evaluated using the entire collection oftunes in Appendix D of Coker (1987). These tunescontain a wide variety of types of modulations andcompositional and harmonic devices and demon-strate the general applicability of the harmonicanalysis algorithm. The chord changes of the tunesare given by key centers and roman numeral chords.For example, the chord changes for Solar (see Figure1) are represented as in Figure 15. The key centersbelow the roman numeral chords represent how a

    64 Computer Music Journal

  • Figure 16. Example ofdifferent segmentation dueto handling of dominants.

    musician might perform tonality segmentation forthe tune. They provide a benchmark against whichthe output of the tonality segmentation algorithmcan be compared.

    This representation is converted into an ordinarychord chart (such as the one in Figure 1), and thenused as input for the structural analysis algorithmand tonality segmentation algorithm. The resultingsegmentation and original segmentation are thencompared.

    Among the 78 tunes in that collection, 57 receiveexactly the same segmentation by both Coker andthe tonality segmentation algorithm. An additional13 (tunes 7, 12, 27, 28, 29, 33, 53, 63, 69, 70, 73,78, and 82) have identical sequences of key centers,but different assignment of dominant chords toconsecutive segments. An example of this type ofdiscrepancy is shown in Figure 16, which resultsfrom the requirement of the tonality segmentationalgorithm to end segments with AEs that do notfunction as dominant chords. Dominant chords atthe boundary of two segments often play dual rolesin the two key centers and it is simpler and moreconsistent to associate them with the chords intowhich they resolve.

    Among the eight remaining tunes, in five (20, 23,36, 52, and 62) the tonality segmentation algorithmdetects more segments than those given by Coker(1987). For example, the tonality segmentationalgorithm determines that tune 23 (Charlie ParkersBlues for Alice) contains one bar in F major, fourbars in B major, two bars in A major, and five barsin F major. Coker considers the entire tune to be inF major. Both analyses are in some sense correct.

    The tonality segmentation algorithm omits somesegments in the other three tunes (3, 25, and 54).These segments are too short or fail to satisfy the

    criteria (they do not contain a root AE of the keycenter, for example) to be detected by the algorithm.

    An OCaml (Leroy 2008) implementation ofthe structural analysis and tonality segmentationalgorithms and test data can be downloaded fromtheWeb pagewww.sixthhappiness.ca/jazz-harmonic-analysis.

    Summary

    A new algorithm for harmonic analysis of jazz chordsequences has been described. It views harmonicanalysis as a problem of segmenting the input chordsequences and determining the key centers of thesegments. This representation is natural and com-monly used by jazz musicians. More importantly,it allows modulations in the chord sequences tobe modeled explicitly. The harmonic analysis prob-lem can then be formulated mathematically andsolved by dynamic programming as an optimizationproblem. Jazz theory knowledge is incorporatedinto the algorithm to specify and solve the tonalitysegmentation problem. Once the segments andkey centers are determined, structural informationcan be recovered from straightforward detection ofwell-understood elements of jazz theory such asdominant resolutions, harmonic rhythm, substitutedominants, related IIm7s, extended dominants,turnarounds, and interpolated dominants. The algo-rithm can be used in software that simulates jazzimprovisation and for implementation of composi-tional and teaching tools. An example of the latter isa GUI tool called T2G, which was used to generatethe analyses in this article (see Figures 2, 3, 57, 9,and 1114). T2G is also available for download atthe URL given at the end of the previous section.

    Choi 65

  • Future work on the tonality segmentation problemcan focus on algorithm evaluation and comparisonusing more comprehensive test data sets, and im-proved models of modulations (including the use ofstatistical models).

    References

    Coker, J. 1987. Improvising Jazz. New York: Simon &Schuster.

    Corman,T. H., C. E. Leiserson, and R. L. Rivest. 2009.Introduction to Algorithms. 3rd ed. Cambridge, Mas-sachusetts: MIT Press.

    Illescas, P. R., D. Rizo, and J. M. Inesta. 2007. Harmonic,Melodic, and Functional Automatic Analysis. In Pro-ceedings of International Computer Music Conference,pp. 165168.

    Jaffe, A. 1983. Jazz Theory. Dubuque, Iowa: Wm. C. BrownCompany Publishers.

    Jaffe, A. 2009. Jazz Harmony. 3rd ed. Rottenburg: AdvanceMusic.

    Johnson-Laird, P. N. 2002. How Jazz Musicians Impro-vise. Music Perception 19(3):415442.

    Keller, R. M., et al. 2006. A Computational Frame-work Enhancing Jazz Creativity. In Proceedings ofEuropean Conference on Artificial Intelligence (pagesunnumbered).

    Klein, J. 2005. A Pattern-based Framework for Computer-aided Jazz Improvisation. Diploma Thesis, Me-dia Computing Group, Computer Science Depart-ment, Rheinisch-Westfalische Technische HochschuleAachen University.

    Lerdahl, F., and R. Jackendoff. 1983. A Generative Theoryof Tonal Music. Cambridge, Massachusetts: MIT Press.

    Leroy, X. 2008. The Objective Caml System Release3.11 Documentation and Users manual. Paris: In-stitut National de Recherche en Informatique et enAutomatique.

    Levine, M. 1995. The Jazz Theory Book. Petaluma,California: Sher Music Company.

    Mehegan, J. 1959. Jazz Improvisation 1. New York:AMSCO Music Publishing.

    Mehegan, J. 1962. Jazz Improvisation 2. New York:AMSCO Music Publishing.

    Mehegan, J. 1964. Jazz Improvisation 3. New York:AMSCO Music Publishing.

    Mehegan, J. 1965. Jazz Improvisation 4. New York:AMSCO Music Publishing.

    Mouton, R., and F. Pachet. 1995. The Symbolic vs. Nu-meric Controversy in Automatic Analysis of Music. InProceedings of the Workshop on Artificial Intelligenceand Music, International Joint Conference on ArtificialIntelligence, pp. 3239.

    Nettles, B., and R. Graf. 1997. The Chord Scale Theoryand Jazz Harmony. Rottenburg: Advance Music.

    Nettles, B., and A. Ulanowsky. 1987. Harmony 14.Boston, Massachusetts: Berklee College of Music.

    Pachet, F. 1991. A Meta-Level Architecture Applied tothe Analysis of Jazz Chord Sequences. In Proceedingsof International Computer Music Conference, pp.266269.

    Pachet, F. 2000. Computer Analysis of Jazz ChordSequences: Is Solar a Blues? In E.Miranda, ed.Readingsin Music and Artificial Intelligence. Amsterdam:Harwood Academic Publishers, pp. 85113.

    Pachet, F. 2003. The Continuator: Musical InteractionWith Style. Journal of New Music Research 32(3):333341.

    Pardo, B., and W. P. Birmingham. 2002. Algorithms forChordal Analysis. Computer Music Journal 26(2):2749.

    Pass, J. 1996. Joe Pass on Guitar. Miami, Florida: WarnerBrothers Publications.

    Ramalho, G. L., P. Y. Rolland, and J. G. Ganascia. 1999.An Artificially Intelligent Jazz Performer. Journal ofNew Music Research 28(2):105129.

    Scholz, R., V. Dantas, and G. Ramalho. 2005. AutomatingFunctional Harmonic Analysis: The Funchal System.In Proceedings of the Seventh IEEE InternationalSymposium on Multimedia, pp. 759764.

    Steedman, M. 1984. A Generative Grammar for JazzChord Sequences. Music Perception 2:5277.

    Temperley, D., and D. Sleator. 1999. Modeling Meter andHarmony: A Preference-Rule Approach. ComputerMusic Journal 23(1):1027.

    Thom, B. 2003. Interactive Improvisational MusicCompanionship: A User-Modeling Approach. UserModeling and User-Adapted Interaction 13(12):133177.

    66 Computer Music Journal