Modeling Tonal Tensionmusic.psych.cornell.edu/articles/tonality/ModelingTonal... · 2013. 4. 23. · tonal tension is a uniquely musical phenomenon (unlike such factors as fluctuations

Modeling Tonal TensionReview by: Fred Lerdahl, Carol L. KrumhanslMusic Perception: An Interdisciplinary Journal, Vol. 24, No. 4 (April 2007), pp. 329-366Published by: University of California PressStable URL: http://www.jstor.org/stable/10.1525/mp.2007.24.4.329 .Accessed: 03/04/2013 13:59

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

University of California Press is collaborating with JSTOR to digitize, preserve and extend access to MusicPerception: An Interdisciplinary Journal.

http://www.jstor.org

This content downloaded from 128.84.127.40 on Wed, 3 Apr 2013 13:59:23 PMAll use subject to JSTOR Terms and Conditions

http://www.jstor.org/action/showPublisher?publisherCode=ucalhttp://www.jstor.org/stable/10.1525/mp.2007.24.4.329?origin=JSTOR-pdfhttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp

FRED LERDAHLColumbia University

CAROL L. KRUMHANSLCornell University

THIS STUDY PRESENTS AND TESTS a theory of tonal ten-sion (Lerdahl, 2001). The model has four components:prolongational structure, a pitch-space model, a surface-tension model, and an attraction model. These compo-nents combine to predict the rise and fall in tension inthe course of listening to a tonal passage or piece. Wefirst apply the theory to predict tension patterns inClassical diatonic music and then extend the theory tochromatic tonal music. In the experimental tasks, lis-teners record their experience of tension for theexcerpts. Comparisons between predictions and datapoint to alternative analyses within the constraints ofthe theory. We conclude with a discussion of the under-lying perceptual and cognitive principles engaged bythe theory’s components.

Received March 16, 2006, accepted October 4, 2006.

Key words: tonal tension, attraction, pitch space,prolongational structure, multiple regression

THE EBB AND FLOW OF tension is basic to themusical experience and has long been of interestin music theory and criticism (Berry, 1976;

Hindemith, 1937; Kurth, 1920; Rothfarb, 2002; Schenker,1935; Zuckerkandl, 1956). It appears to have a directlink to musical affect (Krumhansl, 1996, 1997), and itshapes not only the listening experience but also aspectsof musical performance (Palmer, 1996).

Building on the prolongational component in Lerdahland Jackendoff (1983; hereafter GTTM), Lerdahl (2001;hereafter TPS) developed a formal model of tonal ten-sion and the related concept of tonal attraction. Themodel generates quantitative predictions of tension andattraction for the sequence of events in any passage oftonal music. Earlier empirical studies have shown prom-ising connections between the model’s predictions and

participants’ responses (Bigand, Parncutt, & Lerdahl,1996; Cuddy & Smith, 2000; Krumhansl, 1996; Lerdahl& Krumhansl, 2004; Palmer, 1996; Smith & Cuddy,2003). Our purpose here is to provide a comparativelycomprehensive empirical treatment and analysis ofthe model’s predictions over a range of musical styles.

By “tonal tension” we mean not an inclusive defini-tion of musical tension, which can be induced by manyfactors, such as rhythm, tempo, dynamics, gesture, andtextural density, but the specific sense created bymelodic and harmonic motion: a tonic is relaxed andmotion to a distant pitch or chord is tense; the reversalof these motions causes relative relaxation. Becausetonal tension is a uniquely musical phenomenon(unlike such factors as fluctuations in loudness, speed,or contour), it is perhaps the most crucial respect inwhich music tenses and relaxes. This study sets asideother kinds of musical tension and focuses on tonaltension.

The sense of tonal tension and relaxation can also beexpressed as “stability and instability” or even “conso-nance and dissonance.” These pairs of terms have some-what different shades of meaning. “Dissonance” refersfirst of all to a sensory property that is studied in the psy-choacoustic literature. In a traditional music-theoreticcontext, it refers to intervallic combinations that requireparticular syntactic treatment, such as the passing toneand the suspension. Intervals that are musically dis-sonant usually correspond to intervals that are psy-choacoustically dissonant. “Instability” has cognitiveor conceptual meaning beyond psychoacoustic effects.Theorists such as Riemann (1893), Schenker (1935),and Schoenberg (1911) extend musical dissonance froma surface characteristic to abstract levels. One mayspeak of a composed-out passing tone that is harmo-nized at the surface, or of a subsidiary tonal region thatis conceptually dissonant in relation to the tonic(Rosen, 1972). Schoenberg (1975) asserts that the goal ofa tonal composition, after its initial destabilization, is toreestablish stability.

The term “tension,” as employed here, refers both tosensory dissonance and to cognitive dissonance or insta-bility; similarly,“relaxation” refers to sensory consonanceand to cognitive consonance or stability. The expression

MODELING TONAL TENSION

Music Perception VOLUME 24, ISSUE 4, PP. 329–366, ISSN 0730-7829, ELECTRONIC ISSN 1533-8312 © 2007 BY THE REGENTS OF THE UNIVERSITY OF CALIFORNIA. ALLRIGHTS RESERVED. PLEASE DIRECT ALL REQUESTS FOR PERMISSION TO PHOTOCOPY OR REPRODUCE ARTICLE CONTENT THROUGH THE UNIVERSITY OF CALIFORNIA PRESS’S

RIGHTS AND PERMISSIONS WEBSITE, HTTP://WWW.UCPRESSJOURNALS.COM/REPRINTINFO.ASP. DOI:10.1525/MP.2007.24.4.329

Modeling Tonal Tension 329


http://www.jstor.org/page/info/about/policies/terms.jsp

“tension and relaxation” also has the advantage ofinvoking physical motion and exertion beyond a specif-ically musical function. Everyone experiences physicaltension and relaxation, and it is common to extend theterms to mental and emotional terrains as well. Conse-quently, it is relatively straightforward to ask experi-mental participants to respond to degrees of tensionand relaxation and thereby elicit consistent interper-sonal responses (see Krumhansl, 1996).

The TPS model also develops an attraction compo-nent. The term “attraction” refers to the intuition thatmelodic or voice-leading pitches tend toward otherpitches in greater or lesser degrees. Bharucha (1984)refers to melodic anchoring; Larson (2004; Larson &VanHandel, 2005) speaks of musical forces; Margulis(2005), Meyer (1956), and Narmour (1990) couch attrac-tion in terms of melodic expectation or implication.Attraction can also be seen as a kind of tension: themore attracted a pitch is to another pitch, such as theleading tone to the tonic, the more the listener experi-ences the tension of anticipation. This kind of tensioncontrasts with the tension of instability. The leadingtone is less stable than the tonic, but its expectancy-tension (to use Margulis’s expression) is much greaterthan that of the tonic. That is, the leading tone strongly“wants” to resolve to the tonic; but the tonic pitch,being the point of maximal stability, expresses compar-atively little urge to move to the leading tone or to anyother pitch.

To summarize, our concern is with three kinds oftonal tension: the sensory dissonance of certain inter-vallic combinations, harmonic and regional stability/instability in relation to a governing tonic, and melodicattraction as a projection of expectancy-tension.

Overview of the Tension Model

The four components listed in Figure 1 are required fora quantitative theory of tonal tension. First, there mustbe a representation of the hierarchical event structurein a musical passage. Adapting a traditional music-theoretic term, we call this component prolongationalstructure. Second, there must be a model of tonal pitchspace and all distances within it. Tonal pitch space is thecognitive schema whereby listeners have tacit long-termknowledge, beyond the patterns within any particularpiece, of the distances of pitches, chords, and tonalregions from one another. Third, there must be a treat-ment of surface or sensory dissonance. This measure islargely psychoacoustic: the interval of a seventh is moredissonant than a sixth, and so on. Fourth, there must bea model of melodic or voice-leading attractions. Listeners

experience the relative pull of pitches toward otherpitches in a tonal context.

Let us review these four components, starting withprolongational structure. (This exposition summarizesmaterial in TPS.) GTTM addresses prolongational organ-ization not as an aesthetic ideal, as in Schenkerian analy-sis, but as a psychological phenomenon describable bynested patterns of tension and relaxation. Tension dependson hierarchical position: a tonic chord in root positionis relaxed; another chord or region is relatively tense inrelation to the tonic; a nonharmonic tone is tense inrelation to its harmonic context. This componentassigns prolongational structure by a cognitively moti-vated rule system that proceeds from grouping andmeter through time-span segmentation and reduction.These steps are necessary because prolongational con-nections depend not only on degrees of pitch similarityand stability but also on the rhythmic position of events.

To represent an event hierarchy, the prolongationalcomponent employs a tree notation. Here it will sufficeto refer to branchings stripped of the node typesemployed in GTTM and TPS. Right branches stand fora tensing motion (or departure), left branches for a relax-ing motion (or return). The degree of tension or relax-ation between two events depends on the degree ofcontinuity between them. If two events that connect arethe same or similar, there is little change in tension. If theyare different, there is more change in tension. Figure 2shows an abstract tension pattern: at a local level, Event 1tenses into Event 2, Event 3 relaxes into Event 4, and

330 F. Lerdahl and C. L. Krumhansl

1. A representation of hierarchical (prolongational) event structure.

2. A model of tonal pitch space and all distances within it.

3. A treatment of surface (largely psychoacoustic) dissonance.

4. A model of voice-leading (melodic) attractions.

FIGURE 1. Four components necessary for a quantitative theory oftonal tension.

FIGURE 2. Tension (t) and relaxation (r) represented by a tree structure.


http://www.jstor.org/action/showImage?doi=10.1525/mp.2007.24.4.329&iName=master.img-000.jpg&w=228&h=117http://www.jstor.org/page/info/about/policies/terms.jsp

Event 5 relaxes into Event 6; at larger levels, Event 1tenses into Event 4, Event 6 relaxes into Event 7, andEvent 1 relaxes into Event 7. Notice that this representa-tion says nothing about the tension relationshipbetween Events 2 and 3 or Events 4 and 5. More seri-ously, it does not quantify the amount of tension orrelaxation. It merely says that if two events are con-nected, one is relatively tense or relaxed in relation tothe other.

Further progress in the evaluation of tension dependson the second component listed in Figure 1, a model oftonal pitch space. A well-known finding in music psy-chology is that listeners’ judgments about the distancesof pitches, chords, and regions (or keys) from a giventonic form consistent patterns (Bharucha & Krumhansl,1983; Krumhansl, 1990, hereafter CFMP; Krumhansl &Kessler, 1982). These results have been replicated in sev-eral ways, using different input materials, participantswith varied training, and different task instructions.When submitted to multidimensional scaling, the empir-ical data are represented as geometrical structures inwhich spatial distance corresponds to cognitive distance.The regular geometry found for regions (Krumhansl &Kessler, 1982) corresponds to musical spaces proposedearlier by music theorists (Schoenberg, 1954; Weber,1817-21).

It is striking that listeners share a complex mentalschema of the mutual distances of pitches, chords, andregions. But how is this empirical result to be accountedfor? Several researchers have proposed explanatoryframeworks: CFMP through sensitivity to statistical fre-quency of tone onsets or durations; Bharucha (1987)through neural-net modeling; Parncutt (1989) throughpsychoacoustic modeling. A fourth approach, which iscomplementary to the others, has been to develop amusic-theoretic formal model of tonal pitch space thatcorrelates with the empirical data and that unifies thetreatment of pitches, chords, and keys within a singleframework (Lerdahl, 1988; TPS). The model begins withthe basic space in Figure 3, set to I/C. Regions are desig-nated in boldface type, with upper-case letters for majorkeys and lower-case letters for minor keys. The numbersin familiar pitch-class notation signify either pitches orpitch classes, depending on context. The basic space is

hierarchical in that if a pitch class is stable at one level,it repeats at the immediately superordinate level. Thediatonic scale is built from members of the chromaticscale and the triad from members of the diatonic scale.The triad itself has an internal hierarchy, with the fifthmore stable than the third and the root as the most sta-ble element. The shape of this structure corresponds tothe major-key tone profile in CFMP and can be viewedas an idealized form of it.

Transformations of the basic space measure the dis-tance from any chord in any region to any other chordwithin the region or to any chord in any other region.The space shifts by means of a diatonic chord distancerule in which the distance from chord x to chord yequals the sum of three variables, as shown in theabbreviated statement of the rule in Figure 4. Computa-tional details aside, the factors involved are the degree ofrecurrence of common tones and the number of movesalong two cycles of fifths, one for triads over the dia-tonic collection and the other for the diatonic collectionover the chromatic collection.

Figure 5 illustrates some basic-space configurationswith their distance calculations from I/C (see Figure 3).The underlined numbers signify new pitch classes in thenew configuration (variable k in the rule). The distancefrom I/C to V/C in Figure 5a is accomplished by stayingin the same diatonic collection (i), moving the chordonce up the diatonic cycle of fifths ( j), and counting theresultant noncommon tones (k). The distance from I/Cto I/G in Figure 5b is two units greater, even though thetwo chords are the same. In the latter, there is also acycle-of-fifths move at the scale level (i), causing an extranoncommon tone at that level (k). Motion betweenmajor and minor chords arises not by a transformationbut is a by-product of moves along a cycle of fifths.


(a) octave (root) level: 0 (0)

(b) fifths level: 0 7 (0)

(c) triadic level: 0 4 7 (0)

(d) diatonic level: 0 2 4 5 7 9 11 (0)

(e) chromatic level: 01 2 3 4 5 6 7 8 9 10 11 (0)

FIGURE 3. Diatonic basic space, set to I/C (C = 0, C# = 1, . . . B = 11).

FIGURE 4. The rule for calculating the distance between triads in diatonic space.



Thus the distance from I/C to i/a in Figure 5c is reachedby staying in the same diatonic collection (i) and mov-ing the chord three times up the diatonic cycle of fifths( j), producing four noncommon tones (k). The dis-tance from I/C to i/c in Figure 5d, in contrast, involvesmoving the scale three times down the chromatic cycleof fifths (i). With no change of chord root ( j), the thirdof the chord becomes minor. Again there are four non-common tones (k).

When mapped geometrically, the distances (d ) fromtriad to triad within a key exhibit a regular pattern, withthe diatonic cycle of fifths arrayed on the vertical axisand the diatonic cycle of minor thirds on the horizontalaxis. Figure 6a displays this pattern along with distancesfrom the tonic triad to the other triads within the key.Regional space—that is, distances from a given tonictriad to other tonic triads—shows a similar pattern,with the chromatic cycle of fifths on the vertical axis andthe minor-third cycle on the horizontal axis. Figure 6bgives a portion of regional space along with the distancevalues. If these chordal and regional patterns areextended, both Figures 6a and 6b form toroidal struc-tures. (Figure 6b corresponds to a multidimensionalsolution developed from empirical data in Krumhansland Kessler, 1982; also see CFMP.)

Pitch-space distances are input to prolongationalstructure via the principle of the shortest path. The ideais that listeners construe their understanding of melodiesand chords in the most efficient way; in other words, theyinterpret events in as stable and compact a space as possible.For example, if one hears only the melodic progressionC E, the most stable interpretation is as in C,for C and E are then in an optimally stable location in

3̂1̂

diatonic basic space. A slightly less preferred alternative isas in a; still less preferred would be in G; andso forth. Similarly, a G major chord heard in a C contextis likely to be heard by the shortest path as V/C ratherthan, say, by longer paths to I/G or iii/e.

Figure 7 illustrates the use of the principle of theshortest path in a derivation of the prolongationalstructure for the final phrase of the Bach chorale,“Christus, der ist mein Leben.” (Later on we discuss theentire chorale; also see the extensive analysis in TPS,chapter 1.) Let us assume that the phrasal boundariesand metrical grid have already been assigned. As a firststep, automatic segmentation rules carve the music intonested rhythmic units so that each event is assigned to atime-span segment. Second, at the quarter-note time-span level at the bottom of the graph, nonharmonictones are reduced out, the cadence (marked c) is desig-nated, and tonic orientation is established by shortest-path measurement. The opening F major triad is takento be the tonic because the distance to itself is 0(d[F F] = 0), whereas distances to other possible ton-ics would be greater. All the subsequent events takeplace within F. Third, events at the quarter-note levelcome up for comparison at the half-note level. In eachcase, the most stable event is selected for comparison inthe next larger span, where stability is defined in termsof the distance to another available event. Thus theopening I is compared within span a to viio6 and I6, andii65 is compared within span b to the V. In span a, theopening I is selected over viio6 because, unlike viio6, thedistance of I to the tonic is 0; I wins over I6 because itsroot is in the bass. In span b, the V is chosen because it

6̂4̂5̂3̂


FIGURE 5. Illustrations of d.

(a)

(iii) V viio

8

5

vi 7 I 7 iii8

5

ii IV (vi)

(b)

e G (g)

9

7

a 7 C 7 c

7

10

d F (f)

FIGURE 6. Portions of (a) chordal space within a region; and(b) regional space, along with values calculated by d.



is part of the cadence. In span c, the only choice is thefinal I. Thus the half-note time-span level yields I-V-I.

The time-span hierarchy then forms the input to theprolongational tree, moving from global to local levels.The distances between available global events ared(opening I final I) = 0 and d(V I) = 5. The firstoption wins because its path is shorter: the opening Ibranches directly to the closing I, and within that con-text V branches to the final I. At a more local level, in thefirst part of the phrase d(I I6) = 0 and d(I viio6) = 5(counting, as is customary, viio6 as an incompletedominant), so I6 attaches to I; within the context of I − I6,viio6 branches to I6. Finally, ii65 lies between I

6 and V.d(ii65 I6) = 9 and d(ii65 V) = 7, so ii65 attaches to themore proximate V. As a visual aid, the slurs betweenevents in the music duplicate the relationshipsdescribed in the tree.

Supplementary to the principle of the shortest path isa second factor in the derivation of prolongationalreductions, the principle of good form, which encour-ages optimal patterns of tension and relaxation. Thissecond principle breaks down into three conditions.First is the recursion constraint, in which successive

right or left branches are preferable to unconnectedright or left branches. Thus there is pressure to assignthe first instance of Figure 8a rather than the second.Second is the balance constraint, in which the numberof right and left branches approaches equality. Thus thefirst instance in Figure 8b is preferred over the second.Third is normative structure, in which there is a pref-erence for at least one right branch leading off thestructural beginning of the phrase and for at least oneleft branch (a pre-dominant) leading into the phrase’scadence. Finally, there is a third overarching principle,that of parallelism: parallel passages preferably haveparallel structures. GTTM uses this principle in all of itstheoretical components.

The principles of the shortest path, prolongationalgood form, and parallelism reinforce one another in theBach phrase, but in other passages they might conflict.Although the procedures involving the shortest path arealgorithmic, their interaction with prolongational goodform is not fully specified; and the principle of paral-lelism is notoriously difficult to quantify. Hence there is adegree of flexibility in the assignment of prolongationalstructure.


FIGURE 7. Derivation of the prolongational structure of the final phrase of the Bach chorale, “Christus, der ist mein Leben.”



The chord distance rule calculates not only the dis-tance between two chords x and y but also the tonal ten-sion between them. Tension can be computed bothsequentially and hierarchically. Sequential tension ismeasured simply from one event to the next, as if thelistener had no memory or expectation. Hierarchicaltension proceeds through the prolongational analysisfrom global to local levels in the tree structure. It is anempirical question how much listeners hear tensionsequentially and how much hierarchically. No doubtthey hear from one event to the next, but if listeningwere only sequential there would be little larger-scalecoherence to the musical experience.

We turn now to the third component of tonal ten-sion, surface dissonance. Nonharmonic tones (tonesnot belonging to a sounding triad) are less stable, hencetenser, than harmonic tones. Even when all the sound-ing tones are harmonic, the triad is more stable if it is inroot position than if it is in inversion; and, to a lesserextent, it is more stable if its melodic note is on the rootof the triad than if it is on the third or fifth scale degree.These factors are registered, categorically and approxi-mately, in the surface tension rule in Figure 9. They areonly approximate because tones within a category infact differ in their degree of perceived dissonance,depending on intervallic structure, metrical position,duration, loudness, timbre, and textural location. Analternative method would be to quantify surface tensionaccording to an established measure of sensory disso-nance in the psychoacoustic literature (for instance,Hutchinson & Knopoff, 1978). This method wouldgive rise to a continuous measure of surface tension.

However, surface tension is perceived categorically to aconsiderable extent. For example, in a diatonic 7-6 sus-pension chain, all the sevenths, major or minor, soundmore or less equally dissonant. Here we take the cate-gorical approach.

The chord distance rule and the surface tension rulecombine in two possible ways to yield an overall ten-sion value for a given event. The simpler way, stated inFigure 10a, is sequential: calculate the pitch-space dis-tance from one event to the next and add the value forsurface tension. The more complex way in Figure 10b ishierarchical: calculate the pitch-space distance from theimmediately dominating event and add the value forsurface tension; then add hierarchical values as inheriteddown the prolongational tree.

As illustration, consider Figure 11, the Grail themefrom Wagner’s Parsifal. (This is also known as the“Dresden Amen” and is familiar as such in some Protes-tant services. Here the theme is transposed from A , itscharacteristic key, to E so that it can be directly com-pared later on to its chromatic version in E .) The the-oretically preferred analysis, following the recursionconstraint and parallelism for the first four events, saysthat the music tenses away from the opening I until thepre-dominant ii (Event 5) in bar 2. After an elaborationof ii, the progression relaxes, in observance of norma-tive structure, into the closing I, which repeats theopening I an octave higher. The dashed branch to Event 5signifies an alternative branching that continues to followthe parallelism of the harmonic sequence but thatremoves the pre-dominant left branch required bynormative structure. We shall return to this point.


FIGURE 8. Prolongational good form: (a) recursion constraint; (b) balance constraint; (c) normative structure.

FIGURE 9. The rule for calculating surface tension.


http://www.jstor.org/action/showImage?doi=10.1525/mp.2007.24.4.329&iName=master.img-018.jpg&w=336&h=79http://www.jstor.org/action/showImage?doi=10.1525/mp.2007.24.4.329&iName=master.img-019.png&w=263&h=55http://www.jstor.org/page/info/about/policies/terms.jsp

Included in Figure 11 are numerical values from theapplication of the rules in Figures 9 and 10. The firstrow of numbers between the staves lists surface disso-nance values. The second row lists sequential tensionvalues, obtained by calculating d from one chord to thenext and adding surface distance values. For example,the sequential distance from Event 2 to Event 3 is 7, andthe surface dissonance value for Event 3 is 1; so thesequential tension associated with Event 3 is 7 + 1 = 8.The third row similarly lists hierarchical tension values,obtained globally by adding the distance numbers nextto the branches of the tree and then adding the surfacedistance values. Thus the hierarchical distance fromEvent 2 to Event 4 is 0 + 7 + 7 = 14, and the surface dis-sonance value for Event 4 is 1; hence the hierarchicaltension associated with Event 4 is 14 + 1 = 15.

The same calculations appear in the tabular format inFigure 12. The events for Tseq in Figure 12a are listedin sequential order. The table decomposes the surface-dissonance and pitch-space factors into their compo-nent parts. The values in each row are summed to reachthe total sequential tension for each event. In Thier inFigure 12b, the target chords (those to the right of thearrows) are still listed in sequential order, but the sourcechords (those to the left of the arrows) are now listed bythe immediately dominating events in the tree. Forexample, the notation Thier(4 3) indicates that Event 3,because it branches from Event 4, derives its tensionvalue from Event 4. In accordance with the hierarchicaltension rule, Figure 12b includes the additionalcolumns of “local total” and “inherited value.” The hier-archical tension for each event, given in the “global


FIGURE 10. Tension rules: (a) sequential tension plus surface dissonance; (b) hierarchical tension plus surface dissonance.

FIGURE 11. Grail theme (diatonic version) from Wagner’s Parsifal, together with its’ theoretically preferred prolongational analysis, surfacedissonance values, sequential tension values, and hierarchical tension values.


http://www.jstor.org/action/showImage?doi=10.1525/mp.2007.24.4.329&iName=master.img-021.jpg&w=299&h=133http://www.jstor.org/action/showImage?doi=10.1525/mp.2007.24.4.329&iName=master.img-022.jpg&w=335&h=212http://www.jstor.org/page/info/about/policies/terms.jsp

total” column, equals the local total plus the inheritedvalue.

The fourth component of the tension model is the fac-tor of attraction. That pitches tend strongly or weaklytoward other pitches has long been recognized in musictheory (see TPS, pp. 166–167 and 188–192). Bharucha(1984, 1996) provides a psychological account of thisphenomenon through the notion of anchoring, which isthe urge for a less stable pitch to resolve on a subsequent,proximate, and more stable pitch. This corresponds tothe account offered by Krumhansl (1979) for the effectof temporal order on tone similarity judgments.Bharucha and Larson (1994, 2004) also equate theattractive urge with melodic expectancy (Meyer, 1956;Narmour, 1990). The TPS attraction model extendsBharucha’s approach to include the attraction of anypitch to any other pitch and to harmonic progression. Italso quantifies the relevant variables and places themwithin a larger cognitive theory.

Figure 13a repeats the basic space with the fifths level(level b in Figure 3) omitted, in order to make attractions

to the third and fifth scale degrees equal. Each level ofthe space is assigned an anchoring strength in inverserelation to its depth of embedding. Figure 13b gives themelodic attraction rule. The two factors in the equa-tion, combined by multiplication, are the ratio ofanchoring strengths of two pitches and the inversesquare of the semitone distance between them. The dis-tance factor is estimated to behave as in Newton’s classi-cal gravitational equation. The inverse-square factorrenders miniscule attractions between pitches that aremore than a major second apart. To convey the behav-ior of the rule, Figure 13c lists a few attractions to dia-tonic neighbors in the context I/C. The pitch B is highlyattracted to C because the two pitches are a semitoneapart and C is more stable. D is less attracted to Cbecause it is two semitones away. F is more attracted toE than E is to F because of their inverse anchoringstrengths.

The attraction rule applies not only to individuallines but also to each voice in a progression. As stated inthe harmonic attraction rule in Figure 14, these values


FIGURE 12. Tension tables for Figure 11: (a) sequential tension; (b) hierarchical tension.



are summed and then divided by the value for the chorddistance rule to obtain the overall attraction value fromone chord to the next.

Figure 15 applies the harmonic attraction rule to thefirst and last progressions in Wagner’s Grail theme.Where a pitch repeats, “null” is designated because theattraction rule does not apply to repeated pitches. Thevalues of a are summed to the combined realized voice-leading value (arvl), which is divided by the d value togive the final realized harmonic attraction value (arh).Notice the extreme differences between the arh valuesfor Prog(1 2) and Prog(8 9). In the former, the pro-gression I vi is only moderately strong and includesrepeated notes; in the latter, the progression V7 I isvery strong and resolves by half step in two voices.Indeed, the strongest harmonic attraction is from adominant seventh chord to its tonic, because of thepowerful attractions of the leading tone to the tonic

and the fourth to the third scale degree and because ofthe short distance from the dominant to the tonicchord. This is why (aside from statistical frequency)the expectancy for a tonic chord is so high after adominant-seventh chord.

Attractions in TPS are computed not only from eventto event at the musical surface but also from event toevent at immediately underlying levels of prolonga-tional reduction. The resulting sets of numbers, how-ever, are not integrated into a single attraction measureacross reductional levels. Depending in part on tempo,underlying levels presumably contribute to the overallresult in increasingly smaller amounts as the analysisabstracts away from the surface. (Margulis, 2005, pro-poses a mechanism for this step.) In this study wedispense with underlying levels of attraction.

Figure 16 shows the surface attraction values for theGrail theme. The numbers appear between events because


FIGURE 13. Melodic attractions: (a) The basic space minus the fifth level and with anchoring strengths indicated by level; (b) the melodic attractionrule; (c) some computed attractions between scalar adjacencies in the context I/C.

FIGURE 14. Harmonic attraction rule.



they apply to relations between events. Where the har-mony does not change, as in events 3-4 and 5-7, a singleattraction value obtains.

There is a complementary relationship between ten-sion and attraction numbers. Where the music tensesaway from the tonic, attractions are realized on less sta-ble pitches and chords. Hence where tension numbersrise, attraction values tend to be small. But where themusic relaxes toward the tonic, attractions are realizedon more stable pitches and chords; tension numbersdecline and attraction values rise. A high attractionvalue in effect constitutes a second kind of tension—not the tension of motion away from stability but thetension of expectation that the attractor pitch or chordwill arrive.

A further general point about tension and attractionconcerns numerical quantification. As Klumpenhouwer(2005) points out, the theory’s numbers measure differ-ent entities in the different components: in the disso-nance component, chord inversions and nonharmonictones; in the distance model, steps on cycles of fifthsand noncommon tones; in the attraction component,pitch stabilities and distances. As numerical values,then, these might be considered incommensurate (forexample, a 2 for inversion in the dissonance componentis not exactly the same as a 2 for the k distance betweenchords). One approach to this issue would be to findcoefficients for the different variables to express the rel-ative strength of their units of measurement. We havefound, however, that coefficients are not needed for thetension rules; that is, the numbers already express the

relative strength of the variables in question. However,the attraction rules yield incommensurable outputnumbers compared to those of the tension rules.Empirical data suggest that coefficients are neededwhen tension combines with attraction. For this, wetake a practical rather than theoretical solution throughthe mathematical technique of multiple regression,which weights the two sets of numbers to find the bestfit between the tension and attraction curves.

Before proceeding, it should be noted that themelodic attraction rule (Figure 13b) stands on weakerempirical grounds than does the chord distance rule(Figure 4). Experimental results guided the develop-ment of the distance rule. (However, the output of theelaborated form of d, the chord/region distance rule∆ [TPS, p. 70], which employs the pivot-region concept,proves to be empirically less successful, and we shall notinvoke it.) The attraction rule, in contrast, was devel-oped by a blend of theoretical and intuitive considera-tions without much supporting empirical data. Severalaspects of the rule can be criticized. First, it is unclearthat a multiplicative rather than additive relationshipshould obtain between the stability (s2⁄s1) and proximity(1⁄n2) parts of the equation. Second, as Larson (2004)and Samplaski (2005) observe, there is arbitrariness inthe reduction of five levels of the basic space (Figure 3)to four when calculating attractions (Figure 13a).Third, the inverse factor for proximity eliminates theattraction of a pitch to itself because of the impossibil-ity of a zero denominator. Pitch repetition may indeedbe a case where intuitions of attraction and expectation


FIGURE 15. Two applications of the harmonic attraction rule.

FIGURE 16. Attraction values for the Grail theme.



diverge. One may expect a pitch to repeat, but it seemsmore natural to think of a pitch as being attracted onlyto other pitches. Nevertheless, the exclusion of pitchrepetition from the calculations leaves a gap in the the-ory. Fourth, the specific form of the inverse factor, 1⁄n2,appears to create too steep a curve; that is, the dropfrom great attraction at the half-step distance to lessattraction at the whole-step distance to very littleattraction at the minor-third distance seems tooextreme (Margulis, 2005). The obvious alternative, 1⁄n,yields too flat a curve. An intermediate curve is possible,but the theoretical and empirical bases for such a solu-tion are unclear. Fifth, and perhaps most importantlyfrom a theoretical perspective, the measurement ofproximity only by semitones may be too simple a met-ric. Larson (2004) cites evidence in Povel (1996) thatstepwise arpeggiated intervals—that is, between adjacentmembers of a triad or between the fifth and the tonic—yield greater attractions than predicted by 1⁄n2 or 1⁄n.Krumhansl (1979) and CFMP (Table 5.1) also find highrelatedness ratings for triad members. This evidence fitsthe discussion in Chapter 2 of TPS about pitch proxim-ity, step motion, and linear completion. It appears thatthat discussion, in which stepwise motion is seen as per-taining to the alphabet in question at a given level of thebasic space, should have informed the formulation ofpitch attractions in Chapter 4.

Despite these reservations, the principles behind theattraction rule, stability and proximity, remain the cen-tral factors in a treatment of melodic attraction. Wehave tried the alternatives of five instead of four stabil-ity levels and of proximity by 1⁄n instead of 1⁄n2, but theresulting values do not lead to improvements over thoseof the original rule with respect to the empirical data.Nor are there enough instances of voice-leading arpeg-giation in our examples to force a stratified treatment ofmelodic proximity. Our project is to test the success orfailure of the TPS theory of tension and attraction, andwe leave theoretical refinements of the attraction rulefor future research.

Experimental Approach

The participants in the experiments under discussionwere musically trained students at Cornell Universitywith relatively little training in music theory comparedto the extent of their instruction on musical instru-ments or voice. (More details of music backgroundsand other details of the experimental method can befound in Appendix A.) They were tested for tensionresponses for Wagner’s Grail theme from Parsifal inits diatonic and chromatic versions, a Bach chorale, a

chromatic Chopin prelude, and a passage from Messiaen’sQuartet for the End of Time. The data were compared tothe model’s predictions. (A Mozart sonata movementthat received a similar treatment is not discussed in thispaper; see Krumhansl, 1996, and Lerdahl, 1996.)

The tests for the Wagner and Bach excerpts wereconducted in two ways, the stop-tension task and thecontinuous-tension task. In the stop-tension task, thefirst event was sounded, at which point the participantsrated its degree of tension; then the first and secondevents were sounded and the participants rated the ten-sion of the second event; then the first, second, andthird events were sounded, and so on, until the tensionassociated with each successive event was recorded. Inthe continuous-tension task, which was done for allexcerpts, the participants interacted with a graphicinterface that enabled them to move a slider right andleft on the computer screen using a mouse, in corre-spondence with their ongoing experience of increasingand decreasing tension. The advantage of the stop-tensiontask is that it records the response precisely for the eventthat is evaluated. Its disadvantage is that it is rather arti-ficial and prohibitively time-consuming for longexcerpts. The advantage of the continuous-tension taskis that it encourages a spontaneous response to intu-itions of tension in real time. Its disadvantage is thatthere is a lag time, for which an approximated correc-tion must be made, between the sounded events and thephysical response of moving the mouse. Perhaps sur-prisingly, the results from the two tasks yielded almostthe same results for the short passages where bothtasks were used. For the longer Chopin and Messiaenselections, however, it was practical to employ only thecontinuous-tension task. The participants in the studyby Krumhansl (1996) using this method varied in theextent of their musical training, but training had littleeffect on the tension judgments.

As mentioned, the analyses combine tension andattraction values to achieve an overall measure of ten-sion. We follow three conventions in this respect. First,even though an attraction number does not adhere to asingle pitch but represents a relation between two suc-cessive pitches x and y, we assign the number to x, ineffect claiming that it is at x that the experience of attrac-tion most saliently takes place. In this way, each event hastwo numbers associated with it, one for d and the otherfor a. Second, the harmonic attraction rule (Figure 14)has d in the denominator and hence requires that d ≠ 0.This creates a problem when the voice leading moves butthe harmony does not progress. In such cases, we repeatthe value for d from the point at which the harmony lastchanged (as in Figure 16, Events 3-4 and 5-7). Third, a




virtual attraction can be computed from Event x to anypossible Event y, and, in particular, to the y with thehighest attraction value. Instead we calculate only therealized attraction, that is, from x to the y that actuallyfollows. It might be argued, especially for the stop-tensiontask, that the strongest virtual attraction, when it is notthe same as the realized attraction, should be calculated,on the view that the strongest attraction correspondsto the strongest expectation. Expectations, however,depend not only on strongest attractions but also onschematic patterns that lie beyond current formaliza-tion. To calculate to an event that does not occur wouldbe somewhat speculative in this context. It suffices as afirst approximation to rely on the definiteness of realizedattractions.

A larger methodological point concerns the interactionbetween prediction and data. It is sometimes thoughtthat an experiment simply tests a preexisting theory. Yetexperimental data can give rise to a theory; this in factwas the case for the construction of the pitch-spacemodel. In a healthy science, it often happens that afruitful exchange develops between theory and experi-ment. Such is the case here. If the data suggest that thepredictions are faulty, principled ways are soughtwithin the model to reach predictions that achieve abetter empirical result. These reevaluations are princi-pled in the sense that they are constrained by the gen-eral assumptions and specific formalisms of the theory.This process can go back and forth a number of times.One must of course be careful not to adjust the theorysimply in order to fit the data. Rather, the data can illu-minate how listeners construe tension, suggesting inter-pretations within the model that are both theoreticallyacceptable and more predictive. In this way the theorycan be improved. Furthermore, in our view it is notenough to achieve a statistically significant overall cor-relation. What is wanted, in addition, is an explanatoryaccount of why the model succeeds or fails at any givenpoint in the analysis.

In this back-and-forth process there are two kinds offlexibility within the theory. First, sequential or hierar-chical tension can be computed, each with or withoutattractions. Second, unlike the tension and attractionrules (all those that incorporate d and a), which arealgorithmic, the derivation of prolongational structureinvolves gradient preference rules, which interact withone another in search of an optimal solution (see thediscussions in GTTM and TPS; also Temperley, 2001).

Preferential conditions arise in three ways. First is theinteraction between the principles of the shortest path,prolongational good form, and parallelism. Second,when there is a shift from a right- to a left-branching

pattern, the event where the shift takes place can attacheither way, depending on the shortest path and goodform. Third, it is not always clear where to locate anevent in pitch space; that is, there can be ambiguityabout the identity of a chord or the exact moment of amodulation. As a result of these factors, a passage ofmusic yields not a single prolongational analysis but alimited range of preferred analyses. The data canpoint in any given case toward which theoreticallyviable prolongational analysis conforms best to listeners’responses.

Analyses in a Diatonic Framework

Wagner Theme, Diatonic Version

We begin with the diatonic Grail theme from Wagner’sParsifal, shown in Figure 11. Figure 17 records the nineevents of the excerpt on the x-axis and tensionresponses from the stop-tension task on the y-axis. Thedashed line represents the sequential tension valuesfrom Figure 12a, without the inclusion of attractionvalues, and the solid line shows the data from the aver-aged listeners’ responses. The fit is quite poor: R2(1,7) =.08, p = .46, R2adj = −.049.

Some words of explanation may be helpful. For eachcorrelation, we present the following information aboutthe statistical test. The first number R2 is the proportionof variation in the data that is accounted for by themodel. It is associated with two numbers, the degrees offreedom. The first degree of freedom indicates thenumber of predictor variables in the model. In this casethere is one variable, sequential tension. The seconddegree of freedom is the number of data points (in this


FIGURE 17. Sequential tension analysis of the Grail theme from Parsifal.



case 9, one for each event in the music) minus two. Thesubtraction of two results from the regression “usingup” two degrees of freedom for the parameters it deter-mines. The regression finds the best-fitting linearmodel predicting the data from the variable(s). Themodel determines the optimal values for the slope ofthe line (one parameter) and the intercept of the line(the second parameter). Hence the number two is sub-tracted from the number of data points going intothe regression to give the second degree of freedom.If the model can find a perfect fit between the data andthe variable(s), R2 would equal 1.0. In general, the valueis less than this, and its significance is measured by theprobability, p, which is the next number given in thestatistical report. By convention, a statistic, such as R2, isconsidered significant when the p value is less than .05.The probability depends on both the size of R2 and thedegrees of freedom. The last number given is theadjusted R2, R2adj. The R

2adj is the R

2 value adjusted tomake it more comparable with other models for thesame data that have different numbers of degrees offreedom.

Methods such as time-series analysis or functionaldata analysis are not appropriate here. Our objective isto determine whether the judgments fit the quantitativepredictions of the model for each musical event. Forthis we need a single number for judged tension foreach event.

The conclusion from this statistical test for Figure 17is that listeners do not hear this passage in a simplesequential manner. The R2 value is only .08, whichmeans that the sequential tension variable accounts foronly 8% of the variability in the tension judgments, andthe p value of .46 tells us that this is an unimpressiveresult. Graphically, this is apparent in Figure 17 wherethe two lines do not follow each other closely.

The second analysis is another single variable model,using the attraction values displayed in Figure 18. Theseare the attraction values from Figure 16, without theinclusion of tension values, against the listeners’responses. The fit is improved but still not good: R2(1,7) =.35, p = .09. R2adj =.26.

Figure 19 combines Figures 17 and 18 by adding theattraction values to the sequential tension values. Mul-tiple regression weights the two sets of numbers toachieve a best-fit solution, and assigns a probability toeach of the predictor variables. These will be denotedp(attraction) for the probability of attraction andp(tension) for the probability of the total tension pre-dictor. Each of these is shown with a standardized betavalue, b. The b weights are the coefficients in the linearmodel predicting the data from the predictor variable

(after they have been standardized to have the samemean and standard deviation). The picture is betterthan in Figure 17 but no better than in Figure 18:R2(2,6) = .35, p = .28, R2adj=.13 ; p(attraction) = .17, b = .58;p(tension) = .96, b = .02. The higher p value for R2 is thepenalty for using two predictor variables rather thanone, thus increasing the first degree of freedom. Or, toput it another way, when attraction is included in themodel, adding the sequential tension values does notimprove the fit (the p value for sequential tension in themultiple regression is .96, which means that adding ithas virtually no effect). This analysis confirms the con-clusion that the strict sequential treatment of tensiondoes not contribute to the fit of the data.

Let us abandon the sequential-tension approach andconsider instead the theoretically preferred prolongational


FIGURE 18. Attraction analysis of the Grail theme.

FIGURE 19. Combined sequential + attraction analysis of the Grailtheme.



analysis in Figure 11 together with its derived hierarchi-cal tension analysis in Figure 12b. At first we ignoreattraction values. The resultant graph in Figure 20achieves a better correlation than the previous analyses:R2(1,7) = .43, p = .056, R2adj =.35. However, the predictedvalues are too high for Events 1-4 and too low forEvents 5-8.

Figure 21 adds the attraction values in Figure 16 tothe tension values in Figure 12b. Now the correlationis quite good and statistically significant: R2(2,6) = .75,p = .016,R2adj = .66; p(attraction) = .03, b = .56; p(tension) =.02, b = .63. The most notable change in Figure 21compared to Figure 20 is the raising of the predictedcurve at Event 8 (V7). In Figure 20 the tension model

correctly assigns relaxation into the cadence, but partic-ipants experience greater tension at the V7 chord thanshown there. This happens because the V7 is highlyattracted to the following tonic resolution, an effectrealized in Figure 21 by the inclusion of attraction val-ues. Discrepancies remain, however. The predictions forEvents 3-4 are still too high and those for Events 5-7 aretoo low.

These shortcomings can be overcome through a revi-sion of the prolongational analysis. In the originalanalysis in Figure 11, there is equilibrium between rightand left branching (following the balance constraint),with Event 5 (the ii chord) interpreted as a pre-dominantto the cadence (following normative structure). Theanalysis in effect claims that, beginning at Event 5, thelistener already expects the resolution on Event 9. But itis harder to anticipate prospectively than it is to remem-ber retrospectively. Besides, Event 5 continues from theprevious events the harmonic sequence of descendingthirds with a rising melodic second. It is easier to hearinstead the analysis in Figure 22, in which the principleof parallelism wins over those of branching balance andpre-dominant function. The only difference is thatEvent 5 is now a right instead of left branch; Events 6 and 7attach to Event 5 as before. This single change leads toalterations in tension values for Events 5-7, as listedbetween the staves. In this interpretation, the tension ofthe harmonic sequence continues through the elabora-tion of ii in Events 6-7 and is released only at thecadence in Events 8-9. Attractions remain as before. Theresult is the almost perfectly matching curves in Figure 23:R2(2,6) = .97, p < .0001, R2adj = .97; p(attraction) < .0001,b = .58; p(tension) < .0001, b = .79.

Three broad conclusions can be drawn from thisanalysis of the Grail theme. First, attractions must beincorporated into the predictions. Second, listenershear tension hierarchically more than sequentially.Third, unless schematic intuitions are strong, listenerstend to construe events in a right-branching manner,that is, in terms of previous rather than followingevents.

Are the stop-tension data related to the continuous-stop data? Krumhansl (1996) found that the discretepredictions of the TPS model could provide a good fitto the continuous tension judgments by assuming anintegration time of 2.5 seconds. In the present case, thisapproach is adapted to ask whether the continuous-tension data could be predicted by the stop-tensiondata, assuming the same integration time. The calcula-tion assumes that the values of past events are degradedas an inverse exponential function with a half-life of0.5 seconds. The continuous data are plotted as a solid


FIGURE 20. Tension graph for the theoretically preferred hierarchicalanalysis of the Grail theme.

FIGURE 21. Combined hierarchical (theoretically preferred) + attractionanalysis of the Grail theme.



line in Figure 24 together with the values calculatedfrom the stop-tension data. A high degree of agreementis reached: R2(1,104) = .95, p < .0001, R2adj = .95. This isof interest because the participants performed the stop-tension task before the continuous-tension task. Thismeans that when they performed the stop-tension task,they had not heard the music beyond the chord thatthey were judging. The extent to which the two tasksconverge suggests that listeners were responding to thesounded events rather than to events they anticipatedbecause of memory from previous listening. Althoughthe analyses will not be presented here, the stop-tensionand continuous-tension data are similarly related for

the other two excerpts for which they are available (thechromatic version of the Wagner Grail theme and theBach chorale).

Bach Chorale

On the basis of the discussion of the Wagner excerpt,the remaining analyses follow the hierarchical ratherthan sequential tension model and incorporate attrac-tions as part of the overall prediction of tension. Wefirst consider the Bach chorale “Christus, der ist meinLeben.” Its prolongational analysis is divided, for rea-sons of space, between Figure 25 and Figure 26. The top


FIGURE 22. Prolongational analysis of the Grail theme, with Event 5 reinterpreted as right branching.

FIGURE 23. Combined hierarchical (right-branching) + attraction analysisof the Grail theme.

FIGURE 24. Comparison of the continuous-tension data (solid line)with predictions from the stop-tension data for the Grail theme, afterthe latter are integrated over 2.5 seconds with an exponential decaywith half life 0.5 sec.


http://www.jstor.org/action/showImage?doi=10.1525/mp.2007.24.4.329&iName=master.img-037.jpg&w=335&h=197http://www.jstor.org/action/showImage?doi=10.1525/mp.2007.24.4.329&iName=master.img-038.jpg&w=226&h=169http://www.jstor.org/action/showImage?doi=10.1525/mp.2007.24.4.329&iName=master.img-039.jpg&w=226&h=163http://www.jstor.org/page/info/about/policies/terms.jsp

branches, all of which represent the tonic I/F (hence d = 0in the tree), should be understood as connectingtogether. Event 2 in Figure 25 attaches to Event 41 inFigure 26, and the designation for Event 19 in Figure 26refers to Event 19 in Figure 25. The predicted values ofsurface dissonance, hierarchical tension, and attractionappear between the staves. (Incidentally, Event 34branches differently than does the equivalent first eventin Figure 7. Here it connects not to the final cadence[Event 41] but back to Event 19, showing the return to

/F. This happens because a prolongational analysisalways makes the most global connection possible. InFigure 7 the context was a single phrase; here it is theentire chorale.)

Figure 27 shows the fit of the empirical data with thepredictions in Figures 25-26: R2(2,38) = .79, p < .0001,R2adj =.78 ; p(attraction) < .0001, b = .47; p(tension) <.0001, b = .67. The high correlation is all the moreimpressive given that a correlation tends to decrease asthe number of events increases (because there are morepossible points of deviation, as shown in the seconddegree of freedom). Attraction and tension are bothindividually significant in the multiple regression.

The analysis in Figures 25-26 departs from the TPSanalysis of the chorale in two places. The first concernsthe interpretation of Event 4 in Figure 25. In TPS it is

I3̂

conventionally treated as a secondary dominant,IV2/IV, and by the shortest path attaches to the follow-ing IV. But this solution, shown by the dashed branch,gives a high tension value of 23 because of the doubleinheritance from IV (8 + 5). The right-branchingalternative, the solid branch coming from the previousV6, takes a longer local path but achieves a better bal-ance between right and left branching in the phrase asa whole, and it gives a moderate tension value of 15.Olli Väisälä (personal communication, October 26,2004) points out, however, that the Roman numeralanalysis of IV2/IV itself violates the principle of theshortest path. The most efficient interpretation ofEvent 4 is instead as I/F with a flatted seventh in thebass, yielding a low tension value of 5. This option isshown in parentheses in Figure 25 and by the dottedbranch in the tree. In this view, Event 4 is at animmediately underlying level, transformed at themusical surface by the chromatic descent in the bass.(Imagine Event 4 with F3 instead of E 3 in the bass;the progression makes perfect sense.) Of these solu-tions, the best match with the data is the intermediateone with the tension value of 15, and this is what wehave followed here.

The second departure from the TPS analysis concernsthe point at which the third phrase shifts from F to C.

I3̂


FIGURE 25. Analysis of the Bach chorale, phrases 1-2.



In TPS, the reorientation is taken to occur on the down-beat of bar 5 with V65/C, as illustrated in Figure 28a. Thisinterpretation treats the melodic F5 on the third beat ofbar 5 as a neighboring 4̂ between a prolonged 3̂ in C.The resulting tension values, however, are too high atthe F5. Väisälä suggests instead the analysis in Figure 28b,in which the shift to F takes place later in the phrase. Inthis interpretation, which we have taken here, the F5is not a mere neighbor in C but is the goal, 8̂ in F, of alinear progression from the C5 that begins the phrase.

This reading leads to a different prolongational tree andfits better with the data.

These alternative interpretations of the first and thirdphrases illustrate the gradient nature of prolongationalderivational process and how empirical data can illumi-nate which “preferred” interpretation may best conformto listeners’ intuitions. It is noteworthy that bothinstances involve choices in Roman-numeral analysis.From the present perspective, Roman-numeral analysisis not just a pedagogical labeling device but is a meansof establishing location in pitch space. Different spatiallocations yield different distances, hierarchical relation-ships, and degrees of tension.

There are a few places where the model cannot find agood fit with the data. Event 12 has too low a tensionvalue because, as I6 prolonged from I, it inherits no ten-sion. Yet it also acts as a passing chord in a progressionof outer-voice parallel 10ths. The theory does not yethave a way of addressing this voice-leading pattern.There is also a poor fit at Events 24-25 (this would be thecase also under the TPS interpretation in Figure 28a).These events are embellishing 16th notes of littleimportance to the experience of tension. However, thestop-tension task brings attention to them. The modeldoes not yet take into account the effect of relativeduration, so that these fleeting events have more weightin the statistical analysis than they ought to have.


FIGURE 26. Analysis of the Bach chorale, phrases 3-4.

FIGURE 27. Tension graph for the Bach chorale.



Both deviations suggest directions in which the theorymight be improved.

Chopin Prelude

Chopin’s E major Prelude (analyzed in TPS, Chapter 3)is an exceptionally concentrated example of nineteenth-century chromaticism. We assume a prior reductionof the Prelude’s surface to block four-part harmony.Figures 29-31 display the TPS prolongational analysisof the Prelude’s three phrases. Each phrase begins withthe same chord ( /E), so that, at a global level notshown, Event 17 branches off Event 1 and Event 33 off

I5̂

Event 17; finally, Event 1 attaches to Event 47. As pro-longations of the tonic, all these events inherit 0, and thepatterns of tension and relaxation take place withinthe phrases.

A number of details in the figures require comment.In Figure 29, Events 6 and 8 could be regarded as sepa-rate chords (viio6 and iii6, respectively), but it is equallyvalid to treat them as voice-leading anticipations to theensuing chords (the D in Event 6, the G in Event 8).The latter interpretation better fits the data and is takenhere. In the tree, the indication “1[0]” means that Event16 inherits 0 from Event 12 (since both are V chords)but that the seventh (A3) in Event 16 adds 1 to its local


FIGURE 28. Alternative analyses of the third phrase of the Bach: (a) as in TPS; (b) with a delayed tonicization of C. (Only the soprano and bass linesare shown.)

FIGURE 29. Analysis of the first phrase of Chopin’s E major Prelude.



d value. The bracketed numbers in Figure 30 to Events23 and 30 are to be understood similarly. The dashedbranch to Event 28 is an alternative interpretation thatgives Thier = 36; this result better fits the data and isadopted. Event 44 in Figure 31 also offers contrastingoptions. Here Thier = 6 is too low and Thier = 45 too high

compared to the data. We take the first option since ittakes a shorter path.

Before considering the statistical fit to the data, let usnote how these continuous-tension data were preparedfor the analysis. The discrete values shown as “judged”in Figure 32 are the average of the listeners’ tension


FIGURE 30. Analysis of the second phrase of Chopin’s E major Prelude.

FIGURE 31. Analysis of the third phrase of Chopin’s E major Prelude.



judgments from the onset of each event to the onset ofthe next event. The slow tempo (two seconds per chord)suggested that this would give a representative value forthe tension of each event. The motivation for findingthese discrete values was that it was desirable to workwith a single number for each event as various theoret-ical analyses were considered. This approach makesfewer assumptions than the exponential decay modelused in prior treatments of the continuous responsemethod (Krumhansl, 1996).

Figure 32 shows the fit of the predictions in Figures29-31 to the data: R2(2,44) = .42, p < .0001, R2adj =.40;p(attraction) = .34, b = .11; p(tension) < .0001, b = .62.The correlation is not strong. Although the overallprobability is low, the contribution of attraction doesnot approach statistical significance. We shall find a bettersolution, but before doing so let us review the maintrouble spots. First, the discrepancy for Events 1-5 is anartifact of the continuous-tension task and can be dis-counted: the position of the slider was initially set at 0and participants needed to hear a few events before theywere able to position the slider near an appropriate levelof overall tension. The predicted tension is too low,however, for part of the rest of the first phrase (espe-cially Events 14-16) and much of the second phrase(Events 23-29). The fit in the third phrase is particularlypoor, with the predicted values too high (Events 38-32)and then too low (Events 44-46).

The difficulty with Events 14-16 seems to be that inthe prolongational analysis Event 12 inherits no tensionfrom Event 7 (the two are identical), yet listeners payattention instead to the slow descent of the melody. Thesituation is comparable to that of the second phrase ofthe Bach (Event 12 in Figure 25): in both cases, the linear

melodic progression maintains tension that the theorydoes not account for. The model’s predictions for Events23-29, in contrast, could be increased by a differentanalysis within the theory. The TPS analysis in Figure 30follows Aldwell and Schachter (1979) by interpretingEvents 24-28 as a prolongation of an enharmonicallyshifting diminished seventh chord; thus the A major andB minor chords (Events 25 and 27) are assigned passingstatus. In another plausible analysis, Events 24-27 wouldrecursively branch to the right, on the rationale that thelistener is sufficiently baffled by the intense chromati-cism that the only recourse is to hear each event in termsof the immediately preceding one. This tree structurewould increase the predicted tension to correspondrather well to the data. We refrain from presentingthis alternative only because of another option to bediscussed shortly.

The third phrase presents the greatest problems,beginning with the large distance value assigned to themove to F at Event 37. As mentioned in TPS (p. 78), dmay obtain too great a distance between I and II;Events 38-43 then inherit this value. In addition, listen-ers tend to lag in their responses when presented withdistant modulations; they need time to adjust to the newcontext (see Krumhansl & Kessler, 1982, for related evi-dence). The data shows this in the descending curvesbetween Events 37-39, 41-43, and 45-47. In each case,the local I-V-I progression gradually establishes the newtonic for the listener, even though in the prolongationalanalysis the second I is a repetition of the first. There is aclash between final-state analysis and real-time listening.The conflict is most severe at the return to E at the finalcadence (Events 44-45). The listener expects a repeat ofthe sequential pattern in the previous bars, I-V-I-IV,


FIGURE 32. Tension graph for the TPS analysis of the Chopin prelude.



but the second I is followed instead by V/E. This startlingmodulation takes two more beats to process. The modelhas no way to register these factors.

Nevertheless, the predictions improve markedly bythe injection of a new factor: melodic contour. Observethat in the first two phrases the judged tension rises andfalls in waves that correspond to the rise and fall of themelody. The shape of the melody in this piece is indeedas simple as the harmony is complex. In the spirit of thediatonic underpinnings of this chromatic music, let usassign pitch height in imitation of scale degrees (non-modulo 8), as shown in Figure 33 for the second phrase.A diatonic and a lowered sixth degree both receive “6,”for example. (One could also measure pitch height bysemitones from the bottom melodic pitch B3, which wehave done with comparable results.)

The correlation of melodic contour alone with thedata is surprisingly robust: R2(1,45) = .66, p < .0001,R2adj =.65. Figure 34 shows the result when contour iscombined by multiple regression with the other factorsthat predict tension: R2(3,43) = .67, p < .0001, R2adj =.65;p(attraction) = .02, b = .22; p(tension) = .003, b = .33;p(contour) < .0001, b = .57. Now the correlation ishealthy, and all the variables, even attraction, are statisti-cally significant. Yet it should be noted that for the best-fitting solutions for all the diatonic passages analyzed,

attraction appears weaker than tension, as shown by a bvalue that is consistently less than that for tension. Thissuggests that the numerical formulation of attractionmight be improved.

We have computed the melodic-contour factor for allthe other music under consideration and found that itis significant only in the Chopin. Why is this so? Themelodic contours in the other pieces move up anddown in more intricate patterns than in the Prelude,and their harmonic structures are less complicated.Generally speaking, listeners gravitate toward struc-tures that are more easily processed. When faced withthe simple melody and convoluted harmonies of theChopin, they apparently give the former greater weightthan they would under more usual circumstances.

Analyses in Nondiatonic Spaces

Despite its intense chromaticism, the Chopin preluderemains diatonic in the sense that its harmonic progres-sions refer to diatonic scale degrees, albeit often chro-matically inflected degrees. Later composers began toexplore progressions that, while still having a tonal ori-entation, refer to nondiatonic structures, such as thewhole-tone scale (all whole steps), the octatonic scale(alternating half and whole steps), and the hexatonic


FIGURE 33. Contour values (imitating scale degrees) for the second phrase of the Chopin.

FIGURE 34. Tension graph for the TPS + contour analysis of the Chopin prelude.



scale (alternating half steps and minor thirds). Chapter 6of TPS treats diatonic and nondiatonic scales andchords within a single model through modifications indiatonic basic space and the corresponding distanceand attraction rules. As the details are rather involved,we shall review just a few features of the approach.Figure 35a displays octatonic basic space orientedtoward a C major triad. The space is the same as that ofI/C in diatonic space (Figure 3) except that at the scalelevel an octatonic collection replaces the diatonic col-lection. There are three possible octatonic scales, henceregions; this one is labeled oct0. As in the diatonic case,triadic progressions can take place within oct0 or canmodulate to oct1 or oct2. The distance between any twotriads in octatonic space is computed by an adjustedchord distance rule, doct(x y) = i + j + k; j proceeds notby the cycle of fifths but by minor thirds and parallel tri-ads. Attractions (a) are computed as in the diatonic case.

Triads in hexatonic space receive analogous treat-ment. There are four possible hexatonic regions. Com-puting dhex requires minor adjustments comparable tothose for doct; j proceeds by major thirds and parallel tri-ads. Figure 36a shows a C major triad in hex0, and Fig-ure 36b computes the distance to the farthest triadwithin hex0. Again, a is calculated as in the diatoniccase. Finally, there is a rule dis that calculates interspatialdistances, as for instance when a phrase begins in octa-tonic space and “hypermodulates” to diatonic space.(See TPS, pp. 280-285, for discussion of dis. Because of

the intricacy of the rule, we shall not show how it iscomputed here.)

In sum, TPS’s methods for deriving prolongationalstructure and computing tonal tension in nondiatonictonal music are the same as for diatonic tonal music.Only the basic spaces and details of d are different. Agoal of the present study is to investigate the perceptualrelevance of these nondiatonic structures.

Wagner Theme, Chromatic Version

Figure 37 presents two analyses of a chromatic state-ment of the Grail theme as it appears in Act III of Parsifal.The phrase modulates from E to D by twice flatteningthe diatonic version by a half step, at Events 2 and 5. Weshall consider E and D as equally stable in this con-text, with both set to d = 0. Events 1-5 belong entirelywithin a hexatonic collection; Events 5-9 resume thetheme’s previous diatonic course. How do listeners hearthe flow of tension in this passage?

In keeping with the method in Chapter 6 of TPS forfinding preferred spaces, both analyses in Figure 37interpret Events 1-4 within the hexatonic region hex3with Event 5 pivoting as ii/D into diatonic space.Attractions are computed with reference to hex3 forEvents 1-5 and with reference to diatonic D for Events5-9. Figure 37a corresponds to Figure 22, the empiri-cally strongest analysis of the diatonic version. The onlyprolongational change is that Event 5 branches not


FIGURE 35. Octatonic space: (a) the basic space oriented to a C major triad in the region oct0; (b) an illustration of doct.

FIGURE 36. Hexatonic space: (a) the basic space oriented to a C major triad in the region hex0; (b) an illustration of dhex.



from Event 4 but directly from Event 1, a consequenceof the hypermodulation at that point from hexatonic todiatonic space, with dis(Event 1 Event 5) = 7. Figure 38reveals a poor data fit for this analysis: R2(2,6) = .34,p = .29, R2adj = .11; p(attraction) = .94, b = −.03.22;p(tension) = .16, b = .59. The predictions for Events 3-4are too high because of the inherited value from Event 2.The prediction for Event 5 is conversely too low; thechromatic progression between the E major andminor triads of Events 1 and 5 apparently dilutes theconnection between them. The discrepancy at Event 8does not result not from a local calculation and may bean artifact of the statistical analysis.

Figure 37b repeats the analysis in Chapter 7 of TPS. Inkeeping with its pivot function, Event 5 takes a doublebranch, from the right in hexatonic space and from theleft in diatonic space. Hence Event 5 has two tensionvalues, 1 and 10; the latter number fits the data better andis taken here. The improved result, shown in Figure 39,gives R2(2,6) = .66, p = .04, R2adj = .55 ; p(attraction) =.44, b = .19; p(tension) = .02, b = .78.

In both analyses, attraction probabilities are high,indicating that this factor makes little contribution tothe correlation. Moreover, the more successful hexa-tonic analysis (Figure 37b) bears little resemblance tothe best-fit analysis of the diatonic version of the


FIGURE 37. Hexatonic-to-diatonic analyses of the chromatic version of the Grail motive: (a) right-branching interpretation; (b) TPS interpretation.

FIGURE 38. Tension graph of the analysis in Figure 37a. FIGURE 39. Tension graph of the analysis in Figure 37b.



theme (Figure 22). It is hard to justify such contrastinganalyses of the two versions.

The predicted curves diverge from the data curves atthe same places in Figures 38 and 39: increase instead ofdecrease between Events 2 and 3-4, and decrease insteadof increase between Events 4 and 5. This pattern sug-gests a sequential rather than hierarchical approach tothese events, for a sequential analysis eliminates inher-ited values at Events 3-4 and explicitly includes the rel-atively large distance that exists between Events 4 and 5.A larger rationale behind this step is that, when facedwith puzzling chromatic progressions, listeners seem tofind a hierarchical construal more difficult, for com-mon schemas are not triggered. Instead the tendencymay be to make sense of each event merely in terms ofthe previous one. Hierarchical hearing becomes robustonce the music becomes diatonic at Event 5.

Figure 40 carries out this approach. The arrows forEvents 1-5 represent sequential rather than hierarchicaldistances. (A slight exception is that Event 3 remainsa left-branching anticipation of Event 4.) At Event 5,fragments of a prolongational tree emerge, along thelines of the right-branching treatment of Events 5-7in Figure 22. A left-branching treatment of Event 5,producing an integrated tree for Events 5-9,as in Figure 11,would yield almost the same tension numbers.

This interpretation yields Figure 41. The predictedcurve at the beginning of the phrase comes closer tothat of the data because the distance from Event 1 toEvent 2 is greater than the distance from Event 2 toEvent 4. Similarly, the predicted curve from Event 4to Event 5 improves because of the relatively large dis-tance between them. The overall fit is strong: R2(2,6) =.83, p = .005, R2adj = .77 ; p(attraction) = .42, b = .15;p(tension) = .002, b = .88. Attraction does not reachsignificance, however.

Although Figure 41 is satisfactory, it is possible that agood fit can also be obtained by treating the chromaticGrail theme entirely within diatonic space. Perhaps lis-teners are so unaccustomed to hexatonic space that theyintuitively stick with familiar diatonic space, even if it hasto be adjusted to accommodate the chromatic input.Figure 42a gives a hierarchical analysis in which theRoman-numeral designations descend by half step, fromE to D to D , in parallel with the flattening of Events 2and 5. The modulation is sufficiently inexplicit that wedecided to suppress the i variable in d, leaving distance tobe measured by chord root distance ( j) and noncommontones (k). As shown in Figure 43, this analysis fails empir-ically: R2(2,6) = .30, p = .34, R2adj = .07 ; p(attraction) = .45,b = .28; p(tension) = .25, b = .44. The outcome bearscomparison to the hexatonic-to-diatonic analyses inFigure 37 and the data correlation in Figure 38. Relatedprolongational variants would yield similar results.


FIGURE 40. Sequential-to-hierarchical, hexatonic-to-diatonic analysis of the chromatic Grail theme.

FIGURE 41. Tension graph of the analysis in Figure 40.



Figure 42b takes a sequential-to-hierarchical interpre-tation akin to Figure 40, except that this time the analy-sis takes place not partly in hexatonic space but entirelywithin modulating diatonic space. Again variable i isinactivated for Events 1-5. The distance calculations arefrom I to vi in E , and vi to IV to ii in D; the latter pivotsas ii/D in diatonic space. The strong correlation, shownin Figure 44, is R2(2,6) = .79, p = .0025, R2adj = .72;p(attraction) = .04, b = .28; p(tension) = .0012.

To summarize this rather complicated discussion ofthe chromatic version of the Grail theme, the analysesthat work best with the data are those that treat thechromatic part of the progression sequentially and the

diatonic part hierarchically. The correlations for thehexatonic-to-diatonic interpretation (Figures 40 and41) and the shifting-diatonic interpretation (Figures 42band 44) are too close to decide between them. The forceof the latter, however, is weakened by two theoreticalfactors: the TPS space-finding method picks hexatonicspace for Events 1-5, and the shifting-diatonic interpre-tation relies on a doubtful suppression of the i variablein d. Even so, the evidence is uncertain whether listenershear Events 1-5 in hexatonic space or in shifting-diatonic space. (An alternative approach to this passagewithin a neo-Riemannian framework is presented inAppendix B.)


FIGURE 42. Modulating diatonic analyses of the chromatic Grail theme: (a) hierarchical interpretation; (b) sequential to hierarchical interpretation.

FIGURE 43. Tension graph for Figure 42a. FIGURE 44. Tension graph for Figure 42b.



Messiaen Quartet

Figures 45-46 give the opening parallel phrases of thefifth movement (“Louange à l’Eternité de Jésus”) fromMessiaen’s Quartet for the End of Time. Events are num-bered according to melodic and/or harmonic changes.The melody in the original is played on the cello and the

repeated chords on the piano, but for the experimentboth parts were played on the piano. To shorten theexperiment slightly, the lengths of Events 12, 18, 20, 32,38, and 40 were reduced from half to quarter notes. At amore global level than is shown, Event 31 attaches toEvent 11. In the original, Events 39-40 continue into aconsequent phrase beginning on IV 11/E; however, the


FIGURE 45. Analysis of the first phrase of Messiaen’s Quartet for the End of Time, V.

FIGURE 46. Analysis of the second phrase of the Messiaen.



subjects heard the music only up to Event 40. It is there-fore assumed in the

Documents

Modeling Tonal Tensionmusic.psych.cornell.edu/articles/tonality/ModelingTonal... · 2013. 4. 23. · tonal tension is a uniquely musical phenomenon (unlike such factors as fluctuations