RENI: Real Time Beat Tracking and Metrical Analysis · This report describes the development of RENI, a Beat Tracking and Metrical Analysis application and percussive performance

RENI:

Real Time Beat Tracking and

Metrical Analysis

Donal Mulvihill

(s0789005)

Master of Science

Artificial Intelligence

School of Informatics

University of Edinburgh

2008

1

Abstract

This report describes the development of RENI; a performance accompaniment application which

performs Beat Tracking and Metrical Analysis on MIDI signals in real time to provide simple

percussive accompaniment to musical performances. RENI has been developed as a model of real

time human Beat Tracking and as an effort to investigate if note onset information is sufficient to infer

the beat and metre of a piece of drum-less music in real time without any prior knowledge of its beat

or metre. It implements a rule based algorithm which infers Metrical Hypotheses by searching for and

combining series of regularly spaced note onset events in MIDI signals as they are received. These

Metrical Hypotheses are scored according to plausibility metrics and one is selected to form the basis

of percussive accompaniment, which is performed in time with the piece being played. RENI has

been evaluated by comparing its Beat Tracking performance on a test set of MIDI recordings to the

performance of human beat trackers. It has also been trialled and rated by musicians. Although

deficient in certain areas, RENI can successfully infer beat and metre and provide simple performance

accompaniment for rhythmically simple musical pieces and demonstrates potential in a practical sense

as a performance accompaniment tool.

2

Acknowledgements

I would like to thank my supervisor Alan Smaill, for his guidance and help throughout this project.

Thanks also to David Murray-Rust in the School of Informatics for his help and advice ,and to those

who participated in the evaluation of the application.

3

Declaration

I declare that this thesis was composed by myself, that the work contained herein is my own except

where explicitly stated otherwise in the text, and that this work has not been submitted for any other

degree or professional qualification except as specified.

(Donal Mulvihill)

4

Table of ContentsGlossary.....................................................................................................................................................81.0 Introduction ......................................................................................................................................10

1.1 Beat Tracking and Metrical Analysis...........................................................................................101.2 Computational Performance Accompaniment.............................................................................101.3 Hypothesis....................................................................................................................................101.4 Aims and objectives .....................................................................................................................111.5 Motivation.....................................................................................................................................12

1.5.1 Academic motivation............................................................................................................121.5.2 Practical motivation – the usefulness of an artificially intelligent Beat Tracking application ........................................................................................................................................................13

1.6 What was achieved during this project .......................................................................................141.7 Outline of this document..............................................................................................................141.8 A brief digression – Naming the application................................................................................15

2.0 Background .......................................................................................................................................162.1 Beat...............................................................................................................................................162.2 Metre.............................................................................................................................................162.3 Literature Review – Approaches to Beat Tracking.....................................................................172.4 How this project fits into the research context.............................................................................20

3.0 Design considerations and decisions.................................................................................................223.1 Requirements................................................................................................................................223.2 Constraints....................................................................................................................................22

3.2.1 Time available.......................................................................................................................223.2.2 Hardware and equipment available......................................................................................233.2.3 Real-time operation..............................................................................................................23

3.3 Form of musical input – MIDI.....................................................................................................233.4 Conceptual design decisions........................................................................................................24

3.4.1 Beat Tracking cues................................................................................................................243.4.2 Assumptions made about musical input...............................................................................253.4.3 Beat Tracking algorithm ......................................................................................................26

3.5 Practical and architectural design decisions.................................................................................263.5.1 Development language.........................................................................................................263.5.2 Parametrisation of the application........................................................................................273.5.3 Capabilities of the application..............................................................................................273.5.4 Output...................................................................................................................................273.5.5 Complimentary applications.................................................................................................28

4.0 Beat Tracking and Metrical Analysis................................................................................................294.1 Beat Tracking – The problem and solution..................................................................................294.2 Beat Levels...................................................................................................................................304.3 Metrical Hypotheses.....................................................................................................................314.4 Steps in RENI's Beat Tracking and Metrical Analysis algorithm................................................344.5 Architecture and components of RENI........................................................................................35

4.5.1 Timer ....................................................................................................................................354.5.2 Instrument.............................................................................................................................364.5.3 RENI (Main application component)...................................................................................36

5

4.5.4 Beat Levels...........................................................................................................................364.5.5 Interpreter..............................................................................................................................374.5.6 Judges....................................................................................................................................374.5.7 Drummer...............................................................................................................................374.5.8 Parameters.............................................................................................................................38

5.0 RENI's Beat Tracking algorithm.......................................................................................................395.1 Accepting and processing input....................................................................................................39

5.1.1 Creating RENI Events..........................................................................................................395.1.2 Detecting Chords..................................................................................................................40

5.2 Setting the search space – Spawning Beat Levels.......................................................................415.3 Extending Beat Levels..................................................................................................................43

5.3.1 Extending a Beat Level – Adding a new event....................................................................445.3.2 Choosing between events.....................................................................................................465.3.3 Ghost events..........................................................................................................................50

5.4 Hypothesising...............................................................................................................................525.4.1 Consolidation........................................................................................................................525.4.2 Hypothesising.......................................................................................................................535.4.3 Deciding................................................................................................................................56

5.5 Ranking and selecting Metrical Hypotheses ...............................................................................575.6 Producing output...........................................................................................................................615.7 Re-hypothesising..........................................................................................................................625.8 Parameters.....................................................................................................................................62

6.0 Evaluation..........................................................................................................................................646.1 Aim of evaluation ........................................................................................................................646.2 Difficulties in evaluating Beat Tracking applications..................................................................646.3 Quantitative functional evaluation...............................................................................................65

6.3.1 Test Data...............................................................................................................................656.3.2 Data from RENI....................................................................................................................666.3.3 Comparing RENI's output to the annotations......................................................................67

6.4 Subjective evaluation....................................................................................................................686.5 Results...........................................................................................................................................69

6.5.1 Quantitative functional evaluation results............................................................................696.5.2 Subjective evaluation results................................................................................................70

6.6 Observations and analysis of the results......................................................................................726.7 Comparison with other Beat Tracking applications.....................................................................73

7.0 Discussion..........................................................................................................................................757.1 Analysis.........................................................................................................................................75

7.1.1 Capabilities of RENI ...........................................................................................................757.1.2 Aims and objectives .............................................................................................................757.1.3 Hypothesis............................................................................................................................77

7.2 Further work on RENI..................................................................................................................807.3 Directions for future research.......................................................................................................807.4 Conclusion....................................................................................................................................82

BIBLIOGRAPHY...................................................................................................................................84APPENDIX..............................................................................................................................................86

A.1 Evaluation corpus........................................................................................................................86A.2 Quantitative functional evaluation results...................................................................................87A.3 Qualitative evaluation questionnaire...........................................................................................93

6

A.4 Timeline.......................................................................................................................................97

7

Glossary

Glossary

This report discusses the development of a musical accompaniment system and as such, uses a

number of musical terms It is assumed that the reader has some basic knowledge of musical

terminology (e.g. what a note, volume, pitch is, etc.). This section describes the meaning of terms

which may be unfamiliar to some readers or whose meaning in this report differs from the meaning of

the term as more commonly encountered.

Bar – in music or musical notation, is equivalent to a Measure (see below). It is a duration in a

musical piece defined by a given number of beats of a given duration.

Beat – a regularly spaced pulse or unit of time in a piece of music. It is usually indicated by tapping

along to a piece of music. See section 2.1.

Event – an Event (or Musical Event) in this report refers to the playing of a musical note. The terms

note and event are used interchangeably in this report.

Measure – is equivalent to a Bar (see above). This term is used more frequently in this report. May

also be used in the context of the Measure Beat Level which indicates the beats which denote the start

of a Bar/Measure.

Metre – determined by the number of Tactus level beats of a given duration which make up a

Measure. It is indicated by a time signature.

RENI – the Beat Tracking application described in this report. See section 1.8 for an explanation of

the name.

Salience – In a musical context, when describing a note, Salience refers to the extent to which a note

is distinguishable or stands out relative to other notes. This may be due to its volume, duration or

pitch

8

Glossary

Strong Beat – also known as a Down Beat, refers to those beats which occur at, or indicate the start

of a musical Bar or Measure.

Syncopation - includes a variety of rhythms which are in some way unexpected in that they deviate

from the strict succession of regularly spaced strong and weak beats in a metre.

Tactus – the rate at which one taps along to a musical performance. The number of Tactus beats in a

Measure determines the time signature of a musical piece.

Tatum – A subdivision of the Tactus Beat Interval. Not really of concern in this project

Time Signature – indicates (or may refer to) the metre of a piece of music by expressing the number

of beats that constitute a Measure and the note value/duration which constitute a beat.Time signatures

are written as a fraction. For example, the time signature 4/4 for a musical piece indicates that 4 beats

of quarter note duration make up a Measure.

Weak Beat – the beats in a musical performance which are not strong beats and which do not indicate

the start of a Measure.

9

1.0 Introduction

1.0 Introduction

This report describes the development of RENI, a Beat Tracking and Metrical Analysis application

and percussive performance accompaniment tool which attempts to model the ability of a human

musician (or drummer) to provide simple percussive accompaniment to music performed in real time.

1.1 Beat Tracking and Metrical Analysis

Beat Tracking is the task of identifying and synchronising with the basic rhythmic pulse of a piece of

music. It is analogous to a person tapping their feet or clapping their hands in time with music.

Metrical Analysis is the task of inferring the Metre of a piece of music. This task can be viewed as an

extension of Beat Tracking or as Beat Tracking at multiple levels. It involves organising beats into

groups or organising beats hierarchically, and is also concerned with identifying and distinguishing

between strong and weak beats. Inferring the metre of a musical piece is analogous to identifying its

time signature; an attribute of a musical piece which informs us of its temporal structure and

organisation. In terms of simple performance accompaniment, it influences how the piece is tapped

along to.

1.2 Computational Performance Accompaniment

Computational Performance Accompaniment is the participation by computers (or computer

applications) in musical performance alongside human performers.

Musical accompaniment systems attempt to emulate the task that a human musical accompanist

performs: supplying a missing musical part, generated in real time, in response to the sound input

from a live musician (Raphael, 2003). RENI is a performance accompaniment system in that it

attempts to supply the percussive part of a musical performance in response to the playing of live

music.

10

1.0 Introduction

1.3 Hypothesis

The hypothesis under consideration in this project is

“Knowledge of note onset information is sufficient for computationally inferring the

beat and metre of a piece of improvised drumless music in real time, using a rule

based approach without any prior knowledge of metre or style, for the purposes of

providing simple percussive performance accompaniment”

Essentially, this project investigated the feasibility of developing an application based on a rule based

algorithm that performs Beat Tracking and Metrical Analysis on a piece of musical performance as it

is being played, by only analysing information on the time that each note in the performance was

played.

1.4 Aims and objectives

Motivated by the intention to investigate the hypothesis stated in section 1.3, this project had a

number of aims and objectives.

The primary aim of the project was to build an application to track beats and infer the metre of an

improvised musical performance in real time using a rule based approach, for the purpose of

providing percussive accompaniment, The musical performance would be improvised from the

perspective of the application in that it would have no prior knowledge of the tempo, metre or style of

the performance.

In essence, this project aimed to develop a drum machine that would emulate the ability of a human

percussionist to provide simple accompaniment to an improvised piece of music in real time. This

drum machine would accept (in effect, listen to) musical signals played in real time by a musician and

produce appropriately synchronised percussive accompaniment.

Through investigating the hypotheses and developing the drum machine, this project also sought to

determine additional musical heuristics, approaches and techniques that use note onset information,

which could be used to successfully track the beats and infer the metre of musical signals in real time.

11

1.0 Introduction

The emphasis of the project on performance accompaniment inspired further objectives.

● Given that the application would be interacting with real life performers whose playing may

be imperfectly timed, this project aimed to investigate how a Beat Tracking algorithm could

cope with these imperfections and still correctly infer beat and metre.

● Assuming that the application views the performance as improvised, it is conceivable that the

timing of the performance could vary or change over time. This project therefore sought to

develop a Beat Tracking algorithm that could cope with variations in timing and respond to

them appropriately.

● This project also aimed to gauge the experience and views of musicians on being

accompanied by a Beat Tracking application, particularly when the application attempted to

react to timing variations in their performance.

1.5 Motivation

This project is motivated by interest in a long standing Artificial Intelligence problem; solutions (or

attempted solutions) to which may be useful in a practical context.

1.5.1 Academic motivation

The tasks of Beat Tracking and Metrical Analysis are interesting from a psychological perspective.

Perception of rhythm and beat is one of the most basic activities of musical cognition. Large (1995)

describes the performance of these tasks as “musical common sense”.

While tracking beats and inferring metre in a musical piece is easy and natural for a human listener to

perform, even one without any musical training, it is a computationally difficult task to perform and

remains a formidable AI problem. Many of the models of computational Beat Tracking suggested and

described in literature, still fall short of human Beat Tracking ability (McKinney, Moelants, Davies

and Klapuri, 2007).

Large (1995) suggests two sources of difficulty in attempting to model human Beat Tracking ability:

● Systematic Timing variations in musical performance – the tendency for musicians to use

12

1.0 Introduction

timing variation to communicate musical intentions.

● Rhythmic complexity in musical performance – the presence of syncopation, lack of salient

periodicity, or human performance errors.

The open ended nature of the problem may also be viewed as a source of difficulty. Different sets of

assumptions may be adopted in approaching the problem and the output produced by a Beat Tracking

application for a particular performance cannot be viewed as absolutely correct. Two listeners may

track beats in the same performance differently. Beat Tracking is not solely a product of rhythmical

pattern but rather of pattern and listener together (Eck, 2001)

1.5.2 Practical motivation – the usefulness of an artificially intelligent Beat Tracking application

A computational beat tracker would be useful according to Rosenthal (1992) as it would greatly

enhance the ability of computers to participate intelligently in and transcribe live musical

performance.

A fully general, automatic real time beat tracker would be of great value in many applications such as

music-synchronized Computer Generated animation, music transcription, music editing and

synchronization, and musicological studies (Allen and Dannenberg, 1990).

In a performance accompaniment context, a real time beat tracker such as that envisaged in this

project would open up new possibilities for computers in music as it would allow a computer to

synchronize with music external to it without the use of explicit synchronization information. Some

musicians contend that it is much harder to play in an ensemble and follow a collective tempo than it

is to set their own tempo and require other musicians to follow. A real time beat tracker would solve

this problem and allow an outside agent (e.g. a human performer) instead of the computer to control

the performance tempo (Allen and Dannenberg, 1990); thus transforming the computer from a tempo

setter to a tempo follower.

A real time system would also have some commercial potential. An example of a similar commercial

system is Circular Logic ( http://www.circular-logic.com/ ) . Circular Logic synchronises with the

tempo of musical input for performance accompaniment but does not perform metrical analysis.

13

http://www.circular-logic.com/

http://www.circular-logic.com/)/

1.0 Introduction

1.6 What was achieved during this project

During this project, RENI, a Beat Tracking and Metrical Analysis application that provides percussive

performance accompaniment to improvised musical performances in real time was developed. RENI

is based on a rule based algorithm that analyses note onset times in musical signals as they are

performed. RENI can also be described as a model of real time human Beat Tracking and Metrical

Analysis.

RENI's ability to track beats was evaluated by comparing the output it produced when tracking beats

in musical recordings to annotations produced by musicians, indicating their interpretation of beat

locations in the same recordings. RENI was also trialled by a number of musicians who evaluated its

ability to track beats and provide percussive performance accompaniment.

Although the application performs its primary function and produces output that can be perceived and

assessed, it runs in a development environment. Further work is required to make RENI a fully

fledged and user friendly application that could be distributed and used by a wide audience.

The source code of RENI has been submitted in conjunction with this report and is available upon

request.

1.7 Outline of this document

Following this introduction to the report and project, the remainder of this document is structured as

follows

● Section 2 - explains the concepts of beat and metre, reviews previous research on Beat

Tracking and Metrical Analysis paying particular attention to the variety of approaches put

forward in literature and discusses how this project fits into the research context.

● Section 3 – outlines the scope of the project by describing project requirements, constraints

and important conceptual and practical design decisions.

● Section 4 – discusses Beat Tracking and Metrical Analysis from a computational perspective

in greater detail. It describes RENI's view of the Beat Tracking problem and the principles

14

1.0 Introduction

underlying its approach to solving it. The steps in RENI's approach to Beat Tracking and

Metrical Analysis are discussed and the architecture of RENI as an application is described..

● Section 5 – RENI's Beat Tracking and Metrical Analysis algorithm is described and illustrated

in detail.

● Section 6 – The Evaluation of RENI is described in detail and the results of the evaluation are

presented and assessed.

● Section 7 – A discussion and critical analysis of what was achieved in this project. Further

work and directions for future research are also outlined.

1.8 A brief digression – Naming the application

The Beat Tracking application described in this report is provisionally named RENI. RENI is not an

acronym. It is the nickname of one of the author's favourite drummers, Alan Wren of the now defunct

Manchester based band, the Stone Roses.

Despite the best efforts of this project, Alan Wren (aka Reni) is much better at providing real time

percussive performance accompaniment than RENI.

15

2.0 Background

2.0 Background

This section offers some background to the project. The central concepts of beat and metre in music

are defined. The relevant literature is also reviewed and some of the important approaches to Beat

Tracking and Metrical Analysis suggested in the literature are discussed. How the work in this

projects fits into the research context and relates to previous approaches is also considered.

2.1 Beat

The term beat or beats refer to a regularly spaced pulse in a musical performance which can be

perceived and indicated by a human tapping along to it. For most musical performances, beat can be

defined as sounds that are perceived as being equally spaced in time. This defines a tempo for the

music. The beat for a particular musical piece can be described according to two attributes

● Period – the time duration between successive beats

● Phase – the time when a beat (or beats) occurs relative to the start of a musical performance.

For musicians, this beat is a central issue in time keeping in music performance. But, also for non

experts, the process seems to be fundamental to the perception of tempo and the processing, coding

and appreciation of temporal patterns. Furthermore, it determines the relative importance of notes in,

for example the melodic and harmonic structure (Desain and Honing, 1999).

As alluded to in the introduction, perception of beat by a human listener can be indicated by tapping

along. This indicates that the listener has abstracted information about the music and is able to predict

when the next beat will occur (Klapuri et al, 2006).

2.2 Metre

Metre involves grouping, hierarchy and a strong/weak distinction in terms of beats or pulses in a piece

of music (Scheirer, 1998).

Musical metre is a hierarchical structure consisting of beats at different levels as determined by their

period. The period of Beats of larger levels in the hierarchy are integer multiples of the period of

16

2.0 Background

Beats of smaller levels. Each beat at larger levels must coincide with a beat at all the smaller levels.

The most prominent of these levels is the Tactus (Quarter Note Level in the diagram, FIG 1.1 below)

which indicates the tapping rate for a musical piece. The Measure level's period is a multiple of the

Tactus level, and is typically related to the length of a rhythmic pattern in a piece of music. The Tactus

beats which coincide with the start of a Measure are known as strong beats.

FIG 1.1 – Metrical Hierarchy – taken from Goto(2001)

The size of the Tactus period relative to the Measure period determines how many Tactus beats make

up a Measure in a musical piece. This is the simplest way to think of metre and is the basis upon

which it is expressed in a time signature. A time signature is expressed as a fraction and indicates the

number of beats which make up a measure and the length of these beats. For example, a 4/4 metre for

a musical piece indicates that 4 quarter note beats make up a Measure. A 2/4 metre indicates that 2

quarter note beats make up a Measure.

2.3 Literature Review – Approaches to Beat Tracking

Many approaches and computational models of Beat Tracking have been proposed in literature. These

approaches vary in a number of respects:

● Rule based or alternative approaches (neural nets, oscillators etc.) used

● Format of musical input used – Audio or MIDI 1(see section 3.3 for explanation of MIDI)

● Real time or off-line operation

1 MIDI is a protocol that allows electronic instruments and computers to communicate with each other – see section 3.3

17

2.0 Background

● Whether or not Beat Tracking is performed at multiple levels (equivalent to Metrical

Analysis).

● The assumptions underlying the Beat Tracking approach. (metre/style known, etc.)

● The musical cues and information used for Beat Tracking.

Many of the early models were based on a set of rules which examined note onset times and inter

onset intervals to infer a beat structures. In addition to rule-based and symbolic search models,

optimization, neural nets, and coupled oscillator systems have been used extensively (see Desain and

Honing, 1994 for an overview of these models).

These models implicitly address different aspects of the beat-induction process. For instance, some

models explain the formation of a beat concept in the first moments of hearing a rhythmical pattern

(initial beat induction), some model the tracking of the tempo once a beat is given, and others cover

beat induction for cyclic patterns only (Desain and Honing, 1999).

Although many models have been proposed and discussed in the literature I will only give a brief

outline in this section of some of the approaches I deem relevant to the aims of this study.

One of the earliest and most frequently cited approaches to metrical analysis is by Rosenthal (1992)

who proposed a system called Machine Rhythm to emulate human rhythm perception for piano

performances presented as MIDI files. This system parses MIDI data into events representing note

onset times, which it then searches for series of regularly spaced onset times which represent a

potential rhythmic level. After finding the all potential rhythmic levels, the program looks for sets of

levels that may be organised into families, each representing a possible rhythmic parsing. The

program then ranks the hypotheses according to criteria corresponding to ways in which human

listeners choose rhythmic representations.

Large (1995) also describes a system which analyses note onset times. He proposes a mechanism of

Beat Tracking for complex, metrically structured rhythms which involves the entrainment of a non-

linear oscillator to an incoming signal in the form of impulses corresponding to note events. This

serves as a driver, perturbing both the phase and period of the oscillator. The oscillator adjusts its

18

2.0 Background

phase and period only at certain times during its cycle; isolating and tracking a periodicity in the

incoming rhythm. The perception of beat is modelled by the generation of an event at a particular

phase of the oscillators cycle.

Eck (2002) proposes a model of beat induction that uses a Spiking Neural Network to synchronize

with music. Input is presented to the network as voltage spikes obtained from a MIDI representation

of music, either from a MIDI file or in real time from a MIDI musical instrument. Neurons in the

SNN are initialized with a range of frequencies suitable for rhythm. When exposed to a musical

signal, clusters of neurons begin to fire in synchrony with periodic events in the signal. In many cases

these clusters gravitate to metrically important events, including downbeats. These spike onsets can

then be transformed into musical events.

While many approaches process MIDI signals, much of the more significant recent work has

concentrated on Beat Tracking for audio signals. Goto’s (2001) proposal is significant as it achieves a

reasonable metre analysis accuracy for audio signals in real time. The system can recognize the

hierarchical beat structure comprising the quarter-note level (almost regularly spaced beat times), the

half-note level, and the measure level (bar-lines). However it is assumed that the time-signature of an

input song is 4/4 and that the tempo is roughly constant; corresponding to the most common structure

of Western style music

Goto (2001) identifies the main issues in recognizing the beat structure in real-world musical acoustic

signals as being

1. the detection of beat-tracking cues

2. interpreting the cues to infer the beat structure, and

3. dealing with the ambiguity of interpretation.

Their system uses note onsets, chord changes and drum patterns as Beat Tracking cues to infer the

beat structure of a piece of music. It is based on a multi agent architecture where multiple agents track

competing metre hypotheses.

Another approach which involves metrical analysis of audio signals (although not in real time) is

proposed by Klapuri et al (2006) . Their method, which is not limited to any particular music style

19

2.0 Background

analyses musical metre jointly at three time scales: at the temporally atomic tatum pulse level, at the

beat (aka Tactus) level, and at the musical measure level. The algorithm uses time frequency analysis

to calculate a driving function at four different frequency ranges. This is followed by a bank of comb

filter resonators for periodicity analysis, and a probabilistic model that represents primitive musical

knowledge and uses the low-level observations to perform joint estimation of the tatum, Tactus, and

measure pulses. Both causal and non-causal versions of the method are described in Klapuri et al.

(2006). The causal version generates beat estimates based on past samples, whereas the non-causal

version does (Viterbi) backtracking to find the globally optimal beat track after hearing the entire

excerpt, thus improving tracking accuracy.

The final approach of interest in this study is BeatRoot as described by Dixon (2007). Similarly to

Klapuri et al's (2006) model, BeatRoot was evaluated as part of a Beat Tracking contest at MIREX

2006 (McKinney et al, 2007). The driving function of BeatRoot is a pulse train representing event

onsets derived from a spectral flux difference function. Periodicities in the driving function are

extracted through an all-order inter-onset interval analysis and are then used as input to a multiple

agent system to determine optimal sequences of beat times. This is described in Dixon (2007) and a

full implementation with source code has been made available.

2.4 How this project fits into the research context

In terms of the defining attributes of Beat Tracking models as listed in the previous section, this

project may be defined as -

● adopting a Rule based approach

● based on MIDI input (see section 3.0 for explanation of MIDI).

● operating in real time.

● performing Beat Tracking at multiple levels. Therefore Metrical analysis is performed.

● making no assumptions about the nature of musical input. The model assumes no prior

knowledge of metre or style and views the musical performance that it is accompanying as

improvised.

● using only note onset information for Beat Tracking. No other musical knowledge or

representation of musical knowledge is used.

20

2.0 Background

This project represents a returns to the early rule based, symbolic search approaches based on MIDI

input. It is notable in that it aims to create a rule based model of real time Beat Tracking and Metrical

Analysis whereas its most similar equivalent and main point of reference (see section 3.4.3): Machine

Rhythm (Rosenthal, 1992) operates off-line..

This project is also characterised by the very general and open ended specification of the problem (it

is possibly approached in its most open ended form). Whereas previous models make assumptions

about nature of musical input (such as Goto's (2001) assumption that all musical input is of a 4/4

metre), this project approaches the problem with no prior assumptions about the musical input in

terms of style of metre; treating it as essentially improvised.

Also of significance is the emphasis on performance accompaniment and the consideration of the

attributes that typify performance accompaniment scenarios. Important in this regard is the aim of

gauging the experience of musicians when accompanied by the final Beat Tracking application.

21

3.0 Design considerations and decisions


This section discusses the factors influencing the scope of the project and their impact on the design

of the application being developed. The requirements for the project are discussed and constraints on

the project are specified. The conceptual and practical design decisions are then discussed and

justified.

3.1 Requirements

Arising from a consideration of the hypothesis under investigation and the aims and objectives of the

project as specified in section 1.4, the requirements for RENI were specified:

● RENI must be capable of accepting musical performance data from a file or an external

musical device/instrument in real time.

● The application should infer note onset information from this real time musical input.

● RENI must implement a rule based algorithm which uses note onset information to track the

beat and infer the metre of the piece of music being performed.

● RENI must produce audible percussive accompaniment to the musical input as it is being

played.

● RENI must also produce textual output for use in its evaluation.

● RENI must be fully operable on a standard personal computer or laptop so as to be usable by a

wide audience.

3.2 Constraints

The scope of the project was bound by a number of constraints, some of practical nature and others

arising from the requirements specified.

3.2.1 Time available

There was a short and limited amount of time available to complete this project. The time allocated

was 15 weeks (May 12th – August 22nd). The full time line of the project is included as a Gantt

chard in the Appendix to this report.

22


3.2.2 Hardware and equipment available

No funds or equipment were made available for this project. Development and evaluation of the

application was carried out on a Macbook laptop. This was a determining factor in the requirement

that the application be operable on a standard machine.

A two octave MIDI keyboard, the M-Audio Oxygen 8 V2 USB Keyboard was also purchased and

used in development and evaluation.

3.2.3 Real-time operation

The development hardware available and the requirement for the application to track beats in real

time on a standard laptop or personal computer constrained the complexity and sophistication of the

underlying Beat Tracking algorithm. A data and processing intensive Beat Tracking algorithm, that

would not be operable on a standard machine would make RENI of little practical use to a wide

audience.

3.3 Form of musical input – MIDI

The choice musical signal format that RENI would accept was the most significant design decision

from a conceptual and practical perspective. Previous approaches to Beat Tracking and Metrical

Analysis operate on musical signals received in either MIDI or Audio format. It was decided at a very

early stage in the project that RENI would accept musical input in MIDI format.

MIDI (Musical Instrument Digital Interface) is a protocol that allows electronic instruments and

computers to communicate with each other. It transmits digital messages to devices instructing them

to play notes or to change parameters such as volume and tempo. It represents the playing of musical

notes symbolically as events (such as Note On events and Note Off events), encoding values such as

the pitch of the note played and its volume (or velocity in MIDI terminology). MIDI data can either

be received as a stream as it is played on a MIDI instrument, or can be stored in a file and read by a

MIDI sequencer. A full specification for the MIDI protocol can be found at http://www.midi.org/

23

http://www.midi.org/


Audio encodes Musical data as wave signals and typically stores in in a file format such as WAV or

MP3. Approaches to Beat Tracking using audio involve a considerable amount of signal processing to

infer note onset information. Although the initial preference in this project was to develop an audio

based application as it would allow RENI to work with a greater variety of instruments, the MIDI

format was opted for. The amount of signal processing required with audio would greatly lengthen the

implementation time and would distract from the primary aim of developing an artificially intelligent

accompanist.

The manner in which the MIDI protocol symbolically represent notes makes it a more appropriate

format to use for a symbolic, rule based approach to Beat Tracking. It makes information about music

explicitly available and easily extractable; thereby lending itself well to being manipulated

computationally.

Nonetheless, it should be noted that the choice of MIDI over Audio as the format of musical input is

not entirely without its drawbacks:

● MIDI files for use in the evaluation of the application (see section 6) are not as widely

available as audio files (encoded in either WAV or MP3 file formats.)

● The external instruments that RENI can accompany are limited to MIDI devices.

● The sound quality of MIDI is inferior to audio.

● It could be argued that a real time MIDI beat tracker is of less practical use than an audio

based equivalent.

3.4 Conceptual design decisions

A number of design decisions pertained to the operation of RENI's Beat Tracking algorithm and the

manner in which it models human Beat Tracking.

3.4.1 Beat Tracking cues

Goto (2001) identifies the first issue in Beat Tracking as the inference of Beat Tracking cues. In

Goto's (2001) model, these cues were note onset information, chord change information, and drum

sounds.

24


As the hypothesis under investigation suggests, note onset information was the primary Beat Tracking

cue to be used. The choice of MIDI as the format of musical input greatly simplified the inference of

note onset information. MIDI messages also encode information on the pitch and volume of notes

being played. Due to the focus on note onsets in the hypotheses under consideration and the time

constraints, it was decided not to place any emphasis on chord change cues which may be inferred

from pitch data. However, it was decided that information on the volume of notes would be used to a

limited extent given that it is encoded directly in MIDI messages and is important in assessing the

salience of notes in Metrical Analysis.

Another significant decision in this respect was the decision to ignore drum sounds as a Beat Tracking

cue and to dismiss any potential performance input which included drum sounds, from use in the

evaluation of RENI. RENI is a musical accompaniment system and according to the definition of

accompaniment offered in section 1.2, it should provide a missing musical piece. It therefore was

logical to adopt the assumption that RENI would be expected to provide accompaniment to music

without drum sounds.

3.4.2 Assumptions made about musical input

As was outlined in section 2.3, models of Beat Tracking models vary with respect to the assumptions

they make about musical input. A Beat Tracking model may make assumptions about the following

attributes of the musical performance:

● The metre

● The interval of the beat

● The style of the music being played

As the emphasis of the project was on accompaniment for improvised music, it was decided that

RENI's model of Beat Tracking and Metrical Analysis would make no assumptions about the nature

of musical input.

However during development, it was determined that adding a heuristic which indicates the metre of

the music being analysed, would be very simple and that comparing the performance of the

25


application with and without such a heuristic would be interesting. The heuristic was therefore added

to the application but its use is optional and it is not used by default.

3.4.3 Beat Tracking algorithm

It was decided that the most effective and expedient way of developing RENI's Beat Tracking

algorithm would be to base it on an established approach described in the literature . The chosen

algorithm would serve as an inspiration and point of reference for RENI's algorithm and would be

extended and amended to fit RENI's requirements.

A number of the approaches and algorithms described in Section 2.3 were considered. The two most

compelling candidates were those described by Goto (2001) and Rosenthal (1992). Goto (2001) was

considered as this approach performs Metrical Analysis in real time. However Goto's (2001) approach

was ultimately ruled out as it is based on audio, and in approaching the Beat Tracking problem, it

assumes a metre of 4/4.

Rosenthal's algorithm for Machine Rhythm (1992) was ultimately chosen as the base algorithm.

Although it operates off-line, it is a rule based approach based on MIDI input. It is also very clearly

described and adopting it as a base algorithm was an opportunity to assess Rosenthal's (1992) claim

that Machine Rhythm could be easily adapted for real time use.

3.5 Practical and architectural design decisions

A number of design decisions pertained to the development of the application itself and its

capabilities.

3.5.1 Development language

Java was chosen as the development language due to personal familiarity. The Java Sound API offers

excellent MIDI functionality and developing an application using it was very easy.

Developing RENI in Java on Mac OS X did however cause some difficulties. The Java sound API

running on a Mac does not recognise external MIDI devices connected via USB. Resultantly RENI

26


was not initially able to accept music from an external device in real time. Fortunately, this was

remedied using the Mandolane package (see http://www.mandolane.co.uk/). This allows Java on a

Mac to recognise and accept input from MIDI devices connected via USB.

3.5.2 Parametrisation of the application

The operation of the Beat Tracking algorithm (as described in section 5) can vary depending on the

values of certain parameters. These parameters determine, amongst other things how much leeway the

algorithm allows to timing inconsistencies in performances and how long the application listens to a

performance before attempting to accompany it.

In order that RENI be easily usable under a variety of different settings, it was decided to make these

values adjustable. The relevant parameters may therefore be changed in a Parameters module (see

section 4.5.8).

3.5.3 Capabilities of the application.

As RENI was being developed, A number of decisions were made about its capabilities as an

application.

Due to time constraints, it was decided that RENI would not resemble a fully fledged application at

the end of the project. It was decided that the time available would be better spent on optimising the

Beat Tracking algorithm instead of developing user interfaces.

As currently constituted, RENI performs its intended tasks as listed in the requirements. However it

runs in a development environment. It does not provide a user interface and can only be configured by

changing the Parameters module in the source code. Further work will be carried out in the future to

convert RENI into a fully fledged application (see section 7.2).

3.5.4 Output

As the emphasis of of this project is on percussive performance accompaniment, RENI indicates the

beats it locates aurally. Different beat sounds are used for strong and weak beats. Some previous Beat

27


Tracking approaches and experiments (Desain and Honing, 1999) used implements such as tapping

shoes to indicate output. Given the time constraints on this project, this approach was avoided.

In addition to aural output, it was also decided that RENI should provide some textual output. This

textual output records details of a particular trial (RENI accompanying a musical performance)

including the values of various application parameters and the times at which RENI indicates a beat

(taps). This textual output was used in the evaluation of the project (see section 6).

3.5.5 Complimentary applications

In addition to the main RENI application, two complimentary applications were also developed to

assist in the evaluation of RENI. These included

● BEAT-REC – records the positions at which a human performer taps along to a piece of music

on a keyboard.

● BEAT-COMP – Compares the textual output of BEAT-REC and RENI; determining how

many beats were recognised by both the human performer and RENI on the same piece of

music and producing appropriate statistics.

28



This section describes the problems central to the tasks of Beat Tracking and Metrical Analysis in

more computational terms and gives a general description of the principles and concepts underlying

RENI's approach to solving them(as implemented in its Beat Tracking algorithm). The steps in this

approach are summarised and the various conceptual/architectural components of RENI which

perform them are listed and described. This precedes a more detailed description of RENI's Beat

Tracking algorithm in section 5.

4.1 Beat Tracking – The problem and solution

Taking the definition of a beat as a regularly spaced pulse in a musical performance, RENI must

identify the interval of this pulse and its and the locations of the beats defined by this pulse in the

performance. Extending this task into the sphere of Metrical Analysis, RENI must determine how the

regularly spaced pulses it identifies relate to each other in order to infer the metre of the piece being

performed. This also involves the identification of strong beats and the intervals between them.

Strong beats (which should be regularly spaced) denote the beginning of measures.

When performed off-line this task involves identifying evenly spaced beat locations in a recording of

a musical performance in its entirety. When performing Beat Tracking in real time in order to provide

accompaniment, the locations of beats in the musical performance must be predicted before they

occur. The Beat Tracker must therefore identify the location of beats in an elapsed segment of the

performance (while it is ongoing) and on the basis of these locations and the intervals between them,

infer the location of future beats.

Goto(2001) identifies two of the main issues in recognizing the beats in music as being

1. detecting Beat Tracking cues

2. interpreting these cues to infer the beat structure

Consistent with the hypothesis under investigation (see section 1.3) RENI treats note onset times as

Beat Tracking cues and interprets note onset information as it is received in order to infer the beat and

metre of a musical performance..

29


Like previous models of Beat Tracking (Rosenthal, 1992 and Desain and Honing, 1999), RENI's

model is based on a set of rules which examine inter onset intervals between the events in a musical

performance. Events in this context refer to notes (either on an instrument or from a file) being

played. Musical events occur at a particular onset or point in time relative to each other. The principle

underlying approaches based on the analysis of inter onset intervals is that beats are indicated by, and

their occurrence co-insides with regularly spaced musical events.

In accordance with this principle, RENI searches for regularly spaced note onset intervals within a

segment of a musical performance as the notes in the segment are being played. These regularly

spaced note onset intervals are represented by RENI as a Beat Level. Once potential Beat Levels are

found, RENI then organises them into Metrical Hypotheses. RENI assumes that the metrical structure

of the performance in the segment that has been searched is indicative of the metrical structure of the

performance in the future. Once Beat Levels have been found and a Metrical Hypothesis has been

inferred, RENI projects the Metrical Hypotheses into the future in order to predict the future location

of strong and weak beats. It then produces accompaniment to coincide with the beat locations it has

predicted.

Before discussing the steps in RENI's Beat Tracking algorithm, it is necessary to describe the concept

of Beat Levels and Metrical Hypotheses in more detail.

4.2 Beat Levels

RENI represents regularly spaced note onsets as a Beat Level. There may be several identifiable Beat

Levels in a musical performance. Beat Levels consist of musical events whose onsets occur at a

regular interval within a musical performance. They represent a way in which someone could tap

along to the musical performance.

A Beat Level is therefore defined by the regular interval(measured in milliseconds) between the

onsets of the events that the Beat Level contains. It is also defined by the onsets of the events

themselves, as these indicate the locations in time that one would tap.

In Fig 4.1 below, the Beat Level is represented as arches between musical events (notes in a

30


performance represented as vertical lines).

Fig 4.1 – Beat Level consists of a number of regularly spaced musical events (notes)

Theoretically and ideally, all the notes in a Beat Level are regularly spaced. It is impractical however

to expect all the inter-note onset intervals between consecutive notes in a Beat Level to be exactly

equal, as inconsistencies may occur due to performer error. Therefore, for practical purposes, the

intervals between consecutive notes in a Beat Level in RENI are approximately equal. The Beat Level

is then defined by the location of the events in it and the average interval between consecutive notes.

4.3 Metrical Hypotheses

A Metrical Hypothesis indicates a hypothesised metre and the constituents of the musical performance

which imply it. A Metrical Hypothesis may be thought of as being composed of two or more Beat

Levels and are represented in RENI as a collection of Beat Levels. For two Beat Levels to be

legitimately combined to form a Metrical Hypothesis, the interval of the Beat Level with the larger

interval, must be an integer multiple of the interval of the smaller Beat Level. The diagram below,

FIG 4.2 illustrates how two Beat Levels, BL1 and BL2 combine to form a Metrical Hypothesis, MH1.

31


Fig 4.2 – The interval of BL1 is an integer multiple of BL2. The two Beat Levels can therefore be

combined to form the Metrical Hypothesis MH1.

One of these levels is the Tactus. This denotes the locations at which one taps along. The other level is

the Measure which indicate the locations of strong beats. These strong beats coincide with events

which are relatively more salient.

The interval of the Measure is an integer multiple of the interval of the Tactus. The relative size of the

intervals of the Tactus and Measure Beat Levels in the Metrical Hypothesis determine the number of

Tactus beats in a measure which is the basis of the performance's metre and time signature. Different

multiple relationships between the Tactus and Measure levels imply different time signatures. This is

illustrated in the series of diagrams (FIG 4.3, 4.4 and 4.5) below.

32


Fig 4.3 – Metrical Hypothesis and constituent Beat Levels, indicating a 4/4 metre


33



As there may be several Beat Levels identifiable in a musical performance, there may be several

combinations of Beat Levels leading to inference of several Metrical Hypotheses.

4.4 Steps in RENI's Beat Tracking and Metrical Analysis algorithm

RENI searches for plausible Beat Levels and combines these to form Metrical Hypotheses. The steps

in this process as implemented by RENI's Beat Tracking and Metrical Analysis algorithm are -

1. Accept and process musical performance data in real time – MIDI messages are received as

they are played from a MIDI file or a MIDI instrument and converted into an appropriate

representation which exposes note onset times and other important Beat Tracking cues.

2. Set the search space (Search for potential Beat Levels) - a number of potential Beat Levels are

created corresponding to every possible inter onset interval that occurs within a particular

time window (the Search Space Window) at the start of the performance.

3. Extend Beat Levels (Search for plausible Beat Levels) – notes received within an Extension

34


Window (which includes the Search Space Window) are assessed to determine if they are

compatible with the postulated Beat Levels. If the interval between the last note in a Beat

Level and a new note is approximately the same as the interval that defines that Beat Level,

then the new note can be added to the Beat Level, thereby extending it into the performance

and increasing its plausibility.

4. Hypothesise – plausible Beat Levels are combined to form Metrical Hypotheses. Beat Levels

whose intervals are an even Multiple of each other and who can be organised into a Tactus-

Measure relationship are combined in a Metrical Hypothesis.

5. Rank Hypotheses – a number Metrical Hypotheses will be inferred during the Hypothesise

step. RENI scores them according to certain criteria and selects the Metrical Hypothesis with

the highest ranking to form the basis of the percussive accompaniment produced.

6. Produce Output – percussive accompaniment is produced that corresponds to the selected

Metrical Hypothesis and that is appropriately synchronised with the performance.

RENI may or may not depending on its settings perform the following step

7. Re-Hypothesise – RENI can complete the above steps (1-6) at subsequent points in the

performance and change the accompaniment to account for changes in tempo or metre.

These steps are described in greater detail in Section 5.

4.5 Architecture and components of RENI

Conceptually, RENI is composed of a number of components. Each of these components contribute to

the implementation of RENI's Beat Tracking algorithm and the performance of the tasks summarised

in the previous section.

4.5.1 Timer

The Timer maintains timing information for RENI and is used to record note onset times. The Timer

35


begins when RENI starts listening to a musical performance. Note onset times are recorded as the

time that has elapsed between the point at which RENI's Timer starts and the point at which a note is

received.

The Timer also plays an important role in RENI's production of percussive accompaniment. RENI's

Drummer uses timing information to ensure that its accompaniment begins at the right time so as to

be appropriately synchronised with the musical performance.

4.5.2 Instrument

The Instrument component serves as the interface between RENI and the source of musical

performance data (in MIDI format). It also serves as the point where everything MIDI related

(synthesisers, sequencers, settings for channels) is dealt with; thereby isolating the rest of the

application from MIDI processing and settings.

The Instrument component may read and play MIDI files into RENI or else set up a connection with

an external MIDI instrument and pass input from this instrument to RENI.

4.5.3 RENI (Main application component)

The Main component of RENI co-ordinates the performance of all Beat Tracking and Metrical

Analysis tasks.

The Main component receives MIDI messages from the Instrument component and converts them

into an appropriate representation of note events. It expands the Beat Level search space and then

passes events to these Beat Levels in real time. Once all plausible Beat Levels have been found, the

Main component consults the Interpreter. Once notified by the Interpreter of its selected Hypothesis,

RENI starts the Drummer.

4.5.4 Beat Levels

Beat Levels have already been defined conceptually in section 4.2. As a component in RENI, Beat

Levels represent themselves as a collection of note events and are defined by the regular interval (or

36


average value of this interval) between these Events.

Beat Levels are significant as a processing component in RENI as they are responsible for

determining if a note extends them. As note events are created by the Main component within the

Extension Window, they are passed for inspection to each Beat Level. The Beat Level determines if

the event occurs within a time window which occurs after the last event in the Level, at an offset

determined by the defining interval of the Beat Level. If it does, the event may be added to the Beat

Level; thereby extending it.

4.5.5 Interpreter

The Interpreter is responsible for creating Metrical Hypotheses from potential Beat Levels. Once the

Beat Level Extension Window has expired, the Interpreter receives Beat Levels from RENI. It

identifies and discards duplicate Beat Levels and then combines compatible Beat Levels to form

Metrical Hypotheses. The Interpreter gets the Judges to score the Metrical Hypotheses and then

passes the highest ranking Metrical Hypothesis back to the Main RENI module..

4.5.6 Judges

The Judges assess and assign plausibility scores to the inferred Metrical Hypotheses. These scores are

used by the Interpreter to rank Metrical Hypotheses by plausibility. The highest ranking Metrical

Hypothesis is selected to form the basis of the percussive accompaniment produced. RENI currently

uses three Judges. The Timing Judge assigns scores based on the regularity of timing and equality of

intervals in the Beat Levels of a Metrical Hypothesis . The Salience Judge assigns scores based on the

salience of events at the Measure level. The Statistical judge assigns scores to a Hypothesis based on

the frequency with which Beat Levels with the same interval as the Hypothesis's Tactus Beat Level, or

multiples of that interval occur

4.5.7 Drummer

The Drummer takes the selected Metrical Hypothesis from the RENI and produces performance

accompaniment consistent with it. Once it determines the soonest appropriate time to start playing so

that its performance will be synchronised with the musical piece being performed, it produces simple

percussive accompaniment that denotes strong and weak beats.

37


4.5.8 Parameters

The Parameters component stores a collection of values which parametrise the various components

and operations of RENI. These parameters specify the length of the Search Space and Extension

windows, influence how strictly RENI treats imperfect performance timing, and assign weightings to

the scores calculated by Judges when calculating an overall score for a Metrical Hypothesis. They

also determine whether or not RENI uses heuristic information which indicates the metre of a

performance and if RENI continually listens to a performance to infer updated Metrical Hypotheses.

38

5.0 RENI's Beat Tracking algorithm


This section describes the operation of RENI's Beat Tracking and Metrical Analysis algorithm. The

steps in this process were outlined in section 4.4. In this section, each of these steps are described and

illustrated in greater detail.

5.1 Accepting and processing input.

RENI receives performance information in real time as MIDI messages. These correspond to musical

events such as a note being played and are received as notes are played. They include Note On events

which indicate the onset of a note (a key being pressed) and Note Off indicating the offset of a note

(the same key being released). The MIDI messages received by RENI encode information on the type

of event, the pitch of the note (the key pressed) and volume of the note.

These MIDI messages must be parsed and converted into a representation of a musical event that will

allow RENI to search for regularly spaced onset intervals. The Beat Tracking cues used by RENI

must therefore be inferred as these messages are received. RENI must also distinguish between

different types of events.

5.1.1 Creating RENI Events

Once RENI receives a MIDI message indicating a Note On event, it converts and stores it as a RENI

Event. This representation incorporates the pitch and volume information of the corresponding MIDI

message and also a timestamp which indicates the point in time at which the MIDI message was

received.

Once the Message is received, RENI references the Timer to determine the time at which the Message

was received and stores this in the RENI Event. This time stamp indicates the time that has elapsed in

milliseconds between the time RENI started listening for input and the time the event occurred. The

absolute value of this time stamp is unimportant. RENI is interested in the timing of events relative to

each other.

As messages are received and parsed, RENI must detect and distinguish between two types of RENI

39


Events; Monophonic Notes (regular events) and Chords.

5.1.2 Detecting Chords

Chords may be defined as consisting of two or more Note On events which are played simultaneously

and interpreted by a listener as being one musical event. Chords are more salient than normal events

consisting of one note. These events are important from a Metrical Analysis perspective as more

salient events are likely to denote the location of a strong beat and the start of a measure. Detecting

chords is essential in the detection of strong beats and the inference of metre.

RENI treats note events whose onsets occur within a particular window of each other, the Chord

Detection Window, as constituting a chord. Once a MIDI message is received, the difference between

the time that message was received and the onset time of the last RENI Event is calculated. If this

difference is less than the duration of the Chord Detection Window, then the previous RENI Event is

flagged as a chord.

Fig 5.1 – Where two events occur within a certain a certain offset of each other (the chord detection

window) they are perceived as chords and treated as one event.

In FIG 5.1 above, Chords are indicated by the notes which occur within a Chord Detection Window

40


and are shaded green. A Chord Detection Window is 100 milliseconds long and starts at the onset of

the earliest note in the chord. Combinations of note events which occur within 100 milliseconds of

each other are assumed to be perceived as simultaneous. They are detected as a chord and stored as

one event with the onset indicated by the onset of the earliest note in the chord.

It should be noted that chords are a complex product of simultaneous playing of notes and the tonal

relationship between them. However, due to time constraints and the emphasis on note onset

information in the hypothesis under investigation as stated in section 1.3, it was decided to ignore

tonal relationships and to only view chords as the product of simultaneous playing.

5.2 Setting the search space – Spawning Beat Levels

As MIDI messages are received and converted into RENI Events which encode note onset times,

RENI searches for plausible Beat Levels. In order to do this it must set out a search space of potential

Beat Levels. As was discussed in section 4.2, Beat Levels consist of a collection of regularly spaced

events and are defined by the regular interval between the onsets of consecutive events within this

collection.

The search space is created by creating (or “spawning”) Beat Levels for the inter onset interval

between every combination of events received within a particular window of time at the start of the

performance. This window is called the Search Space Window. This length of this window is

adjustable and set in the Parameters module.

Each note received after the first note is received within the Search Space Window causes new

potential Beat Levels to be spawned. This process is illustrated in the diagrams (FIG 5.2, 5.3 and 5.4)

below.

● The first note , N1 in the performance is received at TS1 (FIG 5.2). As there is only one note

so far in the performance, no Beat Levels are formed.

41


Fig 5.2 – First Note received in search space window

● Another note, N2 is received at TS2 (FIG 5.3). A Beat Level, BL1 is created consisting of the

onsets of N1 and N2 and defined by the interval between them.

Fig 5.3 – Second Note received in search space window. Beat Level BL1 spawned.

● Another note, N3 is received at TS3 (FIG 5.4). Further potential Beat Levels are created. BL2

is created as defined by the interval TS3 – TS2 and another Beat Level, BL3 defined by the

interval TS3 – TS1.

42


Fig 5.4 – Third Note received in search space window. Beat Levels BL2 and BL3 are spawned

As additional Notes are received within the Search Space Window, more potential Beat Levels are

created, one for each combination of the new note and all the previous notes. Once the Search Space

Window has elapsed, potential Beat Levels have been created for every combination of notes within

the Window, each defined by the interval between the notes.

The Beat Levels created during this Search Space Window constitute a search space of potential Beat

Levels within which RENI searches for plausible Beat Levels. RENI does this by establishing if the

potential Beat Levels created in the spawning process can be extended into the performance. This

occurs as part of the Extension process.

5.3 Extending Beat Levels

Once a potential Beat Level has been identified and spawned within the Search Space Window, RENI

must determine if it is a plausible Beat Level. Given that beats occur at a regular interval and that

RENI assumes that beats coincide with event onsets, a Beat Level is plausible if it can be extended to

included further events which maintain the same time interval between consecutive events. RENI

therefore determines if a Beat Level is plausible by attempting to extend it into the musical

performance.

A Beat Level in RENI is extended by adding a RENI Event to it. A RENI Event can legitimately be

added to a Beat Level if the interval between the candidate RENI Event and the last event in the Beat

Level is approximately the same as the interval which defines the Beat Level.

43


RENI attempts to extend the potential Beat Levels that have been created within a window of time

called the Extension Window. The Extension Window includes the Search Space Window. Whilst in

the Search Space Window, the spawning of new potential Beat Levels and the extension of existing

potential Beat Levels occurs simultaneously.

5.3.1 Extending a Beat Level – Adding a new event

Once a Beat Level is spawned from the interval between two events, all subsequent events which

occur before the close of the Extension Window are analysed to determine if they can legitimately

extend the Beat Level. (This occurs for all Beat Levels).

Two values are important in the extension of a Beat Level

● The interval of the Beat Level – The time between the two events that define the Beat Level

on creation.

● The Ideal Next Time - The Ideal Next Time is the sum of the onset time of the most recent

note in the Beat Level and the Interval of the Beat Level. Theoretically, it is the time at which

an event must occur if it is to extend and be added to the Beat Level.

For an event to extend a Beat Level, it should ideally occur at the Ideal Next time. However, an event

which should occur at the Ideal Next Time may not occur precisely a that time due to slight deviations

in the timing of the performance (due perhaps to performer error). If RENI required that an event

occur at precisely the Ideal Next Time in order for a Beat Level to be extended, then it might not find

any plausible Beat Levels. Therefore, in order to accommodate timing deviations, an Event may

extend a Beat Level if it occurs within a window of time surrounding the Ideal Next Time of the Beat

Level. This window is called the Ideal Next Time Window. The width of this window is set in the

Parameters module of RENI and expressed as a percentage of the Beat Level's interval. The width of

this window determines how well timed RENI expects the performance to be.

The series of diagrams (FIG 5.5 and 5.6) below illustrate the Beat Level extension process.

In the diagrams (FIG 5.5 and 5.6) below, RENI has a potential Beat Level which since creation has

44


been extended to include two further events which occur at approximate regular intervals I; giving it a

total of four events. The ideal next time, TI, is calculated by adding the onset of EV4 and I. Also

illustrated is the Ideal Next Time Window around TI in which any Event extending the Beat Level

must occur. Every note received by RENI is passed to the Beat Level to determine if the note can

legitimately extend it.

● N1 is received (FIG 5.5). It does not occur within the Ideal Next Time Window and is not

added to the Beat Level

Fig 5.5 – Note received but not added to Beat Level

● N2 is received (FIG 5.6). It occurs within the Ideal Next Time window and is added to the

Beat Level

45


Fig 5.6 – Note received and added to Beat Level

Once a note is added to the Beat Level, the following occurs

● The average interval of the Beat Level is re-calculated. This average interval value

important as it is used by the Interpreter in subsequent tasks.

● A new Ideal Next Time is calculated

● A new Ideal Next Time Window is set into the future

5.3.2 Choosing between events

In the Extension process outlined in the previous section, the first event which occurs in the Ideal

Next Time Window is added to the Beat Level. This causes a new Ideal Next Time and Ideal Next

Time window to be defined. It is very likely however that more than one note will occur in the same

Ideal Next Time Window and that events subsequent to the one added to the Beat Level may be closer

to the Ideal Next time than the event just added.

This can be seen in FIG 5.7 below. N1 is the first event to occur in the Ideal Next Time window and

is added to the Beat Level. However N2 which occurs after N1and is closer to the Ideal Next Time.

46


Fig 5.7 – Two notes occur within the Ideal Next Time Window

Addressing this problem would be easier if RENI ran off-line. The obvious approach would be to

select the best event, the one closest to the Ideal Next Time from all the candidate events. However in

real time operation RENI doesn't know all the candidate events until the Ideal Next Time Window has

elapsed. Another approaching to this problem would be to spawn an additional Beat Level; having

one Beat Level with N1 and another with N2. This however would lead to an explosion in the number

of Beat Levels and would greatly compromise RENI's ability to run efficiently in real time. Therefore,

in order to address this problem, RENI replaces the event added to the Beat Level within a particular

Ideal Next Time Window with a better event, if one is subsequently found. This means that RENI

must track two Ideal Next Time Window's at the same time, The Active Ideal Next Time Window and

the Previous Ideal Next Time Window.

The replacement process is illustrated in the series of diagrams (FIG 5.8, 5.9 and 5.10) below.

● N1 occurs within the Active Ideal Next Time Window and is therefore added to the Beat Level

(FIG 5.8). The Beat Level calculates a new Interval, Ideal Next Time and Active Ideal Next

Time Window. However it also maintains the Previous Ideal Next Time Window it has just

47


filled and the corresponding Previous Ideal Next Time.

Fig 5.8 – Beat Level Extended and new Ideal Next Time and Ideal Next Time window calculated.

● N2 occurs within the Previous Ideal Next Time Window (FIG 5.9). RENI inspects it and sees

that it is closer to the Previous Ideal Next Time of the Previous Ideal Next Time window than

N1.

48


Fig 5.9– A better note for the previous Ideal Next Time Window occurs

● RENI decides to replace N1 with N2 (FIG 5.10). A new Active Ideal Next Time and Active

Ideal Next Time Window is created. RENI maintains the Previous Ideal Time Window whose

event it has just replaced in case a better event occurs in the future.

49


Fig 5.10– Previous note replaced. New Active Ideal Next Time is set.

5.3.3 Ghost events

An Ideal Next Time Window may be bypassed and unfilled. This happens when no events occur

within it. That would suggest that the Beat Level in question does not extend into the performance

and is therefore implausible. To discard the Beat Level after one failed extension however would be

pre-mature. The absence of an event at the point in time necessary to extend the Beat Level may be

due to the characteristics of the performance (Beats don't always necessarily coincide with an event).

It does not immediately mean that the Beat Level cannot be perceived (by a human listener or

otherwise), especially if appropriate events were to occur in logical future Ideal Next Time Windows.

Therefore when RENI detects that the Active Ideal Next Time Window of a Beat Level has been

bypassed, it does note immediately discard it. Instead it artificially extends the Beat Level by adding a

Ghost Event at the Ideal Next Time within the window that has been bypassed. A Ghost Event is an

event which does not occur in the performance and is therefore not perceived. However it is placed at

a point where one would expect it to occur if tapping along with the Beat Level it extends. Although

the event itself is not perceived, the beat that it indicates may be.

The process for adding a Ghost Event is illustrated below (FIG 5.11 and 5.12):

50


● RENI detects that the Active Ideal Time Window has been bypassed (FIG 5.11).

Fig 5.11– Ideal Time Window is passed without a note being added to it.

● It adds a Ghost Event, G to the Ideal Time Window and infers a new Ideal Time Window

(FIG 5.12).

51


Fig 5.12– Ghost Event added to bypassed Ideal Next Time Window. New Active Ideal Next Time and

Window calculated.

However RENI's tolerance for bypassed Ideal Time Windows is limited. If it was to extend a Beat

Level with a series of Ghost Events, RENI would be wasting time and resources on an implausible

Beat Level. Therefore, if more than than 2 ghost events in a row are required to extend the Beat

Level, then the Beat Level is deemed implausible and discarded.

5.3.4 Completion of Extension process

Once the Extension Window has elapsed, extension of Beat Levels ceases. The output of the

Extension process is a collection of plausible Beat Levels each consisting of regularly spaced events

which occur within the Extension Window. These Beat Levels are passed to the Interpreter which

forms Metrical Hypotheses.

5.4 Hypothesising

Once the Extension Window has elapsed and the Extension process has been completed, the

Interpreter takes the collection of Beat Levels and forms Metrical Hypotheses from them. The

Interpreter performs three main tasks:

52


● Consolidation of Beat Levels

● Forming Metrical Hypotheses from Beat Levels

● Deciding on the Metrical Hypothesis which will form the basis for percussive performance

accompaniment.

5.4.1 Consolidation

Some of the Beat Levels created in the Extension process may be a subset of another Beat Level. A

Beat Level is a subset of another Beat Level if it has a similar defining interval to the other Beat Level

and all its Events can be found in the other Beat Level. In the diagram below, FIG 5.13, BL1 has a

similar interval to BL2 and the events that are in BL2 are a subset of the events in BL1. BL2 and BL1

are essentially representing the same Beat Level. BL2 is contains one less Event as it was created later

in the Search Space Window.

FIG 5.13 – BL1 and BL2 are essentially duplicates as they have similar average intervals and all the

events in BL2 are in BL1

Before forming Metrical Hypotheses with the Beat Levels remaining after Extension RENI

53


consolidates its collection of plausible Beat Levels. by detecting and discarding duplicate Beat Levels.

This ensures that all the Beat Levels examined in the subsequent Hypothesising process are unique.

Discarding duplicate Beat Levels prevents redundant processing during the next phase of

Hypothesising and prevents the inference of duplicate Metrical Hypotheses.

RENI detects duplicate Beat Levels by comparing the average interval of every combination of Beat

Levels. Where two Beat Levels have an average intervals of a similar value and share a large number

of common notes, the Beat Levels in question are deemed to be duplicates. Where RENI determines

that two Beat Levels are duplicates, it discards the Beat Level with fewer events.

5.4.2 Hypothesising

Once Consolidation has been completed RENI begins searching for Metrical Hypotheses. As was

described in section 4.3; two (or more )Beat Levels can be combined to form a Metrical Hypothesis if

they meet the following criteria:

● the average interval of the Beat Level with a larger intervals is an integer multiple of the of

the average interval of the Beat Level with the smaller intervals.

● The two Beat Levels are aligned and share common events. An event in a larger Beat Level in

a hypothesis should also be present in every smaller Beat Level.

The diagram below, FIG 5.14 illustrates how Beat Levels can be legitimately combined to form a

Metrical Hypothesis

54


Fig 5.14 – The interval of BL1 is an integer multiple of BL2's interval All the notes in BL1 are in

BL2. The two Beat Levels can therefore be combined to form the Metrical Hypothesis MH1

The Interpreter infers Metrical Hypotheses by searching for Beat Levels which are compatible

according to criteria listed above and combining them. It does this using a recursive search procedure.

It starts with a smaller Beat Level and combines it with larger compatible Beat Levels to form

Metrical Hypotheses. Conceptually, Metrical Hypotheses are formed from the bottom Beat Level

(Tactus) up. This procedure terminates when the number of Beat Levels added to a Metrical

Hypothesis exceeds a limit defined in the Parameters module.

Before searching for Hypotheses, The Interpreter arranges in the Beat Levels in order of the size of

their average interval (smallest first). RENI then loops through a number of Beat Levels, trying to

find Metrical Hypotheses for which particular Beat Levels are the Tactus (the bottom Beat Level in a

Metrical Hypothesis).

In order to find the Hypotheses for which a particular Beat Level, B1 is a Tactus, the Interpreter

1. Takes B1 and treats it as the base Beat Level or Tactus.

2. Searches for every Beat Level which has a larger average interval and is compatible with

BL1 (BL2, BL3 .......BLN etc.).

55


3. For each of the compatible Beat Levels found, RENI finds all the Hypotheses (Partial

Hypotheses) which have the compatible Beat Level as their Tactus. This is where

recursion is used. For each compatible Beat Level found in step 2, this entire procedure

(steps 1-4) is run to find Partial Hypotheses based on them.

4. As the Partial Hypotheses based on each Beat Level compatible with B1 are returned, the

Interpreter adds B1 to the bottom of each of these Hypotheses and adds them to collection

of inferred Metrical Hypotheses. Although conceptually, we are starting from the bottom

when building Metrical Hypotheses, in practical terms, owing to the recursive nature of

the search procedure Metrical Hypotheses are actually built from the top down.

The search procedure operates subject to the following constraints and rules

● Metrical Hypotheses are only allowed have a limited number of levels, usually 2-3. This limit

is enforced as a constraint on the depth of the recursive procedure. This limit can be adjusted

in the Parameters Module.

● With many potential Beat Levels this search strategy could be very intensive. As it is

implausible that some of the larger Beat Levels could be the correct Tactus, the Interpreter

limits the number of Beat Levels which can be Tactuses in the final collection of inferred

Metrical Hypotheses. The search is constrained according to this limit.

● Ideally RENI should know nothing about the time signature of the performance when

inferring Metrical Hypotheses. However as stated in section 3.4.2, the application does

provide the facility to indicate this to the Interpreter by setting a value in the Parameters

module. This parameter is a vector which indicates how much of a multiple each level in the

Metrical Hypothesis should be of the Level below (except in the case of the bottom Beat

Level which is assigned a multiple value of 1) If we want the final Hypotheses to be 4/4 we

would set this parameter to the value {1,4}.

● For a particular base Beat Level, there may be no compatible Beat Levels. In such instances ,

56


RENI creates a one level Metrical Hypothesis. Due to the absence of Beat Levels which are

compatible with the Tactus of a one level Metrical Hypothesis, this Hypothesis is unlikely to

be the correct (in the context of the musical performance). One level Metrical Hypotheses are

therefore separated from multi level Metrical Hypotheses and will only form the basis of the

percussive accompaniment produced if no multi level Metrical Hypotheses are found.

As Hypotheses are formed, the following is determined for each Hypothesis and encoded in RENI's

representation of a Metrical Hypothesis

● The number of levels in the Metrical Hypothesis

● Designations of the Beat Levels in the Hypothesis that represent the Tactus and the Measure.

● The multiples or relative sizes of the Beat Level Intervals. For a two level Hypothesis with

Beat Levels of average Intervals 500 and 1000 respectively the value for multiples is {1, 2}.

This value indicates the time signature implied by the Hypothesis.

These attributes are used by the Judges when calculating plausibility scores for the Metrical

Hypotheses (see section 5.7).

5.4.3 Deciding

Once the Hypothesising process has been competed, the Interpreter is left with a collection of

Metrical Hypotheses. Only one of these can form the basis of the percussive accompaniment

ultimately generated by RENI. RENI must therefore identify the most plausible Metrical Hypothesis.

In order to identify the most plausible Metrical Hypothesis, the Interpreter gets the Judges to score

each Hypothesis. The Judges score each Metrical Hypothesis on separate criteria and collectively give

the Hypothesis an overall plausibility score. Once each Metrical Hypothesis has been scored, the

Interpreter selects the Hypothesis with the highest score and indicates this to the Main RENI module.

The selected Hypothesis forms the basis of the percussive accompaniment generated by RENI.

5.5 Ranking and selecting Metrical Hypotheses

The Judges are used by the Interpreter in deciding on the best Metrical Hypothesis. Each Judge

scores each Metrical Hypothesis according to its own set of criteria. These scores are then combined

57


in a weighted manner to assign an overall score to the Hypothesis. Each Judge is independent of each

other and there is the potential for RENI to implement multiple Judges.

As currently constituted, RENI currently uses three Judges, the Timing Judge, The Salience Judge

and the Statistical Judge. RENI assigns a different weighting to the scores of each Judge in calculating

the final score, based on the importance of the criteria they assess.

Developing a scoring system to rate Metrical Hypotheses is not clear cut. Determining the metrics to

use in calculating the final score and the weights that should be assigned to each of these metrics is a

challenging task. Some of the metrics described in the forthcoming sections are based on or adapted

from metrics used in Rosenthal's Machine Rhythm (1992), while others are original to RENI. The

weightings assigned to these metrics were determined though repeated trials and adjusted on the basis

of analysis of RENI's output.

5.5.1 Timing Judge

The Timing Judge assesses the the timing consistency of a Hypothesis; in particular the equality of

the inter onset intervals in the Beat Levels of the Hypothesis. It assumes that greater consistency, and

equality in these intervals contributes to a more plausible Metrical Hypothesis.

It calculates the following for each Metrical Hypothesis:

● The standard deviation of the interval between consecutive events in the Tactus Beat Level -

Once the Tactus Beat Level has been extended, its defining value is the average interval

calculated over all the intervals between consecutive notes that occur within the Beat Level.

The Timing Judge calculates the standard deviation of this interval. A smaller standard

deviation indicates a more consistent and unvarying interval between successive events and

therefore; a more plausible Beat Level and Metrical Hypothesis.

● Similarly the standard deviation of other Beat Levels in the Hypothesis are calculated.

● How close to an integer multiple the Measure Level interval is of the Tactus Level interval. A

Metrical Hypothesis whose Measure interval is equal to 4.0 * The Tactus interval is assigned a

higher score than a Hypothesis whose Measure Level is 3.9 * the Tactus interval. For a

Metrical Hypothesis, the closer the value of its Measure Level interval divided by its Tactus

58


Level interval is to an integer value, the more plausible it is assumed to be.

The first two metrics were used in Rosenthal (1992) while the Measure/Tactus multiple metric is a

original metric incorporated into RENI. Smaller values for each of these three metrics correspond to

higher scores for a Metrical Hypothesis. The Timing Judge calculates a weighted sum of these scores

and converts the final score into a metric to be used in the calculation of the overall Metrical

Hypothesis plausibility score.

5.5.2 Statistical Judge

The Statistical Judge accumulates statistics on the musical performance being listened to and on the

Metrical Hypotheses inferred by the Interpreter. These statistics are used to calculate an overall

statistical score for individual Hypotheses. These statistics and the metrics they are used to calculate

are:

● The Overall Average interval between consecutive notes within the Extension Window of the

Performance. For each particular Metrical Hypothesis, the absolute difference between the

Hypothesis's Tactus interval and the Overall Average interval is subtracted from the overall

statistical score.

● A Frequency Histogram of the Tactus Level intervals of the inferred Hypotheses is created

and used to score individual Hypotheses. For example; if there are 3 Hypotheses with a Tactus

interval of 500 milliseconds, then a Hypothesis with 500 millisecond interval will be assigned

a score proportional to 3 (3 multiplied by a multiple parameter).

● A Multiples Histogram of the Tactus Levels Intervals of the inferred Hypotheses and the

Tactus Levels Intervals which are integer multiples of these, is created and used to score

individual Hypotheses. For example if there are 2 Hypotheses with Tactus Levels of 500

milliseconds and 2 Hypotheses with a Tactus Level of 1000 milliseconds (500*2), then a

Hypothesis with a 500 millisecond interval will be assigned a score proportional to 4.

The overall statistical score is determined by calculating a weighted sum of the Histogram metrics and

subtracting the score calculated using the Overall Average interval. The principles underlying these

metrics are discussed below.

Both the Histogram based metrics are based on the premise that the perception of a particular Beat

59


Level will be re-enforced by events spaced by multiples of that Beat Level. These metrics as used

here are an adaptation of a similar method used by Rosenthal (1992). Rosenthal (1992) uses a

histogram construct to select a Tactus at the start of the Beat Tracking process in Machine Rhythm

and bases the search for Beat Levels on this Tactus. In RENI, the histogram construct is instead used

to score inferred Metrical Hypotheses.

In the case of the Overall Average interval, this value will be inversely proportional to the number of

events that occur in the Extension Window. A larger number of events in the Extension Window

implies a faster tempo and suggests that the correct Metrical Hypothesis will have a smaller Tactus

interval. Conversely A smaller number of events in the Extension Window implies a slower tempo

and that the correct Metrical Hypothesis will have a larger Tactus interval. Subtracting the absolute

difference between the Overall Average interval and the Tactus interval of a particular Hypothesis

from the Statistical Score, favours smaller Tactus intervals when the Overall Average is smaller and

larger Tactus intervals when the average is larger. This metric which was developed originally for

RENI is helpful in preventing the Judges selecting a Metrical Hypothesis whose Tactus has an interval

which is too large, on the basis that the Hypothesis is calculated as being more consistently timed than

a Hypothesis with a more appropriate and smaller Tactus interval.

5.5.3 Salience Judge

Whereas the first two Judges are primarily focused on the Tactus Level, the Salience Judge focuses on

the Measure Level and is particularly important in the selection of a Metrical Hypothesis which

implies the correct metre. The Salience Judge works on the premise that events on the Measure Level

coincide with strong beats and should therefore be more salient. Accordingly, the Salience Judge

assesses the salience of events in the Measure Level of the Hypothesis. The greater the average

salience of the events in the Measure Level, the greater the score assigned to the Metrical Hypothesis.

The Salience of events are determined by

● Absolute Duration – the greater the duration of the event, the greater the salience of the event.

● Relative Duration – In an event is preceded by events which are noticeably shorter in duration

than it, then the event will be more salient

● Loudness – the louder the note, the more salient it is

60


● Chords – Chords are more salient than one note events.

● Ghosts – Ghost events are not perceived and therefore possess no salience.

Corresponding with these salience attributes, the following metrics are calculated by the Salience

Judge in order to calculate the overall salience score for a Metrical Hypothesis.

● Absolute Duration – the average absolute duration of the events in the measure.

● Relative duration – For each event in the Measure, the duration of the four events in the

overall performance that precede the event are assessed and compared to the duration of the

event in the measure. The Salience Judge counts how many of these four events have a

duration less than 66% of the duration of the particular Measure Level event. The average of

this count for each event in the Measure Level contributes to the calculation of the salience

score.

● Loudness – The average volume (or velocity in MIDI terminology) of Measure Level events

is calculated.

● Chords – The percentage of the events in the Measure Level which are Chords is calculated

● Ghost Events – The percentage of the Events in the Measure level which are Ghosts is

calculated.

The salience score is calculated as a weighted sum of the first four metrics listed above minus a

weighted indication of the Ghost metric.

5.5.4 Calculating the overall plausibility score.

The overall score assigned to a Hypothesis is a weighted score of the scores assigned to the

Hypothesis by all three Judges. The weightings assigned to each Judges scores are set as parameters.

5.6 Producing output

Once the Interpreter has selected the highest ranking Metrical Hypothesis on the basis of the scores

assigned by the Judges, RENI's Drummer module uses this Hypothesis to produce percussive

accompaniment

61


A Metrical Hypothesis will be presented to the Drummer soon after the completion of the Extension

process. This Metrical Hypothesis tracks the Beats that occurred during the Extension Window. The

combination of Beat Levels in the Metrical Hypothesis can be projected, on the basis of their average

intervals into the future, to predict the location of beats in the remainder of the performance. The

Drummer aims to indicate these future beats (strong and weak) aurally, as they occur, by producing

simple percussive accompaniment.

The Drummer must use the information contained in the Hypothesis to determine the best time to start

accompanying, and the form of accompaniment to be generated. It does this by determining the

location of the next beat to occur at the Measure level and beginning its accompaniment at this point.

This location is calculated by adding the position of the last beat in the Measure Level and the

Interval of the Measure Level. The Drummer then determines the current point in time and calculates

how long it will have to wait before beginning its accompaniment. When accompaniment begins, the

Drummer taps along at the Tactus Level.

The Drummer uses the interval multiples information in the Metrical Hypothesis to determine the

combinations of strong and weak beats it should indicate in its output. For a {1, 4} multiple,

indicating a 4/4 time signature, the Drummer plays 1 strong beat followed by 3 weak beats. For a {1,

3} multiple, the drummer taps one Strong beat followed by 2 weak beats. The type of beat is

distinguished using different tapping sounds.

As indicated in section 3.5.5, the Drummer also produces textual output, writing each time at which it

taps a beat to a text file.

Once the Drummer has started, RENI's Beat Tracking algorithm is completed.

5.7 Re-hypothesising

The sections above describe the full operation of RENI's Beat Tracking algorithm. This algorithm

infers a Metrical Hypothesis and produces accompaniment on the basis of the analysis of the portion

of a musical performance that occurs between the start of the performance and the expiration of the

Extension Window. Implicit in this is an assumption that the contents of the Extension Window and

62


the beat and metre implied by it, is indicative of the beat and metre of the remainder of the

performance.

This assumption may be erroneous, especially if we assume that RENI views musical input as

essentially improvised. The performer may change the tempo and metre of the piece being performed

at any point in the performance. A real life percussive accompanist is likely to react to such changes

and adjust their accompaniment appropriately. As a model of such accompanists, RENI can also be

set to accommodate such changes by re-hypothesising.

When RENI is set to re-hypothesise and react to changes in tempo, it runs its Beat Tracking

algorithm repeatedly, over consecutive windows of time in the performance, and produces percussive

accompaniment on the basis of each new Metrical Hypothesis inferred and selected by its Beat

Tracking algorithm.

5.8 Parameters

The operation and precise behaviour of RENI's Beat Tracking algorithm may be influenced by the

value of the adjustable parameters in the Parameters module. These parameters have already been

alluded to section 4.5.8. They are summarised below.

● File or Instrument flag– indicates how RENI accepts input.

● File Name – if accepting input from a file, this indicates the location of the file.

● Search Space Window duration – indicates in milliseconds the length of the Search Space

Window. A longer Search Space Window results in a larger search space of potential Beat

Levels

● Extension Window Duration – indicates in milliseconds the length of the Extension Window.

This should ideally be at least twice the duration of the Search Space Window. A longer

Extension Window should result in RENI finding more plausible Beat Levels but also

lengthens the time taken to produce accompaniment.

● Ideal Next Time Window Width – the value used to calculate the Ideal Next Time Window,

expressed as a percentage of the Interval of the Beat Level being extended. The larger this

63


value, the wider the Ideal Next time window; thereby making the Extension process less

precise and more forgiving of performer error.

● Tactus Threshold – indicates the number of Beat Levels that should be treated as a Tactus

when the Interpreter searches for Hypotheses. The higher this number, the greater the number

of Metrical Hypotheses inferred.

● Metre Heuristic – indicates if the Metre Heuristic should be used. If it is set to be used, an

associated parameter is set which constrains the Interpreter when searching for compatible

Beat Levels to form Hypotheses by indicating the multiple relationship that should exist

between the Beat Levels in all the inferred Hypotheses. Although this parameter runs contrary

to the hypothesis under investigation (as it gives RENI information about the musical

performance), it was deemed to be an interesting addition to the application.

● Compatibility Metric– indicates how close to an integer, the multiple of one Beat Level's

interval should be of another Beat Level's interval, for them to be considered compatible. For

larger values of this metric, the multiple value has to be closer to an integer for the two Beat

Levels to be considered compatible.

● Judging Weights – indicates the weights that should be assigned to the score calculated by

each Judge when calculating the overall score for a Hypothesis

● Re-hypothesise Flag – if set, then RENI will continually apply its Beat Tracking algorithm and

re-hypothesise over consecutive Extension Windows in the performance.

64

6.0 Evaluation

6.0 Evaluation

This section describes the methods used to evaluate RENI. In particular it discusses the challenge of

evaluating Beat Tracking applications and algorithms and offers justification for the methods used in

this project. The results of the evaluation are also discussed prior to a discussion of the results and

their implications in the final section

6.1 Aim of evaluation

This project aimed to assess RENI's proficiency in the tasks of Beat Tracking and Metrical Analysis,

and also RENI's value as a performance accompaniment tool. It was therefore decided to conduct two

types of evaluation:

● Quantitative Functional Evaluation – evaluating how accurate RENI is at tracking beats and

inferring metre in musical performances in real time.

● Subjective Evaluation – assessing RENI from the perspective of the musical performer it

accompanies; in terms of its performance of core functionality and its value and potential as a

performance accompaniment tool.

6.2 Difficulties in evaluating Beat Tracking applications

The common approach used for evaluating Beat Tracking applications is to compare the beat

locations identified by the application in a musical performance to an annotation indicating the

location of beats in the same performance. This annotation is treated as the ground truth data and may

be created manually by human listeners or inferred from the score of the piece being performed.

This approach however is not ideal. The difficulties of evaluating Beat Tracking applications in this

manner, and in general are well established in the literature.

Dixon (2007) identifies three such issues.

● The task of Beat Tracking is not uniquely defined, but depends on the application. Ambiguity

exists both in the choice of metrical level and the precise placement of beats. Human listeners

65

6.0 Evaluation

may disagree on the precise location of beats, hence it cannot be said that the annotation of

musical input is absolutely correct.

● The availability of test data may also be a major constraint. Manual annotation in order to

create ground truth data is labour-intensive and time consuming, so it is difficult to create test

sets large enough to cover a wide range of musical styles and give statistically significant

results.

● Comparison against Beat Tracking applications is also difficult as some systems are designed

for a limited set of musical styles, which leads to the question of whether such systems can be

compared with other systems at all,

6.3 Quantitative functional evaluation

Despite the difficulties identified in the previous section, it was decided to use the established method

of comparing the applications output to annotations in order to evaluate the core functionality of

RENI. The beat locations identified by RENI in a musical piece were compared to the beat locations

indicated in an annotation of the same performance. A series of statistics were produced from these

comparisons to indicate RENI's performance.

6.3.1 Test Data

A corpus of about thirty MIDI recordings of named musical performances and accompanying scores

were collected during the development of the application. A subset of these MIDI files were used

during the development of the application. The rest were set aside for use in evaluation.

This corpus consists mostly of recordings of well known western style music. It contains recordings

of both popular and classical musical pieces of a variety of metres. A listing of these recordings;

including the names of these musical pieces and their respective composers is included in the

Appendix. These files are available to download at http://donalmulvihill.wordpress.com/reni

As there were no known annotations indicating Beat Locations for any of the files collected, manual

annotation of the files was necessary. A number of volunteers agreed to annotate the files. They were

directed to listen to a musical performance and tap along to it on a MIDI keyboard, indicating strong

66

http://donalmulvihill.wordpress.com/reni

6.0 Evaluation

and weak beats using different keys. A complimentary application, BEAT-REC was developed to

record in a text file. the points in time at which the participants tapped. The annotators were given no

information on the musical pieces prior to performing the annotation. They were required to infer the

beat and metre themselves, based on their own interpretation in real time.

The drawback of manual annotation as described above is that the annotations produced cannot be

deemed to be 100% accurate and objective. An annotator may make an error and two annotators may

infer different metres or tap along at a different tempo for the same musical piece.

However, the manner in which the annotations were created exemplify the task which RENI is trying

to emulate; human performers inferring the beat and metre of a piece they have no prior knowledge of

in real time. Therefore when we are comparing the annotation produced to the output of RENI for the

same piece, we are comparing RENI's performance to a human performing the same task under the

same conditions.

6.3.2 Data from RENI

As has already been described, RENI produces textual output for the purposes of evaluation. For this

evaluation, such data was generated in a series of trials where RENI attempted to track the beats and

infer the metre of the test MIDI recordings. As the output produced by RENI varied for different trials

on the same performance due to timing issues on the computer used and changing Extension Window

durations, it was decided to aggregate these differences out over multiple trials. Therefore, for each

MIDI performance in the test set, RENI was run multiple times for different values of Search Space

and Extension windows.

As was explained earlier, It was decided to built a heuristic into RENI which when turned on allows

RENI to have an indication of the correct metre. A number of trials were run with this heuristic off

and a number of trials were run with the heuristic turned on. The intention was to compare RENI's

Beat Tracking performance with and without this heuristic.

67

6.0 Evaluation

6.3.3 Comparing RENI's output to the annotations.

A comparison application, BEAT-COMP was developed in conjunction with RENI. This application

compares the output of RENI to an annotation of the same recording and produces a number of

statistics. It compares and calculates

● The percentage of beat locations which RENI locates accurately, independent of whether they

are strong or weak - As the beat locations recorded may be effected by latencies and

inconsistencies in the timing of the computer that RENI and BEAT-REC were running on,

beat locations which are within a certain offset of each other are considered to be indicative of

the same beat.

● The percentage of beat locations and types of beat that RENI identifies accurately - This

percentage indirectly measures the degree to which the correct metre was inferred.

However, basing an evaluation of RENI on only these two metrics would be insufficient and

potentially misleading. RENI is rarely 100% correct or 100% wrong. For a particular musical

performance, RENI may select a Metrical Hypothesis different to that implied by the human

annotation, however this doesn't mean that it is entirely incorrect. An incorrect hypothesis and its

corresponding accompaniment may still be perceived as sounding appropriate. There can be varying

degrees of correctness in RENI's output.

For example, for a particular performance, the human annotator may hypothesise a 4/4 metre with a

Tactus interval of 500 milliseconds while RENI may hypothesise a 2/4 interval with a 1 second Tactus

interval. Although RENI may hypothesise incorrectly, the percussive output produced would not be

perceived by a human listener as being completely incorrect. This is because both the Tactus interval

hypothesised by RENI is only twice the correct Tactus interval and the Measure interval of both RENI

and the annotators hypothesis would be the same. Therefore, if correctly aligned, every beat indicated

by RENI in the output produced would have an equivalent in the human annotation. It would

therefore be erroneous to treat this Hypothesis as incorrect in the same way as a hypothesis of 3/4

with a Tactus interval of .66 of a second is incorrect.

Therefore a number of additional attributes in RENI's selected Metrical Hypothesis and output must

be considered in determining the congruence between the interpretations of RENI and a human

68

6.0 Evaluation

annotator for a particular musical piece. These include

● The Tactus interval – Ideally, RENI's Tactus interval should be the same as that indicated by

the human annotators. However, Tactuses which are integer multiples of the correct Tactus

can still sound correct (or not incorrect). For example, with a correct Tactus interval of 500

milliseconds, a hypothesised interval of 1 second or 250 milliseconds (related multiples of 500

milliseconds) is less incorrect than a hypothesised interval of 666 milliseconds or 300

milliseconds.

● Metre – similarly to the way in which there can be incorrect intervals which are related to the

correct interval and therefore not entirely incorrect, there can also be “incorrect but not

entirely incorrect” metres inferred. For example, this applies to the relationship between a 2/4

metres and a 4/4 metre.

These two attributes cannot be treated entirely in isolation either. RENI may hypothesise the same

Tactus interval and metre as a human, but if the beat locations identified by RENI aren't the same as

the human annotator, then RENI can't be said to be correct.

Therefore, assessing the four items described above in combination; the two percentage metrics, the

Tactus interval and the metre, allows for an assessment on how correct RENI is in the output it

produces for a particular musical performance.

6.4 Subjective evaluation

The subjective evaluation was conducted to obtain the views of musicians on RENI's value and

potential as a performance accompaniment tool.

Trials were carried out with a number of musicians. These musicians participated in a number of 60

second trials and were asked to play musical pieces of their choice on a MIDI keyboard and have

RENI accompany them. Two different types of trials were carried out corresponding with RENI's two

different settings:

● RENI analysing and basing its accompaniment on the start of the performance only.

Accompaniment was then kept constant throughout the remainder of the performance.

69

6.0 Evaluation

● RENI continually re-hypothesising and updating its accompaniment by analysing consecutive

windows of time

Participants were then asked to rate RENI's performance in a short questionnaire. Participants were

first asked about their musical background, performance ability and directed to rate how well timed

the pieces they performed were. They were asked the following about their opinion of RENI's

performance:

● How good RENI was at tapping along in time with the pieces they performed? (rating out of

10)

● How good RENI was at inferring the metre (or locating the strong beats) in the pieces they

performed? (rating out of 10)

● How well RENI performed when set to continually hypothesise? (rating out of 10)

● Did they prefer having RENI accompany them when set to continually hypothesise?

● How useful is RENI's as a performance accompaniment tool? (rating out of 10)

● Did they have any further comments, observations or recommendations?

The questionnaire presented to participants in the subjective evaluation of RENI is included in the

Appendix.

6.5 Results

6.5.1 Quantitative functional evaluation results

The following results were accumulated over more than 200 trials and comparisons between RENI's

output and a human annotations for the same files.

For trials where RENI was not using the Metre heuristic,

● RENI located an average of 60% of Beats locations correctly, independent of type. It should

be noted that the percentage of beats located was also effected by the size of the interval

hypothesised relative to the correct Hypothesis. For example, where RENI inferred a beat

interval twice that of the correct interval, then it could only identify 50% of the beats

correctly.

70

6.0 Evaluation

● RENI identified the location and type (strong or weak) of 40% of beats correctly

● For 80% of trials, RENI either hypothesised the correct Tactus interval or a related multiple of

the correct Tactus interval. This can be broken down as follows

○ Hypothesised the correct Tactus interval in 60% of trials

○ Hypothesised a Tactus interval which is an integer or related multiple of the

correct Tactus interval in 20% of trials

● In 55% of trials RENI identified either the correct metre or a metre similar or related to the

correct metre. In 27% of trials, RENI identified the correct metre.

For trials where RENI was using the Metre heuristic

● RENI located an average of 60% of Beats locations correctly, independent of type

● RENI identified the location and type of 38% of beats correctly

● For 72% of trials, RENI either hypothesised the correct beat Tactus or a related multiple of the

correct Tactus interval. This can be broken down as follows

○ Hypothesised the correct Tactus interval in 54% of trials

○ Hypothesised a Tactus interval which is an integer or related multiple of the

correct Tactus interval in 18% of trials

6.5.2 Subjective evaluation results

The following results were gathered from the questionnaires filled out by the participants in the

subjective evaluation trials.

In terms of the musical background of the participants and the pieces they performed in the trial

● All participants had received some formal musical training and rated their performance ability

between 3 and 8 (with a rating of 10 being concert level). The average rating of performer's

ability was 6.5.

● All participants performed improvised pieces with discernible beats and metres.

● The average rating given by performers for the timing consistency of their performances was

5.

71

6.0 Evaluation

● The average rating given by performers for the rhythmical complexity of the pieces they

performed was 3.75. A higher rating for rhythmical complexity indicated pieces which were

harder to tap along to.

For their evaluation of the performance of RENI, Participants assigned an average score, out of 10 of:

● 5.4 for RENI's ability to tap along with what they were playing.

● 5.25 for RENI's ability to identify the location of strong beats.

● 4.0 for RENI's ability to keep tempo with performances when set to re-hypothesise

● 5.75 for RENI's value and potential as a performance accompaniment tool

A narrow majority of participants preferred having RENI accompany them when it hypothesised

based on the opening segment (a 6 second extension window) and then kept the same accompaniment

throughout the remainder of the performance. One participant felt that although the application

worked better when it hypothesised once, they would prefer the applications re-hypothesising

functionality if it performed better.

Participants were also given the opportunity to offer any additional comments and observations. The

most notable of these observations concerned the applications performance when it was set to re-

hypothesise.

Participants were split on their preference for this setting. Some liked the idea of them as the

performer being the tempo setter as opposed to the percussive accompanist (in this case the

application). Others found the changes distracting as they synchronised their playing with the tapping

of the application once it started; treating it as the tempo setter in the same way they would

synchronise with the playing of a metronome or a human percussionist . They disliked occasions

where RENI subsequently changed the rate of its tapping; perhaps as it misread an error by the

performer as a change in timing.

However, most complained of technical problems with the performance of RENI in the re-

hypothesising mode. They complained that:

72

6.0 Evaluation

● the application would frequently change the Tactus interval of its accompaniment despite them

not having changed the tempo. For example, the application might change from tapping along

at an interval 0f 500 milliseconds to an interval of 1 second and then back to a 500 millisecond

interval.

● Having re-hypothesised, the application would briefly stop providing accompaniment during

the performance before it resumed its accompaniment according to a new Metrical

Hypothesis.

6.6 Observations and analysis of the results

These results show that RENI is reasonably good at tracking beats but not as good at inferring metre.

These contentions are borne out in the results of both types of evaluation and also correspond with

observations of the application in operation. Also notable from the subjective evaluation were the

problems and complaints participants had with the re-hypothesising mode. It is therefore worth

considering the reasons for RENI's shortcomings and looking beyond the statistics into some

observations noted during evaluations and while developing the application.

With respect to Beat Tracking, RENI failed to identify either the correct Tactus or a related multiple of

the Tactus in 20% of trials where no metrical heuristic was used. This could be attributable to timing

issues on the laptop RENI was evaluated on. Closer analysis of the timing scores attributed to

Hypotheses by RENI on the same performances over multiple trials supports this contention. This is

also obvious as RENI doesn't always produce the same output for the same performance on the same

settings over repeated trials.

It was also noticeable during evaluations that RENI had greater difficulty tracking the beats in

rhythmically complex pieces. For example it does not cope well with syncopated pieces. Syncopation

is not something that RENI attempts to address or cope with directly.

In terms of metre, RENI only identified the correct metre of performances in 27% of trials and

identified a related metre in 28% of trials. Therefore, in just under half the trials it failed to correctly

identify a correct, or close to correct metre.

73

6.0 Evaluation

The inference of metre is likely to have been effected adversely by the timing and rhythmic

complexity issues described above. In terms of rhythmic complexity, RENI is only really suitable for

the inference of simple Metres such as 2/4, 4/4 and 3/4.

Another reason for the relatively poorer performance in inferring metre is the lesser attention given to

metrical indicators in the scoring of Metrical Hypotheses. The only metrical indicator examined by

RENI is salience. Detection and analysis of repeating patterns of relative note positions and durations

in the performance would likely have contributed to a better Metrical Analysis performance. The

analysis of salience itself would have been enhanced if tonal relationships between notes were

examined in the detection of Chords.

The problems with re-hypothesising are also partially owing to imperfect timing on the Macbook

RENI ran on during the evaluation. This, in combination with timing inconsistencies by a performer

can cause a correct Hypothesis with an interval of 500 milliseconds to score highest the first time

RENI hypothesises and another hypothesis with an interval of 1 second to score highest on

subsequent occasions. RENI allows these changes as it treats each consecutive Extension Window

within which it hypothesises as independent of each other. This allows RENI to be reactive to

dramatic changes in tempo and metre. However it also makes it overly sensitive to unintentional and

momentary changes owing to an error on the part of a performer who is attempting to keep the tempo

constant.

6.7 Comparison with other Beat Tracking applications

RENI's performance compares favourably with the performance of the Beat Tracking algorithms

evaluated in the 2006 Music Information Retrieval Exchange (MIREX) (McKinney et al 2006). Five

state of the art Beat Tracking and Tempo Extraction algorithms including those described in Dixon

(2007) and Klapuri et al (2006) were evaluated at MIREX 2006 in a manner similar to that used in the

evaluation of RENI. A set of 140 musical excerpts were used; each annotated by 40 different listeners.

On the basis of these, performance metrics were calculated to measure the algorithms abilities to

locate beats.

On average, the five Beat Tracking algorithms evaluated, all of which operated off-line, located

74

6.0 Evaluation

between 45.3% and 57.5% of beats with a mean performance across the five algorithms of 54%.

Dixon (2007) scored highest with an average of 57.5% and Klapuri et al (2006) was third highest with

an average of 56.4%. RENI, in its evaluation identified 60% of beat locations correctly while

operating in real time on a standard laptop.

However it would be incorrect to conclude on the basis of this brief comparison that RENI is superior

to the algorithms evaluated at MIREX 200 as the comparison is not like for like and therefore not

entirely valid.

● The MIREX evaluation calculated the percentage score differently. The manner in which the

error window for locating the same beats was calculated and the manner in which overall

percentages were normalised differed to the approaches used in the RENI evaluation.

● The corpus of musical excerpts used in the MIREX 2006 evaluation was larger, more

comprehensive and covered a greater variety of musical styles than that used in the RENI

evaluation. The number of annotators used was also considerably greater.

● The algorithms evaluated in MIREX 2006 are all audio based. Therefore they all must

perform a considerable amount of signal processing to infer Beat Tracking cues such as note

onsets; something they may not be able to do with 100% precision. RENI works on MIDI

input which encode Beat Tracking cues symbolically and with precision. RENI should be

expected to perform better on this basis.

However it is still interesting to look at a comparison (however flawed) between RENI and those Beat

Tracking algorithms which are considered state of the art. The MIREX 2006 results when compared

to RENI' results demonstrate that relatively simple approaches to Beat Tracking such as that

implemented in RENI can be remarkably effective. In and of themselves, the MIREX 2006 results

demonstrate that computational Beat Tracking models still fall short of human Beat Tracking

capabilities.

75

7.0 Discussion

7.0 Discussion

This section assesses what was achieved in this project. It assesses the capabilities of RENI and the

extent to which it met the requirements specified at the start of the project. The extent to which the

aims and objectives described at the outset were achieved is also analysed. The hypothesis under

investigation is also reflected upon in light of the outcome of the project. Further work to be carried

out on RENI and suggestions for future research are also specified.

7.1 Analysis

7.1.1 Capabilities of RENI

RENI's capabilities as currently constituted fully meet all the requirements set out in section 3.1.

● RENI is be capable of accepting musical performance data from a file or an external music

device/instrument in real time.

● RENI infers note onset information from this real time musical input.

● RENI implements a rule based algorithm which uses note onset information to track the beat

and infer the metre of the piece of music being performed.

● RENI produces audible percussive accompaniment to the musical input as it is being played.

● RENI also produce textual for use in its evaluation.

● RENI is fully operable on a standard personal computer or laptop so as to be usable by a wide

audience. In total, RENI contains approximately 3500 lines of code.

7.1.2 Aims and objectives

The results of the evaluation demonstrate that the project achieved its primary aim. RENI and the

algorithm developed for it can track beats and infer the metre of an improvised musical performance

in real time using a rule based approach. In developing RENI the project achieved its related aim of

developing an application that can accept musical signals played in real time by a musician and

produce appropriately synchronised percussive accompaniment.

However, the extent to which these aims have been achieved is limited. As the evaluation results

76

7.0 Discussion

demonstrate, RENI does not always successfully infer the beat or the metre of improvised musical

performances and does not fully emulate or match a human performing the same task. The application

has particular difficulty in correctly identifying the metre of musical performances.

As already discussed, these limitations are due to the following reasons:

● Inconsistent and inaccurate timing on the laptop RENI ran on during evaluation.

● The difficulty the application has in coping with rhythmically complex pieces. This effects

both the tracking of beats and the inference of metre.

● Salience is the only metrical indicator considered in the scoring of Metrical Hypotheses. The

lack of attention paid to other indicators of metre such as repeating patterns of note onsets and

durations as well as the imprecise means of detecting chords contributes to the relatively

poorer record of the application in identifying metre.

This project also aimed to investigate and determine additional approaches and techniques that use

note onset information to successfully track the beats of musical signals in real time. Although the

algorithm implemented was based on one described by Rosenthal, there were significant differences.

These differences are reflected in the real time operation of RENI's algorithm as well as the

incorporation of several new techniques and heuristics used to guide the search process and to score

the inferred Metrical Hypotheses.

The operation of RENI's algorithm also contributed to the achievement of the aims and objectives

arising from the projects emphasis on performance accompaniment. The algorithm is implemented in

such a way as to accommodate imperfect timing and its tolerance for imperfections is adjustable.

RENI can also be set to continually re-hypothesise, thereby allowing it to cope with and respond

appropriately to variations in timing. The operation of this re-hypothesising mode is not entirely

satisfactory however.

This project also used real musicians in the evaluation of the Beat Tracking application developed and

gauged their experience of being accompanied by it. This revealed differing preferences on the part of

musician for playing with RENI in the re-hypothesising mode, While some liked the idea of a

computational accompanist acting as a tempo setter, others found it an unnatural means of interacting

77

7.0 Discussion

with a percussive accompanist. They were more accustomed to synchronising with the output of a

percussive accompanist (such as a metronome or drummer) rather than having the percussive

accompanist continually synchronising and reacting to them.

7.1.3 Hypothesis

The hypothesis under consideration in this project was

“Knowledge of note onset information is sufficient for computationally inferring the

beat and metre of a piece of improvised drumless music in real time using a rule based

approach without any prior knowledge of metre or style for the purposes of providing

simple percussive performance accompaniment”

This hypothesis describes an approach to a Beat Tracking and Metrical analysis problem under certain

assumed conditions. RENI has been developed in accordance with this hypothesis. It is based on a

rule based algorithm that uses note onset information to track beats and infer the metre of a piece of

musical performance in real time. Furthermore, it does so without any prior knowledge of style or

metre and produces simple percussive accompaniment corresponding to the beat and metre inferred

On the basis of the findings of this project I would conclude that knowledge of note onset times is

sufficient for inferring the beat and metre using a rule based algorithm under the conditions described

in the hypothesis. However the results also show that this approach (combining note onset

information with a rule based algorithm) is not sufficient in all cases (for different musical

performances) and may not fully emulate a human performing Beat Tracking and Metrical Analysis in

the manner outlined in the hypothesis.

Insofar as the approach described in the hypothesis and exemplified by RENI is insufficient to

correctly infer the beat and metre under the conditions specified, a crucial question arises: In the

cases where RENI is not able to correctly infer the beat and metre of a musical performance in real

time, under the conditions specified, is this due to

● the shortcomings of the application developed in this project as outlined in section 6 and

repeated in section 7,

or

78

7.0 Discussion

● that RENI's shortcomings aside, knowledge of note onset times are not sufficient to correctly

infer beat and metre in all cases under the conditions specified. More information is required

to successfully perform this task in all cases.

In this project, the shortcomings of RENI undoubtedly contributed to the failure to correctly infer the

beat and metre for some musical performances for the reasons outlined in section 6.6. If the time was

available to address these shortcomings it is highly likely that the results observed in the evaluation

would improve.

It could also be argued that RENI's use of note onset information is limited and could be extended and

enhanced. RENI's operation consists mostly of a search for repeated inter onset intervals and is driven

by the assumption that beats always coincide with note onsets. While this assumption is not generally

incorrect, it is not true of every musical piece. Although rare, beats may occur in a performance which

do not coincide with a note. Note onset information could also be used in a rule based context to

identify repeating rhythmic patterns. This could be given greater emphasis in the overall algorithm

and could lead to superior inference of metre.

However, even with these shortcomings addressed, it is still debatable as to whether RENI and the

approach it implements would be able to infer the beat and metre in real time for all musical

performances with no prior knowledge of style or metre. This is because the hypothesis under

investigation and which guided the development of RENI describes the application of a a very

bounded approach to a relatively unbounded problem.

The problem described isn't bounded, as it is assumed that the metre and style of the performance are

unknown and no other limiting assumptions are made about these attributes. With the addition of the

real time requirement, the hypothesis describes possibly the most challenging case of Beat Tracking

and Metrical Analysis. In contrast the approach specified; analysing note onset information using a

rule based algorithm is bounded and restricted. This combination not only makes the task facing

RENI more challenging, it also may not be a realistic model of the task RENI tries to emulate; a

human percussionist performing Beat Tracking. More information may be needed by the beat tracker

in order to be successful in all cases and even at that, the conditions of the problem may not reflect

79

7.0 Discussion

those encountered in reality.

Unlike RENI, humans almost certainly use musical knowledge and memory in addition to the timing

information coming into their ears when performing Beat Tracking. This knowledge, according to

Allen and Dannenburg (1990) includes

● memory of specific performances and pieces

● memory of musical forms and styles

● knowledge of performer's style

This knowledge is particularly relevant to tracking beats in rhythmically complex pieces which RENI

has difficulty with. In complex music, there are competing rhythmic forces and higher level

knowledge of the musical structure makes the correct interpretation clear to the human listener (Dixon

2007). Therefore in order to disambiguate more difficult rhythmic patterns, some musical knowledge

is necessary (Dixon 2007). If we take this contention to be true; as RENI's approach and the

hypothesis that guides it does not allow for the use of musical knowledge, it is very unlikely that it

will be sufficient in all cases.

The conditions specified for the problem may also be argued to be unrealistic and overly ambitious if

the view is taken that we are trying to emulate human Beat Tracking capabilities. The hypothesis

under investigation directs that beat and metre be inferred without any knowledge of style. This

frames RENI as a general or universal model of Beat Tracking and raises expectations that it be able

to track beats in all styles of music. According to Collins(2006), such a general Beat Tracking solution

is unrealistic. When one considers the variety of styles of music and the corresponding multiplicity of

metrical constructs, Collins(2006) contention seems reasonable. Even if we assume like Allen and

Dannenberg (1990) that a human beat tracker brings knowledge of style to the task, this knowledge is

not going to be exhaustive and even if exhaustive knowledge was available, encoding it all for a

computational model would be too inefficient for real time Beat Tracking. Therefore Collins (2006)

view that we must model the training that encultured listeners undergo in recognising and

synchronising with contexts (or styles) when performing Beat Tracking and Metrical Analysis looks a

more convincing and realistic proposition.

Perhaps then, the hypothesis under investigation would be better stated with some caveats limiting the

80

7.0 Discussion

applicability of the rule based approach based on note onset information to performances of particular

styles and of limited rhythmical complexity.

However, despite the limitations of of the approach RENI is based on, rule based algorithms based on

note onset information are still an interesting way to investigate and model the process of human Beat

Tracking. And even with its shortcomings, RENI is still an effective Beat Tracker and has the

potential to be of practical use in performance accompaniment and other contexts.

7.2 Further work on RENI

Further development of RENI will be carried out in order to make it a fully fledged application. This

may include the addition of the following features.

● A conventional user interface to make the application more usable.

● For a particular performance, allowing the user to change the Metrical Hypothesis providing

the accompaniment, from amongst the Hypotheses that RENI infers.

● The use of drum loops in the provision of percussive output. Based on the Metrical

Hypothesis inferred, RENI could select an appropriate drum loop to play. This would allow

RENI to provide more advanced percussive accompaniment..

● The incorporation of additional heuristics, informing the application of the likely beat interval

and the style of the piece being performed.

● The specification of separate re-hypothesising windows of a different duration to the

extension window. This would make the application less sensitive to performer errors when

re-hypothesising.

The Beat Tracking algorithm can also be improved. These potential enhancements would however

form the basis for future research and are discussed in the next section

7.3 Directions for future research

There is scope for further investigation into real time Beat Tracking and Metrical Analysis and the

modelling of a percussive accompanist performing in real time with an improvised performance,

within the bounds set in this project (rule based and note onset times used as Beat Tracking cues).

81

7.0 Discussion

Using RENI as a basis, further research could be carried out into building a more effective rule based

model of Beat Tracking and Metrical Analysis. There is also much scope for further investigation into

the Beat Tracking problem generally; beyond the bounds set in this project.

Further investigation could be carried out into the methodology for scoring Hypotheses in RENI to

determine additional and more effective metrics for judging the plausibility of Metrical Hypotheses.

Superior metrics for judging the inference of metre in RENI would be of particular interest. As stated

previously, research could be carried out into a rule based approach for inferring metre by analysing

patterns in timing and duration in note onset information. Implementing such capabilities for a real

time application like RENI would be of especially interesting.

The examination of the Hypothesis in section 7.1.3 suggested that some use of representations of

musical knowledge and learning would be necessary for inferring the beat and metre of rhythmically

complex pieces. Nonetheless, research could be carried out into rule based techniques which attempt

to directly address and recognise the sources of rhythmic complexity (syncopation etc.) and how these

could be incorporated into RENI. Such techniques may not make RENI successful in all cases, but

they may improve its performance in evaluations. Generating a set of rules for recognising and

reacting to rhythmic complexity cues would also be interesting from a musical cognition perspective.

Further research could also be carried out into creating a real time Beat Tracker, outside the

restrictions on approach specified in the project's hypothesis. As Allen and Dannenberg (1990) point

out, humans almost certainly use musical knowledge and memory when Beat Tracking. Research

could be carried out into how such musical knowledge and memory could be represented

computationally and used in a real time, improvised Beat Tracking scenario such as that assumed by

RENI. The use of machine learning techniques for Beat Tracking could also be investigated. For

example, a beat tracker such as RENI could plausibly learn and become attuned to the performance

style of a particular performer in the same way that a human percussionist may become attuned to the

style of a performer that he/she regularly accompanies. Such techniques could also be used to

disambiguate between musical styles in order to improve the recognition of metre.

The problems experienced by performers with RENI's re-hypothesising mode present the most

82

7.0 Discussion

interesting opportunity for further research. Research inspired by these issues could be carried out as

an effort to model the interaction between a performer and a percussive accompanist where the

percussive accompanist bases the initial accompaniment on the performance (as happens with RENI).

In terms of the problems experienced with RENI inappropriately changing its accompaniment,

research could be carried out into how these problems could best be addressed when the performance

acts as the tempo setter by looking at:

● How the initial hypothesis inferred influences the inference of subsequent hypotheses in the

same performance? This would differ from RENI which currently treats each consecutive

hypothesis as independent.

● How a beat tracker like RENI, when re-hypothesising could distinguish between performance

errors on the part of the performer and genuine changes in tempo and metre? For RENI, this

could mean only changing the tempo of accompaniment if the most recent hypothesis is

sufficiently different from the initial hypothesis.

More interesting however are the problems experienced by performers in having the percussionist

treating them as a tempo setter rather than the other way around. In this respect an investigation could

be carried out into how the performer reacts to percussive accompaniment. Once the percussive

accompaniment starts and synchronises with the performer does the performer in turn attempt to stay

synchronised with the accompaniment? Does the percussionist go from tempo follower to tempo

setter, when does this occur, and how should a beat tracker such as RENI best behave in this context?

In a more general sense, the development of an audio based solution to the real time, improvised Beat

Tracking problem addressed in this project would also be an interesting research project.

7.4 Conclusion

This project investigated if knowledge of note onset information in musical performance is sufficient

for computationally inferring the beat and metre of a piece of improvised drum-less music in real time

using a rule based approach without any prior knowledge of metre or style for the purposes of

providing simple percussive performance accompaniment.

83

7.0 Discussion

During the lifetime of the project, the application RENI was developed using Java for use on a

standard laptop. RENI is an attempt to model human Beat Tracking and is also a performance

accompaniment tool intended for practical use.

RENI accepts musical performance data in MIDI format and implements a rule based algorithm

which infers Metrical Hypotheses by searching for regular intervals between note onsets in real time.

These regularly spaced onsets are represented as Beat Levels which are then combined to form

Metrical Hypotheses. RENI selects the most plausible Metrical Hypothesis inferred for a particular

musical performance and produces aural percussive accompaniment to indicate strong and weak beats

in the performance.

RENI was evaluated quantitatively and subjectively. It demonstrated proficiency in identifying beat

locations but experienced difficulties in correctly inferring metre, particularly in rhythmically

complex pieces. Performers using RENI also experienced difficulty performing with RENI's re-

hypothesising mode.

These findings demonstrate that RENI's approach (note onset information in a rule based algorithm) is

sufficient for real time Beat Tracking and Metrical Analysis, but not in all cases. This is because

RENI's capabilities fall short of human beat trackers. Also; its aim to infer the metre in all pieces

without prior knowledge of metre or style, thereby making it a universal Beat Tracker, may be an

unrealistically ambitious one. RENI would be a more realistic model of human Beat Tracking if it

used representations of musical knowledge and attempted to be more style specific.

Nonetheless, RENI has the potential to be practical and useful performance accompaniment

application. Further work on RENI and on the future research directions specified in this project

should improve RENI as an application and as a model of human Beat Tracking.

84

BIBLIOGRAPHY

BIBLIOGRAPHY

Allen, P. E. & Dannenberg, R. B. (1990). Tracking Musical Beats in Real Time, Proceedings of the

1990 International Computer Music Conference, 140–143. Glasgow: ICMA.

Collins, N. (2006). Towards a Style-Specific Basis for Computational Beat Tracking. Proceedings of

the 9th International Conference on Music Perception & Cognition. ICMPC and ESCOM, Bologna,

Italy, pp. 461-467

Desain, P. & H. Honing (1994). A Brief Introduction to Beat Induction, Proceedings of the1994

International Computer Music Conference. 78-79. San Francisco: International Computer Music

Association.

Desain, P. & H. Honing (1999) Computational models of Beat Induction: The Rule- based Approach,

Journal of New Music Research, 28(1):29–42

Dixon, S. (2007). Evaluation of the Audio Beat Tracking System BeatRoot, Journal of New Music

Research, 36(1),39 – 50

Eck, D(2001). A Positive Evidence Model for Rhythmical Beat Induction, Journal of New Music

Research, 30(2), 187–200

Eck, D(2002). Real-time Musical Beat Induction with Spiking Neural Networks, Technical Report

IDSIA-22-02, IDSIA, Manno, Switzerland

Goto, M (2001) An Audio-based Real-time Beat Tracking System for Music With or Without Drum-

Sounds, Journal of New Music Research, 30(2), 159–171

Klapuri, A., Eronen, A. & Astola, J. (2006). Analysis of the Meter of Acoustic Musical Signals, IEEE

Transactions on Audio, Speech, and Language Processing, 14(1), 342 – 355.

85

BIBLIOGRAPHY

Large, E.W. (1995) Beat Tracking with a Non-linear Oscillator, Working Notes of the IJCAI-95

Workshop on Artificial Intelligence and Music, pages 24--31

McKinney, M., Moelants, D., Davies, M., & Klapuri, A. (2007). Evaluation of Audio Beat Tracking

and Music Tempo Extraction Algorithms. Journal of New Music Research, 36(1), 1-16

Raphael, C. (2003), Orchestra in a box: A system for Real-time Musical Accompaniment. Working

Notes of IJCAI-03 Rencon Workshop.

Rosenthal, D. (1992), Emulation of Human Rhythm Perception. Computer Music Journal, 16(1), 64–

76

Rosenthal, D. (1992), Machine Rhythm: Computer Emulation of Human Rhythm Perception, PHD

report.

Scheirer, E.D. (1998). Tempo and Beat Analysis of Acoustical Musical Signals. Journal of the

Acoustical Society of America, 103, 588 – 601.

86

APPENDIX

APPENDIX

A.1 Evaluation corpus

MIDI recordings and scores for the following musical pieces were used in the quantitative functional

evaluation. These files are available to download at http://donalmulvihill.wordpress.com/reni

87

Name Composer/ArtistA Breeze from Alabama JoplinAamulla varhainAbstract 1 DoonanAria AM BachFlash Dance MoroderFuque 6 – BWV 851 JS BachFur Elise BeethovenGiselle AdamGod Save the QueenHorn Trio BrahmKilling Me Softly Fox/GimbelLosing My Religion REMLove Marraige Van HeusenMinuet in F L MozartMinuet in G JS BachMoonlight Sonata BeethovenNightswimming REMPrelude from Carmen BizetPyramid Song RadioheadRondino RameauRondo CP BachRussian Folk Tune BeethovenSing IvyThe Entertainer JoplinThe Washington Post SousaToccatina BrownTraditioner af Swenska Folk-DansarWith or Without You U2Your Song John/Taupin

http://donalmulvihill.wordpress.com/reni

APPENDIX

A.2 Quantitative functional evaluation results

The following table lists the results from individual trials in the quantitative functional evaluation as

described in section 6. The table is separated into the following sections:

● Details on the trial

● Attributes describing the interpretation of the annotator

● Attributes describing the settings and interpretation of RENI.

● Statistical comparison of the two interpretations

Explanations for some of the fields are as follows

● Tactus – the interval of the Tactus expressed in milliseconds

● Window - the interval of the Tactus expressed in milliseconds

● Heuristic – indicates if the Metre Heuristic was on for the trial

● Beat % - Percentage of beat locations identified successfully by RENI

● Type % - Percentage of beat locations and type of beat identified successfully by RENI

88

APPENDIX

89

Trial ANNOTATOR RENI STATSTrial Song Metre Tactus Window Heuristic Tactus Metre Beat % Type %

1 a-breeze-from-alabama.mid2-4 357 6000 N 359 3-4 0.64 0.342 a-breeze-from-alabama.mid2-4 357 4000 N 365 4-4 0.87 0.233 a-breeze-from-alabama.mid2-4 357 4000 N 365 4-4 0.98 0.254 a-breeze-from-alabama.mid2-4 357 5000 N 366 3-4 0.36 0.225 AAMULLAVARHAIN.mid 4-4 964 5000 N 999 2-4 0.63 0.486 AAMULLAVARHAIN.mid 4-4 964 4000 N 500 3-4 0.3 0.197 AAMULLAVARHAIN.mid 4-4 964 5000 N 1000 2-4 0.07 0.078 AAMULLAVARHAIN.mid 4-4 964 5000 N 1000 2-4 0.48 0.379 AAMULLAVARHAIN.mid 4-4 964 6000 N 7423 1 0.04 0

10 AAMULLAVARHAIN.mid 4-4 964 4000 N 499 3-4 0.7 0.3311 AAMULLAVARHAIN.mid 4-4 964 6000 N 999 2-4 0.19 0.1912 AAMULLAVARHAIN.mid 4-4 964 6000 N 999 2-4 0.19 0.1913 AAMULLAVARHAIN.mid 4-4 964 6000 N 7407 1 0.04 014 Alabama 2-4 357 6000 N 549 1,5 0.33 0.1415 Alabama 2-4 357 6000 N 365 4-4 0.01 0.0116 Aria.mid 3-4 592 6000 N 600 2-4 0.94 0.4717 Aria.mid 3-4 592 4000 N 599 3-4 1 0.3418 Aria.mid 3-4 592 5000 N 599 2-4 1 0.5119 Aria.mid 3-4 592 4000 N 600 3-4 0.98 0.3320 Aria.mid 3-4 592 6000 N 600 2-4 0.98 0.4821 Aria.mid 3-4 592 6000 N 599 2-4 0.98 0.4922 BachMinu.mid 3-4 587 4000 N 598 3-4 0.74 0.7423 BachMinu.mid 3-4 587 5000 N 598 2-4 0.97 0.4924 BachMinu.mid 3-4 587 6000 N 598 4-4 0.41 0.2125 BachMinu.mid 3-4 587 4000 N 900 2-4 0.13 0.1326 BachMinu.mid 3-4 587 5000 N 599 2-4 0.87 0.4427 BachMinu.mid 3-4 587 6000 N 601 4-4 0.38 0.1828 bwv851_fugue06.mid 4-4 355 4000 N 365 3-4 0.93 0.5529 bwv851_fugue06.mid 4-4 360 6000 N 731 2-4 0.41 0.4130 bwv851_fugue06.mid 4-4 355 6000 N 1005 2-4 0.21 0.1331 bwv851_fugue06.mid 4-4 355 5000 N 550 3-4 0.38 0.2232 bwv851_fugue06.mid 4-4 355 5000 N 731 2-4 0.01 0.0133 bwv851_fugue06.mid 4-4 355 4000 N 366 2-4 0.69 0.5334 entertainer.mid 2-4 248 6000 N 252 3-4 0.82 0.4335 entertainer.mid 2-4 248 4000 N 375 2-4 0.51 0.2736 entertainer.mid 2-4 248 6000 N 375 3-4 0.46 0.2437 entertainer.mid 2-4 248 6000 N 374 1 5 0.47 0.2538 entertainer.mid 2-4 248 4000 N 375 4-4 0.47 0.1939 entertainer.mid 2-4 248 4000 N 251 1 7 0.83 0.4240 entertainer.mid 2-4 248 5000 N 375 3-4 0.48 0.2541 entertainer.mid 2-4 248 5000 N 374 3-4 0.49 0.2442 flashdance2.mid 4-4 425 4000 N 429 4-4 0.35 0.2143 flashdance2.mid 4-4 425 6000 N 639 2-4 0.33 0.0944 flashdance2.mid 4-4 425 4000 N 425 4-4 0.98 0.9845 flashdance2.mid 4-4 425 5000 N 425 3-4 0.87 0.5146 flashdance2.mid 4-4 425 6000 N 851 3-4 0.48 0.2447 flashdance2.mid 4-4 425 6000 N 844 2-4 0.2 0.1348 flashdance2.mid 4-4 425 5000 N 638 2-4 0.31 0.12

APPENDIX

90

49 FurElise.mid 3-4 476 5000 N 480 4-4 0.77 0.4650 FurElise.mid 3-4 476 6000 N 239 3-4 0.87 0.8651 FurElise.mid 3-4 476 5000 N 240 3-4 0.85 0.8552 FurElise.mid 3-4 476 6000 N 240 3-4 0.96 0.7353 FurElise.mid 3-4 476 4000 N 239 3-4 0.79 0.6954 FurElise.mid 3-4 476 6000 N 239 3-4 0.81 0.7155 FurElise.mid 3-4 476 4000 N 240 1 7 0.96 0.5856 FurElise.mid 3-4 476 4000 N 239 3-4 0.72 0.5857 giselle.mid 2-4 740 5000 N 375 4-4 0.92 0.9258 giselle.mid 2-4 740 4000 N 374 4-4 0.9 0.959 giselle.mid 2-4 740 6000 N 749 3-4 0.18 0.160 giselle.mid 2-4 740 4000 N 750 2-4 0.93 0.9361 giselle.mid 2-4 740 4000 N 750 2-4 0.92 0.9262 God Save the Queen 3-4 778 4000 N 799 2-4 0.83 0.3763 God Save the Queen 3-4 778 4000 N 800 2-4 0.69 0.2964 God Save the Queen 3-4 778 5000 N 800 3-4 0.34 0.3465 God Save the Queen 3-4 778 5000 N 800 3-4 0.63 0.6366 God Save the Queen 3-4 778 6000 N 800 2-4 0.65 0.3767 God Save the Queen 3-4 778 6000 N 800 2-4 0.54 0.2668 LeoMin.mid 3-4 394 5000 N 399 2-4 1 0.5169 LeoMin.mid 3-4 394 4000 N 399 3-4 1 170 LeoMin.mid 3-4 394 6000 N 400 3-4 0.95 0.4871 LeoMin.mid 3-4 394 6000 N 400 3-4 0.95 0.4872 LeoMin.mid 3-4 394 4000 N 400 3-4 1 173 loveandmarriage.mid 4-4 495 6000 N 500 1 5 0.97 0.6274 loveandmarriage.mid 4-4 495 6000 N 500 4-4 0.96 0.9575 loveandmarriage.mid 4-4 495 4000 N 499 3-4 1 0.5776 loveandmarriage.mid 4-4 495 6000 N 502 4-4 0.44 0.2677 loveandmarriage.mid 4-4 495 4000 N 496 3-4 0.52 0.3178 loveandmarriage.mid 4-4 495 5000 N 499 4-4 0.34 0.3379 Moonlight 2-4 1176 4000 N 799 2-4 0.01 0.0180 Moonlight 2-4 1176 5000 N 800 3-4 0.44 0.4481 Moonlight 2-4 1176 4000 N 400 4-4 0.58 0.4982 Moonlight 2-4 1176 4000 N 801 2-4 0.01 0.0183 Moonlight 2-4 1173 6000 N 400 1,6 0.01 084 prelude.mid 2-4 247 5000 N 248 1 8 0.75 0.4685 prelude.mid 2-4 247 6000 N 251 1 7 0.76 0.486 prelude.mid 2-4 247 6000 N 250 1 10 0.73 0.3787 prelude.mid 2-4 247 4000 N 252 3-4 0.82 0.488 prelude.mid 2-4 247 4000 N 369 4-4 0.56 0.2889 prelude.mid 2-4 247 5000 N 376 3-4 0.52 0.2490 Rameau.mid 3-4 592 6000 N 605 4-4 0.18 0.1191 Rameau.mid 3-4 592 5000 N 605 3-4 0.45 0.1592 Rameau.mid 3-4 592 6000 N 604 4-4 0.16 0.193 Rameau.mid 3-4 592 5000 N 605 3-4 0.36 0.1194 RussianFolk.mid 2-4 590 4000 N 900 2-4 0.3 0.1795 RussianFolk.mid 2-4 590 4000 N 300 3-4 0.91 0.596 RussianFolk.mid 2-4 590 6000 N 599 4-4 0.96 0.7297 RussianFolk.mid 2-4 590 6000 N 899 3-4 0.3 0.1598 RussianFolk.mid 2-4 590 5000 N 600 3-4 0.94 0.46

APPENDIX

91

99 RussianFolk.mid 2-4 590 6000 N 600 4-4 0.89 0.67100 RussianFolk.mid 2-4 590 5000 N 300 4-4 0.89 0101 SingIvy.mid 6-8 722 6000 N 750 3-4 0.79 0.42102 SingIvy.mid 6-8 722 4000 N 750 2-4 0.82 0.82103 SingIvy.mid 6-8 722 5000 N 750 3-4 0.74 0.39104 SingIvy.mid 6-8 722 6000 N 749 3-4 0.86 0.47105 SingIvy.mid 6-8 722 4000 N 250 4-4 0.77 0.51106 sousa_washington_post.mid6-8 486 5000 N 500 4-4 0.5 0.13107 sousa_washington_post.mid6-8 486 4000 N 500 2-4 0.45 0.45108 sousa_washington_post.mid6-8 486 5000 N 499 4-4 0.88 0.23109 sousa_washington_post.mid6-8 486 6000 N 500 1 5 0.67 0.34110 sousa_washington_post.mid6-8 486 4000 N 500 2-4 0.34 0.34111 sousa_washington_post.mid6-8 486 6000 N 499 4-4 0.89 0.65112 toccatina.mid 3-4 545 5000 N 277 3-4 0.98 0.33113 toccatina.mid 3-4 545 6000 N 555 2-4 0.98 0.49114 toccatina.mid 3-4 545 6000 N 278 1 9 0.96 0.73115 toccatina.mid 3-4 545 6000 N 278 2-4 0.81 0.29116 toccatina.mid 3-4 545 4000 N 555 3-4 1 1117 toccatina.mid 3-4 545 4000 N 277 3-4 0.9 0.9118 traditioner_af_swenska_folk_dansar.1.1.mid3-4 492 5000 N 750 3-4 0.32 0.1119 traditioner_af_swenska_folk_dansar.1.1.mid3-4 492 5000 N 750 3-4 0.32 0.1120 traditioner_af_swenska_folk_dansar.1.1.mid3-4 492 4000 N 750 2-4 0.27 0.27121 traditioner_af_swenska_folk_dansar.1.1.mid3-4 492 6000 N 750 3-4 0.23 0.1122 with.mid 4-4 496 5000 N 500 4-4 0.99 0.99123 with.mid 4-4 496 5000 N 500 4-4 0.46 0.22124 with.mid 4-4 496 4000 N 500 2-4 0.86 0.66125 a-breeze-from-alabama.mid2-4 357 4000 Y 617 2-4 0.29 0.16126 a-breeze-from-alabama.mid2-4 357 5000 Y 367 2-4 0.36 0.35127 a-breeze-from-alabama.mid2-4 357 5000 Y 363 2-4 0.74 0128 a-breeze-from-alabama.mid2-4 357 4000 Y 617 2-4 0.3 0.16129 a-breeze-from-alabama.mid2-4 357 6000 Y 549 2-4 0.35 0.17130 AAMULLAVARHAIN.mid 4-4 964 5000 Y 998 2-4 0.67 0.52131 AAMULLAVARHAIN.mid 4-4 964 5000 Y 999 2-4 0.74 0.56132 AAMULLAVARHAIN.mid 4-4 964 6000 Y 1000 2-4 0.44 0.37133 Aria.mid 3-4 592 6000 Y 600 3-4 0.97 0.33134 Aria.mid 3-4 592 4000 Y 600 3-4 0.98 0.33135 Aria.mid 3-4 592 4000 Y 600 3-4 0.98 0.33136 Aria.mid 3-4 592 5000 Y 600 3-4 0.93 0.31137 BachMinu.mid 3-4 587 5000 Y 600 3-4 0.77 0.26138 BachMinu.mid 3-4 587 4000 Y 599 3-4 0.56 0.56139 BachMinu.mid 3-4 587 6000 Y 599 3-4 0.87 0.31140 BachMinu.mid 3-4 587 5000 Y 600 3-4 0.77 0.26141 BachMinu.mid 3-4 587 6000 Y 600 3-4 0.51 0.18142 BachMinu.mid 3-4 587 4000 Y 599 3-4 0.56 0.56143 bwv851_fugue06.mid 4-4 355 6000 Y 746 2-4 0.31 0.16144 bwv851_fugue06.mid 4-4 355 6000 Y 732 2-4 0.47 0.47145 bwv851_fugue06.mid 4-4 355 4000 Y 732 2-4 0.11 0.05146 bwv851_fugue06.mid 4-4 355 4000 Y 366 2-4 0.47 0.13147 bwv851_fugue06.mid 4-4 355 6000 Y 732 2-4 0.42 0.41148 entertainer.mid 2-4 248 4000 Y 373 2-4 0.53 0.27

APPENDIX

92

149 entertainer.mid 2-4 248 5000 Y 253 2-4 0.69 0.33150 entertainer.mid 2-4 248 4000 Y 376 2-4 0.53 0.26151 entertainer.mid 2-4 248 5000 Y 375 2-4 0.55 0.29152 entertainer.mid 2-4 248 6000 Y 371 2-4 0.55 0.29153 entertainer.mid 2-4 248 5000 Y 252 2-4 0.76 0.4154 entertainer.mid 2-4 248 6000 Y 369 2-4 0.52 0.26155 flashdance2.mid 4-4 425 6000 Y 425 4-4 0.79 0.42156 flashdance2.mid 4-4 425 6000 Y 429 2-4 0.42 0.27157 flashdance2.mid 4-4 425 4000 Y 425 4-4 0.96 0.96158 flashdance2.mid 4-4 425 5000 Y 847 2-4 0.21 0.1159 FurElise.mid 3-4 476 5000 Y 240 3-4 0.7 0.67160 FurElise.mid 3-4 476 4000 Y 240 3-4 0.87 0.44161 FurElise.mid 3-4 476 6000 Y 724 3-4 0.24 0.14162 FurElise.mid 3-4 476 6000 Y 714 3-4 0.28 0.13163 giselle.mid 2-4 740 6000 Y 1124 2-4 0.33 0.17164 giselle.mid 2-4 740 6000 Y 750 2-4 0.88 0.88165 giselle.mid 2-4 740 5000 Y 374 2-4 0.92 0.44166 giselle.mid 2-4 740 5000 Y 374 2-4 0.93 0.44167 giselle.mid 2-4 740 4000 Y 750 2-4 0.88 0.88168 giselle.mid 2-4 740 4000 Y 375 2-4 0.82 0.4169 God Save the Queen 3-4 778 6000 Y 800 3-4 0.46 0.46170 God Save the Queen 3-4 778 6000 Y 800 3-4 0.63 0.63171 God Save the Queen 3-4 778 5000 Y 799 3-4 0.71 0.71172 God Save the Queen 3-4 778 5000 Y 799 3-4 0.71 0.71173 LeoMin.mid 3-4 394 5000 Y 400 3-4 1 0.33174 LeoMin.mid 3-4 394 6000 Y 800 3-4 0.48 0.16175 LeoMin.mid 3-4 394 6000 Y 799 3-4 0.48 0.16176 LeoMin.mid 3-4 394 5000 Y 399 3-4 1 0.33177 LeoMin.mid 3-4 394 4000 Y 399 3-4 1 1178 loveandmarriage.mid 4-4 495 4000 Y 500 2-4 1 0.26179 loveandmarriage.mid 4-4 495 6000 Y 667 2-4 0.29 0.13180 loveandmarriage.mid 4-4 495 5000 Y 500 4-4 1 0.99181 loveandmarriage.mid 4-4 495 5000 Y 494 4-4 0.49 0.46182 loveandmarriage.mid 4-4 495 6000 Y 499 2-4 0.17 0.06183 Moonlight 2-4 1176 6000 Y 1200 2-4 0.87 0.87184 Moonlight 2-4 1176 6000 Y 399 2-4 0.93 0.93185 Moonlight 2-4 1176 4000 Y 799 2-4 0.44 0.24186 prelude.mid 2-4 247 4000 Y 375 2-4 0.58 0.29187 prelude.mid 2-4 247 5000 Y 375 2-4 0.52 0.24188 prelude.mid 2-4 247 5000 Y 375 2-4 0.48 0.24189 prelude.mid 2-4 247 4000 Y 374 2-4 0.51 0.25190 prelude.mid 2-4 247 6000 Y 377 2-4 0.51 0.24191 Rameau.mid 3-4 592 6000 Y 910 3-4 0.25 0.14192 Rameau.mid 3-4 592 5000 Y 602 3-4 0.43 0.14193 Rameau.mid 3-4 592 4000 Y 601 3-4 0.25 0.25194 Rameau.mid 3-4 592 4000 Y 602 3-4 0.18 0.18195 Rameau.mid 3-4 592 5000 Y 605 3-4 0.47 0.16196 Rameau.mid 3-4 592 6000 Y 909 3-4 0.25 0.14197 RussianFolk.mid 2-4 590 5000 Y 300 2-4 0.94 0.46198 RussianFolk.mid 2-4 590 4000 Y 300 2-4 0.94 0.46

APPENDIX

93

199 RussianFolk.mid 2-4 590 6000 Y 600 2-4 0.91 0.91200 RussianFolk.mid 2-4 590 5000 Y 300 2-4 0.94 0.46201 RussianFolk.mid 2-4 590 4000 Y 900 2-4 0.31 0.17202 SingIvy.mid 6-8 722 4000 Y 749 2-4 0.63 0.63203 SingIvy.mid 6-8 722 6000 Y 969 2-4 0.23 0.11204 SingIvy.mid 6-8 722 4000 Y 750 2-4 0.79 0.79205 SingIvy.mid 6-8 722 6000 Y 969 2-4 0.21 0.11206 sousa_washington_post.mid6-8 486 5000 Y 502 2-4 0.58 0.58207 sousa_washington_post.mid6-8 486 5000 Y 500 2-4 0.91 0.91208 sousa_washington_post.mid6-8 486 4000 Y 500 2-4 0.05 0.05209 sousa_washington_post.mid6-8 486 6000 Y 499 2-4 0.12 0.12210 sousa_washington_post.mid6-8 486 6000 Y 500 2-4 0.62 0.62211 toccatina.mid 3-4 545 4000 Y 555 3-4 0.99 0.99212 toccatina.mid 3-4 545 6000 Y 277 3-4 0.79 0.23213 toccatina.mid 3-4 545 4000 Y 555 3-4 0.99 0.99214 toccatina.mid 3-4 545 6000 Y 278 3-4 0.86 0.86215 toccatina.mid 3-4 545 5000 Y 555 3-4 1 0.34216 traditioner_af_swenska_folk_dansar.1.1.mid3-4 492 4000 Y 500 3-4 0.97 0.97217 traditioner_af_swenska_folk_dansar.1.1.mid3-4 492 5000 Y 750 3-4 0.32 0.1218 traditioner_af_swenska_folk_dansar.1.1.mid3-4 492 6000 Y 750 3-4 0.32 0.1219 traditioner_af_swenska_folk_dansar.1.1.mid3-4 492 5000 Y 750 3-4 0.23 0.1220 traditioner_af_swenska_folk_dansar.1.1.mid3-4 492 4000 Y 501 3-4 0.17 0.17221 traditioner_af_swenska_folk_dansar.1.1.mid3-4 492 6000 Y 749 3-4 0.3 0.1222 with.mid 4-4 496 6000 Y 501 4-4 0.86 0.44223 with.mid 4-4 496 4000 Y 750 2-4 0.33 0.08224 with.mid 4-4 496 4000 Y 250 4-4 0.8 0.32225 with.mid 4-4 496 6000 Y 503 4-4 0.31 0.17

APPENDIX

A.3 Qualitative evaluation questionnaire

SECTION A – Your Musical Background

1. Have you received any formal musical training/instruction? (circle answer)

YES NO

2. If yes, to what level? (Indicate grade attained if applicable. Otherwise describe nature of training)

__________________________________________________________________

__________________________________________________________________

3. What musical instruments can you play? (List instruments you can play)

__________________________________________________________________

__________________________________________________________________

4. How would you rate your musical performance ability?

(Rate on a scale of 1-10, 10 being professional/concert level, 1 being no musical ability)

1 2 3 4 5 6 7 8 9 10

94

APPENDIX

SECTION B – Your Performance in the Experiment

5. During the experiment, what musical pieces did you play? (If you know some of the pieces you

played, list name of piece, composer and style if possible)

______________________________________________________________

______________________________________________________________

______________________________________________________________

6. How would you rate the timing of your performance? (Rate on a scale of 1-10, Select 10 if you

feel you kept near perfect timing, Select 1 if your performance was devoid of any rhythm or sense of

timing)

1 2 3 4 5 6 7 8 9 10

7. How rhythmically complex were the pieces that you played? A rhythmically complex piece

would not have a very clear meter or time signature and would be difficult to tap along to. (Rate

on a scale of 1-10; 10 indicating a piece that is very difficult to tap along to, 1 being a piece that

effectively taps along to itself – a series of evenly spaced notes)

1 2 3 4 5 6 7 8 9 10

95

APPENDIX

SECTION C – The performance of the Application

8. In your judgement, how good was the application at tapping along in time with what you

were playing? (Rate on a scale of 1-10, Select 10 if you feel the application tapped along perfectly

with all pieces, Select 1 if you felt the application completely failed to tap along correctly in all

pieces)

1 2 3 4 5 6 7 8 9 10

9. In your judgement, how good was the application at accurately locating the strong beats (the

start of a measure) and identifying the meter in the pieces you were playing? (Rate on a scale of

1-10, Select 10 if you feel the application performed this task perfectly for all pieces, Select 1 if you

felt the application completely failed at this task for all pieces)

1 2 3 4 5 6 7 8 9 10

10. When the application was set to change the beat tempo in reaction to your playing, how well

do you feel the application kept in time with your playing? (Rate on a scale of 1-10, Select 10 if

you feel the application performed this task perfectly for all pieces, Select 1 if you felt the application

completely failed at this task for all pieces)

1 2 3 4 5 6 7 8 9 10

11. When performing with the application, which of the following did you prefer/find

easier/find more natural ? (select one option)

- Playing when the application continually updated the tempo of the beats it was playing – The

application treats your performance as the tempo setter.

96

APPENDIX

- Playing when the application set the tempo and style of the beat based on what was initially

played and then stuck to it – The application takes the initial part of your performance as the

tempo setter and then keeps this tempo

SECTION D – Overall

12. (Bearing in mind that it is still a work in progress) How would you rate the application as a

performance accompaniment tool? Did the accompaniment it provided sound good and

complement or otherwise enhance your performance and how it sounded? (Rate on a scale of

1-10, Select 10 if you feel the application is an excellent and potentially very useful performance

accompaniment tool, Select 1 if you feel the application is completely useless and obsolete as a

performance accompaniment tool)

1 2 3 4 5 6 7 8 9 10

13. Any further comments/observations/recommendations/advice on the experiment or the

application.

______________________________________________________________

______________________________________________________________

______________________________________________________________

______________________________________________________________

97

APPENDIX

A.4 Timeline

See next page for Gantt Chart specifying the timeline of activities in the project.

98

APPENDIX

99

APPENDIX

100

Documents

RENI: Real Time Beat Tracking and Metrical Analysis · This report describes the development of RENI, a Beat Tracking and Metrical Analysis application and percussive performance