Upload
lamtu
View
220
Download
0
Embed Size (px)
Citation preview
RENI:
Real Time Beat Tracking and
Metrical Analysis
Donal Mulvihill
(s0789005)
Master of Science
Artificial Intelligence
School of Informatics
University of Edinburgh
2008
1
Abstract
This report describes the development of RENI; a performance accompaniment application which
performs Beat Tracking and Metrical Analysis on MIDI signals in real time to provide simple
percussive accompaniment to musical performances. RENI has been developed as a model of real
time human Beat Tracking and as an effort to investigate if note onset information is sufficient to infer
the beat and metre of a piece of drum-less music in real time without any prior knowledge of its beat
or metre. It implements a rule based algorithm which infers Metrical Hypotheses by searching for and
combining series of regularly spaced note onset events in MIDI signals as they are received. These
Metrical Hypotheses are scored according to plausibility metrics and one is selected to form the basis
of percussive accompaniment, which is performed in time with the piece being played. RENI has
been evaluated by comparing its Beat Tracking performance on a test set of MIDI recordings to the
performance of human beat trackers. It has also been trialled and rated by musicians. Although
deficient in certain areas, RENI can successfully infer beat and metre and provide simple performance
accompaniment for rhythmically simple musical pieces and demonstrates potential in a practical sense
as a performance accompaniment tool.
2
Acknowledgements
I would like to thank my supervisor Alan Smaill, for his guidance and help throughout this project.
Thanks also to David Murray-Rust in the School of Informatics for his help and advice ,and to those
who participated in the evaluation of the application.
3
Declaration
I declare that this thesis was composed by myself, that the work contained herein is my own except
where explicitly stated otherwise in the text, and that this work has not been submitted for any other
degree or professional qualification except as specified.
(Donal Mulvihill)
4
Table of ContentsGlossary.....................................................................................................................................................81.0 Introduction ......................................................................................................................................10
1.1 Beat Tracking and Metrical Analysis...........................................................................................101.2 Computational Performance Accompaniment.............................................................................101.3 Hypothesis....................................................................................................................................101.4 Aims and objectives .....................................................................................................................111.5 Motivation.....................................................................................................................................12
1.5.1 Academic motivation............................................................................................................121.5.2 Practical motivation – the usefulness of an artificially intelligent Beat Tracking application ........................................................................................................................................................13
1.6 What was achieved during this project .......................................................................................141.7 Outline of this document..............................................................................................................141.8 A brief digression – Naming the application................................................................................15
2.0 Background .......................................................................................................................................162.1 Beat...............................................................................................................................................162.2 Metre.............................................................................................................................................162.3 Literature Review – Approaches to Beat Tracking.....................................................................172.4 How this project fits into the research context.............................................................................20
3.0 Design considerations and decisions.................................................................................................223.1 Requirements................................................................................................................................223.2 Constraints....................................................................................................................................22
3.2.1 Time available.......................................................................................................................223.2.2 Hardware and equipment available......................................................................................233.2.3 Real-time operation..............................................................................................................23
3.3 Form of musical input – MIDI.....................................................................................................233.4 Conceptual design decisions........................................................................................................24
3.4.1 Beat Tracking cues................................................................................................................243.4.2 Assumptions made about musical input...............................................................................253.4.3 Beat Tracking algorithm ......................................................................................................26
3.5 Practical and architectural design decisions.................................................................................263.5.1 Development language.........................................................................................................263.5.2 Parametrisation of the application........................................................................................273.5.3 Capabilities of the application..............................................................................................273.5.4 Output...................................................................................................................................273.5.5 Complimentary applications.................................................................................................28
4.0 Beat Tracking and Metrical Analysis................................................................................................294.1 Beat Tracking – The problem and solution..................................................................................294.2 Beat Levels...................................................................................................................................304.3 Metrical Hypotheses.....................................................................................................................314.4 Steps in RENI's Beat Tracking and Metrical Analysis algorithm................................................344.5 Architecture and components of RENI........................................................................................35
4.5.1 Timer ....................................................................................................................................354.5.2 Instrument.............................................................................................................................364.5.3 RENI (Main application component)...................................................................................36
5
4.5.4 Beat Levels...........................................................................................................................364.5.5 Interpreter..............................................................................................................................374.5.6 Judges....................................................................................................................................374.5.7 Drummer...............................................................................................................................374.5.8 Parameters.............................................................................................................................38
5.0 RENI's Beat Tracking algorithm.......................................................................................................395.1 Accepting and processing input....................................................................................................39
5.1.1 Creating RENI Events..........................................................................................................395.1.2 Detecting Chords..................................................................................................................40
5.2 Setting the search space – Spawning Beat Levels.......................................................................415.3 Extending Beat Levels..................................................................................................................43
5.3.1 Extending a Beat Level – Adding a new event....................................................................445.3.2 Choosing between events.....................................................................................................465.3.3 Ghost events..........................................................................................................................50
5.4 Hypothesising...............................................................................................................................525.4.1 Consolidation........................................................................................................................525.4.2 Hypothesising.......................................................................................................................535.4.3 Deciding................................................................................................................................56
5.5 Ranking and selecting Metrical Hypotheses ...............................................................................575.6 Producing output...........................................................................................................................615.7 Re-hypothesising..........................................................................................................................625.8 Parameters.....................................................................................................................................62
6.0 Evaluation..........................................................................................................................................646.1 Aim of evaluation ........................................................................................................................646.2 Difficulties in evaluating Beat Tracking applications..................................................................646.3 Quantitative functional evaluation...............................................................................................65
6.3.1 Test Data...............................................................................................................................656.3.2 Data from RENI....................................................................................................................666.3.3 Comparing RENI's output to the annotations......................................................................67
6.4 Subjective evaluation....................................................................................................................686.5 Results...........................................................................................................................................69
6.5.1 Quantitative functional evaluation results............................................................................696.5.2 Subjective evaluation results................................................................................................70
6.6 Observations and analysis of the results......................................................................................726.7 Comparison with other Beat Tracking applications.....................................................................73
7.0 Discussion..........................................................................................................................................757.1 Analysis.........................................................................................................................................75
7.1.1 Capabilities of RENI ...........................................................................................................757.1.2 Aims and objectives .............................................................................................................757.1.3 Hypothesis............................................................................................................................77
7.2 Further work on RENI..................................................................................................................807.3 Directions for future research.......................................................................................................807.4 Conclusion....................................................................................................................................82
BIBLIOGRAPHY...................................................................................................................................84APPENDIX..............................................................................................................................................86
A.1 Evaluation corpus........................................................................................................................86A.2 Quantitative functional evaluation results...................................................................................87A.3 Qualitative evaluation questionnaire...........................................................................................93
6
A.4 Timeline.......................................................................................................................................97
7
Glossary
Glossary
This report discusses the development of a musical accompaniment system and as such, uses a
number of musical terms It is assumed that the reader has some basic knowledge of musical
terminology (e.g. what a note, volume, pitch is, etc.). This section describes the meaning of terms
which may be unfamiliar to some readers or whose meaning in this report differs from the meaning of
the term as more commonly encountered.
Bar – in music or musical notation, is equivalent to a Measure (see below). It is a duration in a
musical piece defined by a given number of beats of a given duration.
Beat – a regularly spaced pulse or unit of time in a piece of music. It is usually indicated by tapping
along to a piece of music. See section 2.1.
Event – an Event (or Musical Event) in this report refers to the playing of a musical note. The terms
note and event are used interchangeably in this report.
Measure – is equivalent to a Bar (see above). This term is used more frequently in this report. May
also be used in the context of the Measure Beat Level which indicates the beats which denote the start
of a Bar/Measure.
Metre – determined by the number of Tactus level beats of a given duration which make up a
Measure. It is indicated by a time signature.
RENI – the Beat Tracking application described in this report. See section 1.8 for an explanation of
the name.
Salience – In a musical context, when describing a note, Salience refers to the extent to which a note
is distinguishable or stands out relative to other notes. This may be due to its volume, duration or
pitch
8
Glossary
Strong Beat – also known as a Down Beat, refers to those beats which occur at, or indicate the start
of a musical Bar or Measure.
Syncopation - includes a variety of rhythms which are in some way unexpected in that they deviate
from the strict succession of regularly spaced strong and weak beats in a metre.
Tactus – the rate at which one taps along to a musical performance. The number of Tactus beats in a
Measure determines the time signature of a musical piece.
Tatum – A subdivision of the Tactus Beat Interval. Not really of concern in this project
Time Signature – indicates (or may refer to) the metre of a piece of music by expressing the number
of beats that constitute a Measure and the note value/duration which constitute a beat.Time signatures
are written as a fraction. For example, the time signature 4/4 for a musical piece indicates that 4 beats
of quarter note duration make up a Measure.
Weak Beat – the beats in a musical performance which are not strong beats and which do not indicate
the start of a Measure.
9
1.0 Introduction
1.0 Introduction
This report describes the development of RENI, a Beat Tracking and Metrical Analysis application
and percussive performance accompaniment tool which attempts to model the ability of a human
musician (or drummer) to provide simple percussive accompaniment to music performed in real time.
1.1 Beat Tracking and Metrical Analysis
Beat Tracking is the task of identifying and synchronising with the basic rhythmic pulse of a piece of
music. It is analogous to a person tapping their feet or clapping their hands in time with music.
Metrical Analysis is the task of inferring the Metre of a piece of music. This task can be viewed as an
extension of Beat Tracking or as Beat Tracking at multiple levels. It involves organising beats into
groups or organising beats hierarchically, and is also concerned with identifying and distinguishing
between strong and weak beats. Inferring the metre of a musical piece is analogous to identifying its
time signature; an attribute of a musical piece which informs us of its temporal structure and
organisation. In terms of simple performance accompaniment, it influences how the piece is tapped
along to.
1.2 Computational Performance Accompaniment
Computational Performance Accompaniment is the participation by computers (or computer
applications) in musical performance alongside human performers.
Musical accompaniment systems attempt to emulate the task that a human musical accompanist
performs: supplying a missing musical part, generated in real time, in response to the sound input
from a live musician (Raphael, 2003). RENI is a performance accompaniment system in that it
attempts to supply the percussive part of a musical performance in response to the playing of live
music.
10
1.0 Introduction
1.3 Hypothesis
The hypothesis under consideration in this project is
“Knowledge of note onset information is sufficient for computationally inferring the
beat and metre of a piece of improvised drumless music in real time, using a rule
based approach without any prior knowledge of metre or style, for the purposes of
providing simple percussive performance accompaniment”
Essentially, this project investigated the feasibility of developing an application based on a rule based
algorithm that performs Beat Tracking and Metrical Analysis on a piece of musical performance as it
is being played, by only analysing information on the time that each note in the performance was
played.
1.4 Aims and objectives
Motivated by the intention to investigate the hypothesis stated in section 1.3, this project had a
number of aims and objectives.
The primary aim of the project was to build an application to track beats and infer the metre of an
improvised musical performance in real time using a rule based approach, for the purpose of
providing percussive accompaniment, The musical performance would be improvised from the
perspective of the application in that it would have no prior knowledge of the tempo, metre or style of
the performance.
In essence, this project aimed to develop a drum machine that would emulate the ability of a human
percussionist to provide simple accompaniment to an improvised piece of music in real time. This
drum machine would accept (in effect, listen to) musical signals played in real time by a musician and
produce appropriately synchronised percussive accompaniment.
Through investigating the hypotheses and developing the drum machine, this project also sought to
determine additional musical heuristics, approaches and techniques that use note onset information,
which could be used to successfully track the beats and infer the metre of musical signals in real time.
11
1.0 Introduction
The emphasis of the project on performance accompaniment inspired further objectives.
● Given that the application would be interacting with real life performers whose playing may
be imperfectly timed, this project aimed to investigate how a Beat Tracking algorithm could
cope with these imperfections and still correctly infer beat and metre.
● Assuming that the application views the performance as improvised, it is conceivable that the
timing of the performance could vary or change over time. This project therefore sought to
develop a Beat Tracking algorithm that could cope with variations in timing and respond to
them appropriately.
● This project also aimed to gauge the experience and views of musicians on being
accompanied by a Beat Tracking application, particularly when the application attempted to
react to timing variations in their performance.
1.5 Motivation
This project is motivated by interest in a long standing Artificial Intelligence problem; solutions (or
attempted solutions) to which may be useful in a practical context.
1.5.1 Academic motivation
The tasks of Beat Tracking and Metrical Analysis are interesting from a psychological perspective.
Perception of rhythm and beat is one of the most basic activities of musical cognition. Large (1995)
describes the performance of these tasks as “musical common sense”.
While tracking beats and inferring metre in a musical piece is easy and natural for a human listener to
perform, even one without any musical training, it is a computationally difficult task to perform and
remains a formidable AI problem. Many of the models of computational Beat Tracking suggested and
described in literature, still fall short of human Beat Tracking ability (McKinney, Moelants, Davies
and Klapuri, 2007).
Large (1995) suggests two sources of difficulty in attempting to model human Beat Tracking ability:
● Systematic Timing variations in musical performance – the tendency for musicians to use
12
1.0 Introduction
timing variation to communicate musical intentions.
● Rhythmic complexity in musical performance – the presence of syncopation, lack of salient
periodicity, or human performance errors.
The open ended nature of the problem may also be viewed as a source of difficulty. Different sets of
assumptions may be adopted in approaching the problem and the output produced by a Beat Tracking
application for a particular performance cannot be viewed as absolutely correct. Two listeners may
track beats in the same performance differently. Beat Tracking is not solely a product of rhythmical
pattern but rather of pattern and listener together (Eck, 2001)
1.5.2 Practical motivation – the usefulness of an artificially intelligent Beat Tracking application
A computational beat tracker would be useful according to Rosenthal (1992) as it would greatly
enhance the ability of computers to participate intelligently in and transcribe live musical
performance.
A fully general, automatic real time beat tracker would be of great value in many applications such as
music-synchronized Computer Generated animation, music transcription, music editing and
synchronization, and musicological studies (Allen and Dannenberg, 1990).
In a performance accompaniment context, a real time beat tracker such as that envisaged in this
project would open up new possibilities for computers in music as it would allow a computer to
synchronize with music external to it without the use of explicit synchronization information. Some
musicians contend that it is much harder to play in an ensemble and follow a collective tempo than it
is to set their own tempo and require other musicians to follow. A real time beat tracker would solve
this problem and allow an outside agent (e.g. a human performer) instead of the computer to control
the performance tempo (Allen and Dannenberg, 1990); thus transforming the computer from a tempo
setter to a tempo follower.
A real time system would also have some commercial potential. An example of a similar commercial
system is Circular Logic ( http://www.circular-logic.com/ ) . Circular Logic synchronises with the
tempo of musical input for performance accompaniment but does not perform metrical analysis.
13
1.0 Introduction
1.6 What was achieved during this project
During this project, RENI, a Beat Tracking and Metrical Analysis application that provides percussive
performance accompaniment to improvised musical performances in real time was developed. RENI
is based on a rule based algorithm that analyses note onset times in musical signals as they are
performed. RENI can also be described as a model of real time human Beat Tracking and Metrical
Analysis.
RENI's ability to track beats was evaluated by comparing the output it produced when tracking beats
in musical recordings to annotations produced by musicians, indicating their interpretation of beat
locations in the same recordings. RENI was also trialled by a number of musicians who evaluated its
ability to track beats and provide percussive performance accompaniment.
Although the application performs its primary function and produces output that can be perceived and
assessed, it runs in a development environment. Further work is required to make RENI a fully
fledged and user friendly application that could be distributed and used by a wide audience.
The source code of RENI has been submitted in conjunction with this report and is available upon
request.
1.7 Outline of this document
Following this introduction to the report and project, the remainder of this document is structured as
follows
● Section 2 - explains the concepts of beat and metre, reviews previous research on Beat
Tracking and Metrical Analysis paying particular attention to the variety of approaches put
forward in literature and discusses how this project fits into the research context.
● Section 3 – outlines the scope of the project by describing project requirements, constraints
and important conceptual and practical design decisions.
● Section 4 – discusses Beat Tracking and Metrical Analysis from a computational perspective
in greater detail. It describes RENI's view of the Beat Tracking problem and the principles
14
1.0 Introduction
underlying its approach to solving it. The steps in RENI's approach to Beat Tracking and
Metrical Analysis are discussed and the architecture of RENI as an application is described..
● Section 5 – RENI's Beat Tracking and Metrical Analysis algorithm is described and illustrated
in detail.
● Section 6 – The Evaluation of RENI is described in detail and the results of the evaluation are
presented and assessed.
● Section 7 – A discussion and critical analysis of what was achieved in this project. Further
work and directions for future research are also outlined.
1.8 A brief digression – Naming the application
The Beat Tracking application described in this report is provisionally named RENI. RENI is not an
acronym. It is the nickname of one of the author's favourite drummers, Alan Wren of the now defunct
Manchester based band, the Stone Roses.
Despite the best efforts of this project, Alan Wren (aka Reni) is much better at providing real time
percussive performance accompaniment than RENI.
15
2.0 Background
2.0 Background
This section offers some background to the project. The central concepts of beat and metre in music
are defined. The relevant literature is also reviewed and some of the important approaches to Beat
Tracking and Metrical Analysis suggested in the literature are discussed. How the work in this
projects fits into the research context and relates to previous approaches is also considered.
2.1 Beat
The term beat or beats refer to a regularly spaced pulse in a musical performance which can be
perceived and indicated by a human tapping along to it. For most musical performances, beat can be
defined as sounds that are perceived as being equally spaced in time. This defines a tempo for the
music. The beat for a particular musical piece can be described according to two attributes
● Period – the time duration between successive beats
● Phase – the time when a beat (or beats) occurs relative to the start of a musical performance.
For musicians, this beat is a central issue in time keeping in music performance. But, also for non
experts, the process seems to be fundamental to the perception of tempo and the processing, coding
and appreciation of temporal patterns. Furthermore, it determines the relative importance of notes in,
for example the melodic and harmonic structure (Desain and Honing, 1999).
As alluded to in the introduction, perception of beat by a human listener can be indicated by tapping
along. This indicates that the listener has abstracted information about the music and is able to predict
when the next beat will occur (Klapuri et al, 2006).
2.2 Metre
Metre involves grouping, hierarchy and a strong/weak distinction in terms of beats or pulses in a piece
of music (Scheirer, 1998).
Musical metre is a hierarchical structure consisting of beats at different levels as determined by their
period. The period of Beats of larger levels in the hierarchy are integer multiples of the period of
16
2.0 Background
Beats of smaller levels. Each beat at larger levels must coincide with a beat at all the smaller levels.
The most prominent of these levels is the Tactus (Quarter Note Level in the diagram, FIG 1.1 below)
which indicates the tapping rate for a musical piece. The Measure level's period is a multiple of the
Tactus level, and is typically related to the length of a rhythmic pattern in a piece of music. The Tactus
beats which coincide with the start of a Measure are known as strong beats.
FIG 1.1 – Metrical Hierarchy – taken from Goto(2001)
The size of the Tactus period relative to the Measure period determines how many Tactus beats make
up a Measure in a musical piece. This is the simplest way to think of metre and is the basis upon
which it is expressed in a time signature. A time signature is expressed as a fraction and indicates the
number of beats which make up a measure and the length of these beats. For example, a 4/4 metre for
a musical piece indicates that 4 quarter note beats make up a Measure. A 2/4 metre indicates that 2
quarter note beats make up a Measure.
2.3 Literature Review – Approaches to Beat Tracking
Many approaches and computational models of Beat Tracking have been proposed in literature. These
approaches vary in a number of respects:
● Rule based or alternative approaches (neural nets, oscillators etc.) used
● Format of musical input used – Audio or MIDI 1(see section 3.3 for explanation of MIDI)
● Real time or off-line operation
1 MIDI is a protocol that allows electronic instruments and computers to communicate with each other – see section 3.3
17
2.0 Background
● Whether or not Beat Tracking is performed at multiple levels (equivalent to Metrical
Analysis).
● The assumptions underlying the Beat Tracking approach. (metre/style known, etc.)
● The musical cues and information used for Beat Tracking.
Many of the early models were based on a set of rules which examined note onset times and inter
onset intervals to infer a beat structures. In addition to rule-based and symbolic search models,
optimization, neural nets, and coupled oscillator systems have been used extensively (see Desain and
Honing, 1994 for an overview of these models).
These models implicitly address different aspects of the beat-induction process. For instance, some
models explain the formation of a beat concept in the first moments of hearing a rhythmical pattern
(initial beat induction), some model the tracking of the tempo once a beat is given, and others cover
beat induction for cyclic patterns only (Desain and Honing, 1999).
Although many models have been proposed and discussed in the literature I will only give a brief
outline in this section of some of the approaches I deem relevant to the aims of this study.
One of the earliest and most frequently cited approaches to metrical analysis is by Rosenthal (1992)
who proposed a system called Machine Rhythm to emulate human rhythm perception for piano
performances presented as MIDI files. This system parses MIDI data into events representing note
onset times, which it then searches for series of regularly spaced onset times which represent a
potential rhythmic level. After finding the all potential rhythmic levels, the program looks for sets of
levels that may be organised into families, each representing a possible rhythmic parsing. The
program then ranks the hypotheses according to criteria corresponding to ways in which human
listeners choose rhythmic representations.
Large (1995) also describes a system which analyses note onset times. He proposes a mechanism of
Beat Tracking for complex, metrically structured rhythms which involves the entrainment of a non-
linear oscillator to an incoming signal in the form of impulses corresponding to note events. This
serves as a driver, perturbing both the phase and period of the oscillator. The oscillator adjusts its
18
2.0 Background
phase and period only at certain times during its cycle; isolating and tracking a periodicity in the
incoming rhythm. The perception of beat is modelled by the generation of an event at a particular
phase of the oscillators cycle.
Eck (2002) proposes a model of beat induction that uses a Spiking Neural Network to synchronize
with music. Input is presented to the network as voltage spikes obtained from a MIDI representation
of music, either from a MIDI file or in real time from a MIDI musical instrument. Neurons in the
SNN are initialized with a range of frequencies suitable for rhythm. When exposed to a musical
signal, clusters of neurons begin to fire in synchrony with periodic events in the signal. In many cases
these clusters gravitate to metrically important events, including downbeats. These spike onsets can
then be transformed into musical events.
While many approaches process MIDI signals, much of the more significant recent work has
concentrated on Beat Tracking for audio signals. Goto’s (2001) proposal is significant as it achieves a
reasonable metre analysis accuracy for audio signals in real time. The system can recognize the
hierarchical beat structure comprising the quarter-note level (almost regularly spaced beat times), the
half-note level, and the measure level (bar-lines). However it is assumed that the time-signature of an
input song is 4/4 and that the tempo is roughly constant; corresponding to the most common structure
of Western style music
Goto (2001) identifies the main issues in recognizing the beat structure in real-world musical acoustic
signals as being
1. the detection of beat-tracking cues
2. interpreting the cues to infer the beat structure, and
3. dealing with the ambiguity of interpretation.
Their system uses note onsets, chord changes and drum patterns as Beat Tracking cues to infer the
beat structure of a piece of music. It is based on a multi agent architecture where multiple agents track
competing metre hypotheses.
Another approach which involves metrical analysis of audio signals (although not in real time) is
proposed by Klapuri et al (2006) . Their method, which is not limited to any particular music style
19
2.0 Background
analyses musical metre jointly at three time scales: at the temporally atomic tatum pulse level, at the
beat (aka Tactus) level, and at the musical measure level. The algorithm uses time frequency analysis
to calculate a driving function at four different frequency ranges. This is followed by a bank of comb
filter resonators for periodicity analysis, and a probabilistic model that represents primitive musical
knowledge and uses the low-level observations to perform joint estimation of the tatum, Tactus, and
measure pulses. Both causal and non-causal versions of the method are described in Klapuri et al.
(2006). The causal version generates beat estimates based on past samples, whereas the non-causal
version does (Viterbi) backtracking to find the globally optimal beat track after hearing the entire
excerpt, thus improving tracking accuracy.
The final approach of interest in this study is BeatRoot as described by Dixon (2007). Similarly to
Klapuri et al's (2006) model, BeatRoot was evaluated as part of a Beat Tracking contest at MIREX
2006 (McKinney et al, 2007). The driving function of BeatRoot is a pulse train representing event
onsets derived from a spectral flux difference function. Periodicities in the driving function are
extracted through an all-order inter-onset interval analysis and are then used as input to a multiple
agent system to determine optimal sequences of beat times. This is described in Dixon (2007) and a
full implementation with source code has been made available.
2.4 How this project fits into the research context
In terms of the defining attributes of Beat Tracking models as listed in the previous section, this
project may be defined as -
● adopting a Rule based approach
● based on MIDI input (see section 3.0 for explanation of MIDI).
● operating in real time.
● performing Beat Tracking at multiple levels. Therefore Metrical analysis is performed.
● making no assumptions about the nature of musical input. The model assumes no prior
knowledge of metre or style and views the musical performance that it is accompanying as
improvised.
● using only note onset information for Beat Tracking. No other musical knowledge or
representation of musical knowledge is used.
20
2.0 Background
This project represents a returns to the early rule based, symbolic search approaches based on MIDI
input. It is notable in that it aims to create a rule based model of real time Beat Tracking and Metrical
Analysis whereas its most similar equivalent and main point of reference (see section 3.4.3): Machine
Rhythm (Rosenthal, 1992) operates off-line..
This project is also characterised by the very general and open ended specification of the problem (it
is possibly approached in its most open ended form). Whereas previous models make assumptions
about nature of musical input (such as Goto's (2001) assumption that all musical input is of a 4/4
metre), this project approaches the problem with no prior assumptions about the musical input in
terms of style of metre; treating it as essentially improvised.
Also of significance is the emphasis on performance accompaniment and the consideration of the
attributes that typify performance accompaniment scenarios. Important in this regard is the aim of
gauging the experience of musicians when accompanied by the final Beat Tracking application.
21
3.0 Design considerations and decisions
3.0 Design considerations and decisions
This section discusses the factors influencing the scope of the project and their impact on the design
of the application being developed. The requirements for the project are discussed and constraints on
the project are specified. The conceptual and practical design decisions are then discussed and
justified.
3.1 Requirements
Arising from a consideration of the hypothesis under investigation and the aims and objectives of the
project as specified in section 1.4, the requirements for RENI were specified:
● RENI must be capable of accepting musical performance data from a file or an external
musical device/instrument in real time.
● The application should infer note onset information from this real time musical input.
● RENI must implement a rule based algorithm which uses note onset information to track the
beat and infer the metre of the piece of music being performed.
● RENI must produce audible percussive accompaniment to the musical input as it is being
played.
● RENI must also produce textual output for use in its evaluation.
● RENI must be fully operable on a standard personal computer or laptop so as to be usable by a
wide audience.
3.2 Constraints
The scope of the project was bound by a number of constraints, some of practical nature and others
arising from the requirements specified.
3.2.1 Time available
There was a short and limited amount of time available to complete this project. The time allocated
was 15 weeks (May 12th – August 22nd). The full time line of the project is included as a Gantt
chard in the Appendix to this report.
22
3.0 Design considerations and decisions
3.2.2 Hardware and equipment available
No funds or equipment were made available for this project. Development and evaluation of the
application was carried out on a Macbook laptop. This was a determining factor in the requirement
that the application be operable on a standard machine.
A two octave MIDI keyboard, the M-Audio Oxygen 8 V2 USB Keyboard was also purchased and
used in development and evaluation.
3.2.3 Real-time operation
The development hardware available and the requirement for the application to track beats in real
time on a standard laptop or personal computer constrained the complexity and sophistication of the
underlying Beat Tracking algorithm. A data and processing intensive Beat Tracking algorithm, that
would not be operable on a standard machine would make RENI of little practical use to a wide
audience.
3.3 Form of musical input – MIDI
The choice musical signal format that RENI would accept was the most significant design decision
from a conceptual and practical perspective. Previous approaches to Beat Tracking and Metrical
Analysis operate on musical signals received in either MIDI or Audio format. It was decided at a very
early stage in the project that RENI would accept musical input in MIDI format.
MIDI (Musical Instrument Digital Interface) is a protocol that allows electronic instruments and
computers to communicate with each other. It transmits digital messages to devices instructing them
to play notes or to change parameters such as volume and tempo. It represents the playing of musical
notes symbolically as events (such as Note On events and Note Off events), encoding values such as
the pitch of the note played and its volume (or velocity in MIDI terminology). MIDI data can either
be received as a stream as it is played on a MIDI instrument, or can be stored in a file and read by a
MIDI sequencer. A full specification for the MIDI protocol can be found at http://www.midi.org/
23
3.0 Design considerations and decisions
Audio encodes Musical data as wave signals and typically stores in in a file format such as WAV or
MP3. Approaches to Beat Tracking using audio involve a considerable amount of signal processing to
infer note onset information. Although the initial preference in this project was to develop an audio
based application as it would allow RENI to work with a greater variety of instruments, the MIDI
format was opted for. The amount of signal processing required with audio would greatly lengthen the
implementation time and would distract from the primary aim of developing an artificially intelligent
accompanist.
The manner in which the MIDI protocol symbolically represent notes makes it a more appropriate
format to use for a symbolic, rule based approach to Beat Tracking. It makes information about music
explicitly available and easily extractable; thereby lending itself well to being manipulated
computationally.
Nonetheless, it should be noted that the choice of MIDI over Audio as the format of musical input is
not entirely without its drawbacks:
● MIDI files for use in the evaluation of the application (see section 6) are not as widely
available as audio files (encoded in either WAV or MP3 file formats.)
● The external instruments that RENI can accompany are limited to MIDI devices.
● The sound quality of MIDI is inferior to audio.
● It could be argued that a real time MIDI beat tracker is of less practical use than an audio
based equivalent.
3.4 Conceptual design decisions
A number of design decisions pertained to the operation of RENI's Beat Tracking algorithm and the
manner in which it models human Beat Tracking.
3.4.1 Beat Tracking cues
Goto (2001) identifies the first issue in Beat Tracking as the inference of Beat Tracking cues. In
Goto's (2001) model, these cues were note onset information, chord change information, and drum
sounds.
24
3.0 Design considerations and decisions
As the hypothesis under investigation suggests, note onset information was the primary Beat Tracking
cue to be used. The choice of MIDI as the format of musical input greatly simplified the inference of
note onset information. MIDI messages also encode information on the pitch and volume of notes
being played. Due to the focus on note onsets in the hypotheses under consideration and the time
constraints, it was decided not to place any emphasis on chord change cues which may be inferred
from pitch data. However, it was decided that information on the volume of notes would be used to a
limited extent given that it is encoded directly in MIDI messages and is important in assessing the
salience of notes in Metrical Analysis.
Another significant decision in this respect was the decision to ignore drum sounds as a Beat Tracking
cue and to dismiss any potential performance input which included drum sounds, from use in the
evaluation of RENI. RENI is a musical accompaniment system and according to the definition of
accompaniment offered in section 1.2, it should provide a missing musical piece. It therefore was
logical to adopt the assumption that RENI would be expected to provide accompaniment to music
without drum sounds.
3.4.2 Assumptions made about musical input
As was outlined in section 2.3, models of Beat Tracking models vary with respect to the assumptions
they make about musical input. A Beat Tracking model may make assumptions about the following
attributes of the musical performance:
● The metre
● The interval of the beat
● The style of the music being played
As the emphasis of the project was on accompaniment for improvised music, it was decided that
RENI's model of Beat Tracking and Metrical Analysis would make no assumptions about the nature
of musical input.
However during development, it was determined that adding a heuristic which indicates the metre of
the music being analysed, would be very simple and that comparing the performance of the
25
3.0 Design considerations and decisions
application with and without such a heuristic would be interesting. The heuristic was therefore added
to the application but its use is optional and it is not used by default.
3.4.3 Beat Tracking algorithm
It was decided that the most effective and expedient way of developing RENI's Beat Tracking
algorithm would be to base it on an established approach described in the literature . The chosen
algorithm would serve as an inspiration and point of reference for RENI's algorithm and would be
extended and amended to fit RENI's requirements.
A number of the approaches and algorithms described in Section 2.3 were considered. The two most
compelling candidates were those described by Goto (2001) and Rosenthal (1992). Goto (2001) was
considered as this approach performs Metrical Analysis in real time. However Goto's (2001) approach
was ultimately ruled out as it is based on audio, and in approaching the Beat Tracking problem, it
assumes a metre of 4/4.
Rosenthal's algorithm for Machine Rhythm (1992) was ultimately chosen as the base algorithm.
Although it operates off-line, it is a rule based approach based on MIDI input. It is also very clearly
described and adopting it as a base algorithm was an opportunity to assess Rosenthal's (1992) claim
that Machine Rhythm could be easily adapted for real time use.
3.5 Practical and architectural design decisions
A number of design decisions pertained to the development of the application itself and its
capabilities.
3.5.1 Development language
Java was chosen as the development language due to personal familiarity. The Java Sound API offers
excellent MIDI functionality and developing an application using it was very easy.
Developing RENI in Java on Mac OS X did however cause some difficulties. The Java sound API
running on a Mac does not recognise external MIDI devices connected via USB. Resultantly RENI
26
3.0 Design considerations and decisions
was not initially able to accept music from an external device in real time. Fortunately, this was
remedied using the Mandolane package (see http://www.mandolane.co.uk/). This allows Java on a
Mac to recognise and accept input from MIDI devices connected via USB.
3.5.2 Parametrisation of the application
The operation of the Beat Tracking algorithm (as described in section 5) can vary depending on the
values of certain parameters. These parameters determine, amongst other things how much leeway the
algorithm allows to timing inconsistencies in performances and how long the application listens to a
performance before attempting to accompany it.
In order that RENI be easily usable under a variety of different settings, it was decided to make these
values adjustable. The relevant parameters may therefore be changed in a Parameters module (see
section 4.5.8).
3.5.3 Capabilities of the application.
As RENI was being developed, A number of decisions were made about its capabilities as an
application.
Due to time constraints, it was decided that RENI would not resemble a fully fledged application at
the end of the project. It was decided that the time available would be better spent on optimising the
Beat Tracking algorithm instead of developing user interfaces.
As currently constituted, RENI performs its intended tasks as listed in the requirements. However it
runs in a development environment. It does not provide a user interface and can only be configured by
changing the Parameters module in the source code. Further work will be carried out in the future to
convert RENI into a fully fledged application (see section 7.2).
3.5.4 Output
As the emphasis of of this project is on percussive performance accompaniment, RENI indicates the
beats it locates aurally. Different beat sounds are used for strong and weak beats. Some previous Beat
27
3.0 Design considerations and decisions
Tracking approaches and experiments (Desain and Honing, 1999) used implements such as tapping
shoes to indicate output. Given the time constraints on this project, this approach was avoided.
In addition to aural output, it was also decided that RENI should provide some textual output. This
textual output records details of a particular trial (RENI accompanying a musical performance)
including the values of various application parameters and the times at which RENI indicates a beat
(taps). This textual output was used in the evaluation of the project (see section 6).
3.5.5 Complimentary applications
In addition to the main RENI application, two complimentary applications were also developed to
assist in the evaluation of RENI. These included
● BEAT-REC – records the positions at which a human performer taps along to a piece of music
on a keyboard.
● BEAT-COMP – Compares the textual output of BEAT-REC and RENI; determining how
many beats were recognised by both the human performer and RENI on the same piece of
music and producing appropriate statistics.
28
4.0 Beat Tracking and Metrical Analysis
4.0 Beat Tracking and Metrical Analysis
This section describes the problems central to the tasks of Beat Tracking and Metrical Analysis in
more computational terms and gives a general description of the principles and concepts underlying
RENI's approach to solving them(as implemented in its Beat Tracking algorithm). The steps in this
approach are summarised and the various conceptual/architectural components of RENI which
perform them are listed and described. This precedes a more detailed description of RENI's Beat
Tracking algorithm in section 5.
4.1 Beat Tracking – The problem and solution
Taking the definition of a beat as a regularly spaced pulse in a musical performance, RENI must
identify the interval of this pulse and its and the locations of the beats defined by this pulse in the
performance. Extending this task into the sphere of Metrical Analysis, RENI must determine how the
regularly spaced pulses it identifies relate to each other in order to infer the metre of the piece being
performed. This also involves the identification of strong beats and the intervals between them.
Strong beats (which should be regularly spaced) denote the beginning of measures.
When performed off-line this task involves identifying evenly spaced beat locations in a recording of
a musical performance in its entirety. When performing Beat Tracking in real time in order to provide
accompaniment, the locations of beats in the musical performance must be predicted before they
occur. The Beat Tracker must therefore identify the location of beats in an elapsed segment of the
performance (while it is ongoing) and on the basis of these locations and the intervals between them,
infer the location of future beats.
Goto(2001) identifies two of the main issues in recognizing the beats in music as being
1. detecting Beat Tracking cues
2. interpreting these cues to infer the beat structure
Consistent with the hypothesis under investigation (see section 1.3) RENI treats note onset times as
Beat Tracking cues and interprets note onset information as it is received in order to infer the beat and
metre of a musical performance..
29
4.0 Beat Tracking and Metrical Analysis
Like previous models of Beat Tracking (Rosenthal, 1992 and Desain and Honing, 1999), RENI's
model is based on a set of rules which examine inter onset intervals between the events in a musical
performance. Events in this context refer to notes (either on an instrument or from a file) being
played. Musical events occur at a particular onset or point in time relative to each other. The principle
underlying approaches based on the analysis of inter onset intervals is that beats are indicated by, and
their occurrence co-insides with regularly spaced musical events.
In accordance with this principle, RENI searches for regularly spaced note onset intervals within a
segment of a musical performance as the notes in the segment are being played. These regularly
spaced note onset intervals are represented by RENI as a Beat Level. Once potential Beat Levels are
found, RENI then organises them into Metrical Hypotheses. RENI assumes that the metrical structure
of the performance in the segment that has been searched is indicative of the metrical structure of the
performance in the future. Once Beat Levels have been found and a Metrical Hypothesis has been
inferred, RENI projects the Metrical Hypotheses into the future in order to predict the future location
of strong and weak beats. It then produces accompaniment to coincide with the beat locations it has
predicted.
Before discussing the steps in RENI's Beat Tracking algorithm, it is necessary to describe the concept
of Beat Levels and Metrical Hypotheses in more detail.
4.2 Beat Levels
RENI represents regularly spaced note onsets as a Beat Level. There may be several identifiable Beat
Levels in a musical performance. Beat Levels consist of musical events whose onsets occur at a
regular interval within a musical performance. They represent a way in which someone could tap
along to the musical performance.
A Beat Level is therefore defined by the regular interval(measured in milliseconds) between the
onsets of the events that the Beat Level contains. It is also defined by the onsets of the events
themselves, as these indicate the locations in time that one would tap.
In Fig 4.1 below, the Beat Level is represented as arches between musical events (notes in a
30
4.0 Beat Tracking and Metrical Analysis
performance represented as vertical lines).
Fig 4.1 – Beat Level consists of a number of regularly spaced musical events (notes)
Theoretically and ideally, all the notes in a Beat Level are regularly spaced. It is impractical however
to expect all the inter-note onset intervals between consecutive notes in a Beat Level to be exactly
equal, as inconsistencies may occur due to performer error. Therefore, for practical purposes, the
intervals between consecutive notes in a Beat Level in RENI are approximately equal. The Beat Level
is then defined by the location of the events in it and the average interval between consecutive notes.
4.3 Metrical Hypotheses
A Metrical Hypothesis indicates a hypothesised metre and the constituents of the musical performance
which imply it. A Metrical Hypothesis may be thought of as being composed of two or more Beat
Levels and are represented in RENI as a collection of Beat Levels. For two Beat Levels to be
legitimately combined to form a Metrical Hypothesis, the interval of the Beat Level with the larger
interval, must be an integer multiple of the interval of the smaller Beat Level. The diagram below,
FIG 4.2 illustrates how two Beat Levels, BL1 and BL2 combine to form a Metrical Hypothesis, MH1.
31
4.0 Beat Tracking and Metrical Analysis
Fig 4.2 – The interval of BL1 is an integer multiple of BL2. The two Beat Levels can therefore be
combined to form the Metrical Hypothesis MH1.
One of these levels is the Tactus. This denotes the locations at which one taps along. The other level is
the Measure which indicate the locations of strong beats. These strong beats coincide with events
which are relatively more salient.
The interval of the Measure is an integer multiple of the interval of the Tactus. The relative size of the
intervals of the Tactus and Measure Beat Levels in the Metrical Hypothesis determine the number of
Tactus beats in a measure which is the basis of the performance's metre and time signature. Different
multiple relationships between the Tactus and Measure levels imply different time signatures. This is
illustrated in the series of diagrams (FIG 4.3, 4.4 and 4.5) below.
32
4.0 Beat Tracking and Metrical Analysis
Fig 4.3 – Metrical Hypothesis and constituent Beat Levels, indicating a 4/4 metre
Fig 4.4 – Metrical Hypothesis and constituent Beat Levels, indicating a 2/4 metre
33
4.0 Beat Tracking and Metrical Analysis
Fig 4.5 – Metrical Hypothesis and constituent Beat Levels, indicating a 3/4 metre
As there may be several Beat Levels identifiable in a musical performance, there may be several
combinations of Beat Levels leading to inference of several Metrical Hypotheses.
4.4 Steps in RENI's Beat Tracking and Metrical Analysis algorithm
RENI searches for plausible Beat Levels and combines these to form Metrical Hypotheses. The steps
in this process as implemented by RENI's Beat Tracking and Metrical Analysis algorithm are -
1. Accept and process musical performance data in real time – MIDI messages are received as
they are played from a MIDI file or a MIDI instrument and converted into an appropriate
representation which exposes note onset times and other important Beat Tracking cues.
2. Set the search space (Search for potential Beat Levels) - a number of potential Beat Levels are
created corresponding to every possible inter onset interval that occurs within a particular
time window (the Search Space Window) at the start of the performance.
3. Extend Beat Levels (Search for plausible Beat Levels) – notes received within an Extension
34
4.0 Beat Tracking and Metrical Analysis
Window (which includes the Search Space Window) are assessed to determine if they are
compatible with the postulated Beat Levels. If the interval between the last note in a Beat
Level and a new note is approximately the same as the interval that defines that Beat Level,
then the new note can be added to the Beat Level, thereby extending it into the performance
and increasing its plausibility.
4. Hypothesise – plausible Beat Levels are combined to form Metrical Hypotheses. Beat Levels
whose intervals are an even Multiple of each other and who can be organised into a Tactus-
Measure relationship are combined in a Metrical Hypothesis.
5. Rank Hypotheses – a number Metrical Hypotheses will be inferred during the Hypothesise
step. RENI scores them according to certain criteria and selects the Metrical Hypothesis with
the highest ranking to form the basis of the percussive accompaniment produced.
6. Produce Output – percussive accompaniment is produced that corresponds to the selected
Metrical Hypothesis and that is appropriately synchronised with the performance.
RENI may or may not depending on its settings perform the following step
7. Re-Hypothesise – RENI can complete the above steps (1-6) at subsequent points in the
performance and change the accompaniment to account for changes in tempo or metre.
These steps are described in greater detail in Section 5.
4.5 Architecture and components of RENI
Conceptually, RENI is composed of a number of components. Each of these components contribute to
the implementation of RENI's Beat Tracking algorithm and the performance of the tasks summarised
in the previous section.
4.5.1 Timer
The Timer maintains timing information for RENI and is used to record note onset times. The Timer
35
4.0 Beat Tracking and Metrical Analysis
begins when RENI starts listening to a musical performance. Note onset times are recorded as the
time that has elapsed between the point at which RENI's Timer starts and the point at which a note is
received.
The Timer also plays an important role in RENI's production of percussive accompaniment. RENI's
Drummer uses timing information to ensure that its accompaniment begins at the right time so as to
be appropriately synchronised with the musical performance.
4.5.2 Instrument
The Instrument component serves as the interface between RENI and the source of musical
performance data (in MIDI format). It also serves as the point where everything MIDI related
(synthesisers, sequencers, settings for channels) is dealt with; thereby isolating the rest of the
application from MIDI processing and settings.
The Instrument component may read and play MIDI files into RENI or else set up a connection with
an external MIDI instrument and pass input from this instrument to RENI.
4.5.3 RENI (Main application component)
The Main component of RENI co-ordinates the performance of all Beat Tracking and Metrical
Analysis tasks.
The Main component receives MIDI messages from the Instrument component and converts them
into an appropriate representation of note events. It expands the Beat Level search space and then
passes events to these Beat Levels in real time. Once all plausible Beat Levels have been found, the
Main component consults the Interpreter. Once notified by the Interpreter of its selected Hypothesis,
RENI starts the Drummer.
4.5.4 Beat Levels
Beat Levels have already been defined conceptually in section 4.2. As a component in RENI, Beat
Levels represent themselves as a collection of note events and are defined by the regular interval (or
36
4.0 Beat Tracking and Metrical Analysis
average value of this interval) between these Events.
Beat Levels are significant as a processing component in RENI as they are responsible for
determining if a note extends them. As note events are created by the Main component within the
Extension Window, they are passed for inspection to each Beat Level. The Beat Level determines if
the event occurs within a time window which occurs after the last event in the Level, at an offset
determined by the defining interval of the Beat Level. If it does, the event may be added to the Beat
Level; thereby extending it.
4.5.5 Interpreter
The Interpreter is responsible for creating Metrical Hypotheses from potential Beat Levels. Once the
Beat Level Extension Window has expired, the Interpreter receives Beat Levels from RENI. It
identifies and discards duplicate Beat Levels and then combines compatible Beat Levels to form
Metrical Hypotheses. The Interpreter gets the Judges to score the Metrical Hypotheses and then
passes the highest ranking Metrical Hypothesis back to the Main RENI module..
4.5.6 Judges
The Judges assess and assign plausibility scores to the inferred Metrical Hypotheses. These scores are
used by the Interpreter to rank Metrical Hypotheses by plausibility. The highest ranking Metrical
Hypothesis is selected to form the basis of the percussive accompaniment produced. RENI currently
uses three Judges. The Timing Judge assigns scores based on the regularity of timing and equality of
intervals in the Beat Levels of a Metrical Hypothesis . The Salience Judge assigns scores based on the
salience of events at the Measure level. The Statistical judge assigns scores to a Hypothesis based on
the frequency with which Beat Levels with the same interval as the Hypothesis's Tactus Beat Level, or
multiples of that interval occur
4.5.7 Drummer
The Drummer takes the selected Metrical Hypothesis from the RENI and produces performance
accompaniment consistent with it. Once it determines the soonest appropriate time to start playing so
that its performance will be synchronised with the musical piece being performed, it produces simple
percussive accompaniment that denotes strong and weak beats.
37
4.0 Beat Tracking and Metrical Analysis
4.5.8 Parameters
The Parameters component stores a collection of values which parametrise the various components
and operations of RENI. These parameters specify the length of the Search Space and Extension
windows, influence how strictly RENI treats imperfect performance timing, and assign weightings to
the scores calculated by Judges when calculating an overall score for a Metrical Hypothesis. They
also determine whether or not RENI uses heuristic information which indicates the metre of a
performance and if RENI continually listens to a performance to infer updated Metrical Hypotheses.
38
5.0 RENI's Beat Tracking algorithm
5.0 RENI's Beat Tracking algorithm
This section describes the operation of RENI's Beat Tracking and Metrical Analysis algorithm. The
steps in this process were outlined in section 4.4. In this section, each of these steps are described and
illustrated in greater detail.
5.1 Accepting and processing input.
RENI receives performance information in real time as MIDI messages. These correspond to musical
events such as a note being played and are received as notes are played. They include Note On events
which indicate the onset of a note (a key being pressed) and Note Off indicating the offset of a note
(the same key being released). The MIDI messages received by RENI encode information on the type
of event, the pitch of the note (the key pressed) and volume of the note.
These MIDI messages must be parsed and converted into a representation of a musical event that will
allow RENI to search for regularly spaced onset intervals. The Beat Tracking cues used by RENI
must therefore be inferred as these messages are received. RENI must also distinguish between
different types of events.
5.1.1 Creating RENI Events
Once RENI receives a MIDI message indicating a Note On event, it converts and stores it as a RENI
Event. This representation incorporates the pitch and volume information of the corresponding MIDI
message and also a timestamp which indicates the point in time at which the MIDI message was
received.
Once the Message is received, RENI references the Timer to determine the time at which the Message
was received and stores this in the RENI Event. This time stamp indicates the time that has elapsed in
milliseconds between the time RENI started listening for input and the time the event occurred. The
absolute value of this time stamp is unimportant. RENI is interested in the timing of events relative to
each other.
As messages are received and parsed, RENI must detect and distinguish between two types of RENI
39
5.0 RENI's Beat Tracking algorithm
Events; Monophonic Notes (regular events) and Chords.
5.1.2 Detecting Chords
Chords may be defined as consisting of two or more Note On events which are played simultaneously
and interpreted by a listener as being one musical event. Chords are more salient than normal events
consisting of one note. These events are important from a Metrical Analysis perspective as more
salient events are likely to denote the location of a strong beat and the start of a measure. Detecting
chords is essential in the detection of strong beats and the inference of metre.
RENI treats note events whose onsets occur within a particular window of each other, the Chord
Detection Window, as constituting a chord. Once a MIDI message is received, the difference between
the time that message was received and the onset time of the last RENI Event is calculated. If this
difference is less than the duration of the Chord Detection Window, then the previous RENI Event is
flagged as a chord.
Fig 5.1 – Where two events occur within a certain a certain offset of each other (the chord detection
window) they are perceived as chords and treated as one event.
In FIG 5.1 above, Chords are indicated by the notes which occur within a Chord Detection Window
40
5.0 RENI's Beat Tracking algorithm
and are shaded green. A Chord Detection Window is 100 milliseconds long and starts at the onset of
the earliest note in the chord. Combinations of note events which occur within 100 milliseconds of
each other are assumed to be perceived as simultaneous. They are detected as a chord and stored as
one event with the onset indicated by the onset of the earliest note in the chord.
It should be noted that chords are a complex product of simultaneous playing of notes and the tonal
relationship between them. However, due to time constraints and the emphasis on note onset
information in the hypothesis under investigation as stated in section 1.3, it was decided to ignore
tonal relationships and to only view chords as the product of simultaneous playing.
5.2 Setting the search space – Spawning Beat Levels
As MIDI messages are received and converted into RENI Events which encode note onset times,
RENI searches for plausible Beat Levels. In order to do this it must set out a search space of potential
Beat Levels. As was discussed in section 4.2, Beat Levels consist of a collection of regularly spaced
events and are defined by the regular interval between the onsets of consecutive events within this
collection.
The search space is created by creating (or “spawning”) Beat Levels for the inter onset interval
between every combination of events received within a particular window of time at the start of the
performance. This window is called the Search Space Window. This length of this window is
adjustable and set in the Parameters module.
Each note received after the first note is received within the Search Space Window causes new
potential Beat Levels to be spawned. This process is illustrated in the diagrams (FIG 5.2, 5.3 and 5.4)
below.
● The first note , N1 in the performance is received at TS1 (FIG 5.2). As there is only one note
so far in the performance, no Beat Levels are formed.
41
5.0 RENI's Beat Tracking algorithm
Fig 5.2 – First Note received in search space window
● Another note, N2 is received at TS2 (FIG 5.3). A Beat Level, BL1 is created consisting of the
onsets of N1 and N2 and defined by the interval between them.
Fig 5.3 – Second Note received in search space window. Beat Level BL1 spawned.
● Another note, N3 is received at TS3 (FIG 5.4). Further potential Beat Levels are created. BL2
is created as defined by the interval TS3 – TS2 and another Beat Level, BL3 defined by the
interval TS3 – TS1.
42
5.0 RENI's Beat Tracking algorithm
Fig 5.4 – Third Note received in search space window. Beat Levels BL2 and BL3 are spawned
As additional Notes are received within the Search Space Window, more potential Beat Levels are
created, one for each combination of the new note and all the previous notes. Once the Search Space
Window has elapsed, potential Beat Levels have been created for every combination of notes within
the Window, each defined by the interval between the notes.
The Beat Levels created during this Search Space Window constitute a search space of potential Beat
Levels within which RENI searches for plausible Beat Levels. RENI does this by establishing if the
potential Beat Levels created in the spawning process can be extended into the performance. This
occurs as part of the Extension process.
5.3 Extending Beat Levels
Once a potential Beat Level has been identified and spawned within the Search Space Window, RENI
must determine if it is a plausible Beat Level. Given that beats occur at a regular interval and that
RENI assumes that beats coincide with event onsets, a Beat Level is plausible if it can be extended to
included further events which maintain the same time interval between consecutive events. RENI
therefore determines if a Beat Level is plausible by attempting to extend it into the musical
performance.
A Beat Level in RENI is extended by adding a RENI Event to it. A RENI Event can legitimately be
added to a Beat Level if the interval between the candidate RENI Event and the last event in the Beat
Level is approximately the same as the interval which defines the Beat Level.
43
5.0 RENI's Beat Tracking algorithm
RENI attempts to extend the potential Beat Levels that have been created within a window of time
called the Extension Window. The Extension Window includes the Search Space Window. Whilst in
the Search Space Window, the spawning of new potential Beat Levels and the extension of existing
potential Beat Levels occurs simultaneously.
5.3.1 Extending a Beat Level – Adding a new event
Once a Beat Level is spawned from the interval between two events, all subsequent events which
occur before the close of the Extension Window are analysed to determine if they can legitimately
extend the Beat Level. (This occurs for all Beat Levels).
Two values are important in the extension of a Beat Level
● The interval of the Beat Level – The time between the two events that define the Beat Level
on creation.
● The Ideal Next Time - The Ideal Next Time is the sum of the onset time of the most recent
note in the Beat Level and the Interval of the Beat Level. Theoretically, it is the time at which
an event must occur if it is to extend and be added to the Beat Level.
For an event to extend a Beat Level, it should ideally occur at the Ideal Next time. However, an event
which should occur at the Ideal Next Time may not occur precisely a that time due to slight deviations
in the timing of the performance (due perhaps to performer error). If RENI required that an event
occur at precisely the Ideal Next Time in order for a Beat Level to be extended, then it might not find
any plausible Beat Levels. Therefore, in order to accommodate timing deviations, an Event may
extend a Beat Level if it occurs within a window of time surrounding the Ideal Next Time of the Beat
Level. This window is called the Ideal Next Time Window. The width of this window is set in the
Parameters module of RENI and expressed as a percentage of the Beat Level's interval. The width of
this window determines how well timed RENI expects the performance to be.
The series of diagrams (FIG 5.5 and 5.6) below illustrate the Beat Level extension process.
In the diagrams (FIG 5.5 and 5.6) below, RENI has a potential Beat Level which since creation has
44
5.0 RENI's Beat Tracking algorithm
been extended to include two further events which occur at approximate regular intervals I; giving it a
total of four events. The ideal next time, TI, is calculated by adding the onset of EV4 and I. Also
illustrated is the Ideal Next Time Window around TI in which any Event extending the Beat Level
must occur. Every note received by RENI is passed to the Beat Level to determine if the note can
legitimately extend it.
● N1 is received (FIG 5.5). It does not occur within the Ideal Next Time Window and is not
added to the Beat Level
Fig 5.5 – Note received but not added to Beat Level
● N2 is received (FIG 5.6). It occurs within the Ideal Next Time window and is added to the
Beat Level
45
5.0 RENI's Beat Tracking algorithm
Fig 5.6 – Note received and added to Beat Level
Once a note is added to the Beat Level, the following occurs
● The average interval of the Beat Level is re-calculated. This average interval value
important as it is used by the Interpreter in subsequent tasks.
● A new Ideal Next Time is calculated
● A new Ideal Next Time Window is set into the future
5.3.2 Choosing between events
In the Extension process outlined in the previous section, the first event which occurs in the Ideal
Next Time Window is added to the Beat Level. This causes a new Ideal Next Time and Ideal Next
Time window to be defined. It is very likely however that more than one note will occur in the same
Ideal Next Time Window and that events subsequent to the one added to the Beat Level may be closer
to the Ideal Next time than the event just added.
This can be seen in FIG 5.7 below. N1 is the first event to occur in the Ideal Next Time window and
is added to the Beat Level. However N2 which occurs after N1and is closer to the Ideal Next Time.
46
5.0 RENI's Beat Tracking algorithm
Fig 5.7 – Two notes occur within the Ideal Next Time Window
Addressing this problem would be easier if RENI ran off-line. The obvious approach would be to
select the best event, the one closest to the Ideal Next Time from all the candidate events. However in
real time operation RENI doesn't know all the candidate events until the Ideal Next Time Window has
elapsed. Another approaching to this problem would be to spawn an additional Beat Level; having
one Beat Level with N1 and another with N2. This however would lead to an explosion in the number
of Beat Levels and would greatly compromise RENI's ability to run efficiently in real time. Therefore,
in order to address this problem, RENI replaces the event added to the Beat Level within a particular
Ideal Next Time Window with a better event, if one is subsequently found. This means that RENI
must track two Ideal Next Time Window's at the same time, The Active Ideal Next Time Window and
the Previous Ideal Next Time Window.
The replacement process is illustrated in the series of diagrams (FIG 5.8, 5.9 and 5.10) below.
● N1 occurs within the Active Ideal Next Time Window and is therefore added to the Beat Level
(FIG 5.8). The Beat Level calculates a new Interval, Ideal Next Time and Active Ideal Next
Time Window. However it also maintains the Previous Ideal Next Time Window it has just
47
5.0 RENI's Beat Tracking algorithm
filled and the corresponding Previous Ideal Next Time.
Fig 5.8 – Beat Level Extended and new Ideal Next Time and Ideal Next Time window calculated.
● N2 occurs within the Previous Ideal Next Time Window (FIG 5.9). RENI inspects it and sees
that it is closer to the Previous Ideal Next Time of the Previous Ideal Next Time window than
N1.
48
5.0 RENI's Beat Tracking algorithm
Fig 5.9– A better note for the previous Ideal Next Time Window occurs
● RENI decides to replace N1 with N2 (FIG 5.10). A new Active Ideal Next Time and Active
Ideal Next Time Window is created. RENI maintains the Previous Ideal Time Window whose
event it has just replaced in case a better event occurs in the future.
49
5.0 RENI's Beat Tracking algorithm
Fig 5.10– Previous note replaced. New Active Ideal Next Time is set.
5.3.3 Ghost events
An Ideal Next Time Window may be bypassed and unfilled. This happens when no events occur
within it. That would suggest that the Beat Level in question does not extend into the performance
and is therefore implausible. To discard the Beat Level after one failed extension however would be
pre-mature. The absence of an event at the point in time necessary to extend the Beat Level may be
due to the characteristics of the performance (Beats don't always necessarily coincide with an event).
It does not immediately mean that the Beat Level cannot be perceived (by a human listener or
otherwise), especially if appropriate events were to occur in logical future Ideal Next Time Windows.
Therefore when RENI detects that the Active Ideal Next Time Window of a Beat Level has been
bypassed, it does note immediately discard it. Instead it artificially extends the Beat Level by adding a
Ghost Event at the Ideal Next Time within the window that has been bypassed. A Ghost Event is an
event which does not occur in the performance and is therefore not perceived. However it is placed at
a point where one would expect it to occur if tapping along with the Beat Level it extends. Although
the event itself is not perceived, the beat that it indicates may be.
The process for adding a Ghost Event is illustrated below (FIG 5.11 and 5.12):
50
5.0 RENI's Beat Tracking algorithm
● RENI detects that the Active Ideal Time Window has been bypassed (FIG 5.11).
Fig 5.11– Ideal Time Window is passed without a note being added to it.
● It adds a Ghost Event, G to the Ideal Time Window and infers a new Ideal Time Window
(FIG 5.12).
51
5.0 RENI's Beat Tracking algorithm
Fig 5.12– Ghost Event added to bypassed Ideal Next Time Window. New Active Ideal Next Time and
Window calculated.
However RENI's tolerance for bypassed Ideal Time Windows is limited. If it was to extend a Beat
Level with a series of Ghost Events, RENI would be wasting time and resources on an implausible
Beat Level. Therefore, if more than than 2 ghost events in a row are required to extend the Beat
Level, then the Beat Level is deemed implausible and discarded.
5.3.4 Completion of Extension process
Once the Extension Window has elapsed, extension of Beat Levels ceases. The output of the
Extension process is a collection of plausible Beat Levels each consisting of regularly spaced events
which occur within the Extension Window. These Beat Levels are passed to the Interpreter which
forms Metrical Hypotheses.
5.4 Hypothesising
Once the Extension Window has elapsed and the Extension process has been completed, the
Interpreter takes the collection of Beat Levels and forms Metrical Hypotheses from them. The
Interpreter performs three main tasks:
52
5.0 RENI's Beat Tracking algorithm
● Consolidation of Beat Levels
● Forming Metrical Hypotheses from Beat Levels
● Deciding on the Metrical Hypothesis which will form the basis for percussive performance
accompaniment.
5.4.1 Consolidation
Some of the Beat Levels created in the Extension process may be a subset of another Beat Level. A
Beat Level is a subset of another Beat Level if it has a similar defining interval to the other Beat Level
and all its Events can be found in the other Beat Level. In the diagram below, FIG 5.13, BL1 has a
similar interval to BL2 and the events that are in BL2 are a subset of the events in BL1. BL2 and BL1
are essentially representing the same Beat Level. BL2 is contains one less Event as it was created later
in the Search Space Window.
FIG 5.13 – BL1 and BL2 are essentially duplicates as they have similar average intervals and all the
events in BL2 are in BL1
Before forming Metrical Hypotheses with the Beat Levels remaining after Extension RENI
53
5.0 RENI's Beat Tracking algorithm
consolidates its collection of plausible Beat Levels. by detecting and discarding duplicate Beat Levels.
This ensures that all the Beat Levels examined in the subsequent Hypothesising process are unique.
Discarding duplicate Beat Levels prevents redundant processing during the next phase of
Hypothesising and prevents the inference of duplicate Metrical Hypotheses.
RENI detects duplicate Beat Levels by comparing the average interval of every combination of Beat
Levels. Where two Beat Levels have an average intervals of a similar value and share a large number
of common notes, the Beat Levels in question are deemed to be duplicates. Where RENI determines
that two Beat Levels are duplicates, it discards the Beat Level with fewer events.
5.4.2 Hypothesising
Once Consolidation has been completed RENI begins searching for Metrical Hypotheses. As was
described in section 4.3; two (or more )Beat Levels can be combined to form a Metrical Hypothesis if
they meet the following criteria:
● the average interval of the Beat Level with a larger intervals is an integer multiple of the of
the average interval of the Beat Level with the smaller intervals.
● The two Beat Levels are aligned and share common events. An event in a larger Beat Level in
a hypothesis should also be present in every smaller Beat Level.
The diagram below, FIG 5.14 illustrates how Beat Levels can be legitimately combined to form a
Metrical Hypothesis
54
5.0 RENI's Beat Tracking algorithm
Fig 5.14 – The interval of BL1 is an integer multiple of BL2's interval All the notes in BL1 are in
BL2. The two Beat Levels can therefore be combined to form the Metrical Hypothesis MH1
The Interpreter infers Metrical Hypotheses by searching for Beat Levels which are compatible
according to criteria listed above and combining them. It does this using a recursive search procedure.
It starts with a smaller Beat Level and combines it with larger compatible Beat Levels to form
Metrical Hypotheses. Conceptually, Metrical Hypotheses are formed from the bottom Beat Level
(Tactus) up. This procedure terminates when the number of Beat Levels added to a Metrical
Hypothesis exceeds a limit defined in the Parameters module.
Before searching for Hypotheses, The Interpreter arranges in the Beat Levels in order of the size of
their average interval (smallest first). RENI then loops through a number of Beat Levels, trying to
find Metrical Hypotheses for which particular Beat Levels are the Tactus (the bottom Beat Level in a
Metrical Hypothesis).
In order to find the Hypotheses for which a particular Beat Level, B1 is a Tactus, the Interpreter
1. Takes B1 and treats it as the base Beat Level or Tactus.
2. Searches for every Beat Level which has a larger average interval and is compatible with
BL1 (BL2, BL3 .......BLN etc.).
55
5.0 RENI's Beat Tracking algorithm
3. For each of the compatible Beat Levels found, RENI finds all the Hypotheses (Partial
Hypotheses) which have the compatible Beat Level as their Tactus. This is where
recursion is used. For each compatible Beat Level found in step 2, this entire procedure
(steps 1-4) is run to find Partial Hypotheses based on them.
4. As the Partial Hypotheses based on each Beat Level compatible with B1 are returned, the
Interpreter adds B1 to the bottom of each of these Hypotheses and adds them to collection
of inferred Metrical Hypotheses. Although conceptually, we are starting from the bottom
when building Metrical Hypotheses, in practical terms, owing to the recursive nature of
the search procedure Metrical Hypotheses are actually built from the top down.
The search procedure operates subject to the following constraints and rules
● Metrical Hypotheses are only allowed have a limited number of levels, usually 2-3. This limit
is enforced as a constraint on the depth of the recursive procedure. This limit can be adjusted
in the Parameters Module.
● With many potential Beat Levels this search strategy could be very intensive. As it is
implausible that some of the larger Beat Levels could be the correct Tactus, the Interpreter
limits the number of Beat Levels which can be Tactuses in the final collection of inferred
Metrical Hypotheses. The search is constrained according to this limit.
● Ideally RENI should know nothing about the time signature of the performance when
inferring Metrical Hypotheses. However as stated in section 3.4.2, the application does
provide the facility to indicate this to the Interpreter by setting a value in the Parameters
module. This parameter is a vector which indicates how much of a multiple each level in the
Metrical Hypothesis should be of the Level below (except in the case of the bottom Beat
Level which is assigned a multiple value of 1) If we want the final Hypotheses to be 4/4 we
would set this parameter to the value {1,4}.
● For a particular base Beat Level, there may be no compatible Beat Levels. In such instances ,
56
5.0 RENI's Beat Tracking algorithm
RENI creates a one level Metrical Hypothesis. Due to the absence of Beat Levels which are
compatible with the Tactus of a one level Metrical Hypothesis, this Hypothesis is unlikely to
be the correct (in the context of the musical performance). One level Metrical Hypotheses are
therefore separated from multi level Metrical Hypotheses and will only form the basis of the
percussive accompaniment produced if no multi level Metrical Hypotheses are found.
As Hypotheses are formed, the following is determined for each Hypothesis and encoded in RENI's
representation of a Metrical Hypothesis
● The number of levels in the Metrical Hypothesis
● Designations of the Beat Levels in the Hypothesis that represent the Tactus and the Measure.
● The multiples or relative sizes of the Beat Level Intervals. For a two level Hypothesis with
Beat Levels of average Intervals 500 and 1000 respectively the value for multiples is {1, 2}.
This value indicates the time signature implied by the Hypothesis.
These attributes are used by the Judges when calculating plausibility scores for the Metrical
Hypotheses (see section 5.7).
5.4.3 Deciding
Once the Hypothesising process has been competed, the Interpreter is left with a collection of
Metrical Hypotheses. Only one of these can form the basis of the percussive accompaniment
ultimately generated by RENI. RENI must therefore identify the most plausible Metrical Hypothesis.
In order to identify the most plausible Metrical Hypothesis, the Interpreter gets the Judges to score
each Hypothesis. The Judges score each Metrical Hypothesis on separate criteria and collectively give
the Hypothesis an overall plausibility score. Once each Metrical Hypothesis has been scored, the
Interpreter selects the Hypothesis with the highest score and indicates this to the Main RENI module.
The selected Hypothesis forms the basis of the percussive accompaniment generated by RENI.
5.5 Ranking and selecting Metrical Hypotheses
The Judges are used by the Interpreter in deciding on the best Metrical Hypothesis. Each Judge
scores each Metrical Hypothesis according to its own set of criteria. These scores are then combined
57
5.0 RENI's Beat Tracking algorithm
in a weighted manner to assign an overall score to the Hypothesis. Each Judge is independent of each
other and there is the potential for RENI to implement multiple Judges.
As currently constituted, RENI currently uses three Judges, the Timing Judge, The Salience Judge
and the Statistical Judge. RENI assigns a different weighting to the scores of each Judge in calculating
the final score, based on the importance of the criteria they assess.
Developing a scoring system to rate Metrical Hypotheses is not clear cut. Determining the metrics to
use in calculating the final score and the weights that should be assigned to each of these metrics is a
challenging task. Some of the metrics described in the forthcoming sections are based on or adapted
from metrics used in Rosenthal's Machine Rhythm (1992), while others are original to RENI. The
weightings assigned to these metrics were determined though repeated trials and adjusted on the basis
of analysis of RENI's output.
5.5.1 Timing Judge
The Timing Judge assesses the the timing consistency of a Hypothesis; in particular the equality of
the inter onset intervals in the Beat Levels of the Hypothesis. It assumes that greater consistency, and
equality in these intervals contributes to a more plausible Metrical Hypothesis.
It calculates the following for each Metrical Hypothesis:
● The standard deviation of the interval between consecutive events in the Tactus Beat Level -
Once the Tactus Beat Level has been extended, its defining value is the average interval
calculated over all the intervals between consecutive notes that occur within the Beat Level.
The Timing Judge calculates the standard deviation of this interval. A smaller standard
deviation indicates a more consistent and unvarying interval between successive events and
therefore; a more plausible Beat Level and Metrical Hypothesis.
● Similarly the standard deviation of other Beat Levels in the Hypothesis are calculated.
● How close to an integer multiple the Measure Level interval is of the Tactus Level interval. A
Metrical Hypothesis whose Measure interval is equal to 4.0 * The Tactus interval is assigned a
higher score than a Hypothesis whose Measure Level is 3.9 * the Tactus interval. For a
Metrical Hypothesis, the closer the value of its Measure Level interval divided by its Tactus
58
5.0 RENI's Beat Tracking algorithm
Level interval is to an integer value, the more plausible it is assumed to be.
The first two metrics were used in Rosenthal (1992) while the Measure/Tactus multiple metric is a
original metric incorporated into RENI. Smaller values for each of these three metrics correspond to
higher scores for a Metrical Hypothesis. The Timing Judge calculates a weighted sum of these scores
and converts the final score into a metric to be used in the calculation of the overall Metrical
Hypothesis plausibility score.
5.5.2 Statistical Judge
The Statistical Judge accumulates statistics on the musical performance being listened to and on the
Metrical Hypotheses inferred by the Interpreter. These statistics are used to calculate an overall
statistical score for individual Hypotheses. These statistics and the metrics they are used to calculate
are:
● The Overall Average interval between consecutive notes within the Extension Window of the
Performance. For each particular Metrical Hypothesis, the absolute difference between the
Hypothesis's Tactus interval and the Overall Average interval is subtracted from the overall
statistical score.
● A Frequency Histogram of the Tactus Level intervals of the inferred Hypotheses is created
and used to score individual Hypotheses. For example; if there are 3 Hypotheses with a Tactus
interval of 500 milliseconds, then a Hypothesis with 500 millisecond interval will be assigned
a score proportional to 3 (3 multiplied by a multiple parameter).
● A Multiples Histogram of the Tactus Levels Intervals of the inferred Hypotheses and the
Tactus Levels Intervals which are integer multiples of these, is created and used to score
individual Hypotheses. For example if there are 2 Hypotheses with Tactus Levels of 500
milliseconds and 2 Hypotheses with a Tactus Level of 1000 milliseconds (500*2), then a
Hypothesis with a 500 millisecond interval will be assigned a score proportional to 4.
The overall statistical score is determined by calculating a weighted sum of the Histogram metrics and
subtracting the score calculated using the Overall Average interval. The principles underlying these
metrics are discussed below.
Both the Histogram based metrics are based on the premise that the perception of a particular Beat
59
5.0 RENI's Beat Tracking algorithm
Level will be re-enforced by events spaced by multiples of that Beat Level. These metrics as used
here are an adaptation of a similar method used by Rosenthal (1992). Rosenthal (1992) uses a
histogram construct to select a Tactus at the start of the Beat Tracking process in Machine Rhythm
and bases the search for Beat Levels on this Tactus. In RENI, the histogram construct is instead used
to score inferred Metrical Hypotheses.
In the case of the Overall Average interval, this value will be inversely proportional to the number of
events that occur in the Extension Window. A larger number of events in the Extension Window
implies a faster tempo and suggests that the correct Metrical Hypothesis will have a smaller Tactus
interval. Conversely A smaller number of events in the Extension Window implies a slower tempo
and that the correct Metrical Hypothesis will have a larger Tactus interval. Subtracting the absolute
difference between the Overall Average interval and the Tactus interval of a particular Hypothesis
from the Statistical Score, favours smaller Tactus intervals when the Overall Average is smaller and
larger Tactus intervals when the average is larger. This metric which was developed originally for
RENI is helpful in preventing the Judges selecting a Metrical Hypothesis whose Tactus has an interval
which is too large, on the basis that the Hypothesis is calculated as being more consistently timed than
a Hypothesis with a more appropriate and smaller Tactus interval.
5.5.3 Salience Judge
Whereas the first two Judges are primarily focused on the Tactus Level, the Salience Judge focuses on
the Measure Level and is particularly important in the selection of a Metrical Hypothesis which
implies the correct metre. The Salience Judge works on the premise that events on the Measure Level
coincide with strong beats and should therefore be more salient. Accordingly, the Salience Judge
assesses the salience of events in the Measure Level of the Hypothesis. The greater the average
salience of the events in the Measure Level, the greater the score assigned to the Metrical Hypothesis.
The Salience of events are determined by
● Absolute Duration – the greater the duration of the event, the greater the salience of the event.
● Relative Duration – In an event is preceded by events which are noticeably shorter in duration
than it, then the event will be more salient
● Loudness – the louder the note, the more salient it is
60
5.0 RENI's Beat Tracking algorithm
● Chords – Chords are more salient than one note events.
● Ghosts – Ghost events are not perceived and therefore possess no salience.
Corresponding with these salience attributes, the following metrics are calculated by the Salience
Judge in order to calculate the overall salience score for a Metrical Hypothesis.
● Absolute Duration – the average absolute duration of the events in the measure.
● Relative duration – For each event in the Measure, the duration of the four events in the
overall performance that precede the event are assessed and compared to the duration of the
event in the measure. The Salience Judge counts how many of these four events have a
duration less than 66% of the duration of the particular Measure Level event. The average of
this count for each event in the Measure Level contributes to the calculation of the salience
score.
● Loudness – The average volume (or velocity in MIDI terminology) of Measure Level events
is calculated.
● Chords – The percentage of the events in the Measure Level which are Chords is calculated
● Ghost Events – The percentage of the Events in the Measure level which are Ghosts is
calculated.
The salience score is calculated as a weighted sum of the first four metrics listed above minus a
weighted indication of the Ghost metric.
5.5.4 Calculating the overall plausibility score.
The overall score assigned to a Hypothesis is a weighted score of the scores assigned to the
Hypothesis by all three Judges. The weightings assigned to each Judges scores are set as parameters.
5.6 Producing output
Once the Interpreter has selected the highest ranking Metrical Hypothesis on the basis of the scores
assigned by the Judges, RENI's Drummer module uses this Hypothesis to produce percussive
accompaniment
61
5.0 RENI's Beat Tracking algorithm
A Metrical Hypothesis will be presented to the Drummer soon after the completion of the Extension
process. This Metrical Hypothesis tracks the Beats that occurred during the Extension Window. The
combination of Beat Levels in the Metrical Hypothesis can be projected, on the basis of their average
intervals into the future, to predict the location of beats in the remainder of the performance. The
Drummer aims to indicate these future beats (strong and weak) aurally, as they occur, by producing
simple percussive accompaniment.
The Drummer must use the information contained in the Hypothesis to determine the best time to start
accompanying, and the form of accompaniment to be generated. It does this by determining the
location of the next beat to occur at the Measure level and beginning its accompaniment at this point.
This location is calculated by adding the position of the last beat in the Measure Level and the
Interval of the Measure Level. The Drummer then determines the current point in time and calculates
how long it will have to wait before beginning its accompaniment. When accompaniment begins, the
Drummer taps along at the Tactus Level.
The Drummer uses the interval multiples information in the Metrical Hypothesis to determine the
combinations of strong and weak beats it should indicate in its output. For a {1, 4} multiple,
indicating a 4/4 time signature, the Drummer plays 1 strong beat followed by 3 weak beats. For a {1,
3} multiple, the drummer taps one Strong beat followed by 2 weak beats. The type of beat is
distinguished using different tapping sounds.
As indicated in section 3.5.5, the Drummer also produces textual output, writing each time at which it
taps a beat to a text file.
Once the Drummer has started, RENI's Beat Tracking algorithm is completed.
5.7 Re-hypothesising
The sections above describe the full operation of RENI's Beat Tracking algorithm. This algorithm
infers a Metrical Hypothesis and produces accompaniment on the basis of the analysis of the portion
of a musical performance that occurs between the start of the performance and the expiration of the
Extension Window. Implicit in this is an assumption that the contents of the Extension Window and
62
5.0 RENI's Beat Tracking algorithm
the beat and metre implied by it, is indicative of the beat and metre of the remainder of the
performance.
This assumption may be erroneous, especially if we assume that RENI views musical input as
essentially improvised. The performer may change the tempo and metre of the piece being performed
at any point in the performance. A real life percussive accompanist is likely to react to such changes
and adjust their accompaniment appropriately. As a model of such accompanists, RENI can also be
set to accommodate such changes by re-hypothesising.
When RENI is set to re-hypothesise and react to changes in tempo, it runs its Beat Tracking
algorithm repeatedly, over consecutive windows of time in the performance, and produces percussive
accompaniment on the basis of each new Metrical Hypothesis inferred and selected by its Beat
Tracking algorithm.
5.8 Parameters
The operation and precise behaviour of RENI's Beat Tracking algorithm may be influenced by the
value of the adjustable parameters in the Parameters module. These parameters have already been
alluded to section 4.5.8. They are summarised below.
● File or Instrument flag– indicates how RENI accepts input.
● File Name – if accepting input from a file, this indicates the location of the file.
● Search Space Window duration – indicates in milliseconds the length of the Search Space
Window. A longer Search Space Window results in a larger search space of potential Beat
Levels
● Extension Window Duration – indicates in milliseconds the length of the Extension Window.
This should ideally be at least twice the duration of the Search Space Window. A longer
Extension Window should result in RENI finding more plausible Beat Levels but also
lengthens the time taken to produce accompaniment.
● Ideal Next Time Window Width – the value used to calculate the Ideal Next Time Window,
expressed as a percentage of the Interval of the Beat Level being extended. The larger this
63
5.0 RENI's Beat Tracking algorithm
value, the wider the Ideal Next time window; thereby making the Extension process less
precise and more forgiving of performer error.
● Tactus Threshold – indicates the number of Beat Levels that should be treated as a Tactus
when the Interpreter searches for Hypotheses. The higher this number, the greater the number
of Metrical Hypotheses inferred.
● Metre Heuristic – indicates if the Metre Heuristic should be used. If it is set to be used, an
associated parameter is set which constrains the Interpreter when searching for compatible
Beat Levels to form Hypotheses by indicating the multiple relationship that should exist
between the Beat Levels in all the inferred Hypotheses. Although this parameter runs contrary
to the hypothesis under investigation (as it gives RENI information about the musical
performance), it was deemed to be an interesting addition to the application.
● Compatibility Metric– indicates how close to an integer, the multiple of one Beat Level's
interval should be of another Beat Level's interval, for them to be considered compatible. For
larger values of this metric, the multiple value has to be closer to an integer for the two Beat
Levels to be considered compatible.
● Judging Weights – indicates the weights that should be assigned to the score calculated by
each Judge when calculating the overall score for a Hypothesis
● Re-hypothesise Flag – if set, then RENI will continually apply its Beat Tracking algorithm and
re-hypothesise over consecutive Extension Windows in the performance.
64
6.0 Evaluation
6.0 Evaluation
This section describes the methods used to evaluate RENI. In particular it discusses the challenge of
evaluating Beat Tracking applications and algorithms and offers justification for the methods used in
this project. The results of the evaluation are also discussed prior to a discussion of the results and
their implications in the final section
6.1 Aim of evaluation
This project aimed to assess RENI's proficiency in the tasks of Beat Tracking and Metrical Analysis,
and also RENI's value as a performance accompaniment tool. It was therefore decided to conduct two
types of evaluation:
● Quantitative Functional Evaluation – evaluating how accurate RENI is at tracking beats and
inferring metre in musical performances in real time.
● Subjective Evaluation – assessing RENI from the perspective of the musical performer it
accompanies; in terms of its performance of core functionality and its value and potential as a
performance accompaniment tool.
6.2 Difficulties in evaluating Beat Tracking applications
The common approach used for evaluating Beat Tracking applications is to compare the beat
locations identified by the application in a musical performance to an annotation indicating the
location of beats in the same performance. This annotation is treated as the ground truth data and may
be created manually by human listeners or inferred from the score of the piece being performed.
This approach however is not ideal. The difficulties of evaluating Beat Tracking applications in this
manner, and in general are well established in the literature.
Dixon (2007) identifies three such issues.
● The task of Beat Tracking is not uniquely defined, but depends on the application. Ambiguity
exists both in the choice of metrical level and the precise placement of beats. Human listeners
65
6.0 Evaluation
may disagree on the precise location of beats, hence it cannot be said that the annotation of
musical input is absolutely correct.
● The availability of test data may also be a major constraint. Manual annotation in order to
create ground truth data is labour-intensive and time consuming, so it is difficult to create test
sets large enough to cover a wide range of musical styles and give statistically significant
results.
● Comparison against Beat Tracking applications is also difficult as some systems are designed
for a limited set of musical styles, which leads to the question of whether such systems can be
compared with other systems at all,
6.3 Quantitative functional evaluation
Despite the difficulties identified in the previous section, it was decided to use the established method
of comparing the applications output to annotations in order to evaluate the core functionality of
RENI. The beat locations identified by RENI in a musical piece were compared to the beat locations
indicated in an annotation of the same performance. A series of statistics were produced from these
comparisons to indicate RENI's performance.
6.3.1 Test Data
A corpus of about thirty MIDI recordings of named musical performances and accompanying scores
were collected during the development of the application. A subset of these MIDI files were used
during the development of the application. The rest were set aside for use in evaluation.
This corpus consists mostly of recordings of well known western style music. It contains recordings
of both popular and classical musical pieces of a variety of metres. A listing of these recordings;
including the names of these musical pieces and their respective composers is included in the
Appendix. These files are available to download at http://donalmulvihill.wordpress.com/reni
As there were no known annotations indicating Beat Locations for any of the files collected, manual
annotation of the files was necessary. A number of volunteers agreed to annotate the files. They were
directed to listen to a musical performance and tap along to it on a MIDI keyboard, indicating strong
66
6.0 Evaluation
and weak beats using different keys. A complimentary application, BEAT-REC was developed to
record in a text file. the points in time at which the participants tapped. The annotators were given no
information on the musical pieces prior to performing the annotation. They were required to infer the
beat and metre themselves, based on their own interpretation in real time.
The drawback of manual annotation as described above is that the annotations produced cannot be
deemed to be 100% accurate and objective. An annotator may make an error and two annotators may
infer different metres or tap along at a different tempo for the same musical piece.
However, the manner in which the annotations were created exemplify the task which RENI is trying
to emulate; human performers inferring the beat and metre of a piece they have no prior knowledge of
in real time. Therefore when we are comparing the annotation produced to the output of RENI for the
same piece, we are comparing RENI's performance to a human performing the same task under the
same conditions.
6.3.2 Data from RENI
As has already been described, RENI produces textual output for the purposes of evaluation. For this
evaluation, such data was generated in a series of trials where RENI attempted to track the beats and
infer the metre of the test MIDI recordings. As the output produced by RENI varied for different trials
on the same performance due to timing issues on the computer used and changing Extension Window
durations, it was decided to aggregate these differences out over multiple trials. Therefore, for each
MIDI performance in the test set, RENI was run multiple times for different values of Search Space
and Extension windows.
As was explained earlier, It was decided to built a heuristic into RENI which when turned on allows
RENI to have an indication of the correct metre. A number of trials were run with this heuristic off
and a number of trials were run with the heuristic turned on. The intention was to compare RENI's
Beat Tracking performance with and without this heuristic.
67
6.0 Evaluation
6.3.3 Comparing RENI's output to the annotations.
A comparison application, BEAT-COMP was developed in conjunction with RENI. This application
compares the output of RENI to an annotation of the same recording and produces a number of
statistics. It compares and calculates
● The percentage of beat locations which RENI locates accurately, independent of whether they
are strong or weak - As the beat locations recorded may be effected by latencies and
inconsistencies in the timing of the computer that RENI and BEAT-REC were running on,
beat locations which are within a certain offset of each other are considered to be indicative of
the same beat.
● The percentage of beat locations and types of beat that RENI identifies accurately - This
percentage indirectly measures the degree to which the correct metre was inferred.
However, basing an evaluation of RENI on only these two metrics would be insufficient and
potentially misleading. RENI is rarely 100% correct or 100% wrong. For a particular musical
performance, RENI may select a Metrical Hypothesis different to that implied by the human
annotation, however this doesn't mean that it is entirely incorrect. An incorrect hypothesis and its
corresponding accompaniment may still be perceived as sounding appropriate. There can be varying
degrees of correctness in RENI's output.
For example, for a particular performance, the human annotator may hypothesise a 4/4 metre with a
Tactus interval of 500 milliseconds while RENI may hypothesise a 2/4 interval with a 1 second Tactus
interval. Although RENI may hypothesise incorrectly, the percussive output produced would not be
perceived by a human listener as being completely incorrect. This is because both the Tactus interval
hypothesised by RENI is only twice the correct Tactus interval and the Measure interval of both RENI
and the annotators hypothesis would be the same. Therefore, if correctly aligned, every beat indicated
by RENI in the output produced would have an equivalent in the human annotation. It would
therefore be erroneous to treat this Hypothesis as incorrect in the same way as a hypothesis of 3/4
with a Tactus interval of .66 of a second is incorrect.
Therefore a number of additional attributes in RENI's selected Metrical Hypothesis and output must
be considered in determining the congruence between the interpretations of RENI and a human
68
6.0 Evaluation
annotator for a particular musical piece. These include
● The Tactus interval – Ideally, RENI's Tactus interval should be the same as that indicated by
the human annotators. However, Tactuses which are integer multiples of the correct Tactus
can still sound correct (or not incorrect). For example, with a correct Tactus interval of 500
milliseconds, a hypothesised interval of 1 second or 250 milliseconds (related multiples of 500
milliseconds) is less incorrect than a hypothesised interval of 666 milliseconds or 300
milliseconds.
● Metre – similarly to the way in which there can be incorrect intervals which are related to the
correct interval and therefore not entirely incorrect, there can also be “incorrect but not
entirely incorrect” metres inferred. For example, this applies to the relationship between a 2/4
metres and a 4/4 metre.
These two attributes cannot be treated entirely in isolation either. RENI may hypothesise the same
Tactus interval and metre as a human, but if the beat locations identified by RENI aren't the same as
the human annotator, then RENI can't be said to be correct.
Therefore, assessing the four items described above in combination; the two percentage metrics, the
Tactus interval and the metre, allows for an assessment on how correct RENI is in the output it
produces for a particular musical performance.
6.4 Subjective evaluation
The subjective evaluation was conducted to obtain the views of musicians on RENI's value and
potential as a performance accompaniment tool.
Trials were carried out with a number of musicians. These musicians participated in a number of 60
second trials and were asked to play musical pieces of their choice on a MIDI keyboard and have
RENI accompany them. Two different types of trials were carried out corresponding with RENI's two
different settings:
● RENI analysing and basing its accompaniment on the start of the performance only.
Accompaniment was then kept constant throughout the remainder of the performance.
69
6.0 Evaluation
● RENI continually re-hypothesising and updating its accompaniment by analysing consecutive
windows of time
Participants were then asked to rate RENI's performance in a short questionnaire. Participants were
first asked about their musical background, performance ability and directed to rate how well timed
the pieces they performed were. They were asked the following about their opinion of RENI's
performance:
● How good RENI was at tapping along in time with the pieces they performed? (rating out of
10)
● How good RENI was at inferring the metre (or locating the strong beats) in the pieces they
performed? (rating out of 10)
● How well RENI performed when set to continually hypothesise? (rating out of 10)
● Did they prefer having RENI accompany them when set to continually hypothesise?
● How useful is RENI's as a performance accompaniment tool? (rating out of 10)
● Did they have any further comments, observations or recommendations?
The questionnaire presented to participants in the subjective evaluation of RENI is included in the
Appendix.
6.5 Results
6.5.1 Quantitative functional evaluation results
The following results were accumulated over more than 200 trials and comparisons between RENI's
output and a human annotations for the same files.
For trials where RENI was not using the Metre heuristic,
● RENI located an average of 60% of Beats locations correctly, independent of type. It should
be noted that the percentage of beats located was also effected by the size of the interval
hypothesised relative to the correct Hypothesis. For example, where RENI inferred a beat
interval twice that of the correct interval, then it could only identify 50% of the beats
correctly.
70
6.0 Evaluation
● RENI identified the location and type (strong or weak) of 40% of beats correctly
● For 80% of trials, RENI either hypothesised the correct Tactus interval or a related multiple of
the correct Tactus interval. This can be broken down as follows
○ Hypothesised the correct Tactus interval in 60% of trials
○ Hypothesised a Tactus interval which is an integer or related multiple of the
correct Tactus interval in 20% of trials
● In 55% of trials RENI identified either the correct metre or a metre similar or related to the
correct metre. In 27% of trials, RENI identified the correct metre.
For trials where RENI was using the Metre heuristic
● RENI located an average of 60% of Beats locations correctly, independent of type
● RENI identified the location and type of 38% of beats correctly
● For 72% of trials, RENI either hypothesised the correct beat Tactus or a related multiple of the
correct Tactus interval. This can be broken down as follows
○ Hypothesised the correct Tactus interval in 54% of trials
○ Hypothesised a Tactus interval which is an integer or related multiple of the
correct Tactus interval in 18% of trials
6.5.2 Subjective evaluation results
The following results were gathered from the questionnaires filled out by the participants in the
subjective evaluation trials.
In terms of the musical background of the participants and the pieces they performed in the trial
● All participants had received some formal musical training and rated their performance ability
between 3 and 8 (with a rating of 10 being concert level). The average rating of performer's
ability was 6.5.
● All participants performed improvised pieces with discernible beats and metres.
● The average rating given by performers for the timing consistency of their performances was
5.
71
6.0 Evaluation
● The average rating given by performers for the rhythmical complexity of the pieces they
performed was 3.75. A higher rating for rhythmical complexity indicated pieces which were
harder to tap along to.
For their evaluation of the performance of RENI, Participants assigned an average score, out of 10 of:
● 5.4 for RENI's ability to tap along with what they were playing.
● 5.25 for RENI's ability to identify the location of strong beats.
● 4.0 for RENI's ability to keep tempo with performances when set to re-hypothesise
● 5.75 for RENI's value and potential as a performance accompaniment tool
A narrow majority of participants preferred having RENI accompany them when it hypothesised
based on the opening segment (a 6 second extension window) and then kept the same accompaniment
throughout the remainder of the performance. One participant felt that although the application
worked better when it hypothesised once, they would prefer the applications re-hypothesising
functionality if it performed better.
Participants were also given the opportunity to offer any additional comments and observations. The
most notable of these observations concerned the applications performance when it was set to re-
hypothesise.
Participants were split on their preference for this setting. Some liked the idea of them as the
performer being the tempo setter as opposed to the percussive accompanist (in this case the
application). Others found the changes distracting as they synchronised their playing with the tapping
of the application once it started; treating it as the tempo setter in the same way they would
synchronise with the playing of a metronome or a human percussionist . They disliked occasions
where RENI subsequently changed the rate of its tapping; perhaps as it misread an error by the
performer as a change in timing.
However, most complained of technical problems with the performance of RENI in the re-
hypothesising mode. They complained that:
72
6.0 Evaluation
● the application would frequently change the Tactus interval of its accompaniment despite them
not having changed the tempo. For example, the application might change from tapping along
at an interval 0f 500 milliseconds to an interval of 1 second and then back to a 500 millisecond
interval.
● Having re-hypothesised, the application would briefly stop providing accompaniment during
the performance before it resumed its accompaniment according to a new Metrical
Hypothesis.
6.6 Observations and analysis of the results
These results show that RENI is reasonably good at tracking beats but not as good at inferring metre.
These contentions are borne out in the results of both types of evaluation and also correspond with
observations of the application in operation. Also notable from the subjective evaluation were the
problems and complaints participants had with the re-hypothesising mode. It is therefore worth
considering the reasons for RENI's shortcomings and looking beyond the statistics into some
observations noted during evaluations and while developing the application.
With respect to Beat Tracking, RENI failed to identify either the correct Tactus or a related multiple of
the Tactus in 20% of trials where no metrical heuristic was used. This could be attributable to timing
issues on the laptop RENI was evaluated on. Closer analysis of the timing scores attributed to
Hypotheses by RENI on the same performances over multiple trials supports this contention. This is
also obvious as RENI doesn't always produce the same output for the same performance on the same
settings over repeated trials.
It was also noticeable during evaluations that RENI had greater difficulty tracking the beats in
rhythmically complex pieces. For example it does not cope well with syncopated pieces. Syncopation
is not something that RENI attempts to address or cope with directly.
In terms of metre, RENI only identified the correct metre of performances in 27% of trials and
identified a related metre in 28% of trials. Therefore, in just under half the trials it failed to correctly
identify a correct, or close to correct metre.
73
6.0 Evaluation
The inference of metre is likely to have been effected adversely by the timing and rhythmic
complexity issues described above. In terms of rhythmic complexity, RENI is only really suitable for
the inference of simple Metres such as 2/4, 4/4 and 3/4.
Another reason for the relatively poorer performance in inferring metre is the lesser attention given to
metrical indicators in the scoring of Metrical Hypotheses. The only metrical indicator examined by
RENI is salience. Detection and analysis of repeating patterns of relative note positions and durations
in the performance would likely have contributed to a better Metrical Analysis performance. The
analysis of salience itself would have been enhanced if tonal relationships between notes were
examined in the detection of Chords.
The problems with re-hypothesising are also partially owing to imperfect timing on the Macbook
RENI ran on during the evaluation. This, in combination with timing inconsistencies by a performer
can cause a correct Hypothesis with an interval of 500 milliseconds to score highest the first time
RENI hypothesises and another hypothesis with an interval of 1 second to score highest on
subsequent occasions. RENI allows these changes as it treats each consecutive Extension Window
within which it hypothesises as independent of each other. This allows RENI to be reactive to
dramatic changes in tempo and metre. However it also makes it overly sensitive to unintentional and
momentary changes owing to an error on the part of a performer who is attempting to keep the tempo
constant.
6.7 Comparison with other Beat Tracking applications
RENI's performance compares favourably with the performance of the Beat Tracking algorithms
evaluated in the 2006 Music Information Retrieval Exchange (MIREX) (McKinney et al 2006). Five
state of the art Beat Tracking and Tempo Extraction algorithms including those described in Dixon
(2007) and Klapuri et al (2006) were evaluated at MIREX 2006 in a manner similar to that used in the
evaluation of RENI. A set of 140 musical excerpts were used; each annotated by 40 different listeners.
On the basis of these, performance metrics were calculated to measure the algorithms abilities to
locate beats.
On average, the five Beat Tracking algorithms evaluated, all of which operated off-line, located
74
6.0 Evaluation
between 45.3% and 57.5% of beats with a mean performance across the five algorithms of 54%.
Dixon (2007) scored highest with an average of 57.5% and Klapuri et al (2006) was third highest with
an average of 56.4%. RENI, in its evaluation identified 60% of beat locations correctly while
operating in real time on a standard laptop.
However it would be incorrect to conclude on the basis of this brief comparison that RENI is superior
to the algorithms evaluated at MIREX 200 as the comparison is not like for like and therefore not
entirely valid.
● The MIREX evaluation calculated the percentage score differently. The manner in which the
error window for locating the same beats was calculated and the manner in which overall
percentages were normalised differed to the approaches used in the RENI evaluation.
● The corpus of musical excerpts used in the MIREX 2006 evaluation was larger, more
comprehensive and covered a greater variety of musical styles than that used in the RENI
evaluation. The number of annotators used was also considerably greater.
● The algorithms evaluated in MIREX 2006 are all audio based. Therefore they all must
perform a considerable amount of signal processing to infer Beat Tracking cues such as note
onsets; something they may not be able to do with 100% precision. RENI works on MIDI
input which encode Beat Tracking cues symbolically and with precision. RENI should be
expected to perform better on this basis.
However it is still interesting to look at a comparison (however flawed) between RENI and those Beat
Tracking algorithms which are considered state of the art. The MIREX 2006 results when compared
to RENI' results demonstrate that relatively simple approaches to Beat Tracking such as that
implemented in RENI can be remarkably effective. In and of themselves, the MIREX 2006 results
demonstrate that computational Beat Tracking models still fall short of human Beat Tracking
capabilities.
75
7.0 Discussion
7.0 Discussion
This section assesses what was achieved in this project. It assesses the capabilities of RENI and the
extent to which it met the requirements specified at the start of the project. The extent to which the
aims and objectives described at the outset were achieved is also analysed. The hypothesis under
investigation is also reflected upon in light of the outcome of the project. Further work to be carried
out on RENI and suggestions for future research are also specified.
7.1 Analysis
7.1.1 Capabilities of RENI
RENI's capabilities as currently constituted fully meet all the requirements set out in section 3.1.
● RENI is be capable of accepting musical performance data from a file or an external music
device/instrument in real time.
● RENI infers note onset information from this real time musical input.
● RENI implements a rule based algorithm which uses note onset information to track the beat
and infer the metre of the piece of music being performed.
● RENI produces audible percussive accompaniment to the musical input as it is being played.
● RENI also produce textual for use in its evaluation.
● RENI is fully operable on a standard personal computer or laptop so as to be usable by a wide
audience. In total, RENI contains approximately 3500 lines of code.
7.1.2 Aims and objectives
The results of the evaluation demonstrate that the project achieved its primary aim. RENI and the
algorithm developed for it can track beats and infer the metre of an improvised musical performance
in real time using a rule based approach. In developing RENI the project achieved its related aim of
developing an application that can accept musical signals played in real time by a musician and
produce appropriately synchronised percussive accompaniment.
However, the extent to which these aims have been achieved is limited. As the evaluation results
76
7.0 Discussion
demonstrate, RENI does not always successfully infer the beat or the metre of improvised musical
performances and does not fully emulate or match a human performing the same task. The application
has particular difficulty in correctly identifying the metre of musical performances.
As already discussed, these limitations are due to the following reasons:
● Inconsistent and inaccurate timing on the laptop RENI ran on during evaluation.
● The difficulty the application has in coping with rhythmically complex pieces. This effects
both the tracking of beats and the inference of metre.
● Salience is the only metrical indicator considered in the scoring of Metrical Hypotheses. The
lack of attention paid to other indicators of metre such as repeating patterns of note onsets and
durations as well as the imprecise means of detecting chords contributes to the relatively
poorer record of the application in identifying metre.
This project also aimed to investigate and determine additional approaches and techniques that use
note onset information to successfully track the beats of musical signals in real time. Although the
algorithm implemented was based on one described by Rosenthal, there were significant differences.
These differences are reflected in the real time operation of RENI's algorithm as well as the
incorporation of several new techniques and heuristics used to guide the search process and to score
the inferred Metrical Hypotheses.
The operation of RENI's algorithm also contributed to the achievement of the aims and objectives
arising from the projects emphasis on performance accompaniment. The algorithm is implemented in
such a way as to accommodate imperfect timing and its tolerance for imperfections is adjustable.
RENI can also be set to continually re-hypothesise, thereby allowing it to cope with and respond
appropriately to variations in timing. The operation of this re-hypothesising mode is not entirely
satisfactory however.
This project also used real musicians in the evaluation of the Beat Tracking application developed and
gauged their experience of being accompanied by it. This revealed differing preferences on the part of
musician for playing with RENI in the re-hypothesising mode, While some liked the idea of a
computational accompanist acting as a tempo setter, others found it an unnatural means of interacting
77
7.0 Discussion
with a percussive accompanist. They were more accustomed to synchronising with the output of a
percussive accompanist (such as a metronome or drummer) rather than having the percussive
accompanist continually synchronising and reacting to them.
7.1.3 Hypothesis
The hypothesis under consideration in this project was
“Knowledge of note onset information is sufficient for computationally inferring the
beat and metre of a piece of improvised drumless music in real time using a rule based
approach without any prior knowledge of metre or style for the purposes of providing
simple percussive performance accompaniment”
This hypothesis describes an approach to a Beat Tracking and Metrical analysis problem under certain
assumed conditions. RENI has been developed in accordance with this hypothesis. It is based on a
rule based algorithm that uses note onset information to track beats and infer the metre of a piece of
musical performance in real time. Furthermore, it does so without any prior knowledge of style or
metre and produces simple percussive accompaniment corresponding to the beat and metre inferred
On the basis of the findings of this project I would conclude that knowledge of note onset times is
sufficient for inferring the beat and metre using a rule based algorithm under the conditions described
in the hypothesis. However the results also show that this approach (combining note onset
information with a rule based algorithm) is not sufficient in all cases (for different musical
performances) and may not fully emulate a human performing Beat Tracking and Metrical Analysis in
the manner outlined in the hypothesis.
Insofar as the approach described in the hypothesis and exemplified by RENI is insufficient to
correctly infer the beat and metre under the conditions specified, a crucial question arises: In the
cases where RENI is not able to correctly infer the beat and metre of a musical performance in real
time, under the conditions specified, is this due to
● the shortcomings of the application developed in this project as outlined in section 6 and
repeated in section 7,
or
78
7.0 Discussion
● that RENI's shortcomings aside, knowledge of note onset times are not sufficient to correctly
infer beat and metre in all cases under the conditions specified. More information is required
to successfully perform this task in all cases.
In this project, the shortcomings of RENI undoubtedly contributed to the failure to correctly infer the
beat and metre for some musical performances for the reasons outlined in section 6.6. If the time was
available to address these shortcomings it is highly likely that the results observed in the evaluation
would improve.
It could also be argued that RENI's use of note onset information is limited and could be extended and
enhanced. RENI's operation consists mostly of a search for repeated inter onset intervals and is driven
by the assumption that beats always coincide with note onsets. While this assumption is not generally
incorrect, it is not true of every musical piece. Although rare, beats may occur in a performance which
do not coincide with a note. Note onset information could also be used in a rule based context to
identify repeating rhythmic patterns. This could be given greater emphasis in the overall algorithm
and could lead to superior inference of metre.
However, even with these shortcomings addressed, it is still debatable as to whether RENI and the
approach it implements would be able to infer the beat and metre in real time for all musical
performances with no prior knowledge of style or metre. This is because the hypothesis under
investigation and which guided the development of RENI describes the application of a a very
bounded approach to a relatively unbounded problem.
The problem described isn't bounded, as it is assumed that the metre and style of the performance are
unknown and no other limiting assumptions are made about these attributes. With the addition of the
real time requirement, the hypothesis describes possibly the most challenging case of Beat Tracking
and Metrical Analysis. In contrast the approach specified; analysing note onset information using a
rule based algorithm is bounded and restricted. This combination not only makes the task facing
RENI more challenging, it also may not be a realistic model of the task RENI tries to emulate; a
human percussionist performing Beat Tracking. More information may be needed by the beat tracker
in order to be successful in all cases and even at that, the conditions of the problem may not reflect
79
7.0 Discussion
those encountered in reality.
Unlike RENI, humans almost certainly use musical knowledge and memory in addition to the timing
information coming into their ears when performing Beat Tracking. This knowledge, according to
Allen and Dannenburg (1990) includes
● memory of specific performances and pieces
● memory of musical forms and styles
● knowledge of performer's style
This knowledge is particularly relevant to tracking beats in rhythmically complex pieces which RENI
has difficulty with. In complex music, there are competing rhythmic forces and higher level
knowledge of the musical structure makes the correct interpretation clear to the human listener (Dixon
2007). Therefore in order to disambiguate more difficult rhythmic patterns, some musical knowledge
is necessary (Dixon 2007). If we take this contention to be true; as RENI's approach and the
hypothesis that guides it does not allow for the use of musical knowledge, it is very unlikely that it
will be sufficient in all cases.
The conditions specified for the problem may also be argued to be unrealistic and overly ambitious if
the view is taken that we are trying to emulate human Beat Tracking capabilities. The hypothesis
under investigation directs that beat and metre be inferred without any knowledge of style. This
frames RENI as a general or universal model of Beat Tracking and raises expectations that it be able
to track beats in all styles of music. According to Collins(2006), such a general Beat Tracking solution
is unrealistic. When one considers the variety of styles of music and the corresponding multiplicity of
metrical constructs, Collins(2006) contention seems reasonable. Even if we assume like Allen and
Dannenberg (1990) that a human beat tracker brings knowledge of style to the task, this knowledge is
not going to be exhaustive and even if exhaustive knowledge was available, encoding it all for a
computational model would be too inefficient for real time Beat Tracking. Therefore Collins (2006)
view that we must model the training that encultured listeners undergo in recognising and
synchronising with contexts (or styles) when performing Beat Tracking and Metrical Analysis looks a
more convincing and realistic proposition.
Perhaps then, the hypothesis under investigation would be better stated with some caveats limiting the
80
7.0 Discussion
applicability of the rule based approach based on note onset information to performances of particular
styles and of limited rhythmical complexity.
However, despite the limitations of of the approach RENI is based on, rule based algorithms based on
note onset information are still an interesting way to investigate and model the process of human Beat
Tracking. And even with its shortcomings, RENI is still an effective Beat Tracker and has the
potential to be of practical use in performance accompaniment and other contexts.
7.2 Further work on RENI
Further development of RENI will be carried out in order to make it a fully fledged application. This
may include the addition of the following features.
● A conventional user interface to make the application more usable.
● For a particular performance, allowing the user to change the Metrical Hypothesis providing
the accompaniment, from amongst the Hypotheses that RENI infers.
● The use of drum loops in the provision of percussive output. Based on the Metrical
Hypothesis inferred, RENI could select an appropriate drum loop to play. This would allow
RENI to provide more advanced percussive accompaniment..
● The incorporation of additional heuristics, informing the application of the likely beat interval
and the style of the piece being performed.
● The specification of separate re-hypothesising windows of a different duration to the
extension window. This would make the application less sensitive to performer errors when
re-hypothesising.
The Beat Tracking algorithm can also be improved. These potential enhancements would however
form the basis for future research and are discussed in the next section
7.3 Directions for future research
There is scope for further investigation into real time Beat Tracking and Metrical Analysis and the
modelling of a percussive accompanist performing in real time with an improvised performance,
within the bounds set in this project (rule based and note onset times used as Beat Tracking cues).
81
7.0 Discussion
Using RENI as a basis, further research could be carried out into building a more effective rule based
model of Beat Tracking and Metrical Analysis. There is also much scope for further investigation into
the Beat Tracking problem generally; beyond the bounds set in this project.
Further investigation could be carried out into the methodology for scoring Hypotheses in RENI to
determine additional and more effective metrics for judging the plausibility of Metrical Hypotheses.
Superior metrics for judging the inference of metre in RENI would be of particular interest. As stated
previously, research could be carried out into a rule based approach for inferring metre by analysing
patterns in timing and duration in note onset information. Implementing such capabilities for a real
time application like RENI would be of especially interesting.
The examination of the Hypothesis in section 7.1.3 suggested that some use of representations of
musical knowledge and learning would be necessary for inferring the beat and metre of rhythmically
complex pieces. Nonetheless, research could be carried out into rule based techniques which attempt
to directly address and recognise the sources of rhythmic complexity (syncopation etc.) and how these
could be incorporated into RENI. Such techniques may not make RENI successful in all cases, but
they may improve its performance in evaluations. Generating a set of rules for recognising and
reacting to rhythmic complexity cues would also be interesting from a musical cognition perspective.
Further research could also be carried out into creating a real time Beat Tracker, outside the
restrictions on approach specified in the project's hypothesis. As Allen and Dannenberg (1990) point
out, humans almost certainly use musical knowledge and memory when Beat Tracking. Research
could be carried out into how such musical knowledge and memory could be represented
computationally and used in a real time, improvised Beat Tracking scenario such as that assumed by
RENI. The use of machine learning techniques for Beat Tracking could also be investigated. For
example, a beat tracker such as RENI could plausibly learn and become attuned to the performance
style of a particular performer in the same way that a human percussionist may become attuned to the
style of a performer that he/she regularly accompanies. Such techniques could also be used to
disambiguate between musical styles in order to improve the recognition of metre.
The problems experienced by performers with RENI's re-hypothesising mode present the most
82
7.0 Discussion
interesting opportunity for further research. Research inspired by these issues could be carried out as
an effort to model the interaction between a performer and a percussive accompanist where the
percussive accompanist bases the initial accompaniment on the performance (as happens with RENI).
In terms of the problems experienced with RENI inappropriately changing its accompaniment,
research could be carried out into how these problems could best be addressed when the performance
acts as the tempo setter by looking at:
● How the initial hypothesis inferred influences the inference of subsequent hypotheses in the
same performance? This would differ from RENI which currently treats each consecutive
hypothesis as independent.
● How a beat tracker like RENI, when re-hypothesising could distinguish between performance
errors on the part of the performer and genuine changes in tempo and metre? For RENI, this
could mean only changing the tempo of accompaniment if the most recent hypothesis is
sufficiently different from the initial hypothesis.
More interesting however are the problems experienced by performers in having the percussionist
treating them as a tempo setter rather than the other way around. In this respect an investigation could
be carried out into how the performer reacts to percussive accompaniment. Once the percussive
accompaniment starts and synchronises with the performer does the performer in turn attempt to stay
synchronised with the accompaniment? Does the percussionist go from tempo follower to tempo
setter, when does this occur, and how should a beat tracker such as RENI best behave in this context?
In a more general sense, the development of an audio based solution to the real time, improvised Beat
Tracking problem addressed in this project would also be an interesting research project.
7.4 Conclusion
This project investigated if knowledge of note onset information in musical performance is sufficient
for computationally inferring the beat and metre of a piece of improvised drum-less music in real time
using a rule based approach without any prior knowledge of metre or style for the purposes of
providing simple percussive performance accompaniment.
83
7.0 Discussion
During the lifetime of the project, the application RENI was developed using Java for use on a
standard laptop. RENI is an attempt to model human Beat Tracking and is also a performance
accompaniment tool intended for practical use.
RENI accepts musical performance data in MIDI format and implements a rule based algorithm
which infers Metrical Hypotheses by searching for regular intervals between note onsets in real time.
These regularly spaced onsets are represented as Beat Levels which are then combined to form
Metrical Hypotheses. RENI selects the most plausible Metrical Hypothesis inferred for a particular
musical performance and produces aural percussive accompaniment to indicate strong and weak beats
in the performance.
RENI was evaluated quantitatively and subjectively. It demonstrated proficiency in identifying beat
locations but experienced difficulties in correctly inferring metre, particularly in rhythmically
complex pieces. Performers using RENI also experienced difficulty performing with RENI's re-
hypothesising mode.
These findings demonstrate that RENI's approach (note onset information in a rule based algorithm) is
sufficient for real time Beat Tracking and Metrical Analysis, but not in all cases. This is because
RENI's capabilities fall short of human beat trackers. Also; its aim to infer the metre in all pieces
without prior knowledge of metre or style, thereby making it a universal Beat Tracker, may be an
unrealistically ambitious one. RENI would be a more realistic model of human Beat Tracking if it
used representations of musical knowledge and attempted to be more style specific.
Nonetheless, RENI has the potential to be practical and useful performance accompaniment
application. Further work on RENI and on the future research directions specified in this project
should improve RENI as an application and as a model of human Beat Tracking.
84
BIBLIOGRAPHY
BIBLIOGRAPHY
Allen, P. E. & Dannenberg, R. B. (1990). Tracking Musical Beats in Real Time, Proceedings of the
1990 International Computer Music Conference, 140–143. Glasgow: ICMA.
Collins, N. (2006). Towards a Style-Specific Basis for Computational Beat Tracking. Proceedings of
the 9th International Conference on Music Perception & Cognition. ICMPC and ESCOM, Bologna,
Italy, pp. 461-467
Desain, P. & H. Honing (1994). A Brief Introduction to Beat Induction, Proceedings of the1994
International Computer Music Conference. 78-79. San Francisco: International Computer Music
Association.
Desain, P. & H. Honing (1999) Computational models of Beat Induction: The Rule- based Approach,
Journal of New Music Research, 28(1):29–42
Dixon, S. (2007). Evaluation of the Audio Beat Tracking System BeatRoot, Journal of New Music
Research, 36(1),39 – 50
Eck, D(2001). A Positive Evidence Model for Rhythmical Beat Induction, Journal of New Music
Research, 30(2), 187–200
Eck, D(2002). Real-time Musical Beat Induction with Spiking Neural Networks, Technical Report
IDSIA-22-02, IDSIA, Manno, Switzerland
Goto, M (2001) An Audio-based Real-time Beat Tracking System for Music With or Without Drum-
Sounds, Journal of New Music Research, 30(2), 159–171
Klapuri, A., Eronen, A. & Astola, J. (2006). Analysis of the Meter of Acoustic Musical Signals, IEEE
Transactions on Audio, Speech, and Language Processing, 14(1), 342 – 355.
85
BIBLIOGRAPHY
Large, E.W. (1995) Beat Tracking with a Non-linear Oscillator, Working Notes of the IJCAI-95
Workshop on Artificial Intelligence and Music, pages 24--31
McKinney, M., Moelants, D., Davies, M., & Klapuri, A. (2007). Evaluation of Audio Beat Tracking
and Music Tempo Extraction Algorithms. Journal of New Music Research, 36(1), 1-16
Raphael, C. (2003), Orchestra in a box: A system for Real-time Musical Accompaniment. Working
Notes of IJCAI-03 Rencon Workshop.
Rosenthal, D. (1992), Emulation of Human Rhythm Perception. Computer Music Journal, 16(1), 64–
76
Rosenthal, D. (1992), Machine Rhythm: Computer Emulation of Human Rhythm Perception, PHD
report.
Scheirer, E.D. (1998). Tempo and Beat Analysis of Acoustical Musical Signals. Journal of the
Acoustical Society of America, 103, 588 – 601.
86
APPENDIX
APPENDIX
A.1 Evaluation corpus
MIDI recordings and scores for the following musical pieces were used in the quantitative functional
evaluation. These files are available to download at http://donalmulvihill.wordpress.com/reni
87
Name Composer/ArtistA Breeze from Alabama JoplinAamulla varhainAbstract 1 DoonanAria AM BachFlash Dance MoroderFuque 6 – BWV 851 JS BachFur Elise BeethovenGiselle AdamGod Save the QueenHorn Trio BrahmKilling Me Softly Fox/GimbelLosing My Religion REMLove Marraige Van HeusenMinuet in F L MozartMinuet in G JS BachMoonlight Sonata BeethovenNightswimming REMPrelude from Carmen BizetPyramid Song RadioheadRondino RameauRondo CP BachRussian Folk Tune BeethovenSing IvyThe Entertainer JoplinThe Washington Post SousaToccatina BrownTraditioner af Swenska Folk-DansarWith or Without You U2Your Song John/Taupin
APPENDIX
A.2 Quantitative functional evaluation results
The following table lists the results from individual trials in the quantitative functional evaluation as
described in section 6. The table is separated into the following sections:
● Details on the trial
● Attributes describing the interpretation of the annotator
● Attributes describing the settings and interpretation of RENI.
● Statistical comparison of the two interpretations
Explanations for some of the fields are as follows
● Tactus – the interval of the Tactus expressed in milliseconds
● Window - the interval of the Tactus expressed in milliseconds
● Heuristic – indicates if the Metre Heuristic was on for the trial
● Beat % - Percentage of beat locations identified successfully by RENI
● Type % - Percentage of beat locations and type of beat identified successfully by RENI
88
APPENDIX
89
Trial ANNOTATOR RENI STATSTrial Song Metre Tactus Window Heuristic Tactus Metre Beat % Type %
1 a-breeze-from-alabama.mid2-4 357 6000 N 359 3-4 0.64 0.342 a-breeze-from-alabama.mid2-4 357 4000 N 365 4-4 0.87 0.233 a-breeze-from-alabama.mid2-4 357 4000 N 365 4-4 0.98 0.254 a-breeze-from-alabama.mid2-4 357 5000 N 366 3-4 0.36 0.225 AAMULLAVARHAIN.mid 4-4 964 5000 N 999 2-4 0.63 0.486 AAMULLAVARHAIN.mid 4-4 964 4000 N 500 3-4 0.3 0.197 AAMULLAVARHAIN.mid 4-4 964 5000 N 1000 2-4 0.07 0.078 AAMULLAVARHAIN.mid 4-4 964 5000 N 1000 2-4 0.48 0.379 AAMULLAVARHAIN.mid 4-4 964 6000 N 7423 1 0.04 0
10 AAMULLAVARHAIN.mid 4-4 964 4000 N 499 3-4 0.7 0.3311 AAMULLAVARHAIN.mid 4-4 964 6000 N 999 2-4 0.19 0.1912 AAMULLAVARHAIN.mid 4-4 964 6000 N 999 2-4 0.19 0.1913 AAMULLAVARHAIN.mid 4-4 964 6000 N 7407 1 0.04 014 Alabama 2-4 357 6000 N 549 1,5 0.33 0.1415 Alabama 2-4 357 6000 N 365 4-4 0.01 0.0116 Aria.mid 3-4 592 6000 N 600 2-4 0.94 0.4717 Aria.mid 3-4 592 4000 N 599 3-4 1 0.3418 Aria.mid 3-4 592 5000 N 599 2-4 1 0.5119 Aria.mid 3-4 592 4000 N 600 3-4 0.98 0.3320 Aria.mid 3-4 592 6000 N 600 2-4 0.98 0.4821 Aria.mid 3-4 592 6000 N 599 2-4 0.98 0.4922 BachMinu.mid 3-4 587 4000 N 598 3-4 0.74 0.7423 BachMinu.mid 3-4 587 5000 N 598 2-4 0.97 0.4924 BachMinu.mid 3-4 587 6000 N 598 4-4 0.41 0.2125 BachMinu.mid 3-4 587 4000 N 900 2-4 0.13 0.1326 BachMinu.mid 3-4 587 5000 N 599 2-4 0.87 0.4427 BachMinu.mid 3-4 587 6000 N 601 4-4 0.38 0.1828 bwv851_fugue06.mid 4-4 355 4000 N 365 3-4 0.93 0.5529 bwv851_fugue06.mid 4-4 360 6000 N 731 2-4 0.41 0.4130 bwv851_fugue06.mid 4-4 355 6000 N 1005 2-4 0.21 0.1331 bwv851_fugue06.mid 4-4 355 5000 N 550 3-4 0.38 0.2232 bwv851_fugue06.mid 4-4 355 5000 N 731 2-4 0.01 0.0133 bwv851_fugue06.mid 4-4 355 4000 N 366 2-4 0.69 0.5334 entertainer.mid 2-4 248 6000 N 252 3-4 0.82 0.4335 entertainer.mid 2-4 248 4000 N 375 2-4 0.51 0.2736 entertainer.mid 2-4 248 6000 N 375 3-4 0.46 0.2437 entertainer.mid 2-4 248 6000 N 374 1 5 0.47 0.2538 entertainer.mid 2-4 248 4000 N 375 4-4 0.47 0.1939 entertainer.mid 2-4 248 4000 N 251 1 7 0.83 0.4240 entertainer.mid 2-4 248 5000 N 375 3-4 0.48 0.2541 entertainer.mid 2-4 248 5000 N 374 3-4 0.49 0.2442 flashdance2.mid 4-4 425 4000 N 429 4-4 0.35 0.2143 flashdance2.mid 4-4 425 6000 N 639 2-4 0.33 0.0944 flashdance2.mid 4-4 425 4000 N 425 4-4 0.98 0.9845 flashdance2.mid 4-4 425 5000 N 425 3-4 0.87 0.5146 flashdance2.mid 4-4 425 6000 N 851 3-4 0.48 0.2447 flashdance2.mid 4-4 425 6000 N 844 2-4 0.2 0.1348 flashdance2.mid 4-4 425 5000 N 638 2-4 0.31 0.12
APPENDIX
90
49 FurElise.mid 3-4 476 5000 N 480 4-4 0.77 0.4650 FurElise.mid 3-4 476 6000 N 239 3-4 0.87 0.8651 FurElise.mid 3-4 476 5000 N 240 3-4 0.85 0.8552 FurElise.mid 3-4 476 6000 N 240 3-4 0.96 0.7353 FurElise.mid 3-4 476 4000 N 239 3-4 0.79 0.6954 FurElise.mid 3-4 476 6000 N 239 3-4 0.81 0.7155 FurElise.mid 3-4 476 4000 N 240 1 7 0.96 0.5856 FurElise.mid 3-4 476 4000 N 239 3-4 0.72 0.5857 giselle.mid 2-4 740 5000 N 375 4-4 0.92 0.9258 giselle.mid 2-4 740 4000 N 374 4-4 0.9 0.959 giselle.mid 2-4 740 6000 N 749 3-4 0.18 0.160 giselle.mid 2-4 740 4000 N 750 2-4 0.93 0.9361 giselle.mid 2-4 740 4000 N 750 2-4 0.92 0.9262 God Save the Queen 3-4 778 4000 N 799 2-4 0.83 0.3763 God Save the Queen 3-4 778 4000 N 800 2-4 0.69 0.2964 God Save the Queen 3-4 778 5000 N 800 3-4 0.34 0.3465 God Save the Queen 3-4 778 5000 N 800 3-4 0.63 0.6366 God Save the Queen 3-4 778 6000 N 800 2-4 0.65 0.3767 God Save the Queen 3-4 778 6000 N 800 2-4 0.54 0.2668 LeoMin.mid 3-4 394 5000 N 399 2-4 1 0.5169 LeoMin.mid 3-4 394 4000 N 399 3-4 1 170 LeoMin.mid 3-4 394 6000 N 400 3-4 0.95 0.4871 LeoMin.mid 3-4 394 6000 N 400 3-4 0.95 0.4872 LeoMin.mid 3-4 394 4000 N 400 3-4 1 173 loveandmarriage.mid 4-4 495 6000 N 500 1 5 0.97 0.6274 loveandmarriage.mid 4-4 495 6000 N 500 4-4 0.96 0.9575 loveandmarriage.mid 4-4 495 4000 N 499 3-4 1 0.5776 loveandmarriage.mid 4-4 495 6000 N 502 4-4 0.44 0.2677 loveandmarriage.mid 4-4 495 4000 N 496 3-4 0.52 0.3178 loveandmarriage.mid 4-4 495 5000 N 499 4-4 0.34 0.3379 Moonlight 2-4 1176 4000 N 799 2-4 0.01 0.0180 Moonlight 2-4 1176 5000 N 800 3-4 0.44 0.4481 Moonlight 2-4 1176 4000 N 400 4-4 0.58 0.4982 Moonlight 2-4 1176 4000 N 801 2-4 0.01 0.0183 Moonlight 2-4 1173 6000 N 400 1,6 0.01 084 prelude.mid 2-4 247 5000 N 248 1 8 0.75 0.4685 prelude.mid 2-4 247 6000 N 251 1 7 0.76 0.486 prelude.mid 2-4 247 6000 N 250 1 10 0.73 0.3787 prelude.mid 2-4 247 4000 N 252 3-4 0.82 0.488 prelude.mid 2-4 247 4000 N 369 4-4 0.56 0.2889 prelude.mid 2-4 247 5000 N 376 3-4 0.52 0.2490 Rameau.mid 3-4 592 6000 N 605 4-4 0.18 0.1191 Rameau.mid 3-4 592 5000 N 605 3-4 0.45 0.1592 Rameau.mid 3-4 592 6000 N 604 4-4 0.16 0.193 Rameau.mid 3-4 592 5000 N 605 3-4 0.36 0.1194 RussianFolk.mid 2-4 590 4000 N 900 2-4 0.3 0.1795 RussianFolk.mid 2-4 590 4000 N 300 3-4 0.91 0.596 RussianFolk.mid 2-4 590 6000 N 599 4-4 0.96 0.7297 RussianFolk.mid 2-4 590 6000 N 899 3-4 0.3 0.1598 RussianFolk.mid 2-4 590 5000 N 600 3-4 0.94 0.46
APPENDIX
91
99 RussianFolk.mid 2-4 590 6000 N 600 4-4 0.89 0.67100 RussianFolk.mid 2-4 590 5000 N 300 4-4 0.89 0101 SingIvy.mid 6-8 722 6000 N 750 3-4 0.79 0.42102 SingIvy.mid 6-8 722 4000 N 750 2-4 0.82 0.82103 SingIvy.mid 6-8 722 5000 N 750 3-4 0.74 0.39104 SingIvy.mid 6-8 722 6000 N 749 3-4 0.86 0.47105 SingIvy.mid 6-8 722 4000 N 250 4-4 0.77 0.51106 sousa_washington_post.mid6-8 486 5000 N 500 4-4 0.5 0.13107 sousa_washington_post.mid6-8 486 4000 N 500 2-4 0.45 0.45108 sousa_washington_post.mid6-8 486 5000 N 499 4-4 0.88 0.23109 sousa_washington_post.mid6-8 486 6000 N 500 1 5 0.67 0.34110 sousa_washington_post.mid6-8 486 4000 N 500 2-4 0.34 0.34111 sousa_washington_post.mid6-8 486 6000 N 499 4-4 0.89 0.65112 toccatina.mid 3-4 545 5000 N 277 3-4 0.98 0.33113 toccatina.mid 3-4 545 6000 N 555 2-4 0.98 0.49114 toccatina.mid 3-4 545 6000 N 278 1 9 0.96 0.73115 toccatina.mid 3-4 545 6000 N 278 2-4 0.81 0.29116 toccatina.mid 3-4 545 4000 N 555 3-4 1 1117 toccatina.mid 3-4 545 4000 N 277 3-4 0.9 0.9118 traditioner_af_swenska_folk_dansar.1.1.mid3-4 492 5000 N 750 3-4 0.32 0.1119 traditioner_af_swenska_folk_dansar.1.1.mid3-4 492 5000 N 750 3-4 0.32 0.1120 traditioner_af_swenska_folk_dansar.1.1.mid3-4 492 4000 N 750 2-4 0.27 0.27121 traditioner_af_swenska_folk_dansar.1.1.mid3-4 492 6000 N 750 3-4 0.23 0.1122 with.mid 4-4 496 5000 N 500 4-4 0.99 0.99123 with.mid 4-4 496 5000 N 500 4-4 0.46 0.22124 with.mid 4-4 496 4000 N 500 2-4 0.86 0.66125 a-breeze-from-alabama.mid2-4 357 4000 Y 617 2-4 0.29 0.16126 a-breeze-from-alabama.mid2-4 357 5000 Y 367 2-4 0.36 0.35127 a-breeze-from-alabama.mid2-4 357 5000 Y 363 2-4 0.74 0128 a-breeze-from-alabama.mid2-4 357 4000 Y 617 2-4 0.3 0.16129 a-breeze-from-alabama.mid2-4 357 6000 Y 549 2-4 0.35 0.17130 AAMULLAVARHAIN.mid 4-4 964 5000 Y 998 2-4 0.67 0.52131 AAMULLAVARHAIN.mid 4-4 964 5000 Y 999 2-4 0.74 0.56132 AAMULLAVARHAIN.mid 4-4 964 6000 Y 1000 2-4 0.44 0.37133 Aria.mid 3-4 592 6000 Y 600 3-4 0.97 0.33134 Aria.mid 3-4 592 4000 Y 600 3-4 0.98 0.33135 Aria.mid 3-4 592 4000 Y 600 3-4 0.98 0.33136 Aria.mid 3-4 592 5000 Y 600 3-4 0.93 0.31137 BachMinu.mid 3-4 587 5000 Y 600 3-4 0.77 0.26138 BachMinu.mid 3-4 587 4000 Y 599 3-4 0.56 0.56139 BachMinu.mid 3-4 587 6000 Y 599 3-4 0.87 0.31140 BachMinu.mid 3-4 587 5000 Y 600 3-4 0.77 0.26141 BachMinu.mid 3-4 587 6000 Y 600 3-4 0.51 0.18142 BachMinu.mid 3-4 587 4000 Y 599 3-4 0.56 0.56143 bwv851_fugue06.mid 4-4 355 6000 Y 746 2-4 0.31 0.16144 bwv851_fugue06.mid 4-4 355 6000 Y 732 2-4 0.47 0.47145 bwv851_fugue06.mid 4-4 355 4000 Y 732 2-4 0.11 0.05146 bwv851_fugue06.mid 4-4 355 4000 Y 366 2-4 0.47 0.13147 bwv851_fugue06.mid 4-4 355 6000 Y 732 2-4 0.42 0.41148 entertainer.mid 2-4 248 4000 Y 373 2-4 0.53 0.27
APPENDIX
92
149 entertainer.mid 2-4 248 5000 Y 253 2-4 0.69 0.33150 entertainer.mid 2-4 248 4000 Y 376 2-4 0.53 0.26151 entertainer.mid 2-4 248 5000 Y 375 2-4 0.55 0.29152 entertainer.mid 2-4 248 6000 Y 371 2-4 0.55 0.29153 entertainer.mid 2-4 248 5000 Y 252 2-4 0.76 0.4154 entertainer.mid 2-4 248 6000 Y 369 2-4 0.52 0.26155 flashdance2.mid 4-4 425 6000 Y 425 4-4 0.79 0.42156 flashdance2.mid 4-4 425 6000 Y 429 2-4 0.42 0.27157 flashdance2.mid 4-4 425 4000 Y 425 4-4 0.96 0.96158 flashdance2.mid 4-4 425 5000 Y 847 2-4 0.21 0.1159 FurElise.mid 3-4 476 5000 Y 240 3-4 0.7 0.67160 FurElise.mid 3-4 476 4000 Y 240 3-4 0.87 0.44161 FurElise.mid 3-4 476 6000 Y 724 3-4 0.24 0.14162 FurElise.mid 3-4 476 6000 Y 714 3-4 0.28 0.13163 giselle.mid 2-4 740 6000 Y 1124 2-4 0.33 0.17164 giselle.mid 2-4 740 6000 Y 750 2-4 0.88 0.88165 giselle.mid 2-4 740 5000 Y 374 2-4 0.92 0.44166 giselle.mid 2-4 740 5000 Y 374 2-4 0.93 0.44167 giselle.mid 2-4 740 4000 Y 750 2-4 0.88 0.88168 giselle.mid 2-4 740 4000 Y 375 2-4 0.82 0.4169 God Save the Queen 3-4 778 6000 Y 800 3-4 0.46 0.46170 God Save the Queen 3-4 778 6000 Y 800 3-4 0.63 0.63171 God Save the Queen 3-4 778 5000 Y 799 3-4 0.71 0.71172 God Save the Queen 3-4 778 5000 Y 799 3-4 0.71 0.71173 LeoMin.mid 3-4 394 5000 Y 400 3-4 1 0.33174 LeoMin.mid 3-4 394 6000 Y 800 3-4 0.48 0.16175 LeoMin.mid 3-4 394 6000 Y 799 3-4 0.48 0.16176 LeoMin.mid 3-4 394 5000 Y 399 3-4 1 0.33177 LeoMin.mid 3-4 394 4000 Y 399 3-4 1 1178 loveandmarriage.mid 4-4 495 4000 Y 500 2-4 1 0.26179 loveandmarriage.mid 4-4 495 6000 Y 667 2-4 0.29 0.13180 loveandmarriage.mid 4-4 495 5000 Y 500 4-4 1 0.99181 loveandmarriage.mid 4-4 495 5000 Y 494 4-4 0.49 0.46182 loveandmarriage.mid 4-4 495 6000 Y 499 2-4 0.17 0.06183 Moonlight 2-4 1176 6000 Y 1200 2-4 0.87 0.87184 Moonlight 2-4 1176 6000 Y 399 2-4 0.93 0.93185 Moonlight 2-4 1176 4000 Y 799 2-4 0.44 0.24186 prelude.mid 2-4 247 4000 Y 375 2-4 0.58 0.29187 prelude.mid 2-4 247 5000 Y 375 2-4 0.52 0.24188 prelude.mid 2-4 247 5000 Y 375 2-4 0.48 0.24189 prelude.mid 2-4 247 4000 Y 374 2-4 0.51 0.25190 prelude.mid 2-4 247 6000 Y 377 2-4 0.51 0.24191 Rameau.mid 3-4 592 6000 Y 910 3-4 0.25 0.14192 Rameau.mid 3-4 592 5000 Y 602 3-4 0.43 0.14193 Rameau.mid 3-4 592 4000 Y 601 3-4 0.25 0.25194 Rameau.mid 3-4 592 4000 Y 602 3-4 0.18 0.18195 Rameau.mid 3-4 592 5000 Y 605 3-4 0.47 0.16196 Rameau.mid 3-4 592 6000 Y 909 3-4 0.25 0.14197 RussianFolk.mid 2-4 590 5000 Y 300 2-4 0.94 0.46198 RussianFolk.mid 2-4 590 4000 Y 300 2-4 0.94 0.46
APPENDIX
93
199 RussianFolk.mid 2-4 590 6000 Y 600 2-4 0.91 0.91200 RussianFolk.mid 2-4 590 5000 Y 300 2-4 0.94 0.46201 RussianFolk.mid 2-4 590 4000 Y 900 2-4 0.31 0.17202 SingIvy.mid 6-8 722 4000 Y 749 2-4 0.63 0.63203 SingIvy.mid 6-8 722 6000 Y 969 2-4 0.23 0.11204 SingIvy.mid 6-8 722 4000 Y 750 2-4 0.79 0.79205 SingIvy.mid 6-8 722 6000 Y 969 2-4 0.21 0.11206 sousa_washington_post.mid6-8 486 5000 Y 502 2-4 0.58 0.58207 sousa_washington_post.mid6-8 486 5000 Y 500 2-4 0.91 0.91208 sousa_washington_post.mid6-8 486 4000 Y 500 2-4 0.05 0.05209 sousa_washington_post.mid6-8 486 6000 Y 499 2-4 0.12 0.12210 sousa_washington_post.mid6-8 486 6000 Y 500 2-4 0.62 0.62211 toccatina.mid 3-4 545 4000 Y 555 3-4 0.99 0.99212 toccatina.mid 3-4 545 6000 Y 277 3-4 0.79 0.23213 toccatina.mid 3-4 545 4000 Y 555 3-4 0.99 0.99214 toccatina.mid 3-4 545 6000 Y 278 3-4 0.86 0.86215 toccatina.mid 3-4 545 5000 Y 555 3-4 1 0.34216 traditioner_af_swenska_folk_dansar.1.1.mid3-4 492 4000 Y 500 3-4 0.97 0.97217 traditioner_af_swenska_folk_dansar.1.1.mid3-4 492 5000 Y 750 3-4 0.32 0.1218 traditioner_af_swenska_folk_dansar.1.1.mid3-4 492 6000 Y 750 3-4 0.32 0.1219 traditioner_af_swenska_folk_dansar.1.1.mid3-4 492 5000 Y 750 3-4 0.23 0.1220 traditioner_af_swenska_folk_dansar.1.1.mid3-4 492 4000 Y 501 3-4 0.17 0.17221 traditioner_af_swenska_folk_dansar.1.1.mid3-4 492 6000 Y 749 3-4 0.3 0.1222 with.mid 4-4 496 6000 Y 501 4-4 0.86 0.44223 with.mid 4-4 496 4000 Y 750 2-4 0.33 0.08224 with.mid 4-4 496 4000 Y 250 4-4 0.8 0.32225 with.mid 4-4 496 6000 Y 503 4-4 0.31 0.17
APPENDIX
A.3 Qualitative evaluation questionnaire
SECTION A – Your Musical Background
1. Have you received any formal musical training/instruction? (circle answer)
YES NO
2. If yes, to what level? (Indicate grade attained if applicable. Otherwise describe nature of training)
__________________________________________________________________
__________________________________________________________________
3. What musical instruments can you play? (List instruments you can play)
__________________________________________________________________
__________________________________________________________________
4. How would you rate your musical performance ability?
(Rate on a scale of 1-10, 10 being professional/concert level, 1 being no musical ability)
1 2 3 4 5 6 7 8 9 10
94
APPENDIX
SECTION B – Your Performance in the Experiment
5. During the experiment, what musical pieces did you play? (If you know some of the pieces you
played, list name of piece, composer and style if possible)
______________________________________________________________
______________________________________________________________
______________________________________________________________
6. How would you rate the timing of your performance? (Rate on a scale of 1-10, Select 10 if you
feel you kept near perfect timing, Select 1 if your performance was devoid of any rhythm or sense of
timing)
1 2 3 4 5 6 7 8 9 10
7. How rhythmically complex were the pieces that you played? A rhythmically complex piece
would not have a very clear meter or time signature and would be difficult to tap along to. (Rate
on a scale of 1-10; 10 indicating a piece that is very difficult to tap along to, 1 being a piece that
effectively taps along to itself – a series of evenly spaced notes)
1 2 3 4 5 6 7 8 9 10
95
APPENDIX
SECTION C – The performance of the Application
8. In your judgement, how good was the application at tapping along in time with what you
were playing? (Rate on a scale of 1-10, Select 10 if you feel the application tapped along perfectly
with all pieces, Select 1 if you felt the application completely failed to tap along correctly in all
pieces)
1 2 3 4 5 6 7 8 9 10
9. In your judgement, how good was the application at accurately locating the strong beats (the
start of a measure) and identifying the meter in the pieces you were playing? (Rate on a scale of
1-10, Select 10 if you feel the application performed this task perfectly for all pieces, Select 1 if you
felt the application completely failed at this task for all pieces)
1 2 3 4 5 6 7 8 9 10
10. When the application was set to change the beat tempo in reaction to your playing, how well
do you feel the application kept in time with your playing? (Rate on a scale of 1-10, Select 10 if
you feel the application performed this task perfectly for all pieces, Select 1 if you felt the application
completely failed at this task for all pieces)
1 2 3 4 5 6 7 8 9 10
11. When performing with the application, which of the following did you prefer/find
easier/find more natural ? (select one option)
- Playing when the application continually updated the tempo of the beats it was playing – The
application treats your performance as the tempo setter.
96
APPENDIX
- Playing when the application set the tempo and style of the beat based on what was initially
played and then stuck to it – The application takes the initial part of your performance as the
tempo setter and then keeps this tempo
SECTION D – Overall
12. (Bearing in mind that it is still a work in progress) How would you rate the application as a
performance accompaniment tool? Did the accompaniment it provided sound good and
complement or otherwise enhance your performance and how it sounded? (Rate on a scale of
1-10, Select 10 if you feel the application is an excellent and potentially very useful performance
accompaniment tool, Select 1 if you feel the application is completely useless and obsolete as a
performance accompaniment tool)
1 2 3 4 5 6 7 8 9 10
13. Any further comments/observations/recommendations/advice on the experiment or the
application.
______________________________________________________________
______________________________________________________________
______________________________________________________________
______________________________________________________________
97
APPENDIX
A.4 Timeline
See next page for Gantt Chart specifying the timeline of activities in the project.
98
APPENDIX
99
APPENDIX
100