MiningSuite - · PDF file• MIRtoolbox: 10000s download, 600+ citations, reference ... NaN...

Preview:

Citation preview

Computational analysis of sound and music

Olivier Lartillot Department of Musicology

UIO

MIRtoolbox

MiningSuite

Outline• MIRtoolbox

• Audio analysis

• Metrical analysis

• MiningSuite:

• MIRtoolbox 2.0

• “Symbolic” analysis (of scores, MIDI)

• A large range of audio and music descriptors

• Highly modular framework: building blocks can be reused, reordered

• Simple, adaptive syntax: users can focus on design, can ignore technical details

• Free software, open source

• MIRtoolbox: Audio analysis. 10000s download, 600+ citations, reference tool in Music Information Retrieval

MiningSuiteMIRtoolbox

mirrms mirlowenergy

mirspectrum

mirattackslope

mirrolloff

mirpeaks

mirpeaks

mirinharmonicity

mirpeaks

mirmode

mirkey

mirkeysom

mirsimatrix mirnovelty

mirbrightness

mirchromagram mirkeystrength

mirpeaks

mirmfcc

mironsets mirpeaks mirtempo

mirsummirframe

mirflux

mirsum

mirenvelope

mirattacktime

mirfluctuation mirsum

mirzerocross mirpeaks mirpitch

mirspectrum

mircepstrum

mirautocor

mirsegment

mirfilterbank

miraudio

mirframe

mirsum

mirroughness

mirregularity

*

mirstat

mirzerocross

mirhisto

mircentroid

mirspread

mirskewness

mirkurtosis

mirflatness

mirfeatures

mircluster

mirclassify

mirgetdata

mirexport

mirautocor

mirspectrum*

mirtonalcentroid mirflux mirhcdf

mirplay

mirpulseclarity

mirsave

mireventdensity

mirlength

mirdist

mirquery

mirbeatspectrum

MIRtoolbox

• A large range of audio and music descriptors

• Highly modular framework: building blocks can be reused, reordered

• Simple, adaptive syntax: users can focus on design, can ignore technical details

• Free software, open source

• MIRtoolbox: 10000s download, 600+ citations, reference tool in Music Information Retrieval

MiningSuiteMIRtoolbox

• Feature extraction

• Emotion ratings by listeners

• Linear modeling

Music content / emotion

• Movie soundtrack excerpts

BrainTuning FP6-2004-NEST-PATH-028570

mirrms mirlowenergy

mirspectrum

mirattackslope

mirrolloff

mirpeaks

mirpeaks

mirinharmonicity

mirpeaks

mirmode

mirkey

mirkeysom

mirsimatrix mirnovelty

mirbrightness

mirchromagram mirkeystrength

mirpeaks

mirmfcc

mironsets mirpeaks mirtempo

mirsummirframe

mirflux

mirsum

mirenvelope

mirattacktime

mirfluctuation mirsum

mirzerocross mirpeaks mirpitch

mirspectrum

mircepstrum

mirautocor

mirsegment

mirfilterbank

miraudio

mirframe

mirsum

mirroughness

mirregularity

*

mirstat

mirzerocross

mirhisto

mircentroid

mirspread

mirskewness

mirkurtosis

mirflatness

mirfeatures

mircluster

mirclassify

mirgetdata

mirexport

mirautocor

mirspectrum*

mirtonalcentroid mirflux mirhcdf

mirplay

mirpulseclarity

mirsave

mireventdensity

mirlength

mirdist

mirquery

mirbeatspectrum

mirtempo tempo

129.6333 bpm

mirtempo(’mysong’)

mirtempo tempo

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6!0.4

!0.2

0

0.2

0.4

0.6Envelope autocorrelation

lag (s)

coeffic

ients

0 1 2 3 4 50

0.02

0.04

0.06

0.08Onset curve (Envelope)

time (s)

am

plit

ude

0 1 2 3 4 5!0.05

0

0.05

0.1

0.15Onset curve (Differentiated envelope)

time (s)

am

plit

ude

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6!0.4

!0.2

0

0.2

0.4

0.6Envelope autocorrelation

lag (s)

coeffic

ients

o

do

ac

pa

t = 129.6333 bpm

Roughly:

• o = mironsets(’mysong’, ‘Detect’, ‘No’)

• do = mironsets(o, ‘Diff ’)

• ac = mirautocor(do)

• pa = mirpeaks(ac, ’Total’, 1)In short:

• [t, pa] = mirtempo(’mysong’)

0 2 4 6 8 10 12130

140

150

160

170

180Tempo

Temporal location of events (in s.)

coeffic

ient valu

e (

in b

pm

)mirtempo(’mysong’, ‘Frame’)

mirtempo tempo

0 2 4 $ % 10 12 14!1

0

1

2Onset curve (Differentiated envelope)

time (s)

am

plit

ude

0 2 4 6 8 10 12130

140

150

160

170

180Tempo

Temporal location of events (in s.)

coeffic

ient valu

e (

in b

pm

)

f

pa

t

• o = mironsets(’mysong’, ‘Detect’, ‘No’)

• do = mironsets(o, ‘Diff ’)

• f = mirframe(do)

• ac = mirautocor(f)

• pa = mirpeaks(ac, ’Total’, 1)

In short:

• [t, pa] = mirtempo(’mysong’, ‘Frame’)

mirtempo tempo

mirtempo tempo

0 5 10 15 20 25 30−2

0

2

4x 106 Onset curve (Envelope)

time (s)

ampl

itude

s

Switch from one metrical level to

another!

mirtempo

mirmetre metrical hierarchy

• Constructing and tracking all metrical levels over time.

Metrical levels:

234567

8

0.5

2.5

1

.

.

mirmetre metrical hierarchy

Beethoven, 9th Symphony, Scherzo

Lartillot et al.. Estimating tempo and metrical features by tracking the whole metrical hierarchy. International Conference on Music & Emotion (2013)

C.P.E. Bach, Concerto for cello in A major, WQ 172, 3rd mvt• ‘Envelope’, ‘Filter’: changes in dynamics

• ‘SpectralFlux’: global spectral changes

• ‘Emerge’: local changes in particular frequency regions

0.5 1 1.5 2 2.5 3 3.5 4

0.2

0.4

0.6

0.8

Onset curve (Envelope)

time (s)

ampl

itude

0.5 1 1.5 2 2.5 3 3.5 4

0.2

0.4

0.6

0.8

Onset curve (Envelope)

time (s)

ampl

itude

0.5 1 1.5 2 2.5 3 3.5 4−0.2

0

0.2

0.4

0.6

0.8

Onset curve (Envelope)

time (s)

ampl

itude

mironsets onset detection curve

J.S. Bach, Orchestral suite No.3 in D minor, BWV 1068, Aria

0.5 1 1.5 2 2.5 3 3.5 4

0.2

0.4

0.6

0.8

Onset curve (Envelope)

time (s)

ampl

itude

0.5 1 1.5 2 2.5 3 3.5 4

0

0.2

0.4

0.6

0.8

1Onset curve (Envelope)

time (s)

ampl

itude

0.5 1 1.5 2 2.5 3 3.5 4

0.2

0.4

0.6

0.8

Onset curve (Envelope)

time (s)

ampl

itude

0.5 1 1.5 2 2.5 3 3.5 4 4.50

2

4

6

8

10

12Envelope

time (s)

ampl

itude

mironsets onset detection curve

• ‘Envelope’, ‘Filter’: changes in dynamics

• ‘SpectralFlux’: global spectral changes

• ‘Emerge’: local changes in particular frequency regions

‘Emerge’

M. Bruch, Violin Concerto No.1 in G minor, op.26,

Finale (Allegro energico)

‘SpectralFlux’

1=

.16=

mirmetre tracking all metrical levels

Influence of the onset detection

method

Audio / symbolic

mirenvelope

mirautocor

mirtempo

mirpeaks

19

audio

Outline• MIRtoolbox

• Audio analysis

• Metrical analysis

• MiningSuite:

• MIRtoolbox 2.0

• “Symbolic” analysis (of scores, MIDI)

• Both audio and symbolic representations

• Complete redesign of software: optimized, clearer

• Memory management mechanisms (easier to use)

• “Really” open source: clear code, anybody can contribute via GitHub

• Decomposed into packages

21

MiningSuite

• SIGMINR: signal processing

• AUDMINR: audio, auditory modelling

• MUSMINR: music analysis

• SEQMINR: sequence processing

• PATMINR: pattern mining

22

MiningSuite

Signal domain• SIGMINR

• sig.input, sig.spectrum, …

• AUDMINR• aud.spectrum, … • aud.mfcc, aud.brightness, …

• MUSMINR• mus.spectrum, … • mus.tempo, mus.key, …

• Sets of operators related to signal processing operations, audio and musical features

• Versions specific to particular domains

• Each operator can be tuned with a set of options

23

SIGMINR signal processing

sig.input

24

sig.frame

sig.spectrumsig.rms

sig.cepstrum

sig.autocor

sig.flux

sig.envelope

sig.filterbank

sig.peaks sig.segment

sig.zerocrosssig.rolloff

sig.simatrix sig.cluster

sig.stat…

AUDMINR audio, auditory modeling

25

aud.spectrum aud.envelopeaud.filterbank

aud.attacktime

aud.attackslope

aud.brightness

aud.mfcc

aud.roughness

aud.novelty

aud.segment

aud.score

aud.eventdensity

aud.pitch

aud.fluctuation

MUSMINR music theory

mus.spectrum mus.pitch mus.tempo

mus.pulseclaritymus.chromagram

mus.keystrength

mus.key

mus.mode

mus.keysom

mus.metre

mus.score

Limitations of data flow in MIRtoolbox

mirframe

mircentroid

long audio file, batch of files

miraudio

mirspectrum

• a = miraudio(‘myfile’)

• f = mirframe(a)

• s = mirspectrum(f)

• mircentroid(s)

• mircentroid(‘myfile’, ‘Frame’)

27

Data flow graph design & evaluation

• a = miraudio(‘Design’, …)

• s = mirspectrum(a, ‘Frame’, …)

• c = mircentroid(s)

• mireval(c, ‘myfile’)

28

mirframe

mircentroid

long audio file, batch of files

miraudio

mirspectrum

Data flow graph in MiningSuite

‘Frame’

sig.centroid

long audio file, batch of files

sig.input

sig.spectrum

• a = sig.input(‘myfile’, …);

• s = sig.spectrum(a, ‘Frame’, …); • c = sig.centroid(s)

; → No operation is performed. (The data flow graph is constructed without actual computation.)

29

sig.design.show data flow graph display

• a = sig.input(…);

• s = sig.spectrum(a);

• c = sig.centroid(s)

• c.show

> sig.spectrum ( ... ) win: 'hamming' min: 0 max: Inf mr: 0 res: NaN length: NaN zp: 0 wr: 0 octave: 0 constq: 0 alongbands: 0 ni: 0 collapsed: 0 rapid: 0 phase: 1 nl: 0 norm: 0 mprod: [] msum: [] log: 0 db: 0 pow: 0 collapsed: 0 aver: 0 gauss: 0 timesmooth: 0

> sig.input ( 'ragtime' ) frameconfig: 0 mix: 'Pre' sampling: 0 center: 0 sampling: 0 extract: [] trim: 0 trimwhere: 'BothEnds' trimthreshold: 0.0600 halfwave: 0

> sig.centroid ( ... )

sig.signal.design design stored in the results

• a = sig.input(…);

• s = sig.spectrum(a);

• c = sig.centroid(s);

• d = c.eval

• d.design

• d.design.show

• save result.mat d

1 year later:

• load result.mat

• d

• d.design

• d.design.show

the results

Outline• MIRtoolbox

• Metrical analysis

• MiningSuite:

• MIRtoolbox 2.0

• “Symbolic” analysis (of scores, MIDI)

MIDI Toolbox (Eerola & Toiviainen, U. Jyväskylä, Finland, 2004–6)• nmat = readmidi(‘laksin.mid’)

• pianoroll(nmat)nmat =

0 0.9000 0 64.0000 82.0000 0 0.5510 1.0000 0.9000 0 71.0000 89.0000 0.6122 0.5510 2.0000 0.4500 0 71.0000 82.0000 1.2245 0.2755 2.5000 0.4500 0 69.0000 70.0000 1.5306 0.2755 3.0000 0.4528 0 67.0000 72.0000 1.8367 0.2772 3.5000 0.4528 0 66.0000 72.0000 2.1429 0.2772 4.0000 0.9000 0 64.0000 70.0000 2.4490 0.5510 5.0000 0.9000 0 66.0000 79.0000 3.0612 0.5510 6.0000 0.9000 0 67.0000 85.0000 3.6735 0.5510 7.0000 1.7500 0 66.0000 72.0000 4.2857 1.0714 9.0000 0.4528 0 64.0000 74.0000 5.5102 0.2772 9.5000 0.4528 0 67.0000 81.0000 5.8163 0.2772 10.0000 0.9000 0 71.0000 83.0000 6.1224 0.5510 11.0000 0.4528 0 71.0000 78.0000 6.7347 0.2772 11.5000 0.4528 0 69.0000 73.0000 7.0408 0.2772 12.0000 0.4528 0 67.0000 71.0000 7.3469 0.2772 12.5000 0.4528 0 66.0000 69.0000 7.6531 0.2772 13.0000 0.4528 0 67.0000 83.0000 7.9592 0.2772

0 2 4 6 8 10 12 14 16C4#

D4

D4#

E4

F4

F4#

G4

G4#

A4

A4#

B4

C5

C5#

Time in beats

Pitc

h

MIDI data

mus.score score excerpt selection

• mus.score(…, ‘Notes’, 10:20)

• mus.score(…, ‘StartTime’, 30, ‘EndTime’, 60)

• mus.score(…, ‘Channel’, 1)

• mus.score(…, ‘Trim’)

• mus.score(…, ‘TrimStart’, ‘TrimEnd’)

PitchesD# E F# G A B0

2

4

6

8

10Histogram

0 1 2 3 4 5 6 7 8 963

64

65

66

67

68

69

70

71Pitch

mus.pitch pitch contour

• m = mus.score(‘myfile’)

• m is of class mus.sequence

• s = mus.pitch(m)

• s is of class sig.signal

• h = mus.hist(s, ‘Class’)

s

h

0 1 2 3 4 5 6 7 8 9-2

-1

0

1

2

3

4

5

6

7Pitch

mus.pitch(‘Inter’) pitch interval contour

• m = mus.score(‘myfile’)

• i = mus.pitch(m, ‘Inter’)

• h = mus.histo(i)

• mus.histo(i, ‘Sign’, 0)

i

-2 -1 0 1 2 3 4 5 6 70

1

2

3

4

5

6

7Histogram

h

Time0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

-0.2

0

0.2

0.4

0.6

0.8

1Autocor

0 1 2 3 4 5 6 7 8 963

64

65

66

67

68

69

70

71Pitch

mus.pitch pitch contour

• m = mus.score(‘myfile’)

• c = mus.pitch(m, ‘Sampling’, .25)

• mus.autocor(c)

time0 0.5 1 1.5 2 2.5 3 3.5 4

64

65

66

67

68

69

70

71Pitch

time0 0.5 1 1.5 2 2.5 3 3.5 4

63

64

65

66

67

68

69

70

71Pitch

mus.pitch pitch contour

• m1 = mus.score(‘myfile’, ‘EndTime’, 5)

• m2 = mus.score(‘myfile’, ‘StartTime’, 5)

• p1 = mus.pitch(m1, ‘Sampling’, .25)

• p2 = mus.pitch(m2, ‘Sampling’, .25)

• sig.dist(p1, p2, ‘Cosine’)

mus.score From MIDI to score representation

• metrical grid: hierarchical construction of pulsations over multiple metrical levels

• modal and tonal spaces: mapping pitch values on scales (on delimited temporal regions)

• syntagmatic chains: successive notes forming voices, enabling to express relative distance between successive notes (rhythmic values)

mus.score score information

mus.score(‘laksin.mid’)

mus.save mus.play

0 2 4 6 8 10Eb4

E4

F#4

G4

A4

B4

15 CHAPTER 4 – EXAMPLES ■

■ MIDI Toolbox ■

CHAPTER 4 – EXAMPLES

Example 1: Visualizing MIDI Data

The pianoroll function displays conventional pianoroll notation as it is available in sequencers. The function has the following syntax: pianoroll(nmat,<varargin>); The first argument refers to the notematrix and other arguments are optional. Possible arguments refer to axis labels (either MIDI note numbers or note names for Y-axis and either beats or seconds for the X-axis), colors or other options. For example, the following command outputs the pitch and velocity information: » pianoroll(laksin,'name','sec','vel');

Figure 1: Pianoroll notation of the two first phrases of Läksin minä kesäyönä. The lower panel shows the velocity information.

Figure 2. Notation of first two verses of the Finnish Folk tune "Läksin minä kesäyönä".

0 1 2 3 4 5 6 7 8 9 10C4#D4 D4#E4 F4 F4#G4 G4#A4 A4#B4 C5 C5#

Time in seconds

Pitc

h

0 1 2 3 4 5 6 7 8 9 100

20

40

60

80

100

120

Time in seconds

Vel

ocity

Sound

DynamicsPitchTimbre

Notes

Audio level

Symbolic level

Segments Mode Tonality

MeterStructural levels

mus.score(…, ’Group’) hierarchical grouping

! !"!"!! ! #$ ! "!"!42!# "!"!!

!% !! !" ! ! # " !& !

" ! ! ! !!5

$ !"

!" ! !

"

!! &

'"!!

#

! # !! !!$7 !(!!

Music engraving by LilyPond 2.18.2—www.lilypond.org

Mozart, Variation XI on “Ah, vous dirai-je maman”, K.265/300e

0 2 4 6 8 10

Eb4

E4

F#4

G4

A4

B4

mus.score(…, ’Group’) hierarchical grouping

Sound

DynamicsPitchTimbre

Notes

Audio level

Symbolic level

Segments Mode Tonality

Meter

Ornamentationreduction

Structural levels

mus.score(…, ’Reduce’) ornamentation reduction

! !"!"!! ! #$ ! "!"!42!# "!"!!

!% !! !" ! ! # " !& !

" ! ! ! !!5

$ !"

!" ! !

"

!! &

'"!!

#

! # !! !!$7 !(!!

Music engraving by LilyPond 2.18.2—www.lilypond.org

Mozart, Variation XI on “Ah, vous dirai-je maman”, K.265/300e

(Lerdahl & Jackendoff)head

0 2 4 6 8 10

Eb4

E4

F#4

G4

A4

B4

mus.score(…, ’Reduce’) ornamentation reduction

mus.minr(‘laksin.mid’, ‘Group’, ‘Reduce’)

Construction of a syntagmatic network

Sound

DynamicsPitchTimbre

Notes

Audio level

Symbolic level

Segments Mode Tonality

MotifsMeter

Ornamentation reduction

Structural levels

Pattern

mus.score(…, ’Motif’)

Geisslerlied

0 5 10 15 20

62

65

67

6970

72

Sound

DynamicsPitchTimbre

Notes

Audio level

Symbolic level

Segments ModeTonality

MotifsMeter

Ornamentation reduction

Structural levels

Pattern

Audio / symbolic

aud.envelope

mus.autocor

mus.tempo

sig.peaks

mus.score

metrical analysis

50

audio MIDI, scoretranscription

Sound

DynamicsPitchTimbre

Notes

Audio level

Symbolic level

Segments Mode Tonality

MotifsMeter

Ornamentation reduction

Structural levels

Pattern

mus.score incremental approach

Notes

Segments Mode Tonality

MotifsMeter

Ornamentation reduction Pattern

Each successive note is progressively integrated to all musical analyses, driving interdependencies.

• All releases, GitHub repository, integrated with code review environment

• User’s Manual and documentations in wiki environment

• Mailing lists: news, discussion list related to ongoing development, commits, issues registered and modified, discussion list for users

• Tickets to issue bug reports

Open-source project http://bit.ly/miningsuite

Open-source project http://bit.ly/miningsuite

• MIRtoolbox and initial version of MiningSuite is mainly the work of one person. Transition to a tool controlled by a community following standard open-source protocols.

• Whole code should be clearly readable, and be subject to correction/modification/enrichment by open community, after open discussions.

• Further development of the toolbox core (architecture, new perspectives) also subject to open discussion and community-based collaboration.

Recommended