27
NYAI #4: Unsupervised Learning (Soumith Chintala) & Music Through ML (Brian McFee)

NYAI - Understanding Music Through Machine Learning by Brian McFee

Embed Size (px)

Citation preview

Page 1: NYAI - Understanding Music Through Machine Learning by Brian McFee

NYAI #4: Unsupervised Learning (Soumith Chintala) & Music Through ML (Brian McFee)

Page 2: NYAI - Understanding Music Through Machine Learning by Brian McFee

Understanding music through machine

learningBrian McFee

@functiontelechy

Page 3: NYAI - Understanding Music Through Machine Learning by Brian McFee

Why analyze music with computers?

Recommender systems

Musicology

Music education

Visualization, interactive display

Creative applications

Page 4: NYAI - Understanding Music Through Machine Learning by Brian McFee

What do we mean by music?Score, MIDI

Audio recordings

Page 5: NYAI - Understanding Music Through Machine Learning by Brian McFee

Time-varying prediction tasksStructural segmentation

Chord recognition

Beat / meter tracking

Instrument detection

Transcription

Page 6: NYAI - Understanding Music Through Machine Learning by Brian McFee

Structure analysiswith Dan Ellis @Columbia.edu, @Google

[M., Ellis, ISMIR2014]

Page 7: NYAI - Understanding Music Through Machine Learning by Brian McFee

Structure analysis

1. Similarity vs. dissimilarity

2. Break a song into pieces(detect boundaries)

3. Re-connect the pieces(detect similar segments)

Page 8: NYAI - Understanding Music Through Machine Learning by Brian McFee

Our strategy: a song is a graph1.Construct a graph over beats

2.Partition the graph to recover structure

3.Vary the partition size to expose multi-level structure

Page 9: NYAI - Understanding Music Through Machine Learning by Brian McFee

https://commons.wikimedia.org/wiki/File:Random_walk_25000_not_animated.svg

Page 10: NYAI - Understanding Music Through Machine Learning by Brian McFee

Building the graph [1/3]: The local graph1.Add a vertex for each beat

2.Add local edges between adjacent beats (t, t ± 1)

3.Weight edges by MFCC similarity (gaussian kernel)

1 2 3

1 2 3

1 2 3

Page 11: NYAI - Understanding Music Through Machine Learning by Brian McFee

1.Link k-nearest neighbors (in CQT space)

Building the graph [2/3]: The repetition graph

31 2 5 64 87

Page 12: NYAI - Understanding Music Through Machine Learning by Brian McFee

1.Link k-nearest neighbors (in CQT space)

2.Weight edges by feature similarity(gaussian kernel)

Building the graph [2/3]: The repetition graph

31 2 5 64 87

Page 13: NYAI - Understanding Music Through Machine Learning by Brian McFee

Building the graph [3/3]: The combination1.Take a weighted combination of local and repetition

A = μ *Local + (1-μ) *Repetition

Page 14: NYAI - Understanding Music Through Machine Learning by Brian McFee

Building the graph [3/3]: The combination1.Take a weighted combination of local and repetition

A = μ *Local + (1-μ) *Repetition

2.Optimize μ for a balanced random walk: P[Local move] ≅ P[Repetition move]

∀ i: μ ∑j Local[i,j] ≅ (1-μ) ∑j Repetition[i,j]

Page 15: NYAI - Understanding Music Through Machine Learning by Brian McFee

Building the graph [3/3]: The combination1.Take a weighted combination of local and repetition

A = μ *Local + (1-μ) *Repetition

2.Optimize μ for a balanced random walk: P[Local move] ≅ P[Repetition move]

3.μ has a closed-form (least-squares) optimum

∀ i: μ ∑j Local[i,j] ≅ (1-μ) ∑j Repetition[i,j]

Page 16: NYAI - Understanding Music Through Machine Learning by Brian McFee

Example: The Beatles - Come Together

Tim

e →

Time →

Page 17: NYAI - Understanding Music Through Machine Learning by Brian McFee

Partitioning via spectral clusteringAffinity matrix A

degree matrix Dii = ∑j Aij

Normalized Laplacian L = I - D-1A

Bottom eigenvectors encode component membership for each beat

Ly = λy

Cluster the eigenvectors Y of L to reveal structureTime →

Y9

Y8

Y7

Y6

Y5

Y4

Y3

Y2

Y1

Y0

Page 18: NYAI - Understanding Music Through Machine Learning by Brian McFee

Example: The Beatles - Come Together

Low-rank reconstructions expose structure

L ≅ Y[:, :m] · Y[:, :m]T

Page 19: NYAI - Understanding Music Through Machine Learning by Brian McFee

Multi-level segmentation

1.Construct the n-by-n graph

A

2.Compute Laplacian

eigenvectors Y

3.for m in [2, 3, …]

Partitions[m] := kmeans(Y[:, :m], k=m)

4.Return Partitions

Page 20: NYAI - Understanding Music Through Machine Learning by Brian McFee

Interactive visualization demo

Page 21: NYAI - Understanding Music Through Machine Learning by Brian McFee

Quantitative evaluation

Data sets:Beatles_TUT (174 tracks)

SALAMI small (735 tracks) and functions

Choosing the number of components

Maximum label entropy (with duration constraints)

Oracle: Best m per track, per metric (simulates interactive display)

Baseline: [Serrà, Müller, Grosche &

Arcos, 2012]

Boundary detectionPairwise frame labeling

Page 22: NYAI - Understanding Music Through Machine Learning by Brian McFee

Results: Beatles

F 0.5s F 3.0s F-Pairwise

Automatic m 0.312 +- 0.15 0.579 +- 0.16 0.628 +- 0.13Oracle 0.414 +- 0.14 0.684 +- 0.13 0.694 +- 0.12SMGA* 0.293 +- 0.13 0.699 +- 0.16 0.715 +- 0.15

Page 23: NYAI - Understanding Music Through Machine Learning by Brian McFee

Results: SALAMI small

F 0.5s F 3.0s F-Pairwise

Automatic m 0.192 +- 0.11 0.344 +- 0.15 0.448 +- 0.16Oracle 0.292 +- 0.15 0.525 +- 0.19 0.561 +- 0.16SMGA 0.173 +- 0.08 0.518 +- 0.12 0.493 +- 0.16

Page 24: NYAI - Understanding Music Through Machine Learning by Brian McFee

Results: SALAMI functions

F 0.5s F 3.0s F-Pairwise

Automatic m 0.304 +- 0.13 0.455 +- 0.16 0.546 +- 0.14Oracle 0.406 +- 0.13 0.579 +- 0.15 0.652 +- 0.13SMGA 0.224 +- 0.11 0.550 +- 0.18 0.553 +- 0.15

Page 25: NYAI - Understanding Music Through Machine Learning by Brian McFee

Part 1: wrap-upThe graph representation provides a simple framework for

segmentation

Algorithm is available in MSAF by Oriol Nieto @NYU @Pandorapip install msaf

Follow-up work: hierarchical structure evaluationBoundaries [M., Nieto, Bello, ISMIR2015]

Labelings [M., Nieto, Bello, in preparation]

Page 26: NYAI - Understanding Music Through Machine Learning by Brian McFee

Links to software

Music structure analysis framework

pip install msaf

JAMSpip install jams

Librosapip install librosa

conda install -c conda-forge librosa

Page 27: NYAI - Understanding Music Through Machine Learning by Brian McFee

[email protected]

@functiontelechy

https://bmcfee.github.io/