30
Improving perceptual tempo estimation with crowd-sourced annotations Mark Levy, 26 October 2011

Crowd sourcing for tempo estimation

Embed Size (px)

DESCRIPTION

Slides for presentation at ISMIR 2011 of the paper "Improving perceptual tempo estimation with crowd-source annotations".

Citation preview

Page 1: Crowd sourcing for tempo estimation

Improving perceptual tempo estimation with crowd-sourced annotations

Mark Levy, 26 October 2011

Page 2: Crowd sourcing for tempo estimation

Tempo Estimation

Terminology: tempo = beats per minute = bpm

Page 3: Crowd sourcing for tempo estimation

Tempo Estimation

Use crowd-sourcing: quantify influence of metrical ambiguity

on tempo perception improve evaluation improve algorithms

Page 4: Crowd sourcing for tempo estimation

Perceived Tempo

Metrical ambiguity: listeners don’t agree about bpm typically in two camps perceived values differ by factor of 2 or 3

McKinney and Moelants: 24-40 subjects released experimental data

Page 5: Crowd sourcing for tempo estimation

Perceived Tempo

Metrical ambiguity:

bpm bpm

liste

ners

liste

ners

McKinney and Moelants, 2004

Page 6: Crowd sourcing for tempo estimation

Machine-Estimated Tempo

Also affected by metrical ambiguity: makes estimation difficult natural to see multiple bpm values estimated values often out by factor of 2 or 3

(“octave error”)

Page 7: Crowd sourcing for tempo estimation

Crowd Sourcing

Web-based questionnaire: capture label choices capture bpm from mean tapping interval capture comparative judgements

Page 9: Crowd sourcing for tempo estimation

Crowd Sourcing

Music: over 4000 songs 30-second clips• rock, country, pop, soul, funk and rnb, jazz,

latin, reggae, disco, rap, punk, electronic, trance, industrial, house, folk, ...

• recent releases back to 60s

Page 10: Crowd sourcing for tempo estimation

Response

First week (reported/released): 4k tracks annotated by 2k listeners 20k labels and bpm estimates

To date: 6k tracks annotated by 27k listeners 200k labels and bpm estimates

Page 11: Crowd sourcing for tempo estimation

Analysis: ambiguity

When people tap to a song at different bpm do they really disagree about whether it’s

slow or fast?

Investigation: inspect labels from people who tap differently quantify disagreement for ambiguous songs

Page 12: Crowd sourcing for tempo estimation

Analysis: ambiguity

Subset of slow/fast songs: labelled by at least five listeners majority label “slow” or “fast”

Page 13: Crowd sourcing for tempo estimation

Analysis: ambiguity

bpm vs speed label

all estimates for slow/fast songs

Page 14: Crowd sourcing for tempo estimation

Analysis: ambiguity

bpm vs speed label

people can tap slowly to fast songs

all estimates for slow/fast songs

Page 15: Crowd sourcing for tempo estimation

Analysis: ambiguity

Labels for fast songs from slow-tappers

Page 16: Crowd sourcing for tempo estimation

Analysis: ambiguity

Quantify disagreement over labels: model conflict, extremity of tempo conflict coefficient

Ls, Lf, L: number of slow, fast, all labels for a song

L

LL

LL

LLC fs

fs

fs ),max(

),min(

Page 17: Crowd sourcing for tempo estimation

Distribution of conflict coefficient C

Analysis: ambiguity

all songs with at least five labels

C > 0 means slow and fast

Page 18: Crowd sourcing for tempo estimation

Analysis: ambiguity

Subset of metrically ambiguous songs: at least 30% of listeners tap at half/twice the

majority estimate

Compared to the rest: no significant difference in C

Page 19: Crowd sourcing for tempo estimation

Evaluation metrics

MIREX: capture metrical ambiguity replicate human disagreement

Ambiguity considered unhelpful: automatic playlisting DJ tools, production tools jogging

Page 20: Crowd sourcing for tempo estimation

Evaluation metrics

Application-oriented : compare with majority* human estimate

(*median in most popular bin) categorise machine estimates

same as humans twice as fast twice as slow three times as fast and so on unrelated to humans

Page 21: Crowd sourcing for tempo estimation

Analysis: evaluation

Sources: BPM List (DJ kit, human-moderated)

Donny Brusca, 7th edition, 2011 EchoNest/MSD (closed-source algorithm)

maybe Jehan et al,? VAMP (open-source algorithm)

Davies and Landone, 2007-

Page 22: Crowd sourcing for tempo estimation

Analysis: machine vs human

x2 same /2 unrelated other0%

10%

20%

30%

40%

50%

60%

70%

80%

BPM ListVAMPEchoNest

Page 23: Crowd sourcing for tempo estimation

Analysis: controlled test

Controlled comparison: exploit experience from website A/B testing use this to improve algorithm iteratively

Result is independent of any quality metric

Page 24: Crowd sourcing for tempo estimation

Analysis: controlled test

When visitor arrives at the page: choose a source S at random choose a bpm value at random choose two songs given that value by S display them together

Then ask which sounds faster!

Page 25: Crowd sourcing for tempo estimation

Analysis: controlled test

Null Hypothesis: there will be presentation effects listeners will attend to subtle differencesbut these effects are independent of the source

of bpm estimates if the quality of the sources is the same

Page 26: Crowd sourcing for tempo estimation

Analysis: controlled test

BPM List VAMP EchoNest0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

differentsame

Page 27: Crowd sourcing for tempo estimation

Analysis: improving estimates

Adjust bpm based on class: imagine an accurate slow/fast classifier

Hockmann and Fujinaga, 2010 adjust as follows:

bpm:= bpm/2 if slow and bpm > 100 bpm:= bpm*2 if fast and bpm < 100 otherwise don’t adjust

simulation: accept majority human label

Page 28: Crowd sourcing for tempo estimation

Analysis: adjusted vs human

x2 same /2 unrelated other0%

10%

20%

30%

40%

50%

60%

70%

80%

BPM ListVAMPEchoNest

Page 29: Crowd sourcing for tempo estimation

Conclusions

Crowd sourcing: gather thousands of data points in a few days,

half a million over time humans agree over slow/fast labels, even

when they tap at different bpmImproving machine estimates: use controlled testing exploit a slow/fast classifier

Page 30: Crowd sourcing for tempo estimation

Thanks!

[email protected] @gamboviol

http://mir-in-action.blogspot.comhttp://playground.last.fm/demo/speedohttp://users.last.fm/~mark/speedo.tgz

We are looking for interns/research fellows!