16
Identifying Cover Songs - Ellis & Poliner 2007-04-20 - /16 1 1. Cover Songs 2. Chroma Features 3. Beat Tracking 4. Matching Cover Songs Identifying “Cover Songs” with Beat-Synchronous Chroma Features Dan Ellis and Graham Poliner Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA {dpwe,graham}@ee.columbia.edu http://labrosa.ee.columbia.edu /

Identifying “Cover Songs” with Beat-Synchronous Chroma

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Identifying Cover Songs - Ellis & Poliner 2007-04-20 - /161
1. Cover Songs 2. Chroma Features 3. Beat Tracking 4. Matching Cover Songs
Identifying “Cover Songs” with Beat-Synchronous
Chroma Features Dan Ellis and Graham Poliner
Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA
{dpwe,graham}@ee.columbia.edu http://labrosa.ee.columbia.edu/
Cover Songs • “Cover Songs” = reinterpretation of a piece
different instrumentation, character no match with “timbral” features
• Need a different representation! beat-synchronous chroma features
2
Let It Be - The Beatles Let It Be - Nick Cave
time / sectime / sec
1
2
3
4
2 4 6 8 10
Beat-sync chroma features
Beat-sync chroma features
A
C
D
F
G
A
C
D
F
G
Let It Be
into one canonical octave i.e. 12 semitone bins
• Can resynthesize as “Shepard Tones” all octaves at once
3
Piano scale
2 4 6 8 10 100 200 300 400 500 600 700
2 4 6 8 10 time / sec
time / sec
Calculating Chroma Features • Method 1: Map every STFT bin
blurs non-tonal energy
• Method 2: Map only STFT peaks still blurry at low frequencies
• Method 3: Instantaneous Frequency /t escapes frequency resolution limit
4
50 100 150 200 fft bin 2 4 6 8 10 time / sec 50 100 150 200 250 300 time / frame
fr e q /
A
C
D
F
G
50 100 150 200 fft bin 2 4 6 8 10 time / sec 50 100 150 200 250 300 time / frame
fr e q /
A
C
D
F
G
2 4 6 8 10 time / sec 50 100 150 200 250 300 time / frame
fr e q /
Beat Tracking (1) • Goal: One feature vector per ‘beat’ (tatum)
for tempo normalization, efficiency
• Autocorr. + window → global tempo estimate
5
5 10 15time / sec
0 100 200 300 400 500 600 700 800 900 lag / 4 ms samples
1000
0
Beat Tracking (2) • Dynamic Programming finds beat times {ti}
optimizes i O(ti) + i W((ti+1 – ti – p)/) where O(t) is onset strength envelope (local score) W(t) is a log-Gaussian window (transition cost) p is the default beat period per measured tempo incrementally find best predecessor at every time backtrace from largest final score to get beats
6
P(t) = argmax{W((" – "p)/#)C*(")} "
Beat Tracking Results • DP will bridge gaps (non-causal)
there is always a best path ...
• 2nd place in MIREX 2006 Beat Tracking compared to McKinney & Moelants human data
7
10
20
30
40
20
40
182 184 186 188 190 192 time / sec
10
20
30
40
! "# "! $# $! %# %!
$
&
'
(
"#
"$
"#
$#
%#
&#
$
&
'
(
"#
"$
#
Matching (1): Little Fragments • Cover versions may change song structure
multiple local matches at different alignments
• Match query and target as many small pieces?
9
how do we combine individual scores?
do we have all day?
100 200 300 400 500 beats
100 200 300 400 500 beats
G
Matching (2): Global Correlation • Cross-correlate entire beat-chroma matrices
... at all possible transpositions implicit combination of match quality and duration
• One good matching fragment is sufficient...?
10
100 200 300 400 500 beats @281 BPM
-500 -400 -300 -200 -100 0 100 200 300 400 skew / beats
-5
0
+5
G
E
D
C
A
Cross-correlation
Filtered Cross-Correlation • Raw correlation not as important as precise
local match looking for large contrast at ±1 beat skew i.e. high-pass filter
11
-500 -400 -300 -200 -100 0 100 200 300 400
0
0.2
0.4
0.6
-500 -400 -300 -200 -100 0 100 200 300 400 skew / beats
skew / beats
Identifying Cover Songs - Ellis & Poliner 2007-04-20 - /16
Results (1): Ellis 23 set • 23 pairs of cover songs from uspop2002 +...
one correct match per query
12
Test
Q u
e ry
Ab Ad Al Am Be Be Bl Ca Ce Cl Co Co Da En Fa Go Go Gr Hu I_ I_ Le Ta
Abracadabra/sugar_ray
Addicted_To_Love/tina_turner
All_Along_The_Watchtower/dave_matthews_band
America/simon_and_garfunkel
Before_You_Accuse_Me/eric_clapton
Between_The_Bars/glen_phillips
Blue_Collar_Man/styx
Caroline_No/brian_wilson
Cecilia/simon_and_garfunkel
Claudette/roy_orbison
Cocaine/nazareth
Come_Together/beatles
Day_Tripper/cheap_trick
Enjoy_The_Silence/tori_amos
Faith/limp_bizkit
God_Only_Knows/brian_wilson
Gold_Dust_Woman/sheryl_crow
Grand_Illusion/styx
Hush/milli_vanilli
I_Can_t_Get_No_Satisfaction/rolling_stones
I_Love_You/faith_hill
Let_It_Be/nick_cave
Take_Me_To_The_River/annie_lennox
Identifying Cover Songs - Ellis & Poliner 2007-04-20 - /16
Results (2): MIREX 06 • Cover song contest
30 songs x 11 versions of each (!) (data has not been disclosed) # true covers in top 10 8 systems compared (4 cover song + 4 similarity)
• Found 761/3300 = 23% recall next best: 11% guess: 3%
13
MIREX 06 Cover Song Results: # Covers retrieved per song per system
CS DE KL1 KL2 KWL KWT LR TP
cover song systems similarity systems
s o
n g
-s e
t (e
Where are the matches? • Look inside global cross-correlation to find
matching fragments... xcorr = t f (C1(t, f)⋅C2(t, f)) - view along time
14
50 100 150 200 250 300 350 400
0 50 100 150 200 250 300 350 400
-0.2
0
0.2
0.4
50 100 150 200 250 300 350 400 time / beats
time / beats
time / beats
c h
ro m
What are the mistakes? • False reject - missed true match
cover version is too different, beat tracking wrong ...
• False alarm - invalid match “Cocaine” (Clapton) vs. “Satisfaction” (Stones)
15
100 200 300 400 500 600 700 800 900 1000
Rolling Stones - Satisfaction - beats 1:1011
100 200 300 400 500 600 700 800 900 1000
0 100 200 300 400 500 600 700 800 900 1000 -2
-1
0
1
2
A
C
D
F
G
A
C
D
F
G
are successful for matching cover songs captures melody-harmony, not instruments
• Further uses: Beat-chroma fragments as musical building blocks e.g. VQ over large body of music find recurrent motifs artist identification?
• Code available! Google “matlab cover song”
16