14
Introduction to Onset Detection Functions HAO-HSUN LI 1/30

Introduction to Onset Detection Functions HAO-HSUN LI 1/30

Embed Size (px)

Citation preview

Introduction to Onset Detection FunctionsHAO-HSUN LI

1/30

Onset DetectionOnset

◦ The beginning of a musical note or other sound◦ The amplitude rises from zero to an initial peak

Onset Detection Function◦ Function peaks coincide with onsets◦ Various methods exist

2/30

Short-Time Fourier Transform

◦ : signal◦ : hamming window◦ : frame index; : frequency bin index◦ : frame size◦ : hop size

3/30

𝑋 (𝑛 ,𝑘 )= ∑𝑚=− 𝑁

2

𝑁2− 1

𝑥 (h𝑛+𝑚 )𝑤 (𝑚 )𝑒−2 𝑗 𝜋𝑚𝑘

𝑁

Onset Detection FunctionEnergy

◦ Occurrence of an onset ←→ Increase of the signal’s amplitude

◦ : frame index◦ : frame size◦ : -point window (smoothing kernel)

4/30

𝑁

𝐸 (𝑛)= 1𝑁 ∑

𝑚=− 𝑁2

𝑁2−1

[𝑥 (𝑛+𝑚 ) ]2𝑤 (𝑚 )

Onset Detection FunctionEnergy

◦ Spectral domain energy

◦ : frame index; : frequency bin index◦ : frame size◦ : frequency dependent weighting

High-Frequency Content

◦ Weighting each bin’s contribution proportion to its frequency

5/30

𝑋 (𝑛 ,𝑘)

~𝐸 (𝑛)= 1𝑁 ∑

𝑘=− 𝑁2

𝑁2−1

𝑊 (𝑘 )|𝑋 (𝑛 ,𝑘 )|2

Onset Detection FunctionSpectral Flux

◦ Measures the change in magnitude in each frequency bin

◦ : frame index; : frequency bin index◦ : frame size◦ : half-wave rectifier function

6/30

𝑋 (𝑛−1 ,𝑘) 𝑋 (𝑛 ,𝑘)

𝑆𝐹 (𝑛)= ∑𝑘=− 𝑁

2

𝑁2−1

𝐻 (|𝑋 (𝑛 ,𝑘 )|−|𝑋 (𝑛−1 ,𝑘)|)

Onset Detection FunctionSpectral Difference

◦ Distance between successive short-term Fourier spectra

◦ : frame index; : frequency bin index◦ : frame size◦ : half-wave rectifier function

7/30

𝑆𝐷 (𝑛 )= ∑𝑘=− 𝑁

2

𝑁2− 1

{𝐻 (|𝑋 (𝑛 ,𝑘 )|−|𝑋 (𝑛−1 ,𝑘 )|) }2𝑋 (𝑛−1 ,𝑘) 𝑋 (𝑛 ,𝑘)

Onset Detection FunctionPhase

◦ Phase◦ Instantaneous frequency◦ Change in instantaneous frequency

Phase Deviation

◦ : frame index; : frequency bin index◦ : frame size

8/30

𝑋 (𝑛 ,𝑘 )=|𝑋 (𝑛 ,𝑘 )|𝑒 𝑗𝜓 (𝑛 ,𝑘)

𝑃𝐷 (𝑛)= 1𝑁 ∑

𝑘=− 𝑁2

𝑁2−1

|𝜓 ″ (𝑛 ,𝑘 )|

Onset Detection FunctionWeighted Phase Deviation

◦ Considers magnitude and phase jointly◦ Significant improvement

Normalized Weighted Phase Deviation

9/30

𝑊 𝑃𝐷 (𝑛)= 1𝑁 ∑

𝑘=− 𝑁2

𝑁2−1

|𝑋 (𝑛 ,𝑘 )𝜓 ″ (𝑛 ,𝑘 )|

𝑁𝑊 𝑃𝐷 (𝑛)=

∑𝑘=− 𝑁

2

𝑁2− 1

|𝑋 (𝑛 ,𝑘)𝜓 ″ (𝑛 ,𝑘 )|

∑𝑘=−

𝑁2

𝑁2− 1

|𝑋 (𝑛 ,𝑘 )|

Onset Detection FunctionComplex Domain

◦ Considers amplitude and phase jointly◦ Assuming constant amplitude and rate of phase change

◦ Sum of absolute deviations

◦ : frame index; : frequency bin index

10/30

𝑋𝑇 (𝑛 ,𝑘 )=|𝑋 (𝑛−1 ,𝑘 )|𝑒𝜓 (𝑛−1 ,𝑘) +𝜓 ′ (𝑛− 1,𝑘 )

𝐶𝐷 (𝑛)= ∑𝑘=− 𝑁

2

𝑁2− 1

|𝑋 (𝑛 ,𝑘)−𝑋𝑇 (𝑛 ,𝑘 )|

Onset Detection FunctionRectified Complex Domain

◦ CD does not distinguish between increases and decreases in amplitude◦ Onsets versus offsets

11/30

𝑅𝐶𝐷 (𝑛)= ∑𝑘=− 𝑁

2

𝑁2−1

𝑅𝐶𝐷 (𝑛 ,𝑘 )

𝑅𝐶𝐷 (𝑛 ,𝑘 )={|𝑋 (𝑛 ,𝑘 )− 𝑋𝑇 (𝑛 ,𝑘)|∧ , if|𝑋 (𝑛 ,𝑘)|≥|𝑋 (𝑛−1 ,𝑘 )|0∧,o therwise

Onset Detection FunctionKullback-Leibler Divergence

◦ A measure of the information lost

◦ : true distribution◦ : model or approximation of

Kullback-Leibler◦ Highlights positive amplitude changes

◦ : frame index; : frequency bin index◦ : frame size

12/30

𝐷𝐾𝐿 (𝑃 ∥𝑄 )=∑𝑥

𝑃 (𝑥 ) log 𝑃 (𝑥 )𝑄 (𝑥 ) 𝐷𝐾𝐿 (𝑛 )= ∑

𝑘=− 𝑁2

𝑁2−1

|𝑋 (𝑛 ,𝑘 )|log |𝑋 (𝑛 ,𝑘 )||𝑋 (𝑛−1 ,𝑘 )|

Onset Detection FunctionModified Kullback-Leibler

◦ Removes the weighting

Rectified MKL (1)

Rectified MKL (2)◦ Avoids negative values◦ Defined when a series of small values is

encountered◦ prevents peaks at offsets

13/30

𝐷𝑀𝐾𝐿 (𝑛 )= ∑𝑘=− 𝑁

2

𝑁2− 1

log|𝑋 (𝑛 ,𝑘 )|

|𝑋 (𝑛−1 ,𝑘 )|

𝐷𝑀𝐾𝐿 (𝑛 )= ∑𝑘=− 𝑁

2

𝑁2− 1

𝑑 (𝑛 ,𝑘 )

𝑑 (𝑛 ,𝑘 )={log |𝑋 (𝑛 ,𝑘 )||𝑋 (𝑛−1,𝑘 )|

∧ , if|𝑋 (𝑛 ,𝑘 )|≥|𝑋 (𝑛−1 ,𝑘 )|

0∧ ,otherwise

𝐷𝑀𝐾𝐿 (𝑛 )= ∑𝑘=− 𝑁

2

𝑁2− 1

log (1+ |𝑋 (𝑛 ,𝑘)||𝑋 (𝑛−1 ,𝑘 )|+𝜖 )

ReferencesBello, J. P., Daudet, L., Abdallah, S., Duxbury, C., Davies, M., & Sandler, M. B. (2005). A tutorial on onset detection in music signals. Speech and Audio Processing, IEEE Transactions on, 13(5), 1035-1047.

Dixon, S. (2006, September). Onset detection revisited. In Proceedings of the 9th International Conference on Digital Audio Effects (Vol. 120, pp. 133-137).

Hainsworth, S., & Macleod, M. (2003, September). Onset detection in musical audio signals. In Proc. Int. Computer Music Conference (pp. 163-6).

Brossier, P. M. (2006). Automatic annotation of musical audio for interactive applications (Doctoral dissertation).

14/30