33
4/29/17 1 Computer Science CS 591 S1 – Computational Audio Lecture 18: Music Information Retrieval & Rhythm Overview of Music Information Retrieval Rhythm: Basic Notions Onset Detection Beat Detection Tempo Estimation Higher-level Rhythmic Patterns Wayne Snyder Computer Science Department Boston University Computer Science 2 Music Information Retrieval: Overview Music Information Retrieval is an interdisciplinary science which attempts to extract interesting and useful information from musical signals using computational tools. Researchers from Electrical Engineering, Computer Science, Musicology, Psychology, and Mathematics apply a variety of techniques in their scientific study of music. Some Important Areas of Current Interest: Automatic Feature Extraction (pitch, melody, harmony, rhythm, mood, genre, ...) Feature Tagging (automatic or manual “ground truth”) Beat Tracking and Rhythm Analysis Transcription (melody, chords, score) Multimodal Synchronization/Alignment Database Retrieval and search Fingerprinting and classification Similarity Structure Analysis Performance Analysis Applications include many software tools both for the professional community and for the consumer market.

CS 591 S1 – Computational Audio Detection.pdf · If I Had You (Benny Goodman) Shakuhachi Flute Liszt: Sonetto No. 104 Del Petrarca Where is the beat? Can you tap your foot to it?

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

  • 4/29/17

    1

    Computer Science

    CS 591 S1 – Computational Audio

    Lecture 18: Music Information Retrieval & Rhythm

    Overview of Music Information Retrieval

    Rhythm: Basic Notions

    Onset Detection

    Beat Detection

    Tempo Estimation

    Higher-level Rhythmic Patterns

    Wayne Snyder Computer Science Department

    Boston University

    Computer Science

    2

    Music Information Retrieval: Overview

    Music Information Retrieval is an interdisciplinary science which attempts to extract interesting and useful information from musical signals using computational tools. Researchers from Electrical Engineering, Computer Science, Musicology, Psychology, and Mathematics apply a variety of techniques in their scientific study of music. Some Important Areas of Current Interest:

    Automatic Feature Extraction (pitch, melody, harmony, rhythm, mood, genre, ...) Feature Tagging (automatic or manual “ground truth”) Beat Tracking and Rhythm Analysis Transcription (melody, chords, score) Multimodal Synchronization/Alignment Database Retrieval and search Fingerprinting and classification Similarity Structure Analysis Performance Analysis

    Applications include many software tools both for the professional community and for the consumer market.

  • 4/29/17

    2

    Computer Science

    Music Information Retrieval: Rhythm

    A good place to start with with Rhythm:

    Computer Science

    We could usefully divide the subject of rhythm into a hierarchy of levels, from the fastest to the slowest divisions of time. The basic beat is called the Tactus this is what most people would tap their foot to.

    Music Information Retrieval: Rhythm

  • 4/29/17

    3

    Computer Science

    Further Examples:

    If I Had You (Benny Goodman) Shakuhachi Flute Liszt: Sonetto No. 104 Del Petrarca

    Where is the beat? Can you tap your foot to it? What is the meter? How to find the underlying regular beat which is being varied by the composer and/or performer for expressive effect?

    Rhythm Analysis: Introduction

    Computer Science

    Further Examples:

    If I Had You (Benny Goodman) Shakuhachi Flute Liszt: Sonetto No. 104 Del Petrarca

    Where is the beat? Can you tap your foot to it? What is the meter? How to find the underlying regular beat which is being varied by the composer and/or performer for expressive effect?

    Rhythm Analysis: Introduction

  • 4/29/17

    4

    Computer Science

    Rhythm Analysis: Introduction

    Even when rhythm is regular, there is a complicated semantic

    problem: rhythm is hierarchical, consisting of many

    interrelated groupings:

    Pulse level: Measure

    Computer Science

    Rhythm Analysis: Introduction

    Pulse level: Tactus (beat)

  • 4/29/17

    5

    Computer Science

    Rhythm Analysis: Introduction

    Example: Happy Birthday to you

    Pulse level: Tatum (fastest unit of division)

    Note: “Tatum” was named after Art Tatum, one of the greatest of all jazz pianists, who played a lot of fast notes!

    Computer Science

    In a sophisticated piece of music, these various levels are exploited

    by the composer in complicated ways. How should it be notated

    and described precisely? What is the time signature? Example:

    Bach, WTC, Fugue #1 in C Major

    Rhythm Analysis: Introduction

  • 4/29/17

    6

    Computer Science

    Rhythm Analysis: Introduction

    §  Hierarchical levels often unclear

    §  Global/slow tempo changes (all musicians do this!)

    §  Local/sudden tempo changes (e.g. rubato)

    §  Vague information

    (e.g., soft onsets, false positives )

    §  Sparse information: not all beats occur!

    (often only note onsets are used)

    Challenges in beat tracking

    Computer Science

    §  Onset detection §  Beat tracking§  Tempo estimation

    Tasks

    Introduction

  • 4/29/17

    7

    Computer Science

    §  Onset detection §  Beat tracking§  Tempo estimation

    Tasks

    Tasks in Rhythm Analysis

    Computer Science

    period phase

    §  Onset detection §  Beat tracking§  Tempo estimation

    Tasks

    Tasks in Rhythm Analysis

  • 4/29/17

    8

    Computer Science

    Tempo := 60 / period Beats per minute (BPM)

    §  Onset detection §  Beat tracking§  Tempo estimation

    Tasks

    period

    Tasks in Rhythm Analysis

    Computer Science

    Onset Detection

    §  Finding start times of perceptually relevant acoustic events in music signal

    §  Onset is the time position where a note is played

    §  Onset typically goes along

    with a change of the signal s properties: §  energy or loudness §  pitch or harmony §  timbre

  • 4/29/17

    9

    Computer Science

    Onset Detection

    [Bello et al., IEEE-TASLP 2005]

    §  Finding start times of perceptually relevant acoustic events in music signal

    §  Onset is the time position where a note is played

    §  Onset typically goes along

    with a change of the signal s properties: –  energy or loudness –  pitch or harmony –  timbre

    Computer Science

    Steps

    Time (seconds)

    Waveform

    Onset Detection (Amplitude or Energy-Based)

  • 4/29/17

    10

    Computer Science

    Time (seconds)

    Squared waveform

    Steps 1.  Amplitude squaring (full-wave rectification of power signal)

    Onset Detection (Amplitude or Energy-Based)

    Computer Science Onset Detection (Amplitude or Energy-Based)

    Time (seconds)

    Steps 1.  Amplitude squaring (full-wave rectification of power signal) 2.  Windowing (taking mean or max in each window): “energy

    envelope”

  • 4/29/17

    11

    Computer Science Onset Detection (Energy-Based)

    Time (seconds)

    Steps 1.  Amplitude squaring (full-wave rectification of power signal) 2.  Windowing (taking mean or max in each window) : “energy

    envelope” 3.  Difference Function (using appropriate Distance Function):

    captures changes in signal energy: “novelty curve.”

    Computer Science Onset Detection (Energy-Based)

    Time (seconds)

    Steps 1.  Amplitude squaring (full-wave rectification of power signal) 2.  Windowing (taking mean or max in each window) : “energy

    envelope” 3.  Difference Function (using appropriate Distance Function):

    captures changes in signal energy: “novelty curve.” 4.  Half-wave Rectification (negative samples => 0.0): note onsets are

    indicated by increases in energy only.

  • 4/29/17

    12

    Computer Science Onset Detection (Energy-Based)

    Time (seconds)

    Steps 1.  Amplitude squaring 2.  Windowing 3.  Differentiation 4.  Half wave rectification 5.  Peak picking

    Peak positions indicate note onset candidates

    Computer Science

    Energy based methods work well for percussive instruments, including piano:

    Example: Bach Well-Tempered Clavier, Book 1,

    Fugue #1 in C major (Glenn Gould)

  • 4/29/17

    13

    Computer Science

    Onset Detection

    §  Energy curves often only work for percussive music

    §  Many instruments have weak note onsets: wind, strings, voice. §  Example: Shakuhachi Flute

    §  Biggest problem: pitch or timbre changes (corresponding to note onset) may not correlate with energy changes, e.g., a singer may change the loudness without changing pitch/note, or change pitch/note without appreciable change in loudness.

    §  More refined methods needed that capture changes in energy spread over the spectrum [Bello et al., IEEE-TASLP 2005]

    Computer Science

    1.  Spectrogram Magnitude spectrogram

    Freq

    uenc

    y (H

    z)

    Time (seconds)

    || X Steps: Onset Detection (Spectral-Based)

    §  Aspects concerning pitch, harmony, or timbre are captured by spectrogram

    §  Allows for detecting local energy changes in certain frequency ranges

  • 4/29/17

    14

    Computer Science

    Compressed spectrogram Y

    |)|1log( XCY ⋅+=

    Onset Detection (Spectral-Based)

    1.  Spectrogram 2.  Logarithmic compression

    Steps:

    §  Accounts for the human logarithmic sensation of sound intensity

    §  Dynamic range compression §  Enhancement of low-intensity

    values §  Often leading to enhancement

    of high-frequency spectrum Time (seconds)

    Freq

    uenc

    y (H

    z)

    Computer Science

    Spectral difference

    Onset Detection (Spectral-Based)

    1.  Spectrogram 2.  Logarithmic compression 3.  Differentiation

    Steps:

    §  First-order temporal difference

    §  Captures changes of the spectral content

    §  Only positive intensity changes considered

    Time (seconds)

    Freq

    uenc

    y (H

    z)

  • 4/29/17

    15

    Computer Science

    Spectral difference

    t Novelty curve

    Onset Detection (Spectral-Based)

    1.  Spectrogram 2.  Logarithmic compression 3.  Differentiation 4.  Accumulation: spectral

    differences summarized by a number.

    Steps:

    §  Frame-wise accumulation of all positive intensity changes

    §  Encodes changes of the spectral content

    Freq

    uenc

    y (H

    z)

    Computer Science

    30

    Digression: Difference/Distance Metrics

    One of the most important issues in analyzing data, especially, multi-dimension and/or time-series data, is understand how similar two pieces of data are (represented typically by a vector or multi-dimensional array). There are two principle methods for such comparisons: Distance Metrics: Similar data vectors are regarded as closer in a geometrical sense; the range is [0 .. ∞), where distance = 0 means the vectors are identical: Dependence Metrics: Similar data vectors exhibit dependence: they “move together” in similar ways; the range of the coefficients is [-1 .. 1]:

    D( a, b ) = “distance” between a and b

    b

    a

    -1 0 1 Inverse No Strong Dependence

  • 4/29/17

    16

    Computer Science

    31

    Distance Metrics

    A Distance Metric obeys typical geometric laws: A set with an associated Distance Metric is called a Metric Space.

    Computer Science

    32

    Distance Metrics

    A variety of metrics have been developed, from fields as diverse as game playing to pattern recognition, and the most important of these is as follows: Sum of Absolute Difference (Manhattan Distance): Sum of Squared Difference: Mean Absolute Error: Mean Squared Error: Euclidean Distance:

  • 4/29/17

    17

    Computer Science

    33

    Distance Metrics

    These measures extend our common understanding of the notion of distance to complex mathematical domains (such as vector spaces) and give us tools to understand how similar or dissimilar two objects are.

    Computer Science

    34

    Dependence Metrics

    Two common dependence metrics are as follows: Correlation (Pearson’s Product-Moment Correlation Coefficient): Correlation measures the linear dependence of two vectors or random variables X and Y. Cosine Similarity: Cosine similarity measures the cosine of the angle between two vectors of length N in N-dimensional space. NOTE that these are similar calculations, except that correlation subtracts the mean from each point. For musical signals of any length, the mean will be very close to 0, and so these are effectively the same.

  • 4/29/17

    18

    Computer Science

    35

    Distance Metrics

    Dependence metrics can be converted (almost) into distance metrics by the simple expediency of subtracting them from 1.0: Cosine Distance = 1.0 - Cosine Similarity Pearson’s Distance = 1.0 - Correlation Coefficient Now these are in the range [0..2], with 0 indicating the strongest possible dependence; these are not actually distance metrics, since a metric of 0 does not indicate identity, but just the strongest possible linear depedence; and the cosine distance does not satisfy the triangle inequality; however, this does not prevent them from being extremely useful!!

    Computer Science Onset Detection (Spectral-Based)

    1.  Spectrogram 2.  Logarithmic compression 3.  Differentiation 4.  Accumulation

    Steps:

    Novelty curve

  • 4/29/17

    19

    Computer Science

    Subtraction of local average

    Onset Detection (Spectral-Based)

    1.  Spectrogram 2.  Logarithmic compression 3.  Differentiation 4.  Accumulation 5.  Normalization

    Steps:

    Novelty curve

    Computer Science Onset Detection (Spectral-Based)

    1.  Spectrogram 2.  Logarithmic compression 3.  Differentiation 4.  Accumulation 5.  Normalization

    Steps:

    Normalized novelty curve

  • 4/29/17

    20

    Computer Science Onset Detection (Spectral-Based)

    1.  Spectrogram 2.  Logarithmic compression 3.  Differentiation 4.  Accumulation 5.  Normalization 6.  Peak picking

    Steps:

    Normalized novelty curve

    Computer Science Examples of Onset Detection:

    WTC Fugue #1 (Bach)A Smooth One (Benny Goodman)Doc‘s Guitar (Doc Watson)WTC Prelude #5 (Bach)Poulenc, Valse No.114Faure, Op.15, No.1

  • 4/29/17

    21

    Computer Science

    Beat and Tempo

    §  Steady pulse that drives music forward and provides the temporal framework of a piece of music

    §  Sequence of perceived pulses that are equally spaced in time

    §  The pulse a human taps along when listening to the music

    [Parncutt 1994]

    [Sethares 2007]

    [Large/Palmer 2002]

    [Lerdahl/ Jackendoff 1983]

    [Fitch/ Rosenfeld 2007]

    What is a beat?

    The term tempo then refers to the speed of the pulse.

    Computer Science

    Beat and Tempo

    §  Analyze the novelty curve with respect to reoccurring or quasi-periodic patterns

    §  Avoid the explicit determination of note onsets (no peak picking)

    Strategy

  • 4/29/17

    22

    Computer Science

    Beat and Tempo

    Strategy

    §  Autocorrelation §  Fourier transfrom

    Methods

    §  Analyze the novelty curve with respect to reoccurring or quasi-periodic patterns—as if it were a musical signal and you are trying to find the component pitches (= periodic patterns of the novelty curve)

    §  Avoid the explicit determination of note onsets (no peak picking)

    Computer Science

    Definition: A tempogram is a time-tempo representation that encodes the local tempo of a music signal over time (= spectrograph of novelty curve!).

    Tem

    po (B

    PM

    )

    Time (seconds)

    Inte

    nsity

    Tempogram

  • 4/29/17

    23

    Computer Science

    Definition: A tempogram is a time-tempo represenation that encodes the local tempo of a music signal over time (= spectrograph of novelty curve!).

    §  Compute a spectrogram (STFT) of the novelty curve §  Convert frequency axis (given in Hertz) into

    tempo axis (given in BPM) §  Magnitude spectrogram indicates local tempo

    Fourier-based method

    Tempogram (Fourier)

    Computer Science

    Tem

    po (B

    PM

    )

    Time (seconds)

    Tempogram (Fourier)

    Novelty curve

  • 4/29/17

    24

    Computer Science

    Tem

    po (B

    PM

    ) Tempogram (Fourier)

    Novelty curve (local window)

    Time (seconds)

    Computer Science

    Tem

    po (B

    PM

    )

    Hann-windowed sinusoidal

    Tempogram (Fourier)

    Time (seconds)

  • 4/29/17

    25

    Computer Science

    Tem

    po (B

    PM

    )

    Hann-windowed sinusoidal

    Tempogram (Fourier)

    Time (seconds)

    Computer Science

    Tem

    po (B

    PM

    )

    Tempogram (Fourier)

    Hann-windowed sinusoidal

    Time (seconds)

  • 4/29/17

    26

    Computer Science

    Definition: A tempogram is a time-tempo represenation that encodes the local tempo of a music signal over time (= spectrograph of novelty curve!).

    §  Compare novelty curve with time-lagged local sections of itself

    §  Convert lag-axis (given in seconds) into tempo axis (given in BPM)

    §  Autocorrelogram indicates local tempo

    Autocorrelation-based method (cf. pitch determination algorithm).

    Tempogram (Autocorrelation)

    Computer Science Tempogram (Autocorrelation)

    Novelty curve (local window)

    Lag

    (sec

    onds

    )

    Time (seconds)

  • 4/29/17

    27

    Computer Science Tempogram (Autocorrelation)

    Windowed autocorrelation

    Lag

    (sec

    onds

    )

    Computer Science Tempogram (Autocorrelation)

    Lag = 0 (seconds)

    Lag

    (sec

    onds

    )

  • 4/29/17

    28

    Computer Science Tempogram (Autocorrelation)

    Lag = 0.26 (seconds)

    Lag

    (sec

    onds

    )

    Computer Science Tempogram (Autocorrelation)

    Lag = 0.52 (seconds)

    Lag

    (sec

    onds

    )

  • 4/29/17

    29

    Computer Science Tempogram (Autocorrelation)

    Lag = 0.78 (seconds)

    Lag

    (sec

    onds

    )

    Computer Science Tempogram (Autocorrelation)

    Lag = 1.56 (seconds)

    Lag

    (sec

    onds

    )

  • 4/29/17

    30

    Computer Science Tempogram (Autocorrelation)

    Time (seconds)

    Time (seconds)

    Lag

    (sec

    onds

    )

    Computer Science

    300

    60

    80

    40

    30

    120

    Tempogram (Autocorrelation)

    Tem

    po (B

    PM

    )

    Time (seconds)

    Time (seconds)

  • 4/29/17

    31

    Computer Science

    600

    500

    400

    300

    200

    100

    Tempogram (Autocorrelation) Te

    mpo

    (BP

    M)

    Time (seconds)

    Time (seconds)

    Computer Science

    Time (seconds)

    Tempogram Fourier Autocorrelation

    Time (seconds)

    Tem

    po (B

    PM

    )

  • 4/29/17

    32

    Computer Science Tempogram Fourier Autocorrelation

    210

    70

    Tem

    po (B

    PM

    )

    Tempo@Tatum = 210 BPM Tempo@Measure = 70 BPM Time (seconds) Time (seconds)

    Computer Science Tempogram

    Fourier Autocorrelation

    Time (seconds) Time (seconds)

    Tem

    po (B

    PM

    )

    Time (seconds)

    Emphasis of tempo harmonics (integer multiples)

    Emphasis of tempo subharmonics (integer fractions)

    [Grosche et al., ICASSP 2010] [Peeters, JASP 2007]

  • 4/29/17

    33

    Computer Science Tempogram (Summary)

    Fourier Autocorrelation

    Novelty curve is compared with sinusoidal kernels each representing a specific tempo

    Novelty curve is compared with time-lagged local (windowed) sections of itself

    Convert frequency (Hertz) into tempo (BPM)

    Convert time-lag (seconds) into tempo (BPM)

    Reveals novelty periodicities Reveals novelty self-similarities

    Emphasizes harmonics Emphasizes subharmonics

    Granularity increases as tempo increases; Suitable to analyze tempo on tatum and tactus level

    Granularity increases as tempo decreases; Suitable to analyze tempo on tatum and measure level