38
CSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria 2014 G. Tzanetakis 1 / 38

CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

CSC475 Music Information RetrievalSinusoids and DSP notation

George Tzanetakis

University of Victoria

2014

G. Tzanetakis 1 / 38

Page 2: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Table of Contents I

1 Time and Frequency

2 Sinusoids and Phasors

G. Tzanetakis 2 / 38

Page 3: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Motivation

Frequently the computer science students who take this coursehave no background in Digital Signal Processing (DSP) so Ialways try to do a few lectures introducing some DSPfundamentals. An introduction to DSP typically requires anentire course and learning DSP is a life long pursuit so whatone can do in a few lectures is rather limited. My goal is tostress intuition and attempt to demystify the basics of themathematical notation used. In addition, DSP contains somebeautiful mathematical ideas that connect the continuousmathematics of the physical world with the discretemathematics needed by computers. I hope this material willmotivate you to learn more about DSP.

G. Tzanetakis 3 / 38

Page 4: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Digital Audio Recordings

Recordings in analog media (like vinyl or magnetic tape)degrade over time

Digital audio representations theoretically can remainaccurate without any loss of information through copyingof patterns of bits.

MIR requires a distilling information from an extremelylarge amount of data

Digitally storing 3 minutes of audio requiresapproximately 16 million numbers. A tempo extractionprogram must somehow convert these to a singlenumerical estimate of the tempo.

G. Tzanetakis 4 / 38

Page 5: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Production and Perception of Periodic Sounds

Animal sound generation and perception

The sound generation and perception systems of animals haveevolved to help them survive in their environment. From anevolutionary perspective the intentional sounds generated byanimals should be distinct from the random sounds of theenvironment.

Repetition

Repetition is a key property of sounds that can make themmore identifiable as coming from other animals (predators,prey, potential mates) and therefore animal hearing systemshave evolved to be good at detecting periodic sounds.

G. Tzanetakis 5 / 38

Page 6: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Pitch Perception

Pitch

When the same sound is repeated more than 10-20 times persecond instead of it being perceived as a sequence of individualsound events it is fused into a single sonic event with aproperty we call pitch that is related to the underlying periodof repetition. Note that this fusion is something that ourperception does rather than reflect some underlying singalchange other than the decrease of the repetition period.

G. Tzanetakis 6 / 38

Page 7: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Time-Frequency Representations

Music Notation

When listening to mixtures of sounds (including music) we areinterested in when specific sounds take place (time) and whatis their source of origin (pitch, timbre). This is also reflectedin music notation which fundamentally represents time fromleft to right and pitch from bottom to top.

G. Tzanetakis 7 / 38

Page 8: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Spectrum

Informal definition of Spectrum

A fundamental concept in DSP is the notion of a spectrum.Informally complex sounds such as the ones produced bymusical instruments and their combinations can be modeled aslinear combinations of simple elementary sinusoidal signalswith different frequencies. A spectrum shows how “much”each such basis sinusoidal component contributes to theoverall mixture. It can be used to extract information aboutthe sound such as its perceived pitch or what instrument(s)are playing. A spectrum corresponds to a short snapshot ofthe sound in time.

G. Tzanetakis 8 / 38

Page 9: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Spectrum example

Spectrum of a tenor saxophone note

G. Tzanetakis 9 / 38

Page 10: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Spectrograms

Spectrograms

Music and sound change over time. A spectrum does notprovide any information about the time evolution of differentfrequencies. It just shows the relative contribution of eachfrequency to the mixture signal over the duration analyzed.In order to capture the time evolution of sound and music thestandard approach is to segment the audio signal into smallchunks (called windows or frames) and calculate the spectrumfor each of these windows. The assumption is that during therelatively short period of analysis (typically less than a second)there is not much change and therefore the calculatedshort-time spectrum is an accurate representation of theunderlying signal. The resulting sequence of spectra over timeis called a spectrogram.

G. Tzanetakis 10 / 38

Page 11: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Examples of spectrograms

Spectrogram of a few tenor saxophone notes

G. Tzanetakis 11 / 38

Page 12: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Waterfall spectrogram view

Waterfall display using sndpeek

G. Tzanetakis 12 / 38

Page 13: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Table of Contents I

1 Time and Frequency

2 Sinusoids and Phasors

G. Tzanetakis 13 / 38

Page 14: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Why is DSP important for MIR ?

A large amount of MIR research deals with audio signals.

Audio signals are represented digitally as very longsequences of numbers.

Digital Signal Processing techniques are essential inextracting information from audio signals.

The mathematical ideas behind DSP are amazing. Forexample it is through DSP that you can understand howany sound that you can hear can be expressed as a sum ofsine waves or represented as a long sequence of 1’s and0’s.

G. Tzanetakis 14 / 38

Page 15: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

DSP for MIR

Digital Signal Processing is a large field and thereforeimpossible to cover adequately in this course. The main goalof the lectures focusing on DSP will be to provide you withsome intuition behind the main concepts and techniques thatform the foundation of many MIR algorithms. I hope that theyserve as a seed for growing a long term passion and interestfor DSP and the textbook provides some pointers for furtherreading.

G. Tzanetakis 15 / 38

Page 16: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Sinusoids

We start our exposition with discussing sinusoids which areelementary signals that are crucial in understading both DSPconcepts and the mathematical notation used to understandthem. Our ultimate goal of the DSP lectures is to makeequations such as less intimidating and more meaningfull:

X (f ) =

∫ ∞−∞

x(t)e−j2πftdt (1)

G. Tzanetakis 16 / 38

Page 17: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

What is a sinusoid ?

Family of elementary signals that have a particularshape/pattern of repetition.sin(ωt) and cosin(ωt) are particular examples of sinusoids thatcan be described by the more general equation:

x(t) = sin(ωt + φ) (2)

where ω is the frequency and φ is the phase. There is aninfinite number of continuous periodic signals that belong tothe sinusoid family. Each is characterized by three numbers:the amplitude the frequency and the phase.

G. Tzanetakis 17 / 38

Page 18: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Figure : Simple sinusoids

G. Tzanetakis 18 / 38

Page 19: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

4 motivating viewpoints for sinusoids

Solutions to the differential equations that describesimple systems of vibration

Family of signals that pass “unchanged” through LTIsystems

Phasors (rotating vectors) providing geometric intutionabout DSP concepts and notation

Basis functions of the Fourier Transform

G. Tzanetakis 19 / 38

Page 20: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Simple vibration I

Consider striking the tine of a tuning fork. The tine willdeform, the be restored to the original position, the inertia willmake it overshoot and deform in the other direction and thepattern will repeat. At any particular displacement x Newton’ssecond law applies:

F = ma = −kx (3)

The accelaration is the second derivative of the displacement xwith respect to t so the equation can be rewritten:

d2x

dt2= −(k/m)x (4)

G. Tzanetakis 20 / 38

Page 21: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Sinusoids satisfy the equation

We are looking for a signal x(t) that satisfies the equationdescribing simple vibrations i.e we are looking for a signal thatis proportional to its second derivative.

d

dtsin(ωt) = ω cos(ωt)

d2

dt2sin(ωt) = −ω2 sin(ωt) (5)

So it turns out that sinusoidal signals arise as the solutions tothe physics equations that describe simple systems of vibrationthat can potentially generate sound.

G. Tzanetakis 21 / 38

Page 22: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Linear Time Invariant Systems

Definition

Systems are transformations of signals. They take a input asignal x(t) and produce a corresponding output signal y(t).Example: y(t) = [x(t)]2 + 5.

LTI Systems

Linearity means that one can calculate the output of thesystem to the sum of two input signals by summing the systemoutputs for each input signal individually. Formally ify1(t) = S{x1(t)} and y2(t) = S{x2(t)} thenS{x1(t) + x2(t)} = ysum(t) = y1(t) + y2(t). Time invarianceshift in input results in shift in output.

G. Tzanetakis 22 / 38

Page 23: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Sinusoids and LTI Systems

When a sinusoids of frequency ω goes through a LTI system it“stays” in the family of sinusoids of frequency ω i.e only theamplitude and the phase are changed by the system. Becauseof linearity this implies that if a complex signal is a sum ofsinusoids of different frequencies then the system output willnot contain any new frequencies. The behavior of the systemcan be completely understood by simply analyzing how itresponds to elementary sinusoids. Examples of LTI systems inmusic: guitar boy, vocal tract, outer ear, concert hall.

G. Tzanetakis 23 / 38

Page 24: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Thinking in circles

Key insight

Think of sinusoidal signal as a vector rotating at a constantspeed in the plane (phasor) rather than a single valued signalthat goes up and down.

Amplitude = Length

Frequency = Speed

Phase = Angle at time t

G. Tzanetakis 24 / 38

Page 25: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Projecting a phasor

The projection of the rotating vector or phasor on the x-axis isa cosine wave and on the y-axis a sine wave.

G. Tzanetakis 25 / 38

Page 26: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Notating a phasor

Complex numbers

An elegant notation system for describing and manipulatingrotating vectors.

x + jy

where x is called the real part and y is called the imaginarypart. If we represent a sinusoid as a rotating vector then usingcomplex number notation we can simply write:

cos(ωt) + jsin(ωt)

G. Tzanetakis 26 / 38

Page 27: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Multiplication by j

Multiplication by j is an operation of rotation in the plane. Youcan think of it as rotate +90 degrees counter-clockwise. Twosuccessive rotations by +90 degrees bring us to the negativereal axis, hence j2 = −1. This geometric viewpoint shows thatthere is nothing imaginary or strange about complex numbers.

G. Tzanetakis 27 / 38

Page 28: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Complex number multiplication

Complex number addition is the same as vector addition i.e weadd the x-coordinates (real parts) and the y-coordinates(imaginary parts). Where complex numbers draw their poweris when they are multiplied. Complex number multiplication iscan be done by following the rules of algebra blindly, andreplacing j2 with −1 when needed. However complex numbermultiplication makes more sense when we represent thecomplex numbers as vectors in polar form. When representedin polar form complex number multiplication has the propertythat the magnitude of the product is the product of themagnitudes and the angle of the product is the sum ofthe angles. This is the underlying reason why complexnumbers are a great notation for dealing with rotations.

G. Tzanetakis 28 / 38

Page 29: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Euler’s formula

Key insight

The rotating vector that represents a sinusoid is just a singlecomplex number raised to progressively higher and higherpowers.

Consider a rotating vector of unit magnitude. Let’s use E (θ)the function that represents the vector at some arbitrary angleθ. Then from simple geometry:

E (θ) = cos(θ) + j sin(θ)

anddE (θ)

dθ= − sin θ + j cos(θ) = jE (θ)

G. Tzanetakis 29 / 38

Page 30: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

As can be seen this is a function for which the derivative isproportional to the original function and from calculus weknow that that only the exponential function has this propertyso we can write our function E (θ) as:

E (θ) = e jθ (6)

So now we can express the fact that a rotating vector arisingfrom simple harmonic motion can be notated as a complexnumber raised to higher and higher powers using the famousEuler formula named after the Swiss mathematician LeonardEuler (1707-1783):

e jθ = cos θ + j sin(θ) (7)

G. Tzanetakis 30 / 38

Page 31: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Complex Conjugate

A bit of notation that will be used later. Given a complexnumber z = Re jθ, its complex conjugate is defined asz∗ = Re−jθ. Geometrically z∗ is the reflection of z in the realaxis.

G. Tzanetakis 31 / 38

Page 32: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Adding sinusoids of the same frequency I

G. Tzanetakis 32 / 38

Page 33: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Adding sinusoids of the same frequency II

Geometric view of the property that sinusoids (phasors) of aparticular frequency ω are closed under addition.

G. Tzanetakis 33 / 38

Page 34: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Negative frequencies and phasors

G. Tzanetakis 34 / 38

Page 35: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Measuring the amplitude of a sinusoid

G. Tzanetakis 35 / 38

Page 36: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Other DSP concepts with phasors

Many DSP concepts can be visualized and understood nicelyusing phasors. It is fun to create animations similar to theones I showed in this lecture to illustrate concepts such as:

Sampling, nyquist frequency and aliasing (takingsnapshots of the phasor as it goes around the circle)

Filtering (effect of simple low-pass filter)

Beating (phasors that are close in frequency)

G. Tzanetakis 36 / 38

Page 37: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Book that inspired this DSP exposition

A Digital Signal ProcessingPrimer by Ken Steiglitz

G. Tzanetakis 37 / 38

Page 38: CSC475 Music Information Retrievalmarsyas.cs.uvic.ca/mirBook/csc475_sinusoids.pdfCSC475 Music Information Retrieval Sinusoids and DSP notation George Tzanetakis University of Victoria

Summary

Sinusoidal signals are fundamental in understanding DSP

Representing them as phasors (i.e vectors rotating at aconstant speed) can help understand intuitively severalconcepts in DSP

Complex numbers are an elegant system for expressingrotations and can be used to notate phasors in a way thatleverages our knowledge of algebra

Thinking this way makes e jωt more intuitive.

G. Tzanetakis 38 / 38