69
Properties of Musical Sound Subjective Objective Pitch Frequency Volume Amplitude/power/ intensity Timbre Overtone content/spectrum Duration in beats Duration in time

Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Embed Size (px)

Citation preview

Page 1: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Properties of Musical Sound

Subjective Objective

Pitch Frequency

Volume Amplitude/power/intensity

Timbre Overtone content/spectrum

Duration in beats Duration in time

Page 2: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Direct Sound

Sound waves that travel directly from the source to the listener.

Direct Sound intensity attenuates with the distance according to the inverse square law.

2Dist

KI

For example, doubling the distance will result in an attenuation of 4 times, or

dB6

Page 3: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Early (first order) Reflection

Sound waves that travel to the listener after reflecting “once” from the environment (mainly walls).

Early reflection within 35ms from direct sound reinforce the latter.

According to Beranek who study 54 concert halls, “intimate” effect was felt with early reflections of less than 20ms.

In large halls, suspended reflectors are employed to provide early reflection to center seats.

Page 4: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Reverberation (second to higher order reflection)

Sound waves that travel to the listener after reflection of first-order-reflection.

Reverberation will decay with time as sound energy is absorbed by the enviroment.

Reverberation time is the duration for the sound pressure to drop to 60dB of its initial level, in general for the frequency range of 500-1000Hz,

2

3

s

murfaceroomTotal

mvolumeRoomTR

High frequency signals are absorbed more quickly in air than low ones, reverberation time is hence shorter.

Page 5: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Microphone

Dynamic Magnetic Induction

A Classical Ribbon Microphone

Magnet

Coil

Ribbon diaphragm

Simple Economical Robust

Page 6: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Microphone

Condenser Capacitor Transducer

A Condenser Microphone

Complicated Expensive Sharper transient Phantom power

Current

Page 7: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Magnitude of a microphone’s response to pressure changes imposed at different directions.

0o

30o

60o

90o

330o

300o

270o

240o

210o180o

150o

120o

0.25

0.50

0.75

1.0

Page 8: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

0o

30o

60o

90o

330o

300o

270o

240o

210o180o

150o

120o

0.25

0.50

0.75

1.0

Omnidirectional 1

Page 9: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

0o

30o

60o

90o

330o

300o

270o

240o

210o180o

150o

120o

0.25

0.50

0.75

1.0

Bidirectional (figure-eight) cos

Page 10: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

0o

30o

60o

90o

330o

300o

270o

240o

210o180o

150o

120o

0.25

0.50

0.75

1.0

Standard cardioid cos5.05.0

Page 11: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

0o

30o

60o

90o

330o

300o

270o

240o

210o180o

150o

120o

0.25

0.50

0.75

1.0

Supercardioid cos63.037.0

Page 12: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

0o

30o

60o

90o

330o

300o

270o

240o

210o180o

150o

120o

0.25

0.50

0.75

1.0

Subcardioid cos25.075.0

Page 13: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

XY (coincident pair) Microphone Recording

90o-135o

Top view Front view

Two identical cardioids aimed across each other at 90o to 135o, 12 inches or less apart

Extremely mono-compatible, moderate stereo effect.

Localization of sound source based on difference in amplitude.

e.g., if L > R, the source seems to be closer to the left side.

Page 14: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Blumlein coincident Microphone Recording

90o

Top view Front view

Two identical “figure 8” microphones placed at 90o, one directly on top of the other

Create by Alan Blumlein, provides precise stereo imaging from sound sources at front and reverberation from rear.

L R

Page 15: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Near coincident Microphone Recording

90o-135o

Top view

ORTF (Office de Radio Television Francaise), 2 cardioids spaced 17cm apart at 110o apart. NOS (Netherlandshe Omroep Stichting), 2 cardioids spaced 12cm apart at 90o apart.

Page 16: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

MS Microphone Recording: Recording and playback configuration can be different

S (side)

M is a microphone of any polar pattern, S is a bidirectional microphone

M

M (main/mono)

SSML

SMR

Preserve monophonic compatibility.

Flexible stereoscopic perspectives.

e.g. a cardioid for M

MRL 2

Simulates equivalent microphone at playback

Page 17: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Optimized Cardioid Triangle (OCT)

C (Center)

RF (Right Front)LF (Left Front)

8cm

4-100cm

Page 18: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

INA 5

17.5cm

17.5cm17.5cm

60cm60cm

60o

Ideale Nierenanordung (ideal cardioid)

Five cardioid microphones orientated in 5 directions to supply the five channels

C (Center)

RL

LS RS

Page 19: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Fukada Tree

Developed by NHK

INA 5 as basis plus two omnidirectional microphones to expand spatial impression

C (Center)

LS RS

RLRRLL

Page 20: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Pair-wise pan-pot permit permits positioning of sound source

Non-zero gain is applied only to the two speakers adjacent to the phantom image location Even if there are more than two speakers, only the pair which encloses the phantom image is considered.

L R

2

1

P

12 P

12

21

Pg

12

12

Pg

Assuming gain decreases linearly in one channel and increase linearly in the other, we have

Page 21: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

)(, oooLet 4531545 12

2

1

gain otali

igT

2

1

2 power otali

igT Ideal case: independent on image position

Linear Panning: Total Gain

0

0.2

0.4

0.6

0.8

1

1.2

-45

-35

-25

-15 -5 5 15 25 35 45

Channel one

Channel two

Total gain

Linear Panning: Total Power

0

0.2

0.4

0.6

0.8

1

1.2

-45

-35

-25

-15 -5 5 15 25 35 45

Channel one

Channel two

Total Power

Loudness is proportional to

power instead of gain

Constant Gain Optimization

Page 22: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Let

Constant Power Optimization

12

190 P

m mg cos1 mg sin2

Constant Power Panning: Total Gain

-0.20

0.20.40.60.8

11.21.41.6

-45

-35

-25

-15 -5 5 15 25 35 45

Channel one

Channel two

Total gain

Constant Power Panning: Total Power

0

0.2

0.4

0.6

0.8

1

1.2

-45

-35

-25

-15 -5 5 15 25 35 45

Channel one

Channel two

Total Power

Page 23: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Time domain Digitization (e.g. CD)

x(t) Quantization y(n)

…01001010...

Sampling

x(n)

Bit-rate = Sampling rate (f) Bits per sample

Number of Channels

Example: bit-rate of 16bits, 44kHz stereo signal =

44,100 16 2 = 1,411,200 bits per second = 176,400 bytes per second

Page 24: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Time domain Digitization (e.g. CD)

x(t) Quantization y(n) Sampling

x(n)

After sampling, the maximum frequency of the signal will be restricted to half the sampling frequency (why?).

2T

T

The highest repetitive pattern that can be obtained with a sampling interval of T is shown below:

Tf s

1

Minimum period =22

1max

sffT

Page 25: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

2T

T

Tf s

1

Minimum period =22

1max

sffT

A common convention: Normalized the digital frequencies to the range

2,0f

0 0

fs/8 pi/4

fs/6 pi/3

fs/4 pi/2

fs/2 (fmax) pi

Page 26: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

fs/2

Frequency spectrum of a digitized audio signal

Increasing the sampling rate by two times

fs/4

fs/2

fs

Page 27: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

fs/2

Frequency spectrum of a digitized audio signal

Increasing the sampling rate by N times

fs/2N

fs

fs

Page 28: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Increasing the sampling rate by N times

fs/2N

fs/2

Quantization noise

Relocate the quantization errors to the high frequency end so that it will reduce its effect on the signal

Page 29: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

pk+1

pk

qq/2

If the signal is random (white) the probability distribution of the quantization noise is uniform, noise power (mean square quantization error) =

12/1 22/

2/

2 qdxxq

Nq

qQ

Whenever q is reduced by two times, the power is reduced by 4, i.e. 6dB.

Page 30: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Q

h(n)

d(n)

x(n) y(n)+

+

+

+_

_

nhnunynxnu

u(n)

(1)

ndnuQny (2)

Page 31: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

h(n)

x(n) y(n)+

+

+_

_

nhnunynxnu

u(n)

(as before)

nenuny (3)

+

e(n)

The combine noise addition and quantization can be represented by an overall noise term e(n), as

Page 32: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

H()

X Y()+

+

+_

_

HUYXU

U

(4)

EUY (5)

+

E()

Applying Fourier Transform gives

HEXY 1 (6)

Page 33: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

If H()=1 then the quantization error will be eliminated.

NXHEXY 1 (6)

However this kind of filter cannot be implemented in practice, alternatively different transfer function can be selected so that the noise will be attenuated more on the low frequency end.

jeH (7)

NEXeEXY j 1 (8)

f |N(|2 db

0 0 0 -infinity

fs/8 pi/4 0.5 -3

fs/6 pi/3 1 0

fs/4 pi/2 2 3

fs/2 pi 4 6

Noted that the noise is attenuated more at the low frequency end than the higher end.

Page 34: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Noise power gain

212

1 22

0

dHNPG (9)

Hence the noise shaper had increased the noise power by 3dB

Page 35: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Time Frequency domain Digitization (e.g. MD)

x(t)

Quantizer 1

y(n)

Sampling

x(n)

Block 1 Block 2 Block N

Block M NM 1

Band 1

Band 2

Band 3

Band K

Quantizer 2

Quantizer K

Quantizer 2

Freq. To Time

Converter

Page 36: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

1. Time signal is chopped into segments or blocks

2. Each block is transformed into its frequency spectrum

3. Frequency spectrum is partitioned into bands

4. Each band is digitized and quantized

Page 37: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

If each frequency band is quantized with the same number of levels, no compression is achieved.

5. In the player, each digitized band is converted back to analogue form

6. The frequency bands integrates to reconstruct the frequency spectrum

7. The frequency spectrum is transformed back to the time domain to reproduce the time segment.

The extra, complicated effort is wasted

Page 38: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

However, those bands will subject to more distortion

Compression is attained if certain bands can be quantized with less number of levels

The distortion is in the form of “Quantization Noise”

Any solution to make both ends meet ?

Page 39: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Quantization Noise is less audible at some frequencies than at others

Key researchers in the study of HAS

1. G. von Bekesy

2. J.B. Allen

3. H. Fletcher

4. B. Scharf

5. D.D. Greenwood

Important Findings: Hearing Sensitivity, Tone-Masking-Noise, Critical Bands

Page 40: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

“The brain interprets signal received via the auditory system rather than its objective representation.”

Author: Diana Deutsch Source: http://psy.ucsd.edu/~ddeutsch/psychology/figures/fig3.jpg Copyright: Diana Deutsch

Listeners grouped tones by frequency proximity, rather than the actual representation

L

L

R

R

Page 41: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

“When two identical but delayed audio sources are heard, the first one will inhibit the other if the delay is within 25 to 35 ms.”

This is true even if the second sound is 10db above the first one.

The result is sound seems to originate from the first source only, and the loudness is increased.

Page 42: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Analyses with frequency (critical) bands

The ear operates like a spectrum analyser

100 Hz below 500Hz

1/6 to 1/3 of an octave above 500Hz

High energy in one band may inhibit neighboring bands

Masking occurs after the masking tone starts and ends:

Forward and backward masking

Frequency response of human ears is non-uniform

Page 43: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

• Placed an audience in a quiet room

• Raised 1kHz tone until just audible and recorded the amplitude

• Repeat with other frequencies

kHz

dB

2 4 6 8 10

10

20

Page 44: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Masking involves two signals; a Masker (M) and a Probe (P)

Hiding of one signal at a given frequency by another signal at or near that frequency

M

P

HAS M

P is masked by M

Page 45: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Masking involves two signals; a Masker (M) and a Probe (P)

Hiding of one signal at a given frequency by another signal at or near that frequency

M

P

HAS M

The level when P is just audible is known as “just noticable difference (JND)

Page 46: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Masking by 1kHz tone

kHz

dB

2 4 6 8 10

20

40

60 1

Note: Two types of masking

Page 47: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

kHz

dB

2 4 6 8 10

20

40

Masking of multiple tone

601

0.25 4 8

Note: Two types of masking

Page 48: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Determine masking envelop

Divide signal into bands

Determine masked noise region

Masking tone

Noise that can be masked

Page 49: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Determine masking envelop

Divide signal into spectral bands

Determine masked noise region

Masking tone

Noise that can be masked

Quantization is a kind of noise

QQ

QQ

RSS

RSS

=S + Noise

The coarser the quantization, the smaller is the bit-rate. The effect, however, is negiligible is the noise can be masked

Page 50: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Masking tone

Noise that can be masked

Masking tone

Noise that can be masked

The narrower the bandwidth of each band, the better is the noise masking effect.

Page 51: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Time resolution is best at higher frequencies:

Easier to locate the instance of a particular tone

Frequency resolution is best at low frequencies:

Easier to discriminate different frequencies

Page 52: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Time resolution is best at higher frequencies:

Easier to locate the instance of a particular tone

Frequency resolution is best at low frequencies:

Easier to discriminate different frequencies

Suggest non-uniform partitioning of audio frequency spectrum

Page 53: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Standard: The Bark Scale (after Barkhausen)

Partitioning of frequency spectrum into Critical Bands according to the Psychoacoustic model

f

Page 54: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

otherwisef

log

Hzff

1000

49

500for 100

1 Bark

Page 55: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Bark

dB

5 10 15 20

20

40

Masking of multiple tone

600.25

(2.5Bk)1k (9Bk) 4k (17Bk)

0.5 (5Bk) 2k (13Bk)

Page 56: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

mS

dB

0 5 10 20

20

40

Test tone shortly after the Mask is not audible

60

Mask tone

Test tone

Page 57: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Quantization Noise is less audible at some frequencies than at others

Sensitivity of the ear varies with different frequency

Most sensitive: around 4kHz

Less sensitive: at higher frequencies

Simultaneous masking: A softer sound is less audible in the presence of a louder sound

Quantization Noise is less audible at frequencies on, or closed to loud tones.

Page 58: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

x(n)

Segments of input signal

Y0

Y1

|

.|

YN-1

t0

Y0

Y1

|

.|

YN-1

t1

Page 59: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

DCTx(n)

Yi

x(n)

Analyzing time windows

A single spectral component for each time slotOthers are computed in the same

Page 60: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

The MDCT blends one frame into the next to avoid inter-frame block boundary artifacts. The MDCT output of one frame is windowed according to MDCT requirements, overlapped 50% with the output of the previous frame and added.

Case 1: equal sized-windows

Page 61: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

The MDCT blends one frame into the next to avoid inter-frame block boundary artifacts. The MDCT output of one frame is windowed according to MDCT requirements, overlapped 50% with the output of the previous frame and added.

Case 2: non-equal sized-windows

Page 62: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

x(n)

A single window

122

122

1

0

N

ki i

Nk

NcoskxY

Y0

Y1

Yi

YN-1

N samples

DCT

Page 63: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

x’(n)

Overlapping window w(n)

122

122

1

0

N

ki i

Nk

NcoskxY

Y0

Y1

Yi

YN-1

N samples

DCT

* ' nwnxnx

Page 64: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

1,....,1,0 122

122

cos

12

0

'

NkiN

kN

Ykx

N

ii

Y0

Y1

Yi

YN-1

IDCTx’(n)

Overlapping window

Page 65: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Discard frequency band that is less essential to HAS

Decompose x(n) into N

MDCT coefficients

Select the coefficients that are sensitive to the HAS and discard the rest

(e.g. select K bands where K<N)

x(n)= [x(0),x(1), ....., x(N-1)] N

Number of data samples

N

K < N

Disadvantage: Noticable distortion on discarded bands

Page 66: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

A better approach: Assign different quantization step-size to each coefficients according to their tolerance to quantization noise based on HAS

x(n)= [x(0),x(1), ....., x(N-1)] N Number of data samples

N

1 re whe1

0

j

N

jj qq

Quantize each coefficient so that the noise is below the

masking threshold (1bit = 6dB)

Decompose x(n) into N

MDCT coefficients

Page 67: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

20 bands for lower frequencies

A total of 52 critical bands

16 bands for middle frequencies

16 bands for higher frequencies

Smallest time window: 1.45mS

Longest time window: 11.60mS

Source bit-rate: 1.4Mb/s

Target bit-rate: 292Kb/s

Number of time windows: 8

Page 68: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Bark

dB

5 10 15 20

Different noise masking response in bands can be taken to quantize

frequency components adaptively

Fine quantization is required

Coarse quantization

allowed

No interbank masking

Page 69: Properties of Musical Sound SubjectiveObjective PitchFrequency VolumeAmplitude/power/intensity TimbreOvertone content/spectrum Duration in beatsDuration

Frequency Range

Analyser

MDCT

MDCT

MDCT

Block size decision

Bit Allocation

H

M

L

11-22k

5.5-11k

0.5-5.5k

292kbps1.4Mbps