Digital Audio What do we mean by “digital”? How do we produce, process, and playback? Why is...

Preview:

Citation preview

Digital Audio

What do we mean by “digital”?How do we produce, process, and playback?

Why is physics important?What are the limitations and possibilities?

Digital vs. Analog

Discrete data Reproducible with

100% fidelity Can be stored

using any digital medium

Frequency and amplitude ranges limited by digitization

Continuous data Reproduction

introduces new noise

Storage limited by physical size

Virtually unlimited frequency and amplitude ranges

Physics of Digitization

Sound (pressure wave) is transduced into an electrical signal (usually voltage)

Signal “read” by A-D converter to discrete values

Time sequence of signal values encoded in a computer

Sampling Basics

Sample Rate: Frequency interval of the time sequence of encoded values

Sample Depth (or Bit Depth): Number of bits used to encode each value

Bit Rate = (Sample Rate) x (Bit Depth)

For example, “CD Quality” audio is 44.1kHz at 16 bits = 7.065E5 bps per channel, or 1411 kbps total

Sample Rate (Sample Frequency)

Sample Period = 0.5s

Sample Rate = 1/0.5s = 2Hz

Sample Rate Matters!(Mathematica Demo 1)

What do the samples actually represent?

Nyquist-Shannon Sampling Theorem

“If a function x(t) contains no frequencies higher than B, it is completely determined by giving its ordinates at a series of points spaced 1/(2B) seconds apart.”

A necessary condition for digitizing a signal so that it can be faithfully reconstructed is that the sample rate is at least twice as high as the highest frequency present in the signal.

What can go wrong?

Aliasing: High frequencies contribute signal components that are perceived as lower frequencies (Mathematica Demo 2)

Bit Depth

Number of bits used to represent each sampled value

Available discrete values n=2b

Here there are only 5 discrete values, so 3 bits per sample

Dynamic Range Ability to represent

small and large amplitude signals in the same scheme

Clipping: Large signals are cut off, introducing high harmonics

Masking: Small signals are “drowned out”

Signal-to-Noise Ratio (S/N)

Ratio of meaningful signal power to unwanted signal power

In sound, the “audible power” (decibels) is skewed from the actual power

Best case scenario: noise is in the first bit:

S/N (dB) = 10 Log (2b) = 3.01b (per channel) Human ear sensitivity covers a range of

more than 120dB! (~40 bits)

Digital Audio Compression

Analog signals are practically incompressible

Raw audio signals are similarly hard to reduce using standard (lossless) file compression (Shannon Information Theory)

Psycho-acoustic models may be helpful! (lossy)

MP3 Codec

Divide the file into packets and find the Fourier power spectrum via DFT

Throw out easily masked frequencies to reach desired bit rate

Dither regions with different dynamic ranges or where the bit depth must be lowered to match desired bit rate

Perform traditional redundancy compression

(ratatat samples)

Discrete Fourier Transform Frequency Limit = ½

Sample Frequency (Nyquist)

Frequency Resolution = 1/Signal Period (Mathematica 3)

Usually frequency resolution is much sharper than the ear can detect

Dithering

Digital Signal Processing (DSP)

Non-linear (ie, atemporal) Real-time effects subject to latency and

buffering memory Filters and envelopes extremely

difficult/expensive to achieve with analog techniques

Easier non-destructive editing Perfect fidelity in copying

Some Common DSP Effects

Vocoder vs Autotune (Daft Punk) Delay/Echo (U2, David Gray) Filter/Flange (Foster the People, Dizzy

Gillespie)

Digital Synthesis (If you can write an equation, you can hear it!)

Sound engineering for movies/TV Arbitrary mathematical functions can be

generated (Mathematica 4) Sounds not identifiable by the ear/brain

(Chem Bros and Skrillex samples)

Recommended