View
222
Download
1
Tags:
Embed Size (px)
Citation preview
2
Outline
Background– Motivation– Perception of sound in space
Pricicple of MPEG Surround– Downmixing to one channel– Estimation of spatial cues– Synthesis of spatial cues
Conclusions & Reference
3
Motivation
The vast majority of audio playback equipment use traditional two-channel presentations (stereo)
More reproduction channels (“multi-channel audio” or “surround sound”) is quite visible in the market place
A non-disruptive transition from stereo to multi-channel audio requires media formats that can serve both those using conventional stereo equipment and those using next-generation multi-channel equipment.
4
Perception of sound in space
HRTF(Head Related Transfer Function) modeling the path of sound from a source to the left and right ear entrances.
5
Perception of sound in space(cont.)
Three parameters(cues) describing how human localize sound in the horizontal plane:
– Interaural level difference (ILD)– Interaural time difference (ITD)– Interaural coherence (IC)
6
ITD (Interaural time difference) & ILD (Interaural level difference)
)(log201
210
12
dBa
aILD
ddITD
7
ITD (Interaural time difference) & ILD (Interaural level difference) (cont.)
ITD and ILD between a pair of headphone signals determine the location of the auditory event which appears in the frontal section of the upper head.
9
Two sound source: Summing localization
Inter-channel time difference (ICTD) Inter-channel level difference (ICLD) Inter-channel coherence (ICC)
11
MPEG Surround
MPEG Surround exploits inter-channel differences in level, phase and coherence equivalent to the ILD, ITD and IC cues to capture the spatial image of a multi-channel audio signal
Downmix signal and encodes these cues in a very compact form such that the cues and the transmitted signal can be decoded to synthesize a high quality multi-channel representation.
Provide backward compatibility with stereo/mono audio systems.
13
Downmixing to one channel (1/2)
The sum signal is generated by adding the input channels in a subband domain
Multiplying the sum with a factor in order to preserve signal power
C
cc kxkeks
1
)(~)()(~
)(
)(~
2
1
~~
1~
)()()()( kp
kp
x
C
cx x
C
c cx
ckekpkekp
15
Estimation of spatial cues (1/4)
The spatial cues, ICTD, ICLD, and ICC are estimated in a subband domain. The spatial cue estimation is applied independently to each subband
16
Estimation of spatial cues(2/4)
ICTD (samples):with a short-time estimate of normalized cross-correlation function
where
and is a short-time estimate of the mean of
)},({maxarg)( 1212 kdk d
)()(),(
2~1~
),(~~
12
21
21
dkpdkp
pkd
xx
kdxx
}0,max{
}0,max{
2
1
dd
dd
),(21~~ kdp xx
)(~)(~ 2211 dkxdkx
17
Estimation of spatial cues(3/4)
ICLD (dB):
ICC :
)(
)(log10)(
1
2
~
~
1012 kp
kpkL
x
x
|),(|max)( 1212 kdkc d
18
Estimation of spatial cues(4/4)
For multi-channel audio signals, ICTD and ICLD are defined between the reference channel and each other C-1 channels
19
Synthesis of spatial cues(1/3)
ICTD are synthesized by imposing delays, ICLD by scaling, and ICC by applying de-correlation filters.
20
Synthesis of spatial cues(2/3)
The delays are determined by the ICTDs
{cd ,)(
)),(min)((max
11
121221
dk
kk
c
lCllCl
.2
1
Cc
c
21
Synthesis of spatial cues(3/3)
The scale factors are determined by the ICLDs satisfying:
After delays and scaling, we need to reduce correlation between the subbands.This is achieved by designing the filters hc controlled as a function of ICC.
20
)(
1
1
10kL
cc
a
a
22
Conclusions (1/2)
Well-known perceptual audio coders, such as MP3, primarily exploit a single channel’s ability to mask its own quantization noise.
In contrast, spatial perception is primarily attributed to three parameters : ILD, ITD, and IC.
23
Conclusions (2/2)
MPEG Surround provides an extremely efficient method for coding of multi-channel sound via the transmission of a compressed stereo (or even mono) audio program plus a low-rate side-information channel.
MPEG Surround is the latest technology for bitrate efficient and backward compatible presentation of multi-channel audio.