31
SIP - Sound and Image Processing Lab, EE, KTH Stockholm 1 FlexCode The Sensitivity Matrix Integrating Perception into the Flexcode Project Jan Plasberg Flexcode Seminar Lannion June 6, 2007

Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 1

FlexCode

The Sensitivity Matrix

Integrating Perception into the Flexcode Project

Jan PlasbergFlexcode Seminar Lannion

June 6, 2007

Page 2: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 2

FlexCode Outline

• Flexcode in a Nutshell• The Sensitivity Matrix• Coding with the Sensitivity Matrix• Open Issues

Page 3: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 3

FlexCode Who?

France Telecom

RWTH Aachen

KTH

NokiaEricsson

Page 4: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 4

FlexCode The Problem

• Heterogeneity of networks increasing• Networks inherently variable (mobile users)

• But:– Coders not designed for specific environment– Coders inflexible (codebooks and FEC)– Feedback channel underutilized

Page 5: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 5

FlexCode Adaptation and Coding

set of models, fixed quantizer

flexible

rigid

CELP with FEC

fixed model

fixed quantizer

adaptive coding

AMR

FlexCode

PCM

Page 6: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 6

FlexCode The Tools

• Tools include– Models of source, channel, receiver– High-rate quantization theory– Multiple description coding (MDC)– Iterative source-channel decoding– Distortion measures using the sensitivity matrix

Page 7: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 7

FlexCode Outline

• Flexcode in a Nutshell• The Sensitivity Matrix

– Why, what and how?– What can it tell us?

• Coding with the Sensitivity Matrix• Open Issues

Page 8: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 8

FlexCode

• Simultaneous masking (frequency masking)

Human Auditory Masking

Page 9: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 9

FlexCode

• Non-simultaneous masking (time masking)

• Not accounted for by simple auditory models• Coding standards use (heuristic) work-arounds

Human Auditory Masking

Page 10: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 10

FlexCode

• Merge the worlds ofauditory modeling and audio coding

• Advanced auditory models too complex for coding

AM

AM+

x

x̂⋅ )ˆ,( xxd

perceptual distortion criterion

Motivation

Page 11: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 11

FlexCode The Sensitivity Matrix

• Sensitivity Matrix M from Taylor expansion

Complexity problem solved!

Analysis of model properties using linear algebra• Masking curve• Subspace-based analysis

0)ˆ,(),ˆ()()ˆ(21)ˆ,( →−−≈ xxxxxMxxxx dd x

H

xx

xxxM=

∂∂∂

2

, ˆˆ)ˆ,()]([

jijix xx

d

Page 12: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 12

FlexCode Application to an Auditory Model

• A typical auditory model (here: the Dau model)

• Distortion is sum of per-channel distortions

Filte

rban

k

Stage 1

Stage 2

Stage KL yx

channel m

Page 13: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 13

FlexCode Application to an Auditory Model

• Find sensitivity matrix for model output y per channel m

Filte

rban

kStage

1Stage

2Stage

KL yx

)ˆ()()ˆ(21)ˆ,( )()( yyyMyyyy −−= m

yHmd

IyM 2)()( =myhere

Page 14: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 14

FlexCode Application to an Auditory Model

• Each model stage linearized by respective Jacobian

Filte

rban

kStage

1Stage

2Stage

KL yx

z∂ y∂)(zJ z

Page 15: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 15

FlexCode Application to an Auditory Model

• Get sensitivity matrix for channel m in input signal domain from chain rule;

Filte

rban

kStage

1Stage

2Stage

KL yx

∏∏⎥⎥⎦

⎢⎢⎣

⎡=

k

mk

H

k

mk

mx x )()()( 2)( JJM

Page 16: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 16

FlexCode Application to an Auditory Model

• Sensitivity Matrix is sum of per-channel matrices

Filte

rban

kStage

1Stage

2Stage

KL yx

∑ ∏∏∑⎥⎥⎦

⎢⎢⎣

⎡==

m k

mk

H

k

mk

m

mxx

)()()( 2)()( JJxMxM

Page 17: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 17

FlexCode Validity of the Approximation

• Narrow-band music + white noise at different SNR (10-60 dB), correlation > 0.9

(dB)Daud

(dB)SMd coding possible

Page 18: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 18

FlexCode Analysis: Fourier Transform

• Sensitivity matrix for frequency domain vectors, with = DFT in matrix notation

)(XM X

HxX FxMFXM )()( =

F

)ˆ()ˆ( XXFxx −=− H

Page 19: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 19

FlexCode Analysis: Masking Curve

• Assume masking threshold is at • Masking curve at frequency i:

gain for unit distortion in DFT bin i

to approach masking threshold

iiXHi guXMu )(

211 =

[ ]Hi 010 LL=u

[ ] iiXig

,)(2XM

=⇔

ig

1)ˆ( =XX,d

pos. i

Page 20: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 20

FlexCode

0 500 1000 1500 2000 2500 3000 3500 40000

10

20

30

40

50

60

F

MA

SK

Example 1: Simultaneous Masking

• Masking curve for 50 dB sinusoid at 1000 Hz

dB

f (Hz)

- sensitivity matrix for Dau et al.- spectral model (v.d.Par, 2002)O human listeners (Zwicker, 1982)

Page 21: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 21

FlexCodeExample 2: Non-Simultaneous Masking

• Subspace analysis of M: low sensitivity space

signal

noise

projection

t (ms)

Page 22: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 22

FlexCodeF

200

700

F

200

700

F

T 50 100

200

700

A Spectro-Temporal Experiment

• Spectrograms for two sinusoids (200 Hz and 700 Hz) coded with an APCM coder

signal

WMSE-coder

Sens.-coder

f (H

z)

t (ms)

f (H

z)f (

Hz)

Page 23: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 23

FlexCode Outline

• Flexcode in a Nutshell• The Sensitivity Matrix• Coding with the Sensitivity Matrix• Open Issues

Page 24: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 24

FlexCode Problem

• Sensitivity Matrix generally not diagonal in typical coding domains (e.g., time, frequency, etc.)

– distortion not single-letter:

– no analytical solution known for optimal coding– so far only trained VQ available

(undoable for large dimensions)!

)ˆ()()ˆ(21)ˆ,( xxxMxxxx −−≈ x

Hd

Page 25: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 25

FlexCode Solution Outline

• Sensitivity Matrix diagonal in perceptual domain– transform signal vectors into perceptual domain– (W)MSE-optimal coding in perceptual domain results in

minimum perceptual distortion • Optimal Transform:

)()()( xMxFxF xoptT

opt c=′′

Page 26: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 26

FlexCode Making it Practical

• Problem: optimal transform depends on x– not known at decoder!

• Solution: Parameterize transform as DCT + diagonal weights

– combines DCT coding gain with perceptual transform!– Model-driven approach to temporal noise shaping (TNS)

tDCTf WTWxF =′ )(

Page 27: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 27

FlexCode Examples

• Original

• Freq. weights

• Time-freq. weights

DCTf TWxF =′ )(

tDCTf WTWxF =′ )(

Page 28: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 28

FlexCode Outline

• Flexcode in a Nutshell• The Sensitivity Matrix• Coding with the Sensitivity Matrix• Open Issues

Page 29: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 29

FlexCode Complexity

• Main complexity lies here:

• Can we make multiplications ‘sparse’?• Can we reduce the size of the matrices?

∑ ∏∏⎥⎥⎦

⎢⎢⎣

⎡=

m k

mk

H

k

mk

)()(2)( JJxM

Page 30: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 30

FlexCode Conclusion

• Using auditory models in coding is possible• General method• Direct applicability in AbS structures (CELP)• Solution outline for transform coding

Page 31: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 31

FlexCode The End

Thank you!