7
Thierry Dutoit Ferran Marques Applied Signal Processing A MATLAB™-Based Proof of Concept 4y Springer

Applied Signal Processing

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Applied Signal Processing

Thierry Dutoit • Ferran Marques

Applied Signal Processing

A MATLAB™-Based Proof of Concept

4y Springer

Page 2: Applied Signal Processing

Contents

Chapter 1 1

1.1 Background - Linear predictive processing of speech 1 1.1.1 The LP model of speech 2 1.1.2 The LP estimation algorithm 4 1.1.3 LP processing in practice 5 1.1.4 Linear predictive coders 7

1.2 MATLAB proof of concept: ASP_cell_phone.m 11 1.2.1 Examining a speech file 12 1.2.2 Linear prediction synthesis of 30 ms of voiced

speech 13 1.2.3 Linear prediction synthesis of 30 ms of unvoiced

speech 18 1.2.4 Linear prediction synthesis of a speech file, with

fixed F0 19 1.2.5 Unvoiced linear prediction synthesis of a speech file 21 1.2.6 Linear prediction synthesis of speech, with original F0.... 21 1.2.7 CELP analysis-synthesis of a speech file 23

1.3 Going further 29 1.4 Conclusion 29 References 30

Chapter 2 33

2.1 Background - Delta-sigma modulation 34 2.1.1 Uniform quantization: Bits vs. SNR 34 2.1.2 Conventional DACs 36 2.1.3 Oversampling DACs 36 2.1.4 Oversampling DACs -Noise shaping 40 2.1.5 Delta-sigma DACs 42

2.2 MATLAB proof of concept: ASP_audio_cd.m 45 2.2.1 Uniform quantization 45 2.2.2 Dithering 47 2.2.3 Conventional DAC 49

Page 3: Applied Signal Processing

xviii Contents

2.2.4 Oversampling DAC 53 2.2.5 Oversampling and noise-shaping DAC 57 2.2.6 Delta-sigma DAC 59

2.3 Going further 62 2.4 Conclusion .~ 62 References 63

Chapter 3 65

3.1 Background - Sub-band and transform coding 65 3.1.1 Perfect reconstruction filters 67 3.1.2 Filter banks and lapped transforms 73 3.1.3 Using the masking properties of the human ear 76 3.1.4 Audio coders 77

3.2 MATLAB proof of concept: ASP_mp3.m 80 3.2.1 Two-channel filter bank 81 3.2.2 Two-channel QMF filter bank 84 3.2.3 32-channel pseudo-QMF filter bank 86 3.2.4 Filter banks and lapped transforms 89 3.2.5 Perceptual audio coding 92

3.3 Going further 100 3.4 Conclusion 100 References 101

Chapter 4 103

4.1 Background - Statistical pattern recognition 104 4.1.1 The statistical formalism of ASR 105 4.1.2 Markov models 108 4.1.3 Hidden Markov models I l l 4.1.4 Training HMMs 115

4.2 MATLAB proof of concept: ASP_dictation_machine.m 118 4.2.1 Gaussian modeling and Bayesian classification

of vowels 118 4.2.2 Gaussian Mixture Models (GMM) 123 4.2.3 Hidden Markov models (HMM) 134 4.2.4 W-grams 139 4.2.5 Word-based continuous speech recognition 144

4.3 Going further >. 147 4.4 Conclusion 147 References 147

Page 4: Applied Signal Processing

Contents xix

Chapter 5 149

5.1 Background - The phase vocoder 149 5.1.1 DFT-based signal processing 150 5.1.2 STFT-based signal processing 152 5.1.3 Perfect reconstruction 156 5.1.4 Time scale modification with the phase vocoder 157 5.1.5 Pitch shifting with the phase vocoder 163

5.2 MATLAB proof of concept: ASP_audio_effects.m 166 5.2.1 STFT-based audio signal processing 166 5.2.2 Time-scale modification 172 5.2.3 Pitch modification 179

5.3 Going further 182 5.4 Conclusion 183 References 184

Chapter 6 187

6.1 Background - Source localization 188 6.1.1 Sperm whale sounds 188 6.1.2 The Teager-Kaiser energy operator 190 6.1.3 TDOA estimation based on the generalized

cross-correlation 192 6.1.4 Adaptive TDOA estimation 195 6.1.5 Multilateration 198

6.2 MATLAB proof of concept: ASP_audio_effects.m 199 6.2.1 Sperm whale sounds 199 6.2.2 Teager-Kaiser filtering 203 6.2.3 TDOA estimation using generalized cross-correlation .210 6.2.4 TDOA estimation using least-mean squares 215 6.2.5 Multilateration 218

6.3 Going further 220 6.4 Conclusion 220 References 221

Chapter 7 223

7.1 Background - Audio watermarking seen as a digital communication problem 225 7.1.1 Spread spectrum signals 226 7.1.2 Communication channel design 228 7.1.3 Informed watermarking 233

Page 5: Applied Signal Processing

xx Contents

7.2 MATLAB proof of concept: ASP_watermarking.m 238 7.2.1 Audio watermarking seen as a digital

communication problem 239 7.2.2 Informed watermarking with error-free detection 244 7.2.3 Informed watermarking made inaudible 247 7.2.4 Informed watermarking robust to MPEG

compression 259 7.3 Going further 261 7.4 Conclusion 262 References 262

Chapter 8 265

8.1 Background-JPEG 266 8.1.1 Color transform 268 8.1.2 Frequency transform: The discrete cosine

transform , 269 8.1.3 Entropy coding 279 8.1.4 A few specificities of the JPEG standard 282 8.1.5 Quality measures 284

8.2 MATLAB proof of concept 285 8.2.1 Block image transformation 286 8.2.2 Complete image block coding 293 8.2.3 DCT quantization 295 8.2.4 Spatial decorrelation between blocks 298 8.2.5 Entropy coding 302 8.2.6 Still image coding 306

8.3 Going further 308 8.4 Conclusions 309 References 309

Chapter 9 311

9.1 Background - Motion estimation 312 9.1.1 Motion estimation: The block matching algorithm 316 9.1.2 A few specificities of video coding standards 321

9.2 MATLAB proof of concept 325 9.2.1 Macroblock processing 325 9.2.2 Block matching motion estimation 326 9.2.3 Motion compensation 339 9.2.4 Selection of search area 341 9.2.5 Selection of reference image 343 9.2.6 Backward motion estimation 345 9.2.7 Coding of the compensation error 349

Page 6: Applied Signal Processing

Contents xxi

9.2.8 Entropy coding 350 9.2.9 Video coding 352

9.3 Going further 358 9.4 Conclusion 359 References 359

Chapter 10 361

10.1 Background - Introduction to wavelet and multi-resolution transforms 365 10.1.1 Think globally, act locally 366 10.1.2 Approximate... but details matter 367 10.1.3 Wavelet transform: Definition and computation 370 10.1.4 WT and discrete signals: DWT 374 10.1.5 WT and DWT for Images: 1+1=2 375

10.2 Background - Context-based modeling of wavelet coefficients bit planes 376 10.2.1 Spatial and bit-depth scalability 376 10.2.2 Efficient entropy coding 377

10.3 Background - Rate-distortion optimal bit allocation across wavelet codeblocks 379 10.3.1 Problem definition 380 10.3.2 Lagrangian formulation and approximated solution... 381 10.3.3 Lagrangian optimization: A non-image based

example 384 10.4 MATLAB proof of concept 386

10.4.1 Experiments with the wavelet transform 387 10.4.2 A simplified JPEG2000 scheme 393

10.5 Going further: From concepts to compliant JPEG2000 codestreams 406

10.6 Conclusion 408 References 408

Chapter 11 411

11.1 Background - Statistical pattern recognition for image classification 414 11.1.1 Statistical framework 415 11.1.2 Gaussian mixture models (GMM) 417 11.1.3 The Expectation-Maximization algorithm (EM) 419 11.1.4 Markov random fields (MRF) 420 11.1.5 Hidden Markov random fields (HMRF) 423 11.1.6 Gaussian hidden Markov random field model 424

Page 7: Applied Signal Processing

xxii Contents

11.2 MATLAB proof of concept 426 11.2.1 3D data visualization 426 11.2.2 Image histogram 429 11.2.3 Gaussian mixture model (GMM) 431 11.2.4 Hidden Gaussian mixture model 437 11.2.5 Influence of the spatial parameter 439 11.2.6 Localization and quantification of brain

degeneration ,...441 11.3 Going further 445

11.3.1 Nature and domain of the transformation 447 11.3.2 Features and cost function 447 11.3.3 Optimization 447

11.4 Conclusions 448 11.5 Acknowledgments 448 References 449

Index 451