33
ADVANCED AUDIO CODING [AAC] Presented By Sirhan Shafahath 00606002 S7 EC

Advanced Audio Coding [Aac]

Embed Size (px)

DESCRIPTION

Hi, I'm Sirhan Shafahath. This is my presentation on Advanced Audio Coding, the finest audio coding algorithm today, and the successor of mp3. I had done this for as my seminar topic for the partial fulfillment of my B.Tech degree. Hope this might be useful for you. For detailed information and sources of references contact me at "[email protected]". Will be always there for you to help.Some details on presentation:- contains an introduction to CD audio and the need for audio compression.Then goes for the technology used for compression, ie; psychoacoustic. Then going for the aac coding with block diagrams. Then the SBR and PS technologies that give aac its quality.A comparison with mp3.The applications and a small conclusion.

Citation preview

Page 1: Advanced Audio Coding [Aac]

ADVANCED AUDIO CODING [AAC]

Presented By Sirhan Shafahath

00606002S7 EC

INTRODUCTIONbull Advanced Audio Coding (AAC) is a standardized lossy compression and

encoding scheme for digital audio

bull Its standardized (defined) in ISOIEC 13818-7 [MPEG-2] ISOIEC 14496-3 [MPEG-4]

bull Developed with the cooperation and contribution of companies including Fraunhofer IIS ATampT Bell Laboratories Dolby Sony Co and Nokia

bull Designed to be the successor of the well-known audio compression format MP3

bull Filename extension m4a m4b m4p m4v m4r 3gp mp4 aac

bull It is currently the most powerful multichannel audio coding algorithm in MPEG family

INTRODUCTION TO DIGITAL AUDIO

bull Before the introduction of digital audio audio signals have been represented in analog form

bull Main disadvantages of analog audio Compression Rendering Quality Enhancement

bull Representing audio signals in digital form allows us to achieve the above goals more easily

bull The idea behind digital audio is to use numbers to represent the physical sound via an analog-to-digital (AD) conversion process

bull The AD conversion process involves sampling and quantization

Continuehellip

bull Sampling Each samplersquos amplitude as a function of a discrete index the rate at which each sample is extracted the sampling frequency or the sampling rate which is described in terms of number of samples per second or Hertz (Hz)

bull Quantization Sample resolution or bit depth determines how precisely the samplersquos amplitude is recorded or stored An n-bit sample resolution allows 2^n different possible amplitude values

Continuehellipbull Encoding The sampled and quantized signals are encoded using

some error correction codes and are stored in a media

bull CD AUDIO Itrsquos the most commonly used media for storing and transporting of digital audio

Sampling Rate 44100Hz (Nyquist Criteria satisfied for 20KHz) Sample Resolution 16-bit (ADC) Size (1minStereo) 60 x 2 x 44100 x 16 = 10584 MBmin Filename cda cdda

bull Generally they are uncompressed PCM data

bull The large amount of data makes them not suitable for internet streaming and digital broadcasting because of large bandwidth

HERE ARISE THE NEED FOR COMPRESSION

Compression Techniques

bull Any compression technique belongs to either lossy compression or lossless compression

bull Lossless Compression ndash If data is losslessly compressed the original data can be recovered

exactly from the compressed datandash As name implies involve no loss of information

bull Lossy compression ndash Involves some loss of informationndash Data that have been lossy compressed generally cannot be

recovered exactlyndash By accepting the above we can achieve higher compression ratios

than lossless compression

Perceptual Audio Coding

bull One of the key elements in the development of reduced bit rate audio is the understanding and application of psychoacoustics

bull All of the current perceptual audio coders achieve high compression rates by exploiting the fact that signal information that cannot be detected by even a well-trained listener can be discarded

bull Human hearing is insensitive to quiet frequency components to sound accompanying other stronger frequency components

bull Stereo audio streams contain largely redundant information

bull Irrelevant signal information is identified during signal analysis by incorporating into the coder several psychoacoustic principles

Principles of Psychoacoustics

1 Absolute Threshold of Hearing

The absolute threshold of hearing characterizes the amount of energy needed in a pure tone such that it can be detected by a listener in a noiseless environment

It can be expressed with a non-linear function

Tq(f) = 364(f1000)-08 - 65e-06(f1000-33)2 + 10-3(f1000)4 (dB SPL)

Equal loudness contours for pure tones

Continuehellip

bull When applied to signal compression it could be interpreted as a maximum allowable energy level for coding distortions introduced in the frequency domain

bull So using this information the noise levels during quantization are tried to fit below this threshold

bull Due to this quantization noise does not become audible

2 Critical Band

bull Human ear can be viewed as a discrete set of band pass filters which covers the entire 20kHz frequency range

bull The inner ear called as rdquoCochleardquo contains frequency sensitive positions Whenever any tone enters the cochlea it moves until it reaches the position where it resonates

bull The ldquocritical bandwidthrdquo is a function of frequency that quantifies the cochlear filter pass bands (unit ndash Bark)

bull As the center frequency goes on increasing the bark-width also goes on increasing

bull Spectral analysis of audio content is performed using critical bands

Bark-width with center frequency lsquofrsquo is gives as hellip BWc(f) = 25 + 75(1 + 14(f100)2)069 Hz

Idealized critical band filter bank

3 Masking

bull Masking refers to a process where one sound is rendered inaudible because of the presence of another sound

Advanced Audio CodingModular encoding AAC takes a modular approach to encoding Depending on the

complexity of the bitstream to be encoded the desired performance and the acceptable output implementers may create profiles to define which of a specific set of tools they want use for a particular application The standard offers four default profiles

bull Low Complexity (LC) - the simplest and most widely used and supported

bull Main Profile (MAIN) - like the LC profile with the addition of backwards prediction

bull Sample-Rate Scalable (SRS) - aka Scalable Sample Rate (MPEG-4 AAC-SSR)

bull Long Term Prediction (LTP) - added in the MPEG-4 standard - an improvement of the MAIN profile using a forward predictor with lower computational complexity

MPEG-2 AAC BLOCK DIAGRAMS

MPEG AAC FAMILY

MPEG-4 AAC LCPerceptual Noise Substitution [PNS ]

bull Instead of trying to reproduce a waveform that is similar as input signals the model-based coding tries to generate a perceptually

similar sound as output

bull The encoding of PNS includes two steps (1) Noise detection For input signals in each frame the encoder

performs some analysis and determines if the spectral data in a scale-factor band belongs to noise component

(2) Noise compression All spectral samples in the noise-like scale-factor bands are excluded from the following quantization and entropy coding module Instead only a PNS flag and the energy of these samples are included in the bitstream

MPEG-4 HE-AAC

Spectral Band Replication [ SBR ]bull Developed by a German based company ldquoCoding Technologiesrdquo

bull SBR is a bandwidth extension tool

bull The main effect used is the high correlation between the low- and high-frequency content in an audio signal

bull In an SBR-based coding system waveform audio coding is only used to code the lower frequencies of an audio signal This low frequency content is used to recreate the high frequency content at the decoding side

bull This is done by state-of-the-art transposition method

bull The reconstruction of the high band is conducted by transmitting guiding information such as the spectral envelope of the original input signal or additional information to compensate for potentially missing high-frequency components

bull This guiding information is referred to as SBR data

bull The recreated high-frequency content undergoes some frequency and time domain adjustment before it is combined with the low-frequency part of the audio signal

bull HE-AAC aka aacPlus v1

Continuehellip

Continuehellip

Continuehellip

MPEG-4 HE-AAC v2

Parametric Stereo

bull Its also a contribution from ldquoCoding Technologiesrdquo

bull In the encoder only a monaural downmix of the original stereo signal is coded after extraction of the Parametric Stereo data

bull Just like SBR data these parameters are then embedded as PS side information in the ancillary part of the bit-stream

bull In the decoder the monaural signal is decoded first After that the stereo signal is reconstructed based on the stereo parameters embedded by the encoder

bull Three types of parameters can be employed in a Parametric Stereo system to describe the stereo image

1048705bull Inter-channel Intensity Difference (IID) describing the intensity

difference between the channels

bull Inter-channel Cross-Correlation (ICC) describing the cross correlation or coherence between the channels The coherence is measured as the maximum of the cross-correlation as a function of time or phase

bull Inter-channel Phase Difference (IPD) describing the phase difference between the channels

bull HE-AACv2 aka aacPlus v2

Continuehellip

Continuehellip

Advantages Over MP3 AAC

1 Multi Channel Audio ndash up to 48 audio channels

2 Sample frequencies from 8KHz ~ 96KHz

3 Simpler filter bank (pure MDCT used)

4 Better stationary and transient response due to block sizes of 1024 and 128 samples

5 Excellent handling of high frequency signals

6 CD quality audio at 64Kbitssec

7 Much better quality of audio at lower bit rates (down to 32Kbps)

MP3

1 Stereo signal ndash maximum of only 2 channels

2 Sampling frequencies from 16KHz ~ 48KHz

3 Hybrid filter bank ( more computational power)

4 Poorer stationary and transient response due to block sizes of 576 and 192 samples

5 Signal handling up to 155158 KHz

6 CD quality audio at 128Kbitssec

7 Audio quality is poorer at low bit rates and may present coding artifacts

Disadvantages

bull Transparency is lost at very low bit rates when SBR is used

bull Small loss of stereo image when PS is used

APPLICATIONS

bull HE-AAC was chosen as the coding used in DAB (Digital Audio Broadcasting)

bull HE-AAC is the coding used in DRM (Digital Radio Mondiale)bull Itrsquos the default format in Apples i-PODbull Used in mobile phone to store songsbull Itrsquos the audio coding used in 3gp and 3gpp formatbull Itrsquos the audio coding used in DTH services [MPEG-4]bull For Internet Streamingbull Audio format in Bluetooth StereoMono headsets

[ A2DP ndash Advanced Audio Distribution profile ] (Optional)

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 2: Advanced Audio Coding [Aac]

INTRODUCTIONbull Advanced Audio Coding (AAC) is a standardized lossy compression and

encoding scheme for digital audio

bull Its standardized (defined) in ISOIEC 13818-7 [MPEG-2] ISOIEC 14496-3 [MPEG-4]

bull Developed with the cooperation and contribution of companies including Fraunhofer IIS ATampT Bell Laboratories Dolby Sony Co and Nokia

bull Designed to be the successor of the well-known audio compression format MP3

bull Filename extension m4a m4b m4p m4v m4r 3gp mp4 aac

bull It is currently the most powerful multichannel audio coding algorithm in MPEG family

INTRODUCTION TO DIGITAL AUDIO

bull Before the introduction of digital audio audio signals have been represented in analog form

bull Main disadvantages of analog audio Compression Rendering Quality Enhancement

bull Representing audio signals in digital form allows us to achieve the above goals more easily

bull The idea behind digital audio is to use numbers to represent the physical sound via an analog-to-digital (AD) conversion process

bull The AD conversion process involves sampling and quantization

Continuehellip

bull Sampling Each samplersquos amplitude as a function of a discrete index the rate at which each sample is extracted the sampling frequency or the sampling rate which is described in terms of number of samples per second or Hertz (Hz)

bull Quantization Sample resolution or bit depth determines how precisely the samplersquos amplitude is recorded or stored An n-bit sample resolution allows 2^n different possible amplitude values

Continuehellipbull Encoding The sampled and quantized signals are encoded using

some error correction codes and are stored in a media

bull CD AUDIO Itrsquos the most commonly used media for storing and transporting of digital audio

Sampling Rate 44100Hz (Nyquist Criteria satisfied for 20KHz) Sample Resolution 16-bit (ADC) Size (1minStereo) 60 x 2 x 44100 x 16 = 10584 MBmin Filename cda cdda

bull Generally they are uncompressed PCM data

bull The large amount of data makes them not suitable for internet streaming and digital broadcasting because of large bandwidth

HERE ARISE THE NEED FOR COMPRESSION

Compression Techniques

bull Any compression technique belongs to either lossy compression or lossless compression

bull Lossless Compression ndash If data is losslessly compressed the original data can be recovered

exactly from the compressed datandash As name implies involve no loss of information

bull Lossy compression ndash Involves some loss of informationndash Data that have been lossy compressed generally cannot be

recovered exactlyndash By accepting the above we can achieve higher compression ratios

than lossless compression

Perceptual Audio Coding

bull One of the key elements in the development of reduced bit rate audio is the understanding and application of psychoacoustics

bull All of the current perceptual audio coders achieve high compression rates by exploiting the fact that signal information that cannot be detected by even a well-trained listener can be discarded

bull Human hearing is insensitive to quiet frequency components to sound accompanying other stronger frequency components

bull Stereo audio streams contain largely redundant information

bull Irrelevant signal information is identified during signal analysis by incorporating into the coder several psychoacoustic principles

Principles of Psychoacoustics

1 Absolute Threshold of Hearing

The absolute threshold of hearing characterizes the amount of energy needed in a pure tone such that it can be detected by a listener in a noiseless environment

It can be expressed with a non-linear function

Tq(f) = 364(f1000)-08 - 65e-06(f1000-33)2 + 10-3(f1000)4 (dB SPL)

Equal loudness contours for pure tones

Continuehellip

bull When applied to signal compression it could be interpreted as a maximum allowable energy level for coding distortions introduced in the frequency domain

bull So using this information the noise levels during quantization are tried to fit below this threshold

bull Due to this quantization noise does not become audible

2 Critical Band

bull Human ear can be viewed as a discrete set of band pass filters which covers the entire 20kHz frequency range

bull The inner ear called as rdquoCochleardquo contains frequency sensitive positions Whenever any tone enters the cochlea it moves until it reaches the position where it resonates

bull The ldquocritical bandwidthrdquo is a function of frequency that quantifies the cochlear filter pass bands (unit ndash Bark)

bull As the center frequency goes on increasing the bark-width also goes on increasing

bull Spectral analysis of audio content is performed using critical bands

Bark-width with center frequency lsquofrsquo is gives as hellip BWc(f) = 25 + 75(1 + 14(f100)2)069 Hz

Idealized critical band filter bank

3 Masking

bull Masking refers to a process where one sound is rendered inaudible because of the presence of another sound

Advanced Audio CodingModular encoding AAC takes a modular approach to encoding Depending on the

complexity of the bitstream to be encoded the desired performance and the acceptable output implementers may create profiles to define which of a specific set of tools they want use for a particular application The standard offers four default profiles

bull Low Complexity (LC) - the simplest and most widely used and supported

bull Main Profile (MAIN) - like the LC profile with the addition of backwards prediction

bull Sample-Rate Scalable (SRS) - aka Scalable Sample Rate (MPEG-4 AAC-SSR)

bull Long Term Prediction (LTP) - added in the MPEG-4 standard - an improvement of the MAIN profile using a forward predictor with lower computational complexity

MPEG-2 AAC BLOCK DIAGRAMS

MPEG AAC FAMILY

MPEG-4 AAC LCPerceptual Noise Substitution [PNS ]

bull Instead of trying to reproduce a waveform that is similar as input signals the model-based coding tries to generate a perceptually

similar sound as output

bull The encoding of PNS includes two steps (1) Noise detection For input signals in each frame the encoder

performs some analysis and determines if the spectral data in a scale-factor band belongs to noise component

(2) Noise compression All spectral samples in the noise-like scale-factor bands are excluded from the following quantization and entropy coding module Instead only a PNS flag and the energy of these samples are included in the bitstream

MPEG-4 HE-AAC

Spectral Band Replication [ SBR ]bull Developed by a German based company ldquoCoding Technologiesrdquo

bull SBR is a bandwidth extension tool

bull The main effect used is the high correlation between the low- and high-frequency content in an audio signal

bull In an SBR-based coding system waveform audio coding is only used to code the lower frequencies of an audio signal This low frequency content is used to recreate the high frequency content at the decoding side

bull This is done by state-of-the-art transposition method

bull The reconstruction of the high band is conducted by transmitting guiding information such as the spectral envelope of the original input signal or additional information to compensate for potentially missing high-frequency components

bull This guiding information is referred to as SBR data

bull The recreated high-frequency content undergoes some frequency and time domain adjustment before it is combined with the low-frequency part of the audio signal

bull HE-AAC aka aacPlus v1

Continuehellip

Continuehellip

Continuehellip

MPEG-4 HE-AAC v2

Parametric Stereo

bull Its also a contribution from ldquoCoding Technologiesrdquo

bull In the encoder only a monaural downmix of the original stereo signal is coded after extraction of the Parametric Stereo data

bull Just like SBR data these parameters are then embedded as PS side information in the ancillary part of the bit-stream

bull In the decoder the monaural signal is decoded first After that the stereo signal is reconstructed based on the stereo parameters embedded by the encoder

bull Three types of parameters can be employed in a Parametric Stereo system to describe the stereo image

1048705bull Inter-channel Intensity Difference (IID) describing the intensity

difference between the channels

bull Inter-channel Cross-Correlation (ICC) describing the cross correlation or coherence between the channels The coherence is measured as the maximum of the cross-correlation as a function of time or phase

bull Inter-channel Phase Difference (IPD) describing the phase difference between the channels

bull HE-AACv2 aka aacPlus v2

Continuehellip

Continuehellip

Advantages Over MP3 AAC

1 Multi Channel Audio ndash up to 48 audio channels

2 Sample frequencies from 8KHz ~ 96KHz

3 Simpler filter bank (pure MDCT used)

4 Better stationary and transient response due to block sizes of 1024 and 128 samples

5 Excellent handling of high frequency signals

6 CD quality audio at 64Kbitssec

7 Much better quality of audio at lower bit rates (down to 32Kbps)

MP3

1 Stereo signal ndash maximum of only 2 channels

2 Sampling frequencies from 16KHz ~ 48KHz

3 Hybrid filter bank ( more computational power)

4 Poorer stationary and transient response due to block sizes of 576 and 192 samples

5 Signal handling up to 155158 KHz

6 CD quality audio at 128Kbitssec

7 Audio quality is poorer at low bit rates and may present coding artifacts

Disadvantages

bull Transparency is lost at very low bit rates when SBR is used

bull Small loss of stereo image when PS is used

APPLICATIONS

bull HE-AAC was chosen as the coding used in DAB (Digital Audio Broadcasting)

bull HE-AAC is the coding used in DRM (Digital Radio Mondiale)bull Itrsquos the default format in Apples i-PODbull Used in mobile phone to store songsbull Itrsquos the audio coding used in 3gp and 3gpp formatbull Itrsquos the audio coding used in DTH services [MPEG-4]bull For Internet Streamingbull Audio format in Bluetooth StereoMono headsets

[ A2DP ndash Advanced Audio Distribution profile ] (Optional)

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 3: Advanced Audio Coding [Aac]

INTRODUCTION TO DIGITAL AUDIO

bull Before the introduction of digital audio audio signals have been represented in analog form

bull Main disadvantages of analog audio Compression Rendering Quality Enhancement

bull Representing audio signals in digital form allows us to achieve the above goals more easily

bull The idea behind digital audio is to use numbers to represent the physical sound via an analog-to-digital (AD) conversion process

bull The AD conversion process involves sampling and quantization

Continuehellip

bull Sampling Each samplersquos amplitude as a function of a discrete index the rate at which each sample is extracted the sampling frequency or the sampling rate which is described in terms of number of samples per second or Hertz (Hz)

bull Quantization Sample resolution or bit depth determines how precisely the samplersquos amplitude is recorded or stored An n-bit sample resolution allows 2^n different possible amplitude values

Continuehellipbull Encoding The sampled and quantized signals are encoded using

some error correction codes and are stored in a media

bull CD AUDIO Itrsquos the most commonly used media for storing and transporting of digital audio

Sampling Rate 44100Hz (Nyquist Criteria satisfied for 20KHz) Sample Resolution 16-bit (ADC) Size (1minStereo) 60 x 2 x 44100 x 16 = 10584 MBmin Filename cda cdda

bull Generally they are uncompressed PCM data

bull The large amount of data makes them not suitable for internet streaming and digital broadcasting because of large bandwidth

HERE ARISE THE NEED FOR COMPRESSION

Compression Techniques

bull Any compression technique belongs to either lossy compression or lossless compression

bull Lossless Compression ndash If data is losslessly compressed the original data can be recovered

exactly from the compressed datandash As name implies involve no loss of information

bull Lossy compression ndash Involves some loss of informationndash Data that have been lossy compressed generally cannot be

recovered exactlyndash By accepting the above we can achieve higher compression ratios

than lossless compression

Perceptual Audio Coding

bull One of the key elements in the development of reduced bit rate audio is the understanding and application of psychoacoustics

bull All of the current perceptual audio coders achieve high compression rates by exploiting the fact that signal information that cannot be detected by even a well-trained listener can be discarded

bull Human hearing is insensitive to quiet frequency components to sound accompanying other stronger frequency components

bull Stereo audio streams contain largely redundant information

bull Irrelevant signal information is identified during signal analysis by incorporating into the coder several psychoacoustic principles

Principles of Psychoacoustics

1 Absolute Threshold of Hearing

The absolute threshold of hearing characterizes the amount of energy needed in a pure tone such that it can be detected by a listener in a noiseless environment

It can be expressed with a non-linear function

Tq(f) = 364(f1000)-08 - 65e-06(f1000-33)2 + 10-3(f1000)4 (dB SPL)

Equal loudness contours for pure tones

Continuehellip

bull When applied to signal compression it could be interpreted as a maximum allowable energy level for coding distortions introduced in the frequency domain

bull So using this information the noise levels during quantization are tried to fit below this threshold

bull Due to this quantization noise does not become audible

2 Critical Band

bull Human ear can be viewed as a discrete set of band pass filters which covers the entire 20kHz frequency range

bull The inner ear called as rdquoCochleardquo contains frequency sensitive positions Whenever any tone enters the cochlea it moves until it reaches the position where it resonates

bull The ldquocritical bandwidthrdquo is a function of frequency that quantifies the cochlear filter pass bands (unit ndash Bark)

bull As the center frequency goes on increasing the bark-width also goes on increasing

bull Spectral analysis of audio content is performed using critical bands

Bark-width with center frequency lsquofrsquo is gives as hellip BWc(f) = 25 + 75(1 + 14(f100)2)069 Hz

Idealized critical band filter bank

3 Masking

bull Masking refers to a process where one sound is rendered inaudible because of the presence of another sound

Advanced Audio CodingModular encoding AAC takes a modular approach to encoding Depending on the

complexity of the bitstream to be encoded the desired performance and the acceptable output implementers may create profiles to define which of a specific set of tools they want use for a particular application The standard offers four default profiles

bull Low Complexity (LC) - the simplest and most widely used and supported

bull Main Profile (MAIN) - like the LC profile with the addition of backwards prediction

bull Sample-Rate Scalable (SRS) - aka Scalable Sample Rate (MPEG-4 AAC-SSR)

bull Long Term Prediction (LTP) - added in the MPEG-4 standard - an improvement of the MAIN profile using a forward predictor with lower computational complexity

MPEG-2 AAC BLOCK DIAGRAMS

MPEG AAC FAMILY

MPEG-4 AAC LCPerceptual Noise Substitution [PNS ]

bull Instead of trying to reproduce a waveform that is similar as input signals the model-based coding tries to generate a perceptually

similar sound as output

bull The encoding of PNS includes two steps (1) Noise detection For input signals in each frame the encoder

performs some analysis and determines if the spectral data in a scale-factor band belongs to noise component

(2) Noise compression All spectral samples in the noise-like scale-factor bands are excluded from the following quantization and entropy coding module Instead only a PNS flag and the energy of these samples are included in the bitstream

MPEG-4 HE-AAC

Spectral Band Replication [ SBR ]bull Developed by a German based company ldquoCoding Technologiesrdquo

bull SBR is a bandwidth extension tool

bull The main effect used is the high correlation between the low- and high-frequency content in an audio signal

bull In an SBR-based coding system waveform audio coding is only used to code the lower frequencies of an audio signal This low frequency content is used to recreate the high frequency content at the decoding side

bull This is done by state-of-the-art transposition method

bull The reconstruction of the high band is conducted by transmitting guiding information such as the spectral envelope of the original input signal or additional information to compensate for potentially missing high-frequency components

bull This guiding information is referred to as SBR data

bull The recreated high-frequency content undergoes some frequency and time domain adjustment before it is combined with the low-frequency part of the audio signal

bull HE-AAC aka aacPlus v1

Continuehellip

Continuehellip

Continuehellip

MPEG-4 HE-AAC v2

Parametric Stereo

bull Its also a contribution from ldquoCoding Technologiesrdquo

bull In the encoder only a monaural downmix of the original stereo signal is coded after extraction of the Parametric Stereo data

bull Just like SBR data these parameters are then embedded as PS side information in the ancillary part of the bit-stream

bull In the decoder the monaural signal is decoded first After that the stereo signal is reconstructed based on the stereo parameters embedded by the encoder

bull Three types of parameters can be employed in a Parametric Stereo system to describe the stereo image

1048705bull Inter-channel Intensity Difference (IID) describing the intensity

difference between the channels

bull Inter-channel Cross-Correlation (ICC) describing the cross correlation or coherence between the channels The coherence is measured as the maximum of the cross-correlation as a function of time or phase

bull Inter-channel Phase Difference (IPD) describing the phase difference between the channels

bull HE-AACv2 aka aacPlus v2

Continuehellip

Continuehellip

Advantages Over MP3 AAC

1 Multi Channel Audio ndash up to 48 audio channels

2 Sample frequencies from 8KHz ~ 96KHz

3 Simpler filter bank (pure MDCT used)

4 Better stationary and transient response due to block sizes of 1024 and 128 samples

5 Excellent handling of high frequency signals

6 CD quality audio at 64Kbitssec

7 Much better quality of audio at lower bit rates (down to 32Kbps)

MP3

1 Stereo signal ndash maximum of only 2 channels

2 Sampling frequencies from 16KHz ~ 48KHz

3 Hybrid filter bank ( more computational power)

4 Poorer stationary and transient response due to block sizes of 576 and 192 samples

5 Signal handling up to 155158 KHz

6 CD quality audio at 128Kbitssec

7 Audio quality is poorer at low bit rates and may present coding artifacts

Disadvantages

bull Transparency is lost at very low bit rates when SBR is used

bull Small loss of stereo image when PS is used

APPLICATIONS

bull HE-AAC was chosen as the coding used in DAB (Digital Audio Broadcasting)

bull HE-AAC is the coding used in DRM (Digital Radio Mondiale)bull Itrsquos the default format in Apples i-PODbull Used in mobile phone to store songsbull Itrsquos the audio coding used in 3gp and 3gpp formatbull Itrsquos the audio coding used in DTH services [MPEG-4]bull For Internet Streamingbull Audio format in Bluetooth StereoMono headsets

[ A2DP ndash Advanced Audio Distribution profile ] (Optional)

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 4: Advanced Audio Coding [Aac]

Continuehellip

bull Sampling Each samplersquos amplitude as a function of a discrete index the rate at which each sample is extracted the sampling frequency or the sampling rate which is described in terms of number of samples per second or Hertz (Hz)

bull Quantization Sample resolution or bit depth determines how precisely the samplersquos amplitude is recorded or stored An n-bit sample resolution allows 2^n different possible amplitude values

Continuehellipbull Encoding The sampled and quantized signals are encoded using

some error correction codes and are stored in a media

bull CD AUDIO Itrsquos the most commonly used media for storing and transporting of digital audio

Sampling Rate 44100Hz (Nyquist Criteria satisfied for 20KHz) Sample Resolution 16-bit (ADC) Size (1minStereo) 60 x 2 x 44100 x 16 = 10584 MBmin Filename cda cdda

bull Generally they are uncompressed PCM data

bull The large amount of data makes them not suitable for internet streaming and digital broadcasting because of large bandwidth

HERE ARISE THE NEED FOR COMPRESSION

Compression Techniques

bull Any compression technique belongs to either lossy compression or lossless compression

bull Lossless Compression ndash If data is losslessly compressed the original data can be recovered

exactly from the compressed datandash As name implies involve no loss of information

bull Lossy compression ndash Involves some loss of informationndash Data that have been lossy compressed generally cannot be

recovered exactlyndash By accepting the above we can achieve higher compression ratios

than lossless compression

Perceptual Audio Coding

bull One of the key elements in the development of reduced bit rate audio is the understanding and application of psychoacoustics

bull All of the current perceptual audio coders achieve high compression rates by exploiting the fact that signal information that cannot be detected by even a well-trained listener can be discarded

bull Human hearing is insensitive to quiet frequency components to sound accompanying other stronger frequency components

bull Stereo audio streams contain largely redundant information

bull Irrelevant signal information is identified during signal analysis by incorporating into the coder several psychoacoustic principles

Principles of Psychoacoustics

1 Absolute Threshold of Hearing

The absolute threshold of hearing characterizes the amount of energy needed in a pure tone such that it can be detected by a listener in a noiseless environment

It can be expressed with a non-linear function

Tq(f) = 364(f1000)-08 - 65e-06(f1000-33)2 + 10-3(f1000)4 (dB SPL)

Equal loudness contours for pure tones

Continuehellip

bull When applied to signal compression it could be interpreted as a maximum allowable energy level for coding distortions introduced in the frequency domain

bull So using this information the noise levels during quantization are tried to fit below this threshold

bull Due to this quantization noise does not become audible

2 Critical Band

bull Human ear can be viewed as a discrete set of band pass filters which covers the entire 20kHz frequency range

bull The inner ear called as rdquoCochleardquo contains frequency sensitive positions Whenever any tone enters the cochlea it moves until it reaches the position where it resonates

bull The ldquocritical bandwidthrdquo is a function of frequency that quantifies the cochlear filter pass bands (unit ndash Bark)

bull As the center frequency goes on increasing the bark-width also goes on increasing

bull Spectral analysis of audio content is performed using critical bands

Bark-width with center frequency lsquofrsquo is gives as hellip BWc(f) = 25 + 75(1 + 14(f100)2)069 Hz

Idealized critical band filter bank

3 Masking

bull Masking refers to a process where one sound is rendered inaudible because of the presence of another sound

Advanced Audio CodingModular encoding AAC takes a modular approach to encoding Depending on the

complexity of the bitstream to be encoded the desired performance and the acceptable output implementers may create profiles to define which of a specific set of tools they want use for a particular application The standard offers four default profiles

bull Low Complexity (LC) - the simplest and most widely used and supported

bull Main Profile (MAIN) - like the LC profile with the addition of backwards prediction

bull Sample-Rate Scalable (SRS) - aka Scalable Sample Rate (MPEG-4 AAC-SSR)

bull Long Term Prediction (LTP) - added in the MPEG-4 standard - an improvement of the MAIN profile using a forward predictor with lower computational complexity

MPEG-2 AAC BLOCK DIAGRAMS

MPEG AAC FAMILY

MPEG-4 AAC LCPerceptual Noise Substitution [PNS ]

bull Instead of trying to reproduce a waveform that is similar as input signals the model-based coding tries to generate a perceptually

similar sound as output

bull The encoding of PNS includes two steps (1) Noise detection For input signals in each frame the encoder

performs some analysis and determines if the spectral data in a scale-factor band belongs to noise component

(2) Noise compression All spectral samples in the noise-like scale-factor bands are excluded from the following quantization and entropy coding module Instead only a PNS flag and the energy of these samples are included in the bitstream

MPEG-4 HE-AAC

Spectral Band Replication [ SBR ]bull Developed by a German based company ldquoCoding Technologiesrdquo

bull SBR is a bandwidth extension tool

bull The main effect used is the high correlation between the low- and high-frequency content in an audio signal

bull In an SBR-based coding system waveform audio coding is only used to code the lower frequencies of an audio signal This low frequency content is used to recreate the high frequency content at the decoding side

bull This is done by state-of-the-art transposition method

bull The reconstruction of the high band is conducted by transmitting guiding information such as the spectral envelope of the original input signal or additional information to compensate for potentially missing high-frequency components

bull This guiding information is referred to as SBR data

bull The recreated high-frequency content undergoes some frequency and time domain adjustment before it is combined with the low-frequency part of the audio signal

bull HE-AAC aka aacPlus v1

Continuehellip

Continuehellip

Continuehellip

MPEG-4 HE-AAC v2

Parametric Stereo

bull Its also a contribution from ldquoCoding Technologiesrdquo

bull In the encoder only a monaural downmix of the original stereo signal is coded after extraction of the Parametric Stereo data

bull Just like SBR data these parameters are then embedded as PS side information in the ancillary part of the bit-stream

bull In the decoder the monaural signal is decoded first After that the stereo signal is reconstructed based on the stereo parameters embedded by the encoder

bull Three types of parameters can be employed in a Parametric Stereo system to describe the stereo image

1048705bull Inter-channel Intensity Difference (IID) describing the intensity

difference between the channels

bull Inter-channel Cross-Correlation (ICC) describing the cross correlation or coherence between the channels The coherence is measured as the maximum of the cross-correlation as a function of time or phase

bull Inter-channel Phase Difference (IPD) describing the phase difference between the channels

bull HE-AACv2 aka aacPlus v2

Continuehellip

Continuehellip

Advantages Over MP3 AAC

1 Multi Channel Audio ndash up to 48 audio channels

2 Sample frequencies from 8KHz ~ 96KHz

3 Simpler filter bank (pure MDCT used)

4 Better stationary and transient response due to block sizes of 1024 and 128 samples

5 Excellent handling of high frequency signals

6 CD quality audio at 64Kbitssec

7 Much better quality of audio at lower bit rates (down to 32Kbps)

MP3

1 Stereo signal ndash maximum of only 2 channels

2 Sampling frequencies from 16KHz ~ 48KHz

3 Hybrid filter bank ( more computational power)

4 Poorer stationary and transient response due to block sizes of 576 and 192 samples

5 Signal handling up to 155158 KHz

6 CD quality audio at 128Kbitssec

7 Audio quality is poorer at low bit rates and may present coding artifacts

Disadvantages

bull Transparency is lost at very low bit rates when SBR is used

bull Small loss of stereo image when PS is used

APPLICATIONS

bull HE-AAC was chosen as the coding used in DAB (Digital Audio Broadcasting)

bull HE-AAC is the coding used in DRM (Digital Radio Mondiale)bull Itrsquos the default format in Apples i-PODbull Used in mobile phone to store songsbull Itrsquos the audio coding used in 3gp and 3gpp formatbull Itrsquos the audio coding used in DTH services [MPEG-4]bull For Internet Streamingbull Audio format in Bluetooth StereoMono headsets

[ A2DP ndash Advanced Audio Distribution profile ] (Optional)

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 5: Advanced Audio Coding [Aac]

Continuehellipbull Encoding The sampled and quantized signals are encoded using

some error correction codes and are stored in a media

bull CD AUDIO Itrsquos the most commonly used media for storing and transporting of digital audio

Sampling Rate 44100Hz (Nyquist Criteria satisfied for 20KHz) Sample Resolution 16-bit (ADC) Size (1minStereo) 60 x 2 x 44100 x 16 = 10584 MBmin Filename cda cdda

bull Generally they are uncompressed PCM data

bull The large amount of data makes them not suitable for internet streaming and digital broadcasting because of large bandwidth

HERE ARISE THE NEED FOR COMPRESSION

Compression Techniques

bull Any compression technique belongs to either lossy compression or lossless compression

bull Lossless Compression ndash If data is losslessly compressed the original data can be recovered

exactly from the compressed datandash As name implies involve no loss of information

bull Lossy compression ndash Involves some loss of informationndash Data that have been lossy compressed generally cannot be

recovered exactlyndash By accepting the above we can achieve higher compression ratios

than lossless compression

Perceptual Audio Coding

bull One of the key elements in the development of reduced bit rate audio is the understanding and application of psychoacoustics

bull All of the current perceptual audio coders achieve high compression rates by exploiting the fact that signal information that cannot be detected by even a well-trained listener can be discarded

bull Human hearing is insensitive to quiet frequency components to sound accompanying other stronger frequency components

bull Stereo audio streams contain largely redundant information

bull Irrelevant signal information is identified during signal analysis by incorporating into the coder several psychoacoustic principles

Principles of Psychoacoustics

1 Absolute Threshold of Hearing

The absolute threshold of hearing characterizes the amount of energy needed in a pure tone such that it can be detected by a listener in a noiseless environment

It can be expressed with a non-linear function

Tq(f) = 364(f1000)-08 - 65e-06(f1000-33)2 + 10-3(f1000)4 (dB SPL)

Equal loudness contours for pure tones

Continuehellip

bull When applied to signal compression it could be interpreted as a maximum allowable energy level for coding distortions introduced in the frequency domain

bull So using this information the noise levels during quantization are tried to fit below this threshold

bull Due to this quantization noise does not become audible

2 Critical Band

bull Human ear can be viewed as a discrete set of band pass filters which covers the entire 20kHz frequency range

bull The inner ear called as rdquoCochleardquo contains frequency sensitive positions Whenever any tone enters the cochlea it moves until it reaches the position where it resonates

bull The ldquocritical bandwidthrdquo is a function of frequency that quantifies the cochlear filter pass bands (unit ndash Bark)

bull As the center frequency goes on increasing the bark-width also goes on increasing

bull Spectral analysis of audio content is performed using critical bands

Bark-width with center frequency lsquofrsquo is gives as hellip BWc(f) = 25 + 75(1 + 14(f100)2)069 Hz

Idealized critical band filter bank

3 Masking

bull Masking refers to a process where one sound is rendered inaudible because of the presence of another sound

Advanced Audio CodingModular encoding AAC takes a modular approach to encoding Depending on the

complexity of the bitstream to be encoded the desired performance and the acceptable output implementers may create profiles to define which of a specific set of tools they want use for a particular application The standard offers four default profiles

bull Low Complexity (LC) - the simplest and most widely used and supported

bull Main Profile (MAIN) - like the LC profile with the addition of backwards prediction

bull Sample-Rate Scalable (SRS) - aka Scalable Sample Rate (MPEG-4 AAC-SSR)

bull Long Term Prediction (LTP) - added in the MPEG-4 standard - an improvement of the MAIN profile using a forward predictor with lower computational complexity

MPEG-2 AAC BLOCK DIAGRAMS

MPEG AAC FAMILY

MPEG-4 AAC LCPerceptual Noise Substitution [PNS ]

bull Instead of trying to reproduce a waveform that is similar as input signals the model-based coding tries to generate a perceptually

similar sound as output

bull The encoding of PNS includes two steps (1) Noise detection For input signals in each frame the encoder

performs some analysis and determines if the spectral data in a scale-factor band belongs to noise component

(2) Noise compression All spectral samples in the noise-like scale-factor bands are excluded from the following quantization and entropy coding module Instead only a PNS flag and the energy of these samples are included in the bitstream

MPEG-4 HE-AAC

Spectral Band Replication [ SBR ]bull Developed by a German based company ldquoCoding Technologiesrdquo

bull SBR is a bandwidth extension tool

bull The main effect used is the high correlation between the low- and high-frequency content in an audio signal

bull In an SBR-based coding system waveform audio coding is only used to code the lower frequencies of an audio signal This low frequency content is used to recreate the high frequency content at the decoding side

bull This is done by state-of-the-art transposition method

bull The reconstruction of the high band is conducted by transmitting guiding information such as the spectral envelope of the original input signal or additional information to compensate for potentially missing high-frequency components

bull This guiding information is referred to as SBR data

bull The recreated high-frequency content undergoes some frequency and time domain adjustment before it is combined with the low-frequency part of the audio signal

bull HE-AAC aka aacPlus v1

Continuehellip

Continuehellip

Continuehellip

MPEG-4 HE-AAC v2

Parametric Stereo

bull Its also a contribution from ldquoCoding Technologiesrdquo

bull In the encoder only a monaural downmix of the original stereo signal is coded after extraction of the Parametric Stereo data

bull Just like SBR data these parameters are then embedded as PS side information in the ancillary part of the bit-stream

bull In the decoder the monaural signal is decoded first After that the stereo signal is reconstructed based on the stereo parameters embedded by the encoder

bull Three types of parameters can be employed in a Parametric Stereo system to describe the stereo image

1048705bull Inter-channel Intensity Difference (IID) describing the intensity

difference between the channels

bull Inter-channel Cross-Correlation (ICC) describing the cross correlation or coherence between the channels The coherence is measured as the maximum of the cross-correlation as a function of time or phase

bull Inter-channel Phase Difference (IPD) describing the phase difference between the channels

bull HE-AACv2 aka aacPlus v2

Continuehellip

Continuehellip

Advantages Over MP3 AAC

1 Multi Channel Audio ndash up to 48 audio channels

2 Sample frequencies from 8KHz ~ 96KHz

3 Simpler filter bank (pure MDCT used)

4 Better stationary and transient response due to block sizes of 1024 and 128 samples

5 Excellent handling of high frequency signals

6 CD quality audio at 64Kbitssec

7 Much better quality of audio at lower bit rates (down to 32Kbps)

MP3

1 Stereo signal ndash maximum of only 2 channels

2 Sampling frequencies from 16KHz ~ 48KHz

3 Hybrid filter bank ( more computational power)

4 Poorer stationary and transient response due to block sizes of 576 and 192 samples

5 Signal handling up to 155158 KHz

6 CD quality audio at 128Kbitssec

7 Audio quality is poorer at low bit rates and may present coding artifacts

Disadvantages

bull Transparency is lost at very low bit rates when SBR is used

bull Small loss of stereo image when PS is used

APPLICATIONS

bull HE-AAC was chosen as the coding used in DAB (Digital Audio Broadcasting)

bull HE-AAC is the coding used in DRM (Digital Radio Mondiale)bull Itrsquos the default format in Apples i-PODbull Used in mobile phone to store songsbull Itrsquos the audio coding used in 3gp and 3gpp formatbull Itrsquos the audio coding used in DTH services [MPEG-4]bull For Internet Streamingbull Audio format in Bluetooth StereoMono headsets

[ A2DP ndash Advanced Audio Distribution profile ] (Optional)

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 6: Advanced Audio Coding [Aac]

Compression Techniques

bull Any compression technique belongs to either lossy compression or lossless compression

bull Lossless Compression ndash If data is losslessly compressed the original data can be recovered

exactly from the compressed datandash As name implies involve no loss of information

bull Lossy compression ndash Involves some loss of informationndash Data that have been lossy compressed generally cannot be

recovered exactlyndash By accepting the above we can achieve higher compression ratios

than lossless compression

Perceptual Audio Coding

bull One of the key elements in the development of reduced bit rate audio is the understanding and application of psychoacoustics

bull All of the current perceptual audio coders achieve high compression rates by exploiting the fact that signal information that cannot be detected by even a well-trained listener can be discarded

bull Human hearing is insensitive to quiet frequency components to sound accompanying other stronger frequency components

bull Stereo audio streams contain largely redundant information

bull Irrelevant signal information is identified during signal analysis by incorporating into the coder several psychoacoustic principles

Principles of Psychoacoustics

1 Absolute Threshold of Hearing

The absolute threshold of hearing characterizes the amount of energy needed in a pure tone such that it can be detected by a listener in a noiseless environment

It can be expressed with a non-linear function

Tq(f) = 364(f1000)-08 - 65e-06(f1000-33)2 + 10-3(f1000)4 (dB SPL)

Equal loudness contours for pure tones

Continuehellip

bull When applied to signal compression it could be interpreted as a maximum allowable energy level for coding distortions introduced in the frequency domain

bull So using this information the noise levels during quantization are tried to fit below this threshold

bull Due to this quantization noise does not become audible

2 Critical Band

bull Human ear can be viewed as a discrete set of band pass filters which covers the entire 20kHz frequency range

bull The inner ear called as rdquoCochleardquo contains frequency sensitive positions Whenever any tone enters the cochlea it moves until it reaches the position where it resonates

bull The ldquocritical bandwidthrdquo is a function of frequency that quantifies the cochlear filter pass bands (unit ndash Bark)

bull As the center frequency goes on increasing the bark-width also goes on increasing

bull Spectral analysis of audio content is performed using critical bands

Bark-width with center frequency lsquofrsquo is gives as hellip BWc(f) = 25 + 75(1 + 14(f100)2)069 Hz

Idealized critical band filter bank

3 Masking

bull Masking refers to a process where one sound is rendered inaudible because of the presence of another sound

Advanced Audio CodingModular encoding AAC takes a modular approach to encoding Depending on the

complexity of the bitstream to be encoded the desired performance and the acceptable output implementers may create profiles to define which of a specific set of tools they want use for a particular application The standard offers four default profiles

bull Low Complexity (LC) - the simplest and most widely used and supported

bull Main Profile (MAIN) - like the LC profile with the addition of backwards prediction

bull Sample-Rate Scalable (SRS) - aka Scalable Sample Rate (MPEG-4 AAC-SSR)

bull Long Term Prediction (LTP) - added in the MPEG-4 standard - an improvement of the MAIN profile using a forward predictor with lower computational complexity

MPEG-2 AAC BLOCK DIAGRAMS

MPEG AAC FAMILY

MPEG-4 AAC LCPerceptual Noise Substitution [PNS ]

bull Instead of trying to reproduce a waveform that is similar as input signals the model-based coding tries to generate a perceptually

similar sound as output

bull The encoding of PNS includes two steps (1) Noise detection For input signals in each frame the encoder

performs some analysis and determines if the spectral data in a scale-factor band belongs to noise component

(2) Noise compression All spectral samples in the noise-like scale-factor bands are excluded from the following quantization and entropy coding module Instead only a PNS flag and the energy of these samples are included in the bitstream

MPEG-4 HE-AAC

Spectral Band Replication [ SBR ]bull Developed by a German based company ldquoCoding Technologiesrdquo

bull SBR is a bandwidth extension tool

bull The main effect used is the high correlation between the low- and high-frequency content in an audio signal

bull In an SBR-based coding system waveform audio coding is only used to code the lower frequencies of an audio signal This low frequency content is used to recreate the high frequency content at the decoding side

bull This is done by state-of-the-art transposition method

bull The reconstruction of the high band is conducted by transmitting guiding information such as the spectral envelope of the original input signal or additional information to compensate for potentially missing high-frequency components

bull This guiding information is referred to as SBR data

bull The recreated high-frequency content undergoes some frequency and time domain adjustment before it is combined with the low-frequency part of the audio signal

bull HE-AAC aka aacPlus v1

Continuehellip

Continuehellip

Continuehellip

MPEG-4 HE-AAC v2

Parametric Stereo

bull Its also a contribution from ldquoCoding Technologiesrdquo

bull In the encoder only a monaural downmix of the original stereo signal is coded after extraction of the Parametric Stereo data

bull Just like SBR data these parameters are then embedded as PS side information in the ancillary part of the bit-stream

bull In the decoder the monaural signal is decoded first After that the stereo signal is reconstructed based on the stereo parameters embedded by the encoder

bull Three types of parameters can be employed in a Parametric Stereo system to describe the stereo image

1048705bull Inter-channel Intensity Difference (IID) describing the intensity

difference between the channels

bull Inter-channel Cross-Correlation (ICC) describing the cross correlation or coherence between the channels The coherence is measured as the maximum of the cross-correlation as a function of time or phase

bull Inter-channel Phase Difference (IPD) describing the phase difference between the channels

bull HE-AACv2 aka aacPlus v2

Continuehellip

Continuehellip

Advantages Over MP3 AAC

1 Multi Channel Audio ndash up to 48 audio channels

2 Sample frequencies from 8KHz ~ 96KHz

3 Simpler filter bank (pure MDCT used)

4 Better stationary and transient response due to block sizes of 1024 and 128 samples

5 Excellent handling of high frequency signals

6 CD quality audio at 64Kbitssec

7 Much better quality of audio at lower bit rates (down to 32Kbps)

MP3

1 Stereo signal ndash maximum of only 2 channels

2 Sampling frequencies from 16KHz ~ 48KHz

3 Hybrid filter bank ( more computational power)

4 Poorer stationary and transient response due to block sizes of 576 and 192 samples

5 Signal handling up to 155158 KHz

6 CD quality audio at 128Kbitssec

7 Audio quality is poorer at low bit rates and may present coding artifacts

Disadvantages

bull Transparency is lost at very low bit rates when SBR is used

bull Small loss of stereo image when PS is used

APPLICATIONS

bull HE-AAC was chosen as the coding used in DAB (Digital Audio Broadcasting)

bull HE-AAC is the coding used in DRM (Digital Radio Mondiale)bull Itrsquos the default format in Apples i-PODbull Used in mobile phone to store songsbull Itrsquos the audio coding used in 3gp and 3gpp formatbull Itrsquos the audio coding used in DTH services [MPEG-4]bull For Internet Streamingbull Audio format in Bluetooth StereoMono headsets

[ A2DP ndash Advanced Audio Distribution profile ] (Optional)

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 7: Advanced Audio Coding [Aac]

Perceptual Audio Coding

bull One of the key elements in the development of reduced bit rate audio is the understanding and application of psychoacoustics

bull All of the current perceptual audio coders achieve high compression rates by exploiting the fact that signal information that cannot be detected by even a well-trained listener can be discarded

bull Human hearing is insensitive to quiet frequency components to sound accompanying other stronger frequency components

bull Stereo audio streams contain largely redundant information

bull Irrelevant signal information is identified during signal analysis by incorporating into the coder several psychoacoustic principles

Principles of Psychoacoustics

1 Absolute Threshold of Hearing

The absolute threshold of hearing characterizes the amount of energy needed in a pure tone such that it can be detected by a listener in a noiseless environment

It can be expressed with a non-linear function

Tq(f) = 364(f1000)-08 - 65e-06(f1000-33)2 + 10-3(f1000)4 (dB SPL)

Equal loudness contours for pure tones

Continuehellip

bull When applied to signal compression it could be interpreted as a maximum allowable energy level for coding distortions introduced in the frequency domain

bull So using this information the noise levels during quantization are tried to fit below this threshold

bull Due to this quantization noise does not become audible

2 Critical Band

bull Human ear can be viewed as a discrete set of band pass filters which covers the entire 20kHz frequency range

bull The inner ear called as rdquoCochleardquo contains frequency sensitive positions Whenever any tone enters the cochlea it moves until it reaches the position where it resonates

bull The ldquocritical bandwidthrdquo is a function of frequency that quantifies the cochlear filter pass bands (unit ndash Bark)

bull As the center frequency goes on increasing the bark-width also goes on increasing

bull Spectral analysis of audio content is performed using critical bands

Bark-width with center frequency lsquofrsquo is gives as hellip BWc(f) = 25 + 75(1 + 14(f100)2)069 Hz

Idealized critical band filter bank

3 Masking

bull Masking refers to a process where one sound is rendered inaudible because of the presence of another sound

Advanced Audio CodingModular encoding AAC takes a modular approach to encoding Depending on the

complexity of the bitstream to be encoded the desired performance and the acceptable output implementers may create profiles to define which of a specific set of tools they want use for a particular application The standard offers four default profiles

bull Low Complexity (LC) - the simplest and most widely used and supported

bull Main Profile (MAIN) - like the LC profile with the addition of backwards prediction

bull Sample-Rate Scalable (SRS) - aka Scalable Sample Rate (MPEG-4 AAC-SSR)

bull Long Term Prediction (LTP) - added in the MPEG-4 standard - an improvement of the MAIN profile using a forward predictor with lower computational complexity

MPEG-2 AAC BLOCK DIAGRAMS

MPEG AAC FAMILY

MPEG-4 AAC LCPerceptual Noise Substitution [PNS ]

bull Instead of trying to reproduce a waveform that is similar as input signals the model-based coding tries to generate a perceptually

similar sound as output

bull The encoding of PNS includes two steps (1) Noise detection For input signals in each frame the encoder

performs some analysis and determines if the spectral data in a scale-factor band belongs to noise component

(2) Noise compression All spectral samples in the noise-like scale-factor bands are excluded from the following quantization and entropy coding module Instead only a PNS flag and the energy of these samples are included in the bitstream

MPEG-4 HE-AAC

Spectral Band Replication [ SBR ]bull Developed by a German based company ldquoCoding Technologiesrdquo

bull SBR is a bandwidth extension tool

bull The main effect used is the high correlation between the low- and high-frequency content in an audio signal

bull In an SBR-based coding system waveform audio coding is only used to code the lower frequencies of an audio signal This low frequency content is used to recreate the high frequency content at the decoding side

bull This is done by state-of-the-art transposition method

bull The reconstruction of the high band is conducted by transmitting guiding information such as the spectral envelope of the original input signal or additional information to compensate for potentially missing high-frequency components

bull This guiding information is referred to as SBR data

bull The recreated high-frequency content undergoes some frequency and time domain adjustment before it is combined with the low-frequency part of the audio signal

bull HE-AAC aka aacPlus v1

Continuehellip

Continuehellip

Continuehellip

MPEG-4 HE-AAC v2

Parametric Stereo

bull Its also a contribution from ldquoCoding Technologiesrdquo

bull In the encoder only a monaural downmix of the original stereo signal is coded after extraction of the Parametric Stereo data

bull Just like SBR data these parameters are then embedded as PS side information in the ancillary part of the bit-stream

bull In the decoder the monaural signal is decoded first After that the stereo signal is reconstructed based on the stereo parameters embedded by the encoder

bull Three types of parameters can be employed in a Parametric Stereo system to describe the stereo image

1048705bull Inter-channel Intensity Difference (IID) describing the intensity

difference between the channels

bull Inter-channel Cross-Correlation (ICC) describing the cross correlation or coherence between the channels The coherence is measured as the maximum of the cross-correlation as a function of time or phase

bull Inter-channel Phase Difference (IPD) describing the phase difference between the channels

bull HE-AACv2 aka aacPlus v2

Continuehellip

Continuehellip

Advantages Over MP3 AAC

1 Multi Channel Audio ndash up to 48 audio channels

2 Sample frequencies from 8KHz ~ 96KHz

3 Simpler filter bank (pure MDCT used)

4 Better stationary and transient response due to block sizes of 1024 and 128 samples

5 Excellent handling of high frequency signals

6 CD quality audio at 64Kbitssec

7 Much better quality of audio at lower bit rates (down to 32Kbps)

MP3

1 Stereo signal ndash maximum of only 2 channels

2 Sampling frequencies from 16KHz ~ 48KHz

3 Hybrid filter bank ( more computational power)

4 Poorer stationary and transient response due to block sizes of 576 and 192 samples

5 Signal handling up to 155158 KHz

6 CD quality audio at 128Kbitssec

7 Audio quality is poorer at low bit rates and may present coding artifacts

Disadvantages

bull Transparency is lost at very low bit rates when SBR is used

bull Small loss of stereo image when PS is used

APPLICATIONS

bull HE-AAC was chosen as the coding used in DAB (Digital Audio Broadcasting)

bull HE-AAC is the coding used in DRM (Digital Radio Mondiale)bull Itrsquos the default format in Apples i-PODbull Used in mobile phone to store songsbull Itrsquos the audio coding used in 3gp and 3gpp formatbull Itrsquos the audio coding used in DTH services [MPEG-4]bull For Internet Streamingbull Audio format in Bluetooth StereoMono headsets

[ A2DP ndash Advanced Audio Distribution profile ] (Optional)

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 8: Advanced Audio Coding [Aac]

Principles of Psychoacoustics

1 Absolute Threshold of Hearing

The absolute threshold of hearing characterizes the amount of energy needed in a pure tone such that it can be detected by a listener in a noiseless environment

It can be expressed with a non-linear function

Tq(f) = 364(f1000)-08 - 65e-06(f1000-33)2 + 10-3(f1000)4 (dB SPL)

Equal loudness contours for pure tones

Continuehellip

bull When applied to signal compression it could be interpreted as a maximum allowable energy level for coding distortions introduced in the frequency domain

bull So using this information the noise levels during quantization are tried to fit below this threshold

bull Due to this quantization noise does not become audible

2 Critical Band

bull Human ear can be viewed as a discrete set of band pass filters which covers the entire 20kHz frequency range

bull The inner ear called as rdquoCochleardquo contains frequency sensitive positions Whenever any tone enters the cochlea it moves until it reaches the position where it resonates

bull The ldquocritical bandwidthrdquo is a function of frequency that quantifies the cochlear filter pass bands (unit ndash Bark)

bull As the center frequency goes on increasing the bark-width also goes on increasing

bull Spectral analysis of audio content is performed using critical bands

Bark-width with center frequency lsquofrsquo is gives as hellip BWc(f) = 25 + 75(1 + 14(f100)2)069 Hz

Idealized critical band filter bank

3 Masking

bull Masking refers to a process where one sound is rendered inaudible because of the presence of another sound

Advanced Audio CodingModular encoding AAC takes a modular approach to encoding Depending on the

complexity of the bitstream to be encoded the desired performance and the acceptable output implementers may create profiles to define which of a specific set of tools they want use for a particular application The standard offers four default profiles

bull Low Complexity (LC) - the simplest and most widely used and supported

bull Main Profile (MAIN) - like the LC profile with the addition of backwards prediction

bull Sample-Rate Scalable (SRS) - aka Scalable Sample Rate (MPEG-4 AAC-SSR)

bull Long Term Prediction (LTP) - added in the MPEG-4 standard - an improvement of the MAIN profile using a forward predictor with lower computational complexity

MPEG-2 AAC BLOCK DIAGRAMS

MPEG AAC FAMILY

MPEG-4 AAC LCPerceptual Noise Substitution [PNS ]

bull Instead of trying to reproduce a waveform that is similar as input signals the model-based coding tries to generate a perceptually

similar sound as output

bull The encoding of PNS includes two steps (1) Noise detection For input signals in each frame the encoder

performs some analysis and determines if the spectral data in a scale-factor band belongs to noise component

(2) Noise compression All spectral samples in the noise-like scale-factor bands are excluded from the following quantization and entropy coding module Instead only a PNS flag and the energy of these samples are included in the bitstream

MPEG-4 HE-AAC

Spectral Band Replication [ SBR ]bull Developed by a German based company ldquoCoding Technologiesrdquo

bull SBR is a bandwidth extension tool

bull The main effect used is the high correlation between the low- and high-frequency content in an audio signal

bull In an SBR-based coding system waveform audio coding is only used to code the lower frequencies of an audio signal This low frequency content is used to recreate the high frequency content at the decoding side

bull This is done by state-of-the-art transposition method

bull The reconstruction of the high band is conducted by transmitting guiding information such as the spectral envelope of the original input signal or additional information to compensate for potentially missing high-frequency components

bull This guiding information is referred to as SBR data

bull The recreated high-frequency content undergoes some frequency and time domain adjustment before it is combined with the low-frequency part of the audio signal

bull HE-AAC aka aacPlus v1

Continuehellip

Continuehellip

Continuehellip

MPEG-4 HE-AAC v2

Parametric Stereo

bull Its also a contribution from ldquoCoding Technologiesrdquo

bull In the encoder only a monaural downmix of the original stereo signal is coded after extraction of the Parametric Stereo data

bull Just like SBR data these parameters are then embedded as PS side information in the ancillary part of the bit-stream

bull In the decoder the monaural signal is decoded first After that the stereo signal is reconstructed based on the stereo parameters embedded by the encoder

bull Three types of parameters can be employed in a Parametric Stereo system to describe the stereo image

1048705bull Inter-channel Intensity Difference (IID) describing the intensity

difference between the channels

bull Inter-channel Cross-Correlation (ICC) describing the cross correlation or coherence between the channels The coherence is measured as the maximum of the cross-correlation as a function of time or phase

bull Inter-channel Phase Difference (IPD) describing the phase difference between the channels

bull HE-AACv2 aka aacPlus v2

Continuehellip

Continuehellip

Advantages Over MP3 AAC

1 Multi Channel Audio ndash up to 48 audio channels

2 Sample frequencies from 8KHz ~ 96KHz

3 Simpler filter bank (pure MDCT used)

4 Better stationary and transient response due to block sizes of 1024 and 128 samples

5 Excellent handling of high frequency signals

6 CD quality audio at 64Kbitssec

7 Much better quality of audio at lower bit rates (down to 32Kbps)

MP3

1 Stereo signal ndash maximum of only 2 channels

2 Sampling frequencies from 16KHz ~ 48KHz

3 Hybrid filter bank ( more computational power)

4 Poorer stationary and transient response due to block sizes of 576 and 192 samples

5 Signal handling up to 155158 KHz

6 CD quality audio at 128Kbitssec

7 Audio quality is poorer at low bit rates and may present coding artifacts

Disadvantages

bull Transparency is lost at very low bit rates when SBR is used

bull Small loss of stereo image when PS is used

APPLICATIONS

bull HE-AAC was chosen as the coding used in DAB (Digital Audio Broadcasting)

bull HE-AAC is the coding used in DRM (Digital Radio Mondiale)bull Itrsquos the default format in Apples i-PODbull Used in mobile phone to store songsbull Itrsquos the audio coding used in 3gp and 3gpp formatbull Itrsquos the audio coding used in DTH services [MPEG-4]bull For Internet Streamingbull Audio format in Bluetooth StereoMono headsets

[ A2DP ndash Advanced Audio Distribution profile ] (Optional)

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 9: Advanced Audio Coding [Aac]

Equal loudness contours for pure tones

Continuehellip

bull When applied to signal compression it could be interpreted as a maximum allowable energy level for coding distortions introduced in the frequency domain

bull So using this information the noise levels during quantization are tried to fit below this threshold

bull Due to this quantization noise does not become audible

2 Critical Band

bull Human ear can be viewed as a discrete set of band pass filters which covers the entire 20kHz frequency range

bull The inner ear called as rdquoCochleardquo contains frequency sensitive positions Whenever any tone enters the cochlea it moves until it reaches the position where it resonates

bull The ldquocritical bandwidthrdquo is a function of frequency that quantifies the cochlear filter pass bands (unit ndash Bark)

bull As the center frequency goes on increasing the bark-width also goes on increasing

bull Spectral analysis of audio content is performed using critical bands

Bark-width with center frequency lsquofrsquo is gives as hellip BWc(f) = 25 + 75(1 + 14(f100)2)069 Hz

Idealized critical band filter bank

3 Masking

bull Masking refers to a process where one sound is rendered inaudible because of the presence of another sound

Advanced Audio CodingModular encoding AAC takes a modular approach to encoding Depending on the

complexity of the bitstream to be encoded the desired performance and the acceptable output implementers may create profiles to define which of a specific set of tools they want use for a particular application The standard offers four default profiles

bull Low Complexity (LC) - the simplest and most widely used and supported

bull Main Profile (MAIN) - like the LC profile with the addition of backwards prediction

bull Sample-Rate Scalable (SRS) - aka Scalable Sample Rate (MPEG-4 AAC-SSR)

bull Long Term Prediction (LTP) - added in the MPEG-4 standard - an improvement of the MAIN profile using a forward predictor with lower computational complexity

MPEG-2 AAC BLOCK DIAGRAMS

MPEG AAC FAMILY

MPEG-4 AAC LCPerceptual Noise Substitution [PNS ]

bull Instead of trying to reproduce a waveform that is similar as input signals the model-based coding tries to generate a perceptually

similar sound as output

bull The encoding of PNS includes two steps (1) Noise detection For input signals in each frame the encoder

performs some analysis and determines if the spectral data in a scale-factor band belongs to noise component

(2) Noise compression All spectral samples in the noise-like scale-factor bands are excluded from the following quantization and entropy coding module Instead only a PNS flag and the energy of these samples are included in the bitstream

MPEG-4 HE-AAC

Spectral Band Replication [ SBR ]bull Developed by a German based company ldquoCoding Technologiesrdquo

bull SBR is a bandwidth extension tool

bull The main effect used is the high correlation between the low- and high-frequency content in an audio signal

bull In an SBR-based coding system waveform audio coding is only used to code the lower frequencies of an audio signal This low frequency content is used to recreate the high frequency content at the decoding side

bull This is done by state-of-the-art transposition method

bull The reconstruction of the high band is conducted by transmitting guiding information such as the spectral envelope of the original input signal or additional information to compensate for potentially missing high-frequency components

bull This guiding information is referred to as SBR data

bull The recreated high-frequency content undergoes some frequency and time domain adjustment before it is combined with the low-frequency part of the audio signal

bull HE-AAC aka aacPlus v1

Continuehellip

Continuehellip

Continuehellip

MPEG-4 HE-AAC v2

Parametric Stereo

bull Its also a contribution from ldquoCoding Technologiesrdquo

bull In the encoder only a monaural downmix of the original stereo signal is coded after extraction of the Parametric Stereo data

bull Just like SBR data these parameters are then embedded as PS side information in the ancillary part of the bit-stream

bull In the decoder the monaural signal is decoded first After that the stereo signal is reconstructed based on the stereo parameters embedded by the encoder

bull Three types of parameters can be employed in a Parametric Stereo system to describe the stereo image

1048705bull Inter-channel Intensity Difference (IID) describing the intensity

difference between the channels

bull Inter-channel Cross-Correlation (ICC) describing the cross correlation or coherence between the channels The coherence is measured as the maximum of the cross-correlation as a function of time or phase

bull Inter-channel Phase Difference (IPD) describing the phase difference between the channels

bull HE-AACv2 aka aacPlus v2

Continuehellip

Continuehellip

Advantages Over MP3 AAC

1 Multi Channel Audio ndash up to 48 audio channels

2 Sample frequencies from 8KHz ~ 96KHz

3 Simpler filter bank (pure MDCT used)

4 Better stationary and transient response due to block sizes of 1024 and 128 samples

5 Excellent handling of high frequency signals

6 CD quality audio at 64Kbitssec

7 Much better quality of audio at lower bit rates (down to 32Kbps)

MP3

1 Stereo signal ndash maximum of only 2 channels

2 Sampling frequencies from 16KHz ~ 48KHz

3 Hybrid filter bank ( more computational power)

4 Poorer stationary and transient response due to block sizes of 576 and 192 samples

5 Signal handling up to 155158 KHz

6 CD quality audio at 128Kbitssec

7 Audio quality is poorer at low bit rates and may present coding artifacts

Disadvantages

bull Transparency is lost at very low bit rates when SBR is used

bull Small loss of stereo image when PS is used

APPLICATIONS

bull HE-AAC was chosen as the coding used in DAB (Digital Audio Broadcasting)

bull HE-AAC is the coding used in DRM (Digital Radio Mondiale)bull Itrsquos the default format in Apples i-PODbull Used in mobile phone to store songsbull Itrsquos the audio coding used in 3gp and 3gpp formatbull Itrsquos the audio coding used in DTH services [MPEG-4]bull For Internet Streamingbull Audio format in Bluetooth StereoMono headsets

[ A2DP ndash Advanced Audio Distribution profile ] (Optional)

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 10: Advanced Audio Coding [Aac]

Continuehellip

bull When applied to signal compression it could be interpreted as a maximum allowable energy level for coding distortions introduced in the frequency domain

bull So using this information the noise levels during quantization are tried to fit below this threshold

bull Due to this quantization noise does not become audible

2 Critical Band

bull Human ear can be viewed as a discrete set of band pass filters which covers the entire 20kHz frequency range

bull The inner ear called as rdquoCochleardquo contains frequency sensitive positions Whenever any tone enters the cochlea it moves until it reaches the position where it resonates

bull The ldquocritical bandwidthrdquo is a function of frequency that quantifies the cochlear filter pass bands (unit ndash Bark)

bull As the center frequency goes on increasing the bark-width also goes on increasing

bull Spectral analysis of audio content is performed using critical bands

Bark-width with center frequency lsquofrsquo is gives as hellip BWc(f) = 25 + 75(1 + 14(f100)2)069 Hz

Idealized critical band filter bank

3 Masking

bull Masking refers to a process where one sound is rendered inaudible because of the presence of another sound

Advanced Audio CodingModular encoding AAC takes a modular approach to encoding Depending on the

complexity of the bitstream to be encoded the desired performance and the acceptable output implementers may create profiles to define which of a specific set of tools they want use for a particular application The standard offers four default profiles

bull Low Complexity (LC) - the simplest and most widely used and supported

bull Main Profile (MAIN) - like the LC profile with the addition of backwards prediction

bull Sample-Rate Scalable (SRS) - aka Scalable Sample Rate (MPEG-4 AAC-SSR)

bull Long Term Prediction (LTP) - added in the MPEG-4 standard - an improvement of the MAIN profile using a forward predictor with lower computational complexity

MPEG-2 AAC BLOCK DIAGRAMS

MPEG AAC FAMILY

MPEG-4 AAC LCPerceptual Noise Substitution [PNS ]

bull Instead of trying to reproduce a waveform that is similar as input signals the model-based coding tries to generate a perceptually

similar sound as output

bull The encoding of PNS includes two steps (1) Noise detection For input signals in each frame the encoder

performs some analysis and determines if the spectral data in a scale-factor band belongs to noise component

(2) Noise compression All spectral samples in the noise-like scale-factor bands are excluded from the following quantization and entropy coding module Instead only a PNS flag and the energy of these samples are included in the bitstream

MPEG-4 HE-AAC

Spectral Band Replication [ SBR ]bull Developed by a German based company ldquoCoding Technologiesrdquo

bull SBR is a bandwidth extension tool

bull The main effect used is the high correlation between the low- and high-frequency content in an audio signal

bull In an SBR-based coding system waveform audio coding is only used to code the lower frequencies of an audio signal This low frequency content is used to recreate the high frequency content at the decoding side

bull This is done by state-of-the-art transposition method

bull The reconstruction of the high band is conducted by transmitting guiding information such as the spectral envelope of the original input signal or additional information to compensate for potentially missing high-frequency components

bull This guiding information is referred to as SBR data

bull The recreated high-frequency content undergoes some frequency and time domain adjustment before it is combined with the low-frequency part of the audio signal

bull HE-AAC aka aacPlus v1

Continuehellip

Continuehellip

Continuehellip

MPEG-4 HE-AAC v2

Parametric Stereo

bull Its also a contribution from ldquoCoding Technologiesrdquo

bull In the encoder only a monaural downmix of the original stereo signal is coded after extraction of the Parametric Stereo data

bull Just like SBR data these parameters are then embedded as PS side information in the ancillary part of the bit-stream

bull In the decoder the monaural signal is decoded first After that the stereo signal is reconstructed based on the stereo parameters embedded by the encoder

bull Three types of parameters can be employed in a Parametric Stereo system to describe the stereo image

1048705bull Inter-channel Intensity Difference (IID) describing the intensity

difference between the channels

bull Inter-channel Cross-Correlation (ICC) describing the cross correlation or coherence between the channels The coherence is measured as the maximum of the cross-correlation as a function of time or phase

bull Inter-channel Phase Difference (IPD) describing the phase difference between the channels

bull HE-AACv2 aka aacPlus v2

Continuehellip

Continuehellip

Advantages Over MP3 AAC

1 Multi Channel Audio ndash up to 48 audio channels

2 Sample frequencies from 8KHz ~ 96KHz

3 Simpler filter bank (pure MDCT used)

4 Better stationary and transient response due to block sizes of 1024 and 128 samples

5 Excellent handling of high frequency signals

6 CD quality audio at 64Kbitssec

7 Much better quality of audio at lower bit rates (down to 32Kbps)

MP3

1 Stereo signal ndash maximum of only 2 channels

2 Sampling frequencies from 16KHz ~ 48KHz

3 Hybrid filter bank ( more computational power)

4 Poorer stationary and transient response due to block sizes of 576 and 192 samples

5 Signal handling up to 155158 KHz

6 CD quality audio at 128Kbitssec

7 Audio quality is poorer at low bit rates and may present coding artifacts

Disadvantages

bull Transparency is lost at very low bit rates when SBR is used

bull Small loss of stereo image when PS is used

APPLICATIONS

bull HE-AAC was chosen as the coding used in DAB (Digital Audio Broadcasting)

bull HE-AAC is the coding used in DRM (Digital Radio Mondiale)bull Itrsquos the default format in Apples i-PODbull Used in mobile phone to store songsbull Itrsquos the audio coding used in 3gp and 3gpp formatbull Itrsquos the audio coding used in DTH services [MPEG-4]bull For Internet Streamingbull Audio format in Bluetooth StereoMono headsets

[ A2DP ndash Advanced Audio Distribution profile ] (Optional)

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 11: Advanced Audio Coding [Aac]

2 Critical Band

bull Human ear can be viewed as a discrete set of band pass filters which covers the entire 20kHz frequency range

bull The inner ear called as rdquoCochleardquo contains frequency sensitive positions Whenever any tone enters the cochlea it moves until it reaches the position where it resonates

bull The ldquocritical bandwidthrdquo is a function of frequency that quantifies the cochlear filter pass bands (unit ndash Bark)

bull As the center frequency goes on increasing the bark-width also goes on increasing

bull Spectral analysis of audio content is performed using critical bands

Bark-width with center frequency lsquofrsquo is gives as hellip BWc(f) = 25 + 75(1 + 14(f100)2)069 Hz

Idealized critical band filter bank

3 Masking

bull Masking refers to a process where one sound is rendered inaudible because of the presence of another sound

Advanced Audio CodingModular encoding AAC takes a modular approach to encoding Depending on the

complexity of the bitstream to be encoded the desired performance and the acceptable output implementers may create profiles to define which of a specific set of tools they want use for a particular application The standard offers four default profiles

bull Low Complexity (LC) - the simplest and most widely used and supported

bull Main Profile (MAIN) - like the LC profile with the addition of backwards prediction

bull Sample-Rate Scalable (SRS) - aka Scalable Sample Rate (MPEG-4 AAC-SSR)

bull Long Term Prediction (LTP) - added in the MPEG-4 standard - an improvement of the MAIN profile using a forward predictor with lower computational complexity

MPEG-2 AAC BLOCK DIAGRAMS

MPEG AAC FAMILY

MPEG-4 AAC LCPerceptual Noise Substitution [PNS ]

bull Instead of trying to reproduce a waveform that is similar as input signals the model-based coding tries to generate a perceptually

similar sound as output

bull The encoding of PNS includes two steps (1) Noise detection For input signals in each frame the encoder

performs some analysis and determines if the spectral data in a scale-factor band belongs to noise component

(2) Noise compression All spectral samples in the noise-like scale-factor bands are excluded from the following quantization and entropy coding module Instead only a PNS flag and the energy of these samples are included in the bitstream

MPEG-4 HE-AAC

Spectral Band Replication [ SBR ]bull Developed by a German based company ldquoCoding Technologiesrdquo

bull SBR is a bandwidth extension tool

bull The main effect used is the high correlation between the low- and high-frequency content in an audio signal

bull In an SBR-based coding system waveform audio coding is only used to code the lower frequencies of an audio signal This low frequency content is used to recreate the high frequency content at the decoding side

bull This is done by state-of-the-art transposition method

bull The reconstruction of the high band is conducted by transmitting guiding information such as the spectral envelope of the original input signal or additional information to compensate for potentially missing high-frequency components

bull This guiding information is referred to as SBR data

bull The recreated high-frequency content undergoes some frequency and time domain adjustment before it is combined with the low-frequency part of the audio signal

bull HE-AAC aka aacPlus v1

Continuehellip

Continuehellip

Continuehellip

MPEG-4 HE-AAC v2

Parametric Stereo

bull Its also a contribution from ldquoCoding Technologiesrdquo

bull In the encoder only a monaural downmix of the original stereo signal is coded after extraction of the Parametric Stereo data

bull Just like SBR data these parameters are then embedded as PS side information in the ancillary part of the bit-stream

bull In the decoder the monaural signal is decoded first After that the stereo signal is reconstructed based on the stereo parameters embedded by the encoder

bull Three types of parameters can be employed in a Parametric Stereo system to describe the stereo image

1048705bull Inter-channel Intensity Difference (IID) describing the intensity

difference between the channels

bull Inter-channel Cross-Correlation (ICC) describing the cross correlation or coherence between the channels The coherence is measured as the maximum of the cross-correlation as a function of time or phase

bull Inter-channel Phase Difference (IPD) describing the phase difference between the channels

bull HE-AACv2 aka aacPlus v2

Continuehellip

Continuehellip

Advantages Over MP3 AAC

1 Multi Channel Audio ndash up to 48 audio channels

2 Sample frequencies from 8KHz ~ 96KHz

3 Simpler filter bank (pure MDCT used)

4 Better stationary and transient response due to block sizes of 1024 and 128 samples

5 Excellent handling of high frequency signals

6 CD quality audio at 64Kbitssec

7 Much better quality of audio at lower bit rates (down to 32Kbps)

MP3

1 Stereo signal ndash maximum of only 2 channels

2 Sampling frequencies from 16KHz ~ 48KHz

3 Hybrid filter bank ( more computational power)

4 Poorer stationary and transient response due to block sizes of 576 and 192 samples

5 Signal handling up to 155158 KHz

6 CD quality audio at 128Kbitssec

7 Audio quality is poorer at low bit rates and may present coding artifacts

Disadvantages

bull Transparency is lost at very low bit rates when SBR is used

bull Small loss of stereo image when PS is used

APPLICATIONS

bull HE-AAC was chosen as the coding used in DAB (Digital Audio Broadcasting)

bull HE-AAC is the coding used in DRM (Digital Radio Mondiale)bull Itrsquos the default format in Apples i-PODbull Used in mobile phone to store songsbull Itrsquos the audio coding used in 3gp and 3gpp formatbull Itrsquos the audio coding used in DTH services [MPEG-4]bull For Internet Streamingbull Audio format in Bluetooth StereoMono headsets

[ A2DP ndash Advanced Audio Distribution profile ] (Optional)

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 12: Advanced Audio Coding [Aac]

Idealized critical band filter bank

3 Masking

bull Masking refers to a process where one sound is rendered inaudible because of the presence of another sound

Advanced Audio CodingModular encoding AAC takes a modular approach to encoding Depending on the

complexity of the bitstream to be encoded the desired performance and the acceptable output implementers may create profiles to define which of a specific set of tools they want use for a particular application The standard offers four default profiles

bull Low Complexity (LC) - the simplest and most widely used and supported

bull Main Profile (MAIN) - like the LC profile with the addition of backwards prediction

bull Sample-Rate Scalable (SRS) - aka Scalable Sample Rate (MPEG-4 AAC-SSR)

bull Long Term Prediction (LTP) - added in the MPEG-4 standard - an improvement of the MAIN profile using a forward predictor with lower computational complexity

MPEG-2 AAC BLOCK DIAGRAMS

MPEG AAC FAMILY

MPEG-4 AAC LCPerceptual Noise Substitution [PNS ]

bull Instead of trying to reproduce a waveform that is similar as input signals the model-based coding tries to generate a perceptually

similar sound as output

bull The encoding of PNS includes two steps (1) Noise detection For input signals in each frame the encoder

performs some analysis and determines if the spectral data in a scale-factor band belongs to noise component

(2) Noise compression All spectral samples in the noise-like scale-factor bands are excluded from the following quantization and entropy coding module Instead only a PNS flag and the energy of these samples are included in the bitstream

MPEG-4 HE-AAC

Spectral Band Replication [ SBR ]bull Developed by a German based company ldquoCoding Technologiesrdquo

bull SBR is a bandwidth extension tool

bull The main effect used is the high correlation between the low- and high-frequency content in an audio signal

bull In an SBR-based coding system waveform audio coding is only used to code the lower frequencies of an audio signal This low frequency content is used to recreate the high frequency content at the decoding side

bull This is done by state-of-the-art transposition method

bull The reconstruction of the high band is conducted by transmitting guiding information such as the spectral envelope of the original input signal or additional information to compensate for potentially missing high-frequency components

bull This guiding information is referred to as SBR data

bull The recreated high-frequency content undergoes some frequency and time domain adjustment before it is combined with the low-frequency part of the audio signal

bull HE-AAC aka aacPlus v1

Continuehellip

Continuehellip

Continuehellip

MPEG-4 HE-AAC v2

Parametric Stereo

bull Its also a contribution from ldquoCoding Technologiesrdquo

bull In the encoder only a monaural downmix of the original stereo signal is coded after extraction of the Parametric Stereo data

bull Just like SBR data these parameters are then embedded as PS side information in the ancillary part of the bit-stream

bull In the decoder the monaural signal is decoded first After that the stereo signal is reconstructed based on the stereo parameters embedded by the encoder

bull Three types of parameters can be employed in a Parametric Stereo system to describe the stereo image

1048705bull Inter-channel Intensity Difference (IID) describing the intensity

difference between the channels

bull Inter-channel Cross-Correlation (ICC) describing the cross correlation or coherence between the channels The coherence is measured as the maximum of the cross-correlation as a function of time or phase

bull Inter-channel Phase Difference (IPD) describing the phase difference between the channels

bull HE-AACv2 aka aacPlus v2

Continuehellip

Continuehellip

Advantages Over MP3 AAC

1 Multi Channel Audio ndash up to 48 audio channels

2 Sample frequencies from 8KHz ~ 96KHz

3 Simpler filter bank (pure MDCT used)

4 Better stationary and transient response due to block sizes of 1024 and 128 samples

5 Excellent handling of high frequency signals

6 CD quality audio at 64Kbitssec

7 Much better quality of audio at lower bit rates (down to 32Kbps)

MP3

1 Stereo signal ndash maximum of only 2 channels

2 Sampling frequencies from 16KHz ~ 48KHz

3 Hybrid filter bank ( more computational power)

4 Poorer stationary and transient response due to block sizes of 576 and 192 samples

5 Signal handling up to 155158 KHz

6 CD quality audio at 128Kbitssec

7 Audio quality is poorer at low bit rates and may present coding artifacts

Disadvantages

bull Transparency is lost at very low bit rates when SBR is used

bull Small loss of stereo image when PS is used

APPLICATIONS

bull HE-AAC was chosen as the coding used in DAB (Digital Audio Broadcasting)

bull HE-AAC is the coding used in DRM (Digital Radio Mondiale)bull Itrsquos the default format in Apples i-PODbull Used in mobile phone to store songsbull Itrsquos the audio coding used in 3gp and 3gpp formatbull Itrsquos the audio coding used in DTH services [MPEG-4]bull For Internet Streamingbull Audio format in Bluetooth StereoMono headsets

[ A2DP ndash Advanced Audio Distribution profile ] (Optional)

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 13: Advanced Audio Coding [Aac]

3 Masking

bull Masking refers to a process where one sound is rendered inaudible because of the presence of another sound

Advanced Audio CodingModular encoding AAC takes a modular approach to encoding Depending on the

complexity of the bitstream to be encoded the desired performance and the acceptable output implementers may create profiles to define which of a specific set of tools they want use for a particular application The standard offers four default profiles

bull Low Complexity (LC) - the simplest and most widely used and supported

bull Main Profile (MAIN) - like the LC profile with the addition of backwards prediction

bull Sample-Rate Scalable (SRS) - aka Scalable Sample Rate (MPEG-4 AAC-SSR)

bull Long Term Prediction (LTP) - added in the MPEG-4 standard - an improvement of the MAIN profile using a forward predictor with lower computational complexity

MPEG-2 AAC BLOCK DIAGRAMS

MPEG AAC FAMILY

MPEG-4 AAC LCPerceptual Noise Substitution [PNS ]

bull Instead of trying to reproduce a waveform that is similar as input signals the model-based coding tries to generate a perceptually

similar sound as output

bull The encoding of PNS includes two steps (1) Noise detection For input signals in each frame the encoder

performs some analysis and determines if the spectral data in a scale-factor band belongs to noise component

(2) Noise compression All spectral samples in the noise-like scale-factor bands are excluded from the following quantization and entropy coding module Instead only a PNS flag and the energy of these samples are included in the bitstream

MPEG-4 HE-AAC

Spectral Band Replication [ SBR ]bull Developed by a German based company ldquoCoding Technologiesrdquo

bull SBR is a bandwidth extension tool

bull The main effect used is the high correlation between the low- and high-frequency content in an audio signal

bull In an SBR-based coding system waveform audio coding is only used to code the lower frequencies of an audio signal This low frequency content is used to recreate the high frequency content at the decoding side

bull This is done by state-of-the-art transposition method

bull The reconstruction of the high band is conducted by transmitting guiding information such as the spectral envelope of the original input signal or additional information to compensate for potentially missing high-frequency components

bull This guiding information is referred to as SBR data

bull The recreated high-frequency content undergoes some frequency and time domain adjustment before it is combined with the low-frequency part of the audio signal

bull HE-AAC aka aacPlus v1

Continuehellip

Continuehellip

Continuehellip

MPEG-4 HE-AAC v2

Parametric Stereo

bull Its also a contribution from ldquoCoding Technologiesrdquo

bull In the encoder only a monaural downmix of the original stereo signal is coded after extraction of the Parametric Stereo data

bull Just like SBR data these parameters are then embedded as PS side information in the ancillary part of the bit-stream

bull In the decoder the monaural signal is decoded first After that the stereo signal is reconstructed based on the stereo parameters embedded by the encoder

bull Three types of parameters can be employed in a Parametric Stereo system to describe the stereo image

1048705bull Inter-channel Intensity Difference (IID) describing the intensity

difference between the channels

bull Inter-channel Cross-Correlation (ICC) describing the cross correlation or coherence between the channels The coherence is measured as the maximum of the cross-correlation as a function of time or phase

bull Inter-channel Phase Difference (IPD) describing the phase difference between the channels

bull HE-AACv2 aka aacPlus v2

Continuehellip

Continuehellip

Advantages Over MP3 AAC

1 Multi Channel Audio ndash up to 48 audio channels

2 Sample frequencies from 8KHz ~ 96KHz

3 Simpler filter bank (pure MDCT used)

4 Better stationary and transient response due to block sizes of 1024 and 128 samples

5 Excellent handling of high frequency signals

6 CD quality audio at 64Kbitssec

7 Much better quality of audio at lower bit rates (down to 32Kbps)

MP3

1 Stereo signal ndash maximum of only 2 channels

2 Sampling frequencies from 16KHz ~ 48KHz

3 Hybrid filter bank ( more computational power)

4 Poorer stationary and transient response due to block sizes of 576 and 192 samples

5 Signal handling up to 155158 KHz

6 CD quality audio at 128Kbitssec

7 Audio quality is poorer at low bit rates and may present coding artifacts

Disadvantages

bull Transparency is lost at very low bit rates when SBR is used

bull Small loss of stereo image when PS is used

APPLICATIONS

bull HE-AAC was chosen as the coding used in DAB (Digital Audio Broadcasting)

bull HE-AAC is the coding used in DRM (Digital Radio Mondiale)bull Itrsquos the default format in Apples i-PODbull Used in mobile phone to store songsbull Itrsquos the audio coding used in 3gp and 3gpp formatbull Itrsquos the audio coding used in DTH services [MPEG-4]bull For Internet Streamingbull Audio format in Bluetooth StereoMono headsets

[ A2DP ndash Advanced Audio Distribution profile ] (Optional)

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 14: Advanced Audio Coding [Aac]

Advanced Audio CodingModular encoding AAC takes a modular approach to encoding Depending on the

complexity of the bitstream to be encoded the desired performance and the acceptable output implementers may create profiles to define which of a specific set of tools they want use for a particular application The standard offers four default profiles

bull Low Complexity (LC) - the simplest and most widely used and supported

bull Main Profile (MAIN) - like the LC profile with the addition of backwards prediction

bull Sample-Rate Scalable (SRS) - aka Scalable Sample Rate (MPEG-4 AAC-SSR)

bull Long Term Prediction (LTP) - added in the MPEG-4 standard - an improvement of the MAIN profile using a forward predictor with lower computational complexity

MPEG-2 AAC BLOCK DIAGRAMS

MPEG AAC FAMILY

MPEG-4 AAC LCPerceptual Noise Substitution [PNS ]

bull Instead of trying to reproduce a waveform that is similar as input signals the model-based coding tries to generate a perceptually

similar sound as output

bull The encoding of PNS includes two steps (1) Noise detection For input signals in each frame the encoder

performs some analysis and determines if the spectral data in a scale-factor band belongs to noise component

(2) Noise compression All spectral samples in the noise-like scale-factor bands are excluded from the following quantization and entropy coding module Instead only a PNS flag and the energy of these samples are included in the bitstream

MPEG-4 HE-AAC

Spectral Band Replication [ SBR ]bull Developed by a German based company ldquoCoding Technologiesrdquo

bull SBR is a bandwidth extension tool

bull The main effect used is the high correlation between the low- and high-frequency content in an audio signal

bull In an SBR-based coding system waveform audio coding is only used to code the lower frequencies of an audio signal This low frequency content is used to recreate the high frequency content at the decoding side

bull This is done by state-of-the-art transposition method

bull The reconstruction of the high band is conducted by transmitting guiding information such as the spectral envelope of the original input signal or additional information to compensate for potentially missing high-frequency components

bull This guiding information is referred to as SBR data

bull The recreated high-frequency content undergoes some frequency and time domain adjustment before it is combined with the low-frequency part of the audio signal

bull HE-AAC aka aacPlus v1

Continuehellip

Continuehellip

Continuehellip

MPEG-4 HE-AAC v2

Parametric Stereo

bull Its also a contribution from ldquoCoding Technologiesrdquo

bull In the encoder only a monaural downmix of the original stereo signal is coded after extraction of the Parametric Stereo data

bull Just like SBR data these parameters are then embedded as PS side information in the ancillary part of the bit-stream

bull In the decoder the monaural signal is decoded first After that the stereo signal is reconstructed based on the stereo parameters embedded by the encoder

bull Three types of parameters can be employed in a Parametric Stereo system to describe the stereo image

1048705bull Inter-channel Intensity Difference (IID) describing the intensity

difference between the channels

bull Inter-channel Cross-Correlation (ICC) describing the cross correlation or coherence between the channels The coherence is measured as the maximum of the cross-correlation as a function of time or phase

bull Inter-channel Phase Difference (IPD) describing the phase difference between the channels

bull HE-AACv2 aka aacPlus v2

Continuehellip

Continuehellip

Advantages Over MP3 AAC

1 Multi Channel Audio ndash up to 48 audio channels

2 Sample frequencies from 8KHz ~ 96KHz

3 Simpler filter bank (pure MDCT used)

4 Better stationary and transient response due to block sizes of 1024 and 128 samples

5 Excellent handling of high frequency signals

6 CD quality audio at 64Kbitssec

7 Much better quality of audio at lower bit rates (down to 32Kbps)

MP3

1 Stereo signal ndash maximum of only 2 channels

2 Sampling frequencies from 16KHz ~ 48KHz

3 Hybrid filter bank ( more computational power)

4 Poorer stationary and transient response due to block sizes of 576 and 192 samples

5 Signal handling up to 155158 KHz

6 CD quality audio at 128Kbitssec

7 Audio quality is poorer at low bit rates and may present coding artifacts

Disadvantages

bull Transparency is lost at very low bit rates when SBR is used

bull Small loss of stereo image when PS is used

APPLICATIONS

bull HE-AAC was chosen as the coding used in DAB (Digital Audio Broadcasting)

bull HE-AAC is the coding used in DRM (Digital Radio Mondiale)bull Itrsquos the default format in Apples i-PODbull Used in mobile phone to store songsbull Itrsquos the audio coding used in 3gp and 3gpp formatbull Itrsquos the audio coding used in DTH services [MPEG-4]bull For Internet Streamingbull Audio format in Bluetooth StereoMono headsets

[ A2DP ndash Advanced Audio Distribution profile ] (Optional)

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 15: Advanced Audio Coding [Aac]

MPEG-2 AAC BLOCK DIAGRAMS

MPEG AAC FAMILY

MPEG-4 AAC LCPerceptual Noise Substitution [PNS ]

bull Instead of trying to reproduce a waveform that is similar as input signals the model-based coding tries to generate a perceptually

similar sound as output

bull The encoding of PNS includes two steps (1) Noise detection For input signals in each frame the encoder

performs some analysis and determines if the spectral data in a scale-factor band belongs to noise component

(2) Noise compression All spectral samples in the noise-like scale-factor bands are excluded from the following quantization and entropy coding module Instead only a PNS flag and the energy of these samples are included in the bitstream

MPEG-4 HE-AAC

Spectral Band Replication [ SBR ]bull Developed by a German based company ldquoCoding Technologiesrdquo

bull SBR is a bandwidth extension tool

bull The main effect used is the high correlation between the low- and high-frequency content in an audio signal

bull In an SBR-based coding system waveform audio coding is only used to code the lower frequencies of an audio signal This low frequency content is used to recreate the high frequency content at the decoding side

bull This is done by state-of-the-art transposition method

bull The reconstruction of the high band is conducted by transmitting guiding information such as the spectral envelope of the original input signal or additional information to compensate for potentially missing high-frequency components

bull This guiding information is referred to as SBR data

bull The recreated high-frequency content undergoes some frequency and time domain adjustment before it is combined with the low-frequency part of the audio signal

bull HE-AAC aka aacPlus v1

Continuehellip

Continuehellip

Continuehellip

MPEG-4 HE-AAC v2

Parametric Stereo

bull Its also a contribution from ldquoCoding Technologiesrdquo

bull In the encoder only a monaural downmix of the original stereo signal is coded after extraction of the Parametric Stereo data

bull Just like SBR data these parameters are then embedded as PS side information in the ancillary part of the bit-stream

bull In the decoder the monaural signal is decoded first After that the stereo signal is reconstructed based on the stereo parameters embedded by the encoder

bull Three types of parameters can be employed in a Parametric Stereo system to describe the stereo image

1048705bull Inter-channel Intensity Difference (IID) describing the intensity

difference between the channels

bull Inter-channel Cross-Correlation (ICC) describing the cross correlation or coherence between the channels The coherence is measured as the maximum of the cross-correlation as a function of time or phase

bull Inter-channel Phase Difference (IPD) describing the phase difference between the channels

bull HE-AACv2 aka aacPlus v2

Continuehellip

Continuehellip

Advantages Over MP3 AAC

1 Multi Channel Audio ndash up to 48 audio channels

2 Sample frequencies from 8KHz ~ 96KHz

3 Simpler filter bank (pure MDCT used)

4 Better stationary and transient response due to block sizes of 1024 and 128 samples

5 Excellent handling of high frequency signals

6 CD quality audio at 64Kbitssec

7 Much better quality of audio at lower bit rates (down to 32Kbps)

MP3

1 Stereo signal ndash maximum of only 2 channels

2 Sampling frequencies from 16KHz ~ 48KHz

3 Hybrid filter bank ( more computational power)

4 Poorer stationary and transient response due to block sizes of 576 and 192 samples

5 Signal handling up to 155158 KHz

6 CD quality audio at 128Kbitssec

7 Audio quality is poorer at low bit rates and may present coding artifacts

Disadvantages

bull Transparency is lost at very low bit rates when SBR is used

bull Small loss of stereo image when PS is used

APPLICATIONS

bull HE-AAC was chosen as the coding used in DAB (Digital Audio Broadcasting)

bull HE-AAC is the coding used in DRM (Digital Radio Mondiale)bull Itrsquos the default format in Apples i-PODbull Used in mobile phone to store songsbull Itrsquos the audio coding used in 3gp and 3gpp formatbull Itrsquos the audio coding used in DTH services [MPEG-4]bull For Internet Streamingbull Audio format in Bluetooth StereoMono headsets

[ A2DP ndash Advanced Audio Distribution profile ] (Optional)

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 16: Advanced Audio Coding [Aac]

MPEG AAC FAMILY

MPEG-4 AAC LCPerceptual Noise Substitution [PNS ]

bull Instead of trying to reproduce a waveform that is similar as input signals the model-based coding tries to generate a perceptually

similar sound as output

bull The encoding of PNS includes two steps (1) Noise detection For input signals in each frame the encoder

performs some analysis and determines if the spectral data in a scale-factor band belongs to noise component

(2) Noise compression All spectral samples in the noise-like scale-factor bands are excluded from the following quantization and entropy coding module Instead only a PNS flag and the energy of these samples are included in the bitstream

MPEG-4 HE-AAC

Spectral Band Replication [ SBR ]bull Developed by a German based company ldquoCoding Technologiesrdquo

bull SBR is a bandwidth extension tool

bull The main effect used is the high correlation between the low- and high-frequency content in an audio signal

bull In an SBR-based coding system waveform audio coding is only used to code the lower frequencies of an audio signal This low frequency content is used to recreate the high frequency content at the decoding side

bull This is done by state-of-the-art transposition method

bull The reconstruction of the high band is conducted by transmitting guiding information such as the spectral envelope of the original input signal or additional information to compensate for potentially missing high-frequency components

bull This guiding information is referred to as SBR data

bull The recreated high-frequency content undergoes some frequency and time domain adjustment before it is combined with the low-frequency part of the audio signal

bull HE-AAC aka aacPlus v1

Continuehellip

Continuehellip

Continuehellip

MPEG-4 HE-AAC v2

Parametric Stereo

bull Its also a contribution from ldquoCoding Technologiesrdquo

bull In the encoder only a monaural downmix of the original stereo signal is coded after extraction of the Parametric Stereo data

bull Just like SBR data these parameters are then embedded as PS side information in the ancillary part of the bit-stream

bull In the decoder the monaural signal is decoded first After that the stereo signal is reconstructed based on the stereo parameters embedded by the encoder

bull Three types of parameters can be employed in a Parametric Stereo system to describe the stereo image

1048705bull Inter-channel Intensity Difference (IID) describing the intensity

difference between the channels

bull Inter-channel Cross-Correlation (ICC) describing the cross correlation or coherence between the channels The coherence is measured as the maximum of the cross-correlation as a function of time or phase

bull Inter-channel Phase Difference (IPD) describing the phase difference between the channels

bull HE-AACv2 aka aacPlus v2

Continuehellip

Continuehellip

Advantages Over MP3 AAC

1 Multi Channel Audio ndash up to 48 audio channels

2 Sample frequencies from 8KHz ~ 96KHz

3 Simpler filter bank (pure MDCT used)

4 Better stationary and transient response due to block sizes of 1024 and 128 samples

5 Excellent handling of high frequency signals

6 CD quality audio at 64Kbitssec

7 Much better quality of audio at lower bit rates (down to 32Kbps)

MP3

1 Stereo signal ndash maximum of only 2 channels

2 Sampling frequencies from 16KHz ~ 48KHz

3 Hybrid filter bank ( more computational power)

4 Poorer stationary and transient response due to block sizes of 576 and 192 samples

5 Signal handling up to 155158 KHz

6 CD quality audio at 128Kbitssec

7 Audio quality is poorer at low bit rates and may present coding artifacts

Disadvantages

bull Transparency is lost at very low bit rates when SBR is used

bull Small loss of stereo image when PS is used

APPLICATIONS

bull HE-AAC was chosen as the coding used in DAB (Digital Audio Broadcasting)

bull HE-AAC is the coding used in DRM (Digital Radio Mondiale)bull Itrsquos the default format in Apples i-PODbull Used in mobile phone to store songsbull Itrsquos the audio coding used in 3gp and 3gpp formatbull Itrsquos the audio coding used in DTH services [MPEG-4]bull For Internet Streamingbull Audio format in Bluetooth StereoMono headsets

[ A2DP ndash Advanced Audio Distribution profile ] (Optional)

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 17: Advanced Audio Coding [Aac]

MPEG-4 AAC LCPerceptual Noise Substitution [PNS ]

bull Instead of trying to reproduce a waveform that is similar as input signals the model-based coding tries to generate a perceptually

similar sound as output

bull The encoding of PNS includes two steps (1) Noise detection For input signals in each frame the encoder

performs some analysis and determines if the spectral data in a scale-factor band belongs to noise component

(2) Noise compression All spectral samples in the noise-like scale-factor bands are excluded from the following quantization and entropy coding module Instead only a PNS flag and the energy of these samples are included in the bitstream

MPEG-4 HE-AAC

Spectral Band Replication [ SBR ]bull Developed by a German based company ldquoCoding Technologiesrdquo

bull SBR is a bandwidth extension tool

bull The main effect used is the high correlation between the low- and high-frequency content in an audio signal

bull In an SBR-based coding system waveform audio coding is only used to code the lower frequencies of an audio signal This low frequency content is used to recreate the high frequency content at the decoding side

bull This is done by state-of-the-art transposition method

bull The reconstruction of the high band is conducted by transmitting guiding information such as the spectral envelope of the original input signal or additional information to compensate for potentially missing high-frequency components

bull This guiding information is referred to as SBR data

bull The recreated high-frequency content undergoes some frequency and time domain adjustment before it is combined with the low-frequency part of the audio signal

bull HE-AAC aka aacPlus v1

Continuehellip

Continuehellip

Continuehellip

MPEG-4 HE-AAC v2

Parametric Stereo

bull Its also a contribution from ldquoCoding Technologiesrdquo

bull In the encoder only a monaural downmix of the original stereo signal is coded after extraction of the Parametric Stereo data

bull Just like SBR data these parameters are then embedded as PS side information in the ancillary part of the bit-stream

bull In the decoder the monaural signal is decoded first After that the stereo signal is reconstructed based on the stereo parameters embedded by the encoder

bull Three types of parameters can be employed in a Parametric Stereo system to describe the stereo image

1048705bull Inter-channel Intensity Difference (IID) describing the intensity

difference between the channels

bull Inter-channel Cross-Correlation (ICC) describing the cross correlation or coherence between the channels The coherence is measured as the maximum of the cross-correlation as a function of time or phase

bull Inter-channel Phase Difference (IPD) describing the phase difference between the channels

bull HE-AACv2 aka aacPlus v2

Continuehellip

Continuehellip

Advantages Over MP3 AAC

1 Multi Channel Audio ndash up to 48 audio channels

2 Sample frequencies from 8KHz ~ 96KHz

3 Simpler filter bank (pure MDCT used)

4 Better stationary and transient response due to block sizes of 1024 and 128 samples

5 Excellent handling of high frequency signals

6 CD quality audio at 64Kbitssec

7 Much better quality of audio at lower bit rates (down to 32Kbps)

MP3

1 Stereo signal ndash maximum of only 2 channels

2 Sampling frequencies from 16KHz ~ 48KHz

3 Hybrid filter bank ( more computational power)

4 Poorer stationary and transient response due to block sizes of 576 and 192 samples

5 Signal handling up to 155158 KHz

6 CD quality audio at 128Kbitssec

7 Audio quality is poorer at low bit rates and may present coding artifacts

Disadvantages

bull Transparency is lost at very low bit rates when SBR is used

bull Small loss of stereo image when PS is used

APPLICATIONS

bull HE-AAC was chosen as the coding used in DAB (Digital Audio Broadcasting)

bull HE-AAC is the coding used in DRM (Digital Radio Mondiale)bull Itrsquos the default format in Apples i-PODbull Used in mobile phone to store songsbull Itrsquos the audio coding used in 3gp and 3gpp formatbull Itrsquos the audio coding used in DTH services [MPEG-4]bull For Internet Streamingbull Audio format in Bluetooth StereoMono headsets

[ A2DP ndash Advanced Audio Distribution profile ] (Optional)

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 18: Advanced Audio Coding [Aac]

MPEG-4 HE-AAC

Spectral Band Replication [ SBR ]bull Developed by a German based company ldquoCoding Technologiesrdquo

bull SBR is a bandwidth extension tool

bull The main effect used is the high correlation between the low- and high-frequency content in an audio signal

bull In an SBR-based coding system waveform audio coding is only used to code the lower frequencies of an audio signal This low frequency content is used to recreate the high frequency content at the decoding side

bull This is done by state-of-the-art transposition method

bull The reconstruction of the high band is conducted by transmitting guiding information such as the spectral envelope of the original input signal or additional information to compensate for potentially missing high-frequency components

bull This guiding information is referred to as SBR data

bull The recreated high-frequency content undergoes some frequency and time domain adjustment before it is combined with the low-frequency part of the audio signal

bull HE-AAC aka aacPlus v1

Continuehellip

Continuehellip

Continuehellip

MPEG-4 HE-AAC v2

Parametric Stereo

bull Its also a contribution from ldquoCoding Technologiesrdquo

bull In the encoder only a monaural downmix of the original stereo signal is coded after extraction of the Parametric Stereo data

bull Just like SBR data these parameters are then embedded as PS side information in the ancillary part of the bit-stream

bull In the decoder the monaural signal is decoded first After that the stereo signal is reconstructed based on the stereo parameters embedded by the encoder

bull Three types of parameters can be employed in a Parametric Stereo system to describe the stereo image

1048705bull Inter-channel Intensity Difference (IID) describing the intensity

difference between the channels

bull Inter-channel Cross-Correlation (ICC) describing the cross correlation or coherence between the channels The coherence is measured as the maximum of the cross-correlation as a function of time or phase

bull Inter-channel Phase Difference (IPD) describing the phase difference between the channels

bull HE-AACv2 aka aacPlus v2

Continuehellip

Continuehellip

Advantages Over MP3 AAC

1 Multi Channel Audio ndash up to 48 audio channels

2 Sample frequencies from 8KHz ~ 96KHz

3 Simpler filter bank (pure MDCT used)

4 Better stationary and transient response due to block sizes of 1024 and 128 samples

5 Excellent handling of high frequency signals

6 CD quality audio at 64Kbitssec

7 Much better quality of audio at lower bit rates (down to 32Kbps)

MP3

1 Stereo signal ndash maximum of only 2 channels

2 Sampling frequencies from 16KHz ~ 48KHz

3 Hybrid filter bank ( more computational power)

4 Poorer stationary and transient response due to block sizes of 576 and 192 samples

5 Signal handling up to 155158 KHz

6 CD quality audio at 128Kbitssec

7 Audio quality is poorer at low bit rates and may present coding artifacts

Disadvantages

bull Transparency is lost at very low bit rates when SBR is used

bull Small loss of stereo image when PS is used

APPLICATIONS

bull HE-AAC was chosen as the coding used in DAB (Digital Audio Broadcasting)

bull HE-AAC is the coding used in DRM (Digital Radio Mondiale)bull Itrsquos the default format in Apples i-PODbull Used in mobile phone to store songsbull Itrsquos the audio coding used in 3gp and 3gpp formatbull Itrsquos the audio coding used in DTH services [MPEG-4]bull For Internet Streamingbull Audio format in Bluetooth StereoMono headsets

[ A2DP ndash Advanced Audio Distribution profile ] (Optional)

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 19: Advanced Audio Coding [Aac]

bull The reconstruction of the high band is conducted by transmitting guiding information such as the spectral envelope of the original input signal or additional information to compensate for potentially missing high-frequency components

bull This guiding information is referred to as SBR data

bull The recreated high-frequency content undergoes some frequency and time domain adjustment before it is combined with the low-frequency part of the audio signal

bull HE-AAC aka aacPlus v1

Continuehellip

Continuehellip

Continuehellip

MPEG-4 HE-AAC v2

Parametric Stereo

bull Its also a contribution from ldquoCoding Technologiesrdquo

bull In the encoder only a monaural downmix of the original stereo signal is coded after extraction of the Parametric Stereo data

bull Just like SBR data these parameters are then embedded as PS side information in the ancillary part of the bit-stream

bull In the decoder the monaural signal is decoded first After that the stereo signal is reconstructed based on the stereo parameters embedded by the encoder

bull Three types of parameters can be employed in a Parametric Stereo system to describe the stereo image

1048705bull Inter-channel Intensity Difference (IID) describing the intensity

difference between the channels

bull Inter-channel Cross-Correlation (ICC) describing the cross correlation or coherence between the channels The coherence is measured as the maximum of the cross-correlation as a function of time or phase

bull Inter-channel Phase Difference (IPD) describing the phase difference between the channels

bull HE-AACv2 aka aacPlus v2

Continuehellip

Continuehellip

Advantages Over MP3 AAC

1 Multi Channel Audio ndash up to 48 audio channels

2 Sample frequencies from 8KHz ~ 96KHz

3 Simpler filter bank (pure MDCT used)

4 Better stationary and transient response due to block sizes of 1024 and 128 samples

5 Excellent handling of high frequency signals

6 CD quality audio at 64Kbitssec

7 Much better quality of audio at lower bit rates (down to 32Kbps)

MP3

1 Stereo signal ndash maximum of only 2 channels

2 Sampling frequencies from 16KHz ~ 48KHz

3 Hybrid filter bank ( more computational power)

4 Poorer stationary and transient response due to block sizes of 576 and 192 samples

5 Signal handling up to 155158 KHz

6 CD quality audio at 128Kbitssec

7 Audio quality is poorer at low bit rates and may present coding artifacts

Disadvantages

bull Transparency is lost at very low bit rates when SBR is used

bull Small loss of stereo image when PS is used

APPLICATIONS

bull HE-AAC was chosen as the coding used in DAB (Digital Audio Broadcasting)

bull HE-AAC is the coding used in DRM (Digital Radio Mondiale)bull Itrsquos the default format in Apples i-PODbull Used in mobile phone to store songsbull Itrsquos the audio coding used in 3gp and 3gpp formatbull Itrsquos the audio coding used in DTH services [MPEG-4]bull For Internet Streamingbull Audio format in Bluetooth StereoMono headsets

[ A2DP ndash Advanced Audio Distribution profile ] (Optional)

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 20: Advanced Audio Coding [Aac]

Continuehellip

Continuehellip

MPEG-4 HE-AAC v2

Parametric Stereo

bull Its also a contribution from ldquoCoding Technologiesrdquo

bull In the encoder only a monaural downmix of the original stereo signal is coded after extraction of the Parametric Stereo data

bull Just like SBR data these parameters are then embedded as PS side information in the ancillary part of the bit-stream

bull In the decoder the monaural signal is decoded first After that the stereo signal is reconstructed based on the stereo parameters embedded by the encoder

bull Three types of parameters can be employed in a Parametric Stereo system to describe the stereo image

1048705bull Inter-channel Intensity Difference (IID) describing the intensity

difference between the channels

bull Inter-channel Cross-Correlation (ICC) describing the cross correlation or coherence between the channels The coherence is measured as the maximum of the cross-correlation as a function of time or phase

bull Inter-channel Phase Difference (IPD) describing the phase difference between the channels

bull HE-AACv2 aka aacPlus v2

Continuehellip

Continuehellip

Advantages Over MP3 AAC

1 Multi Channel Audio ndash up to 48 audio channels

2 Sample frequencies from 8KHz ~ 96KHz

3 Simpler filter bank (pure MDCT used)

4 Better stationary and transient response due to block sizes of 1024 and 128 samples

5 Excellent handling of high frequency signals

6 CD quality audio at 64Kbitssec

7 Much better quality of audio at lower bit rates (down to 32Kbps)

MP3

1 Stereo signal ndash maximum of only 2 channels

2 Sampling frequencies from 16KHz ~ 48KHz

3 Hybrid filter bank ( more computational power)

4 Poorer stationary and transient response due to block sizes of 576 and 192 samples

5 Signal handling up to 155158 KHz

6 CD quality audio at 128Kbitssec

7 Audio quality is poorer at low bit rates and may present coding artifacts

Disadvantages

bull Transparency is lost at very low bit rates when SBR is used

bull Small loss of stereo image when PS is used

APPLICATIONS

bull HE-AAC was chosen as the coding used in DAB (Digital Audio Broadcasting)

bull HE-AAC is the coding used in DRM (Digital Radio Mondiale)bull Itrsquos the default format in Apples i-PODbull Used in mobile phone to store songsbull Itrsquos the audio coding used in 3gp and 3gpp formatbull Itrsquos the audio coding used in DTH services [MPEG-4]bull For Internet Streamingbull Audio format in Bluetooth StereoMono headsets

[ A2DP ndash Advanced Audio Distribution profile ] (Optional)

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 21: Advanced Audio Coding [Aac]

Continuehellip

MPEG-4 HE-AAC v2

Parametric Stereo

bull Its also a contribution from ldquoCoding Technologiesrdquo

bull In the encoder only a monaural downmix of the original stereo signal is coded after extraction of the Parametric Stereo data

bull Just like SBR data these parameters are then embedded as PS side information in the ancillary part of the bit-stream

bull In the decoder the monaural signal is decoded first After that the stereo signal is reconstructed based on the stereo parameters embedded by the encoder

bull Three types of parameters can be employed in a Parametric Stereo system to describe the stereo image

1048705bull Inter-channel Intensity Difference (IID) describing the intensity

difference between the channels

bull Inter-channel Cross-Correlation (ICC) describing the cross correlation or coherence between the channels The coherence is measured as the maximum of the cross-correlation as a function of time or phase

bull Inter-channel Phase Difference (IPD) describing the phase difference between the channels

bull HE-AACv2 aka aacPlus v2

Continuehellip

Continuehellip

Advantages Over MP3 AAC

1 Multi Channel Audio ndash up to 48 audio channels

2 Sample frequencies from 8KHz ~ 96KHz

3 Simpler filter bank (pure MDCT used)

4 Better stationary and transient response due to block sizes of 1024 and 128 samples

5 Excellent handling of high frequency signals

6 CD quality audio at 64Kbitssec

7 Much better quality of audio at lower bit rates (down to 32Kbps)

MP3

1 Stereo signal ndash maximum of only 2 channels

2 Sampling frequencies from 16KHz ~ 48KHz

3 Hybrid filter bank ( more computational power)

4 Poorer stationary and transient response due to block sizes of 576 and 192 samples

5 Signal handling up to 155158 KHz

6 CD quality audio at 128Kbitssec

7 Audio quality is poorer at low bit rates and may present coding artifacts

Disadvantages

bull Transparency is lost at very low bit rates when SBR is used

bull Small loss of stereo image when PS is used

APPLICATIONS

bull HE-AAC was chosen as the coding used in DAB (Digital Audio Broadcasting)

bull HE-AAC is the coding used in DRM (Digital Radio Mondiale)bull Itrsquos the default format in Apples i-PODbull Used in mobile phone to store songsbull Itrsquos the audio coding used in 3gp and 3gpp formatbull Itrsquos the audio coding used in DTH services [MPEG-4]bull For Internet Streamingbull Audio format in Bluetooth StereoMono headsets

[ A2DP ndash Advanced Audio Distribution profile ] (Optional)

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 22: Advanced Audio Coding [Aac]

MPEG-4 HE-AAC v2

Parametric Stereo

bull Its also a contribution from ldquoCoding Technologiesrdquo

bull In the encoder only a monaural downmix of the original stereo signal is coded after extraction of the Parametric Stereo data

bull Just like SBR data these parameters are then embedded as PS side information in the ancillary part of the bit-stream

bull In the decoder the monaural signal is decoded first After that the stereo signal is reconstructed based on the stereo parameters embedded by the encoder

bull Three types of parameters can be employed in a Parametric Stereo system to describe the stereo image

1048705bull Inter-channel Intensity Difference (IID) describing the intensity

difference between the channels

bull Inter-channel Cross-Correlation (ICC) describing the cross correlation or coherence between the channels The coherence is measured as the maximum of the cross-correlation as a function of time or phase

bull Inter-channel Phase Difference (IPD) describing the phase difference between the channels

bull HE-AACv2 aka aacPlus v2

Continuehellip

Continuehellip

Advantages Over MP3 AAC

1 Multi Channel Audio ndash up to 48 audio channels

2 Sample frequencies from 8KHz ~ 96KHz

3 Simpler filter bank (pure MDCT used)

4 Better stationary and transient response due to block sizes of 1024 and 128 samples

5 Excellent handling of high frequency signals

6 CD quality audio at 64Kbitssec

7 Much better quality of audio at lower bit rates (down to 32Kbps)

MP3

1 Stereo signal ndash maximum of only 2 channels

2 Sampling frequencies from 16KHz ~ 48KHz

3 Hybrid filter bank ( more computational power)

4 Poorer stationary and transient response due to block sizes of 576 and 192 samples

5 Signal handling up to 155158 KHz

6 CD quality audio at 128Kbitssec

7 Audio quality is poorer at low bit rates and may present coding artifacts

Disadvantages

bull Transparency is lost at very low bit rates when SBR is used

bull Small loss of stereo image when PS is used

APPLICATIONS

bull HE-AAC was chosen as the coding used in DAB (Digital Audio Broadcasting)

bull HE-AAC is the coding used in DRM (Digital Radio Mondiale)bull Itrsquos the default format in Apples i-PODbull Used in mobile phone to store songsbull Itrsquos the audio coding used in 3gp and 3gpp formatbull Itrsquos the audio coding used in DTH services [MPEG-4]bull For Internet Streamingbull Audio format in Bluetooth StereoMono headsets

[ A2DP ndash Advanced Audio Distribution profile ] (Optional)

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 23: Advanced Audio Coding [Aac]

bull Three types of parameters can be employed in a Parametric Stereo system to describe the stereo image

1048705bull Inter-channel Intensity Difference (IID) describing the intensity

difference between the channels

bull Inter-channel Cross-Correlation (ICC) describing the cross correlation or coherence between the channels The coherence is measured as the maximum of the cross-correlation as a function of time or phase

bull Inter-channel Phase Difference (IPD) describing the phase difference between the channels

bull HE-AACv2 aka aacPlus v2

Continuehellip

Continuehellip

Advantages Over MP3 AAC

1 Multi Channel Audio ndash up to 48 audio channels

2 Sample frequencies from 8KHz ~ 96KHz

3 Simpler filter bank (pure MDCT used)

4 Better stationary and transient response due to block sizes of 1024 and 128 samples

5 Excellent handling of high frequency signals

6 CD quality audio at 64Kbitssec

7 Much better quality of audio at lower bit rates (down to 32Kbps)

MP3

1 Stereo signal ndash maximum of only 2 channels

2 Sampling frequencies from 16KHz ~ 48KHz

3 Hybrid filter bank ( more computational power)

4 Poorer stationary and transient response due to block sizes of 576 and 192 samples

5 Signal handling up to 155158 KHz

6 CD quality audio at 128Kbitssec

7 Audio quality is poorer at low bit rates and may present coding artifacts

Disadvantages

bull Transparency is lost at very low bit rates when SBR is used

bull Small loss of stereo image when PS is used

APPLICATIONS

bull HE-AAC was chosen as the coding used in DAB (Digital Audio Broadcasting)

bull HE-AAC is the coding used in DRM (Digital Radio Mondiale)bull Itrsquos the default format in Apples i-PODbull Used in mobile phone to store songsbull Itrsquos the audio coding used in 3gp and 3gpp formatbull Itrsquos the audio coding used in DTH services [MPEG-4]bull For Internet Streamingbull Audio format in Bluetooth StereoMono headsets

[ A2DP ndash Advanced Audio Distribution profile ] (Optional)

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 24: Advanced Audio Coding [Aac]

Continuehellip

Advantages Over MP3 AAC

1 Multi Channel Audio ndash up to 48 audio channels

2 Sample frequencies from 8KHz ~ 96KHz

3 Simpler filter bank (pure MDCT used)

4 Better stationary and transient response due to block sizes of 1024 and 128 samples

5 Excellent handling of high frequency signals

6 CD quality audio at 64Kbitssec

7 Much better quality of audio at lower bit rates (down to 32Kbps)

MP3

1 Stereo signal ndash maximum of only 2 channels

2 Sampling frequencies from 16KHz ~ 48KHz

3 Hybrid filter bank ( more computational power)

4 Poorer stationary and transient response due to block sizes of 576 and 192 samples

5 Signal handling up to 155158 KHz

6 CD quality audio at 128Kbitssec

7 Audio quality is poorer at low bit rates and may present coding artifacts

Disadvantages

bull Transparency is lost at very low bit rates when SBR is used

bull Small loss of stereo image when PS is used

APPLICATIONS

bull HE-AAC was chosen as the coding used in DAB (Digital Audio Broadcasting)

bull HE-AAC is the coding used in DRM (Digital Radio Mondiale)bull Itrsquos the default format in Apples i-PODbull Used in mobile phone to store songsbull Itrsquos the audio coding used in 3gp and 3gpp formatbull Itrsquos the audio coding used in DTH services [MPEG-4]bull For Internet Streamingbull Audio format in Bluetooth StereoMono headsets

[ A2DP ndash Advanced Audio Distribution profile ] (Optional)

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 25: Advanced Audio Coding [Aac]

Advantages Over MP3 AAC

1 Multi Channel Audio ndash up to 48 audio channels

2 Sample frequencies from 8KHz ~ 96KHz

3 Simpler filter bank (pure MDCT used)

4 Better stationary and transient response due to block sizes of 1024 and 128 samples

5 Excellent handling of high frequency signals

6 CD quality audio at 64Kbitssec

7 Much better quality of audio at lower bit rates (down to 32Kbps)

MP3

1 Stereo signal ndash maximum of only 2 channels

2 Sampling frequencies from 16KHz ~ 48KHz

3 Hybrid filter bank ( more computational power)

4 Poorer stationary and transient response due to block sizes of 576 and 192 samples

5 Signal handling up to 155158 KHz

6 CD quality audio at 128Kbitssec

7 Audio quality is poorer at low bit rates and may present coding artifacts

Disadvantages

bull Transparency is lost at very low bit rates when SBR is used

bull Small loss of stereo image when PS is used

APPLICATIONS

bull HE-AAC was chosen as the coding used in DAB (Digital Audio Broadcasting)

bull HE-AAC is the coding used in DRM (Digital Radio Mondiale)bull Itrsquos the default format in Apples i-PODbull Used in mobile phone to store songsbull Itrsquos the audio coding used in 3gp and 3gpp formatbull Itrsquos the audio coding used in DTH services [MPEG-4]bull For Internet Streamingbull Audio format in Bluetooth StereoMono headsets

[ A2DP ndash Advanced Audio Distribution profile ] (Optional)

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 26: Advanced Audio Coding [Aac]

Disadvantages

bull Transparency is lost at very low bit rates when SBR is used

bull Small loss of stereo image when PS is used

APPLICATIONS

bull HE-AAC was chosen as the coding used in DAB (Digital Audio Broadcasting)

bull HE-AAC is the coding used in DRM (Digital Radio Mondiale)bull Itrsquos the default format in Apples i-PODbull Used in mobile phone to store songsbull Itrsquos the audio coding used in 3gp and 3gpp formatbull Itrsquos the audio coding used in DTH services [MPEG-4]bull For Internet Streamingbull Audio format in Bluetooth StereoMono headsets

[ A2DP ndash Advanced Audio Distribution profile ] (Optional)

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 27: Advanced Audio Coding [Aac]

APPLICATIONS

bull HE-AAC was chosen as the coding used in DAB (Digital Audio Broadcasting)

bull HE-AAC is the coding used in DRM (Digital Radio Mondiale)bull Itrsquos the default format in Apples i-PODbull Used in mobile phone to store songsbull Itrsquos the audio coding used in 3gp and 3gpp formatbull Itrsquos the audio coding used in DTH services [MPEG-4]bull For Internet Streamingbull Audio format in Bluetooth StereoMono headsets

[ A2DP ndash Advanced Audio Distribution profile ] (Optional)

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 28: Advanced Audio Coding [Aac]

Conclusion

AAC ndash The perceptual audio coding the world is going to adapt completely

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 29: Advanced Audio Coding [Aac]

ReferencesSitesbull wwwwikipediaorgbull wwwhydrogenaudioorgbull wwwcodingtechnologiescombull wwwmp3-techorgaachtml

Booksbull High-Fidelity Multichannel Audio Coding - Dai Tracy Yang Chris Kyriakakis and

C-C Jay Kuobull Introduction To Data Compression - Khalid Sayood

Papersbull ISOIEC Standards [13818-7 14496-3]bull MP3 and AAC Explained Karlheinz Brandenburg [Father of MP3]bull CT-aacPlus - a state-of-the-art audio coding scheme Martin Dietz and Stefan

Meltzerbull MPEG-4 HE-AAC v2 - audio coding for todayrsquos media world Stefan Meltzer and

Gerald Moserbull helliphelliphellip

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33
Page 30: Advanced Audio Coding [Aac]

THANK YOU

  • ADVANCED AUDIO CODING [AAC]
  • INTRODUCTION
  • INTRODUCTION TO DIGITAL AUDIO
  • Continuehellip
  • Slide 5
  • Compression Techniques
  • Perceptual Audio Coding
  • Principles of Psychoacoustics
  • Slide 9
  • Slide 10
  • 2 Critical Band
  • Slide 12
  • 3 Masking
  • Advanced Audio Coding
  • MPEG-2 AAC BLOCK DIAGRAMS
  • Slide 16
  • Slide 17
  • MPEG AAC FAMILY
  • MPEG-4 AAC LC
  • MPEG-4 HE-AAC
  • Slide 21
  • Slide 22
  • Slide 23
  • MPEG-4 HE-AAC v2
  • Slide 25
  • Slide 26
  • Slide 27
  • Advantages Over MP3
  • Disadvantages
  • APPLICATIONS
  • Conclusion
  • References
  • Slide 33