Ch8 - Choosing Speech Codecs for Mobile Communication

Embed Size (px)

Citation preview

  • 8/10/2019 Ch8 - Choosing Speech Codecs for Mobile Communication

    1/19

    CHAPTER 8

    SPEECH CODING

  • 8/10/2019 Ch8 - Choosing Speech Codecs for Mobile Communication

    2/19

    Choosing Speech Codecs for Mobile

    Communication

    Important step in the design of a digital mobile communicationsystem is to choose the right speech codec

    Available bandwidth is limited, so it is required to compress

    speech to maximize the number of users on the system.

    It must include the end to end encoding delay, the algorithmic

    complexity of the coder, the dc power requirements, compatibility

    with existing standards, and robustness of the encoded speech to

    transmission errors.

    The choice of the speech coder will also depend o the cell size

    used. When the cell size is sufficiently small such that high

    spectral efficiency is achieved through frequency reuse, it may besufficient to use a simple high rate speech codec

  • 8/10/2019 Ch8 - Choosing Speech Codecs for Mobile Communication

    3/19

  • 8/10/2019 Ch8 - Choosing Speech Codecs for Mobile Communication

    4/19

    The GSM Codec The original speech coder used in the pan-European digital cellular

    standard GSM goes by a rather grandiose name of regular pulse

    excited long-term prediction(RLE-LTP) codec. This codec has a net bit rate of 13kbps.

    It combines the advantages of the earlier French proposed basebandRELP codec with those of the (MPE-LTP) multipulse excited longterm prediction codec proposed by Germany.

    The advantage of RELP codec is that it provides good qualityspeech at low complexity.

    The MPE-LTP technique produces excellent speech quality at highcomplexity and is not affected by bit errors in the channel.

    By modifying the RELP codec to incorporate the features of theMPE-LTP, the net bit rate was reduced from 14.77kbps to 13kbpswithout loss of quality.

    The most important modification was the addition of a long termprediction loop.

  • 8/10/2019 Ch8 - Choosing Speech Codecs for Mobile Communication

    5/19

    GSM codec is complex

    Fig 8.10 shows a block diagram of the speech encoder

    Encoder consists of four major processing blocks

    Speech sequence is first pre-emphasized, ordered intosegments of 20 ms duration and then Hamming windowed

    This is followed by short-term prediction (STP) filtering

    analysis where the logarithmic area ratios (LARs) of the

    reflection coefficients rn

    (k) (eight in number) are computed

    Eight LAR have different dynamic ranges and probability

    distribution functions

    So all are not encoded with the same number of bits for

    transmission

  • 8/10/2019 Ch8 - Choosing Speech Codecs for Mobile Communication

    6/19

    LAR parameters are decoded by LPC inverse filters so as

    to minimize the error e nLTP involves finding the pitch period pn and gain factor

    gn is then carried out such that the LTP residual rn is

    minimised

    Pitch extraction is done by determining that value of

    delay D.

    The extracted pitach pn and gain gn are transmitted andencoded at a rate of 3.6kbps

    The LTP residual, rn, is weighted and decomposed into

    three sequences

  • 8/10/2019 Ch8 - Choosing Speech Codecs for Mobile Communication

    7/19

    Fig 8.11 shows a block diagram of the GSM speech

    decoder

    Consists of four blocks operations

    The received excitation parameters are RPE decoded and

    passed to the LTP synthesis filter which uses the pitch and

    gain parameter to synthesize the long-term signal

    Short-term synthesis is carried out using the received

    reflection coefficients to recreate the original speech signal

    Every 260 bits of the coder output (i.e., 20ms blocks of

    speech) are ordered, depending on their importance, into

    groups of 50, 132, and 78 bits each

  • 8/10/2019 Ch8 - Choosing Speech Codecs for Mobile Communication

    8/19

    The bits in the first group are very important bits called

    type Ia bits, next 132bits are Ib bits and the last 78bits are II

    bits

    Since type Ia bits are the ones which effect speech quality

    , they have error detection CRC bits added

    Both Ia and Ib bits are convolution ally encoded for

    forward error correction

    The least significant type II bits have no error correctionor detection

  • 8/10/2019 Ch8 - Choosing Speech Codecs for Mobile Communication

    9/19

  • 8/10/2019 Ch8 - Choosing Speech Codecs for Mobile Communication

    10/19

  • 8/10/2019 Ch8 - Choosing Speech Codecs for Mobile Communication

    11/19

    The USDC Codec

    The US digital cellular system(IS-136) uses a vector sum

    excited linear predictive coder(VSELP).

    It operates at data rate of 7950 bps and a total rate of 13kbps

    after channel coding.

    It is a variant of the CELP type vocoders.

    This coder was designed to accomplish the three goals of

    highest speech quality, modest computational complexity

    robustness to channel errors.

    The code books in the VSELP encoder are organized with a

    predefined structure such that a brute force search is avoided.

    This significantly reduces the time required for the optimum

    code word search

  • 8/10/2019 Ch8 - Choosing Speech Codecs for Mobile Communication

    12/19

    Fig 8.12 shows a block diagram of VSELP encoder

    The 8kps VSELP codec utilizes three excitation resources

    One from the long-term (pitch) predictor state, oradaptive code book

    Second and third sources from the two VSELP excitation

    code books

    Each of these VSELP code books contain the equivalent of

    128 vectors

    These three excitation sequences are multiplied by their

    corresponding gain terms and summed to give the combined

    excitation sequence.

    After each sub frame the combined excitation sequence is

    used to update the long term filter state.

  • 8/10/2019 Ch8 - Choosing Speech Codecs for Mobile Communication

    13/19

    The synthesis filter is a direct from 10thorder LPC all

    pole filter. The LPC coefficients are coded once per 20ms

    frame and updated in each 5ms sub frame.The number of sub frame is 40 at an 8kHz sampling rate .

    The decoder is shown in Fig 8.13

  • 8/10/2019 Ch8 - Choosing Speech Codecs for Mobile Communication

    14/19

  • 8/10/2019 Ch8 - Choosing Speech Codecs for Mobile Communication

    15/19

  • 8/10/2019 Ch8 - Choosing Speech Codecs for Mobile Communication

    16/19

    Performance Evaluation of Speech Coders

    There are two approaches in evaluating the performance of a

    speech coder in terms of its ability to preserve the signal

    quality.

    Objective measures have the general nature of a SNR an dprovide a quantitative value of how well the reconstructed

    speech approximates the original speech.

    MSE- Mean Square Error distortion, frequency weighted

    MSE, and segmented SNR, articulation index are examples ofobjective measures.

  • 8/10/2019 Ch8 - Choosing Speech Codecs for Mobile Communication

    17/19

    Speech coders are highly speaker dependent in that the

    quality varies with the age and gender of the speaker, the speed

    at which the speaker speaks, and other factors .

    The diagnostic acceptability measure (DAM) is another test

    that evaluates acceptability of speech coding systems.

    The most popular ranking system is known as the meanopinion score or MOS ranking.

    One of the difficult conditions for speech coders to perform

    well in the case where a digital speech-coded signal is

    transmitted from the mobile to the base station and then

    demodulated into an analog signal which is then speech coded

    for retransmission as a digital signal over a landline or wireless

    link. This situation is called tandem signaling

  • 8/10/2019 Ch8 - Choosing Speech Codecs for Mobile Communication

    18/19

    Th MOS i f h d d i h d i

  • 8/10/2019 Ch8 - Choosing Speech Codecs for Mobile Communication

    19/19

    The MOS rating of a speech code decreases with decreasing

    bit rate

    Table 8.3 gives the performance of some of the popular

    speech coders on the MOS scale

    Table 8.3 Performance of Coders