IPTEL'2001, New York, USA1 Lingfen Sun Graham Wade, Benn Lines Emmanuel Ifeachor University of Plymouth, U.K. Impact of Packet Loss Location on Perceived

IPTEL'2001, New York, USA 1

Lingfen Sun

Graham Wade, Benn Lines

Emmanuel Ifeachor

University of Plymouth, U.K.

Impact of Packet Loss Location on Perceived Speech Quality


Outline

• Introduction

• Codec's internal concealment and convergence time

• Perceptual speech quality measurement

• Simulation system

• Loss location with perceived quality

• Loss location with convergence time

• Conclusions and future work


Introduction

• End-to-end speech transmission quality– IP network performance (e.g. packet loss and jitter)– Gateway/terminal (codec + loss/jitter compensation)

• Impact of packet loss on perceived speech quality – Loss pattern (e.g. burst/random)– Loss location (codec's concealment)

SCN SCNIP Network

Gateway Gateway


Introduction (cont.)

• Previous research on loss location– Concealment performance is speech content

related (e.g. voiced/unvoiced)

– Analysis based on MSE or SNR for limited codec

– Perceptual objective methods only to assess overall quality under stochastic loss simulations

• Questions:– How does a packet loss location affect perceived

speech quality ?

– How does a packet loss location affect codec's convergence time (for loss constraint)?


Codec's internal concealment

• What is codec's concealment?– When a loss occurs, the decoder interpolates the

parameters for the lost frame from parameters of previous frames.

• Which codec has concealment algorithm?– G.729/G.723.1/AMR (main VoIP codecs)– CELP analysis-by-synthesis

• What are the limitations of concealment algorithms?– During unvoiced(u) or voiced(v)– During u/v


Codec's convergence time

• What is convergence time?– The time taken by decoder to resynchronize its

state with encoder after a loss occurs. It is also called resynchronization time.

– For set up loss constraint distance between two consecutive losses for new packet loss metrics

• What is the relationship between convergence time with loss location, codec type and packet size?


Perceptual quality measurement

• Transform the signal into the psychophysical representation approximating human perception

• Calculating their perceptual difference

• Mapping to objective MOS (Mean Opinion Score)

• Algorithms: PSQM/PSQM+/MNB/EMBSD/PESQ

Reference signal

Objective perceptualquality test

System/network under test

Objective MOS

Degraded signal


Simulation System

Reference speech

Degraded speech with loss

Degraded speech without loss

perceptualquality measure

decoderencoderBitstream

decoderloss

simulation

convengencetime analysis

Reference speech

• Perceptual speech quality analysis with loss location• Convergence time analysis with loss location


• Speech test sentence is about 6 seconds.• First talkspurt (about 1.34 second, above waveform) is

used for loss location analysis.• Four voiced segments, V(1) to V(4), which can be

decided by pitch delay in G.729 codec

Speech test sentence


Pitch delay from G.729 codec

0

20

40

60

80

100

120

140

1 11 21 31 41 51 61 71 81 91 101 111 121 131

frame location (10ms/frame)

pit

ch

de

lay

V(1) V(2) V(3) V(4)


Loss location with perceived quality

• Each time only one packet loss is created

• Loss position moves from left to right one frame by one frame

• Overall perceptual quality is measured from PSQM/PSQM+, MNB and EMBSD

• Packet size: 1 to 4 frames/packet

• Codec: G.729/G.723.1/AMR

• How does a loss location affect perceived speech quality ?


Loss position with quality (1)

PSQM+

PSQM

Loss position reference speech

degraded speech



Loss position

PSQM+

PSQM

reference speech

degraded speech



Loss position

PSQM+

PSQM

degraded speech

reference speech



Loss position

PSQM+

PSQM

degraded speech

reference speech


1

1.4

1.8

2.2

2.6

1 11 21 31 41 51 61 71 81 91 101 111 121 131

Loss location (in frames, 10ms/frame)

PS

QM

+

1-frame 2-frame 3-frame 4-frame

Overall PSQM+ vs loss location (G.729)

G.729


2.5

2.8

3.1

3.4

3.7

4

1 11 21 31 41 51 61 71 81 91 101 111 121 131


MN

B


Overall MNB vs loss location (G.729)

G.729


0

2

4

6

8

1 11 21 31 41 51 61 71 81 91 101 111 121 131


EM

BS

D


Overall EMBSD vs loss location (G.729)

G.729


11.5

22.5

33.5

44.5

1 6 11 16 21 26 31 36 41


PS

QM

+

1-frame loss 2-frame loss 3-frame loss 4-frame loss

Overall PSQM+ vs loss location (G.723.1)

G.723.1


Loss location with perceived quality

• Loss location affects perceived quality.

• The loss at unvoiced speech segment has no obvious impact on perceived quality.

• The loss at the beginning of the voiced segment has the most severe impact on perceived quality.

• PSQM+ yields the most detailed result comparing to MNB/EMBSD


0

10

20

30

40

50

1 11 21 31 41 51 61 71 81 91 101 111 121 131


Co

nve

rgen

ce t

ime

(fra

mes

)

1-frame loss 2-frame loss 3-frame loss 4-frame loss

Convergence time based on MSE

G.729


Convergence time based on PSQM+

0

20

40

60

80

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

frame position

PS

QM

+ (

on

fra

me)

location 1 location 2 location 3

location 4 location 5


Convergence time based on PSQM+

05

1015202530

1 6 11 16 21 26 31 36 41

frame position

PS

QM

+ (

on

fra

me)

1 2 3 4 5 6 7 8

9 10 11 12


Loss location with convergence time

• Convergence time is almost the same for different packet size

• Convergence time for a loss at unvoiced segments appears stable

• Convergence time shows a good linear relationship for loss at the voiced segments– maximum at the beginning

– linear descending

– Up bound to the end of voiced segments


Conclusions and future work

• Investigated the impact of loss locations on perceived speech quality

• Investigated the impact of loss locations on convergence time

• The results will be helpful to develop a perceptually relevant packet loss metric.

• Future work will focus on more extensive analysis of the impact of packet loss on speech content

Documents

IPTEL'2001, New York, USA1 Lingfen Sun Graham Wade, Benn Lines Emmanuel Ifeachor University of Plymouth, U.K. Impact of Packet Loss Location on Perceived