Upload
vaughn
View
28
Download
0
Tags:
Embed Size (px)
DESCRIPTION
New Models for Perceived Voice Quality Prediction and their Applications in Playout Buffer Optimization for VoIP Networks. Dr. Lingfen Sun Prof Emmanuel Ifeachor. University of Plymouth United Kingdom {L.Sun; E.Ifeachor}@plymouth.ac.uk. Outline. Background Speech quality for VoIP networks - PowerPoint PPT Presentation
Citation preview
New Models for Perceived Voice Quality Prediction and their
Applications in Playout Buffer Optimization for VoIP Networks
University of PlymouthUnited Kingdom{L.Sun; E.Ifeachor}@plymouth.ac.uk
Dr. Lingfen SunProf Emmanuel Ifeachor
ICC 2004, Paris France, 20-24 June 2004 2
Outline
Background Speech quality for VoIP networks Current status Aims of the project
Main Contributions Novel non-intrusive voice quality prediction models Novel perceptual-based speech quality optimization (e.g. jitter
buffer optimization) mechanism Conclusions and Future Work
ICC 2004, Paris France, 20-24 June 2004 3
Background – Speech Quality for VoIP Networks
VoIP speech quality: end-user perceived quality (MOS), an important metric.
Affected by IP network impairments and other impairments. Voice quality measurement: subjective (MOS ) or objective
(intrusive or non-intrusive)
SCN SCNIP Network
Gateway Gateway
SCN: Switched Comm. Networks (PSTN, ISDN, GSM …)
End-to-end Perceived speech quality
Intrusivemeasurement
Non-intrusivemeasurement
MOS
MOS
Reference speech Degraded speech
ICC 2004, Paris France, 20-24 June 2004 4
Current Status and Problems
Lack of an efficient non-intrusive speech quality measurement method E-model (a complicated computational model) Based on subjective tests to derive models/parameters, time-
consuming and expensive. Only limited models exist Lack of perceptual optimization control methods
only based on individual network parameters for buffer optimization and QoS control purposes
not perceptual-based optimization control
ICC 2004, Paris France, 20-24 June 2004 5
Aims of the Project
IP Network
ReceiverVoice source
Voice receiver
Encoder
Sender
PacketizerJitter
bufferDecoder
De-packetizer
Non-intrusivemeasurement
MOS
End-to-end perceived voice quality (MOS)
To develop novel and efficient method/models for non-intrusive quality prediction,
To apply the models for perceptual-based optimization control ( e.g. buffer optimization or adaptive sender-bit-rate QoS control).
ICC 2004, Paris France, 20-24 June 2004 6
Novel Non-intrusive Voice Quality Prediction
Based on intrusive quality measurement (e.g. PESQ) to predict voice quality non-intrusively which avoids subjective tests.
A generic method which can be applied to audio, image and video.
VoIP Network
New model
(packet loss, delay, codec …)
Predicted MOSc
PESQ
E-model Measured MOScdelay
MOS(PESQ)
Reference speech Degraded speech
Intrusive method
(regression or ANN models)Non-intrusive method
ICC 2004, Paris France, 20-24 June 2004 7
New Structure to Obtain MOSc
PESQ can only predict one-way listening speech quality (expressed as MOS).
By a new combined PESQ/E-model structure, a conversational speech quality (MOSc) can be obtained as Measured MOSc.
PESQ
Delay model
MOS R Ie
Ie
End-to-end delay
E-modelMOSc
Id
Reference speech
Degraded speech
MOS (PESQ)
ICC 2004, Paris France, 20-24 June 2004 8
Regression based Models (1)
Nonlinear regression models are derived for Ie based on PESQ/PESQ-LQ
Further combine Ie with Id to obtain MOSc.
MOS (PESQ)
Ie model
Ie
E-modelMOSc
Id modelId
Delay (d)
CodecPacket loss
Reference speech
Degraded speech
Speechdatabase
Encoder Loss model Decoder
Nonlinear regression model (Ie model) Predicted Ie
PESQ/PESQ-LQ
MOS RIeMeasured Ie
(a)
(b)
ICC 2004, Paris France, 20-24 June 2004 9
Regression based Models (2)
Ie can be modelled by a logarithm fitting function with the form of
Parameters for different codecs (PESQ) cbaIe )1ln(
Parameters AMR(H) AMR(L) G.729 G.723.1 iLBC
a 16.68 30.86 21.14 20.06 12.59
b*100 30.11 4.26 12.73 10.24 9.45
c 14.96 31.66 22.45 25.63 20.42
ICC 2004, Paris France, 20-24 June 2004 10
Regression Models for AMR (12.2Kb/s)
96.14)3011.01ln(68.16 eI
e.g. for AMR (12.2Kb/s),
The goodness of fit is:
SSE = 2.83 and R2 = 0.998
MOS vs. packet loss and delay
ICC 2004, Paris France, 20-24 June 2004 11
Perceptual-based Buffer Optimization
Motivation: only based on individual network parameters (e.g. delay or loss) targeting only minimum average delay or minimum late arrival loss,
not maximum MOS. There is a need to design buffer algorithm to achieve optimum
perceived speech quality.
Contribution A perceptual-based optimization jitter buffer algorithm
o Use regression based models for buffer optimizationo Use a minimum impairment criterion instead of traditional maximum
MOS scoreo A Weibull delay distribution based on trace analysiso A perceptual-based optimization of playout buffer algorithm
ICC 2004, Paris France, 20-24 June 2004 12
Impairment Function Im Define: impairment function Im
parameters related codec are and 0 if 1)(
0 if 0)(
)1ln()3.177()3.177(11.0024.0
),(
baxxH
xxHwhere
badHdd
IIdfI edm
rdnnnnbn edXP )/)(()100()()100(
Playout delay d
Weilbull distributionbuffer loss
b
ICC 2004, Paris France, 20-24 June 2004 13
Minimum Impairment Criterion Define: minimum impairment criterion
Given: network delay dn, network loss n and codec type
Estimate: an optimized playout delay dopt
Such that: minimize Im can be reached.
d1 d2 d3
d4
Minimum Im
ICC 2004, Paris France, 20-24 June 2004 14
Perceptual-based Optimization Buffer Algorithm
For every packet i received, calculate network delay ni
If mode == SPIKE then
if ni tail*old_d then
mode = NORMAL
elseif ni > head*di then
mode = SPIKE; old_d = di
else
-update delay records for the past W packets
endifAt the beginning of a talkspurt
If mode == SPIKE then
di = ni
else
-obtain (, , ) for Weilbull distribution for the past W packets
-search playout d which meets minimum Im criterion
endif
ICC 2004, Paris France, 20-24 June 2004 15
Performance Analysis and Comparison (1)
Selected five traces from UoP to CU (USA), DUT (Germany), BUPT (China), and NC (China).
Traces 1 and 3 with high delay variation and traces 2, 4, 5 with low delay variation
Trace Delay (ms)
Jitter (ms)
Loss (%)
1 153 16.2 1.1
2 46 0.8 0.3
3 186 19.5 14.3
4 16 0.7 4.4
5 150 0.2 0.2
ICC 2004, Paris France, 20-24 June 2004 16
Performance Analysis and Comparison (2)
“p-optimum” algorithm achieves the optimum voice quality for all traces.
“adaptive” algorithm achieves sub-optimum quality with low complexity.
Performance comparison for buffer algorithms
0.5
1
1.5
2
2.5
3
3.5
4
1 2 3 4 5
Traces
MO
S
exp-avg
fast-exp
min-delay
spk-delay
adaptive
p-optimum
ICC 2004, Paris France, 20-24 June 2004 17
Conclusions and Future Work
Conclusions The development of a new methodology and regression models to
predict voice quality non-intrusively. Demonstrated the application of new non-intrusive voice quality
prediction models to perceptual-based optimization of playout buffer algorithms.
Future Work To consider buffer adaptation during a talkspurt in order to achieve
the best trade-off between delay, loss and end-to-end jitter. To extend the work to improve the performance of multimedia
services (e.g. audio/image/video) over IP networks
ICC 2004, Paris France, 20-24 June 2004 18
Contact Details
http://www.tech.plymouth.ac.uk/spmc Dr. Lingfen Sun
[email protected] Prof Emmanuel Ifeachor
[email protected] Any questions?
Thank you!