4
QualityCost
VoIP Design Considerations
Speech Quality
Time to Market
Flexibility
Ease of Use
Network Impairments
Power Consumption
Cost
Signaling
Features
Infrastructure
Device Considerations
VoIP DesignChallenges
Coping with Network Degredation
Power Consumption
Hardware Issues (Processor, OS, Acoustics, etc.)
Echo Cancellation
Additional Voice Processing Components
Environment – Background Noise,
Room Acoustics, etc.
Speech Codec
Both Sides of the Call Need to be Considered
Network
Codec
Hardware
EchoPower
Voice Environment
Major Challenges for VoIP End-point Design
5
DelayMajor effect is “stepping on each other’s talk”Usage scenario affects annoyance factor – higher delay can be tolerated for mobile devicesLong delays make echo more annoying
Packet LossSmooth concealment
necessary
Network JitterJitter buffer necessary to ensure continuous playoutTrade-off between delay and quality
Impact of IP Networks
6
Sources of Latency
Codec Capture Playout Network delay Jitter buffer OS interaction Transcoding
7
A/DPre-
processing
Speech encoding
IP interface
D/APost-
processing
Speech decoding
Jitter buffer
IP Network
A/DPre-
ProcessingSpeech Encoding
IP Interface
D/APost-
ProcessingSpeech
DecodingJitterButter
IP Network
Impact of Delay on Voice Quality
ITU-T (G.114) recommends:– Less than 150 ms one-way delay for most applications (up to 400 ms
acceptable in special cases)
Users have got used to longer delays– Still, low delay very important for high quality
1
2
3
4
0 250 500 750One-w ay transmission time [ms]
Mea
n O
pini
on S
core
Data from ITU-T G.114
8
Speech Codec
Many conflicting parameters affect choice of codec
Determines upper limit of quality
Support of several codecs necessary
– Interoperability
– Usage scenario
IPR issues a significant concern
Speech Codec
Packet-loss Robustness
Memory
Input Signal Robutness
Sampling Rate
Complexity
Delay
Bit-rate
Quality
Complexity
Bit-rate Input Signal Robutness
9
Audio Spectrum
Better than PSTN quality is achievable in VoIP
– Utilizing full 0 – 4 kHz band in narrowband
– Wideband coding offers more natural and crispier voice
Telephony band
10
NarrowbandSpeech (PSTN)
Audio Spectrum vs. Speech Quality
Frequency
WidebandSpeech
Super WidebandSpeech
4 kHz 8 kHz 22.1 kHz
Speech Quality
16 kHz
CDSpeech
10 kHz
11
Speech Codec Design for VoIP
Many standard codecs designed for bit errors, not packet loss
– Error propagation issue for CELP codecs
Variable bit rate attractive for IP networks Packet overhead significant (5 – 32 kb/s)
– Makes low bit rate codecs less attractive
Packet loss concealment a must Jitter buffer design has significant impact on quality Alternatives to standards
– De-facto standards like iSAC– Open source like Speex
12
Echo Cancellation
High delay in VoIP makes echo problem more prominent Network/Line echo cancellation for gateways Acoustic echo cancellation
– Hands-free/speakerphone– Small devices
Biggest challenge is AEC for PC – Acoustic setup unknown and changing– Wideband speech– Very few solutions on the market
Limited quality degradation since G.711 used on the PSTN side
VoIP to PSTN
Severe quality degradation common since low bit-rate codecs typically used on both sides
VoIP to Cellular
Usually occurs in Session Border Controllers
Can normally be avoided
VoIP to VoIP
Transcoding occurs when the endpoints are using different codecs– Every transcoding introduces distortion– Low bit-rate codecs very sensitive to transcoding
Transcoding between networks
Transcoding in conferencing– Mixing done in decoded domain results in transcoding
Effects of Transcoding
14
Spot Jitter Patterns - Increase Delay to Keep
Good Quality when Unavoidable
Packet Loss Concealment - Capable of Handling Several Lost
Packets in a Row
Very Quick Jitter Buffer Adaptation – Conditions Change Very Rapidly (on a milisecond basis)
Minimize Delay Everywhere – every milisecond counts
How to Make the VoIP Software Robust?
15
Subjective Methods
Test the “right thing”, i.e. subjective quality
Takes all types of degradation into account
Time consuming and costly
Lack of repeatability
Objective Methods
Simple and affordable Inaccurate but repeatable results Sensitive to any processing (non-
linear filtering, echo cancellation, time warping etc.)
– Time synchronization major challenge not yet solved
Sensitive to background and equipment impairments
One step behind development of codecs and error concealment
Next generation algorithm in standardization process (P.OLQA)
Measuring Voice Quality
16
Audio Conferencing Design includes a trade-off between quality and
scalability Client based or server based
– Server based offers better scalability than client based
– Can be combined
Transcoding often unavoidable Two strategies:
– Mix incoming signals to form one output signal
– Only relay packets and mix at client side
Multi-codec support– In relay mode all endpoints need to support all codecs
Narrowband and wideband– Both can be present in a conference
– Narrowband participant will hear everything in narrowband
– Wideband participant hears others in narrowband or wideband
B
E
C
D
A
B+C+D+E
A+C+D+EA+B+D+E
A+B+C+E
A+B+C+D
Conclusions
Latency has a significant impact on the perceived quality in VoIP
– Low latency, high quality (e.g. NetEQ) jitter buffer necessary
Choose the right codec for the usage scenario– Or a codec that can adapt like iSAC
Transcoding should be avoided, if possible
Significantly better quality than PSTN possible– Wideband coding
No good objective measure for speech quality exists– Always combine with subjective evaluation
18