31
Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

Embed Size (px)

Citation preview

Page 1: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

Leveraging Wideband Codecs for VoIP Development

Laurent AmarPresident, VoiceAge Corporation

Page 2: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

Contents

Speech Communication/Coding Basics

Wideband Speech Description and Applications

Wideband Speech Codec Standards

Real-World Wideband VoIP Deployment

Wideband Momentum

What’s Next & Wrap Up

Page 3: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

Speech Signal Basics

Page 4: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

Understanding Speech Communication

Human Physiology and Perception are Key

Encode and exchange primarily the speech signal information that is important for human perception

Use human speech production and comprehension parameters to reduce bit rate and enhance communication quality

Page 5: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

Speech Coding Attributes Bit rate

• As low as possible

Delay• As little as possible

Quality• As high as possible

Complexity• As algorithmically simple as possible to constrain platform processing and

memory requirements

Robustness• Effective operation under background noise and channel impairment

conditions

Standards compliance• Open, tested and interoperable solutions

As required by specific applications

Difficult to attain all of these often divergent objectives at the same time

Page 6: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

Speech Synthesis ModelUsed in CELP (Code Excited Linear Prediction) Speech

Coding

PredictionLong-term

s(n)c(n) v(n) ^

PredictionShort-term

Innovative excitation

Synthesized speech

1

2

3

1 2 3

1 = air from lungs

2 = vocal chords (periodicity)

3 = vocal tract (mouth + lips)Very successful speech compression algorithm is based on Algebraic CELP:

ACELP ®

Page 7: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

ACELP at the heart (overview)

Ask Redwan – glean from his presentation

CELP Decoder Principles

Page 8: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

More on ACELP implementation

Ask Redwan or take block diagrams from the old poster.

CELP Encoder Principles

• The excitation parameters (codebook indices and gains) are determined by minimizing the perceptually weighted error between original and synthesized speech.

• Analysis-by-synthesis where a ‘local decoder’ (the orange part) exists inside the encoder.

Page 9: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

International Standards Using ACELP

Page 10: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

What is Wideband Telephony?AMR Standard Codec Family at a GlanceBuilt on a solid, market-proven ACELP ® technology foundation

G.722.2

A Complete Suite of Low Bit Rate Speech and Audio Coding Solutions

Page 11: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

Contents

Speech Communication/Coding Basics

Wideband Speech Description and Applications

Wideband Speech Codec Standards

Real-World Wideband VoIP Deployment

Wideband Momentum

What’s Next & Wrap Up

Page 12: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

What is Wideband Telephony?What is Wideband Telephony?

An Emerging Opportunity to Deliver Vastly Improved Speech Quality

•Substantially increases transmitted speech information

• Double the bandwidth

•Enables digital end-to-end packet-based telephony services to deliver much better speech quality than traditional PSTN circuit-switched telephony

• VoIP quality differentiatorHearing is believing! Visit VoiceAge at booth #305 for a demoAlso visit the listening room at www.voiceage.com to hear samples

Page 13: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

Why Wideband VoIP Telephony Now?Enabling Technologies and Consumer Perceptions are

Converging

Improved presence, naturalness and intelligibility• Reduces listener fatigue• Improved Hands-free/speakerphone sound quality

Improves speaker and speech recognition High-quality low bit rate wideband codecs

• e.g., G.722.2/AMR-WB & VMR-WB at rates ranging from 7–24 kbps

Rising user awareness of enhanced sound quality• Wideband teleconferencing• Wideband enterprise IPtelephony• Wireless/VoIP multimedia services

Driving up expectations! Interoperable wideband codec solutions over end-to-end

digital networks help pave the way for fixed/mobile convergence

Wideband Telephony Benefits:

Page 14: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

Typical Speech Signal Acoustics

1 0 0 0

0

2 0 0 0

3 0 0 0

4 0 0 0

5 0 0 0

6 0 0 0

7 0 0 0

0 .50 1 .0 1 .5 2 .0 2 .5 3 .0T im e [s]

Fre

qu

ency

[H

z]

200

- 34

00H

z50 -

700

0Hz

Typical Speech Signal AcousticsWideband telephony covers much more speech signal information

Improved voice quality and intelligibility (e.g., s & f differentiation)

Improved speech naturalness, presence and comfort

“Everyone looked extremely confused about the news”

Page 15: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

Wideband Telephony ApplicationsScope is much wider than VoIP Telephony

VoIP hi-fi telephony (G.722.2)

Cellular wireless hi-fi telephony (AMR-WB & VMR-WB)

Wi-Fi VoIP telephony

Converged wireless/wire-line telephony

Multi-point audio and video teleconferencing

Video telephony audio coding

Call center conversation recording and archiving

Speech and speaker recognition-based systems

Digital radio broadcasting and field reporting

Hi-fi ringtones

Page 16: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

Contents

Speech Communication/Coding Basics

Wideband Speech Description and Applications

Wideband Speech Codec Standards

Real-World Wideband VoIP Deployment

Wideband Momentum

What’s Next & Wrap Up

Page 17: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

The Standard Solution AdvantageInteroperable, Open and Fully Tested

Open, collaborative and competitive process

Requirements specifically address target applications

Published algorithms and source code • Permits wider and more effective scrutiny

Rigorous comparative testing under diverse conditions

• Background noise types and levels• Spoken languages• Speaker types• Various network impairmentsEnsures that the best technologies are chosen

Page 18: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

Evolution of Wideband StandardsA steady progression of high-quality speech coding

technologies

1987FR

13 kb/s1994HR

5.6 kb/s 1995EFR

12.2 kb/s1999

AMR-NB4.75-12.2 kb/s

1972G.71164 kb/s

1984G.72632 kb/s

1992G.72816 kb/s

1995G.729

6.4,8,11.8 kb/s

1988G.722

48,56,64 kb/s

1999G.722.124,32 kb/s

2001-20023GPP/ITU-TAMR-WB/

G722.26.6-23.85 kb/s

1993IS-96A

Rate-Set I

1995IS-96A

Rate-Set II

1997EVRC

Rate-Set I

2000SMV

Rate-Set I

20043GPP2

VMR-WB(Source

Controlled)Rate-Set I & II

3GPP2

3GPPWideband

ITU-T

InteroperableWideband

Narrowband

ITU-T

Page 19: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

G.722.2/AMR-WB and VMR-WB Standards

• 3GPP 1999 TS 26.111 recommends AMR-WB for (3G-324H) multimedia telephone handsets

• 3GPP 2001 TS 26.190 defines the AMR-WB codec• ITU-T 2002 G.722.2 recommended for wideband speech• 3GPP2 (2004) C.S0052, “Source-Controlled Variable-Rate

Multimode Wideband Speech Codec (VMR-WB), Service Options 62 and 63 for Spread Spectrum Systems,” specifies the VMR-WB codec for cdma2000® systems.

• 3GPP 2005 TS.235 requires packet-switched multimedia terminals at 16kHz and PoC terminals to support AMR-WB

• OMA 2005 Push-to-Talk User Plane states the PoC server must support AMR and AMR-WB media parameters

Widespread success in international standards competitions

Page 20: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

G.722.2 Subjective Testing Results

Clean Condition Test (English Language)AMR-WB Characterization Test

1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5

5.0

MO

S

No Tandem -26 dBov Self-Tandem -26 dBov

G.722 @ 64 kbps

G.722 @ 48 kbps

G.722.2 @ 8.85 kbps

G.722.2 @ 12.65 kbps

G.722.2 @ 18.25 kbps

G.722.2 @ 23.05 kbps

G.722.2/AMR-WB Delivers Excellent Wideband Speech QualityEven at Low Bit Rates (e.g. MOS at 8.85 kbps exceeds G.722 at 48 kbps)

Page 21: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

Contents

Speech Communication/Coding Basics

Wideband Speech Description and Applications

Wideband Speech Codec Standards

Real-World Wideband VoIP Deployment

Wideband Momentum

What’s Next & Wrap Up

Page 22: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

Enabling Wideband VoIP TelephonyThe Key Underpinnings

Wideband speech coding technology is ready – what else is needed for mass adoption?

Wideband capable terminal device speakers and microphones

More and more network elements and end-devices equipped with compatible wideband codecs

•Standard wideband codecs ensure smooth interoperability•Software-driven terminals enable downloading of the

latest enhancements to standard wideband codecs•Relevant application servers and network infrastructure

gear need to support the necessary wideband standard codecs

Fully digital packet-based VoIP networks that are readily configurable to support wideband telephony

Page 23: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

Implementation Considerations Interoperability

• Important to eliminate or reduce transcoding Transcoding adds cost, delay and jitter Degrades speech quality

Complexity• Tradeoff between bit rate and complexity/memory• An important design consideration for handheld devices• Miniaturization trends, Moore’s law and other

innovations are still going strong though

Quality of Service• Robust real-world performance, need to consider:

Packet loss – Counter with concealment and FEC methods Background noise – Mitigate with noise suppression

Delay and jitter – Minimize delay and manage jitter

Total bit rate available• Codec & system/channel coding both contribute

Page 24: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

Enabling Transcoder Free Interoperability

Enabling Seamless Communication across Wireline, Wireless and Wi-Fi Networks

Page 25: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

Growing Real-World Wideband Deployment Momentum

Teleconferencing system vendors• Wideband telephony deployment pioneers – have a very

compelling wideband speech application Hi-fi ringtones (True Tones)

• Increasing deployment in newer mobile phones from major vendors

Enterprise IPphone systems• Campus LAN environments provide an ideal platform for

rolling out wideband telephony Emerging wideband VoIP services for the masses

• Provide an opportunity for service providers to differentiate VoIP offering to the mass market

• Broadband Internet access is quickly becoming the norm helping VoIP become mainstream

• Increasing availability of wideband speech capable devices• Softphone clients like XtenTM’s eyeBeamTM are integrating

wideband codecs (G.722.2)

Wideband

Page 26: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

Wideband in Enterprise VoIP Enterprise are deploying wideband VoIP

telephony• Intra-site GbE/10 GbE LANs widely deployed

Facilitate converged IT corporate data and VoIP voice communications over a common network infrastructure

Intra-site networks primed for VoIP with wideband

• Improves communications effectiveness and productivity within a corporate network

Little or no additional cost needed Improves mission critical communications (e.g.

hospitals)

• Compression and robustness are important for cost-effective communications between sites over a WAN

Also significant when reaching out to mobile employees (either over cellular at remote sites or over a WLAN connection within a site or campus)

Page 27: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

Wideband VoIP over Xten Softphones XtenTM eyeBeamTM has readily implemented and

demonstrated G.722.2 VoIP Enabling a higher quality conversation

with the same/similar bandwidth as narrowband codecs

Service providers can provide a higher value service for the same cost

Supports interoperability between SIP and 3G cellular network devices without audio signal transcoding

G.722.2 readily integrated and demonstration on the eyeBeam

Reduces the need for operators to purchase, operate and maintain additional equipment such as transcoders and wideband capable hard-phones

Enables service providers to rollout VoIP services with superior voice quality

XtenTM eyeBeamTM

Page 28: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

Contents

Speech Communication/Coding Basics

Wideband Speech Description and Applications

Wideband Speech Codec Standards

Wideband VoIP Implementation Considerations

Real-World Wideband Momentum

What’s Next & Wrap Up

Page 29: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

Beyond Wideband Speech, what’s next?

Teleconferencing solution pioneers are introducing new audio enhancements:

• Ultra-wideband, which typically increases the transmitted speech bandwidth to 14 – 16 kHz

Increases further the richness of conversational voice quality• Stereo sound and spatial sound

Gives a better sense of speaker directionality for remote meetings

Audio improvements also driven by multimedia services, such as:

• On-line gaming, audiovisual telephony and rich messaging

Emerging hybrid speech and stereo audio codecs effectively meet these emerging needs with efficient use of channel capacity, e.g.:

• The AMR-WB+ hi-fi audio compression codec (selected by the 3GPP for mobile multimedia services), encompasses essentially the full human audio spectrum with parametric stereo, even at low bit rates.

Page 30: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

Summary Wideband speech is beginning to gain real-world

momentum• The key enablers are widely available (end-user devices,

end-to-end digital networks, interoperable standard WB codecs, …)

User expectations for improved audio quality are rising• Video telephony, audiovisual conferencing and remote

collaboration and other multimedia services are expected to be extremely popular for both business and residential use

Once wideband speech communication is widely deployed and available it will increasingly become expected by users as the norm

The stage is set for widespread wideband VoIP – it is time for main the players (you the developers) to make it happen

What are you waiting for?Go make it happen!

Page 31: Leveraging Wideband Codecs for VoIP Development Laurent Amar President, VoiceAge Corporation

Abbreviations/Glossary3GPP: Third Generation Partnership Project (Standards body defining GSM evolution to 3G

networks)3GPP2: Third Generation Partnership Project 2 (Standards body defining CDMA evolution to 3G)AMR: Adaptive Multi-Rate (standard narrowband speech codec for GSM and WCDMA networks)AMR-WB/G.722.2: Adaptive Multi-Rate Wideband (standard wideband speech codec for GSM and

WCDMA networks and ITU-T (as G.722.2))AMR-WB+: Extended Adaptive Multi-Rate Wideband (standard wideband speech and hi-fi audio

codec)CDMA: Code Division Multiple Access (Technology behind the second most popular cellular

networks)BTS: Base Transceiver StationBSS: Base Station SystemCNG: Comfort Noise Generation (decoder feature the generates comfort noise to avoid listener

annoyance when the encoder at the far-end is not transmitting due to silence)GSM: Global System for Mobile (most widely deployed cellular mobile technology) ITU-T: International Telecommunications Union – Telecommunications standardization sectorMOS: Mean Opinion Score (a subjective test methodology for evaluating speech quality)OMA: Open Mobile Alliance (an organization formed to facilitate the global user adoption of mobile

data services) PoC: Push-to-talk over Cellular (walkie-talkie like service over cellular networks)VAD: Voice Activity Detection (an encoder feature that detects when the user is speaking)VMR-WB: Variable Rate Multi-mode Wideband (standard wideband speech codec for CDMA2000®)WCDMA: Wideband CDMA (Technology adopted by GSM networks for their evolution to 3G)wMOPS: weighted Million Operations Per Second (measure of codec complexity)