1
Enhancing Conversational Speech Quality of VoIP in a Wired/Wireless Environment Batu Sat and Benjamin W. Wah Illinois Center fo Wireless Systems Background Proposed Solutions Results VoIP • Providing interactive speech among multiple users • Utilizing public and private wired/wireless IP networks • Independent of locations of users and devices used IP Networks • Long-haul, WAN, LAN wired/wireless networks • Non-stationary real-time packet arrivals and losses • Large disparity in delay and loss behavior among clients • Complex QoS with multiple IP providers and without cost model • Quality measured and maintained at end-points • Better scalability with end-to-end strategies Conversational Dynamics • Different network delays among clients • Multiple realities in VoIP in contrast to face-to-face conversation • Perception of delays and efficiency affected by conversational switching (turn-taking) frequency Conversational Dynamics & Quality Goals Challenges Design of VoIP End Clients Achieving high and consistent perceptual conversational quality • Enabling natural and efficient conversation among users • Real-time adaptation to changing network delay & loss conditions • Suitable for any communication device using any IP network Quality Metrics • No objective metrics for quantifying conversational speech quality • Costly non-repeatable subjective tests with full implementation Design of Play-out Scheduling and Loss Concealment • Under dynamic packet delays and losses • Trade-offs among mouth-to-ear delay (MED), redundancy, and amount of packets not received in time for play-out (UCFLR) • Difficulty under dynamic delay spikes and bursty losses • With longer MED • Improved one-way speech quality • Degraded symmetry and efficiency of interactive conversation • Trade-off between minimizing pair-wise MED and maintaining a balance among MEDs perceived by users in a conversation Conversational Speech Quality • Multiple dimensions in user perception of quality • Quality of one-way speech segments • Naturalness and rhythm of conversation, mutual- silence durations Trade-Offs Collection of Traces on Delays and Losses • Using Planet-Lab nodes for collecting end-to-end traces • With packet periods and payloads typical of VoIP applications Modeling of Two-Party and Multi-Party Conversations • Utilizing human psychological models when possible • Subjective tests to obtain parameters for simulating dynamics Evaluation of Conversational Speech Quality (CSQ) • Identification of human-observable and system- measurable metrics • Modeling CSQ as function of these metrics • Designing human subjective tests Designing Play-out Scheduling/Loss concealment schemes • Trade-offs on system measurable and human- observable metrics • Schemes for real-time collection and relay of network statistics • Schemes for real-time adaptive POS and LCS Face-to-face setting: (A & B’s common perspective) A B A B time VoIP setting: (A’s perspective) (B’s perspective) time time A A’ B B’ A A’ B B’ A speaks B speaks A thinks B thinks MED(AB) MED(BA) Legend: Publi c Inter net Wirele ss Private IP Network Trac e # Jitte r Loss 2 Low 1.7% 5 - 17% 9 High 0.1% 10 - 33% None of the previous algorithms provides consistent balance between one-way speech quality and conversational interactivity Our scheme • Hugging delay curve closely • Minimizing delay degradations • Providing good one-way quality • Maximizing human quality perception

Enhancing Conversational Speech Quality of VoIP in a Wired/Wireless Environment

  • Upload
    tracen

  • View
    24

  • Download
    0

Embed Size (px)

DESCRIPTION

Enhancing Conversational Speech Quality of VoIP in a Wired/Wireless Environment Batu Sat and Benjamin W. Wah. Wireless. Public Internet. Private IP Network. Legend:. (A & B’s common perspective). Face-to-face setting:. A. B. A. B. time. A speaks. A thinks. (A’s perspective). A. - PowerPoint PPT Presentation

Citation preview

Page 1: Enhancing Conversational Speech Quality  of VoIP in a Wired/Wireless Environment

Enhancing Conversational Speech Quality of VoIP in a Wired/Wireless Environment

Batu Sat and Benjamin W. Wah

Illinois Center forWireless Systems

Background

Proposed Solutions

Results

VoIP• Providing interactive speech among multiple users • Utilizing public and private wired/wireless IP networks • Independent of locations of users and devices usedIP Networks• Long-haul, WAN, LAN wired/wireless networks• Non-stationary real-time packet arrivals and losses• Large disparity in delay and loss behavior among clients• Complex QoS with multiple IP providers and without cost model• Quality measured and maintained at end-points• Better scalability with end-to-end strategies

Conversational Dynamics• Different network delays among clients• Multiple realities in VoIP in contrast to face-to-face conversation• Perception of delays and efficiency affected by conversational switching (turn-taking) frequency

Conversational Dynamics & Quality

Goals

Challenges

Design of VoIP End Clients• Achieving high and consistent perceptual conversational quality• Enabling natural and efficient conversation among users• Real-time adaptation to changing network delay & loss conditions• Suitable for any communication device using any IP network

Quality Metrics• No objective metrics for quantifying conversational speech quality• Costly non-repeatable subjective tests with full implementation

Design of Play-out Scheduling and Loss Concealment• Under dynamic packet delays and losses

• Trade-offs among mouth-to-ear delay (MED), redundancy, and

amount of packets not received in time for play-out (UCFLR)

• Difficulty under dynamic delay spikes and bursty losses • With longer MED

• Improved one-way speech quality• Degraded symmetry and efficiency of interactive conversation

• Trade-off between minimizing pair-wise MED and maintaining a

balance among MEDs perceived by users in a conversation

Conversational Speech Quality • Multiple dimensions in user perception of quality • Quality of one-way speech segments• Naturalness and rhythm of conversation, mutual-silence durations

Trade-Offs

Collection of Traces on Delays and Losses• Using Planet-Lab nodes for collecting end-to-end traces• With packet periods and payloads typical of VoIP applications

Modeling of Two-Party and Multi-Party Conversations• Utilizing human psychological models when possible• Subjective tests to obtain parameters for simulating dynamics

Evaluation of Conversational Speech Quality (CSQ)• Identification of human-observable and system-measurable metrics• Modeling CSQ as function of these metrics• Designing human subjective tests

Designing Play-out Scheduling/Loss concealment schemes• Trade-offs on system measurable and human-observable metrics• Schemes for real-time collection and relay of network statistics• Schemes for real-time adaptive POS and LCS

Face-to-face setting:(A & B’s common perspective)

A B A B time

VoIP setting:

(A’s perspective)

(B’s perspective)

time

time

A

A’ B

B’ A

A’ B

B’

A speaks

B speaks

A thinks

B thinks

MED(AB)

MED(BA)

Legend:

PublicInternetPublic

Internet

Wireless

Wireless

Private IP Network

Private IP Network

Trace

#Jitter Loss

2 Low 1.7%

5 - 17%

9 High 0.1%

10 - 33%

None of the previous algorithms provides consistent balance between one-way speech quality and conversational interactivity

Our scheme• Hugging delay curve closely• Minimizing delay degradations• Providing good one-way quality• Maximizing human quality perception