2
FRAUNHOFER INSTITUTE FOR DIGITAL MEDIA TECHNOLOGY IDMT SIP-TOOLBOX – MODELING SPEECH INTELLIGIBILITY Speech plays an outstanding role in human communication. The intelligibility of speech signals depends very much on the acoustic conditions, e.g. on background noise or reverberation, the capacity of the hearing system, or the transmission channel by which speech is transmitted (such as a telecommunication system). Several factors influence the perceived speech quality, in particular intelligibility, loudness or the overall sound quality. There are a number of applications requiring a reliable prediction of speech quality. For example, developers of public address systems or designers of office areas benefit from models predicting speech in- telligibility since they can replace costly and time-consuming subjective measurements. Additionally, software engineers make use of such models to distinguish between different algorithms or parameters. Our SIP-Toolbox (Speech Intelligibility Prediction Toolbox) has been developed as 1 Many applications, e.g. the de- velopment of public adress systems require reliable models for the pre- diction of speech intelligibility. Fraunhofer Institute for Digital Media Technology IDMT Ehrenbergstr. 31 98693 Ilmenau Project Group Hearing, Speech and Audio Technology Branch Lab Oldenburg Haus des Hörens Marie-Curie-Straße 2 26129 Oldenburg Contact Jan Rennies Telefon +49 441 2172-433 [email protected] www.idmt.fraunhofer.de a solution for such applications. Evaluation of Speech Quality Currently, several models are available to predict speech intelligibility, loudness or quality. The problem, however, is that for a given acoustic situation, the models may predict different results or may even be inapplicable. It is therefore up to the user to decide which model to use for the desired application context. photo acknowledgement: istockphoto.com 1 General view of the Toolbox

SIP-Toolbox – ModelIng SPeeCH InTellIgIbIlITY › ... › SIP-Toolbox_EN.pdf · The SIP-Toolbox provides quick and easy modeling of speech intelligi-bility for different situations

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: SIP-Toolbox – ModelIng SPeeCH InTellIgIbIlITY › ... › SIP-Toolbox_EN.pdf · The SIP-Toolbox provides quick and easy modeling of speech intelligi-bility for different situations

F R A U N H O F E R I N S T I T U T E F O R D I g I TA l M E D I A T E c H N O l O g y I D M T

1

SIP-Toolbox – ModelIng SPeeCH InTellIgIbIlITY

Speech plays an outstanding role in human

communication. The intelligibility of speech

signals depends very much on the acoustic

conditions, e.g. on background noise or

reverberation, the capacity of the hearing

system, or the transmission channel by

which speech is transmitted (such as a

telecommunication system). Several factors

influence the perceived speech quality,

in particular intelligibility, loudness or the

overall sound quality.

There are a number of applications

requiring a reliable prediction of speech

quality. For example, developers of public

address systems or designers of office areas

benefit from models predicting speech in-

telligibility since they can replace costly and

time-consuming subjective measurements.

Additionally, software engineers make use

of such models to distinguish between

different algorithms or parameters.

Our SIP-Toolbox (Speech Intelligibility

Prediction Toolbox) has been developed as

1 Many applications, e.g. the de-

velopment of public adress systems

require reliable models for the pre-

diction of speech intelligibility.

Fraunhofer Institute for

Digital Media Technology IDMT

Ehrenbergstr. 31

98693 Ilmenau

Project Group Hearing, Speech and

Audio Technology

Branch Lab Oldenburg

Haus des Hörens

Marie-Curie-Straße 2

26129 Oldenburg

Contact

Jan Rennies

Telefon +49 441 2172-433

[email protected]

www.idmt.fraunhofer.de

a solution for such applications.

Evaluation of Speech Quality

Currently, several models are available to

predict speech intelligibility, loudness or

quality. The problem, however, is that for

a given acoustic situation, the models may

predict different results or may even be

inapplicable. It is therefore up to the user to

decide which model to use for the desired

application context.

photo acknowledgement: istockphoto.com

1

General view of the Toolbox

Page 2: SIP-Toolbox – ModelIng SPeeCH InTellIgIbIlITY › ... › SIP-Toolbox_EN.pdf · The SIP-Toolbox provides quick and easy modeling of speech intelligi-bility for different situations

1

2 Speech intelligibility depends

very much on specific situations.

The SIP-Toolbox provides quick and

easy modeling of speech intelligi-

bility for different situations.

SIP-Toolbox

The SIP-Toolbox developed in the

project group Hearing, Speech and Audio

Technology at the Fraunhofer IDMT offers

a quick and easy prediction of the main

factors affecting speech quality for different

situations. It can be used to easily compare

the different models and thereby assist the

user to select the most suitable models

for his or her application. The SIP-Toolbox

designed in MATLAB® offers versatile

utilities to import, process and represent

data. For users with no access to Matlab®,

we also offer a stand-alone version of the

SIP-Toolbox.

Technical Features

Speech intelligibility models

The SIP-Toolbox contains standardized

models of speech intelligibility, e.g.

the Articulation Index (AI), the Speech

Intelligibility Index (SII) and the Speech

Transmission Index (STI). Additionally,

extended models are integrated which

can also predict the influence of hearing

impairment and binaural hearing.

Technical and perceptual models

Further models included in the SIP-toolbox

range from simple energy-based measures

(e.g. signal-to-noise ratios, reverberation

time) to complex speech quality measures

based on auditory models (e.g. PEMO-Q).

Additionally, loudness (e.g. DIN 45631/A1)

and other psychoacoustic quantities can be

calculated.

Graphical user interface

For arbitrary speech and noise signals, all

models can be quickly compared in a con-

cise graphical interface. The user can easily

control all important parameters such as

level or signal-to-noise ratio. An option

for batch processing allows a convenient

processing of large amounts of data.

Representation of signals

The SIP-Toolbox offers a visual and

acoustical representation of the signals

providing for example a direct impression

of the effect of reverberation. Arbitrary

acoustic situations can be integrated into

the SIP-Toolbox by means of their impulse

responses, which allows simulations of

the influence of a transmission system or

reverberation on speech quality.

customization

Current research of the Fraunhofer

project group Hearing, Speech and Audio

Technology focuses on the development

of improved and more generally applicable

speech intelligibility models. These models

are developed to account for the effects of

binaural hearing, temporal fluctuations or

hearing impairment, which are not or only

insufficiently included in current standard

models. The new methods can be integra-

ted into the SIP-Toolbox in a modular way.

Furthermore, we offer to objectively evalua-

te your own signals using the models of the

SIP-Toolbox. If desired, we can also perform

subjective listening tests based on your

custom condition with normal-hearing and

hearing-impaired listeners in our specially

equipped measurement facilities.

Demo-Version

If you are interested in our SIP-Toolbox,

we kindly ask you to contact us via mail

or phone. We will provide you with a

free demo-version of the software. For

further information and pictures about the

SIP-Toolbox please visit our website http://

www.idmt.fraunhofer.de/de/research_to-

pics/sip_toolbox_eng.html.

2

Speech Intelligibility