16
Overview of CSSML Yan Jun, Department Manager Anhui USTC iFLYTEK Co., Ltd University of Science & Tech of C hina

Overview of CSSML Yan Jun, Department Manager Anhui USTC iFLYTEK Co., Ltd University of Science & Tech of China

Embed Size (px)

Citation preview

Page 1: Overview of CSSML Yan Jun, Department Manager Anhui USTC iFLYTEK Co., Ltd University of Science & Tech of China

Overview of CSSML

Yan Jun, Department ManagerAnhui USTC iFLYTEK Co., Ltd

University of Science & Tech of China

Page 2: Overview of CSSML Yan Jun, Department Manager Anhui USTC iFLYTEK Co., Ltd University of Science & Tech of China

Presentation Outline

• Motivation and solutions

• Standardization

• Application

Page 3: Overview of CSSML Yan Jun, Department Manager Anhui USTC iFLYTEK Co., Ltd University of Science & Tech of China

CSSML

• Chinese Speech Synthesis Markup Language

• CSSML is a extension of SSML for Chinese

• Objective – To meet Chinese speech synthesis requirements– To provide more flexible and convenient methods to

adjust parameters and optimize speech synthesis effect

Page 4: Overview of CSSML Yan Jun, Department Manager Anhui USTC iFLYTEK Co., Ltd University of Science & Tech of China

Motivation

• Special problems of Chinese speech synthesis– Pronunciation of Chinese characters– Disposure of words composed of English letters – Segmentation of Chinese words

• Requirements of Chinese speech market– Using background music

Page 5: Overview of CSSML Yan Jun, Department Manager Anhui USTC iFLYTEK Co., Ltd University of Science & Tech of China

Pronunciation of Chinese characters

• Syllables: Chinese characters • Chinese characters have four tones, or no tone to ex

press unstressed syllables• Chinese Romanization (PinYin) is widely used in C

hina as a formal notation of Chinese character pronunciation.

广 ɡuǎnɡ guang3

光 ɡuānɡ guang1

Page 6: Overview of CSSML Yan Jun, Department Manager Anhui USTC iFLYTEK Co., Ltd University of Science & Tech of China

words composed of English letters

• Words composed of English letters– English words: James, New York– PinYin words: Anhui, Hefei, Jiang Zemin

• PinYin words speak as English words– Not according to pronunciation custom– Difficult to understand

Page 7: Overview of CSSML Yan Jun, Department Manager Anhui USTC iFLYTEK Co., Ltd University of Science & Tech of China

phoneme

• Attributes supported by the phoneme element are extended– alphabet attribute can take ‘py’ and ph attribute can b

e PinYin notation– new lang attribute is added to indicate the language o

r dialect of the content

他姓他姓 <<phoneme alphabet=“py” phphoneme alphabet=“py” ph=“zeng1”>=“zeng1”> 曾曾 </</phonemephoneme>>

国家主席国家主席 <phoneme lang=“cn”>Jiang Zemin</phoneme><phoneme lang=“cn”>Jiang Zemin</phoneme>

Page 8: Overview of CSSML Yan Jun, Department Manager Anhui USTC iFLYTEK Co., Ltd University of Science & Tech of China

Segmentation of Chinese word

• Basic grammatical unit of Chinese: Chinese character

• No blanks or punctuations to separate word

• Thus, one sentence may have several results of segmenting words that may be correct

南京市长江大桥南京市长江大桥南京市南京市ˇ̌长江大桥长江大桥 The Bridge of the Yangtse River in Nanking cityThe Bridge of the Yangtse River in Nanking city

南京市长ˇ江大桥南京市长ˇ江大桥 Jiang Daqiao, the mayor of Nanking cityJiang Daqiao, the mayor of Nanking city

Page 9: Overview of CSSML Yan Jun, Department Manager Anhui USTC iFLYTEK Co., Ltd University of Science & Tech of China

Segmentation of Chinese word

• Different result of segmenting words– Greatly affect the meaning of the sentence– The pronunciation of Chinese characters may be

different ( monograph )– Thus, influence or even destroy the effect of speech

synthesis

南京市南京市ˇ̌长江大桥长江大桥 nan2 jing1 shi4 nan2 jing1 shi4 chang2chang2 jiang1 da4 qiao2 jiang1 da4 qiao2

南京市长ˇ江大桥 南京市长ˇ江大桥 nan2 jing1 shi4 nan2 jing1 shi4 zhang3zhang3 jiang1 da4 qiao2 jiang1 da4 qiao2

Page 10: Overview of CSSML Yan Jun, Department Manager Anhui USTC iFLYTEK Co., Ltd University of Science & Tech of China

word and phrase

• word element is used to define the boundary between Chinese words

• phrase element define the boundary between phrases at different levels

<word><word> 南京市南京市 </word><word></word><word> 长江大桥长江大桥 </word></word>

<phrase><word><phrase><word> 我们的我们的 </word><word></word><word> 最高目标最高目标 </word></phras</word></phrase>e>

<phrase><phrase> 是是 </phrase></phrase>

<phrase><phrase> 得到高自然的语音得到高自然的语音 </phrase></phrase>

Page 11: Overview of CSSML Yan Jun, Department Manager Anhui USTC iFLYTEK Co., Ltd University of Science & Tech of China

Using background music

• Synthesized speech can be played together with background music

• To upgrade user experience

• Background music may be added in a given position

• Background sound may be switched during the synthesis process

Page 12: Overview of CSSML Yan Jun, Department Manager Anhui USTC iFLYTEK Co., Ltd University of Science & Tech of China

environment

• environment element is introduced to present the sound field environment of synthesizing– src attribute– repeat attribute

<environment repeat= “yes” src= “1.wav”><environment repeat= “yes” src= “1.wav”>

有三千余年建城史的北京,经过改革开放的洗礼,将以崭新有三千余年建城史的北京,经过改革开放的洗礼,将以崭新的、多姿多彩的面貌进入新世纪,她将以饱满的热情欢迎全的、多姿多彩的面貌进入新世纪,她将以饱满的热情欢迎全世界的体育健儿和各界朋友,共同参与奥运盛会。世界的体育健儿和各界朋友,共同参与奥运盛会。

</ environment ></ environment >

Page 13: Overview of CSSML Yan Jun, Department Manager Anhui USTC iFLYTEK Co., Ltd University of Science & Tech of China

CSSML:enterprise standard

• iFLYTEK setup the enterprise standard CSSML to define the markup language used in speech synthesis product in 2002

• Since 2003, the standard has been supported by InterPhonic product series of iFLYTEK

Page 14: Overview of CSSML Yan Jun, Department Manager Anhui USTC iFLYTEK Co., Ltd University of Science & Tech of China

CSSML: candidate of national standard

• Human-machine speech alternation standard workgroup of the Ministry of China Information Industry

• CSSML was proposed in the workgroup in 2003 and was widely debated

• CSSML was voted through by the workgroup in Oct 24, 2005 and it will be submitted to the Ministry of China Information Industry as a candidate of national standard

Page 15: Overview of CSSML Yan Jun, Department Manager Anhui USTC iFLYTEK Co., Ltd University of Science & Tech of China

Application

• Speech synthesis product that support CSSML are widely used in telecom, banking, insurance, negotiable securities, education and so on.

– telecom: 168 and 114 information inquiry service– securities: stock comment, company introduction– enterprise: customer telephone service– education: to teach pronunciation of Chinese

characters and words

Page 16: Overview of CSSML Yan Jun, Department Manager Anhui USTC iFLYTEK Co., Ltd University of Science & Tech of China

Question?

Thank you and good bye!