11
Toshiba (China) R&D C enter LOU Xiaoyan, LI Jian Research and Development Center, Toshiba China Suggestions on Tone and Word Boundary of Mandarin for SSML

Toshiba (China) R&D Center LOU Xiaoyan, LI Jian Research and Development Center, Toshiba China Suggestions on Tone and Word Boundary of Mandarin for SSML

Embed Size (px)

Citation preview

Page 1: Toshiba (China) R&D Center LOU Xiaoyan, LI Jian Research and Development Center, Toshiba China Suggestions on Tone and Word Boundary of Mandarin for SSML

Toshiba (China) R&D Center

LOU Xiaoyan, LI JianResearch and Development Center, Toshiba

China

Suggestions on Tone and Word Boundary of Mandarin for SSML

Page 2: Toshiba (China) R&D Center LOU Xiaoyan, LI Jian Research and Development Center, Toshiba China Suggestions on Tone and Word Boundary of Mandarin for SSML

Toshiba (China) R&D Center

Outline

Tone Word boundary

Page 3: Toshiba (China) R&D Center LOU Xiaoyan, LI Jian Research and Development Center, Toshiba China Suggestions on Tone and Word Boundary of Mandarin for SSML

Toshiba (China) R&D Center

Tone (cont…)

Importance As important as phonemes in tonal language Same syllables with different tones take different me

aning:妈 (mā) 麻 (má) 马 (mă) 骂 (mà)

Sandhi phenomenon in tonal language你好 ni3 hao3 ni2 hao3

Synthesis with correct tone help listener catch the meaning of speech

Non-markup behavior Tone can be achieved by looking up dictionary or ap

plying rules. Errors may occur, especially in dealing with sandhi

Page 4: Toshiba (China) R&D Center LOU Xiaoyan, LI Jian Research and Development Center, Toshiba China Suggestions on Tone and Word Boundary of Mandarin for SSML

Toshiba (China) R&D Center

Suggestion on Tone (cont…)

Our suggestions Using Pinyin sequence as the value of phoneme elem

ent Using number 1, 2, 3, 4 and 5 standing for tone “yin

ping”, “yang ping”, “shang sheng”, “qu sheng” and neutral tone in Mandarin:

Text: 大都 (dàdoū) Pinyin sequence+tone: /da 4/dou 1/ Solution1: new tone element (optional), with require

d attribute detail:<tone detail=“4 1”> 大都 </tone> Solution 2: new value “t” and “pt”of alphabet attri

bute in phoneme element<phoneme alphabet=“t” ph=“4 1”> 大都 </phoneme><phoneme alphabet=“pt” ph=“da 4/dou 4”> 大都 </phoneme>

Page 5: Toshiba (China) R&D Center LOU Xiaoyan, LI Jian Research and Development Center, Toshiba China Suggestions on Tone and Word Boundary of Mandarin for SSML

Toshiba (China) R&D Center

Note on Tone Markup

Possible influence on SSML1.0 Solution 1: Tone element cannot be followed by

other element, and can be enclosed by p, s, w(if defined) element

Solution 2: phoneme element is modified, the relation to other elements should not change

The tone strings given by markup cannot be changed in the text normalization step in the result of looking up the lexicon.

Tone markup should be neglected, when Value error of tone Unmatched length of tone sequence

Page 6: Toshiba (China) R&D Center LOU Xiaoyan, LI Jian Research and Development Center, Toshiba China Suggestions on Tone and Word Boundary of Mandarin for SSML

Toshiba (China) R&D Center

Outline

Tone Word boundary

Page 7: Toshiba (China) R&D Center LOU Xiaoyan, LI Jian Research and Development Center, Toshiba China Suggestions on Tone and Word Boundary of Mandarin for SSML

Toshiba (China) R&D Center

Word Boundary (cont…)

Word is the basic unit for sentence parsing and understanding.

Chinese sentences are composed of sequence of Chinese characters without blanks or spaces to specify word boundaries.

Difficulties: Complex words, such as reduplications, derived words, such a

s “ 简简单单” (very easily), “ 非物质” (immateriality) Proper nouns, such as location name, person name The ambiguous word segmentations.

A: 上海 是 个 大都会。 (Shanghai is a metropolis)B: 上海人 大都 会 那么 说。 (Most Shanghainese will say that)

Non-markup behavior Determine the boundary using language-specific knowledge Errors may occur

Page 8: Toshiba (China) R&D Center LOU Xiaoyan, LI Jian Research and Development Center, Toshiba China Suggestions on Tone and Word Boundary of Mandarin for SSML

Toshiba (China) R&D Center

Suggestions on Word Boundary (cont…)

New element w is suggested <w> 都会 </w>

An optional attribute detail is also recommended to mark phrases

<w detail=“3 2 1”> 上海人大都会 </w>

Here, the phrase is split into three words, and the number of Chinese characters of these words are 3, 2 and 1.

Page 9: Toshiba (China) R&D Center LOU Xiaoyan, LI Jian Research and Development Center, Toshiba China Suggestions on Tone and Word Boundary of Mandarin for SSML

Toshiba (China) R&D Center

Suggestion on Word Boundary (cont…)

Legal values of the optional attribute detail Not bigger than the length of the contained text<w detail=“3”><w detail=“3”> 上海上海 </w></w> Default value is the length of the contained text<w > 上海 </w> When the sum of value is smaller than the length of

the contained text, the left part is regarded as a word

<w detail=“3”> 上海人大都会 </w>

The first 3 Chinese characters “ 上海人” are regarded as one word and the left “ 大都会” are regarded as another word

When the sum of value is bigger than the length of the contained text, this markup should be neglected

Page 10: Toshiba (China) R&D Center LOU Xiaoyan, LI Jian Research and Development Center, Toshiba China Suggestions on Tone and Word Boundary of Mandarin for SSML

Toshiba (China) R&D Center

Possible Influence on SSML 1.0

Influence on speech synthesizing steps Word segmentation is suggested to be done before p

arse text and analysis structure Relation between SSML 1.0 markups and word

segmentation markup w (needs more discussion) p, s element can be followed by w element; w element can be followed by audio, emphasis, phon

eme, prosody, say-as, sub, voice and t(if defined)<p>

<w detail=“2”> 上海 </w></p><w detail=“2”><prosody rate=“-10%”> 上海 </prosody></w> 大

都会

Page 11: Toshiba (China) R&D Center LOU Xiaoyan, LI Jian Research and Development Center, Toshiba China Suggestions on Tone and Word Boundary of Mandarin for SSML

Toshiba (China) R&D Center

Thank you!