Upload
savannah-oneill
View
213
Download
0
Embed Size (px)
Citation preview
23/04/10 1MASTAR Project, Universal Communication Research InstituteE-mail: [email protected]
Chiori Hori Ph.D.Spoken Language Communication Laboratory
National Institute of Information and Communications Technology(NICT)
Geneva, 25 November 2011
Telecommunications Relay Services in Speech-to-Speech translation system
in accordance with Recommendations F.745 and H.625
ITU-T Workshop on“Telecommunications relay services for persons with disabilities ”
(Geneva, 25 November 2011)
23/04/10 2MASTAR Project, Universal Communication Research InstituteE-mail: [email protected]
Telecommunications Relay Services in Speech-to-Speech translationTelecommunications Relay Services in Speech-to-Speech translationin accordance with ITU-T Recommendations F.745 and H.625in accordance with ITU-T Recommendations F.745 and H.625
Speech-to-Speech Translation
Communicating between more languages can be actualized using S2ST technology by connecting distributed S2ST servers, (i.e., ASR, MT, TTS) all over the world.
Speech-to-Speech Translation (S2ST) technologies are an effective means to break through language barriers between people who do not speak the same language.
EnglishEnglish““I go to school”I go to school”
AutomaticAutomaticSpeech Speech
RecognitionRecognition(ASR)(ASR)
MachineTranslation
(MT)
SpeechSpeechSynthesisSynthesis
(TTS)(TTS)
w a t a sh i w a t a sh i w a g a xtu w a g a xtu k o o n i…..k o o n i…..
私は私は学校に行く学校に行く
I go to I go to school school
JapaneseJapanese「私は学校に行く」「私は学校に行く」
Convert from phoneme
to word
Convert from Japanese text
to English text
Convertfrom text
to waveform
Japanese speech and
textcorpora
Japanese speech and
textcorpora
Japanese-to-English
parallel corpora
Japanese-to-English
parallel corpora
English speech corpora
English speech corpora
Large amount of training data for machine learning
Network-based S2ST systems
Network
Communication between users
who speak different languages
Speaker of Language
B
Digitalization of speech
signals
MC client
ASR serverASR server
Conversion from speech signal to text in Language
A
MC serverMC server
Speaker of Language
A
Digitalization of speech
signals
MC client
MT serverMT server
Conversion from text in A to text in
B
MC serverMC server
TTS serverTTS server
Conversion from text in B to speech
signal
MC serverMC server
ASR serverASR server
Conversion from speech signal to text in Language
B
MC serverMC server
MT serverMT server
Conversion from text in B to text in
A
MC serverMC server
TTS serverTTS server
Conversion from text in A to speech
signal
MC serverMC server
23/04/10 3MASTAR Project, Universal Communication Research InstituteE-mail: [email protected]
Japanese speaker’s device
Chinese speaker’s device
Network-based Speech Translation System Network-based Speech Translation System in accordance with ITU-T Recommendations F.745 and H.625in accordance with ITU-T Recommendations F.745 and H.625
Network-based S2ST application via multilateral translation on smartphone/tablet/PC/TV
English speaker’s device
飲み水は 13 : 00 から市役所前で配給します.
Water to drink will be provided in front of the city hall from13 : 00.
On-site communicationOn-site communication
从下午一点开始,在市政府门前供应饮用水。
Papa, maman, comment vas-
tu?
Remote CommunicationRemote Communication
お父さん,お母さんお元気ですか?
23/04/10 4MASTAR Project, Universal Communication Research InstituteE-mail: [email protected]
Modality Conversion Markup Language (MCML)
XML schema, ITU-T name space (http://www.itu.int/xml-namespace/itu-t/H.645/MCML.xsd) MCML includes information for communication between multiple persons who use different modalities. Ex. speech, text, image, video data input by users or output by MCML servers such as ASR, MT, TTS , Sign Recognition systems.
Network-based Speech Translation System Network-based Speech Translation System in accordance with ITU-T Recommendations F.745 and H.625in accordance with ITU-T Recommendations F.745 and H.625
http://www.itu.int/rec/T-REC-F.745-201010-I/en
23/04/10 5MASTAR Project, Universal Communication Research InstituteE-mail: [email protected]
23/04/10 6MASTAR Project, Universal Communication Research InstituteE-mail: [email protected]
U-STAR Consortium
The Universal Speech Translation Advanced Research (U-STAR) Consortium has been established as an international research collaboration entity with the goal of developing a world wide network-based speech-to-speech translation system. The consortium objective is to create a basic infrastructure for spoken language communication to overcome the language barriers that exist around the world. Currently, there are participant members from 14 countries (15 institutes).
Plan for Field experiment
Period: One year from April of 2012 including during the 2012 London Olympics
Application: Multiparty conversation via a network-based S2ST system on iPhones and Android phones (Free)
MCML servers: ASR,MT, TTS servers will be provided by U-STAR members
Potential languages: Chinese, Dzongkha, English, Filipino, Hindi, Indonesian Japanese, Korean, Mongolian, Malay Nepali, Sinhala, Thai, Urdu, Vietnamese and some European languages
23/04/10 7MASTAR Project, Universal Communication Research InstituteE-mail: [email protected]
The U-STAR membersInstitute Country Language
DITT Bhutan Dzongkha
UPD Philippines Filipino
CDAC India Hindi
BPPT Indonesia Indonesian
NICT Japan Japanese
ETRI Korea Korean
MUST Mongolia Mongolian
NUM Mongolia Mongolian
I2R Singapore Malay
LTK Nepal Nepali
UCSC Sri Lanka Sinhala
NECTEC Thailand Thai
KICS-UET Pakistan Urdu
IOIT Vietnam Vietnamese
CASIA China Chinese
23/04/10 8MASTAR Project, Universal Communication Research InstituteE-mail: [email protected]
Potential European Language
French, German, Italian,Portuguese, Spanish, Turkish,
British English