20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

Embed Size (px)

Citation preview

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    1/56

    RealTimeSpeechTranslatorXavier GarciaCabrera

    25/06/2008

    Author XavierGarciaCabrera

    Projectdirector JanRudinsky

    Date 25/06/2008

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    2/56

    Summary1 Overview...............................................................................................................................1

    2 Introduction:ACCProject......................................................................................................3

    2.1 Platformdescription......................................................................................................3

    2.1.1 Software................................................................................................................3

    2.1.2 Hardware...............................................................................................................5

    2.1.3 Architecture...........................................................................................................5

    2.2 ACCTeamProjects........................................................................................................7

    2.2.1 RealTimeVoiceTranslator....................................................................................7

    2.2.2 VXMLIDE...............................................................................................................8

    2.2.3 ACCsecurity...........................................................................................................8

    2.2.4 Others....................................................................................................................9

    3 RealTimeVoiceTranslator..................................................................................................11

    3.1 Objectives....................................................................................................................11

    3.2 Requirements..............................................................................................................11

    3.3 MarketAnalysis...........................................................................................................12

    3.3.1 SkypePersonalInterpreterService.....................................................................12

    3.3.2 GoogletalkbotsforrealtimetranslationsinIM(InstantMessaging)...............13

    3.3.3 NECAutomaticTranslatorformobilephones....................................................13

    3.4 ChoosingaTranslationEngine....................................................................................14

    3.4.1 WebTranslationservices....................................................................................14

    3.4.2 TranslationWebServicesAccuracybenchmark.................................................15

    3.4.3 Conclusion...........................................................................................................26

    3.5 SoftwareTools.............................................................................................................26

    3.5.1 IBMWebSphereApplicationServer....................................................................26

    3.5.2 JavaEE.................................................................................................................26

    3.5.3 ApacheTomcatHTTPServer...............................................................................27

    3.5.4 EclipseIDE...........................................................................................................27

    3.5.5 IBMRationalWebDeveloper&VoiceToolkit....................................................28

    3.6 HowIconfigurethesoftwaretools.............................................................................29

    3.6.1 Tomcat6InstallationandconfigurationonFedoraLinux..................................29

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    3/56

    3.6.2 IBMRWDVoiceLanguageandVoiceEnginesConfiguration..............................32

    3.7 CreatingandrunningadynamicVoiceXMLapplication.............................................33

    3.7.1 CreateanewJavaProjectinEclipseIDE.............................................................34

    3.7.2 AddingJEElibrariesinEclipseIDE.......................................................................34

    3.7.3 CreatethemainprocedureintheServletclass..................................................35

    3.7.4 Copythebinaryfilestotheserver......................................................................35

    3.7.5 Tomcatapplicationconfiguration.......................................................................36

    3.7.6 ConfiguringVEtousetheVXMLApplicationgeneratedbytheservlet..............37

    3.8 Applicationschema.....................................................................................................38

    3.9 TextToSpeechandSpeechRecognition.....................................................................39

    3.10 Achievedgoals.............................................................................................................39

    3.10.1 SetupWebSphereServertorunVoiceXMLApplications....................................40

    3.10.2 UseIBMRationalWebDevelopertoimplementVoiceXMLApplications..........40

    3.10.3 ConfigureandusedifferentlanguageswithinourVoiceEngine........................40

    3.10.4 GeneratedynamicVXMLcode............................................................................41

    3.10.5 Makequeriestowebsitesandprocesstheresponse.........................................41

    3.10.6 MakequeriestoGoogleTranslator.....................................................................42

    3.10.7 GetdynamicdatafromaVoiceXMLApplication................................................43

    4 Conclusions.........................................................................................................................45

    5 APPENDIXA:Definitions.....................................................................................................47

    6 APPENDIXB:Acronyms.......................................................................................................51

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    4/56

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    5/56

    10BOverview

    1 OverviewThisdocumentiswrittentoreportonmyworkontheRealTimeVoiceTranslatorProject,the

    projectIcarriedoutasmyfinalthesisprojectduringtheacademicyear20072008.Duringthis

    period I have been working in the Research and Development Center (RDC) for Mobile

    Applications,adepartmentoftheCzechTechnicalUniversity(CTU)inPrague.IntheRDCIwas

    a member of the Automatic Call Center Project (ACC Project) team, and within it, I was

    assignedtocarryouttheRealTimeVoiceTranslatorProject.

    The Automatic Call Center Project (ACC Project), now renamed to Voice2Web Project, is a

    projectcarriedoutbytheResearchandDevelopmentCenter.TheRDCisadepartmentinside

    theElectroTechnicalFacultyoftheCTUthatcarriesoutResearchandDevelopmentprojects

    regarding the Information Technologies (IT). Some of its partners are IBM, Vodafone and

    Ericson,whotheRDCisdoingprojectsfor.

    TheACCProjectbeganon2007and itsaim istodevelopVoiceApplications,within the IBM

    and RDC agreement, using IBM Voice Technologies and whatever open standards or open

    sourcesoftware. IBM isanACCProjectpartnerandprovidesfinancingfor it. Italsoprovides

    hardwareandsoftwarelicensestotheACCProjectandgivesussupport.Themembersofthe

    ACCProjectaredevelopingseveralVoiceApplicationsatthesametime,allthemfollowingthe

    ACCProjectpurposes.

    Although this document is focused on the Real Time Voice Translator Project, it will also

    explain in the introduction some aspects of the ACC Project. This is because the Real Time

    VoiceTranslatorProjecthasalotofpointsincommonwithitanditisworth,tounderstandit

    well,understandsomepointsoftheACCProjectaswell.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    6/56

    2 RealTimeSpeechTranslator

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    7/56

    31BIntroduction:ACCProject

    2 Introduction:ACCProject2.1PlatformdescriptionTorunVoiceApplicationsandachievethegoalsofourteam,itisneededtohaveaplatformto

    build on it the all the projects we are going to carry on. This platform is composed of

    hardware,softwareandnetworkarchitecturethatworktogethertoachieveitsproposal.

    Thefollowingsectionsdescribethedifferentpartsofourplatform,andexplainhowtheyare

    configuredtoworktogether.

    2.1.1 Software

    To get all the features we need from our platform we use some software installed on our

    servers that provides the services we require. We have been adding the different software

    applicationsduringthedevelopingoftheprojectaswehaveneededthem,notallatthesame

    timeatthebeginningof it.Therefore,thenumberofthesoftwareapplicationsusedmaybe

    increased in the future if we require some extra functionality the software we have now

    installedisnotabletoprovide.

    Eachtimewehavefoundthenecessityofaddingsomefunctionality,wehavefoundoutthe

    applicationsthatareabletoprovideusourrequirementsandwehavechosenwhichfitsbetter

    to our present and future necessities. One of the most important criteria we have used to

    choosethesoftwareapplicationstouse isthat itshouldbeopensource.Theonlyexception

    wehavedone, iswith the IBMWebSpheresoftware,due IBM issupporting ourprojectand

    providedusthisforfree.

    IntheTable21youcanseethecurrentsoftware installed inourserversandthefeatures it

    provides

    to

    our

    platform

    and

    the

    applications

    in

    which

    they

    are

    used.

    Software Server FeaturesandUses

    FedoraLinux Adela,Was,Bolek

    ThisistheOSusedonallourservers.Asall

    theLinuxdistributions,itprovidesallthe

    basicfunctionalitiesforaserverlike:user

    accounts,sshserver,ftpserver,network

    protocols

    VoiceEnabler(VE) Adela It

    is

    a

    part

    of

    the

    IBM

    WebSphere

    Server

    framework.Itprovidesthebasiccapabilities

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    8/56

    4 RealTimeSpeechTranslator

    toenableVoiceApplications.Itunderstands

    theSIPandCCXMLprotocolsandmanages

    andprocessestheVXMLfile.Itreceivesthe

    SIPcallsforwardedbyAsteriskserver,assigns

    themaVoiceXMLApplication,andexecutes

    andinterpretstheVXMLfilesofthat

    application.

    VoiceServer(VS) Was

    ThisisapartoftheIBMWebSphereServer

    framework.ItisinstalledintheWasserver

    andimplementsthebothVoiceEngines:ASR

    (AutomaticSpeechRecognition)andTTS

    (TextToSpeech).Itreceivestherequestfrom

    VoiceEnablerthroughMRCPProtocoland

    returnstheresultofprocessingthembythe

    VoiceEngines.Theenginesrequiremore

    amountofCPUpowerthanother

    applicationsandthisisthereasonwhyitis

    installedinaseparatemachine.

    IBMHTTPServer Adela

    ItissuppliedinsidetheIBMWebSphere

    Serverframework.ItisanHTTPserver,and

    providesdebasicfeaturesofthiskindof

    servers.Itisusedtohostwebpagesandthe

    VXMLfilesofstaticVoiceXMLapplications.

    ApacheTomcat Adela

    ItisanHTTPServerenabledtorunJava

    Servlets.ItisusedtoruntheServletswe

    needtogenerateDynamicVoiceXML

    Applications.Italsohostsomewebpages

    andsomeconfigurationfilesusedbyServlets.

    MySQL Adela

    MySQLisaverypopularandopensource

    database.Itprovidesthesamefunctionalities

    ofthemostpowerfulproprietarydatabases.

    Weuseitasadatabaseinthedevelopment

    ofVXMLIDEProject.

    JavaFramework Adela,Was

    ItprovidesdeJVM(JavaVirtualMachine)and

    extraresourcesneededtoexecuteJava

    applications.Itisrequiredtobeinstalledin

    allthemachinesthatrunIBMWebSphere

    framework.WealsorequireitinAdelatorun

    theJavaServlets.

    Asterisk Bolek

    Itisthemostworldwideusedopensource

    SIPserver.Weuseittomanagetheincoming

    SIPcallsandredirectthemtoourVoice

    Applications.Italsomakepossibletoreceive

    SIPcallsbehindaNAT/Firewall.

    Table21.SoftwareusedinACCProjectPlatform.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    9/56

    51BIntroduction:ACCProject

    2.1.2 Hardware

    Wehavethreephysicalmachines.Thesemachinesaretheserversthatrunthesoftwarethat

    providesthefunctionalitiesweneed.Twooftheseservers,thosethatrunIBMsoftware,were

    providedbyIBM.Theother,whereistheAsteriskserverinstalled,belongstoRDCdepartment.

    Our entire infrastructure is behind the Firewall/ NAT of the CVUT University, but the server

    that runs the Firewall/NAT server belongs to the university and is administrated by the

    Universitysadministrator,soitwillbeconsideredoutofourplatform.

    Each of the machines is identified by a name (Adela, Was and Bolek) and run different

    software applications. Following our principle of using always open source software in the

    wholeproject,weuseFedoraLinuxoperationsystemonalltheseservers.InTable22youcan

    seethesoftwareinstalledoneachmachine.

    Name InstalledSoftware

    Adela

    VoiceEnabler(IBMWebSphere)

    IBMHTTPServer(IBMWebSphere)

    ApacheTomcatHTTPServer

    MySQLDatabase

    Was VoiceEngine(IBMWebSphere)

    Bolek AsteriskSIPServer

    Table22.Servernamesandinstalledsoftware.

    Inthefollowingtableyoucanseethehardwaretechnicalfeaturesofeachofourservers:

    Adela Bolek Was

    Model DELL PCABACUSARCH

    8700N

    DELL PCABACUSARCH

    8700N

    SUPERMICRO

    SuperServer

    CPU IntelPentium4

    3.6GHz

    IntelPentium4

    3.6GHz

    IntelXeon5130 @

    2.00GHz

    RAM 2GB 2GB 2GB

    HD 230GB 230GB 230GB

    Graphicscard GeForce550 GeForce550

    Table23.Servershardwarefeatures.

    2.1.3 Architecture

    Ourplatformisconstitutedforseveralserverswhichareinterconnectedwithinanetwork.We

    arenotusingadedicatednetworkonlyforourproject.Weareusingtheuniversitynetwork,

    sharedwithallotheruniversityservers.OurserversareBolek,WasandAdela.Theyconnect

    andreceiveconnectionsfrominternetthroughtheuniversityFirewall/NATserver.Thatisthe

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    10/56

    6 RealTimeSpeechTranslator

    reason why we need to use asterisk to be able to manage SIP connections behind a NAT

    server.InFigure21youcanseetheACCProjectnetworkarchitecture.

    Figure21.ACCProjectnetworkarchitecture.

    Whenauserwantstouseourservices,hehastoestablishaconnectionwithourSIPserver

    (Asterisk server installed on Bolek) which is behind the university firewall. Once this

    connection is established, our SIP server redirects the call to the Voice Enabler (VE) where

    therearetheVoiceXMLapplications.VoiceEnablertakesthecontrolofthecallandexecutes

    the Voice Application that user is requesting. To be able to interact with the user, Voice

    Applications need to use a Voice Server (VS). When it is needed, VE establish an MRCP

    connectionwiththeVS(Bolekserver)tomakethequeriesandreceivetheanswersfromthe

    VoiceEngine.InFigure22youcanseethecalldiagramandtheprotocolsusedtoestablishthe

    connectionsbetweenservers.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    11/56

    71BIntroduction:ACCProject

    Figure22.Calldiagram.

    2.2ACCTeamProjectsACCProjectisabigProjectwhichpurposeisdevelopingallkindofVoiceApplicationsandnot

    only those focused on call center applications. Therefore, ACC Project is made up of other

    smallerprojectscarriedoutforitsmembersthatmatchthemainprojectaims.

    Atthebeginning,ACCProjectteammadeabrainstorminganditsmembersproposedseveral

    Voice Projects to realize. After that, the proposed projects were sorted by priority and the

    teambegantoworkonthosewhichweremoreimportantaccordingonitscriteria.Afterthat,

    some other projects were coming up during the development of the Project and they were

    addedtothelist.

    Inthissection,Iwillexplainsomeoftheseprojects.Somearecurrentlyondevelopmentand

    othersareplannedtobedoneinthefuturebytheACCTeam.

    2.2.1 Real Time Voice Translator

    This project was set the first in the priority list and it was the first Voice Application to be

    started. We decided it so because many of the steps done to realize this project will be

    requiredinanyotherVoiceProjectwepurposetocarryoutinthefuture.

    ThemainpurposeofthisprojectistodevelopaVoiceApplicationwhichwasabletodoareal

    timetranslationofaphoneconversationbetweentwocallerswhospeakdifferentlanguages.

    This is theprojectthatwasassigned tome,XaviGarcia,andwhich Ihavebeencarryingout

    duringallthesemonths.Thisprojectwillbeexplaineddeeplyinsection4.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    12/56

    8 RealTimeSpeechTranslator

    2.2.2 VXML IDE

    Despite it is not a Voice Application properly speaking, it is within the purposes of the ACC

    Project. ItwasinthesecondpositionintheACCTeamprioritylist,soitmeansthisprojectis

    veryimportantforus.ThefinalaimofthisprojectisdevelopingaWebbasedIDE(Integrated

    Development Environment) for VoiceXML Applications. Its purpose is making available to a

    wide range of programmers VoiceXML technology and supply them an easy way to develop

    their own Voice Applications. The most important requirement is that it has to be a Web

    Application;therefore itwillbeavailableforeverybodywithoutanykindoflocalinstallation,

    whattheuserwillonlyneed is awebbrowser.Thisproject iscurrentlyunderdevelopment

    andiscarriedoutbyTomasMikula.

    Figure23.VXMLIDEinterface.

    2.2.3 ACC security

    ThisisanothernonVoiceApplication,butitisextremelyimportantforourProject.Theaimof

    this project is provide security to all our project applications and protect our infrastructure

    against attacks. This task is being accomplished by Michal Vanek and he is focusing on

    providingthefirewallrulesanddoingthepropersoftwareconfiguration(Asteriskserver)to

    avoidDoSattacks.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    13/56

    91BIntroduction:ACCProject

    2.2.4 Others

    ACCTeamhasplannedtocarryoutsomeotherprojectsinthefuture.Theyhavenotstarted

    yetbecauseeitherwehavenotaccomplishedsometheirprerequisitesorwehavenotenough

    team members to do all the projects at the same time. Some of these future projects are:

    PhoneWikipediaProject,andthirdpartyapplications.

    PhoneWikipediaProjectpretends tomakeavailableall thecontentsofWikipedia througha

    mobilephonecall.Thatis,everybodywhohaveamobilephonewouldbeabletoread(inthis

    casewouldbeabletolisten)anyWikipediaarticleavailableininternet.

    ThereareotherProjectsthatcameupduringthedevelopmentofothertasks.Thisisthecase

    ofWeatherApplication,whichwasdevelopedtotesttheVoiceXMLDATAtag.Thisapplication

    providesthecurrentweatherofanyEuropeancapital.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    14/56

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    15/56

    112BRealTimeVoiceTranslator

    3 RealTimeVoiceTranslatorTheRealTimeVoiceTranslatorisoneoftheProjectscarriedoutfortheACCProjectTeam.This

    documentwillfocusonitmoredeeplythanotherACCprojectsbecauseit istheprojectthat

    was assigned to be carried out by me. In the following sections I will describe the different

    partsofitandhowithasbeenfulfilled.

    3.1ObjectivesThepurposeofthisproject istodevelopaVoiceApplicationthatwasabletodoarealtime

    translationfromaspokensentenceinone languagetoaspokensentence inother language.

    The basic idea is to enable that two people, who are talking through a voice call in two

    differentcountries,wereable tounderstandeachotherspeakingdifferent languagesdue to

    anautomaticrealtimetranslationdonebyamachine.

    3.2RequirementsIn this project, like in every project; before carrying out it, it is needed to establish some

    requirements thatshouldbeaccomplished.Oncealltheserequirementshadbeenachieved,

    wealsowillbeabletosaythattheaimsoftheRealTimeVoiceTranslatorhavebeenachieved.

    Theserequirementsarethefollowings:

    The application must be autonomous. It should work without the interaction of any

    operator.

    It isneededtogeneratedynamicVXMLfilesdynamically. It isnotpossibletostorea

    list of possible results. All the answers will be generated after the user said his

    sentence.

    Theapplicationshouldbeabletounderstandeverythingtheusersaid.Usershouldbe

    abletotalkaboutwhateverhewantsandtheapplicationwouldbeabletotranslate

    whathehasspoken.

    Itshouldbeabletotranslatesimplesentencesto/fromatleasttwolanguages.

    It is needed to use a Translator Engine which was able to do the translation at real

    time.

    The delay between the moment the caller ends his sentence and the moment the

    translatedsentence isdeliveredtothecalleeshouldbeasshortaspossibletomake

    possibleasmoothconversation.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    16/56

    12 RealTimeSpeechTranslator

    3.3MarketAnalysisBeforebeginningthisproject,wedidaninternetresearchtofindoutotherpossibleservices,

    either commercial or free services, which were currently offering the same features we

    attempttoofferwiththeRealTimeVoiceApplication.

    This Market Analysis shows that, although there are some attempts to give a real time

    translation, they do not approach the problem in the same way we want to do in our

    application.Some of them try tosolve theproblem usinga mobilemachine where the user

    talks and it replies with the translated sentence. Others offer translation services for some

    written Instant Messaging (IM)services.And Skypehas introducedanewservice to provide

    InstantTranslationinphonecallsdonethroughahumaninterpreter.

    InthefollowingsectionsIwillexplainthefeaturesofeachofthemostimportantservicesthat

    trytoapproachthetranslationproblemincommunications.Theseservicesare:SkypePersonal

    InterpreterService,GoogleTalkBots,NECAutomaticTranslatorformobiles.

    3.3.1 Skype Personal Interpreter Service

    Skype is now offering realtime voice translation services for Skype calls. Language Line

    Personal Interpreter offers SkypeOut customers nearinstantaneous access to professional

    voice translation. This is an access to live interpreters, not machine translation. The cost is

    $2.99perminute,whichisrelativelyhigh,butwiththistypeofserviceyouarepayingforthe

    conveniencefirstandforemost.Theuseofthisservice isonthefly,withnoscheduling.You

    cangetaninterpreteronaveragein45secondsafteraninitialrequest.

    Features:

    Price $2.99/min

    Scope PhonecallsusingSkype

    Keypoints Liveinterpreters,notmachinetranslation

    150supportedlanguages Noscheduling

    Moreinfo www.skype.com

    Table31.SkypePersonalInterpreterServicefeatures

    http://www.skype.com/http://www.skype.com/
  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    17/56

    132BRealTimeVoiceTranslator

    3.3.2 Google talk bots for realtime translations in IM (Instant

    Messaging)

    Recently, on December2007,Googleannounced the availabilityofrealtime translations for

    their IM (Instant Messaging) service Google Talk. The translations are made by bots that

    translatethetextmessagesintheGoogleTalkconversations.Foreachlanguagetherearetwo

    bots,oneforeachtranslationdirection.Forexample,forSpanishtoEnglishtranslationthere

    are the boots [email protected] and [email protected]. This service

    supports23translationdirections(bots).

    Features:

    Price Free

    Scope InstantMessagingconversationsusingGoogleTalk

    Keypoints Texttranslationonly

    23translationdirectionssupported

    Moreinfohttp://googletalk.blogspot.com/2007/12/merry

    christmasgodjuland.html

    Table32.GoogleTalkBotsfeatures.

    3.3.3 NEC Automatic Translator for mobile phones

    Japanese electronics company NEC has developed the first automatic translation software

    tailoredformobilephones. ItonlytranslatesfromspeechJapanesetoEnglish,butthis isthe

    firsttimethetranslationtechnologyisusedinmobilephoneswithoutanyexternalhelp.The

    softwarehasadictionaryof50,000wordsanditisstillunderdevelopment.

    Features:

    Price unknown

    Scope Mobilephoneusers

    Keypoints JapanesetoEnglishtranslationonly(notEnglishtoJapanese)

    Underdevelopment

    Moreinfo http://www.akihabaranews.com/en/news_details.php?id=15185

    Table33.NECAutomaticTranslatorfeatures.

    http://googletalk.blogspot.com/2007/12/merry-christmas-god-jul-and.htmlhttp://googletalk.blogspot.com/2007/12/merry-christmas-god-jul-and.htmlhttp://googletalk.blogspot.com/2007/12/merry-christmas-god-jul-and.htmlhttp://googletalk.blogspot.com/2007/12/merry-christmas-god-jul-and.htmlhttp://googletalk.blogspot.com/2007/12/merry-christmas-god-jul-and.htmlhttp://googletalk.blogspot.com/2007/12/merry-christmas-god-jul-and.htmlhttp://googletalk.blogspot.com/2007/12/merry-christmas-god-jul-and.htmlhttp://googletalk.blogspot.com/2007/12/merry-christmas-god-jul-and.htmlhttp://googletalk.blogspot.com/2007/12/merry-christmas-god-jul-and.htmlhttp://www.akihabaranews.com/en/news_details.php?id=15185http://www.akihabaranews.com/en/news_details.php?id=15185http://googletalk.blogspot.com/2007/12/merry-christmas-god-jul-and.htmlhttp://googletalk.blogspot.com/2007/12/merry-christmas-god-jul-and.html
  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    18/56

    14 RealTimeSpeechTranslator

    3.4ChoosingaTranslationEngineInordertocarryouttheRealTimeVoiceTranslatorProject,itismandatorytohavearealtime

    translator which can provide us the translated sentences. As it is a very hard work to

    implement a translator engine, we decided to use one of the free translators available inInternet.

    Hence,firstofallwedidaresearchtofindoutallthefreetranslatorenginesavailable.After

    this research we had to decide which one we choose for our purposes. Like it is known for

    everybody,allthetranslationenginescurrentlyavailablehavenotagoodaccuracy.Therefore

    wedecidetochoosethatonewhichhadthebesttranslationaccuracy.Toknowwhichoneof

    all the preselected translation services has the best accuracy, I did a test that give a

    punctuationtoeachoneandfinallyIcomparetheresults.

    3.4.1 Web Translation services

    After a deep research, we find out the most important free translation services currently

    available in internet. We have preselected all them to do our test and choose the best

    translationenginetouseinourapplication.InTable44youcanseealistofalltheseservices

    andtheirURLs.

    Service Website

    Altavista Babel Fish http://babelfish.altavista.com/

    Yahoo!Babelfish http://babelfish.yahoo.com/

    Google Translator http://translate.google.com/translate_t?hl=en

    InterTran http://www.tranexp.com:2000/InterTran

    FreeTranslation http://www.freetranslation.com/

    WorldLingohttp://www.worldlingo.com/en/products_services/worldlingo_translat

    or.html

    Dictionary.Com http://translator.dictionary.com/text.html

    PROMT http://www.epromt.com/

    TraduceGratis.com

    (Google engine)

    http://www.traducegratis.com/

    Imtranslator

    (PROMTengine)

    http://translation.paralink.com/

    Table34.Webtranslationserviceslist.

    http://babelfish.altavista.com/http://babelfish.altavista.com/http://babelfish.yahoo.com/http://babelfish.yahoo.com/http://translate.google.com/translate_t?hl=enhttp://www.tranexp.com:2000/InterTranhttp://www.freetranslation.com/http://www.freetranslation.com/http://www.worldlingo.com/en/products_services/worldlingo_translator.htmlhttp://www.worldlingo.com/en/products_services/worldlingo_translator.htmlhttp://translator.dictionary.com/text.htmlhttp://www.e-promt.com/http://www.e-promt.com/http://www.e-promt.com/http://www.e-promt.com/http://www.traducegratis.com/http://translation.paralink.com/http://translation.paralink.com/http://translation.paralink.com/http://www.traducegratis.com/http://www.e-promt.com/http://translator.dictionary.com/text.htmlhttp://www.worldlingo.com/en/products_services/worldlingo_translator.htmlhttp://www.worldlingo.com/en/products_services/worldlingo_translator.htmlhttp://www.freetranslation.com/http://www.tranexp.com:2000/InterTranhttp://translate.google.com/translate_t?hl=enhttp://babelfish.yahoo.com/http://babelfish.altavista.com/
  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    19/56

    152BRealTimeVoiceTranslator

    3.4.2 Translation Web Services Accuracy benchmark

    We decided to do an accuracy benchmark to each one of the preselected web translation

    servicestobeabletocompareeachotherinabetterway.Thisbenchmarkconsistintranslate

    severalbasicsentencesfromSpanishtoEnglish.Ihaveselectedsomecommonsentencesthat

    areusedinabasicconversationwhensomeoneistryingtocontractsomeservice.Ispecifically

    have selected the basic dialog for booking a taxi and the basic dialog for ordering a pizza. I

    havetonoticethattheaccuracymarkisanabsolutelysubjectivemarkanditisonlybasedon

    mypersonalopinion.

    3.4.2.1 SpanishsentencesHerearelistedtheSpanishsentencesofthedialogswhichourbenchmarkismadeof:

    1. BuenosdasleatiendeJuanGonzlez,enqupuedoayudarle?

    2. Desearapediruntaxiparalas22horasenladireccin:calleSanRamnnmero2.

    3. Deacuerdo,sureservahasidorealizada.

    4. Desearapedirunapizzatropicaltamaofamiliar.

    5. Deseaaadiralgningredientems?

    6. S.Extradequeso,jamnypeperoni.

    7. Muybien,mepuededecirsunmerodetelfonoysudireccin?

    8. Minmerodetelfonoesel777.777.777ymidireccines:calleFrankKafkanmero

    132.

    9. Perfecto,en30minutosrecibirsupedido.Elpreciofinalesde180coronas.

    In the following tables there are the result of the benchmark we passed on each translator

    engineandthemarktheyobtainedon the translationofeachsentence.Attheendyoucan

    findasummarytablewiththeglobalmarkofalltheengines.

    AltavistaBabelfish(Babelfishengine)

    Translationresult Accuracy

    1 GoodmorningittakescareofJuantohimGonzlez,inwhatIcanhelphim? 45

    2 It would wish to request a taxi for the 22 hours in the direction: street San Ramon

    number2

    90

    3 Inagreement,itsreservehasbeenmade. 80

    4 Pizzawouldwishtorequestonetropicalfamiliarsize. 50

    5 Itwishestoaddsomeingredientmore? 95

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    20/56

    16 RealTimeSpeechTranslator

    6 Yes.Cheeseextra,jamnandpeperoni. 75

    7 Verywell,itcansaytoitstelephonenumberanditsdirectiontome? 70

    8 My telephone number is the 777.777.777 and my direction is: street Frank Kafka

    number132.

    85

    9 Perfect,in30minutesitwillreceiveitsorder.Thefinalpriceisof180crowns. 85

    Table35.AltavistaBabelfishSpanishEnglishbenchmarkresults.

    Yahoo!Babelfish(Babelfishengine)

    Translationresult Accuracy

    1 GoodmorningittakescareofJuantohimGonzlez,inwhatIcanhelphim? 45

    2 It would wish to request a taxi for the 22 hours in the direction: street San Ramon

    number2.

    90

    3 Inagreement,itsreservehasbeenmade. 80

    4 Pizzawouldwishtorequestonetropicalfamiliarsize. 50

    5 Itwishestoaddsomeingredientmore? 95

    6 Yes.Cheeseextra,jamnandpeperoni. 75

    7 Verywell,itcansaytoitstelephonenumberanditsdirectiontome? 70

    8 My telephone number is the 777.777.777 and my direction is: street Frank Kafka

    number132.

    85

    9 Perfect,in30minutesitwillreceiveitsorder.Thefinalpriceisof180crowns. 85

    Table36.Yahoo!BabelfishSpanishEnglishbenchmarkresults.

    GoogleTranslator

    Translationresult Accuracy

    1 GoodmorningheattendsJuanGonzalez,howcanIhelp? 80

    2 Iwouldliketoaskataxifor22hoursatthefollowingaddress:CalleSanRamonnumber

    2.

    100

    3 Okay,yourreservationhasbeenmade. 100

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    21/56

    172BRealTimeVoiceTranslator

    4 Iwouldliketoaskapizzatropicalfamilysize. 90

    5 Wanttoaddonemoreingredient? 90

    6 Yes.Extracheese,hamandpeperoni. 100

    7 Well,whatIcansayyourphonenumberandyouraddress? 65

    8 My telephone number is 777,777,777 and my address is: Frank Kafka street number

    132.

    100

    9 Perfect,in30minutesyoushouldreceiveyourorder.Thefinalpricewas180kronor. 95

    Table37.GoogleTranslatorSpanishEnglishbenchmarkresults.

    InterTran

    Translationresult Accuracy

    1 GoodmorninghimatiendeJohnGonzlez,whereinitcanlendahelpinghand? 20

    2 Hunger for order one taxicab in order to the 22 hours on the steerage : street San

    Ramnnumeral2.

    35

    3 Inagreement,herreservationhasbeenachieved. 65

    4 Hungerfororderonepizzatropicalsizehomelike. 40

    5 Deseaappendanyingredientfurther? 35

    6 yesExtraofcheese,hamandpeperoni. 85

    7 Welldonemecanyoutellmehernumeraloftelephoneandhersteerage? 30

    8 Enumeraloftelephone isthe777.777.777andEsteerage is:streetOutspokenKafka

    numeral132.

    20

    9 Perfect,at30minutesshe'llreceiveherrequest.Thepriceeventualisof180halos. 60

    Table38.InterTranSpanishEnglishbenchmarkresults.

    FreeTranslation

    Translationresult Accuracy

    1 GoodmorningJuanattendshimGonzlez,inwhatcanIhelphim? 45

    2 Itwoulddesiretoaskataxiforthe22hoursinthedirection:streetSanRamnnumber 80

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    22/56

    18 RealTimeSpeechTranslator

    2.

    3 Inagreement,itsreservehasbeencarriedout. 80

    4 Itwoulddesiretoaskapizzatropicalfamilysize. 80

    5 Itdesirestoaddsomeingredientmore? 85

    6 Yes. Extraofcheese,hamandpeperoni. 100

    7 Verywell,cantellmeitsphonenumberanditsdirection? 85

    8 Myphonenumber isthe777.777.777andmydirection is:streetFrankKafkanumber

    132.

    85

    9 Perfect,in30minuteswillreceiveitsorder. Thefinalpriceisof180crowns. 85

    Table39.FreeTranslationSpanishEnglishbenchmarkresults.

    WorldLingo

    Translationresult Accuracy

    1 GoodmorningittakescareofJuantohimGonzlez, inwhatIcanhelphim? 60

    2 It would wish to request a taxi for the 22 hours in the direction: street San Ramon

    number2.

    80

    3 Inagreement,itsreservehasbeenmade. 90

    4 Pizzawouldwishtorequestonetropicalfamiliarsize. 70

    5 Deseatoaddsomeingredientmore? 65

    6 Yes.Cheeseextra,jamnandpeperoni. 60

    7 Verywell, cansaytoitstelephonenumberanditsdirectiontome? 80

    8 My telephone number is the 777.777.777 and my direction is: street Frank Kafka

    number132.

    90

    9 Perfect,in30minutesitwillreceiveitsorder.Thefinalpriceisof180crowns. 90

    Table310.WorldLingoSpanishEnglishbenchmarkresults.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    23/56

    192BRealTimeVoiceTranslator

    Dictionary.Com

    Translationresult Accuracy

    1 GoodmorningittakescareofJuantohimGonzlez,inwhatIcanhelphim? 60

    2 It would wish to request a taxi for the 22 hours in the direction: street San Ramon

    number2.

    80

    3 Inagreement,itsreservehasbeenrealized. 75

    4 Solargerelativewouldwishtorequestatropicalpizza. 50

    5 Itwishestoaddsomeingredientmore? 80

    6 Yes.Cheeseextra,jamnandpeperoni. 50

    7 Verywell,itcansaytoitstelephonenumberanditsdirectiontome? 75

    8 My telephone number is the 777.777.777 and my direction is: street Frank Kafka

    number132.

    80

    9 Perfect,in30minutesitwillreceiveitsorder.Thefinalpriceisof180crowns. 85

    Table311.Dictionary.comSpanishEnglishbenchmarkresults.

    PROMT

    Translationresult Accuracy

    1 GoodmorningJuanGonzlezattendstohim:inwhatcanIhelphim? 75

    2 Hewouldwanttoaskforataxifor22hoursinthedirection:thereisquietSanRamn

    number2.

    50

    3 Inagreement,hisreservationhasbeenrealized. 85

    4 Familiarsizewouldwanttoaskforatropicalpizza. 25

    5 Doeshewanttoaddanyingredientmore? 75

    6 Yes.Extraofcheese,hamandpepperoni. 100

    7 Verywell:canhesaytomehistelephonenumberandhisdirection? 80

    8 My telephone number is 777.777.777 and my direction is: there is quiet Frank Kafka

    number132.

    70

    9 Perfect,in30minutesitwillreceivehisorder.Thefinalpriceis180crowns. 85

    Table312.PROMTSpanishEnglishbenchmarkresults.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    24/56

    20 RealTimeSpeechTranslator

    Summary

    Altavista Yahoo! Google InterTranFreeTrans

    lation

    WorldLing

    o

    Dictionary

    .ComPROMT

    1 45 45 80 20 45 60 60 70

    2 90 90 100 35 80 80 80 50

    3 80 80 100 65 80 90 75 85

    4 50 50 90 40 80 70 50 25

    5 95 95 90 35 85 65 80 70

    6 75 75 100 85 100 60 50 100

    7 70 70 65 30 85 80 75 80

    8 85 85 100 20 85 90 80 70

    9 85 85 95 60 85 90 85 85

    Avg 75 75 91 43 81 76 71 70

    Table313.Benchmarkssummary.

    In order to differentiate better the engines scores and have a clearer idea about them, it is

    requiredtohavesomestatisticscharts.Thefollowinggraphscomparestheaverage,maximum

    andminimumaccuracyofeachengine.Itishighlightedwithredcolorthebestscoredengine.

    73

    73

    90

    39

    79

    75

    69

    67

    0 20 40 60 80 1

    AltavistaBabelfish

    Yahoo!Babelfish

    Google

    InterTran

    FreeTranslation

    WorldLingo

    Dictionary.com

    PROMT

    AverageAccuracy

    00

    Figure31.AverageaccuracygraphinSpanishEnglishBenchmark.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    25/56

    212BRealTimeVoiceTranslator

    95

    95

    100

    85

    100

    90

    85

    100

    75 80 85 90 95 100 105

    AltavistaBabelfish

    Yahoo!Babelfish

    Google

    InterTran

    FreeTranslation

    WorldLingo

    Dictionary.com

    PROMT

    MaxAccuracy

    Figure32.MaximumaccuracygraphinSpanishEnglishBenchmark.

    45

    45

    65

    20

    45

    6050

    25

    0 10 20 30 40 50 60 70

    AltavistaBabelfish

    Yahoo!Babelfish

    Google

    InterTran

    FreeTranslation

    WorldLingoDictionary.com

    PROMT

    MinAccuracy

    Figure33.MinimumaccuracygraphinSpanishEnglishBenchmark.

    In Figure 34 you can see the range of marks obtained for each engine and where it is the

    averageaccuracymarkwithinit.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    26/56

    22 RealTimeSpeechTranslator

    0

    20

    40

    60

    80

    100

    120

    Accuracyrange

    Average

    Figure34.AccuracyrangechartinSpanishEnglishBenchmark.

    3.4.2.2 EnglishsentencesHerearelistedtheEnglishsentencesofthedialogswhichourbenchmarkismadeof:

    1. Goodmorning,itsJuanJohnSmith,whatcanIhelpyou?

    2. Iwouldliketoaskforataxiat22hoursattheaddress:FifthAvenuenumber2.

    3. Ok,yourreservationhasbeenmade.

    4. IwouldliketoorderaTropicalpizzafamilysize.

    5. Doyouwanttoaddsomeotheringredient?

    6. Yes.Extracheese,hamandpepperoni.

    7. Allright,canyoutellmeyourtelephonenumberandyouraddress?

    8. My telephonenumber is777.777.777andmyaddress is:FrankKafkastreetnumber

    132.

    9. Allright,youwillreceiveyourorderin30minutes.Thefinalpriceis180crowns.

    LookingattheresultsoftheSpanishEnglishbenchmark,wecannoticethatthereareseveral

    engines which have a very low accuracy. Therefore, we decided to do the EnglishSpanish

    benchmarkonlyonthethreeengineswhichhavethehighestmarksintheprevioustest.

    In the following tables there are the benchmark results of each translator engine and the

    markstheyobtainedonthetranslationofeachsentence.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    27/56

    232BRealTimeVoiceTranslator

    GoogleTranslator

    Translationresult Accuracy

    1 Buenosdas,esJuandeJohnSmith,qupuedohacerparaayudarle? 80

    2 Megustarapediruntaxialas22horasenladireccin:QuintaAvenidanmero2. 100

    3 Ok,sureservaseharealizado. 100

    4 MegustarapedirunpizzaTropicaltamaodelafamilia. 95

    5 Deseasaadiralgnotroingrediente? 100

    6 S.Extraqueso,jamnypepperoni. 100

    7 Estbien,puedeustedmedicesunmerodetelfonoysudireccin? 85

    8 Minmerodetelfonoesel777.777.777ymidireccines:calleFrankKafkanmero

    132.

    100

    9 Muybien,ustedrecibirsupedidoen30minutos.Elpreciofinalesde180coronas. 100

    Table314.GoogleTranslatorEnglishSpanishbenchmarkresults.

    Altavista&Yahoo!(BabelFishEngine)

    Translationresult Accuracy

    1 Buenamaana,lesJuanJohnSmith,culpuedeyoayudarle? 40

    2 Quisierapediruntaxien22horasenladireccin:QuintaAvenidanmero2. 50

    3 Sehahecholaautorizacin,sureservacin. 40

    4 Quisierapediruntamaodelafamiliatropicaldelapizza. 30

    5

    Usted

    quiere

    agregar

    un

    poco

    de

    otro

    ingrediente?

    75

    6 S.Queso,jamnysalchichonesadicionales. 80

    7 Todoaladerecha,puedeusteddecirmesunmerodetelfonoysudireccin? 70

    8 Minmerode telfonoes777.777.777ymidireccines: Callenmero132 deFrank

    Kafka.

    100

    9 Todoaladerecha,ustedrecibirsuordenen30minutos.Elpreciofinales180coronas. 80

    Table315.BabelFishengineEnglishSpanishbenchmarkresults.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    28/56

    24 RealTimeSpeechTranslator

    FreeTranslation

    Translationresult Accuracy

    1 Buenosdas,esJuanJohnSmith,lequpuedoayudaryo? 70

    2 Querrapediruntaxien22horasenladireccin:QuintonmerodelaAvenida2. 60

    3 Bueno,sureservacinhasidohecha. 80

    4 QuerraordenarunapizzaTropicaleltamaofamiliar. 85

    5 Quiereustedagregaralgnotroingrediente? 100

    6 S.Elquesoextra,eljamnylasalchicha. 100

    7 Bueno,puededecirmeustedsunmerodetelfonoysudireccin? 95

    8 Minmerodetelfonoes777.777.777ymidireccines:Elnmerofrancode lacalle

    deKafka132.

    60

    9 9.Bueno,ustedrecibirsuordenen30minutos.Elpreciofinales180coronas. 80

    Table316.FreeTranslationEnglishSpanishbenchmarkresults.

    Inthefollowinggraphs,youcanseethestatisticscomparisonbetween thedifferentscored

    engines.Ineachgraph,itishighlightedwithredcolorthebestscoredengine.Intheaccuracy

    range chart, you can see the range of marks obtained for each engine and where it is the

    averageaccuracymarkwithinit.

    95

    59

    80

    0 20 40 60 80 100 120

    Google

    Babelfish

    FreeTranslation

    AverageAccuracy

    Figure35.AverageaccuracygraphinEnglishSpanishBenchmark.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    29/56

    252BRealTimeVoiceTranslator

    100

    100

    100

    0 20 40 60 80 100 120

    Google

    Babelfish

    FreeTranslation

    MaxAccuracy

    Figure36.MaximumaccuracygraphinEnglishSpanishBenchmark.

    80

    30

    60

    0 20 40 60 80 100

    Google

    Babelfish

    FreeTranslation

    MinAccuracy

    Figure37.MinimumaccuracygraphinEnglishSpanishBenchmark.

    0

    20

    40

    60

    80

    100

    120

    Google Babelfish FreeTranslation

    Accuracyrange

    Average

    Figure38.AccuracyrangechartinEnglishSpanishBenchmark.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    30/56

    26 RealTimeSpeechTranslator

    3.4.3 Conclusion

    Looking at the both benchmark results, we can see that Google Translate stands out as the

    bestwebtranslatorservice.Thetranslationsitprovidesarequiteunderstandableinallcases,

    whilsttheresultsprovidedforotherenginesarenotinmanycases.Forthisreason,wechose

    GoogleTranslateEngineforusinginourproject.

    3.5SoftwareToolsFor thedevelopmentof theRealTime VoiceTranslator ithasbeennecessarytouseseveral

    softwaretools.Thistoolscoverawiderangeofsoftwareapplications,fromHTTPserverstoa

    programming IDEs.Eachtoolhasbeenusedtoaccomplishataskwithinthelargenumberof

    parts our project is made of. In the following sections I am going to explain each of these

    softwareapplications,theirfunctioninsidetheprojectarchitectureandhowwehaveusedand

    reconfigu dthem.

    3.5.1 IBM WebSphere Application Server

    IBM WebSphere Application Server (WAS) is the flagship product within IBM's WebSphere

    brand. WAS is built using open standards such as Java EE, XML, and Web Services. It works

    with a number of Web servers including Apache HTTP Server, Netscape Enterprise Server,

    MicrosoftInternetInformationServices(IIS),IBMHTTPServerfori5/OS,IBMHTTPServerfor

    z/OS, and IBM HTTP Server for AIX/Linux/Microsoft Windows/Solaris. WAS V6.1 is the

    foundationoftheIBMWebSpheresoftwareplatform.Itdeliversthesecure,scalable,resilient

    applicationinfrastructurethatisneededforaServiceOrientedArchitecture(SOA).

    WeinstalledWebSphereApplicationServerV5.1onaLinuxsystemforusingtheVoiceEnabler

    andVoiceEnginesserversitprovidesinourproject.Asitisexplainedinsection2.1Software,

    wehavedeployedWebSphereApplicationServerintwoservers:adelaandwas.

    3.5.2 Java EE

    JavaEEistheversionoftheJavaPlatformusedtodevelopEnterpriseApplications,focusingon

    a distributed server environment. The platform was known as Java 2 Platform, Enterprise

    Edition or J2EE until the name was changed to Java EE in version 5. The current version is

    calledJavaEE5whilstthepreviousversioniscalledJ2EE1.4.

    JavaEEbuildsonthesolidfoundationofJavaPlatform,StandardEdition(JavaSE)and isthe

    industry standard for implementing enterpriseclass serviceoriented architecture (SOA) and

    nextgeneration web applications. The SDKs contain Sun GlassFish Enterprise Server,

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    31/56

    272BRealTimeVoiceTranslator

    previously named Sun Java System Application Server, and provide support for Java EE 5

    specifications.

    The Java EE Platform differs from the Standard Edition (SE) of Java in that it adds libraries

    which provide functionality to deploy faulttolerant, distributed, multitier Java software,

    basedlargelyonmodularcomponentsrunningonanapplicationserver.ItincludesseveralAPI

    specifications, such as JDBC, RMI, email, JMS, web services, XML, etc, and defines how to

    coordinatethem.JavaEEalsofeaturessomespecificationsuniquetoJavaEEforcomponents.

    TheseincludeEnterpriseJavaBeans,servlets,portlets(followingtheJavaPortletspecification),

    JavaServerPagesandseveralwebservicetechnologies.

    WechoseJavaEEplatformtodevelopourapplication.OurdynamicVoiceXMLapplicationsuse

    thepowerofJavaServletstointeractwithotherwebservicesavailableininternetandmanage

    data from different sources to create new voice services. We are currently using Java EE 5

    version.

    3.5.3 Apache Tomcat HTTP Server

    Apache Tomcat is a Servlet container developed at the Apache Software Foundation (ASF).

    Tomcat implements the Java Servlet and the JavaServer Pages (JSP) specifications from Sun

    Microsystems,andprovidesa"pureJava"HTTPwebserverenvironmentforJavacodetorun.

    TomcatshouldnotbeconfusedwiththeApachewebserver,whichisaCimplementationofa

    HTTPwebserver;thesetwoHTTPwebserversarenotbundledtogether.

    ApacheTomcatincludestoolsforconfigurationandmanagement,butcanalsobeconfigured

    byeditingconfigurationfilesthatarenormallyXMLformatted.

    As I mentioned in previous sections, we use Java Servlets for implementing most of the

    applicationsinourproject.Forthisreason,weneedtouseaServletcontainerandwechose

    theApacheTomcatone.WearecurrentlyrunningApacheTomcat6forLinuxonadelaserver.

    ThisversionisdesignedtouseJavaEE5.

    3.5.4 Eclipse IDE

    Eclipse is an integrated development environment (IDE) written primarily in Java. The initial

    codebase originated from VisualAge, a family of computer integrated development

    environments from IBM, which included support for a few popular (and not so popular)

    computer programming languages. In its default form it is meant for Java developers,

    consistingoftheJavaDevelopmentTools(JDT).Userscanextend itscapabilitiesby installing

    plugins written for the Eclipse software framework, such as development toolkits for other

    programming languages,and canwrite and contribute theirown plugin modules. Language

    packsprovidetranslationsintooveradozennaturallanguages.Itisopensourcesoftwareand

    itisavailableforseveraloperationsystems.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    32/56

    28 RealTimeSpeechTranslator

    The Eclipse Project is supported by the most important industry leaders like Borland, IBM,

    MERANT, QNX Software Systems, Rational Software, Red Hat, SuSE, TogetherSoft and

    Webgain.

    IhavebeenusingEclipse3.3IDEfordevelopingallthejavaservletsforourProject.Insection

    4.5.6 Creating and running a dynamic VoiceXML application I will explain how I use it to

    developVoiceXMLapplications.

    Figure39.EclipseIDEscreenshot.

    3.5.5 IBM Rational Web Developer & Voice Toolkit

    IBM Rational Application Developer for WebSphere Software extends Eclipse with visual

    construction development. It helps Java developers rapidly design, develop, assemble, test,

    profileanddeployhighqualityJava/J2EE,Portal,Web,WebservicesandSOAapplications.

    We use IBM Rational Web Developer (RWD) in ACC Project to develop static Voice

    Applications. It is really easy and intuitive to create a whole application using the visual

    constructenvironment that RWD provides. For this purpose it isneeded to install theVoice

    ToolkitwhichintegrateswithRWDenvironmentandprovidestheCommunicationFlowBuilder

    tool.UsingtheCommunicationFlowBuilder it isveryeasytocreatetheflowdiagramofthe

    voiceapplicationonlydragginganddroppingthevisualelementsandconnectingthem.Once

    we have created the Communication Flow, the RWD creates the VXML application file

    automatically.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    33/56

    292BRealTimeVoiceTranslator

    Figure310.IBMRationalWebDeveloperandCommunicationFlowBuilderscreenshot.

    3.6 wIconfigurethesoftwaretools3.6.1 Tomcat 6 Installation and configuration on Fedora Linux

    Ho

    ThissectionwilldescribestepbystepwhatitisneededtodotoinstallandconfigureApache

    Tomcat6onourFedoraLinuxserver.

    3.6.1.1 DownloadbinariesandprerequisitesJDK(JavaDevelopmentKit)

    Tomcat 6.0 was designed to run on J2SE 5.0. You can download Java EE SDK from

    http://java.sun.com.

    Tomcat

    Binary downloads of the Tomcat server are available from

    http://tomcat.apache.org/download60.cgi.

    SetanenvironmentvariableCATALINA_HOMEthatcontainsthepathnametothedirectoryin

    whichTomcat6hasbeeninstalled.

    Ant

    Binary downloads of the Ant build tool are available from

    http://ant.apache.org/bindownload.cgi. Download and install Ant from the distribution

    http://tomcat.apache.org/download-60.cgihttp://tomcat.apache.org/download-60.cgihttp://tomcat.apache.org/download-60.cgihttp://ant.apache.org/bindownload.cgihttp://ant.apache.org/bindownload.cgihttp://tomcat.apache.org/download-60.cgi
  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    34/56

    30 RealTimeSpeechTranslator

    directorymentionedabove.Then,addthebindirectoryoftheAntdistributiontoyourPATH

    environment variable, following the standard practices for your operating system platform.

    Once you have done this, you will be able to execute theant shell command directly. You

    should use a command like it is shown below and modify your .bashrc file in your home

    directory:

    PATH=$PATH:/opt/SDK/lib/ant/bin

    3.6.1.2 EnvironmentvariablesrequiredCATALINA_HOME

    ItmaypointatyourCatalina"build"directory.

    CATALINA_BASE (Optional)

    ItisthebasedirectoryforresolvingdynamicportionsofaCatalinainstallation. Ifnotpresent,

    resolvestothesamedirectorythatCATALINA_HOMEpointsto.

    JAVA_HOME

    ItmustpointatyourJavaDevelopmentKitinstallation.Requiredtorunthewiththe"debug"

    or"javac"argument.TosettheJAVA_HOMEenvironmentvariable it isnecessarytoexecute

    thefollowingcommand:

    export JAVA_HOME=/opt/SDK/jdk

    Youcanaddthislineinthe.bashrcfileofyourhomedirectorytodoitautomaticallyeachtime

    thesystemruns.

    JRE_HOME

    ItmustpointatyourJavaDevelopmentKitinstallation. DefaultstoJAVA_HOMEifempty.

    3.6.1.3 ConfigurationfilesServer.xml

    ThisistheTomcatservermainconfigurationfile.Bydefaultitisatconf/server.xml.Wewillnot

    modifyitandwillleaveitwiththedefaultvalues.

    Build.xml

    Itisnecessarytomodifythedefaultbuild.xmlfileprovidedbytomcatatthefollowingline:

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    35/56

    312BRealTimeVoiceTranslator

    stepsrequired.Thisfileisstoredinthetopleveldirectoryofyoursourcecodehierarchy,and

    shouldbecheckedintoyoursourcecodecontrolsystem.

    Web.xml

    Thisisthewebapplicationconfigurationfileandcontrolsalltheparametersrelatedwitheach

    web application. This file will be discussed in section4.5.6Creatingand runningadynamic

    VoiceXMLapplication.Thereisoneweb.xmlfileforeachapplicationinthedirectorypath:

    $CATALINA_HOME/webapps/{my_web_application}/WEB-INF

    3.6.1.4 StandarddirectorylayoutAwebapplicationisdefinedasahierarchyofdirectoriesandfilesinastandardlayout.Sucha

    hierarchycanbeaccessed in its"unpacked"form,whereeachdirectoryandfileexists inthe

    filesystemseparately,orina"packed"formknownasaWebARchive,orWARfile.Theformer

    format ismoreusefulduringdevelopment,whilethe latter isusedwhenyoudistributeyour

    applicationtobeinstalled.

    To run a web application on Tomcat you will need to arrange your application files in the

    followinghierarchyinyourapplication's"documentroot"directory:

    *.html,*.jsp,etc. TheHTMLandJSPpages,alongwithotherfilesthatmustbevisible

    to the client browser (such as JavaScript, stylesheet files, and images) for your

    application. In larger applications you may choose to divide these files into a

    subdirectoryhierarchy,butforsmallerapps, it isgenerallymuchsimplertomaintain

    onlyasingledirectoryforthesefiles.

    /WEBINF/web.xml TheWebApplicationDeploymentDescriptorforyourapplication.

    This isanXML file describing theservlets andothercomponents thatmakeupyour

    application, along with any initializationparameters and containermanaged security

    constraintsthatyouwanttheservertoenforceforyou.Thisfileisdiscussedinmore

    detailinthefollowingsections.

    /WEBINF/classes/ This directory contains any Java class files (and associated

    resources)requiredforyourapplication,includingbothservletandnonservletclasses,

    thatarenotcombinedintoJARfiles.IfyourclassesareorganizedintoJavapackages,

    youmustreflectthisinthedirectoryhierarchyunder/WEBINF/classes/.Forexample,

    aJavaclassnamedcom.mycompany.mypackage.MyServletwouldneedtobestoredin

    afilenamed/WEBINF/classes/com/mycompany/mypackage/MyServlet.class.

    /WEBINF/lib/ This directory contains JAR files that contain Java class files (and

    associatedresources)requiredforyourapplication,suchasthirdpartyclass libraries

    orJDBCdrivers.

    Likemostservletcontainers,Tomcat6alsosupportsmechanismstoinstalllibraryJARfiles(or

    unpacked classes) once, and make them visible to all installed web applications (without

    havingtobeincludedinsidethewebapplicationitself.

    $CATALINA_HOME/common/lib JAR files placed here are visible both to web

    applicationsandinternalTomcatcode.ThisisagoodplacetoputJDBCdriversthatare

    requiredforbothyourapplicationandinternalTomcatuse(suchasforaJDBCRealm).

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    36/56

    32 RealTimeSpeechTranslator

    $CATALINA_BASE/shared/lib JARfilesplacedherearevisibletoallwebapplications,

    but not to internal Tomcat code. This is the right place for shared libraries that are

    specifictoyourapplication.

    3.6.2 IBM RWD Voice Language and Voice Engines Configuration

    OncetheCommunicationflowisdoneandbeforethegenerationoftheVXMLfile,itisneeded

    toconfiguretheVoiceEnginesandtheSpeechLanguageourapplication isgoing touse.For

    doingthis,youneedtodothefollowingsteps:

    MenuWindow>Preferences

    Attheleftpanelofthewindowthatappearsselect:

    VoiceTools>SpeechEngines >WebSphereVoiceServer

    VoiceLanguage:USEnglish/LatinAmericaSpanish

    Compressionformat:mulaw

    RecognizerService

    Mediaurl:rtsp://was.feld.cvut.cz/media/recognizer

    Encoding:UTF8

    SynthesizerService

    Mediaurl:rtsp://was.feld.cvut.cz/media/synthesizer

    Encoding:UTF8

    TTSVoice:Default

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    37/56

    332BRealTimeVoiceTranslator

    Figure311.IBMRWDVoiceLanguageandVoiceEnginesconfiguration.

    3.7Creating and running a dynamic VoiceXMLapplication

    In thissection Iamgoing toexplainhow touseall thesoftware tools explained in previous

    sectionstocreateaVoiceXMLapplicationandrun itonourplatform.Thissection iswritten

    like a step by step manual in order it were easier to follow all the procedure and to show

    clearlyhowtouseeachtool.

    Brieflytheprocedureis:writetheservletcodeinEclipseIDE,copythegeneratedservletbinary

    filesintotheserver(insidetheTomcatdirectory)andupdateTomcatandVEtorecognizethe

    newapplication.Seethefollowingsubsectionswherethereareexplainedmoredetailedeach

    step.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    38/56

    34 RealTimeSpeechTranslator

    3.7.1 Create a new Java Project in Eclipse IDE

    AlthoughEclipsegivesyouthechancetocreateaDynamicWebApplicationProjectwhenyou

    trytocreatenewproject,wewillnotusethisoption.Ifyouchoosethis,Eclipseassistyouinall

    thetasksinvolvedinthecreationofthiskindofapplications,butalltheprocedureswithinthis

    kind of applications are very complex and take much time to understand how to use them.

    Therefore,likeforourpurposeswedonotneedmostofthisassistance,wearegoingtocreate

    asimplyJavaProject,whichwillprovideuswhatweneedanditismucheasiertounderstand.

    Tocreatethenewproject,yousimplyhavetogotothemenuFile>New>JavaProjectand

    followthewizard.Afterthat,youcanbeginaddingnewclassesandwritingtheservletscode.

    3.7.2 Adding JEE libraries in Eclipse IDE

    Inthepackageexplorer,rightclickon theprojectnamewhichyouaregoing todevelop the

    servletin.SelectPropertiesonthepopupmenuthatappears.Ontheleftsideoftheproperties

    windowsselectJavaBuildPathandthenontherightsideofthepropertieswindowschoose

    theLibrariestab.ClickthebuttoncalledAddExternalJARsandthenselectthejavaee.jar

    file(orj2ee.jarinsomeJDKversions)locatedin:

    $JAVA_EE_SDK/lib/javaee.jar

    Figure312.Addinglibrariesstep1.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    39/56

    352BRealTimeVoiceTranslator

    Figure313.Addinglibrariesstep2.

    3.7.3 Create the main procedure in the Servlet class

    ItisneededtocreateanemptypublicvoidmainprocedureintheHTTPServletclass,dueifitis

    not present, eclipse will not compile the class using the Java perspective. We can use the

    JavaEE perspective to create an HTTPServlet (it would be better) instead, but it is more

    complextounderstandhowthisperspectiveworks.IfweusetheJavaPerspectivewithclasses

    withMainprocedure,wecanworkwithEclipselikewewerewritinganormalJavaapplication

    and,later,copythefilesmanuallyintheTomcatwebserverdirectorystructure.

    3.7.4 Copy the binary files to the server

    Once,youhavecompiledthejavacode(chooseRunJavaApplication ifyouareasked,not

    Run on Server) it is need to copy the generated class file to the server (adela) inside the

    ApacheTomcat webappsdirectory. Thedestinationpathfortheclassfileshouldbe:

    $TOMCAT_DIR/webapps/{my_web_application}/WEB-INF/classes

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    40/56

    36 RealTimeSpeechTranslator

    IfyouhaveusedsomeexternalAPInot includedneither intheJavaStandardSDKnor inthe

    JavaEESDK,youshouldincludetheJARfilewiththeAPIpackagesinthefollowingpath:

    $TOMCAT_DIR/webapps/{my_web_application}/WEB-INF/lib

    You should restart the server every time you copy a new file or if an existing file has been

    modified.

    3.7.5 Tomcat application configuration

    YouneedtoconfigureyourTomcatApplicationtoallowthenewservletbereachableforthe

    users.Youneedtomodifytheweb.xmlfilewhichisat

    $TOMCAT_DIR/webapps/{my_web_application}/WEB-INF

    Addingthefollowinglines(itisusedtheTransaltorServletconfigurationasexample):

    TranslatorServlet

    TranslatorServlet

    TranslatorServlet

    /translator.vxml

    Figure314.Linestobeaddedintheapplicationsweb.xmlfile.

    Whereeachtagmeans:

    Tag Description

    In this section we define internal configuration (used inside Tomcat

    server)foreachservlet.

    In this section we define the relative URL where each servlet will be

    reachable.

    Specifies the name it is known a servlet inside the web.xml file. It is

    definedwiththissametaginsidethesection.

    Inthesectionitdefinestheclassfilewhichisassociatedwitha

    servletname.

    Definestherelativeurlforaspecifiedservletwheretheservletwillbe

    reachable.Itisusedinsidethesection. Thevalueof

    this tag will be /{servlet_relative_url}, which means that the servlet

    willbereachableat:

    http://myserver.com/{myWebApplication}/{servlet_relative_url}

    Table317.Web.xmlfiletags.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    41/56

    372BRealTimeVoiceTranslator

    IftheservlethavetoprovideaVXMLoutputtobeinterpretedbytheVoiceEnabler(VE),asis

    thecaseofallVoiceXMLapplications;thehastoendwith.vxml,becausethe

    VEonlyisabletoreadthisfileformat.Althoughtheoutputoftheservletwouldbethesame

    VXMLcode,ifthedontendwith.vxmltheVEwillnotprocessit.

    Youshouldrestarttheservereverytimetheweb.xml ismodifiedbeforethechangestake

    effect.

    Once it has been done, the new servlet will be reachable at the following URL (in this

    example):

    http://myserver.com/{myWebApplication}/translator.vxml

    TheVoiceXMLapplicationsIhavecreatedsofar,needtheTomcatserverwererestartedfrom

    the $CATALINA_BASE directory. It is because the servlets need to know the absolute path

    wheretherearelocatedtheirbinaryfiles.Maybethisissuecouldbesolvedinthefuture.

    Therefore,runtherestartexecutablefilefromthepath$CATALINA_BASE,bydoing:

    cd /opt/apache-tomcat-6.0.16

    Andonceyouareinthispathyouhavetoexecutethefollowingcommand:

    ./bin/restart.sh

    3.7.6 Configuring VE to use the VXML Application generated by theservlet

    At thispoint, itonlyremains tomodify theVEconfigurationfiles tomakeavailable thenew

    VoiceXMLapplicationinaspecifiedphonenumber.Whenitwasdoneandtheusercallstothe

    assignedphonenumber,VEwilllaunchourVoiceApplicationtoattendhim.

    To assign a phone number to the Voice Application generated by the servlet, you should

    modify theDefaultInboundCCXML.jsp VE configuration file. It could be found in the

    followingpathinadelaserver:

    /opt/WebSphere/AppServer/installedApps/adela/

    WVX5.1-adela.ear/wvxweb.war/DefaultInboundCCXML.jsp

    Insidethisfileyoushouldfindthetextblockshowedin

    Figure45.Youhavetoaddanewlineattheendofthisblock,followingtheexamplesofthe

    otherlines.InthenewlineyoushouldsetanavailablephonenumberandlinkitwiththeURL

    wheretheVXMLfilewhichcontainstheVoiceApplicationcanbefound.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    42/56

    38 RealTimeSpeechTranslator

    var number = extractNumber(evt.connection.local);

    if(!isNaN(number)) switch(number){

    case 700: vxml_app = "DefaultVXML.jsp"; break;

    case 701: vxml_app = "http://localhost/vxml/identity.vxml"; break;

    case 702: vxml_app = "http://localhost/vxml/mexicanRestaurant_v2.vxml"; break;

    case 703: vxml_app = "http://localhost/vxml/cvutCallCenter.vxml"; break;

    case 704: vxml_app = "http://localhost:8080/myservlets/dynamicVXML_hour.vxml";

    break;

    case 705: vxml_app = "http://localhost:8080/myservlets/translator.vxml"; break;

    case 706: vxml_app = "http://localhost/vxml/mexicanRestaurant_v2_EN.vxml"; break;

    case 777: vxml_app = "http://localhost/vxml/icecream_alt.vxml"; break;

    default: vxml_app = "http://localhost:8080/vxmlide/file?phone=" + number; break;

    }

    ]]>

    Figure315.TextblocktobemodifiedinDefaultInboundCCXML.jspfile.

    3.8ApplicationschemaIfyouwanttounderstandthewayIhavefollowedtoaccomplishtheaimofRealTimeSpeech

    TranslatorProject,itwouldbereallyinterestingtohaveaclearideaofthedifferentpartsthis

    big application is made up. Knowing this you will be able to understand de different step

    explainedinsection4.8Achievedgoalsbecauseeachofthemhasthepurposetoaccomplish

    one of these tasks. In Figure 46, you can see the whole schema of Real Time Speech

    Applicationanditsmainparts.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    43/56

    392BRealTimeVoiceTranslator

    Figure316.RealTimeSpeechTranslatorApplicationschema.

    3.9TextToSpeechandSpeechRecognitionThis point is where we found one of the biggest problems to cope with during the

    developmentofthisproject.Whenwebeginthisproject,wethoughttheSpeechRecognition

    Enginetransformedinternallythespeechtotextandthenitcomparestheobtainedtextwith

    thetextwrittenintheVXMLcode.Thisassumptionwaswrongandifwethoughtthis,wasdue

    our lack of knowledge with this technology. Actually, the Speech recognition is done by

    convertingthewrittentextintheVXMLfileintoaudioandthencomparingthiswiththeaudio

    comingfromthecaller.Therefore,itisneededtohaveagrammarfilewhichcontainsallthe

    wordsandsentencesyouwantyourapplicationwereabletorecognize.Thisisabigobstacle

    we have not solved yet, because for our purposes we need to have a full recognition of

    whateverthecallersaysanditisalmostimpossibletogeneratesuchabiggrammar.

    3.10AchievedgoalsIn the following subsections I am going toexplain the aims achieved in the RealTime Voice

    Applicationsofar.Asitisaverycomplexproject,Ihavedivideditinseveralsimplerparts.Each

    of these parts has the proposal of solving one of the requisites or implements one of the

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    44/56

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    45/56

    412BRealTimeVoiceTranslator

    3.10.4 Generate dynamic VXML code

    Untilnowwehavedevelopedseveraldifferentvoiceapplications,butalltheyarestaticvoice

    applications.Itmeansthatallthecallflowhavetobepredictedintheimplementationtime.It

    alsomeansthatalltheinformationtheycansupplyandtheoptionstheyoffertothecallerhas

    to be known at the development time. Once the application is running, there are not any

    chance to do anything it was not being considered when the programmer wrote the

    application. This is because VoiceXML applications are based in static files written in VXML

    language.

    ThisissueisabigobstacletocreatemanyVoiceApplicationswhichwouldbeabletoprovide

    veryinterestingservicesthatarenotaccessibleorknownatdevelopmenttime.Inthecaseof

    Real Time Voice Application, the developer will know neither which sentence will the caller

    wanttotranslatenorthetranslatedsentencetoreturn.

    Forthisreason,ournextstepwasgettingtheknowledgetocreatedynamicvoiceapplications.

    Toachieveit,Idoadeepresearchwithinmanydifferenttechnologiestofindoutwhichones

    weneededforourpurposesandwhichtoolsweneedtouse.Afterthisresearch,Icreatedthe

    DynamicTimeApplication,ourfirstDynamicVoiceApplication.Thisisasimpleapplicationthat

    saysthecurrenttimetothecaller.Fordoingit,theapplicationgetsthecurrenttimefromthe

    machineandgeneratesontheflytheVXMLfilewhichwillbesenttotheVEtobeinterpreted

    andfinallybeplayedtothecaller.

    Figure317.DynamicTimeApplicationschema.

    3.10.5 Make queries to websites and process the response

    Once we have the capability to create dynamic voice applications, we have to use this

    knowledge for going forward to our aims in the project. To do this, next step is having the

    capability to obtain data form a website. This would be very interesting for a big range of

    applications that were based on web services. Some examples would be Real Time Voice

    Translator, which needs to get the translated sentences from web translation services;

    WikipediaVoiceProject,whichenablestogetthe informationfromWikipediaandplay iton

    yourmobilephone;theWeatherForecastApplicationwhichgetstheweatherforecastforma

    websiteandplaysitbacktotheuser;

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    46/56

    42 RealTimeSpeechTranslator

    ToachievethisaimIcreatedtheGenericPostservlet.Thisisaservletthatimplementsallthe

    stepsnecessarytocarryoutaVoiceApplicationbasedonwebservices.Thisservletisableto

    makeaquerytoanywebsitewithindependenceofwhateverHTTPqueriesituses:HTTPPOST

    or HTTP GET. GenericPost servlet makes a query to the website, gets the data from it, and

    forwardsittotheuserswebbrowser.InourVoiceApplicationswewillusethissameschema

    butwewillforwardthedataobtainedfromthewebsitetotheVEinsteadtoawebbrowser.

    Figure318.SchemaforaVoiceApplicationwhichneedsgettingdatafromawebsite.

    3.10.6 Make queries to Google Translator

    At this point, we were very close to one of the main aims of our Project: given a sentence,

    translate it and plays it back to the user. To accomplish this goal, we had to modify the

    GenericPost servlet to make queries to Google Translate service and integrate it in a Voice

    Applicationwhichplaysthetranslatedsentencetotheuser.

    Itappearedtobeasimplystep,butaftermodifyingtheGenericPostservletforourpurposes

    Google Translatedenied us theaccess to itsservice. After some research, I findout that all

    GoogleServicesdeniesaccesstotheirservicestoallautomaticapplications.Theyonlyaccepthuman users to avoid DoS attacks. To avoid this handicap, I modified the behavior of

    GenericPost servlet to emulate the behavior of a human user which uses a Mozilla web

    browser. I change the HTTP header our application uses to connect with the server for

    cheatingit.Inthisway,nowwewereabletoaccesstoallGooglewebservices.

    Despite, at this point, we already had a manner to get the translated sentences using the

    GenericPostservlet,IcontinuemyresearchandIfindoutabetterwaytodothis.Ifindouta

    betaAPIforJavathatenablestouseGoogleTranslateServices,withoutdoingHTTPqueries.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    47/56

    432BRealTimeVoiceTranslator

    3.10.6.1GoogleTranslatorAPIGoogleTranslatorAPIforJavaisanAPIforJavaprogramminglanguagedevelopedforGoogle

    to enable Java applications to use their services. Although it is currently a beta version, the

    testsIdoneshowedthatitworkswellforourpurposes.UsingthisAPIisabetterwaytoaccess

    GoogleTranslatorservice,because itsaveus toprocess thereturnedHTML code toget the

    translatedsentenceanddroptherestofHTMLcodethatgeneratesthevisualinterfaceofthe

    page.Anotheradvantageisthatwedonotdependontheinterfacechangesdoneintheweb

    inthefuture,whichwouldmodifytheHTMLcodeandwouldforceustochangethealgorithms

    weusetoprocessit.

    3.10.7 Get dynamic data from a VoiceXML Application

    Althoughwehavethewaytogetdatafromanywebserviceandplayitbacktotheuser,we

    alsoneedtoknowthewayfor integrating it inawholeVoiceApplication. Itmeanstoknow

    howtolaunchourservlets,whichgetthisdataandreturnittotheuser,fromtheVXMLcode

    of a Voice Application. For this purpose we decided to use the DATA tag of VXML language

    whichenablesaVoiceApplicationtogetdatafromawebservicethroughaHTTPquery.

    Therefore,weneededtomakeourservletsbereachableforthistag.Torealizethis,Imodified

    allservletsdoneup tonow towereabletoreceive theconfigurationparameterstheyneed

    fromaHTTPquery(HTTPGETorHTTPPOST).Beforedoingthis,theygettheparametersfrom

    a configuration file. With this modification, Translator Application, for example, is able to

    receive the sentence to be translated and the original and destination languages through a

    HTTPquery.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    48/56

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    49/56

    453BConclusions

    4 ConclusionsWhenwebeganthisprojectseveralmonthsago, itwasabigchallengeforallourteam.For

    me,itwasthefirsttimeIusesomeofthesetechnologies,likeVoiceXML.Weallhavehadtodo

    a hard work to learn how to manage them and reach our aims. It has been a very nice

    experience for me, and I have had the possibility to do a true team work with other very

    qualified people. This has permitted us to benefit each other and, in my case, improve my

    personalskillsquickly.

    RegardingtheRealTimeVoiceTranslator,weachieved importantgoals.Although Ihavenot

    reach the final aim of creating an application which was able to translate in real time a

    conversationbetweentwocallers,Ihavedonemostofthenecessarystepstoachieveit.Itisa

    verycomplexProjectandrequiresalotofintermediatesteps,whichIwasnotabletoimagine

    atthebeginningduemylackofknowledgeonthetechnologiesinvolvedin.Despitethis,Ithink

    that linking theachievedmidstepsandaddingfewmore features Icouldreachthegoalwe

    proposed at the beginning. I also think that, with the knowledge acquired during this time,

    achieve the new aims would be more easy and quickly due to the most difficult parts have

    alreadybeendone.

    ForafutureevolutionofmyworkontheRealTimeVoiceTranslatorProject,Iproposetodo

    thefollowingstepstosolvetheproblemsIhavefoundortoaddnewfeatures:

    CreateaMultilanguageapplication,i.e.usedifferentlanguagesinasameVXMLfile.It

    willbenecessaryforexampletogivethecallerthechancetochoosehislanguageand,

    afterhischoice,usetheselectedlanguagetospeakwithhim.

    Findouthowto launchaVoiceApplicationfromanotherVoiceApplication.Itwillbe

    necessarytousethisinmanysituations.Forexample,wecouldcreateanapplication

    whichgivesthechancetothecallertochoosewhichapplicationhewantstouse,and

    afterit,redirectthecalltothechosenapplication.

    Findoutthewaytodoacyclicapplication,i.e.anapplicationthatneverendsuntilthe

    userwants.ItwillbenecessarytoimplementthisfeatureinmanyVoiceApplications.

    Forexample, itwouldbenecessary intheRealTimeSpeechTranslatortobeableto

    translate all the sentences the user says indefinitely until he wants to finish the

    conversation.

    TrydifferentwaystogetdataintoaVoiceXMLApplication.Nowwehaveonlytested

    theDATAtag,butitwouldbenecessarytotryotheroptionstoknowthebenefitsof

    eachoneandknowwhenitisbettertouseoneorother.

    Find out an efficient algorithm to process the HTML code and get the useful data

    withinitforourpurposes.

    Find theways to improve thewaiting timewhenweperformaquery toawebsite.

    Maybe it would be possible to use a multithreading solution to make simultaneous

    queriestoseveralwebsitesatthesametimeoruseabuffertostoretheresponses.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    50/56

    46 RealTimeSpeechTranslator

    Solve the grammar problem to achieve full conversation recognition without the

    restriction of a set of words fixed in advance as it is the case of the current

    applications.

    Duringthedevelopmentofthisprojectwehavefacedalotofobstaclesthatwerecomingup

    andweturnedoutsuccessful inmostofthem. Ithink thehardestpartof thewayhasbeen

    doneandnowwehavethechance tocontinue improvingourVoiceApplicationsandcreate

    newandevenmorecomplexonesapplyingtheknowledgewehaveobtaineduptonow.

    I think that VoiceXML technology has an amazing future and there are a wide range of

    applications where it could be used. ACC team has demonstrated its good skills and it is

    carryingoutveryambitiousprojects;henceIthinkithasagoodchancetobecomeareference

    TeaminVoiceApplicationsdevelopment.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    51/56

    474BAPPENDIXA:Definitions

    5 APPENDIXA:DefinitionsPrivateBranch eXchange (PBX)

    Private Branch eXchange (PBX) is a telephone exchange that serves a particular business or

    office, as opposed to one that a common carrier or telephone company operates for many

    businessesorforthegeneralpublic.PBXsarealsoreferredtoas:

    PABX PrivateAutomaticBrancheXchange.

    EPABX ElectronicPrivateAutomaticBrancheXchange.

    TrunkLine

    When dealing with a PBX, trunk lines are the phone lines coming into the PBX from the

    telephoneprovide

    SessionInitiationProtocol(SIP)

    The Session Initiation Protocol (SIP) is an applicationlayer control (signaling) protocol for

    creating,modifying,andterminatingsessionswithoneormoreparticipants.Itcanbeusedto

    create twoparty, multiparty, or multicast sessions that include Internet telephone calls,

    multimedia distribution, and multimedia conferences (RFC 3261). SIP is designed to be

    independentoftheunderlyingtransportlayer;itcanrunonTCP,UDP,orSCTP.

    Media Resource ControlProtocol(MRCP)

    MRCP is the Media Resource Control Protocol proposed to the IETF. It is a communication

    protocol which allows speech servers to provide various speech services (such as speech

    recognitionandspeechsynthesis)toitsclients.Typically,thismeanstheserversoftwarewill

    be running on one computer and clients can send MRCP messages to the server over a

    network,usuallyontopofanotherprotocol,suchasRTSPorTCP.Therearetwoversionsof

    theMRCPprotocol.Version2usesSIPasthecontrolprotocol,whereasversion1usesRTSP.

    Time StreamingProtocol(RTSP)

    TheRealTimeStreamingProtocol(RTSP),developedbythe IETFandcreated in1998asRFC

    2326, is a protocol for use in streaming media systems which allows a client to remotely

    controlastreamingmediaserver,issuingVCRlikecommandssuchas"play"and"pause",and

    allowingtimebasedaccesstofilesonaserver.

    VoiceXML(VXML)

    VoiceXML(VXML)istheW3C'sstandardXMLformatforspecifyinginteractivevoicedialogues

    betweenahumanandacomputer.Itallowsvoiceapplicationstobedevelopedanddeployed

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    52/56

    48 RealTimeSpeechTranslator

    inananalogouswaytoHTMLforvisualapplications.JustasHTMLdocumentsareinterpreted

    byavisualwebbrowser,VoiceXMLdocumentsareinterpretedbyavoicebrowser.Acommon

    architecture istodeploybanksofvoicebrowsersattachedto thepublicswitchedtelephone

    network(PSTN)sothatuserscanuseatelephonetointeractwithvoiceapplications.

    CallControleXtensibleMarkup Language (CCXML)Call Control eXtensibleMarkup Language (CCXML) is anXML standard designed to provide

    telephonysupporttoVoiceXML.ItscurrentstatusisaW3CWorkingDraft,adopted19January

    2007.WhereasVoiceXML isdesigned toprovideaVoiceUser Interface toavoicebrowser,

    CCXML isdesigned to inform thevoicebrowserhow tohandle the telephonycontrolof the

    voicechannel.The twoXMLapplicationsarewholly separateandarenot requiredbyeach

    othertobeimplemented.

    OnecriticalthingtounderstandisthatCCXMLisnotamedia/dialoglanguagelikeVoiceXML.It

    only provides support tomove calls around and connect them to dialog resources. CCXML

    doesnotprovideanydialog resourceson itsown. (Note:Adialog resource isanything that

    interactswith a caller via voice, such as a VoiceXML platform or even a second caller at

    anotherlocation.)

    VoicebrowserAvoicebrowserisawebbrowserthatpresentsaninteractivevoiceuserinterfacetotheuser.

    Inaddition,ittypicallyprovidesaninterfacetothePSTNoraPBX.Justasavisualwebbrowser

    workswith HTML pages, a voice browser operates on pages that specify voice dialogues.

    Typically these pages are written in VoiceXML, the W3C's standard voice dialog markup

    language,butotherproprietaryvoicedialoguelanguagesremaininuse.

    Avoicebrowserpresentsinformationaurally,usingprerecordedaudiofileplaybackorusing

    texttospeech software to render textual information as audio. A voice browser obtains

    informationusingspeechrecognitionandkeypadentry(e.g.,DTMFdetection).

    Asspeechrecognitionandwebtechnologieshavematuredoverthepastdecade,thousandsof

    voice applications have been deployed commercially and voice browsers are supplanting

    traditionalproprietary IVRsystems.Scoresofcompaniesprovidevoicebrowsers.These take

    theformofsoftware,packagedhardware/softwaresolutionsorhostedsolutions.

    VoiceEnablerHardwareandsoftwarethatenablesPCtoPCvoicecommunicationsorPCtotelephonelong

    distancecommunications.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    53/56

    494BAPPENDIXA:Definitions

    VoiceServer

    Its the combinationof hardware and software that enables voicecommunicationsbetween

    users. It can manage simultaneously many calls flows to make possible the communication

    betweenoneuserandhisspeakers.

    CallXML

    VoxeoproprietaryCallControlLanguage.ItsaimisthesameasCCXML.

    InteractiveVoiceResponse(IVR)

    Intelephony,interactivevoiceresponse,orIVR,isaphonetechnologythatallowsacomputer

    todetectvoiceandtouchtonesusinganormalphonecall.TheIVRsystemcanrespondwith

    prerecordedordynamicallygeneratedaudiotofurtherdirectcallersonhowtoproceed.IVR

    systemscanbeusedtocontrolalmostanyfunctionwheretheinterfacecanbebrokendown

    into a series of simple menu choices. Once constructed IVR systems generally scale well to

    handlelargecallvolumes.

    Dual tonemultifrequency(DTMF)

    Dualtonemultifrequency(DTMF)signalingisusedfortelephonesignalingoverthelineinthe

    voicefrequency band to the call switching center. The version of DTMF used for telephone

    tone dialing is known by the trademarked term TouchTone, and is standardised by ITUT

    RecommendationQ.23.

    Speechrecognition

    Speechrecognition(inmanycontextsalsoknownasautomaticspeechrecognition,computer

    speechrecognitionorerroneouslyasvoicerecognition)istheprocessofconvertingaspeech

    signal to a sequence of words in the form of digital data, by means of an algorithm

    implementedasacomputerprogram.

    Hidden Markovmodel(HMM) basedspeechrecognition

    ModerngeneralpurposespeechrecognitionsystemsaregenerallybasedonHMMs.Theseare

    statisticalmodelswhichoutputasequenceofsymbolsorquantities.Onepossiblereasonwhy

    HMMsareusedinspeechrecognitionisthataspeechsignalcouldbeviewedasapiecewise

    stationarysignalorashorttimestationarysignal.Thatis,onecouldassumeinashorttimein

    therangeof10milliseconds,speechcouldbeapproximatedasastationaryprocess.Speech

    couldthusbethoughtasaMarkovmodelformanystochasticprocesses(knownasstates).

    Dynamictimewarping (DTW) basedspeechrecognition

    Dynamictimewarpingisanapproachthatwashistoricallyusedforspeechrecognitionbuthas

    now largely been displaced by the more successful HMMbased approach. Dynamic time

    http://en.wikipedia.org/wiki/Hidden_Markov_Modelhttp://en.wikipedia.org/wiki/Stationary_processhttp://en.wikipedia.org/wiki/Stationary_processhttp://en.wikipedia.org/wiki/Stationary_processhttp://en.wikipedia.org/wiki/Markov_modelhttp://en.wikipedia.org/wiki/Markov_modelhttp://en.wikipedia.org/wiki/Markov_modelhttp://en.wikipedia.org/wiki/Markov_modelhttp://en.wikipedia.org/wiki/Stationary_processhttp://en.wikipedia.org/wiki/Hidden_Markov_Model
  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    54/56

    50 RealTimeSpeechTranslator

    warping is an algorithm for measuring similarity between two sequences which may vary in

    timeorspeed.Forinstance,similaritiesinwalkingpatternswouldbedetected,evenifinone

    videothepersonwaswalkingslowlyandifinanothertheywerewalkingmorequickly,oreven

    iftherewereaccelerationsanddecelerationsduringthecourseofoneobservation.DTWhas

    been applied to video, audio, and graphics indeed, any data which can be turned into a

    linearrepresentationcanbeanalyzedwithDTW.

    StateChartXML(SCXML)

    SCXMLstands forStateChartXML :State MachineNotation forControl Abstraction. It is an

    XMLlanguagewhichprovidesagenericstatemachinebasedexecutionenvironmentbasedon

    Harelstatecharts.

    SCXML is able to describe complex statemachines. For example, it is possible to describe

    notionssuchassubstates,parallelstates,synchronization,orconcurrency,inSCXML.

    ExtensibleStylesheetLanguage Transformations(XSLT)

    ExtensibleStylesheetLanguageTransformations(XSLT)isanXMLbasedlanguageusedforthe

    transformation of XML documents into other XML or "humanreadable" documents. The

    originaldocumentisnotchanged;rather,anewdocumentiscreatedbasedonthecontentof

    anexistingone.[2]Thenewdocumentmaybeserialized(output)bytheprocessorinstandard

    XMLsyntax or inanother format, such as HTML or plain text.[3] XSLT is most oftenused to

    convert data between different XML schemas or to convert XML data into HTML or XHTML

    documentsforwebpages,creatingadynamicwebpage,orintoanintermediateXMLformatthatcanbeconvertedtoPDFdocuments.

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    55/56

    515BAPPENDIXB:Acronyms

    6 APPENDIXB:Acronyms

    VXML

    VoiceXML(VXML)

    CCXML

    CallControleXtensibleMarkupLanguage

    IVR

    InteractiveVoiceResponse

    DNIS

    Dialednumberinformationservice

    DTMF

    Dualtonemultifrequency

    HMM

    HiddenMarkovmodel

    DTW

    Dynamictimewarping

    CCXML

    CallControleXtensibleMarkupLanguage

    VoiceXML

    VoiceeXtensibleMarkupLanguage

    SRGS

    SpeechRecognitionGrammarSpecification

    SISR

    SemanticInterpretationforSpeechRecognition

    SSML

    SpeechSynthesisMarkupLanguage

  • 8/6/2019 20081001 Real Time Speech Translator Report Xavier Garcia Cabrera (1)

    56/56

    52 RealTimeSpeechTranslator

    PLS

    PronunciationLexiconSpecification

    ECMAScript

    Scripting

    language

    supported

    by

    most

    v