32
June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in Nepal Price €10.00 / US$10.00 ISSN 1649-2358 Localisation Focus THE INTERNATIONAL JOURNAL OF LOCALISATION Issue sponsored by

VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

  • Upload
    others

  • View
    2

  • Download
    1

Embed Size (px)

Citation preview

Page 1: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

June 2006VOL. 5 Issue 2

Using Web Servicesfor Translation

Formatting and theTranslator: Why XLIFF Does Matter

Localisation in Nepal

Price €10.00 / US$10.00ISSN 1649-2358

Localisation FocusTHE INTERNATIONAL JOURNAL OF LOCALISATION

Issue sponsored by

Page 2: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

Editorial BoardAFRICAKim Wallmach, Lecturer in Translation and Interpreting, University of South Africa, Pretoria, South Africa; Translator and ProjectManager

ASIAPatrick Hall, Emeritus Professor of Computer Science, Open University, UK; Project Director, Bhasha Sanchar, Madan Puraskar Pustakalaya, NepalSarmad Hussain, Professor and Head of the Center for Research in Urdu Language Processing, NUCES, Lahore, PakistanOm Vikas, Director of the Indian Institute of Information Technology and Management (IIITM), Gwalior, Madhya-Pradesh, India

AUSTRALIA and NEW ZEALANDJames M. Hogan, Senior Lecturer in Software Engineering, Queensland University of Technology, Brisbane, Australia

EUROPEBert Esselink, Solutions Manager, Lionbridge Technologies, Netherlands; authorSharon O'Brien, Lecturer in Translation Studies, Dublin City University, Dublin, IrelandMaeve Olohan, Programme Director of MA in Translation Studies, University of Manchester, Manchester, UKPat O'Sullivan, Test Architect, IBM Dublin Software Laboratory, Dublin, IrelandAnthony Pym, Director of Translation- and Localisation-related Postgraduate Programmes at the Universitat Rovira I Virgili, Tarragona, SpainHarold Somers, Professor of Language Engineering, University of Manchester, Manchester, UKMarcel Thelen, Lecturer in Translation and Terminology, Zuyd University, Maastricht, NetherlandsGregor Thurmair, Head of Development, linguatec language technology GmbH, Munich, GermanyAngelika Zerfass, Freelance Consultant and Trainer for Translation Tools and Related Processes; part-time Lecturer, University ofBonn, Germany

NORTH AMERICATim Altanero, Associate Professor of Foreign Languages, Austin Community College, Texas, USADonald Barabé, Vice President, Professional Services, Canadian Government Translation Bureau, CanadaLynne Bowker, Associate Professor, School of Translation and Interpretation, University of Ottawa, CanadaCarla DiFranco, Programme Manager, Windows Division, Microsoft, USADebbie Folaron, Assistant Professor of Translation and Localisation, Concordia University, Montreal, Quebec, CanadaLisa Moore, Chair of the Unicode Technical Committee, and IM Products Globalisation Manager, IBM, California, USASue Ellen Wright, Lecturer in Translation, Kent State University, Ohio, USA

SOUTH AMERICATeddy Bengtsson, CEO of Idea Factory Languages Inc., Buenos Aires, ArgentinaJosé Eduardo De Lucca, Co-ordinator of Centro GeNESS and Lecturer at Universidade Federal de Santa Catarina, Brazil

Publisher InformationEditor: Reinhard Schäler, Director, Localisation Research Centre, University of Limerick, Limerick, IrelandProduction Editor: Thomas Keogan, Technical Writer, Localisation Research Centre, University of Limerick, Limerick, IrelandPublished by: Localisation Research Centre, CSIS Department, University of Limerick, Limerick, Ireland

Aims & ScopeLocalisation Focus – The International Journal of Localisation provides a forum for localisation professionals and researchers to discuss and present their localisation-related work,covering all aspects of this multi-disciplinary field, including software engineering, tools and technology development, cultural aspects, translation studies, project management, workflow andprocess automation, education and training, and details of new developments in the localisation industry. Proposed contributions are peer-reviewed thereby ensuring a high standard ofpublished material. Localisation Focus is distributed worldwide to libraries and localisation professionals, including engineers, managers, trainers, linguists, researchers and students.Indexed on a number of databases, this journal affords contributors increased recognition for their work. Localisation-related papers, articles, reviews, perspectives, insights andcorrespondence are all welcome.

Subscription: To subscribe to Localisation Focus – The International Journal of Localisation visit www.localisationshop.com (subscriptions tab). For more information visitwww.localisation.ie/lf

Members of The Institute of Localisation Professionals (TILP) receive Localisation Focus – The International Journal of Localisation as part of their membership benefits. Membershipapplications can be filed electronically from www.tilponline.org Change of address details should be sent to [email protected]

Sponsorship and Advertising: To advertise in or to sponsor an issue of Localisation Focus-The International Journal of Localisation contact [email protected]

Copyright: © 2006 Localisation Research CentrePermission is granted to quote from this journal with the customary acknowledgement of the source.Opinions expressed by individual authors do not necessarily reflect those of the LRC or the editor.

Localisation Focus – The International Journal of Localisation (ISSN 1649-2358) is published and distributed quarterly and has been published since 1996 by the Localisation ResearchCentre, University of Limerick, Limerick, Ireland. Articles are peer reviewed and indexed by major scientific research services.

Page 3: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

Page 3

CONTENTS / editorial

Close-upIGNITE Project 4

LocalisationLRC Awards 2006 6

WhitepaperUsing Web Services for Translation 7Kevin Bargary and Peter Reynolds

StandardsFormatting and the Translator : Why XLIFF Does Matter 14Ignacio Garcia

Country FocusLocalisation in Nepal 21Pat Hall

TILPOpportunity to Influence YourProfessional Organisation 26Angela Starkmann-Lehr

ResourcesLOTS Distribution and GILC 27Reinhard Schäler

Personal ProfileThe Best of Both Worlds — Languagesand Computers 28Angelika Zerfass

Organisation News 30

CONTENTS

Welcome to the newly designedLocalisation Focus – TheInternational Journal of

Localisation! With the new title page and theimproved layout, we wanted to add visualimpact to our efforts to provide thelocalisation community with a high-qualityinternational journal publishing peer-reviewed articles that are now indexed bymajor international scientific services –thanks to the combined efforts of ourEditorial Board, editors, designers, and peer-reviewers. The next steps in the developmentof our journal will bring you a greater coverage of activities of theGlobal Initiative for Local Computing (GILC), a dedicated reviewsection (books, events, courses), and an improved regionaldistribution network for Asia, Africa and South America.

This year marks the 10th Anniversary of Localisation Focus. Tocelebrate this occasion, Thomas Keogan of the LRC has prepared aspecial, limited edition of all issues published over that period whichcan be ordered directly from the LRC and online atwww.localisationshop.com.

The Localisation Research Centre (LRC) is continuing to work withpartner organisations globally to develop and facilitate local computing.In June 2003, the LRC established the Localisation Tools andTechnology Laboratory and Showcase (LOTS) to facilitate easy accessto state-of-the-art localisation tools and technologies for practitioners,researchers, teachers and students. This effort was funded by theEuropean Union eContent Programme and supported by the localisationtools industry. On 19 April 2006, the LRC launched the GILC-LOTSSatellite Distribution at the International Symposium on ICT for RuralDevelopment in Kuching, Malaysia, and I handed the first copy of thisdistribution to Prof. Dr. Abdul Rashid Abdullah, the vice chancellor ofthe Univerisiti Malaysia Sarawak, following the signing of aMemorandum of Understanding between the LRC and that university.This was made possible, to no small degree, thanks to the efforts of Dr.Alvin Yeo of the University of Malaysia. I also would like to thankMichael Bourke of the LRC, as well as Alchemy, CatsCradle, PASS,Project Open, Aquino and SDL/TRADOS who have contributed theirtools free-of-charge to this distribution. The GILC-LOTS SatelliteDistribution, with each installation worth tens of thousands of euros,will be rolled out by the LRC free-of-charge initially to ten partneruniversities worldwide.

This year you will be able to meet us at a number of internationalevents: following the IGNITE Project's Working Conference in May,the LRC will be represented at Localisation World Barcelona (31 May-01 June), our 6th Internationalisation and Localisation Summer Schoolin Limerick (12-15 June), the CTTT's Technology for TranslationTeachers intensive training seminar in Istanbul (3-7 July), the 2ndIATIS Conference in Cape Town (12-14 July), LRC XI in Dublin (25-26 October), and ASLIB 28 in London (16-17 November).

I hope to meet you at some of these events. In the meantime, I hope youenjoy the new look of Localisation Focus – The International Journal ofLocalisation.

Reinhard Schäler

From the Editor . . .

JUNE 2006

Prof. Dr. Abdul Rashid Abdullah, vice chancellor of UniversitiMalaysia Sarawak, and Reinhard Schäler, director of theLocalisation Research Centre at the University of Limerick, sign aMemorandum of Understanding that will establish closer linksbetween the two institutions to promote internationalisation – andlocalisation – related research and teaching.

Subscribe to Localisation Focus – The InternationalJournal of Localisation at www.localisationshop.com

To access this issue online go tohttp://www.localisation.ie/resources/locfocus/pdf.htmand click on "June 2006".User Name: locfocsub, Password: j0606

Page 4: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

Page 4

CLOSE-UP

JUNE 2006

IGNITE – The Localisation Factory

Do you need to localise more digital content within anever-decreasing time scale, without a decrease inquality and without an increase in your budget?

Have you asked yourself whether process automation couldhelp you to achieve this impossible target? Maybe you havetried to automate some of your processes already but foundout that the technology you used is not interoperable? Areyou now insisting on the implementation of the relevantstandards? Do you know whether your tools suppliers – oreven your in-house tools developers – comply with the mostcommon localisation standards?

These are hard questions for any digital publisher,localisation service provider, tools developer, standardsbody and researcher alike – and the answers are difficult tofind. However, finding the right answers to these difficultquestions is crucial and can tip the balance between successand failure for many of them.

This is why Archetypon, Pass Engineering, VeriTest,Vivendi Universal Games and the Localisation ResearchCentre have decided to collaborate and invest heavily in theIGNITE project, which is co-funded by the EuropeanUnion's eContent Programme.

Most experts agree that automation is the answer to today'slocalisation challenges. Yet, even basic pre-requisites forautomation are still missing:

l Instead of standard file formats, localisers are expectedto deal with an ever-increasing variety of source file formats; rather than concentrating on the development of new, improved localisation tools, they are spending too much time adjusting existing tools and technologiesto new and emerging file formats.

l There is little or no meta-data available in the source material sent to localisers which would enable them to streamline the localisation process, such as informationon individual strings (maximum length allowed), dialogues (space available for expansion of controls), linguistics (terminology, previous translations, controlled language), the product itself (product family,version) and the project on hand (status).

l Tools and technologies use their own internal, proprietary file format when dealing with digital contentto be localised. Such a file format is generally

incompatible with the format used by other tools; usersof these tools find it difficult to switch between them sothey are prevented from selecting the best tool for the job on hand.

The use of industry-wide standards could help. However,there are doubts about the usefulness of some of the existingstandards, there are no agreed procedures in place and thereis no independent organisation around that could validate thestandards themselves, verify their implementation, ensureinteroperability, and – most importantly – demonstrate theirusefulness in an automated localisation environment.

IGNITE will fill this gap. IGNITE collects linguisticresources, including authentic source material, tools andtechnologies, and standards, and uses these to build anautomated localisation environment, the LocalisationFactory, that will be used to both demonstrate and measurethe effectiveness of the use of standards in the localisationprocess.

The IGNITE project will prepare the ground for theestablishment of an independent operation that will validatelocalisation standards, verify their implementation andmeasure their effectiveness.

To find out more about IGNITE and to join the IGNITEContact Group visit: www.igniteweb.org and register forLRC XI – The Localisation Factory, the 11th AnnualInternationalisation and Localisation Conference organisedby the LRC on www.localisation.ie/conference.

How can you benefit?All the major stakeholders in localisation – contentdevelopers, service providers, technology developers andcertification experts – are represented in the IGNITE projectconsortium. However, to ensure the success of the projectand to fully realise the potential of IGNITE, we invite otherorganisations to actively participate in the project throughthe IGNITE Contact Group. IGNITE will also closelycooperate with the localisation industry's leadingrepresentative bodies, specifically the Globalisation andLocalisation Association (GALA), The Institute ofLocalisation Professionals (TILP), and the Organization forthe Advancement of Structured Information Standards(OASIS).

For organisations cooperating under the umbrella of

Linguistic Infrastructure for Localisation: Language Data, Tools and Standards

Page 5: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

CLOSE-UP

Page 5JUNE 2006

IGNITE, there are a number of distinct advantages, amongthem:

l Early access to localisation resources, infrastructure andtest harnesses

l Exposure through project literature and publicationsl Preferential invitation to Contact Group meetings and

workshops

If you or your organisation would like to join the ContactGroup or contribute to IGNITE in some other fashion,contact the Localisation Research Centre at [email protected].

The IGNITE ConsortiumThe LRC is the coordinating partner of the IGNITE project.The other partners, each representing one of the stakeholdersin localisation, are the following:

Archetypon S.A., founded in 1987, is an independentprivate-owned service and consulting company operating inthe Information and Communication Technologies (ICT)sector. The company is a strategic engineering serviceprovider to leading international companies.www.archetypon.com

Vivendi Universal Games is a global leader in multi-platform interactive entertainment. The company develops,publishes and distributes interactive products across allmajor platforms including PCs, video game consoles and theInternet. Vivendi Universal Games' portfolio ofdevelopment studios and publishing labels includes BlizzardEntertainment, Coktel, Fox Interactive, KnowledgeAdventure, Massive Entertainment and SierraEntertainment. www.vugames.ie

VeriTest is a division of Lionbridge Technologies, Inc., aprovider of globalisation solutions. VeriTest is the leadingprovider of comprehensive product testing, productcertification, and strategic QA consulting services to the ITdeveloper community. www.veritest.com

PASS Engineering GmbH was established in 1989 and isbased in Bonn, Germany. PASS Engineering's idea ofefficient localisation, integrated in the development process,has been realised in the creation of PASSOLO, the PASSSoftware Localizer, on the market since 1998. Today, PASSEngineering is one of the leading providers of softwarelocalisation tools. www.passolo.com

IGNITE – Background

On 28 April 2005, the European Union signed a contractcharging the Localisation Research Centre and fourindustry partners – Archetypon, Pass Engineering,VeriTest, and Vivendi Universal Games – with theestablishment of a linguistic infrastructure for localisationincorporating language data, tools and standards. TheIGNITE project will run over two years and close to 2.5million euro will be invested by the partners and theEuropean Commission to establish a central andaccessible repository of European linguistic resourcesstrongly connected to and supported by the localisationindustry.

Page 6: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

Page 6 JUNE 2006

LRC Awards 2006LRC Best Thesis

Award 2006

sponsored by Symantec Ireland Ltd.

This award aims to find the bestresearch thesis, on a localisation-related topic, delivered within the lasttwo years. The winner will receive aprize of €1000 plus one of Symantec'sprofessional retail products.

Students who have completed a thesison a relevant theme within the pasttwo years are invited to submit theirwork. Theses, which may besubmitted prior to the applicant'sdegree being awarded, will be judgedby a panel of renowned academic andindustry experts.

Visitwww.localisation.ie/resources/Awards/Thesis.htm for further details

LRC Best Scholar Award 2006

sponsored by con[text]

This award aims to find the bestresearch proposal, on a localisation-related topic, by a student currentlyundertaking (or about to undertake)postgraduate research in a third-levelinstitution. The winner will receive€500 that will help them in thecompletion of their research.

The winning entry will be chosenbased on the quality and relevance ofthe student's research proposal andwill be judged by a panel of renownedacademic and industry experts.

Visitwww.localisation.ie/resources/Awards/Scholar.htm for further details

LRC Best GlobalWebsite Award 2006

sponsored by Euro RSCG 4D

This award aims to highlight thecapabilities of a truly globalisedwebsite and show the increasedeffectiveness and profitability of thistype of site over non-globalised sites.A highly prestigious award, it aims toreward electronic publishers andmarketers that take advantage of theopportunities presented byglobalisation. Previous winnersinclude the BBC World Service andthe Swedish National Agency forSchool Improvement.

Submissions are invited from thosewho have designed and/or areoperating a multilingual website.Websites will be judged on a range ofcriteria by a panel of renownedacademic and industry experts.

Visitwww.bestglobalwebsiteaward.com for further details

Closing date for entries to all three awards is 01 August 2006

LOCALISATION.org

Page 7: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

Page 7JUNE 2006

WHITEPAPER.loc

Using Web Services for Translation

By Kevin Bargary and Peter Reynolds

AbstractWeb services use Internet technologies to allow computer-based systems to communicate and transfer data in a way that providesa more seamless automated workflow. At OASIS work is being done to create a standard way for Web services to be used withinthe translation and localisation industry. The purpose of this article is to inform you about this work, what Web services are, and tooutline a real-life case study showing how this technology is being put to use now. The article will deal in detail with thespecification proposed by the OASIS technical committee for Translation Web Services. It will also describe use cases where thisspecification can be put into practise such as the project implemented by the Localisation Research Centre as part of the IGNITEproject.

KeywordsOpen standards, Web services, translation, localisation life cycle, standards development, OASIS, localisation, localization, globalization,globalisation, Localisation Research Centre, LRC

1. Introduction

Web services is a solution to the problem ofcomputer systems not talking to each other. It usesInternet technologies to allow computer-based

systems to communicate and transfer data in a way thatprovides a more seamless automated workflow. Thelikelihood is that Web services will be adopted by companiesover the next few years to automate processes and integratesystems. Within the translation and localisation industry,however, there is work being done to create a standard wayfor Web services to be used. This work is being done atOASIS, which is the organisation for creating standards,particularly XML standards within the industry. The purposeof this article is to inform you about this work, what Webservices is, and to outline a real-life case study showing howthis technology is being put to use now.

The idea of creating a standard for the use of Web serviceswithin translation was put forward by Bill Looby of IBM atthe eLocalisation 2001 conference held in Limerick, Ireland.Mr. Looby delivered a paper which showed a vision of howthe industry could benefit from agreeing on a common wayof using this technology. The conference had also seen apractical demonstration of how this technology was alreadybeing used. Lionbridge (then Berlitz GlobalNET) gave ademonstration of work being done for the 2003 SpecialOlympics Web site, which it was sponsoring. Using Webservices, an XLIFF file was sent from the Web site toElcano, Lionbridge's online translation service, and back tothe Web site.

This conference ended with a small group of people gettingtogether to look at how they could progress with the idea ofusing Web services within the translation industry. Thesteering group that formed decided that OASIS would be thenatural home. OASIS was established in 1993 and it isfocussed on XML standards for the software industry.XLIFF (XML Localization Interchange File Format) wasalready being developed by an OASIS technical committee,and there was considerable support for this industry withinOASIS. The OASIS technical committee was formed at thebeginning of 2003 and its members include representativesfrom Oracle, Microsoft, IBM, Connect Global Solutions,thebigword, LISA, the LRC, and Lionbridge as well asindividual members.

2. Translation Web Services

2.1 What are Web Services?Before detailing the main features in the draft specificationfrom the Translation Web Services (TWS) technicalcommittee, we would like to give some background on Webservices. The World Wide Web is a collection of interlinkeddocuments that sits on the Internet, which is effectively ahuge computer network that connects individual Web sites.To access a Web site a person sits at a computer and viewspages through a Web browser – a process that can beconsidered as machine-to-person communication. With theadvent of Web services the Internet is used for machine-to-

Kevin Bargary Peter Reynolds

Page 8: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

Page 8 JUNE 2006

WHITEPAPER.loc

machine communication rather than machine-to-personcommunication. Protocols such as HTTP and standards suchas XML and SOAP (Simple Object Access Protocol) areused in Web services to enable this machine-to-machinecommunication. This allows different systems to worktogether, allowing for more powerful functionality andautomation.

A Web service might do something such as enable weatherforecasts to be queried by remote computers over theInternet. To do this the weather forecasting company wouldhave to create a Web service and allow it to be accessed.This is done using an XML document called a Web ServiceDefinition Language (WSDL) document. The WSDLdocument describes the services which the client applicationis allowed access to and describes what parameters will besent and received for each of these calls. A gardeningenthusiast who is away a lot might want his computer tocontrol his water sprinkler. By using Web services thecomputer will be able to find out when it is not raining andturn on the sprinkler. A protocol called Simple ObjectAccess Protocol (SOAP) is used for this.

2.2 What is Translation Web Services?Since January 2003 the Translation Web Services (TWS)technical committee has been working to create a standardway for Web services to be used in a multilingual context. Ithas concentrated its efforts on creating a standard relating tothe communication between publisher and vendorcompanies. At the simplest level this will allow fortranslation and other work to be sent by the publisher to thevendor and, once translated, sent back. The draftspecification covers the following areas:

l Service supportl Translation and request quotel Status, notification and deliveryl Reference filesl Security

3. Translation Web Services Specification

Each service in the TWS specification provides two formsfor interaction between client and vendor. These forms arerequest and response. For example, the client submits aretrieveServiceList request to the vendor. The vendorreceives this request, processes it, and then returns aretrieveServiceList response to the client. The request andresponse forms of a service expect different inputs andproduce different outputs but both request and response areneeded for the interaction between and use of each service.

3.1 Categories of ServicesThe TWS specification defines five categories of methods orservices, namely 'Service Support', 'Security', 'Translation &Request Quote', 'Status, Notification and Delivery' and'Reference Files'. These categories form a guideline for theservices that the TWS specification provides. Each categoryencapsulates one facet of the core work required for the

completion of a translation job, from initial quote through tofinal delivery.

3.1.1 Service SupportThe 'Service Support' category contains only one service,namely retrieveServiceList. This service allows the client toquery the vendor on the type of localisation services theyprovide. When a retrieveServiceList request is made on avendor a list of languages, service types, domain types, andMIME (Multipurpose Internet Mail Extensions) types thatare supported is returned. In the case where no relationshipexists between client and vendor, the retrieveServiceList isthe first service evoked by the client to ensure the vendormeets the requirements for the potential translation job. Theclient can then use the information returned in their futureinteractions with the vendor.

3.1.2 SecurityAs with any transaction over the web, data security is animportant consideration. OASIS defines a Web servicessecurity standard specification (WS-Security) whichprovides several methods for the securing of Web service-related transactions. The TWS specification relies on WS-Security to provide an end-to-end message level security andhence the specification recommends the use ofusername/password-based security over SSL.

3.1.3 Translation and Request QuoteThis category details the services required to instantiate ajob between a client and a vendor. Web services fortranslation uses a job ticket as a unique identifier for eachproject. This job ticket is created on the client side usuallybefore a quote request. The job ticket consists of a projectID, a user ID and a unique job ID. This job ticket can thenbe used in all future interactions with the vendor's webservice. Currently there are two methods of initiating a jobin the 'Translation and Request Quote' category.

The first method is where the client submits a requestQuoteservice. The requestQuote service details the informationpertaining to the translation job (word count, languages,etc.). The client retrieves the generated quote using theservice retrieveQuote and chooses to accept or reject thequote. If the quote is accepted an acceptQuote service isactivated; if it is not accepted the generated quote will expireafter a certain time limit, defined by the vendor.

The second method is based upon the submitJob service.The submitJob service has similar inputs to therequestQuote service but it also contains the purchase orderinformation found in the acceptQuote service used in theprevious method. In using this second method it isautomatically assumed that the job will be accepted. Thisinteraction might be between two in-house systems, e.g. onesystem has content to be translated, and it contacts a secondMT system and gets the content translated.

3.1.4 Status, Notification and DeliveryThe Translation Web Services technical committee providesseven status, notification and delivery management services

Page 9: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

in the TWS specification. This set of services allows theclient some control over the work that is being carried out bythe vendor for a particular job. Using these services a clientcan check the status of and cancel or suspend a particularjob. The success of each service request is dependant on thestatus of the job at the time of calling the service. Forexample you cannot cancel a job that has already beencompleted (for obvious reasons).

The retrieveActiveJobsList service returns to the client a listof all active jobs that they have with a particular vendor. Analternative to this is the retrieveFullJobsList service, whichreturns all jobs associated with a particular vendorirrespective of the current status.

A client can query the vendor using theretrieveJobInformation service and get a responsecontaining all current information about a job. The status ofthe job can be deduced from the information received fromthe retrieveJobInformation service and possible changes tothe project deadlines can be predicted. If a job is completedthen the status of the retrieveJobInformation serviceresponse should reflect this.

If the information received back from the vendor after aretrieveJobInformation service request indicates that the jobis completed, this job can then be downloaded using theretrieveJob service.

The client can choose to suspend a job temporarily at anytime as long as the job status is not complete. This is doneby making a suspendJob service request.

To remove this temporary suspension of a job (by submittinga suspendJob service request), the client can choose toresume the job using the resumeJob service.

A client can cancel a job using the cancelJob service if thejob is currently active and not in a completed state.

3.1.5 Reference FilesThe vast majority of localisation projects require not onlythe localisable content but also any reference files associatedwith the project. Reference files are not for translation butcontain information that may help the translation process.Translation memories, style guides, or terminologyreferences may be sent along with the translatable files. The'Reference' category defines services to allow for thisallocation of these files to a particular project.

A resource file can be assigned to any number of active jobsusing the associateResource service.

To remove an association between a job and a resource filethe disassociateResource service is used.

The client can review information about a resource file byinvoking the retrieveResourceInformation service. This willreturn information about the resource file from the vendor: alist of jobs that the resource file is assigned to, the purpose

of the resource file and whether or not the file has changed(been updated).The TWS specification allows the client the functionality ofuploading assets to the vendor using SOAP messages usingthe uploadFile service.

3.2 Services Supported in Current SpecificationAt this point there are 18 services supported by the TWSspecification. As discussed in section 3.1, services can becategorised into one of five categories depending on theirfunction in a translation process. These services canconversely be considered under the following threeheadings:

Required Services – these are services that are required fora Translation Web Services implementation and form thebasis of a minimalist approach to Translation Web Servicesuse.

Recommended Services – these services are recommendedby the Translation Web Services technical committee to beused in an implementation of Translation Web Servicestogether with the 'required' services.

Optional Services – these services are services that are onlyneeded in some specific cases (enquiring about whatservices a vendor has to offer or setting up a first contactwith a vendor by requesting a quote etc.). These services arenot essential to the Translation Web Services process but arerequired for some scenarios.

Table 1: List of services available in the TWS specification

Required Optional Recommended Services Services Services

submitJob retrieveServiceList rejectJob

retrieveJobInformation requestQuote associateResource

retrieveJob acceptQuote disassociateResource

retrieveActiveJobsList retrieveQuote retrieveResourceInformation

suspendJob retrieveFullJobsList retrieveFullResourceList

resumeJob uploadFile

cancelJob

WHITEPAPER.loc

Page 9JUNE 2006

4. Technologies in Translation Web Services

4.1 Simple Object Access Protocol (SOAP)SOAP is a W3C-developed standard described as acommunication protocol or a message passing systembetween two computers. SOAP is the specification thatdefines the XML format for these messages being passed.SOAP is one of three core XML-based standards that are thefoundation of a Web services implementation (the othersbeing WSDL and UDDI, see Sections 4.2 and 4.3).

4.2 Web Services Description Language (WSDL)A WSDL is an XML document that describes (a) a set ofSOAP messages and (b) how these messages are exchanged.The WSDL file in a practical sense contains the informationrequired by a client to access a service, i.e. what parameters

Page 10: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

Page 10 JUNE 2006

WHITEPAPER.loc5.

Loca

lisat

ion

Life

Cyc

le

Figu

re 1:

Loca

lisat

ion lif

e cy

cle in

corp

orat

ing th

e us

e of

Web

serv

ices f

or tr

ansla

tion

Page 11: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

The development model used for this referenceimplementation was loosely based on the prototyping modelof software development. Initially one of the 18 servicescurrently available in the standard was developed. From thatthe development incorporated one further service at a timeuntil all were implemented. Throughout the developmentlife cycle we reported back to the technical committee onany issues or suggestions for improvements that arose as weprogressed.

Apart from the API for the usage of SOAP functionality Axisprovides two command line utilities that are essential to theimplementation of Web services. The 'Java2WSDL' utilitytakes pre-existing Java code and creates a WSDL file for thatcode. This utility can be used if there is some code thatperforms a specific function that you would like to makeavailable as a Web service for others to use. The TranslationWeb Services standard has made a WSDL available, so thisutility was redundant for our purposes. However, the secondutility, 'WSDL2Java', creates the Java stubs (Java files thatcontain the code needed to use SOAP) required by:

(a) the server to write and deploy the service, and(b) the client to access the service through its own code.

Figures 2 and 3 show this process. Firstly the Java stubs arecreated:

Figure 2: Creating the Java Stubs from the WSDL

Then the Java stubs are used by the client application toaccess the services and by the server to deploy the services.

Figure 3: Connecting client and server (through the Java stubs) over HTTP usingSOAP

need to be passed to the service to invoke a response. TheWSDL file should also contain the actual location (HTTPaddress) of the web service.

4.3 Universal Description, Discovery and Integration (UDDI)UDDI is an OASIS-driven mechanism for clients todynamically find other Web services. The UDDI protocolgives a company the ability to register their available Webservices online, thus exposing them to potential clients. AnUDDI registry service is a Web service that managesinformation about service providers, serviceimplementations, and service metadata.

In summation, SOAP is the communication protocol forWeb services, WSDL defines how the interaction occursbetween the two computers, i.e. how to invoke the servicesand the UDDI is a mechanism for finding these services orregistering one's own services.

5. Localisation Life Cycle

See page 10, Figure 1: Localisation life cycle incorporatingthe use of Web services for translation.

6. Use Cases

6.1 TWS Reference ImplementationA reference implementation of the Translation Web Servicesspecification was undertaken by the Localisation ResearchCentre at the University of Limerick in Ireland as part of theIGNITE project. The Translation Web Services technicalcommittee decided that it was important to have a referenceimplementation to see how the standard worked from atechnical viewpoint.

The basic premise of Web services for translation is that aserver machine will contain a pre-programmed set ofmethods or functions that a client machine can access usingWeb services technology. The two components involved inthis interaction are the server machine and the clientmachine. The process of implementing Web services foreach component is similar. Nevertheless there are somesubtle but important distinctions to be made between bothimplementations.

The implementation platform of choice for this referenceimplementation is J2EE and the Java programminglanguage. The rationale behind this decision was that theopen source 'Apache Project' has a Java-basedimplementation of SOAP called Axis. Axis is defined as a"reliable and stable base on which to implement Java Webservices"; it provides an Application Programming Interface(API) into the SOAP actions that are required forimplementing the Translation Web Services standard.Apache Tomcat was chosen as the Web server on which todevelop the Web services. Tomcat and Axis were developedunder Apache and work very well together in a practicalimplementation environment.

WHITEPAPER.loc

Page 11JUNE 2006

Page 12: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

The first service that we implemented was theretrieveServiceList service. This service was chosen becausethere were no input parameters required for it. All that wasrequired to invoke the service was an instance of theretrieveServiceListRequest class. The retrieveServiceListservice returns "a complete list of services offered by aparticular vendor. This will include the languages dealt withand services offered by a particular vendor" (TranslationWeb Services Specification Draft 1.0). After writing theserver-side code to handle a retrieveServiceListRequest, i.e.return all of the appropriate values, the next stage was tocreate a simple test class that could instantiate aretrieveServiceListRequest and handle the results receivedback from the server in a retrieveServiceListResponse. Withboth classes now ready, we needed to deploy the services tothe Apache Web server. The WSDL file also contains thelocation of the service, i.e. where it can be accessed from.Deployment is necessary to ensure the service is in thelocation as defined in the WSDL.

When the utility 'WSDL2Java' creates the Java stubs neededfor the server-side machine, it also creates two other filesthat are used to deploy and 'un-deploy' the service to a Webserver (Apache Tomcat). These files are called Web ServiceDeployment Descriptors ('deploy.wsdd' and 'undeploy.wsdd').

The next stage in the development of the implementationwas to write the code for the rest of the services. Whilewriting the code we encountered some issues with thespecification (including inconsistencies between the schemaand the specification document). These issues were quicklyamended by the technical committee. During the process ofcoding we also made some suggestions to the technicalcommittee about possible improvements to the specification

and we were actively involved in applying these changes.For example, the service 'retrieveQuote' in the originalspecification did not return any information about thelocation of the actual quote. This was deemed to be animportant piece of information for this service and waspromptly included in the specification.

With the code for the implementation of the services nowwritten, the initial service deployed (retrieveServiceList)was un-deployed and the full list of services was deployed tothe Web server. A JSP (Java Server Pages) client interfacewas developed to allow for the input of the parametersrequired for each service and also to show the responsesfrom the server.

6.2 Next StepsThe next step in the reference implementation will be toattach the front-end JSP interface to a back-end databasesystem that returns some relevant information that is not pre-defined. The current implementation can be seen runninglive on www.electonline.org:8080/index.html. To view ordownload the source code used, please go towww.igniteweb.org/tws. Here you will also find instructionson how to install this implementation on your own localmachine and server.

6.3 Lionbridge's use of Web servicesAlthough the specification from the TWS technicalcommittee is still at the draft stage, there has been somesignificant work done with Web services in the translationindustry. Lionbridge has implemented a number of solutionsbased on Web services which have linked contentmanagement and other systems with Elcano, its onlinetranslation portal.

Figure 4: Lionbridge's use of Web services

WHITEPAPER.loc

Page 12 JUNE 2006

Page 13: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

Figure 4 shows a solution that was built for a large chemicalcompany. Its translation process was time-consuming anderror-prone. The process required more than 40 separatemanual steps for each file to be translated. This led to alocalisation process that was not cost-effective and deliveryto target markets was unnecessarily delayed.

Working together with the company and its SystemsIntegrator, Lionbridge was able to propose a solution tostandardise and automate parts of the process, reduce themanual effort required, and provide a more cost-effectiveand time-efficient way to translate the content stored withinthe TeamSite Content Management System (CMS).

The solution Lionbridge proposed to the company was toconnect Interwoven's TeamSite to Elcano™ (For furtherdetails see http://elcano.Lionbridge.com). Elcano offers aWeb services interface providing simple and directprogrammatic access to Elcano from any CMS or contentrepository. The Elcano Web services solution makes use oftwo language industry standards – both under the auspices ofthe OASIS standards organisation (www.oasis-open.net).XLIFF defines the structure of translation data, and TWSdefines the communication between TeamSite and Elcano.Using these data transfer standards to connect to Elcanoallows source content of any type to be posted directly intoLionbridge's translation production process, its status to betracked from the CMS during translation and, for completedtranslations to be retrieved without unnecessary manualintervention. In addition, once extracted from the CMS,Lionbridge can make use of a variety of translationproductivity tools to ensure the consistency, accuracy andtimeliness of the translation.

FreewayLionbridge recently introduced its new customer portal inApril 2006 and the Elcano Web services described abovewill be ported to Freeway. More information is available atwww.lionbridge.com.

References

Translation Web Services Technical Committeewww.oasis-open.org/committees/tc_home.php?wg_abbrev=trans-ws

XLIFF Technical Committeewww.xliff.org

OASIS www.oasis–open.org/ Translation Web Services and XLIFF are developed withinOASIS, which is a standards body for industry.

World Wide Web Consortiumwww.w3.org The World Wide Web Consortium is the standards body forthe Internet.

Localisation Research Centrewww.localisation.ieThe Localisation Research Centre (LRC) is the information,educational, and research centre for the localisationcommunity.

IGNITE projectwww.igniteweb.orgIGNITE is an LRC-coordinated project that will pooltogether linguistic infrastructure resources and provideconvenient access and a market place for them. IGNITEhosts an implementation of the Translation Web Servicesspecification.

LISA - OSCARwww.lisa.org/sigs/oscar/OSCAR is LISA's body for the development and maintenanceof open standards for the language industry.

Note: If you are interested in joining the Translation WebServices technical committee please contact Peter Reynoldsat [email protected] or Tony Jewtushenko [email protected]

This whitepaper was written by Kevin Bargary and PeterReynolds with additional contributions from MagnusMartikainen, Tony Jewtushenko, Andrzej Zydron andReinhard Schaler. Kevin Bargary may be contacted [email protected] and Peter Reynolds may be contactedat [email protected]

Page 13JUNE 2006

WHITEPAPER.loc

Page 14: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

Page 14 JUNE 2006

STANDARDS.loc

Formatting and theTranslator:

Why XLIFF Does MatterIgnacio Garcia

University of Western Sydney

AbstractGains in productivity through translation memory-based text reuse are often offset by time spent in dealing with formatting glitches.This affects all players in the localisation industry, from the end client to the language vendor to the freelance translator. However,as a non-core activity for them, translators are less well prepared to deal with these hidden formatting related costs. This articlelooks from the translator's viewpoint at the importance of formatting as part of the translator's work, and at the limitations in dealingwith formatting of the technologies now in use. It also shows how the development and implementation of standards within thelocalisation industry, XLIFF in particular, may impact on the situation, so that translators can once again deal only with text, as theydid in pre-digital times.

Keywordslanguage Industry, localisation, localization, Open Standards, text reuse, TMX, XLIFF

1. TRANSLATORS TRANSLATE FILES, NOT TEXT

What translators receive for translating is files, notjust text. Translators do not receive TXT files, butfiles with text plus formatting; with data that users

can read plus code that machines can. Since many of thefiles translators receive have been formatted in Word, whichwe are all familiar with as the de facto standard for wordprocessing, some may assume that formatting is transparentand has nothing to do with translation. However, the fact thattranslators, as computer users, do not need to 'read' the codeto understand the text does not mean they do not need to payattention to it. Translators who have been exposed to otherformats have learnt that it pays to understand the differencesbetween flat and binary files, and between structuredformatting and inline formatting. The digital world hascreated both the file, an amazing advance from the dayswhen text was composed on a typewriter, as well as specifictechnologies to deal with translation, principally translationmemory (TM). This digital world has also raised the issue offormatting. It is argued here that gains in productivitythrough TM-based text reuse are often offset by time spentin dealing with formatting glitches.

That formatting is part of the translator's job is obvious toany translator working in the localisation industry.Formatting, however, has not been given the prominence itdeserves in training and professional development. There isno mention of it in the Language Engineering for Translators

Curricula (LETRAC) Curriculum Modules (1999) in whichmany of the programmes with a focus on technology werefirst based. Even today those programmes tend to presentformatting as something that will be taken care of byspecialised computer software, TM suites or localisationtools. This is not quite the case yet. The importance offormatting, notwithstanding the technology currentlyavailable, has been repeatedly pointed out in the literatureaddressed to language vendors and end clients (Reynolds &Jewtushenko, 2005). There is a gap, however, in theliterature addressed to the freelance and the trainee translatorthat this article will attempt to fill. Austermuhl (2001) hardlyrefers to formatting; Bowker (2002:37-39,118-119) andSomers (2003:18-19) only treat it marginally. Only Zetzsche(2003b) pays thorough attention to it, its focus being to givethe freelancer the tools to deal with digital text.

To some extent, it is understandable that not much profile isgiven to formatting in academic settings. Text is the coreissue for translators, formatting is not. Dealing withformatting, like dealing with invoicing, may be a mostimportant activity, but it is non-core. Also, translating text isa complex activity that takes years to master. It involvesweighing up alternative renditions of a meaning in the targetlanguage in order to choose the most appropriate one for thesituation, in a context where there is rarely a clear-cut rightor wrong answer. Dealing with formatting, on the other

Ignacio Garcia

Page 15: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

STANDARDS.loc

Page 15JUNE 2006

hand, may be very complicated, but those who manipulatefiles will realise soon enough whether or not they have donethe right thing. It is, however, part of the translator's job, ascurrent technology is not yet good at separating text fromformatting (i.e. content from its container) within the file. Inthe age of the typewriter and before, formatting wasunimportant. In the first stages of the digital age, it hasbecome important, and it will continue to be important – atleast until we reach the 100% XML scenario outlined below.

2. TRANSLATION IS NOT A CRAFT - IT IS ANINDUSTRY

Translation is no longer a craft; it is an industry. However, itis an industry which does not pay the translator – thefreelancer at least – by the year or even by the hour likerespectable professions such as law and medicine do, but bythe word (or by the line, or by the page: by quantity).Translators work at the 'wordface' in the same way thatminers work at the coalface, as Emma Wagner put it(Chesterman & Wagner, 2002: 1), taking out 'loads oftranslated words' which is what language vendors sell, asMark Lancaster, the head of SDL, a major language vendorand the most important developer of computer-aidedtranslation tools, was reported to have said (inFenstermacher, 2006). On the one hand, there is a lowthreshold entry point to the profession: any educatedbilingual, given enough time and some mentoring, canbecome a translator; on the other, only those able to translateat great speed will be able to make it professionallyprofitable.

Most translators work within what has been loosely calledthe language industry or, more precisely, the localisationindustry, also referred to lately as the globalisation industry,or the GILT (globalisation, internationalisation, localisationand translation) industry. This is an industry that, whatevername it uses, is based on selling lots of translated words,with quality often taken for granted, time-to-market animportant constraint, and price paramount. This is anindustry that, according to the latest calculations and withconservative estimates, will be worth more than 9 billion USdollars by the end of 2006 and will grow at 7.5 percent peryear to be worth an estimated excess of 12 billion US dollarsby 2010 (DePalma & Beninatto, 2006:4-5). Languagevendors, like individual translators, are also paid by thenumber of translated words they deliver to the end client,with a benefit margin that can only be widened throughincreases in productivity. Despite efforts by the industryitself to monitor quality (the Localization Industry StandardsAssociation (LISA) being a case in point) and initiativessuch as the recent EN-15038 European Quality Standard forTranslation Services, backed by the European Committeefor Standardisation (Arevadillo Doval, 2005), the translationindustry also has a low entry threshold and does not requirea large amount of capital. So there is fierce competition,competition that shows in a consolidation process bestreflected at the top end of the market in the mergers andacquisitions of Mendez by Lernaut&Hauspie, then of

Lernaut&Hauspie and Berlitz by Bowne, then of Bowne byLionbridge, a process that does not seem to have stabilisedyet (DePalma & Beninatto, 2006: 6).This necessary increase in productivity, like that achieved inmanufacturing two centuries ago, is based on the division oflabour and on mechanisation. In the localisation industry,division of labour means virtual teams of translatorsworking on a single project, with team members workingoff-shore to take advantage of lower salaries, or all throughdifferent time-zones if what matters is to speed up time-to-market. Mechanisation is achieved through the use ofproductivity tools: occasionally machine translation (MT),most often TM suites for the translation of running text, andlocalisation tools for the translation of short stringsembedded in programme files. Productivity is achievedthrough the reuse of already translated text and of itsformatting. In fact, it is likely that the savings in formattingreuse are greater than those achieved through text reusealthough, surprisingly, no study has been done on this yet.

It is worth noting that the localisation industry does nottranslate – it localises. This involves project managers,graphic designers, software engineers and others working ontasks such as adaptation, quality assurance, desktoppublishing adjustments and testing (Esselink, 1998:258-273), with the translator's role limited to the replacing ofnatural language strings, a mere, perhaps, 30 percent of thetotal localisation load. But, yes, this does include the oftentough task of respecting the formatting of those naturallanguage strings. It is almost ironic that at the very momentwhen translation studies was ready to expand the meaning oftranslation beyond the tight equivalence model thatdominated for decades, the localisation industry, the'market-driven translation theory', moved in the oppositedirection, restricting translation to an (internationalisation-driven) institutionally controlled equivalence (as Pym,2004:62-65 explains), thereby giving the translators theadded burden of having to go to great lengths to keep theformatting intact.

There are two ways for translators to deal with thisformatting issue, and neither is (yet) completely satisfactory.One is overwriting the files, a bad idea if the translator doesnot have the application with which the original file wascreated and a working knowledge of that application. If thefile that needs translation is flat, it is not always easy toseparate translatable text from code; if it is binary, it won'teven be opened without the programme (and, often, theversion) that created it. Overwriting is also a bad ideabecause it does not allow for the semi-automatic reuse ofalready translated text and, if the translator is lucky, theformatting as well. The other way, which makes more sensefor the above reasons, is by using the afore-mentioned TMproductivity tools, which translators are forced to do in mostlocalisation projects anyway. The bad news, however, isthat, despite claims to the contrary, TM tools do not solvemany of these formatting challenges.

A brief look at the user lists for these tools (located atwww.yahoogroups.com for most of the best known

Page 16: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

STANDARDS.loc

Page 16 JUNE 2006

commercial brands) will show the breadth of the formattingproblems translators experience daily when using these tools– and what an advantage it is to be able to count on suchquality peer help. Table 1 looks at data from the three listswith the greatest numbers of members and volume ofmessages for March 2006. Many more queries will havegone to the technical support section in the softwaredevelopers' web pages or to the language vendors thatcommissioned the job so the figures are indicative only. Thenumber of messages does not reflect the seriousness of thematters dealt with in them; nor does it reflect on the qualityof the particular product. The more 'technical' the job, themore likely it is that there will be more messages dealingwith formatting issues. Wordfast, for instance, may have alower percentage of formatting queries than TRADOSbecause translators working with Wordfast are likely to doless file-challenging work, not because their software is inany way superior to that of TRADOS.

Table 1: Lists and number of formatting-related messages for March 2006

There needs to be better ways of measuring how much timeand energy the average translator may use in dealing withformatting glitches. Direct observation of a 'typical'translator's week, now more feasible via usability testingtechnology, should be attempted to give the research a morecontrolled, empirical outlook. The author's limitedexperience as a freelance translator allows him to guess thatsuch kind of research will also confirm that most savingsgained through text re-use are offset by the amount of timespent on formatting matters. This article, however, will limititself to supporting this hypothesis by just looking at thelimitations of current technology.

3. CURRENT TECHNOLOGY PROMISES MORETHAN IT DELIVERS

Translators receive a job in one of four ways:

1. As files alone.2. As files plus relevant sections of sentence and term

databases.

3. As pre-translated files, with database information inserted in the document, as in pre-translated TRADOS files.

4. As files alone with access to databases hosted in servers.

No system, at present, avoids the problems translators oftenexperience with formatting.

In theory, TM suites and localisation tools separate text fromcode before translation and then merge translated text withthe original code, thus allowing for the reuse of formatting.Then they reuse content, by leveraging data from thedatabases of translated sentences and terms during thetranslation process. However, just as these tools don't doautomatic translation of the text (TM is not MT!), but justhelp translators with the repetitive stuff so they are free toconcentrate on the more challenging aspects of the text, theydo not automatically solve all formatting problems either.

The downside here is that dealing with formatting issues andcode is a core activity of computer engineers, perhapsdesktop publishers or even project managers, not oftranslators. Therefore, translators are thus less prepared tosucceed here.

When language vendors and freelancers encounter theproblems related to the reuse of text, they have to deal withformatting too. There are two reasons why they have to dealwith formatting: Firstly, because TM databases are compiledin a proprietary format that does not allow fluid exchange ofdata with other TM databases – an exchange that is neededas soon as a translator works for a language vendor (or thelanguage vendor for a client) that does not use the sameprogramme. Secondly, because these sentence databasesalso keep inline formatting (the formatting within the flowof text, as opposed to structured formatting), and a segmentwith both the right text and format will get a better matchthan a segment with only the right text.

In fact, exchange of text alone between end clients, languagevendors, freelance translators and TM suites is not thatdifficult. It can always be exported from the database to aspreadsheet programme, then from the spreadsheet to thenew database. What is more difficult is working with text

List TW users Dejavu-l Wordfast (TRADOS) (DejaVu) (Wordfast)

Formatting-related messages 107 305 73

Total number of messages 448 936 402

Percentage of formatting-related messages 24% 32% 18%

Page 17: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

STANDARDS.loc

Page 17JUNE 2006

that contains both inline formatting and metadatainformation. When TRADOS became a de facto standard inthe industry, from the late 1990s onwards, most developerstried to solve the problem by making themselves compatiblewith TRADOS. Later, when the Translation MemoryeXchange (TMX) standard emerged, they all claimed TMXcompliance. However, the process of exchanging translationmemory data is not always perfect; it was not perfectbetween TRADOS-compatible software, and it is not evenperfect between TMX-certified products at the latest versionof the standard, now level 1.4b (Zetzsche, 2003a). In fact, itis developers themselves who simply aim for 'little or no lossof critical data' during the process of exchanging translationmemory data (LISA, 2005b).

The problems grow as we move from the reuse of text to thereuse of formatting. At the point of importing a file intowhatever translation tool is used, a filter is needed to convertthe original file into a format that will be read by thetranslation editor. Creating these filters and maintainingthem throughout the periodic upgrades of the programmes inwhich the files are composed means a waste of resources fordevelopers – resources that would be better used if devotedto the core function of TM, which is improving thereusability of text.

These problems manifest themselves even further at thepoint of exporting the file for conversion into its nativeformat, for several reasons:

l conversions are rarely 100 percent accurate l files may not be well formed due to wrong handling by

their creators (for instance, in Microsoft Word, using the enter key to change the line, or the space bar instead of the tab to indent)

l the translator may have pressed the wrong key in the translation editor

Then, we have to account for the possibility of bugs (in thefile, in the filter, in the editor), for the difficulties of specificformats (MIF files, resources files), plus possibleinterferences of hardware / software running in thebackground.

There is also the issue of text expansion in translation, whichwill often require post-translation adjustments, particularlyin presentation and design-oriented DTP files.

It is relatively easy for the translators and translation projectmanagers to know how these productivity tools shouldbehave in theory. The real test is in the ability of translatorsand managers to troubleshoot formatting problems as theyarise. Allowances for budgeting and time are needed for that,which are likely to eat into most of the savings made throughtext reuse.

I have not referred here to other issues, such as, forfreelancers the maintenance of databases and, for languagevendors, the synchronisation of server databases so that theycan be effectively used by different translators working at

the same time. While time spent in maintenance will also eatinto some of the savings from reuse, it is not directly relatedto formatting.

4. EMERGING TECHNOLOGY: OPEN STANDARDS

The problem with formatting is technical and the solutionmay be technical too. We have seen it emerging throughopen standards such as the above-mentioned TMX. It iswidely accepted that standards benefit everyone – theproduct developers and businesses that depend on them aswell as the actual users – and they have a positive impact onthe overall economic cycle. After XML technology wasdeveloped, standards were achievable in the area of textreuse, as XML was designed precisely to separate, within thefiles, content from the container. The Localization IndustryStandards Association (LISA) identified this and establishedin 1997 a specific body to develop text reuse standards. Thisbody is entitled OSCAR (Open Standards for Container/ContentAllowing Re-use).

TMX was the first such standard to emerge: version 1.0, forthe exchange of text only, was released in December 1997;the latest version, 1.4b, was released in October 2004 andincludes capabilities for exchange of formatting andmetadata. All commercial TM suites claim to be compliantwith at least version 1.1, while a few certified products, plussome non-certified products, claim to be compliant withversion 1.4b of the standard. There are still the teethingproblems mentioned above: once again, current softwareoften promises more than it delivers, but the situation isimproving.

Term Base eXchange (TBX, version 1.0 released in April2002) was then developed to cover the terminologicalexchange needs within the language industry and betweentools – not only TM-based needs, but also MT-based needs.The Segmentation Rules eXchange (SRX, version 1.0released in April 2004) followed, once it was realised that upto 30 percent of TMX-exchanged perfect matches could belost between applications due to differences insegmentation. The last OSCAR standard, still indevelopment, uses the official name of Global InformationManagement eXchange (GMX), also known as GILTMetrics eXchange. It deals with metrics rather than withtext, and consists of three components: GMX Volume forword counting (the only one defined so far), GMXComplexity for the quantification of the complexity oftranslation tasks, and GMX Quality for the specifications ofthe quality requirements of translation tasks (LISA, 2005a).

All these OSCAR standards deal with the reuse of text,although GMX only does so indirectly. However, as alreadydiscussed, it is in the area of the reuse of formatting thatmore gains are to be expected from standards. In 2000, anew standard was developed. It is known as XLIFF (XMLLocalization Interchange File Format) and comes under theumbrella of OASIS, the Organization for the Advancementof Structured Information Standards. Version 1.1 of XLIFFbecame an OASIS Committee Specification in the Spring of

Page 18: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

STANDARDS.loc

Page 18 JUNE 2006

2003 (OASIS, 2006).

XLIFF was created for the exchange (OASIS would preferto call it interchange this time) of translatable (orlocalisable) text between different file formats. With XLIFF,content can freely circulate through the localisation cyclewith independence of what its native file format was, andindependence of the TM suites or localisation tools that willbe processed. The XLIFF conversion tool works byseparating structural formatting into a skeleton file, thensegmenting content and its inline formatting into translationunits with its source and its target. These translation unitscan contain inside 'alternative translation' units, in mostcases to hold data leveraged from a TM. Once translated, theXLIFF file merges back with the skeleton to reuse theformatting.

The XLIFF format does much more than simply interfacingwith any other file format. It also allows each segment, theminimal discrete unit of translatable text that will then bekept in TM databases for recycling, to carry sophisticatedmetadata. This metadata can be used to track which versioneach segment originated from (it is as common forlocalisation projects to start translation before the finalversion of the source text has been completed, as it is toupdate a product, or to generate content from databasesinstead of static files), and to track which phase of theworkflow the segment is going through, including data ontool used, job ID, client, translator/reviser, notes, metricsinformation, etc. Being an XML standard, it is alsoextensible and can accommodate future needs (Reynolds &Jewtushenko, 2005).

The XLIFF standard is being developed in line with theOSCAR standards referred to above: segmentation as perSRX rules, TM information so that it can be downloadedfrom/uploaded to TMX, and word counts based on GMX.Although translation units in the XLIFF format are bilingualonly, multilingual projects can be dealt with by bundlingtogether several files in a single document. This is fine, astranslation is after all a bilingual activity, and a multilingualfile would need to be divided into its bilingual componentsat some stage anyway.

There is a lot the localisation industry can gain fromadopting XLIFF. Complicated projects may have to dealwith over thirty different types of files, from EXE and DLLprogramming files to HTML and XML and their derivatives,to formats generated by content-oriented and design-oriented DTP programs, to the different Microsoft Officeapplications. Once this standard is adopted, instead ofhaving to build one filter for each file format plus filters tohandle data between TM suites, software developers willneed just one filter for each file format. Indeed, the softwarethat generated the files should produce this filter, thusallowing developers to shift resources to refine thealgorithms so that translated text can be reused morethoroughly and easily.

In the current environment, the more that end clients rely on

outsourcing localisation to multilingual vendors (MLV) andthe less they spend on in-house localisation resources, themore like a 'black box' a whole project looks to them. Withcurrent practices end clients pass content and code on to thevendors, and later receive from the vendors the translatedfiles ready to be imported into their document managementsystem. In an XLIFF environment, clients will have muchmore control over the whole process, passing onlytranslatable text and keeping the code (which may besensitive in some cases) in-house. They will gain much morecontrol of their linguistic assets also, merely by updatingtheir own TM in the process of converting the XLIFF filesto their native format. Just as importantly, they will not risklocking themselves in to a particular vendor or locking intheir linguistic assets in to a particular tool.

For language vendors – particularly those at the top (MLVs)– the success of XLIFF as a standard will mean savings onmanagement, engineering and DTP costs, without having toalso lock their linguistic assets in to a particular TM tool.Their current role, which is central in the localisationprocess, involves dealing with all the formattingcomplexities the end clients do not want to spend resourceson and that the freelance translators do not have theexpertise to deal with. It is likely that this role will betransformed into a mere consulting job. SLVs will still retaintheir important role as language experts, dealing with thelinguistic quality assurance of the project.

For freelancers, the success of XLIFF will mean that theywill finally be able to concentrate again on text, which istheir core activity, rather than on formatting, which is not. Itwill mean combining the advantages of the pre-digital era,when all they had to worry about was text, with those of thedigital era: taking advantage of sophisticated software to re-use translated and repetitive text, allowing them to focus juston the new linguistic challenges arising. No more risks oflocking themselves in to a particular tool or out from anythird party information; no more need to buy several toolsfor different vendors: any single XLIFF–compliant tool willbe enough.

5. THE 100 PERCENT XML SCENARIO

XLIFF may be the next big thing for the localisationindustry, as significant as Unicode, which allowed for theeasy management of character sets in any language, in anycomputer and with any (compliant) programme. WhatUnicode did for multilingual writing, XLIFF may do fortransporting this written text across languages, localisationagents, software and hardware. The latest development ofthe past few years of moving client and localisation vendorTM databases from the desktop to the server and triggeringthe whole localisation process from the client's contentmanagement system will be greatly helped by the adoptionof XLIFF.

XLIFF, however, is not likely to succeed overnight. At themoment, rather than making all other formats and filters

Page 19: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

STANDARDS.loc

Page 19JUNE 2006

obsolete, it sits there in parallel with them as one moreformat and one more filter that has to be taken care of bysoftware developers, clients, language vendors andtranslators, somehow defying the purpose for which it wascreated. There are also teething problems in the applicationof the standard, with tools purported to conform to XLIFFproducing code that is not easily exchangeable betweenthem (Wunderlich, 2005). Indeed, it may not succeed, just asits OpenTag precedent did not succeed. Insufficent numbersof end clients and leading software developers may feel theneed to invest the resources needed to make it run. SomeMLVs (gatekeepers as they are sometimes known) mayresist it as it makes almost obsolete what is now a big chunkof their core activity. Like all standards, XLIFF has beendeveloped by big players – with Novell, Oracle, Sun, andBerlitz involved first, then joined by Lotus/IBM, MoraviaIT, RWS Group, and Lionbridge – but that does notguarantee its success.

On the other hand, XLIFF seems to be making inroads intothe industry. Leading commercial localisation tools(Catalyst, PASSOLO, WinRC) and TM suites(SDLX/TRADOS, Heartsome) have adopted it. There isinterest in the open source community in the use of thisstandard (Frimannsson & Hogan, 2005), with KBabel andLanguage Tools also offering free XLIFF-compliant tools.In some cases (SDLX for instance) the XLIFF format willinterface with the native file format via its own proprietaryITD format. Heartsome, on the contrary, works directly onXLIFF, TMX and TBX standards without using anyproprietary standards. Innovators and early adopters arealready embracing XLIFF, although we are still at the firststages of the S-curve. For clients and language vendors,there is no longer any comparative advantage in adoptingTM as most are already using it – so rather than TM beingan advantage it is a necessity. However, there may be acomparative advantage in adopting XLIFF, and server TMand document management software now, before themajority does (Project-Open, 2005).

Indeed, it is easy to imagine a 100 percent XML scenario inwhich a more developed XLIFF specification would be ableto carry out the management of information of the globalenterprise seamlessly – from the authoring of text to itslocalisation, publishing and archiving – with processestriggered and pushed through the corresponding workflow(semi) automatically by content management software, alloverseen by the project manager. Technical writers willcreate content on structured language and, with the help ofauthoring tools, through the single sourcing cycle, allowingfor text chunks to be reused in other documents and to beoutputted in different formats: HTML, PDF, Help, etc.(Rockley, 2002). Then, translators will move the contentthrough the localisation cycle while reusing previouslytranslated sentences and terms. Both technical writers andtranslators – language specialists in their own right – willdeal only with text and, when relevant, its recycling, leavingformatting to DTP and engineering specialists who willhandle it in a totally independent way.

In this scenario, successful software developers could

actually afford the resources to enhance text reusealgorithms that incorporate linguistic knowledge(inflections, synonyms…) and perhaps create a new kind oflanguage-specific TM which is more efficient for theparticular language combination. Doing this would blur thedistance between MT and software development, but stillleave translators in charge. The process could then rightly beconsidered as machine assisted human translation ratherthan human assisted machine translation to use Hutchins'(1992) parlance. For content creation, translation, andtranslation management some developers may find it usefulto pursue Zydron's (2003) 'text memory' xml:tm idea. Othersmay be interested in advancing diagnostic tools to determinewhether a document should be translated by MT, by TM orwithout them, as the TransRouter project (Cleary & Schäler,2000) was aiming at.

With this 100 percent XML scenario, just as in pre-digitaltimes, translators will need to focus only on text, which iscomplex enough, without being distracted by thecomplications of formatting. After all, there will be enoughchallenges for freelancers in coping with the demands oftranslating following the imminent introduction of Web 2.0– the Semantic Web in which machines, rather than merelydisplaying data as they do now, will be better able to'understand' it as well (Berners-Lee, Hendler, & Lassila,2001). Even if XLIFF succeeds, the digital world will stillstir the translation profession for years.

References

Arevadillo Doval, J. J. (2005).The EN-15038 European Quality Standard for TranslationServices: What's Behind It? The Globalisation Insider.Retrieved 1 April 2006, fromhttp://www.lisa.org/globalizationinsider/2005/04/the_en15038_eur.html

Austermuhl, F. (2001).Electronic Tools for Translators.Manchester: St Jerome.

Berners-Lee, T., Hendler, J., & Lassila, O. (2001).The Semantic Web. Scientific American, (5).

Bowker, L. (2002).Computer-Aided Translation Technology: A PracticalIntroduction. Ottawa: University of Ottawa Press.

Chesterman, A., & Wagner, E. (2002).Can Theory Help Translators.Manchester, UK & Northampton, MA: St Jerome Publishing.

Cleary, R., & Schäler, R. (2000).R&D Opens Doors to Translation Portals. Localisation Ireland, 4(1), 9.

Page 20: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

STANDARDS.loc

Page 20 JUNE 2006

References (continued)

DePalma, D. A., & Beninatto, R. (2006).Ranking of Top 20 Translation Companies for 2005:Common Sense Advisory.

Esselink, B. (1998).A Practical Guide to Software Localisation.Amsterdam & Philadelphia: John Benjamins.

Fenstermacher, A. (2006, January/February).Authors, localizers and language barriers. Multilingual, 77, 82.

Frimannsson, A., & Hogan, J. M. (2005).Adopting Standards-based XML File Formats in OpenSource Localisation. Localisation Focus – The International Journal forLocalisation, 4(4),9-23.

Hutchins, J., & Somers, H. (1992).An Introduction to Machine Translation. London: Academic Press.

LETRAC. (1999).LETRAC Curriculum Modules. Retrieved 1 April 2006, from http://www.iai.uni-sb.de/docs/D3.pdf

LISA. (2005a).OSCAR (Open Standards for Container / Content Allowing re-use). Retrieved 1 April, 2006, fromhttp://www.lisa.org/sigs/oscar/

LISA. (2005b).TMX – Translation Memory eXchange. Retrieved 1 April, 2006, from http://www.lisa.org/tmx/

OASIS. (2006).OASIS XML Localisation Interchange File Format (XLIFF)TC. Retrieved 1 April, 2006, from http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=xliff

Project-Open.(2005).Technology Adoption and Competitiveness in theTranslation Industry. Retrieved 1 April, 2006, from http://www.project-open.com/whitepapers/index.html#Liese2005

Pym, A. (2004).The Moving Text. Localization, Translation, andDistribution.Amsterdam and Philadelphia: Benjamins.

Reynolds, P., & Jewtushenko, T. (2005).What Is XLIFF and Why Should I Use It? XML Journal, 4.

Rockley, A. (2002).Managing Enterprise Content: A Unified Content Strategy.Indianapolis: New Riders Publishing.

Somers, H. (Ed.). (2003). Computers and Translation: A translator’s guide (Vol. 35).Amsterdam/Philadelphia: John Benjamins Publishing.

Wunderlich, M. (2005).Options for Editing an XLIFF File. Multilingual Computing and Technology, 76,51-58.

Zetzsche, J. (2003a, March 2003).TMX Implementation in Major Translation Tools.Multilingual Computing and Technology, 14, 2,23-27.

Zetzsche, J. (2003b).A Translator's Tool Box for the 21st Century. A ComputerPrimer for Translators (2nd Version 2.4.1, December ed.):International Writers' Group.

Zydron, A. (2003, August 26, 2003). xml:tm Using XML technology to reduce the cost ofauthoring and translation. The LISA Newsletter, XII.

Dr Ignacio Garcia is a senior lecturer at the School ofLanguages and Linguistics at the University of WesternSydney, where he has been teaching in Spanish andTranslation since 1995. He is currently researchingtranslation and new technologies, using computer-aidedqualitative data analysis software (CAQDAS) onprofessional discussion lists to investigate how translatorsare responding to the challenges posed by translationmemory software. He may be contacted [email protected]

Page 21: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

COUNTRY.focus

Page 21JUNE 2006

Localisation in NepalPat Hall,

Open University UK, and Madan Puraskar Pustakalaya, Nepal

AbstractComputing using the local language of Nepal has just been enabled for both the Linux and Windows platforms. Nepali is the nationallanguage of Nepal, spoken by nearly everybody in Nepal. Historically software has interfaced to users in English, though ad hocsolutions for creating and printing in Nepali have always been possible. Now, at least in principle, all systems that deliver serviceswithin Nepal to Nepalis could do so completely in their own national language, Nepali, and in ways which fit with and support localways of doing things. However, this is not happening yet, and there are even indications of a reluctance to make the move.Therefore, more concerted action will be necessary to bring about this most important development for the Nepali language.

Keywordslocalisation, localization, Nepali, Devanagari writing system, languages of South Asia, Linux, Windows, Bashar Sancha project

Localisation is just beginning in Nepal. It couldbecome central to the way computing is done inNepal – or become irrelevant as English swamps

Nepali as the language of administration and business. Myjudgement is that it will be the former.

The first computer came into Nepal in 1974, to process theNepalese population census. This computer worked inEnglish, as did PCs when imported into Nepal in reasonablenumbers in the 1980s. However people soon used them forstoring and printing text in Nepali, particularly fornewspapers and magazines.

Nepali is written using the Devanagari writing system alsoused for Hindi, Marathi, and several other languages ofSouth Asia. Figure 1 shows an example – note the letters ofwords are joined by a horizontal top line, with many lettershaving special half-forms as they form 'conjuncts', clustersof consonants with no vowels in between.

Figure 1: Sample of Nepali text about Linux, written in Devanagari

During the 1980s and 1990s, working in Nepali wasachieved by producing TrueType fonts that could beinstalled on PCs – IBM compatibles and Macintoshes –following practices that were wide-spread across South Asiaand South East Asia. These fonts were full of compromisesin the way the writing was produced, and the way it wasencoded internally – the approach can only be described as'hacking' and the situation it engendered as chaotic – butthey were successful for writing Nepali and some otherlanguages of Nepal. Each 'font' embodied not just the visualappearance on the screen and paper (more or less), but alsoembodied a keyboard layout and an internal encoding. Eachfont did things differently, no standards were followed, withthe consequence that data could not be exchanged betweencomputers unless they happened to use the same font. Butfor desktop publishing, and even newspaper publishing,these fonts sufficed until the Internet became more and morepervasive. For successful use of the Internet you just mustfollow standards.

Computing in Nepal has inevitably been influenced by whatwas happening in India. In India a research programme intothe computerisation of Indic writings systems, includingDevanagari, led to the sophisticated ISCII encoding1 and thespecial GIST hardware rendering that did justice to writingin the Indian subcontinent. Some of the ideas here also foundtheir way north into Nepal, and a hardware rendering enginewas produced, though it did not prove commercially

Pat Hall

1 Bureau of Indian Standard (1991) IS 13194 Indian Script Code for InformationInterchange – ISCII. Bureau of Indian Standards, Manak Bhavan, 9 Bahadur ShahSafar Mark, New Delhi 110002, India. A version of ISCII was adopted into Unicodeand determined the current tables for Devanagari and other Indian writing systems.

Page 22: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

COUNTRY.focus

Page 22 JUNE 2006

successful. In India the GIST hardware was too costly, andonly when PCs became powerful enough for the rendering tobe done in software did ISCII have real potential.Regrettably this potential was not realised due to the Indiangovernment practice of selling such products of publiclyfunded research. ISCII and the rendering DLLs never madeit into widespread use in India, and never made it to Nepal.Local language computing in India in general followed thesame hacking approach as Nepal2, with a variety of 8-bitTrue Type fonts with keyboard layouts and internalencodings implicit in the 'font'. This chaos continued untilthe late 1990s and early 2000s, and the widespread adoptionof Unicode.

In Nepal in 1997 there were a number of interested parties:

l Nepali software technologists, keen to support the use of computers in Nepali: notable among these was AllenTuladhar, chief executive of Unlimited Software, at thattime local agents for Microsoft;

l Madan Puraskar Pustakalaya (MPP), the archive libraryof Nepali, who wanted to index their library holding using standard Nepali, led by Kanak Mani Dixit, in partrepresenting the wider publishing industry of Nepal;

l expatriate software technologists concerned that software should be used effectively not just in English but also in local languages: Pater Malling from Denmark, Jeff Rollins from the US, and myself;

All of these parties met and worked together to produce afirst national standard for Nepali encoding, exploiting thefact that software technology by then enabled the

independence of keyboard layout, internal encoding, andvisual appearance on screen and in print. Central to this wasthe encoding, a white paper3 was written and from this thegroup proposed standards both for an 8-bit representation ofNepali-Devanagari, and for Unicode tables. Subsequentlythe 8-bit encodings have become irrelevant, and an attemptto get the Unicode tables accepted by the UnicodeConsortium was rebuffed. Some of the background to thiswork was written up by me in 19984.

However the Unicode rebuff did not destroy this enterprise,and the group obtained funding from IDRC to undertakesome preliminary development of fonts and conversionroutines. This was picked up by MPP who developed aNepali Unicode CD. Further support came from UNESCO,and then again from IDRC via the PANlocalisation project,and then from the EU through the Nelralec (BhashaSanchar) project. MPP found themselves entering thesoftware industry, and now manage one of the larger teamsof software engineers in Nepal. A complete localisation ofDebian Linux and Open Office was formally launched inDecember 2005 (see Figure 2).

Meanwhile Microsoft had developed its own support forIndic languages with the advent of the Uniscribe renderer,and to help communities where Microsoft could not justifythe commercial localisation of their software, launched theLanguage Interface Packs (LIPs). With support andencouragement from Microsoft, Allen Tuladhar developed aLIP for Nepali (see Figure 3), launched in November 2005in truly spectacular Microsoft fashion.

Figure 2: The Nepalinux distribution

2 Hall, Pat (1998) Vernacular Software in South Asia: what happens now and what is needed. WG9.4 conference, Bangkok, Thailand, March 1998.3 Font Standardisation Working Committee (1997) Nepali Font Standards. 4 Hall, Pat (1998) Designing of Code Tables with application to Nepali, Journal of Global Information Management, Volume 6 No.4, Fall 1998 pp5-14. ISSN 1062-7375

Page 23: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

COUNTRY.focus

Page 23JUNE 2006

Figure 3: The Windows LIP for Nepali

Meanwhile the softwareindustry of Nepal has beensteadily growing, andacquiring governmentsupport. The High LevelInformation TechnologyCommission (HLITC)was established in 2000 'toprovide crucial oversightand policy guidance forthe development of ICT

sector in the country'. Their 2005 report on IT in Nepal5while showing the growth in computing in Nepal, alsoshows just how small the IT sector really is. In 2004 theNepal IT industry had a turnover of US$16M, of whichabout $6M was in software exports (offshoring) and $6Mwas in internal services. The exact number of computersoverall is unclear, but a survey of institutions in 2005revealed just 12,000 in use, mostly running MS Windowsand standard applications. The survey found only 2,277qualified software professionals, just over half withbachelors degrees, 17.5 percent with masters degree, only 10with PhDs, and the rest qualified at sub-degree level. Thetotal output from universities between 2000 and 2004 wasjust over 1000 graduate computing and software engineers.The application of computers relies mostly upon people whotake training in the use of computer applications, providedby one of the thousands of high street training institutions.

The HLITC in turn has established the Nepal Language inInformation Technology (NLIT) steering committee and theForum for Information Technology Nepal (FIT Nepal). Astandard terminology for computing has been agreed, andkeyboard layouts are currently being standardised.

By the end of 2005 Nepal was equipped with a choice ofLinux and MS windows platforms for developing their ownsoftware to work in Nepali, or for localising software fromoutside to work in Nepali. When the HLITC held a meetingabout what to do next, the response was lukewarm. Whilethe discussions at the meeting were all in Nepali, the overallview was ironically that since the people at the meetingworked effectively with computers using the Englishlanguage interface, a Nepali interface was not necessary. So what will happen next?

There clearly is not the internal commercial market to drivethings forward, though the government as a major user, orpotential user, of computers could and should insist oninterfaces in Nepali. The HLITC itself is collaborating withKorean agencies to develop financial software that works inNepali. On the Bhasha Sanchar project we will be localisingopen source software for schools, possibly in collaborationwith other players.

There is an active telecentre movement starting up,organised by FIT Nepal under the title of Swaabhimaan,with the ambitious target of establishing 1,500 telecentresfrom a base of around 40 today. The applications used inthese telecentres will all be enabled for Nepali, though it isnot yet clear what the cutting edge applications will be thatwill make these succeed, especially when so many othertelecentre projects have failed to survive the end of initialstart-up funding.

Meanwhile more basic provision continues to be developed.The Nepali spell checker on Linux is being improved,Nepali TTS is being developed on both the Linux andWindows platforms, mobile phones and PDAs are beinglocalised, and high quality Nepali-style Devanagari fonts arebeing developed.

Once we have established the use of Nepali computing, weneed to move towards supporting other languages of Nepal,at least the ten or so written ones. There are activemovements to develop software support and fonts forMaithili and for Newari and other languages.

Other external factors will also intervene. There are severalrival PCs for developing countries being developed as wellas initiatives such as Negroponte's questionable One Laptopper Child and, more recently, Intel's robust workstation fortelecentres. If such machines are delivered with particularsoftware installed, will this software then have to belocalised to Nepali? The WSIS in Tunis in late 2005 mayalso have repercussions that move things in Nepal.

What I believe is going to be necessary is a number of seedlocalisation projects, so that the localisation of softwaresourced from outside Nepal becomes established as oneeffective route for providing software within Nepal. Thislocalisation should include not just language, but alsocustomisation of facilities to match Nepalese businesspractices, customs and culture. Equally well, we will need tolook for software developed internally for local markets tobe produced in Nepali or other local languages as a matter ofroutine. The danger is that lucrative offshore contracts willlure the best talent away from developing software for thebenefit of Nepal, to producing software for the (financial)benefit of others.

Above all we will need to maintain publicity that workingwith computers using Nepali is possible, desirable andimportant for the future of Nepal. However, competence inEnglish and its importance in working with the internationalcommunity should not be allowed to lead to the demise ofNepali or any other of the 100 or more languages of Nepal.

Pat Hall is an Emeritus Professor of Computer Science atthe Open University in the UK and he is currently theProject Director for the Bhasha Sanchar project based inMadan Puraskar Pustakalaya in Nepal. Pat may becontacted at [email protected]

5 HLITC (2005) Towards a Digital Future. A fact book on Information andCommunication Technology sector of Nepal.

Page 24: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in
Page 25: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in
Page 26: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

TILP.org

Page 26 JUNE 2006

Opportunity to InfluenceYour Professional

OrganisationAires, New Delhi, or Guangao do almost the same tasks, andhave the same issues and possible solutions. What are wedoing with this knowledge, and can't this be useful to all ofus in our professional lives? There is hardly a genuine linkbetween the people in the 'industry' (in the context thatmedical doctors, engineers, lawyers, etc. would seethemselves amongst their peers); nor a sense of how usefulit can be to know others in our area of work, shareinformation, ideas and experiences and support one anotherprofessionally.

The Institute of Localisation Professionals (TILP) is aimingto improve this. As the TILP council, we are trying to createa framework for all localisation professionals in the industry- to facilitate awareness, professional development, and themuch – needed international exchange we so often miss justsitting behind our computers. Yet, TILP is a growingorganisation, and can only work with the involvment of itsmembers. Only if members contribute to this professionalnetwork, and share whatever they have to share, will thisreally work. Join TILP and tell us what would you like to seehappen and what you need for your professional future. Forfurther information visit TILP at www.tilponline.org oremail us at [email protected].

Best regards and do not hesitate to get back to me or otherTILP council members with any comments or questions.

Angela Starkmann-Lehr, The TILP council

Dear TILP professional

On behalf of the TILP council I would like to welcome youto this first issue of a new look and direction for LocalisationFocus – The International Journal of Localisation. This yearis also set to see great progress for the CLP (CertifiedLocalisation Professional) programme. We are expandingthe number and range of courses for all three areas ofexpertise: project management, engineering and linguistics.Check the website www.tilponline.org to keep up with thenewest developments. The CLP programme can be oftremendous professional use for everyone working in theindustry. In addition to being of great benefit for individuals,the CLP programme can be used to certify a company'straining – and thus all localisation staff – through the CLP.Talk with your manager if you think that your colleagueswould like to become certified localisation professionals,and let us know if your company would like to learn moreabout leveraging the CLP into their own training.

It is our goal to make TILP more attractive and lively thanever, for its members, and by its members. In addition to theAsk the Expert sessions, we are organising informal TILPpub meetings in many cities. We are still looking formembers willing to organise these informal and fun events.Interested? Contact me (at [email protected]) and see howan event can be planned in your city.

When my bilingual young children are being asked a wordin their other language, it often takes them a minute to comeup with it – that is translation. Still, they refer to theirteacher both in Dutch and German as to their 'juffie' evenwhen Großmutter is visiting, and would never considerdoing differently – that is localisation. My daughter insistsshe wants to do just what her mommy does, when she growsup, but has no idea what this job really is. That is localisationtoo.

Are you a 'Localisation professional'? Do you know whatthis means? And can you explain it to your mother so thatshe'll understand? And do we really feel we are recognisedas a highly skilled, qualified group of professionals, and notjust as a random assembled group of people who possiblyspeak more than one language? It is a great, fascinating,energising, modern business we're all working in, and weshould be proud of it and get more credit for the hard workthat we do.

We sometimes tend to forget that a software engineer,translator, or project manager in Ireland, Seattle, Buenos

Angela Starkmann-Lehr Has worked as a freelancetranslator, copywriter and trainerfor German and internationalcompanies and since 2000 – asin-house linguist for MedtronicTechnical Literature Group (TLi) inHeerlen (the Netherlands).

“I love the work in the fast-paced, ever-changing internationallocalisation industry and am happy about the challenge topromote cooperation of colleagues all over the world. It is myimpression that many peers are not aware of the interactive,networked character of our area of work. Therefore promotingprofessional standards, communicating individual experiencesand representing all those working in the field - in short: helpingall those working in the localisation industry to do a better job, isthe major role I see for the TILP - and for myself as a member ofit’s council.”

Page 27: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

RESOURCES.loc

Page 27JUNE 2006

A unique research, teaching and learning resource worthtens of thousands of euros is now available free-of-chargeto ten selected GILC partner organisations. It coversworld-leading localisation tools and technologies. Fifteenlicenses will be installed initially at ten partner sites.

The GILC-LOTS Satellite Distribution was launched by the LRCon 19 April 2006 in Kuching, Malaysia. It will be installed initiallyat ten GILC partner organisations and covers fifteen licenses ofworld-leading localisation tools and technologies. These include:

l Alchemy Catalyst (visual L10N solution; www.alchemysoftware.ie)

l CatsCradle (web site L10N tool; www.stormdance.net)l PASSOLO (visual L10N solution; www.passolo.com)l Project Open (project management and workflow;

www.project-open.com)l Web Budget (web site L10N tool; www.webbudget.com)l Extended edition: SDL & TRADOS tools suite (translation

memory and visual localisation solutions, www.sdl.com)

All tools and licenses, worth tens of thousands of euro, have beenmade available by their developers to the LRC free-of-charge. Thedistribution will be maintained and distributed by the LRC free-of-charge as part of its contribution to the Global Initiative for LocalComputing (GILC), launched in September 2006 (www.gilc.info).

The GILC-LOTS Satellite Distribution builds on the success of theLocalisation Tools and Technology Laboratory and Showcase(LOTS), which opened its doors at the LRC in June 2003 and justshortly afterwards also became accessible remotely over theinternet (www.electonline.org). LOTS was funded by the EuropeanUnion eContent project ELECT and supported by dozens oflocalisation tools developers. The total value of the licensescontributed to LOTS by tools developers was just under half amillion euro.

At the launch, Prof. Dr. Abdul Rashid Abdullah, Vice Chancellor ofthe University of Malaysia Sarawak (UNIMAS), said "There is anenormous interest in Southeast Asia to remove what has often beendescribed as the Digital Divide, to make Information andCommunication Technologies (ICTs) more accessible to non-English speakers here in Southeast Asia while, at the same time,allowing the Western World to access our own digital content -content that originated outside of the western cultural referencesystem. Academic research, guided by social and economic needs,has to develop methodologies and technologies which go beyondthe mainstream localisation efforts we have become so familiarwith. We are confident that UNIMAS and the LRC will play acentral role in coming up with innovative and inspiring solutions."

The LRC would like to thank all the companies who so generouslycontributed their tools and technologies to the GILC-LOTSSatellite Distribution for their visionary commitment.

Free Access toLocalisation Technology

Page 28: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

PERSONAL.profile

Page 28 JUNE 2006

The best of bothworlds – languages

and computersWhen asked how she came to choose acareer in localisation, Angelika Zerfass

finds that it was not a conscious decisionbut that many things that happened in her

life just sort of lead up to it. Thelocalisation business has such a low

visibility in society that many people justdo not know that it is there and that it has

such exciting possibilities.

Angelika Zerfass was born inGermany in 1966, but Englishwas the first language she could

speak, as her father had been transferredto the US for two years when she was justten months old. After her return toGermany Angelika 'forgot' most of herEnglish until that day when her Englishteacher in third grade wondered wherethat New York accent came from whenshe was teaching her Oxford English. Latin and French followed during highschool and also a little Russian on theside – it seemed clear that languages wasthe way to go.

Angelika always wanted to be a translatorand when she started her studies inChinese and Japanese at the University ofBonn in 1987 her greatest wish was to sitin her room with lots and lots of books(she loves books!) and create the 'perfect'translation. "Today, I am thankful that itdid not turn out that way," she says.

A few terms into her studies, Angelikahad to choose an additional topic. Most ofher classmates chose law or businessadministration – but none of the topicsthat were on offer really interestedAngelika. Again it seemed her father's'fault' that she chose something else.

Angelika's father had been a computerprogrammer, so computers had been partof her life for years already (she ownedthe first 5.5 kg laptop of the wholelanguage department at that time), socomputational linguistics seemed a goodchoice that would allow Angelika tocombine her love for languages with herinterest for computers and programming.The topic of her thesis "Machinetranslation and the Chinese language" didnot earn her much love from herlanguage professor, as it had nothing todo with his favourite topic (femaleauthors in China after 1945), but she hadfound her own favourite topic –computers and languages.

After completing her studies atuniversity, Angelika tried to get a job as atranslator, but at that time nobody wouldpay a full-time translator for Asianlanguages and working freelance was notreally possible due to a lack ofexperience. In the end, Angelika startedout her career by working for some timeat the Japanese Embassy in Bonn, whereshe was the main contact point for theJapanese diplomats with the Germanministries of Defence and the Interior. Itwas an interesting experience, but also avery predictable job, so, feeling a little

Angelika Zerfass

Page 29: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

PERSONAL.profile

Page 29JUNE 2006

restless, Angelika started looking for something morechallenging. In 1997 a newspaper add from TRADOScaught her eye, and as they were opening their Japaneseoffice at that time, she applied for a job. Three months laterAngelika was in Japan, helping to set up the new office, butmainly working as a TRADOS specialist, supportingMicrosoft's localisation projects in the Asian region.

After 13 months in Japan, Angelika moved to Microsoft'sheadquarters in Seattle, Washington, where she performedthe same role, working as a training and support specialistfor all TRADOS-related questions.

During that time, Angelika gained a good insight into largelocalisation projects from the perspectives of the client, thevendor and also the translation software developers. "I metso many interesting people during this time and learned sucha lot that I think I could never really leave this industry –there is no other industry that allows us to combine so manyexciting things on a daily basis" she says.

In 2000, after her return to Germany, Angelika's nextchallenge was already beckoning – luring her away from thereduced scope of just one company's products to anindependent existence where she could deal with the toolsand technologies in the field.

For the last six years, Angelika has been workinginternationally as an independent consultant and trainer ontranslation tools and translation tools-related processes. Sheis often involved in the development or beta-testing of newtools or software versions, teaches translation tools at theUniversity of Bonn, holds training classes for translators,project managers and authors for different translation toolsand the processes that are involved in developing and usingsuch tools or helps companies to decide on and set up theright tool for their projects.

"I am happy that I can do what I do best, that isteaching and learning, in an environment that is bothdemanding and stimulating at the same time. I havefound my niche where two things that often aremutually exclusive – the logic of computers and theillogical behaviour of language – can coexist. True,sometimes it almost gets too much and I long for thatisland where nobody can reach me. But then again, Idon't think I could leave this industry behind – ever!"

Angelika Zerfass is based in Germany and works as anindependent consultant and trainer for translation tools.She can be reached at [email protected]

Page 30: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in

ORGANISATION.news

Page 30 JUNE 2006

The Institute of LocalisationProfessionals (TILP)TILP is the industry association for individuals inthe translation, localisation, internationalisationand globalisation industry. TILP's objective is todevelop professional practices in localisation and

support and offer benefits to industry professionals.

News BriefsCertified Localisation Professional Training (CLP)TILP accredited four new course providers in March:Alchemy Software Development PASS EngineeringSTAR Barcelona Ushuaia

Welcome to Karl Kelly - TILP Project Manager Karl has already done tremendous work on TILP's CLP Programme and weare very happy to have him on board. Feel free to [email protected] with your feedback or questions about TILP or theCLP.

New Professional MembersCongratulations to François Lanctôt, Yuka Kurihara-Pelish, Aitor Medrano,Peide Zha, and Elsebeth Flarup for the approval of their applications asProfessional Members. Qualification as a Professional Member recognisestheir knowledge, commitment and experience in the industry.

Ask the ExpertsTILP organises regular Ask the Experts sessions where professionals sharetheir knowledge with TILP members. Three TILP Ask the Experts sessionstook place at Localization World in Barcelona on 30 May:Real-world Terminology Management Solutions, by Sue-Ellen Wright,Kent State University and Chair of the ATA Terminology CommitteeAdvanced L10N Automation Techniques, moderated by Tony O'Dowd,CEO and President of Alchemy Software DevelopmentHow to Master .NET Localization, by Florian Sachse, Managing Directorof PASS EngineeringVisit www.tilponline.org for further details.

The Globalization and LocalizationAssociation (GALA)GALA is an international non-profit associationthat promotes translation services, languagetechnology and language management solutions.

The member companies worldwide include translation agencies,localisation service providers, globalisation consultants and technologydevelopers.

Online Career Services CentreGALA recently launched a Career Services centre on the GALA website,designed to facilitate recruiting and job placement for emergingprofessionals and current professionals in the fields of businessdevelopment, engineering, internationalisation, marketing, sales,operations, project management, testing and translation. Industryprofessionals are invited to create job seeker profiles and search job listingsposted by GALA member companies.

GALA Articles Database - Call for ArticlesGALA invites authors to submit new and previously published articles onglobalisation, internationalisation, localisation and translation to the onlineGALA Articles Database at http://www.gala-global.org/index.php?action=view_articles&source=resources.

Past EventsGALA European Member Meeting in BarcelonaGALA held its fourth European member meeting on 30 May in Barcelona.The focus of the meeting was interaction and input from members on keyissues for member companies and the association.

Localization World BarcelonaGALA member companies Beijing E-C Translation, eLocalize, HumanScience Co. Ltd., Idea Factory Languages, LocTeam, MAGIT, ORCO, andSatellite Station exhibited in a group booth at the Localization Worldconference. GALA conference sessions included 'Vendor Management:Perspectives from the Buyer and the Supplier (GALA: Open to allattendees)' and a vendor-only session called 'Selling Your LocalisationCompany?'

LRC NewsThe LRC is a research centre of theUniversity of Limerick (Ireland) and

based at its Department of Computer Science and Information Systems(CSIS). The LRC works with worldwide digital publishers and theirpartners who are interested in future technologies and processes forGILT, provides relevant well-researched content-rich information onfuture trends and technologies, and is a unique industry and academiccollaboration which provides an unparalleled network of expertise.

Universiti Malaysia Sarawak and LRC sign Memorandum ofUnderstandingOn 19 April 2006, the two organisations agreed to establish closer links topromote internationalisation- and localisation-related research andteaching.

Karl Kelly LRC ManagerThe LRC has appointed Karl Kelly as its new manager on 5 May 2006. Karlis a graduate of the University of Limerick and its Localisation Programme.

LRC launches GILC-LOTS Satellite DistributionThe LRC launched the GILC-LOTS Satellite Distribution in Kuching,Malaysia, on 19 April 2006. The distribution will initially be installed at tenglobal sites and covers fifteen licenses for each of the leading localisationtechnologies covered by the distribution.

Page 31: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in
Page 32: VOL. 5 Issue 2 Localisation Focus · 2013-11-20 · June 2006 VOL. 5 Issue 2 Using Web Services for Translation Formatting and the Translator: Why XLIFF Does Matter Localisation in