18
Speaker: Speaker: Bel Bel é é n n Garc Garc í í a a - - Ochoa (CPSL) Ochoa (CPSL) Co Co - - speaker: Diego speaker: Diego Bartolom Bartolom é é ( ( tauyou tauyou <language technology>) <language technology>) Implementation of a Machine Implementation of a Machine Translation Engine at CPSL Translation Engine at CPSL

2011 Tekom Wiesbaden: Implementation of a machine translation engine at CPSL

  • Upload
    tauyou

  • View
    110

  • Download
    4

Embed Size (px)

DESCRIPTION

Presentation by CPSL and tauyou at the tekom annual conference. It provides the case of a successful implementation of machine translation in a mid-size Language Service Providers.

Citation preview

Page 1: 2011 Tekom Wiesbaden: Implementation of a machine translation engine at CPSL

Speaker: Speaker: BelBeléénn GarcGarcííaa--Ochoa (CPSL)Ochoa (CPSL)

CoCo--speaker: Diego speaker: Diego BartolomBartoloméé ((tauyoutauyou <language technology>)<language technology>)

Implementation of a Machine Implementation of a Machine

Translation Engine at CPSLTranslation Engine at CPSL

Page 2: 2011 Tekom Wiesbaden: Implementation of a machine translation engine at CPSL

TheThe speakerspeaker

Localization Director at CPSL

CPSL is a Multilingual Service Provider since 1963

Headquarters in Barcelona-Spain

Other Offices in:

Madrid-Spain

Germany

UK

CPSL staff includes over 50 people

Belén García-Ochoa

Page 3: 2011 Tekom Wiesbaden: Implementation of a machine translation engine at CPSL

TheThe coco--speakerspeaker

CEO tauyou <language technology>

tauyou provides language technologies for the localization industry since 2006

Main clients: medium-sized LSPs

Headquarters in Barcelona

Diego Bartolomé

Page 4: 2011 Tekom Wiesbaden: Implementation of a machine translation engine at CPSL

CPSL and Machine Translation

Post-editing services provided to a software

company for a huge project

Lots of translated words in a tight timeframe

Page 5: 2011 Tekom Wiesbaden: Implementation of a machine translation engine at CPSL

MainMain difficultiesdifficulties foundfound

LotsLots ofof clientsclients

DifferentDifferent subjectsubject mattersmatters

DifferentDifferent languagelanguage combinationscombinations

Page 6: 2011 Tekom Wiesbaden: Implementation of a machine translation engine at CPSL

WorkaroundWorkaround

LotsLots ofof clientsclients::

A A listlist ofof thethe mostmost appropiateappropiate clientsclients forfor

usingusing thethe engineengine waswas createdcreated

BasedBased onon thisthis listlist, , wewe establishedestablished thethe

DifferentDifferent subjectsubject mattersmatters

AndAnd thethe

DifferentDifferent languagelanguage combinationscombinations

Page 7: 2011 Tekom Wiesbaden: Implementation of a machine translation engine at CPSL

Human Human postpost--editingediting vs. vs.

humanhuman translationtranslation

TheThe standardstandard wordswords thatthat a a translatortranslator

can do can do perper dayday isis 2,5002,500..

TheThe standardstandard wordswords thatthat a a reviewerreviewer ofof

human human translationtranslation can do can do perper dayday isis

12,000.12,000.

AnAn average average ofof thethe wordswords thatthat can be can be

postpost--editededited perper dayday isis 8,000. 8,000.

Page 8: 2011 Tekom Wiesbaden: Implementation of a machine translation engine at CPSL

Dedicated hybrid machine translation Dedicated hybrid machine translation

engine that is continuously customizedengine that is continuously customized

CorpusCorpus--based with rules for prebased with rules for pre-- and and

postpost--processingprocessing

Data confidentiality is guaranteedData confidentiality is guaranteed

Translation speedTranslation speed

The tauyou solutionThe tauyou solution

Page 9: 2011 Tekom Wiesbaden: Implementation of a machine translation engine at CPSL

Any type of documentAny type of document

Glossary priorizationGlossary priorization

Fast domain creation/updateFast domain creation/update

Fully customizableFully customizable

Quality metrics computationQuality metrics computation

Terminology extractionTerminology extraction

Main characteristicsMain characteristics

Page 10: 2011 Tekom Wiesbaden: Implementation of a machine translation engine at CPSL

gather ingather in--domain datadomain data

train the translation solutiontrain the translation solution

enrich solution with related textenrich solution with related text

terminology priorizationterminology priorization

update the translation solutionupdate the translation solution

add rules to enhance qualityadd rules to enhance quality

weekly updatesweekly updates

Optimum domain creationOptimum domain creation

Page 11: 2011 Tekom Wiesbaden: Implementation of a machine translation engine at CPSL

Optimize translation quality for a clientOptimize translation quality for a client

gather client datagather client data

train the translation solutiontrain the translation solution

add rules to enhance qualityadd rules to enhance quality

continuous improvementcontinuous improvement

CPSL workflow 1CPSL workflow 1

Page 12: 2011 Tekom Wiesbaden: Implementation of a machine translation engine at CPSL

General purpose translatorGeneral purpose translator

gather clients datagather clients data

add generic texts to provide a good sampleadd generic texts to provide a good sample

train the translation solutiontrain the translation solution

add rules to enhance qualityadd rules to enhance quality

periodical improvementperiodical improvement

CPSL workflow 2CPSL workflow 2

Page 13: 2011 Tekom Wiesbaden: Implementation of a machine translation engine at CPSL

Data creation and enhancementData creation and enhancement

user defineduser defined

unaligned translated documentsunaligned translated documents

generic translationsgeneric translations

optimum corpus/memories creationoptimum corpus/memories creation

rulerule--based extension/filtering based extension/filtering

Other use casesOther use cases

Page 14: 2011 Tekom Wiesbaden: Implementation of a machine translation engine at CPSL

tauyou interfacetauyou interface

Tabs can be customizedTabs can be customized

Page 15: 2011 Tekom Wiesbaden: Implementation of a machine translation engine at CPSL

Detailed analysis of translated documentsDetailed analysis of translated documents

Several customized parameters, including word Several customized parameters, including word

error rate, number of word edits, tag differences, etcerror rate, number of word edits, tag differences, etc

Useful in machine translation but also in normal Useful in machine translation but also in normal

quality processquality process

Quality metricsQuality metrics

Page 16: 2011 Tekom Wiesbaden: Implementation of a machine translation engine at CPSL

Unilingual and bilingual terminology listsUnilingual and bilingual terminology lists

Customized according to position in the sentence, Customized according to position in the sentence,

word type, number of words, etcword type, number of words, etc

Feed the MT engine or tool for human translatorFeed the MT engine or tool for human translator

Terminology extractionTerminology extraction

Page 17: 2011 Tekom Wiesbaden: Implementation of a machine translation engine at CPSL

Increase usage of translation memoriesIncrease usage of translation memories

Automatic domain classificationAutomatic domain classification

Source text enhancement Source text enhancement

spelling, grammar, structure, terminology ...spelling, grammar, structure, terminology ...

Special words detectionSpecial words detection

New domains/language pairs creationNew domains/language pairs creation

The futureThe future

Page 18: 2011 Tekom Wiesbaden: Implementation of a machine translation engine at CPSL

QuestionsQuestions??

[email protected]@cpsl.com

www.cpsl.comwww.cpsl.com

[email protected]@tauyou.com

www.tauyou.comwww.tauyou.com