kerstin bier, localization world barcelona, manuel herranz, mt, pangeanic, sybase

Preview:

DESCRIPTION

Co-presentation by Kerstin Bier and Manuel Herranz in Localization World Barcelona 2011 on the achievement and progress made by a customized PangeaMT engine at Sybase. Initial machine translation implementation, machine translation customization for Sybase, use of client's data for training and productivity results.

Citation preview

MT ExperiencesAt Sybase

Kerstin Bier Manuel HerranzPangeaMT

MT at Pangeanic From Trial to Service2007/08

.

2009/10

2011/12

• DIY SMT • Empower Users• Glossary• Automated re-training• Transfer architecture and know-how to users• Compatibility with commercial formats (ttx, sdlxliff, itd)

2007 and before• RB tests with commercial software• Insufficiently good output• Only internal production• EU Post-Editing Award

• V1: Small data sets (2-5M words), automotive & electronics• (ES), then Fr/It/De in other fields

• Division born • 00's of engine trials and language combinations• Open-Source to commercial• TMX / XLIFF workflows

MT and PE at Sybase: From Trial to Production

2009/2010 Trial project with PangeaMT (EN-DE)

Engine: 2.5 million words, narrow domain (one product)

Results:Surprisingly good (BLEU: 49, PE productivity > 70 % )

2010 Project 1: MT and Post-Editing (EN-DE)

Engine: 5 million words

Major new release, lots of new content: 400.000 „new“ words post-edited

2010/2011 Project 2: Retraining, MT + PE

Retraining with post-edited and cleaned-up TMs

Small product update: 80.000 words „no matches“ MT + PE

Expectations

Excell.10%

Good30%

OK30%

OK or Bad???

20%

Bad10%

MT output gets better over time

Continuous PE productivity increase

Turnaround times shorter Cost savings go up over

time

Initial system – Expected output (% of MT words)

Retraining 1, Retraining 2, …

Main Challenges

Results: From „human“ perspective

Results: The MT perspective

Retraining

Project 1

Project 2

METEORscore range

100-70

50- 69

40 - 49

30 - 39

0 – 29

Examples: MT output and PE effort Minimal PE effort

Small PE effort

Medium PE effort

Lessons Learned Small in-domain MT engine = excellent starting point

For future projects: Faster turnaround, lower costs Other product lines Experiences help with other languages

MT output better than expected Often better than translators said Improved after retraining

We think that improving translator acceptance will improve productivity Idea: Filtering out poor translations (confidence scores) Retraining, retraining, retraining

2015

2014

2013

2011

2010

2009

2012

2018

2017

2016

Use

r em

pow

erm

en

t

YEAR2016

00

0's o

f custo

mize

d M

T sy

stem

s

Predictions

PangeaMT Tech. notthe realm of afew providers

2015

2014

2013

2011

2010

2009

2012

2018

2017

2016

Use

r em

pow

erm

en

t

YEAR2016

00

0's o

f custo

mize

d M

T sy

stem

s

Predictions

PangeaMT Tech. notthe realm of afew providers

Thank you!

Kerstin BierSybase, An SAP Company

Manuel HerranzPangeaMT

Recommended