17
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE Using Moses To Win Business That Would Otherwise Be Lost. Two Practical Use Cases at AVB Translations 10:00-10:20 Wednesday, 17 October Joël Sigling AVB Translations

TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Two Practical Use Cases at AVB Translations, Joel Sigling, AVB, 17 October 2012

Embed Size (px)

DESCRIPTION

This presentation is a part of the MosesCore project that encourages the development and usage of open source machine translation tools, notably the Moses statistical MT toolkit. MosesCore is supported by the European Commission Grant Number 288487 under the 7th Framework Programme. For the latest updates, follow us on Twitter - #MosesCore

Citation preview

Page 1: TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Two Practical Use Cases at AVB Translations, Joel Sigling, AVB, 17 October 2012

TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE

Using Moses To Win Business That Would Otherwise Be Lost. Two Practical Use Cases at AVB Translations

10:00-10:20Wednesday, 17 October

Joël SiglingAVB Translations

Page 2: TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Two Practical Use Cases at AVB Translations, Joel Sigling, AVB, 17 October 2012

Using Moses to win

business that would otherwise be

lost

Joël SiglingDirector Technology & Business Partners

TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASESeattle, 17 October 2012

©AVB Translations , 2012

Page 3: TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Two Practical Use Cases at AVB Translations, Joel Sigling, AVB, 17 October 2012

AVB Translations background• Amstelveens Vertaalburo: founded 1972 – traditional, high-

quality agency

• Translation World: founded 2002, tech-savvy all-round player

• Merger in 2010 >> AVB Translations: premium brand with strong tech focus

• Top 5 player in The Netherlands, 2011 turnover € 4.6 million

• Core business: general translations – legal, financial, technical, …New branch: Single Language Vendor Dutch for global MLVs

Page 4: TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Two Practical Use Cases at AVB Translations, Joel Sigling, AVB, 17 October 2012

History of MT interest

• Member of TAUS since 2008, 1st round table Amsterdam

• Visited TAUS User Conferences in US since 2009

• Sense of urgency developed, merger distraction 2010

• Action in 2011 after merger

• 2011: Our own choice: Dutch <> English legal domain engine

• 2012: At customer request, tourism engines English > X languages

• Why SMT, why Moses? Quicker, cheaper, similar quality (shows research)

Page 5: TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Two Practical Use Cases at AVB Translations, Joel Sigling, AVB, 17 October 2012

Case 1: Legal domain engine• Legal translations about approx. 40% of AVB business, 80%

Dutch <>English

• Not the obvious choice: people said MT wouldn’t work for legal: sentences too long, material to intricate

• Statistical MT suited to non-stylistic materials: eg legal

• If this works, we can make MT happen for all other domains

Page 6: TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Two Practical Use Cases at AVB Translations, Joel Sigling, AVB, 17 October 2012

Legal engine: Objectives

• Increased productivity, no BLEU % target, but tangible, practical results. How much extra can a translator do when compared to HT?

• Tool to offer usable quality with very quick turnarounds for

high volume (typical “Friday afternoon lawyer requests”) • Becoming an MT front runner in the non-localization sector for

Dutch (5th language in Europe after FIGS)

Page 7: TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Two Practical Use Cases at AVB Translations, Joel Sigling, AVB, 17 October 2012

Legal engine: Development• Choice between in-house and external development

• In-house: control, developing expertise, lower long-term cost• External: lower initial cost, much more expertise > best for

now

• Our pre-requisites for development option • ownership and free access to engine• assurance data will not be used or copied by builder• Acceptable costs for development & usage• Skilled partner > AsiaOnline, CrossLang, Pangeanic, LetsMT,

SmartMate??

• CrossLang > all of the above, closest to our office, independent

Page 8: TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Two Practical Use Cases at AVB Translations, Joel Sigling, AVB, 17 October 2012

Legal engine: Input data

• Highest quality AVB Dutch <>English legal translations: approx. 700k words per language. Predominantly civil law.

• Not fully reviewed AVB TM, still high-quality: approx. 10 mi. words per language. Predominantly civil law.

• Legal translations harvested by CrossLang, more diverse legal material: 7 mi. words per language

Page 9: TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Two Practical Use Cases at AVB Translations, Joel Sigling, AVB, 17 October 2012

Legal engine: Initial test results

Various productivity tests done in CrossLang and TAUS productivity assessment tools (very similar):

productivity between 5% and 20% higher

for post-ediding than human-only output. These results both for very experienced legal translator and translation novice (intern).

Encouragement to continue…

Page 10: TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Two Practical Use Cases at AVB Translations, Joel Sigling, AVB, 17 October 2012

Legal engine: Results in practice• Live rush ranslations done in first few weeks for new

customers: • 1,500 word trial done for law firm needing high

volume in very short time. Post-edited in 75 minutes. Customer happy with quality/price ratio.

• 25,000 words in two days with light PE effort by two post-editors. Quality estimate 80-90% of human translation.

• 4,500 words in 3 hours with almost full PE effort by one post-editor. Quality estimate >90% of human translation

• 15,000 words in one day, done by two post-editors. Quality estimate 80-90% of human translation

Page 11: TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Two Practical Use Cases at AVB Translations, Joel Sigling, AVB, 17 October 2012

Legal engine: more success

• Other successes: • 7 documents translated and post-edited this way in

one court-case. Customer very happy, case won!

• Three new customers secured through the fact that we were able to turnaround a large volume in a very short time.

• • Some customers are ordering this type of translation

in stead of normal translation even with normal deadlines.

• Sales department have a new USP to sell to our legal customers. Interest is growing!

Page 12: TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Two Practical Use Cases at AVB Translations, Joel Sigling, AVB, 17 October 2012

Case 2: Hotel booking engine

SITUATION• Big (4 million words) hotel booking site, partly available in 10

languages

• Booking funnel fully and best-selling hotels largely translated

Customer ambition: • Have all 4M words translated & do more languages

BUT• No budget for full human translation/post-editing • Google too expensive and risk of being considered

duplicate content at some point in the future

Page 13: TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Two Practical Use Cases at AVB Translations, Joel Sigling, AVB, 17 October 2012

Booking engine: AVB solution

• Combination of TM & MT to translate all content & differentiate in quality where necessary

• Build up new TMs for “new” languages and use TM from “old” languages to create dataset for engines with harvested material

• Customer to get any TM matches for free after paying for initial baseline (20,000 words)

• Engine development paid for, but throughput almost free

Page 14: TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Two Practical Use Cases at AVB Translations, Joel Sigling, AVB, 17 October 2012

Booking engine: Challenges

• Getting traction within client DMT (took two years)

• Finding enough relevant data to build a decent engine (classic)

• Deciding on KPIs from the customer: when would it be successful?

• Technical: getting workable data in and out of customer CMS

Page 15: TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Two Practical Use Cases at AVB Translations, Joel Sigling, AVB, 17 October 2012

Booking engine: Results

• Two languages chosen for trial: French and Polish\

• Baseline TM built for these, all website TM matches replaced for French and Polish. Has already save customer thousands of euros.

• Two Moses engines built by TauYou as trial: French and Polish

• 1st Polish engine results were put on line as test, bookings increased in spite of poor initial language quality

• Customer has approved four more engines: Dutch, French, German, Spanish and Italian

Page 16: TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Two Practical Use Cases at AVB Translations, Joel Sigling, AVB, 17 October 2012

Business conclusions

• We are already making money on MT. All investments paid back in half a year.

• Extra turnover forecast thanks to legal engine: EUR 50,000+ in 2013 + EUR 20,000 in cost savings (bottom line).

• Turnover forecast for TM+MT solution for hotel booking site in 2013: anywhere between EUR 50,000 and 200,000.

FOR AVB, MT IS A GOOD INVESTMENT!

Page 17: TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Two Practical Use Cases at AVB Translations, Joel Sigling, AVB, 17 October 2012

Phone: +31 20 645.66.10Mobile: +31 625.025.475E-mail: [email protected]: @JoelAVBAdres: Ouderkerkerlaan 50

1185 AD AmstelveenThe Netherlands

Website: www.avb.nl