View
2
Download
0
Category
Preview:
Citation preview
Automatic Translation Error AnalysisProject results & conclusions
MT Bondone (side project)
Outline● Got to know the tools; cross-evaluation (all)
● Hjerson++ (Maja)● Addicter's friendlier and richer interface (Dan)● A non-paranoid alignment for Addicter (Martin)
● Both tools on wmt11 En-De (Sabine)● Both tools on dataset X (Arianna, Suhel)
Addicter vs Hjerson● Addicter (HMM)
● Moses the decoder uses beam search
● The Moses program employs suboptimal pruning● Hjerson & Addicter (Greedy)
● Moses the decoder uses beam search
● The Moses program employs suboptimal pruning
● Failed translation attempt or missing+extra pair? Hard task● Turns out, second strategy is better (ranking, error prec./rec.)
Hjerson on Czech data● WMT'09, En-Cs● Very rich morphology including inflections and
derivations● Very free word order● Flexible human error analysis (related to the
given reference, but only loosely)
Ranking evaluation● correlations over error categories between
0.4 and 0.7● correlations over translation systems:
● strongest for missing words and lexical errors(0.7 - 1)
● weaker for reordering and morphological errors(0.2 - 0.8)
● reason: above mentioned characteristics of Czech language
● (again) weak for extra words (-0.2 - 0.4)
Precision, Recall, Confusions● General problem:
● (again) extra words confused with lexical errors
● Problems related to the Czech language:● morphological errors confused with lexical● much more reordering errors
WER alignment on base forms
Improves some aspects:● better correlations over error classes● better recall of extra words (+less confusion
with lexical errors)● price: deterioration of lexical recall + more lexical
errors confused with extra words● however, the gain is significantly larger
● better precision of extra words● price: more correct words are tagged as extra
Addicter / Visualizer● Easy install (uses internal webserver now)● Improved interface● Reference-hypothesis alignment
● Multiple alignments of the same sentence● Color highlighting of automatically found errors
● … DEMO
Addicter on English● Automatic testing system for all tools and
datasets
● Greedy alignment for Addicter● fast (linear search)● based on context, lemma and PoS similarity● suffers from lexical error overkill (better than not
detecting them)● evaluated on manually annotated WMT09 De-En –
similar to Hjerson● Addicter's best built-in aligner
Test on WMT11 EN-DE Data● 22 MT systems/outputs● No manually annotated gold standard ● Ranking according to manual judgments● Application of both Addicter & Hjerson to all the
systems‘ output
Number of Errors● Addicter tags between 81-90k of 150k tokens
with errors, Hjerson between 84-95k.● The systems with the fewest errors:
● online-B: rank #2 of 22● illc-uva: rank #21 of 22● RBMT systems are tagged with more errors
Fun with CorrelationsAddicter
Total errors 0,003
Inflection errors 0,113
Extra words -0,283
Missing words 0,268
Lexical errors 0,086
Reordering 0,189
Hjerson
Total errors -0,109
Inflection errors 0,432
Extra words -0,351
Missing words 0,427
Lexical errors -0,275
Reordering 0,579
Infl+ext+reord 0,654
Error Analysis of the Error Analysis● Addicter tags very conservatively wrt
reordering/inflection, Hjerson is greedy.● The lack of alignment in Hjerson leads to many
errors: the German determiner is often wrongly tagged with inflection or reordering errors.
● Addicter abuses extra/miss (can be fixed by creating a better alignment).
Example - Hjerson●Aktuálně.cz "tested" the Social Democrat members of the new Council in terms of the well-established slang that originated in the town hall during the few last years, when Prague was ruled by the current coalition partners.
●Die Zeitung Aktuálně.cz hat Mitglieder des neuen Rates aus der ČSSD mal ein wenig "abgeklopft", wie sie den notorischen Slang beherrschen, der sich in den letzten Jahren eingebürgert hat, in denen die heutigen Koalitionspartner in Prag am Ruder waren.
●Aktuáln.cz "testete" die Sozialdemokratin-Mitglieder vom neuen Rat in Bezug auf die feste Umgangssprache von den gegenwärtigen Koalitionspartnern, die während der paar letzten Jahre im Rathaus entstand, als Prag regiert wurde.
Example - Addicter● New Councilors of CSSD will most probably have to overcome certain
language barriers to understand their old-new colleagues from ODS in Prague Council and municipal council.
● Die neuen Ratsherren der Hauptstadt aus den Reihen der ČSSD werden offensichtlich gewisse Sprachbarrieren überwinden müssen, um ihre alt-neuen Kollegen aus der ODS im Prager Rat und in der Stadtvertretung überhaupt verstehen zu können.
● Neue Ratsmitglieder von CSSD werden am wahrscheinlichsten Sprachbarrieren überwinden müssen, um ihre altneuen Kollegen von ODS in Prag-Rat und Magistrat zu verstehen.
Test on IWSLT'11 Ar-En Data● In progress
● “The system is of good quality and far too many errors are marked”
Conclusions● Hjerson updated; evaluates better; usable for
error/system ranking and rough error-tagging● Addicter updated; now also usable for
error/system ranking and rough error-tagging● Both tools tested on EnDe, En->Cs, Ar->En
Slide 1Slide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14Slide 15Slide 16Slide 17Slide 18Slide 19Slide 20
Recommended