27
Rudolf Rosa, Ondřej Dušek, David Mareček, Martin Popel {rosa,odusek,marecek,popel}@ufal.mff.cuni.cz Using Parallel Features in Parsing of Machine-Translated Sentences for Correction of Grammatical Errors Charles University in Prague Faculty of Mathematics and Physics Institute of Formal and Applied Linguistics SSST, Jeju, 12th July 2012

Using Parallel Features in Parsing of Machine-Translated

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Using Parallel Features in Parsing of Machine-Translated

Rudolf Rosa, Ondřej Dušek, David Mareček, Martin Popel{rosa,odusek,marecek,popel}@ufal.mff.cuni.cz

Using Parallel Featuresin Parsing

of Machine-Translated Sentencesfor Correction of Grammatical Errors

Charles University in PragueFaculty of Mathematics and Physics

Institute of Formal and Applied Linguistics

SSST, Jeju, 12th July 2012

Page 2: Using Parallel Features in Parsing of Machine-Translated

Parsing of SMT Outputs

can be useful in many applications automatic classification of translation errors automatic correction of translation errors (Depfix) confidence estimation, multilingual question

answering...

✔ we have the source sentence available Can we use it to help parsing?

✗ SMT outputs noisy (errors in fluency, grammar...) parsers trained on gold standard treebanks Can we adapt parser to noisy sentences?

Page 3: Using Parallel Features in Parsing of Machine-Translated

MST Parser

Maximum Spanning Tree dependency parser by Ryan McDonald

Page 4: Using Parallel Features in Parsing of Machine-Translated

(1) Words and Tags

RudolphNNP

relaxesVBZ

abroadRB

#rootwords = nodes

Page 5: Using Parallel Features in Parsing of Machine-Translated

(2) (Nearly) Complete Graph

RudolphNNP

relaxesVBZ

abroadRB

#rootall possible edges =

directed edges

Page 6: Using Parallel Features in Parsing of Machine-Translated

(3) Assign Edge Weights

RudolphNNP

relaxesVBZ

abroadRB

#root

-7 -1659325

1490 1154

-263 24

185

368

Margin InfusedRelaxed Algorithm

(MIRA)

edge weight =

sum ofedge featuresweights

Page 7: Using Parallel Features in Parsing of Machine-Translated

(4) Maximum Spanning Tree

RudolphNNP

relaxesVBZ

abroadRB

#root

-7 -1659325

1490 1154

-263 24

185

368

non-projective trees:Chu-Liu-Edmonds algorithm

(projective trees:

Eisner algorithm)

Page 8: Using Parallel Features in Parsing of Machine-Translated

(5) Unlabeled Dependency Tree

RudolphNNP

relaxesVBZ

abroadRB

#rootdependency tree =

maximumspanning tree

Page 9: Using Parallel Features in Parsing of Machine-Translated

(6) Labeled Dependency Tree

RudolphNNP

relaxesVBZ

abroadRB

#root

Predicate

Subject Adverbial

labels asignedby a second stagelabeler

Page 10: Using Parallel Features in Parsing of Machine-Translated

RUR Parser

reimplementation of MST Parser (so far only) first-order, non-projective

adapted for SMT outputs parsing parallel features ”worsening” the training treebank

Page 11: Using Parallel Features in Parsing of Machine-Translated

English-to-Czech SMT

Czech language highly flective

4 genders, 2 numbers, 7 cases, 3 persons... Czech grammar requires agreement in related words

word order relatively free: word order errors not crucial

Phrase-Based SMT often makes inflection errors:➔ Rudolph's car is black.

✗ Rudolfova/fem auto/neut je černý/masc.

✔ Rudolfovo/neut auto/neut je černé/neut.

Page 12: Using Parallel Features in Parsing of Machine-Translated

Parser Training Data

Prague Czech-English Dependency Treebank parallel treebank 50k sentences, 1.2M words morphological tags, surface syntax, deep syntax word alignment

Page 13: Using Parallel Features in Parsing of Machine-Translated

Parallel Features

word alignment (using GIZA++) additional features (if aligned node exists):

aligned tag (NNS, VBD...) aligned dependency label (Subject, Attribute...) aligned edge existence (0/1)

Page 14: Using Parallel Features in Parsing of Machine-Translated

Parallel Features Example

RudolfNN M S 1

relaxujeVB S 3

vRR 6

zahraničíNN N S 6

RudolphNNP

relaxesVBZ

abroadRB

#root

#root

Pred

AuxP

AdvSubj

Subj

Pred

Adv

Page 15: Using Parallel Features in Parsing of Machine-Translated

Worsening the Treebank

treebank used for training contains correct sentences

SMT output is noisy grammatical errors incorrect word order missing/superfluous words …

let's introduce similar errors into the treebank! so far, we have only tried inflection errors

Page 16: Using Parallel Features in Parsing of Machine-Translated

Worsen (1): Apply SMT

translate English side of PCEDT to Czech by an SMT system (we used Moses)

now we have (e.g.): Gold English

Rudolph's car is black. Gold Czech

RudolfovoNEUT

autoNEUT

je černéNEUT

.

SMT Czech Rudolfova

FEM auto

NEUT je černý

MASC.

Page 17: Using Parallel Features in Parsing of Machine-Translated

Worsen (2): Align SMT to Gold

align SMT Czech to Gold Czech Monolingual Greedy Aligner

alignment link score = linear combination of: similarity of word forms (or lemmas) similarity of morphological tags (fine-grained) similarity of positions in the sentence indication whether preceding/following words aligned

repeat: align best scoring pair until below threshold no training: weights and threshold set manually

Page 18: Using Parallel Features in Parsing of Machine-Translated

Worsen (3): Create Error Model

for each tag: estimate probabilities of SMT system using an

incorrect tag instead of the correct tag(Maximum Likelihood Estimate)

Czech tagset: fine-grained morphological tags part-of-speech, gender, number, case, person,

tense, voice... 1500 different tags in training data

Page 19: Using Parallel Features in Parsing of Machine-Translated

Worsen (3): Error Model

Adjective, Masculine, Plural, Instrumental case(AAMP7), e.g. lingvistickými (linguistic)➔ 0.2 Adjective, Masculine, Singular, Nominative case

➔ e.g. lingvistický➔ 0.1 Adjective, Masculine, Plural, Nominative case

➔ e.g. lingvističtí➔ 0.1 Adjective, Neuter, Singular, Accusative case

➔ e.g. lingvistické

… altogether 2000 such change rules

Page 20: Using Parallel Features in Parsing of Machine-Translated

Worsen (4): Apply Error Model

take Gold Czech for each word:

assign a new tag randomly sampled according to Tag Error Model

generate a new word form rule-based generator, generates even unseen forms new_form = generate_form(lemma, tag) || old_form

→ get Worsened Czech use resulting Gold English-Worsened Czech

parallel treebank to train the parser

Page 21: Using Parallel Features in Parsing of Machine-Translated

Direct Evaluation by Inspection

manual inspection of several parse trees comparing baseline and adapted parser ouputs

examples of improvements: subject identification even if not in nominative case adjective-noun dependence identification even if

agreement violated (gender, number, case)

hard to do reliably trying to find a correct parse tree for an (often)

incorrect sentence – not well defined

Page 22: Using Parallel Features in Parsing of Machine-Translated

Indirect Evaluation: in Depfix

rule-based grammar correction of SMT outputs input = aligned, tagged and parsed sentences:

target (Czech) sentence – to be corrected source (English) sentence – additional information

applies 20 correction rules: noun – adjective agreement (gender, number, case) subject – predicate agreement (gender, number) preposition – noun agreement (case) …

Page 23: Using Parallel Features in Parsing of Machine-Translated

Depfix: Rudolph's Car

RudolphNNP

carNN

RudolfovaAA fem

autoNN neutAtr Atr

'sPOS

RudolfovoAA neut

autoNN neut

Atr

Atr

Adjective – Noun Agreement

Page 24: Using Parallel Features in Parsing of Machine-Translated

Indirect Evaluation Results

differences in Depfix corrections evaluated by humans: better / worse / indefinite

three different parsers RUR + parallel features + worsened treebank – original McDonald's MST Parser RUR – our baseline setup

RUR + parallel features + worsened treebankbetter worse indefinite51% 30% 18%

RUR 54% 28% 18%

Page 25: Using Parallel Features in Parsing of Machine-Translated

Conclusion

SMT outputs often hard to parse RUR parser – adapted to parsing SMT outputs

parallel features (tag, dep. label, edge existence) worsening the training treebank (tag error model)

outputs of English-to-Czech translation evaluated in Depfix

SMT errors correction system

Page 26: Using Parallel Features in Parsing of Machine-Translated

Future Work

more sophisticated parallel features more experiments on worsening more languages

parallel tagging

Page 27: Using Parallel Features in Parsing of Machine-Translated

Thank you for your attention

For this presentation and other information, visit:

http://ufal.mff.cuni.cz/~rosa/depfix/

Rudolf Rosa, Ondřej Dušek, David Mareček, Martin Popel{rosa,odusek,marecek,popel}@ufal.mff.cuni.cz

Charles University in PragueFaculty of Mathematics and Physics

Institute of Formal and Applied Linguistics