23
Word Sense Disambiguation for Machine Translation Han-Bin Chen 2010.11.24

Word Sense Disambiguation for Machine Translation

  • Upload
    moesha

  • View
    83

  • Download
    0

Embed Size (px)

DESCRIPTION

Word Sense Disambiguation for Machine Translation. Han-Bin Chen 2010.11.24. Reference Paper. Cabezas and Resnik . 2005. Using WSD Techniques for Lexical Selection . (Technical report) Carpuat and Wu. 2005. Word Sense Disambiguation vs. Statistical Machine Translation . (ACL 2005) - PowerPoint PPT Presentation

Citation preview

Page 1: Word Sense Disambiguation for  Machine Translation

Word Sense Disambiguation for Machine Translation

Han-Bin Chen2010.11.24

Page 2: Word Sense Disambiguation for  Machine Translation

Reference Paper• Cabezas and Resnik. 2005. Using WSD Techniques for Lexical

Selection. (Technical report)• Carpuat and Wu. 2005. Word Sense Disambiguation vs.

Statistical Machine Translation. (ACL 2005)• Carpuat and Wu. 2005. Improving Statistical Machine

Translation using Word Sense Disambiguation. (EMNLP 2007)• Chan et al. 2007. Word Sense Disambiguation Improves

Statistical Machine Translation. (ACL 2007)• Apidianaki. 2009. Data-driven semantic analysis for

multilingual WSD. (EACL 2009)

Page 3: Word Sense Disambiguation for  Machine Translation

SMT Workflow

Language model

Input: source language

Translation modelReordering model

Bilingual Corpus Monolingual Corpus

Decoder

Output: target language

Page 4: Word Sense Disambiguation for  Machine Translation

MT Research Areas

Language model

Input: source language

Translation modelReordering model

Bilingual Corpus Monolingual Corpus

Decoder

Output: target language

Word Alignment

Evaluation Metric

Page 5: Word Sense Disambiguation for  Machine Translation

Translation Model (TM)

• Research in TM– Phrase extraction– Phrase filtering– Phrase augmentation– Word Sense Disambiguation (WSD)

Page 6: Word Sense Disambiguation for  Machine Translation

Traditional WSD

• Target word is a single content word– Noun, verb, adjectives

• Classification task with predefined senses– WordNet, HowNet

• Modern WSD system– Not limited to local context– Linguistic information– Position-sensitive– Syntactic– Collocation

• A intuitive application of WSD is SMT

Page 7: Word Sense Disambiguation for  Machine Translation

WSD in MT

• Wrong translations from Google Translate• what is today's special ?• 什 麼 是 今 天 的 特 色 ?

• I would like to reserve a table for three• 我想保留一表三• the plane will briefly stop over in the airport• 這架飛機將簡要地停留在機場

Page 8: Word Sense Disambiguation for  Machine Translation

WSD in MT: Early Stage

• Whether WSD model can help SMT– Energetically debated question over the past years

• Implicit WSD in SMT– Local context: phrase table & language model

• Dedicated WSD system– Wider variety of context features– Position, sentence-level, document-level features

• WSD should play a role in MT• Publicly available SMT system

– Pharaoh by Philipp Koehn (2003~2004)

Page 9: Word Sense Disambiguation for  Machine Translation

Small Scale Experiment (1)

• Marine CARPUAT and Dekai Wu, 2005• Chinese-to-English translation task• Chinese lexical sample task includes 20 target• Trained with state-of-the-art WSD

– 37 training instances per target word

(manual annotation)

Page 10: Word Sense Disambiguation for  Machine Translation

Small Scale Experiment (2)

• Hard decision– Force the decoder to choose translations from glosses– Decided by language model

• Surprising and frustrating result– Small data, out-of-domain material, hard decision– Language model effect

Page 11: Word Sense Disambiguation for  Machine Translation

Translation Disambiguation (1)

• Clara Cabezas and Philip Resnik, 2005– Address 3 problems of the previous work

• Use aligned target word directly as "sense"– 4 senses for "briefly": { 短暫地 , 短時間地 , 簡潔地 , 簡要地 }– Trained with state-of-the-art WSD– Handle "small data" and "out-of-domain" problems

• Soft decision– Pharoah XML markup

• Choose specified translations and translation model together

– Handle "hard decision" problem

Page 12: Word Sense Disambiguation for  Machine Translation

Translation Disambiguation (2)• Pharaoh XML markup

• Experiment & Result• Spanish-to-English test from Europarl test• WSD: 0.2382, Baseline: 0.2356• Not statistically significant• But at least it is not a decrease

Page 13: Word Sense Disambiguation for  Machine Translation

Toward Better Integration into SMT

• How to better integrate WSD into SMT?• Phrase-based sense disambiguation (PSD)• Key points

– Phrase, not word– Integration into log-linear model: weight tuning

Page 14: Word Sense Disambiguation for  Machine Translation

Successful Integration (1)

• Chan et al., 2007• Chinese-to-English translation• Sense disambiguation on Chinese phrase

– 1 or 2 consecutive Chinese words– Extract training examples from word-aligned corpus

• Add WSD features– Contextual probability of WSD – Reward probability of WSD

Page 15: Word Sense Disambiguation for  Machine Translation

Successful Integration (2)

• Statistically significant improvement

• 將 無法 取得 更 多 援助 或 其他 讓步• Hiero: will be more aid and other concessions• Hiero+WSD: will be unable to obtain more aid and other

concessions

Page 16: Word Sense Disambiguation for  Machine Translation

PSD System (1)

• Marine CARPUAT and Dekai Wu, 2007• WSD model for every phrase

– Extract training data from phrase extraction– WSD probability as new feature

• Comments– Not every phrase need WSD– Technical problem (Pharaoh)

Page 17: Word Sense Disambiguation for  Machine Translation

PSD System (2)

• Result: better translation on all test sets

IWSLT 2006 dataset NIST 2004 test set

Page 18: Word Sense Disambiguation for  Machine Translation

PSD System (3)

Page 19: Word Sense Disambiguation for  Machine Translation

Recent Issue

• Different translations may have the same sense– 2 senses for "briefly", rather than 4– Sense 1: { 短暫地 , 短時間地 }– Sense 2: { 簡潔地 , 簡要地 }

• Automatic sense clustering

Page 20: Word Sense Disambiguation for  Machine Translation

Sense Clustering (1)

• Marianna Apidianaki, 2009• Two translations are semantically related

– If they occur in similar context

• Translation unit (TU) as context– Bilingual sentence pair

• Source word "briefly" • Translations

– { 短暫地 , 短時間地 , 簡潔地 , 簡要地 }– {t1, t2, t3, t4}

Page 21: Word Sense Disambiguation for  Machine Translation

Sense Clustering (2)

• "briefly-t1" occurs in context {TU1, TU4, TU25, TU88…}• "briefly-t2" occurs in context {TU5, TU18, TU92, TU126…}• Clustering based on pairwise context similarity

– Apidianaki, 2008

Page 22: Word Sense Disambiguation for  Machine Translation

Sense Clustering (3)

• Experiment– English-Greek translation– 150 ambiguous English nouns

• Evaluation of lexical selection– Strict precision (Exact match with answer word)

– Enriched precision (Match with the cluster of answer word)

• Result

Page 23: Word Sense Disambiguation for  Machine Translation

Conclusion

• From WSD to PSD• However, semantic is also important• Future work

– Semantic PSD