43
Session #2 Introductions Syllabus, requirements, etc. Language in 10 minutes Lecture MT Approaches MT Evaluation MT Lab

Session #2

  • Upload
    ilyssa

  • View
    27

  • Download
    2

Embed Size (px)

DESCRIPTION

Session #2. Introductions Syllabus, requirements, etc. Language in 10 minutes Lecture MT Approaches MT Evaluation MT Lab. Road Map. Multilingual Challenges for MT MT Approaches MT Evaluation. Gisting. MT Approaches MT Pyramid. Source meaning. Target meaning. Source syntax. - PowerPoint PPT Presentation

Citation preview

Page 1: Session #2

Session #2

• Introductions– Syllabus, requirements, etc.

• Language in 10 minutes• Lecture

– MT Approaches– MT Evaluation

• MT Lab

Page 2: Session #2

Road Map

• Multilingual Challenges for MT• MT Approaches• MT Evaluation

Page 3: Session #2

MT ApproachesMT Pyramid

Source word

Source syntax

Source meaning Target meaning

Target syntax

Target word

Analysis Generation

Gisting

Page 4: Session #2

MT ApproachesGisting Example

Sobre la base de dichas experiencias se estableció en 1988 una metodología.

Envelope her basis out speak experiences them settle at 1988 one methodology.

On the basis of these experiences, a methodology was arrived at in 1988.

Page 5: Session #2

MT ApproachesMT Pyramid

Source word

Source syntax

Source meaning Target meaning

Target syntax

Target word

Analysis Generation

Gisting

Transfer

Page 6: Session #2

MT ApproachesTransfer Example

• Transfer Lexicon – Map SL structure to TL structure

poner

X mantequilla en

Y

:obj:mod:subj

:obj

butter

X Y

:subj :obj

X puso mantequilla en Y X buttered Y

Page 7: Session #2

MT ApproachesMT Pyramid

Source word

Source syntax

Source meaning Target meaning

Target syntax

Target word

Analysis Generation

Gisting

Transfer

Interlingua

Page 8: Session #2

MT ApproachesInterlingua Example: Lexical Conceptual Structure

(Dorr, 1993)

Page 9: Session #2

MT ApproachesMT Pyramid

Source word

Source syntax

Source meaning Target meaning

Target syntax

Target word

Analysis Generation

Interlingua

Gisting

Transfer

Page 10: Session #2

MT ApproachesMT Pyramid

Source word

Source syntax

Source meaning Target meaning

Target syntax

Target word

Analysis Generation

Interlingual Lexicons

Dictionaries/Parallel Corpora

Transfer Lexicons

Page 11: Session #2

MT ApproachesStatistical vs. Rule-based

Source word

Source syntax

Source meaning Target meaning

Target syntax

Target word

Analysis Generation

Page 12: Session #2

Statistical MT Noisy Channel Model

Portions from http://www.clsp.jhu.edu/ws03/preworkshop/lecture_yamada.pdf

Page 13: Session #2

Statistical MT Automatic Word Alignment

• GIZA++– A statistical machine translation toolkit used to train word alignments.– Uses Expectation-Maximization with various constraints to bootstrap

alignments

Slide based on Kevin Knight’s http://www.sims.berkeley.edu/courses/is290-2/f04/lectures/mt-lecture.ppt

Mary

did

not

slap

the

green

witch

Maria no dio una bofetada a la bruja verde

Page 14: Session #2

Giza Plateau and MT Pyramid

• EGYPT toolkit• Giza, Giza++• Cairo visualizer• Pharaoh• Hiero• Moses

Words WordsWords

Page 15: Session #2

Statistical MT IBM Model (Word-based Model)

http://www.clsp.jhu.edu/ws03/preworkshop/lecture_yamada.pdf

Page 16: Session #2

Phrase-Based Statistical MT

• Foreign input segmented in to phrases– “phrase” is any sequence of words

• Each phrase is probabilistically translated into English– P(to the conference | zur Konferenz)– P(into the meeting | zur Konferenz)

• Phrase distortion

This is state-of-the-art!

Morgen fliege ich nach Kanada zur Konferenz

Tomorrow I will fly to the conference In Canada

Slide courtesy of Kevin Knight http://www.sims.berkeley.edu/courses/is290-2/f04/lectures/mt-lecture.ppt

Page 17: Session #2

Mary

did

not

slap

the

green

witch

Maria no dió una bofetada a la bruja verde

Word Alignment Induced Phrases

(Maria, Mary) (no, did not) (slap, dió una bofetada) (la, the) (bruja, witch) (verde, green)

Slide courtesy of Kevin Knight http://www.sims.berkeley.edu/courses/is290-2/f04/lectures/mt-lecture.ppt

Page 18: Session #2

Mary

did

not

slap

the

green

witch

Maria no dió una bofetada a la bruja verde

Word Alignment Induced Phrases

(Maria, Mary) (no, did not) (slap, dió una bofetada) (la, the) (bruja, witch) (verde, green)

(a la, the) (dió una bofetada a, slap the)

Slide courtesy of Kevin Knight http://www.sims.berkeley.edu/courses/is290-2/f04/lectures/mt-lecture.ppt

Page 19: Session #2

Mary

did

not

slap

the

green

witch

Maria no dió una bofetada a la bruja verde

Word Alignment Induced Phrases

(Maria, Mary) (no, did not) (slap, dió una bofetada) (la, the) (bruja, witch) (verde, green)

(a la, the) (dió una bofetada a, slap the)

(Maria no, Mary did not) (no dió una bofetada, did not slap), (dió una bofetada a la, slap the)

(bruja verde, green witch)

Slide courtesy of Kevin Knight http://www.sims.berkeley.edu/courses/is290-2/f04/lectures/mt-lecture.ppt

Page 20: Session #2

Mary

did

not

slap

the

green

witch

Maria no dió una bofetada a la bruja verde

(Maria, Mary) (no, did not) (slap, dió una bofetada) (la, the) (bruja, witch) (verde, green)

(a la, the) (dió una bofetada a, slap the)

(Maria no, Mary did not) (no dió una bofetada, did not slap), (dió una bofetada a la, slap the)

(bruja verde, green witch) (Maria no dió una bofetada, Mary did not slap)

(a la bruja verde, the green witch) …

Word Alignment Induced PhrasesSlide courtesy of Kevin Knight http://www.sims.berkeley.edu/courses/is290-2/f04/lectures/mt-lecture.ppt

Page 21: Session #2

Mary

did

not

slap

the

green

witch

Maria no dió una bofetada a la bruja verde

(Maria, Mary) (no, did not) (slap, dió una bofetada) (la, the) (bruja, witch) (verde, green)

(a la, the) (dió una bofetada a, slap the)

(Maria no, Mary did not) (no dió una bofetada, did not slap), (dió una bofetada a la, slap the)

(bruja verde, green witch) (Maria no dió una bofetada, Mary did not slap)

(a la bruja verde, the green witch) …

(Maria no dió una bofetada a la bruja verde, Mary did not slap the green witch)

Word Alignment Induced PhrasesSlide courtesy of Kevin Knight http://www.sims.berkeley.edu/courses/is290-2/f04/lectures/mt-lecture.ppt

Page 22: Session #2

Phrase-Based Models• Sentence f is decomposed into J phrases f1

J = f1,...,fj,...,fJ

• Sentence e is decomposed into l phrases e = eI1 = e1,...,ei,...,eI.

• We choose the sentence with the highest probability:

Page 23: Session #2

Phrase-Based Models• Model the posterior probability using a log-linear combination

of feature functions. • We have a set of M feature functions hm(eI

1,f1J),m = 1,...,M. For

each feature function, there exists a model parameter λm ,m = 1,...,M

• The decision Rule is

• Features cover the main components• Phrase-Translation Model• Reordering Model• Language Model

Page 24: Session #2

Advantages of Phrase-Based SMT

• Many-to-many mappings can handle non-compositional phrases

• Local context is very useful for disambiguating– “Interest rate” …– “Interest in” …

• The more data, the longer the learned phrases– Sometimes whole sentences

Slide courtesy of Kevin Knight http://www.sims.berkeley.edu/courses/is290-2/f04/lectures/mt-lecture.ppt

Page 25: Session #2

Source word

Source syntax

Source meaning Target meaning

Target syntax

Target word

Analysis Generation

MT ApproachesStatistical vs. Rule-based vs. Hybrid

Page 26: Session #2

Knowledge Acquisition Strategy

Knowledge Representation Strategy

All manual

Deep/ Complex

Shallow/ Simple

Fully automated

Learn from un-annotated data

Phrase tables

Word-based only

Learn from annotated data

Example-based MT

Original statistical MT

Typical transfer system

Classic interlingual system

Original direct approach

Syntactic Constituent Structure

Interlingua

New Research Goes Here!

Semantic analysis

Hand-built by non-experts

Hand-built by experts

Electronic dictionaries

Slide courtesy ofLaurie Gerber

Page 27: Session #2

MT Approaches Practical Considerations

• Resources Availability– Parsers and Generators

• Input/Output compatability– Translation Lexicons

• Word-based vs. Transfer/Interlingua– Parallel Corpora

• Domain of interest• Bigger is better

• Time Availability– Statistical training, resource building

Page 28: Session #2

Road Map

• Multilingual Challenges for MT• MT Approaches• MT Evaluation

Page 29: Session #2

MT Evaluation

• More art than science• Wide range of Metrics/Techniques

– interface, …, scalability, …, faithfulness, ... space/time complexity, … etc.

• Automatic vs. Human-based– Dumb Machines vs. Slow Humans

Page 30: Session #2

Human-based Evaluation ExampleAccuracy Criteria

Page 31: Session #2

Human-based Evaluation ExampleFluency Criteria

Page 32: Session #2

Fluency vs. Accuracy

Accuracy

Fluency

conMTFAHQMT

Prof.MT

Info.MT

Page 33: Session #2

Automatic Evaluation ExampleBleu Metric

(Papineni et al 2001)

• Bleu – BiLingual Evaluation Understudy – Modified n-gram precision with length penalty – Quick, inexpensive and language independent – Correlates highly with human evaluation – Bias against synonyms and inflectional variations

Page 34: Session #2

Test Sentence

colorless green ideas sleep furiously

Gold Standard References

all dull jade ideas sleep iratelydrab emerald concepts sleep furiously

colorless immature thoughts nap angrily

Automatic Evaluation ExampleBleu Metric

Page 35: Session #2

Test Sentence

colorless green ideas sleep furiously

Gold Standard References

all dull jade ideas sleep iratelydrab emerald concepts sleep furiously

colorless immature thoughts nap angrily

Unigram precision = 4/5

Automatic Evaluation ExampleBleu Metric

Page 36: Session #2

Test Sentence

colorless green ideas sleep furiouslycolorless green ideas sleep furiouslycolorless green ideas sleep furiouslycolorless green ideas sleep furiously

Gold Standard References

all dull jade ideas sleep iratelydrab emerald concepts sleep furiously

colorless immature thoughts nap angrily

Unigram precision = 4 / 5 = 0.8Bigram precision = 2 / 4 = 0.5

Bleu Score = (a1 a2 …an)1/n

= (0.8 0.5)╳ ½ = 0.6325 63.25

Automatic Evaluation ExampleBleu Metric

Page 37: Session #2

Automatic Evaluation ExampleMETEOR

(Lavie and Agrawal 2007)

• Metric for Evaluation of Translation with Explicit word Ordering

• Extended Matching between translation and reference– Porter stems, wordNet synsets

• Unigram Precision, Recall, parameterized F-measure• Reordering Penalty• Parameters can be tuned to optimize correlation with

human judgments• Not biased against “non-statistical” MT systems

Page 38: Session #2

• A syntactically-aware evaluation metric– (Liu and Gildea, 2005; Owczarzak et al.,

2007; Giménez and Màrquez, 2007)• Uses dependency representation

– MICA parser (Nasr & Rambow 2006)– 77% of all structural bigrams are surface

n-grams of size 2,3,4• Includes dependency surface span as

a factor in score– long-distance dependencies should

receive a greater weight than short distance dependencies

• Higher degree of grammaticality? 0%5%

10%15%20%25%30%35%40%45%50%

1 2 3 4plus

Automatic Evaluation Example

SEPIA (Habash and ElKholy 2008)

Page 39: Session #2

Metrics MATR Workshop

• Workshop in AMTA conference 2008 – Association for Machine Translation in the Americas

• Evaluating evaluation metrics• Compared 39 metrics

– 7 baselines and 32 new metrics– Various measures of correlation with human judgment– Different conditions: text genre, source language,

number of references, etc.

Page 40: Session #2

Building an SMT System• Training

– Input• Parallel Training Data• Monolingual Data of the Target Language

– Output• Phrase Table• Language Model• Initial Translation Model parameters

• Tuning– Input

• Parallel Tune Data Set• Decoder • Phrase Table + Language Model + Initial Translation Model parameters

– Output• Improved Translation Model parameters

• Development & Testing– Input

• Parallel Test Data Set• Decoder• Phrase Table + Language Model + FinalTranslation Model parameters

– Output• BLEU score

Page 41: Session #2

Training

Preprocess

Alignment

Phrase Extraction

Phrase Table (PT)

LM

MonolingualTargetSource

Create LM

Page 42: Session #2

Tuning

Tune Set

Preprocess

Decoder

LM

PT

Weights

Output

Metric Score Optimal?

Adjust Weights

Done

No

Yes

Page 43: Session #2

Testing

Test Set

Preprocess

Decoder

LM

PT Weights

Output