29
1 Machine Translation Dai Xinyu 2006-10- 27

1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline Introduction Architecture of MT Rule-Based MT vs. Data-Driven MT Evaluation of MT Development

Embed Size (px)

Citation preview

Page 1: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

1

Machine Translation

Dai Xinyu

2006-10-27

Page 2: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

2

Outline Introduction Architecture of MT Rule-Based MT vs. Data-Driven MT Evaluation of MT Development of MT MT problems in general Some Thinking about MT from

recognition

Page 3: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

3

Introductionmachine translation - the use of computers to translate from one language to another

•The classic acid test for natural language processing.•Requires capabilities in both interpretation and generation.•About $10 billion spent annually on human translation.

http://www.google.com/language_tools?hl=en

"I have a text in front of me which is written in Russian but I am going to pretend that it is really written in English and that it has been coded in some strange symbols. All I need do is strip off the code in order to retrieve the information contained in the text"

Page 4: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

4

Introdution - MT past and present mid-1950's - 1965:

Great expectations The dark ages for MT:

Academic research projects 1980's - 1990's:

Successful specialized applications 1990's:

Human-machine cooperative translation 1990's - now:

Statistical-based MT Hybrid-strategies MT

Future prospects: ???

Page 5: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

5

Interest in MT Commercial interest:

U.S. has invested in MT for intelligence purposes

MT is popular on the web—it is the most used of Google’s special features

EU spends more than $1 billion on translation costs each year.

(Semi-)automated translation could lead to huge savings

Page 6: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

6

Interest in MT Academic interest:

One of the most challenging problems in NLP research

Requires knowledge from many NLP sub-areas, e.g., lexical semantics, parsing, morphological analysis, statistical modeling,…

Being able to establish links between two languages allows for transferring resources from one language to another

Page 7: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

7

Related Area to MT Linguistics Computer Science

AI Compile Formal Semantics …

Mathematics Probability Statistics …

Informatics Recognition

Page 8: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

8

Architecture of MT -- (Levers of Transfer)

Page 9: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

9

Rule-Based MT vs. Data-Driven MT

Rule-Based MT Data-Driven MT

Example-Based MT Statistics-Based MT

Page 10: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

10

Rule-Based MT

翻译系统

规则

x

语言学语义学认知科学人工智能

写规则

自然语言输入翻译结果

Page 11: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

11

Rule-Based MT

Page 12: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

12

Hmm, every time he sees “banco”, he either types “bank” or “bench” … but if he sees “banco de…”,he always types “bank”, never “bench”…

Man, this is so boring.

Translated documents

Page 13: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

13

Example-Based MT origins: Nagao (1981) first motivation: collocations, bilingual

differences of syntactic structures basic idea:

human translators search for analogies (similar phrases) in previous translations

MT should seek matching fragment in bilingual database, extract translations

aim to have less complex dictionaries, grammars, and procedures

improved generation (using actual examples of TL sentences)

Page 14: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

14

EBMT still going

Bi-lingual corpus Collection Store Searching and matching …

Page 15: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

15

Statistical MT Basics Based on assumption that translations

observed statistical regularities origins: Warren Weaver (1949) Shannon’s information theory

core process is the probabilistic ‘translation model’ taking SL words or phrases as input, and producing TL words or phrases as output

succeeding stage involves a probabilistic ‘language model’ which synthesizes TL words as ‘meaningful’ TL sentences

Page 16: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

16

Statistical MT

学习系统

预测系统

nxxx 21

1nx

概率模型

统计学习

)x(p̂ n 1

建立模型

自然语言输入

自然语言输入 预测

Page 17: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

17

Statistical MT schema

Page 18: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

18

Statistical MT processes Bilingual corpora: original and translation little or no linguistic ‘knowledge’, based on word co-

occurrences in SL and TL texts (of a corpus), relative positions of words within sentences, length of sentences

Alignment: sentences aligned statistically (according to sentence length and position)

Decoding: compute probability that a TL string is the translation of a SL string (‘translation model’), based on: frequency of co-occurrence in aligned texts of corpus position of SL words in SL string

Adjustment: compute probability that a TL string is a valid TL sentence (based on a ‘language model’ of allowable bigrams and trigrams)

search for TL string that maximizes these probabilities argmaxeP(e/f) = argmaxeP (f/e) P (e)

Page 19: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

19

Language Modeling Determines the probability of some English

sequence of length l P(e) is normally approximated as:

where m is size of the context, i.e. number of previous words that are considered,

m=1, bi-gram language model m=2, tri-gram language model

e1l

P(e1l ) P(e1 )P(e2 | e1) P(eii3

l | ei mi 1 )

Page 20: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

20

Translation Modeling Determines the probability that the foreign

word f is a translation of the English word e How to compute P(f | e) from a parallel

corpus? Statistical approaches rely on the co-

occurrence of e and f in the parallel data: If e and f tend to co-occur in parallel sentence pairs, they are likely to be translations of one another

Page 21: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

21

SMT issues ignores previous MT research (new start, new ‘paradigm’)

basically ‘direct’ approach: replaces SL word by most probable TL word, reorders TL words

decoding is effectively kind of ‘back translation’ originally wholly word-based (IBM ‘Candide’ 1988) ; now predominantly phrase-based (i.e. alignment of word groups); some research on syntax-

based mathematically simple, but huge amount of training (large databases) problems for SMT:

translation is not just selecting the most frequent ‘equivalent’ (wider context)

no quality control of corpora lack of monolingual data for some languages insufficient bilingual data (Internet as resource) lack of structure information of language

merit of SMT: evaluation as integral process of system development

Page 22: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

22

Rule-Based MT & SMT SMT black box: no way of finding how it works in

particular cases, why it succeeds sometimes and not others

RBMT: rules and procedures can be examined RBMT and SMT are apparent polar opposites, but

gradually ‘rules’ incorporated in SMT models first, morphology (even in versions of first IBM model) then, ‘phrases’ (with some similarity to linguistic

phrases) now also, syntactic parsing

Page 23: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

23

Rule-Based MT & SMT Comparison from following perspectives:

Theory background Knowledge expression Knowledge discovery Robust Extension Development Cycle

Page 24: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

24

Evaluation of MT

Manual: Precise / fluency / integrality 信 达 雅

Automatically evaluation: BLEU: percentage of word sequences (n-grams) occurring in reference texts NIST

Page 25: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

25

Development of MT - MT System

Page 26: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

26

Knowledge Acquisition Strategy

Knowledge Representation Strategy

All manual

Deep/ Complex

Shallow/ Simple

Fully automated

Learn from un-annotated data

Phrase tables

Word-based only

Learn from annotated data

Example-based MT

Original statistical MT

Typical transfer system

Classic interlingual system

Original direct approach

Syntactic Constituent Structure

Interlingua

New Research Goes Here!

Semantic analysis

Hand-built by non-experts

Hand-built by experts

Electronic dictionaries

MT Development - Research

Page 27: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

27

MT problems in general

Characters of language Ambiguous Dynamic Flexible

Knowledge How to express How to discovery How to use

Page 28: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

28

Some Thinking about MT from recognition

Human Cerebra Memory Progress - Learning Model Pattern

Translation by human… Translation by machine…

Page 29: 1 Machine Translation Dai Xinyu 2006-10-27. 2 Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development

29

Further Reading Arturo Trujillo, Translation Engines: Techniques for Machine Translation,

Springer-Verlag London Limited 1999 P.F. Brown, et al., A Statistical Approach to MT, Computational Linguistics,

1990,16(2) P.F. Brown, et al., The Mathematics of Statistical Machine Translation:

Parameter Estimation, Computational Linguistics, 1993, 19(2) Bonnie J. Dorr, et al, Survey of Current Paradigms in Machine Translation Makoto Nagao, A Framework of a Mechanical Translation between Japanese

and English by Analog Principle, In A. Elithorn and R. Banerji(Eds.), Artificial and Human Intelligence. NATO Publications, 1984

Hutchins WJ, Machine Translation: Past, Present, Future. Chichester: Ellis Horwood, 1986

Daniel Jurafsky & James H. Martin, Speech and Language Processing, Prentice-Hall, 2000

Christopher D. Manning & Hinrich Schutze, Foundations of Statistical Natural Langugae Processing, Massachusetts Institute of Technology, 1999

James Allen, Natural Language Understanding, The Benjamin/Cummings Publishing Company, Inc. 1987