Optimizing Statistical Machine Translation for Text Simplification · Optimizing Statistical...

Preview:

Citation preview

Optimizing Statistical Machine Translation for Text Simplification

TACL paper @ ACL Aug-9-2016

Wei XuCourtney Napoles

Ellie Pavlich Quanze Chen

Chris Callison-Burch

UPenn ➞ Ohio State University JHU UPenn UPenn ➞ University of Washington UPenn

Text Simplification (A Monolingual Machine Translation Problem)

★ for children, disabled, non-native speakers … ★ for other NLP tasks (MT, summarization …)

Our Paper Last Year (TACL ’15)

Problems in Current Text Simplification Research (TACL ’15)

Data Evaluation System

not muchparaphrasing

no reliableautomatic metric

Simple Wikipediawas not that simple

The Newsela Corpus!very high qualityfreely available

(often use Moses as a black-box)

Wei Xu, Chris Callison-Burch, Courtney Napoles. “Problems in Current Text Simplification Research: New Data Can Help” TACL (2015)

This Paper and This Talk (TACL ’16)

Data Evaluation System

not muchparaphrasing

no reliableautomatic metric

Simple Wikipediawas not that simple

Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in TACL (2016)

This Paper and This Talk (TACL ’16)

Data Evaluation System

not muchparaphrasing

no reliableautomatic metric

Simple Wikipediawas not that simple

Crowdsourcing3.5k sentences

8 referencesWei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in

TACL (2016)

This Paper and This Talk (TACL ’16)

Data Evaluation System

not muchparaphrasing

no reliableautomatic metric

Simple Wikipediawas not that simple

Crowdsourcing3.5k sentences

8 references

SARI the 1st metric

that worksWei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in

TACL (2016)

This Paper and This Talk (TACL ’16)

Data Evaluation System

not muchparaphrasing

no reliableautomatic metric

Simple Wikipediawas not that simple

Crowdsourcing3.5k sentences

8 references

SARI the 1st metric

that works

state-of-the-artparaphrasingfor simplicity

Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in TACL (2016)

DataCrowdsourcing3.5k sentences

8 references

Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in TACL (2016)

★ paraphrasing! (little deletion; no splitting) ★ simple quality control (see our NAACL 2015 paper) ★ freely available

DataCrowdsourcing3.5k sentences

8 references

Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in TACL (2016)

★ paraphrasing! (little deletion; no splitting) ★ simple quality control (see our NAACL 2015 paper) ★ freely available

DataCrowdsourcing3.5k sentences

8 references

Wiki Jeddah is the principal gateway to Mecca, Islam’s holiest city, which able-bodied Muslims are required to visit at least once in their lifetime.

Turk #1 Jeddah is the main entrance to Mecca, the holiest city in Islam, which all healthy Muslims need to visit at least once in their life.

Our System

Jeddah is the main gateway to Mecca, Islam’s holiest city, which sound Muslims have to visit at least once in their life.

Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in TACL (2016)

★ paraphrasing! (little deletion; no splitting) ★ simple quality control (see our NAACL 2015 paper) ★ freely available

DataCrowdsourcing3.5k sentences

8 references

Wiki Jeddah is the principal gateway to Mecca, Islam’s holiest city, which able-bodied Muslims are required to visit at least once in their lifetime.

Turk #1 Jeddah is the main entrance to Mecca, the holiest city in Islam, which all healthy Muslims need to visit at least once in their life.

Our System

Jeddah is the main gateway to Mecca, Islam’s holiest city, which sound Muslims have to visit at least once in their life.

Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in TACL (2016)

★ paraphrasing! (little deletion; no splitting) ★ simple quality control (see our NAACL 2015 paper) ★ freely available

DataCrowdsourcing3.5k sentences

8 references

Wiki Jeddah is the principal gateway to Mecca, Islam’s holiest city, which able-bodied Muslims are required to visit at least once in their lifetime.

Turk #1 Jeddah is the main entrance to Mecca, the holiest city in Islam, which all healthy Muslims need to visit at least once in their life.

Our System

Jeddah is the main gateway to Mecca, Islam’s holiest city, which sound Muslims have to visit at least once in their life.

Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in TACL (2016)

EvaluationSARI

the 1st metric that works

Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in TACL (2016)

EvaluationSARI

the 1st metric that works

Input

SystemoutputHuman

references

keepO ∩ R ∩ I

addO ∩ R ∩ not I

delI ∩ not O ∩ not R

Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in TACL (2016)

EvaluationSARI

the 1st metric that works

Input

SystemoutputHuman

references

keepO ∩ R ∩ I

addO ∩ R ∩ not I

delI ∩ not O ∩ not R

★ light-weighted and tunable ★ SARI — “the BLEU for simplification” ★ take input into account!★ use F-measure

Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in TACL (2016)

EvaluationSARI

the 1st metric that works

Input

SystemoutputHuman

references

keepO ∩ R ∩ I

addO ∩ R ∩ not I

delI ∩ not O ∩ not R

SARI = d1Fadd + d2Fkeep + d3Pdel

d1 = d2 = d3 = 1/3

★ light-weighted and tunable ★ SARI — “the BLEU for simplification” ★ take input into account!★ use F-measure

Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in TACL (2016)

Systemstate-of-the-artparaphrasingfor simplicity

Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in TACL (2016)

Systemstate-of-the-artparaphrasingfor simplicity

Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in TACL (2016)

★ In-depth adaptation of statistical MT (Joshua) • tuning data (small parallel corpus) • tunable metric (PRO pairwise ranking optimization) • PPDB (no large parallel corpus needed!) • simplification-specific features

Systemstate-of-the-artparaphrasingfor simplicity

★ In-depth adaptation of statistical MT (Joshua) • tuning data (small parallel corpus) • tunable metric (PRO pairwise ranking optimization) • PPDB (no large parallel corpus needed!) • simplification-specific features

Lexical applauded ⟶ praised

Phrasal generally acknowledged ⟶ widely accepted

Syntactic NNP’s JJ legislation ⟶ the JJ law of NNPWei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in

TACL (2016)

Input:

Lexical applauded ⟶ praised

Phrasal generally acknowledged ⟶ widely accepted

Syntactic NNP’s JJ legislation ⟶ the JJ law of NNP

PPDB (4+ million rules)

Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in TACL (2016)

Input:

Candidate Simplifications:

Lexical applauded ⟶ praised

Phrasal generally acknowledged ⟶ widely accepted

Syntactic NNP’s JJ legislation ⟶ the JJ law of NNP

SARI = 0.21

SARI = 0.33

PPDB (4+ million rules)

SARI = 0.15Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in

TACL (2016)

Human Evaluation

0

1

2

3

4

Simple Wikipedia

Turk (Wubben et al. 2012) Joshua -BLEU-

Joshua -SARI-

Grammar Meaning Simplicity+

Automatic Simplification

tuning towards BLEU leaves input unchanged(8 references)

Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in TACL (2016)

Human Evaluation

0

1

2

3

4

Simple Wikipedia

Turk Moses + reranking(Wubben et al. 2012)

Joshua -BLEU-

Joshua -SARI-

Grammar Meaning Simplicity+

Automatic Simplification

Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in TACL (2016)

Human Evaluation

0

1

2

3

4

Simple Wikipedia

Turk Moses + reranking(Wubben et al. 2012)

Joshua -BLEU-

Joshua -SARI-

Grammar Meaning Simplicity+

Automatic Simplification

tuning towards BLEU leaves input unchangedWhy? Why? Why?Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in

TACL (2016)

Automatic Metrics (Spearman’s rho correlation w/ human ratings)

Grammar Meaning Simplicity

BLEU 0.59 0.70 0.15

FKBLEU 0.35 0.41 0.24

SARI 0.34 0.40 0.34

Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in TACL (2016)

Automatic Metrics (Spearman’s rho correlation w/ human ratings)

Grammar Meaning Simplicity

BLEU 0.59 0.70 0.15

FKBLEU 0.35 0.41 0.24

SARI 0.34 0.40 0.34

this is not necessarily a bad thingWhy? Why? Why?

Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in TACL (2016)

Gra

mm

ar

BLEU

0 0.25 0.5 0.75 1

BLEU correlates w/ Grammar

If a system leaves input unchanged, it gets

perfect Grammar perfect Meaning

and very high BLEU score.because there are multiple references

(unlike bilingual translation)

Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in TACL (2016)

high Grammar high BLEU

Sim

plic

ity

0 0.25 0.5 0.75 1

BLEU is not for SimplificationG

ram

mar

BLEU

0 0.25 0.5 0.75 1

If a system leaves input unchanged, it gets

perfect Grammar perfect Meaning

and very high BLEU score.But it is not good simplification!

low Simplicity high BLEU

Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in TACL (2016)

high Grammar high BLEU

Gra

mm

ar

BLEU

0 0.25 0.5 0.75 1

SARI is way better than BLEU

Gra

mm

arSARI

0.12 0.24 0.36 0.48 0.6

high Grammarcould be bad simplification

low SARI high Grammar

Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in TACL (2016)

low BLEU high Grammar

Sim

plic

ity

BLEU

0 0.25 0.5 0.75 1

SARI is way better than BLEU

Sim

plic

itySARI

0.12 0.24 0.36 0.48 0.6

low Simplicitymust be bad simplification

low Simplicityhigh BLEU

Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in TACL (2016)

low Simplicityhigh SARI

Machine Translation

Bilingual(Paraphrase =)Monolingual

naturally availableparallel text a lot much less

sensitive to error less more

objective straightforward sophisticated

standard metric BLEU etc. SARI and ?

Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in TACL (2016)

Thank you

Sponsor: National Science FoundationData and code available at https://cocoxu.github.io/

Questions?

I am looking for PhD students and post-docs. Email me! Wei Xu @ Ohio State University

Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch. “Optimizing Statistical Machine Translation for Simplification” in TACL (2016)

Newsela Dataset (TACL ’15)

Wei Xu, Chris Callison-Burch, Courtney Napoles. “Problems in Current Text Simplification Research: New Data Can Help” TACL (2015)

every article at 5 levels of simplification total 50k+ sentences, written by trained editors, comes with comprehension quizzes

Recommended