Upload
wilfrid-allen
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
MonoTrans2: A New Human Computation System to Support
Monolingual Translation
Chang Hu, Benjamin B. Bederson, Philip Resnik and Yakov Kronrod
Translating with people who speak only one language
International Children’s Digital Library– 4,386 books– 54 languages– 100K unique visitors/month– 1,500 volunteer translators
www.childrenslibrary.org
English and Spanish?
Croatian and Japanese?
Too Much to Translate
Fanm gen tranche pou fe` yon pitit nan Delmas 31
Undergoing children delivery Delmas 31
Munro, Robert. 2010. Crowdsourced translation for emergency response and beyond. NSF Workshop on crowdsourcing and translation, University of Maryland.
Uncommon Languages
Translation with bilingual translators
vs. 1,200,000 contributors Wikipedia: 900 translators
Translate with the Monolingual Crowd
Chang Hu. Collaborative Translation by Monolingual Users, CHI '09Chang Hu, Benjamin B. Bederson, Philip Resnik. Translation by Iterative Collaboration between Monolingual Users (MonoTrans), GI '10
Estoy bien.
Estoy bien.
Am fine.
2 Target-side editing
I am fine.
1 Vote on candidates
1 Vote on back translation
Estoy bien.
Estoy bien.
I am been.
1 Vote on back translation
2 Target-side editing
3 Identify translation errors
I am been.been.bien.
2 Explain phrase
Estoy bien.bien. I am been.been.
1 Vote on candidates
Estoy bien.
Estoy bien.
I am been.
1 Vote on back translation
2 Target-side editing
3 Identify translation errors
I am been.been.bien.
2 Explain phrase
Estoy bien.bien.
3 Paraphrase source sentence
repeat …
12
3
…
Yo estoy bien.
1 Vote on candidates
Experiment1 – Children’s Books• 60 Spanish / 22 German speakers• ICDL volunteers• Worked on
– 4 Spanish books => German– 1 German book => Spanish
• Machine translation engine: Google Translate
Evaluation of MonoTrans2 Output• 2 German-Spanish bilingual evaluators (not
part of MonoTrans2!)• Fluency and accuracy• 5-point score• How much improvement over Google
Translate?Original: Estoy muy bien.Fluent, not accurate: The weather is good.Accurate, not fluent: Me is very good.
Ready for ICDL?Ready: both bilingual evaluators agree score = 5
Machine translation (Google) only: 10% of sentences
MonoTrans2: 68% of sentences ready
1 2 3 4 50
255075
100125150
GoogleMonoTrans2
1 2 3 4 50
255075
100125150
GoogleMonoTrans2
Experiment2 Haitian Earthquake SMS
• 4 Haitian Creole speakers• 5 English-speaking students• 21 other English speakers• Worked on 408 text messages
Machine translation (Google) only: 25% of sentences
MonoTrans2: 38% of sentences ready
Difficulty: text messages >> children’s books
Sample ResultsHaitian Creole: Enfòmasyon sou tranblemen de tèGround Truth: Information on the earthquakeGoogle: Information tranblemen groundMonoTrans2: Information on the earthquake
Sample ResultsHaitian Creole: Bonjou. Mwen ta renmen konnen si imigrasyon ouvè SVP. Mèsi.Ground Truth: Hello. I would like to know if immigration is open please. Thank you.Google: Hello. I would like to know if open immigration SVP. Thank you.MonoTrans2: Hello. I would like to know if immigration is open please. Thank you.
• MonoTrans2– No human bilingual knowledge– Dramatic improvement from machine translation
translatetheworld.org
?
Recap
Take-Away Message
• People + machine > people or machine• Combining two crowds with different skills
translatetheworld.org
• Translation speed– Professional translators: 2000 words per day– MonoTrans2: 800 words per day
– Translation firm on the four German/Spanish books: 4 days
– MonoTrans2: 4 days
– Haitian SMS experiment: 284.75 words per minute
Ready for ICDL?
Google MonoTrans2Sentences with fluency = 5 21 112Sentences with adequacy = 5 17 118Sentences where BOTH = 5 17 110
Sentences for which both bilingual evaluators agree score = 5
(N=162 sentences worked on in the experiment)
Machine translation only: 10% of sentences ready
MonoTrans2: 68% of sentences ready
My family in Carrefour, 24 Cote Plage, 41A needs food and water
People trapped in Sacred Heart Church, PauP
General Hospital has less than 24 hrs. supplies
Undergoing children delivery Delmas 31
Experiment 3
• An alternative use case for crowdsourced translation…
Munro, Robert. 2010. Crowdsourced translation for emergency response and beyond. NSF Workshop on crowdsourcing and translation, University of Maryland.
Punchline (provisional)
Google MonoTrans2
Sentences with fluency = 5 1 (1%) 22 (30%)
Sentences with adequacy = 5 11 (14%) 29 (38%)
Sentences where BOTH = 5 0 (0%) 14 (18%)
Sentences for which three bilingual evaluators agree score = 5
(N=76 sentences completed)
Straight MT: 0% of sentences preserve all the meaning
MonoTrans2: 38% of sentences preserve all the meaning