Upload
dangduong
View
223
Download
4
Embed Size (px)
Citation preview
The Human Element in Translation
Technology
Dr. Dorothy Kenny
Centre for Translation and Textual Studies
Dublin City University
2
Acknowledgements
We are grateful to the following for funding our participation in the IATIS conference: School of Applied Language and Intercultural Studies (SALIS), DCU (de Almeida & Dombek); DCU Research Committee (de Barra-Cusack & Mitchell), Centre for Next Generation Localisation (CNGL) (Doherty & Moorkens); Faculty of Humanities and Social Sciences, DCU (Kenny)
Magda Dombek‟s research is funded by SALIS, DCU
Fionnuala de Barra Cusack‟s research is funded by Foras na Gaeilge and SALIS, DCU
Joss Moorkens‟ and Stephen Doherty‟s research is supported by the Science Foundation Ireland (Grant 07/CE/I1142) as part of the CNGL (www.cngl.ie) at DCU
Giselle de Almeida‟s research is funded by the Irish Research Council for Science, Engineering and Technology (IRCSET) and VistaTEC
Linda Mitchell‟s research is funded by Symantec
Special thanks to: Lynne Bowker (University of Ottawa), Fred Hollowood (Symantec),
Sharon O‟Brien (DCU), Minako O‟Hagan (DCU), Phil Ritchie (VistaTEC), Johan Roturier
(Symantec), Reinhard Schäler (University of Limerick), Fiontar (DCU) and Google
Translation Technologies
Collaborative platforms
Termbanks
Translation memory (TM)
Machine translation (MT) Rule-Based MT
Statistical MT
People
Developers
Managers
Translators
Revisers
Post-Editors
Readers
Researchers…
Mixed Methods
Automatics evaluation metrics (for MT)
Corpus-based methods
Eye-tracking
Questionnaires
Focus groups
Interviews
Human user evaluations
Contextual Inquiry
Netnography…
Motivation in Translation
Crowdsourcing - A study of Volunteer Polish Facebook User-Translators
Magdalena Dombek
Centre for Translation and Textual Studies
Dublin City University
8
Overview
Research background
Research question
Facebook and translation crowdsourcing
Theoretical framework
Methodology
Findings so far
Conclusions
Research background
9
collaborative
translation tools
new forms of
translation practice
Research question
10
Translation process on Facebook
Translation System Collaborative translation platform
Glossary
Voting system
‘Inline’ mode
Social Networking Features Community group page
Post a message
Ask a question
Upload video/ photo/ document
11
Theoretical Framework
Self-Determination Theory Activity Theory
•Intrinsic Motivation (IM):
•Autonomy
•Competence
•Relatedness
•Cognitive Evaluation Theory:
•Social and environmental factors
strengthen vs. undermine IM
•Information technologies in the
context of human practice:
•HCI: social, organizational and
cultural context of computer
usage
•Mediation: tools shape human
experiences
12
Facebook translation crowdsourcing:
Translation as an activity which correlates with the needs of autonomy,
competence and relatedness
Translation as an activity mediated by tools
Methodology
Mixed methods approach:
Netnography:
ethnographic research online
study of a community in the online environment (observation,
interaction, archival data analysis)
Online survey:
source of quantitative data (demographics, use of Facebook,
Translations application, collaborative translation platform and
motivation)
study with 20 Polish Facebook user-translators
Further experiment:
observatory study of the usage of Facebook‟s collaborative
translation platform
13
Findings: Intrinsic Motivation, Competence
14
3
2.5
3
3
4
1
0 1 2 3 4
It is a great fun and a way to spend free time
It is a good way to fight boredom
It gives me a lot of satisfaction
I can use my skills in practice
It is a good deed for the benefit of others
It makes a good impression on family/ friends
I participate in the initiative because:
median
Scale:
0 = I completely disagree
4 = I completely agree
Findings: Competence
15
1
3
2.5
3
4
2
2
0 1 2 3 4
I meet and make friends with people who have similarinterests
I promote Polish language
I improve my qualifications and how employers perceiveme
I improve my knowledge of English, translation skills
I help others without English who want to use Facebook
I improve my reputation among other Facebook user-translators
I improve how Polish Facebook users perceive me
Thanks to my participation in the initiative:
median
Scale:
0 = I completely disagree
4 = I completely agree
Findings: Competence
16
[median = 3] [median = 3]
1
2
1
8
1
0 2 4 6 8
completely disagree
1
2
3
completely agree
The platform is easy to use
# of respondents
0
1
1
2
3
0 1 2 3
completely disagree
1
2
3
completely agree
The 'inline' mode is easy to use
# of respondents
Findings: Autonomy
17
0
6
13
1
0 5 10 15
a few times a year
a few times a month
a few times a week
at least once a day
How often do you contribute?
# of respondents
0
1
8
11
0 2 4 6 8 10 12
over 3 hours
2 to 3 hours
1 to 2 hours
less than 1 hour
How long does an individual session last?
# of respondents
Findings: Competence and Relatedness
18
1
1.5
3.5
2
1
3
0 1 2 3 4
to have your name on a leaderboard
to work for Facebook
to improve the quality of the Polish translation
to receive praise from others
to equal others with the volume of contribution
to suppor the community of Polish user-translators
How important is it for you:
median
Scale:
0 = I completely unimportant
4 = I extremely important
Findings: Relatedness
19
3.5
3
3
0 1 2 3 4
By translating/ voting I repay other translators fortheir contributions
As a member of the translators' community I feelresponsible for the work of all
I know I can count on help of other Facebookuser-translators in case of translation difficulty
To what extent do you agree:
median
Scale:
0 = I completely disagree
4 = I completely agree
Conclusions
Intrinsic motivation to contribute further strengthened
(as postulated in SDT):
Activity of Facebook translation conducive to the translators‟
increased perceptions of autonomy, competence and
relatedness
Persisting high levels of self-confidence, belief in one‟s skills,
task aptitude
Strong perception of community spirit
20
Metadata Use
www.focal.ie
Fionnuala de Barra-Cusack
Centre for Translation and Textual Studies
Dublin City University
Internal Users
www.focal.ie
Classification for subject-field headings
DANTERM classification
23
24
25
26
External Users
Previous Research My Contribution
Focus on comparative uses, i.e.
monolingual/bilingual or
learner/expert
Metadata use – very little research to
date
Largely print-based Electronic, online dictionarie
HCI
Mostly questionnaires, TAPs,
interviews and tests
Focus groups
Contextual inquiry
Eye-tracking
27
28
Pilot Study
Focus groups – hypotheses generation
29
And Next…
Contextual inquiry - observer as partner
Context - workplace
Method – partnership
Goal – uncover unarticulated aspects of work
30
And Next…
Eye-tracking – hypotheses testing
Actual use
Visual attention
31
And Next…
32
A Case Study of Inconsistency in
Translation Memories
Dr. Joss Moorkens
Centre for Translation and Textual Studies
Dublin City University
34
Translation Memory (TM)
35
36
Prior Research
Rieche 2004: Errors are introduced to TMs unless
“systematic control of memory through procedures
for regular review and maintenance.”
4
Prior Research
Rieche 2004
Bowker 2005: “Although it is frequently claimed
that TMs improve consistency, this is not always
the case”
4
Prior Research
Rieche 2004
Bowker 2005
Ribas López 2007: “TM systems can potentially
help spread errors”
5
Aims of research
Develop a method for measuring consistency in
translation memories
Use this method on TMs from the localisation
industry to find whether TM promotes consistency
in translation
Pilot Study
Pilot study found TT inconsistencies in TM data
Identification of categories of inconsistency
More than one type of inconsistency per segment
Inconsistencies also in source text
Single case study – difficult to generalise
Moorkens, J. 2009. Total Recall? A Case Study of Consistency in Translation Memory. IN:
Proceedings of LRC XIV. 24th and 25th September. Limerick, Ireland. p101-110.
6
7
Methodology
Sequential mixed methods study
Quantitative phase: empirical study of inconsistencies in
ST and TT of 4 TMs from industry (two English-
German, two English-Japanese)
Qualitative phase: series of interviews with translators
and others in various roles related to TM from industry
to add explanatory detail and suggest methods for
minimising inconsistency
Phase
One:
Quantitative
Data
Collection
Quantitative
Data Analysis
Quantitative
Results
Identify
Results for
Follow-Up
Phase
Two:
Qualitative
Data
Collection
Qualitative
Data Analysis
Qualitative
Results
Interpretation
Quantitative-
>Qualitative
7
Methodology
Adapted from Creswell and Plano Clark (2007, p73).
Sequential mixed methods study
Categories of Inconsistency
8
Find the Appropriate Ellipse for the Axis 軸に対して適切な楕円の検索
Find the appropriate ellipse for the axis 特定の軸に適合るす楕円
Categories of Inconsistency
Case sensitive 大文字と小文字の区別
Case Sensitive 大文字と小文字の区別
8
[1] 'Daimoji to komoji no kubetsu' [The distinction between upper case and lower case letters]
Find the Appropriate Ellipse for the Axis 軸に対して適切な楕円の検索
Find the appropriate ellipse for the axis 特定の軸に適合るす楕円
Categories of Inconsistency
Case sensitive 大文字と小文字の区別
Case Sensitive 大文字と小文字の区別
8
Find the Appropriate Ellipse for the Axis 軸に対して適切な楕円の検索
Find the appropriate ellipse for the axis 特定の軸に適合るす楕円
A new layer group filter can be nested only
under another group filter.
新しいレイヤグループ フィルタは、他のグルー
プ フィルタに対してのみネストできます。
A new layer group filter can be nested only
under another group filter.
新しい画層グループ フィルタは、他のグループ
フィルタに対してのみネストできます。
9
Findings from first phase
4.1s All lines that have been converted
using the {1}Create surface borders{2}
function can be recognized easily since they
are drawn with the {3}Border{4} pen.
4.1.1t Alle Linien, die mit der Funktion
{1}Flächenränder anlegen{2} konvertiert wurden,
können Sie leicht erkennen, da sie mit dem Stift mit
der Bezeichnung {3}Border{4} gezeichnet werden.
4.1s All lines that have been converted
using the {1}Create surface borders{2}
function can be recognized easily since they
are drawn with the {3}Border{4} pen.
4.1.2t Alle Linien, die mit der Funktion {1}
Flächenränder anlegen{2} konvertiert wurden,
können Sie leicht erkennen, da sie mit dem Stift mit
der Bezeichnung {3}Rand{4} gezeichnet werden.
10
Findings from first phase
{1}Swap the Colors.{2} {1}Farben austauschen{2}
{1}Swap the Colors{2}. {1}Farben tauschen{2}
11
Findings from first phase
(Do Not Add to Workspaces or Add to
Woprkspaces)
(Zu Arbeitsbereichen Nicht Hinzufügen oder Zu
Arbeitsbereichen Hinzufügen)
(Do Not Add to Workspaces or Add to
Workspaces)
(Zu Arbeitsbereichen nicht hinzufügen oder Zu
Arbeitsbereichen hinzufügen)
11
Findings from first phase
At the Command prompt, enter subtract. Geben Sie in der Befehlszeile differenz ein.
At the command prompt, enter subtract. Geben Sie an der Eingabeaufforderung DIFFERENZ
ein.
12
Findings from second phase
• All of the inconsistencies from the quantitative study
are common in the experiences of interviewees
• Constant problems with inadequate or ambiguous
source text
• Outdated glossaries, multiple concordance results,
inaccurate fuzzy match measurement leads to
translators introducing inconsistency
• Locked 100% matches often incorrect
• Current TM tools are inadequate – requirement for
extra levels of review and QA
Conclusions
Inconsistency is a genuine problem
Inconsistent source text:
Letter case
Punctuation
Inconsistent target text:
Terminology
Formatting
Tags
13
Conclusions
Clients, translators, developers all have a role to
minimise inconsistency
Inconsistency costs
Recommendations:
Standardisation
Maintenance
13
14
Thank you!
Email: [email protected]
This research is supported by the Science Foundation Ireland (Grant 07/CE/I1142) as part of the Centre for Next Generation Localisation
(www.cngl.ie) at Dublin City University.
Reading and Comprehension of
„Controlled‟ Machine Translated Texts
Dr. Stephen Doherty
Centre for Translation and Textual Studies
Dublin City University
55
Overview
Motivation
Concepts
Study Design
Results
Conclusions
56
Motivation
Increased need for automatic translation
Quality of machine translations [MT]
Benefits of controlled language [CL] and MT
Inclusion of end-users
Mix of human and automated evaluation
Better input = better output, right?
57
Interaction
(Reading)
Comprehensibility Readability
Reader Text
58
Study Design
Is „controlled‟ MT output more readable and
comprehensible?
Tech. support documentation
EN-FR, n = 25
Reading and evaluation task
Retrospective protocols
Previous validation of method
59
Input Output
Controlled
Input
MT
Systems
Controlled
Output
60
1. If you will be performing such an activity and want to avoid the warning, you can temporarily disable
the program.
2. If you perform such an activity and want to avoid the warning, you can temporarily disable the program.
1. Si vous serez effectuer une telle activité et que vous voulez éviter
l'avertissement, vous pouvez désactiver temporairement le programme.
2. Si vous effectuez une telle activité et que vous voulez éviter
l'avertissement, vous pouvez désactiver temporairement le programme.
1. Wenn Sie die Durchführung einer solchen Tätigkeit werden und wollen, um
die Warnung zu vermeiden, können Sie vorübergehend deaktivieren Sie das
Programm.
2. Wenn Sie eine solche Tätigkeit und durchführen wollen, um die Warnung zu
vermeiden, können Sie vorübergehend deaktivieren Sie das Programm.
1. このようなアクティビティを実行し、警告を回避するす
る場合は、プログラムを一時的に無効にすることができます。
2. このようなアクティビティを実行し、警告を回避したい
場合は、プログラムを一時的に無効にすることができます。
61
Results
Readability
Indices
Eye Tracking Human
Evaluation
Automatic
Evaluation
Metrics
Recall
Improved
readability,
significant* at
batch level
[*p < 0.05]
Significant*
improvements for
fixation count and
duration, and
regressions [cognitive
effort], but not pupil
dilation
Significantly*
higher ratings
for both
readability and
comprehension
Slight
degradation! [BLEU, GTM, TER]
Significantly*
higher
62
Conclusions
Application: CL is worth it when humans are concerned;
Focus on which rules work best;
Improvements along the workflow;
Predictors of readable/comprehensible text.
Research: Humans can be unreliable, machines can be too rigid;
Mixed methods, especially for human process research;
Including many points of view, even if you don‟t agree with them!
Modelling post-editing behaviour
to design tools and training
Giselle de Almeida
Centre for Translation and Textual Studies
Dublin City University
65
Overview
Definition of post-editing
Market for post-editing
Post-editing issues
Research questions
Application
Framework
Methodology
Findings and insights
Recommendations
Definition of post-editing
“Editing, modifying and/or correcting a
pre-translated text processed by a MT
system from a source language into one or
more target languages.” (Jeffrey Allen)
66
Market for post-editing
Among nearly 1,000 language service
providers around the world, 41.2% claimed
to offer post-edited machine translation
(The Market for MT Post-editing - Donald A.
DePalma and Vijayalaxmi Hegde, November 2010)
67
Post-editing issues
Still relatively unknown to many translators
No internationally adopted standard
guidelines (company-specific instead)
Not much information or training available
New methodologies and tools to learn
Expected level of language quality
determined by each project/company
Expected higher productivity
68
Research questions
Does the level of previous experience with
translation influence the performance of translators
when doing post-editing tasks?
If so, does the level of experience have a positive or
a negative impact on the performance in terms of
time spent and fitness for purpose?
Are the same post-editing strategies employed
across languages of the same family? (Test case:
French and Brazilian Portuguese.)
69
Application
Indication of post-editing effort
Identification of problem areas for
post-editors
Selection of candidates for post-editing training
Elaboration of post-editing training
and guidelines and improvement
of MT engines and/or dictionaries
70
Framework
Post-editing (Krings 2001, Loffler-Laurian 1984, Allen 2003, O'Brien 2006)
Translation memory tools (Guerberof 2008, Rieche 2004)
Views of translators regarding machine translation (Fulford 2002, Araújo 2004)
Post-editing training (O'Brien 2002)
Post-editing guidelines (Allen 2003, Schäefer 2003)
Automated post-editing (Vasconcellos 1986, Allen 2001, Simard et al. 2007, Font Llitjós 2007, Guzmán 2008)
Profile of a good post-editor (Offersgaard et al. 2008)
71
Methodology
Typology to classify post-editing changes:
based on LISA (Localisation Industry
Standards Association ) QA Model
Corpus: segments from IT texts
Recording of sessions with participants
Screen recording
Keyboard/mouse logging
72
Methodology (cont.)
Data analysis
Total time taken per participant
Types of changes made
Editing strategies
Correlation of translation experience
and post-editing performance
Correlations between languages
73
Findings and insights
Correlation of translation experience
vs. post-editing performance
Identification of main post-editing strategies
employed and changes made
74
Findings and insights
Language: category with the highest number of
essential corrections
Highest number of Language changes:
agreement (gender, number), phrasal ordering,
determiners
Correlations: translation experience vs. post-
editing performance
75
Findings and insights
76
Recommendations
Advisable to implement intermediate phase in
the MT workflow: automatic correction of
common error patterns (with constant
feedback from post-editors)
Guidelines should be very specific and clear, and
provide many examples of changes to
implement and those to avoid
Adapt post-editing training according to
previous experience
77
78
Email: [email protected]
Monolingual Post-Editing in an
Online Community
Linda Mitchell
Centre for Translation and Textual Studies
Dublin City University
Overview
Norton Community
Motivation
Aim
Research Questions
Previous Research
Methodology
80
81
82
Types of Users
Lurker
Once-off user
Regular user
Guru
83
Motivation
User 1 Question
EN
User 2 Answer
EN
User 4 Question
DE
MT
English Forum
Question1
Answer1
German Forum
Frage1
Antwort1
User 3 Edit DE
84
Aim
Make machine translated post-edited content
available to users of different languages
85
Research Questions
Is monolingual post-editing feasible in an online
community environment?
Quality of output
Changes users make
What motivates users to post-edit?
Motivation Post-
editing
Post-Editing
Bilingual:
Post-editor has access to source text and machine
translated output
Monolingual:
Post-editor only has access to machine translated
output
Why monolingual?
Users do not speak source language
86
87
Previous Research
Bilingual PE:
MT & bilingual PE
MT & automated bilingual PE
Monolingual PE:
MonoTrans – Hu et al. 2010/11
Language Grid – Lin et al. 2010
Translation Options – Koehn 2010
New: Online Community Domain Experts Motivation
88
Experimental Set-Up
German-speaking users will post-edit posts that
have been machine translated (ENDE)
Evaluation:
Professional evaluation:
Comprehensibility
Fidelity
User evaluation:
“Acceptability” in forum context
PE Interface
89
90
Evaluation Interface
User Motivation
Motivation
Extrinsic/ Intrinsic (Deci & Ryan 1985)
91
Development of Skills
Work Experience
Rewards
Reputation
Reciprocity
Community Fun
Passing of Time Dedication
92
User Motivation
How to measure motivation?
QUAN (e.g. Oreg and Nov 2008)
QUAL (e.g. Brabham 2010)
Mixed methods:
Questionnaire
Interviews
Next Steps
Participants to be recruited by September
Pilot Study in October
93