The Human Element in Translation Technology · The Human Element in Translation Technology Dr. Dorothy Kenny Centre for Translation and Textual Studies Dublin City University . 2

The Human Element in Translation

Technology

Dr. Dorothy Kenny

Centre for Translation and Textual Studies

Dublin City University

2

Acknowledgements

We are grateful to the following for funding our participation in the IATIS conference: School of Applied Language and Intercultural Studies (SALIS), DCU (de Almeida & Dombek); DCU Research Committee (de Barra-Cusack & Mitchell), Centre for Next Generation Localisation (CNGL) (Doherty & Moorkens); Faculty of Humanities and Social Sciences, DCU (Kenny)

Magda Dombek‟s research is funded by SALIS, DCU

Fionnuala de Barra Cusack‟s research is funded by Foras na Gaeilge and SALIS, DCU

Joss Moorkens‟ and Stephen Doherty‟s research is supported by the Science Foundation Ireland (Grant 07/CE/I1142) as part of the CNGL (www.cngl.ie) at DCU

Giselle de Almeida‟s research is funded by the Irish Research Council for Science, Engineering and Technology (IRCSET) and VistaTEC

Linda Mitchell‟s research is funded by Symantec

Special thanks to: Lynne Bowker (University of Ottawa), Fred Hollowood (Symantec),

Sharon O‟Brien (DCU), Minako O‟Hagan (DCU), Phil Ritchie (VistaTEC), Johan Roturier

(Symantec), Reinhard Schäler (University of Limerick), Fiontar (DCU) and Google

Translation Technologies

Collaborative platforms

Termbanks

Translation memory (TM)

Machine translation (MT) Rule-Based MT

Statistical MT

People

Developers

Managers

Translators

Revisers

Post-Editors

Readers

Researchers…

Mixed Methods

Automatics evaluation metrics (for MT)

Corpus-based methods

Eye-tracking

Questionnaires

Focus groups

Interviews

Human user evaluations

Contextual Inquiry

Netnography…

6

Different Stages

Email: [email protected]

Motivation in Translation

Crowdsourcing - A study of Volunteer Polish Facebook User-Translators

Magdalena Dombek



8

Overview

Research background

Research question

Facebook and translation crowdsourcing

Theoretical framework

Methodology

Findings so far

Conclusions

Research background

9

collaborative

translation tools

new forms of

translation practice

Research question

10

Translation process on Facebook

Translation System Collaborative translation platform

Glossary

Voting system

‘Inline’ mode

Social Networking Features Community group page

Post a message

Ask a question

Upload video/ photo/ document

11

Theoretical Framework

Self-Determination Theory Activity Theory

•Intrinsic Motivation (IM):

•Autonomy

•Competence

•Relatedness

•Cognitive Evaluation Theory:

•Social and environmental factors

strengthen vs. undermine IM

•Information technologies in the

context of human practice:

•HCI: social, organizational and

cultural context of computer

usage

•Mediation: tools shape human

experiences

12

Facebook translation crowdsourcing:

Translation as an activity which correlates with the needs of autonomy,

competence and relatedness

Translation as an activity mediated by tools

Methodology

Mixed methods approach:

Netnography:

ethnographic research online

study of a community in the online environment (observation,

interaction, archival data analysis)

Online survey:

source of quantitative data (demographics, use of Facebook,

Translations application, collaborative translation platform and

motivation)

study with 20 Polish Facebook user-translators

Further experiment:

observatory study of the usage of Facebook‟s collaborative

translation platform

13

Findings: Intrinsic Motivation, Competence

14

3

2.5

3

3

4

1

0 1 2 3 4

It is a great fun and a way to spend free time

It is a good way to fight boredom

It gives me a lot of satisfaction

I can use my skills in practice

It is a good deed for the benefit of others

It makes a good impression on family/ friends

I participate in the initiative because:

median

Scale:

0 = I completely disagree

4 = I completely agree

Findings: Competence

15

1

3

2.5

3

4

2

2

0 1 2 3 4

I meet and make friends with people who have similarinterests

I promote Polish language

I improve my qualifications and how employers perceiveme

I improve my knowledge of English, translation skills

I help others without English who want to use Facebook

I improve my reputation among other Facebook user-translators

I improve how Polish Facebook users perceive me

Thanks to my participation in the initiative:

median

Scale:



Findings: Competence

16

[median = 3] [median = 3]

1

2

1

8

1

0 2 4 6 8

completely disagree

1

2

3

completely agree

The platform is easy to use

# of respondents

0

1

1

2

3

0 1 2 3

completely disagree

1

2

3

completely agree

The 'inline' mode is easy to use

# of respondents

Findings: Autonomy

17

0

6

13

1

0 5 10 15

a few times a year

a few times a month

a few times a week

at least once a day

How often do you contribute?

# of respondents

0

1

8

11

0 2 4 6 8 10 12

over 3 hours

2 to 3 hours

1 to 2 hours

less than 1 hour

How long does an individual session last?

# of respondents

Findings: Competence and Relatedness

18

1

1.5

3.5

2

1

3

0 1 2 3 4

to have your name on a leaderboard

to work for Facebook

to improve the quality of the Polish translation

to receive praise from others

to equal others with the volume of contribution

to suppor the community of Polish user-translators

How important is it for you:

median

Scale:

0 = I completely unimportant

4 = I extremely important

Findings: Relatedness

19

3.5

3

3

0 1 2 3 4

By translating/ voting I repay other translators fortheir contributions

As a member of the translators' community I feelresponsible for the work of all

I know I can count on help of other Facebookuser-translators in case of translation difficulty

To what extent do you agree:

median

Scale:



Conclusions

Intrinsic motivation to contribute further strengthened

(as postulated in SDT):

Activity of Facebook translation conducive to the translators‟

increased perceptions of autonomy, competence and

relatedness

Persisting high levels of self-confidence, belief in one‟s skills,

task aptitude

Strong perception of community spirit

20

21


Thank You!

Metadata Use

www.focal.ie

Fionnuala de Barra-Cusack



Internal Users

www.focal.ie

Classification for subject-field headings

DANTERM classification

23

http://www.focal.ie/

24

25

26

External Users

Previous Research My Contribution

Focus on comparative uses, i.e.

monolingual/bilingual or

learner/expert

Metadata use – very little research to

date

Largely print-based Electronic, online dictionarie

HCI

Mostly questionnaires, TAPs,

interviews and tests

Focus groups

Contextual inquiry

Eye-tracking

27

28

Pilot Study

Focus groups – hypotheses generation

29

And Next…

Contextual inquiry - observer as partner

Context - workplace

Method – partnership

Goal – uncover unarticulated aspects of work

30

And Next…

Eye-tracking – hypotheses testing

Actual use

Visual attention

31

And Next…

32

A Case Study of Inconsistency in

Translation Memories

Dr. Joss Moorkens



34

Translation Memory (TM)

35

36

Prior Research

Rieche 2004: Errors are introduced to TMs unless

“systematic control of memory through procedures

for regular review and maintenance.”

4

Prior Research

Rieche 2004

Bowker 2005: “Although it is frequently claimed

that TMs improve consistency, this is not always

the case”

4

Prior Research

Rieche 2004

Bowker 2005

Ribas López 2007: “TM systems can potentially

help spread errors”

5

Aims of research

Develop a method for measuring consistency in

translation memories

Use this method on TMs from the localisation

industry to find whether TM promotes consistency

in translation

Pilot Study

Pilot study found TT inconsistencies in TM data

Identification of categories of inconsistency

More than one type of inconsistency per segment

Inconsistencies also in source text

Single case study – difficult to generalise

Moorkens, J. 2009. Total Recall? A Case Study of Consistency in Translation Memory. IN:

Proceedings of LRC XIV. 24th and 25th September. Limerick, Ireland. p101-110.

6

7

Methodology

Sequential mixed methods study

Quantitative phase: empirical study of inconsistencies in

ST and TT of 4 TMs from industry (two English-

German, two English-Japanese)

Qualitative phase: series of interviews with translators

and others in various roles related to TM from industry

to add explanatory detail and suggest methods for

minimising inconsistency

Phase

One:

Quantitative

Data

Collection

Quantitative

Data Analysis

Quantitative

Results

Identify

Results for

Follow-Up

Phase

Two:

Qualitative

Data

Collection

Qualitative

Data Analysis

Qualitative

Results

Interpretation

Quantitative-

>Qualitative

7

Methodology

Adapted from Creswell and Plano Clark (2007, p73).

Sequential mixed methods study

Categories of Inconsistency

8

Find the Appropriate Ellipse for the Axis 軸に対して適切な楕円の検索

Find the appropriate ellipse for the axis 特定の軸に適合るす楕円


Case sensitive 大文字と小文字の区別

Case Sensitive 大文字と小文字の区別

8

[1] 'Daimoji to komoji no kubetsu' [The distinction between upper case and lower case letters]




Case sensitive 大文字と小文字の区別

Case Sensitive 大文字と小文字の区別

8



A new layer group filter can be nested only

under another group filter.

新しいレイヤグループフィルタは、他のグルー

プフィルタに対してのみネストできます。

A new layer group filter can be nested only

under another group filter.

新しい画層グループフィルタは、他のグループ

フィルタに対してのみネストできます。

9

Findings from first phase

4.1s All lines that have been converted

using the {1}Create surface borders{2}

function can be recognized easily since they

are drawn with the {3}Border{4} pen.

4.1.1t Alle Linien, die mit der Funktion

{1}Flächenränder anlegen{2} konvertiert wurden,

können Sie leicht erkennen, da sie mit dem Stift mit

der Bezeichnung {3}Border{4} gezeichnet werden.

4.1s All lines that have been converted

using the {1}Create surface borders{2}

function can be recognized easily since they

are drawn with the {3}Border{4} pen.

4.1.2t Alle Linien, die mit der Funktion {1}

Flächenränder anlegen{2} konvertiert wurden,

können Sie leicht erkennen, da sie mit dem Stift mit

der Bezeichnung {3}Rand{4} gezeichnet werden.

10


{1}Swap the Colors.{2} {1}Farben austauschen{2}

{1}Swap the Colors{2}. {1}Farben tauschen{2}

11


(Do Not Add to Workspaces or Add to

Woprkspaces)

(Zu Arbeitsbereichen Nicht Hinzufügen oder Zu

Arbeitsbereichen Hinzufügen)

(Do Not Add to Workspaces or Add to

Workspaces)

(Zu Arbeitsbereichen nicht hinzufügen oder Zu

Arbeitsbereichen hinzufügen)

11


At the Command prompt, enter subtract. Geben Sie in der Befehlszeile differenz ein.

At the command prompt, enter subtract. Geben Sie an der Eingabeaufforderung DIFFERENZ

ein.

12

Findings from second phase

• All of the inconsistencies from the quantitative study

are common in the experiences of interviewees

• Constant problems with inadequate or ambiguous

source text

• Outdated glossaries, multiple concordance results,

inaccurate fuzzy match measurement leads to

translators introducing inconsistency

• Locked 100% matches often incorrect

• Current TM tools are inadequate – requirement for

extra levels of review and QA

Conclusions

Inconsistency is a genuine problem

Inconsistent source text:

Letter case

Punctuation

Inconsistent target text:

Terminology

Formatting

Tags

13

Conclusions

Clients, translators, developers all have a role to

minimise inconsistency

Inconsistency costs

Recommendations:

Standardisation

Maintenance

13

14

Thank you!


This research is supported by the Science Foundation Ireland (Grant 07/CE/I1142) as part of the Centre for Next Generation Localisation

(www.cngl.ie) at Dublin City University.

http://www.cngl.ie/

Reading and Comprehension of

„Controlled‟ Machine Translated Texts

Dr. Stephen Doherty



55

Overview

Motivation

Concepts

Study Design

Results

Conclusions

56

Motivation

Increased need for automatic translation

Quality of machine translations [MT]

Benefits of controlled language [CL] and MT

Inclusion of end-users

Mix of human and automated evaluation

Better input = better output, right?

57

Interaction

(Reading)

Comprehensibility Readability

Reader Text

58

Study Design

Is „controlled‟ MT output more readable and

comprehensible?

Tech. support documentation

EN-FR, n = 25

Reading and evaluation task

Retrospective protocols

Previous validation of method

59

Input Output

Controlled

Input

MT

Systems

Controlled

Output

60

1. If you will be performing such an activity and want to avoid the warning, you can temporarily disable

the program.

2. If you perform such an activity and want to avoid the warning, you can temporarily disable the program.

1. Si vous serez effectuer une telle activité et que vous voulez éviter

l'avertissement, vous pouvez désactiver temporairement le programme.

2. Si vous effectuez une telle activité et que vous voulez éviter

l'avertissement, vous pouvez désactiver temporairement le programme.

1. Wenn Sie die Durchführung einer solchen Tätigkeit werden und wollen, um

die Warnung zu vermeiden, können Sie vorübergehend deaktivieren Sie das

Programm.

2. Wenn Sie eine solche Tätigkeit und durchführen wollen, um die Warnung zu

vermeiden, können Sie vorübergehend deaktivieren Sie das Programm.

1. このようなアクティビティを実行し、警告を回避するす

る場合は、プログラムを一時的に無効にすることができます。

2. このようなアクティビティを実行し、警告を回避したい

場合は、プログラムを一時的に無効にすることができます。

61

Results

Readability

Indices

Eye Tracking Human

Evaluation

Automatic

Evaluation

Metrics

Recall

Improved

readability,

significant* at

batch level

[*p < 0.05]

Significant*

improvements for

fixation count and

duration, and

regressions [cognitive

effort], but not pupil

dilation

Significantly*

higher ratings

for both

readability and

comprehension

Slight

degradation! [BLEU, GTM, TER]

Significantly*

higher

62

Conclusions

Application: CL is worth it when humans are concerned;

Focus on which rules work best;

Improvements along the workflow;

Predictors of readable/comprehensible text.

Research: Humans can be unreliable, machines can be too rigid;

Mixed methods, especially for human process research;

Including many points of view, even if you don‟t agree with them!

63

Fin


Just one reference for now:

http://doras.dcu.ie/16805/

Thanks!

http://doras.dcu.ie/16805/

Modelling post-editing behaviour

to design tools and training

Giselle de Almeida



65

Overview

Definition of post-editing

Market for post-editing

Post-editing issues

Research questions

Application

Framework

Methodology

Findings and insights

Recommendations

Definition of post-editing

“Editing, modifying and/or correcting a

pre-translated text processed by a MT

system from a source language into one or

more target languages.” (Jeffrey Allen)

66

Market for post-editing

Among nearly 1,000 language service

providers around the world, 41.2% claimed

to offer post-edited machine translation

(The Market for MT Post-editing - Donald A.

DePalma and Vijayalaxmi Hegde, November 2010)

67

Post-editing issues

Still relatively unknown to many translators

No internationally adopted standard

guidelines (company-specific instead)

Not much information or training available

New methodologies and tools to learn

Expected level of language quality

determined by each project/company

Expected higher productivity

68

Research questions

Does the level of previous experience with

translation influence the performance of translators

when doing post-editing tasks?

If so, does the level of experience have a positive or

a negative impact on the performance in terms of

time spent and fitness for purpose?

Are the same post-editing strategies employed

across languages of the same family? (Test case:

French and Brazilian Portuguese.)

69

Application

Indication of post-editing effort

Identification of problem areas for

post-editors

Selection of candidates for post-editing training

Elaboration of post-editing training

and guidelines and improvement

of MT engines and/or dictionaries

70

Framework

Post-editing (Krings 2001, Loffler-Laurian 1984, Allen 2003, O'Brien 2006)

Translation memory tools (Guerberof 2008, Rieche 2004)

Views of translators regarding machine translation (Fulford 2002, Araújo 2004)

Post-editing training (O'Brien 2002)

Post-editing guidelines (Allen 2003, Schäefer 2003)

Automated post-editing (Vasconcellos 1986, Allen 2001, Simard et al. 2007, Font Llitjós 2007, Guzmán 2008)

Profile of a good post-editor (Offersgaard et al. 2008)

71

Methodology

Typology to classify post-editing changes:

based on LISA (Localisation Industry

Standards Association ) QA Model

Corpus: segments from IT texts

Recording of sessions with participants

Screen recording

Keyboard/mouse logging

72

Methodology (cont.)

Data analysis

Total time taken per participant

Types of changes made

Editing strategies

Correlation of translation experience

and post-editing performance

Correlations between languages

73


Correlation of translation experience

vs. post-editing performance

Identification of main post-editing strategies

employed and changes made

74


Language: category with the highest number of

essential corrections

Highest number of Language changes:

agreement (gender, number), phrasal ordering,

determiners

Correlations: translation experience vs. post-

editing performance

75


76

Recommendations

Advisable to implement intermediate phase in

the MT workflow: automatic correction of

common error patterns (with constant

feedback from post-editors)

Guidelines should be very specific and clear, and

provide many examples of changes to

implement and those to avoid

Adapt post-editing training according to

previous experience

77

78


Monolingual Post-Editing in an

Online Community

Linda Mitchell



Overview

Norton Community

Motivation

Aim

Research Questions

Previous Research

Methodology

80

81

82

Types of Users

Lurker

Once-off user

Regular user

Guru

83

Motivation

User 1 Question

EN

User 2 Answer

EN

User 4 Question

DE

MT

English Forum

Question1

Answer1

German Forum

Frage1

Antwort1

User 3 Edit DE

84

Aim

Make machine translated post-edited content

available to users of different languages

85

Research Questions

Is monolingual post-editing feasible in an online

community environment?

Quality of output

Changes users make

What motivates users to post-edit?

Motivation Post-

editing

Post-Editing

Bilingual:

Post-editor has access to source text and machine

translated output

Monolingual:

Post-editor only has access to machine translated

output

Why monolingual?

Users do not speak source language

86

87

Previous Research

Bilingual PE:

MT & bilingual PE

MT & automated bilingual PE

Monolingual PE:

MonoTrans – Hu et al. 2010/11

Language Grid – Lin et al. 2010

Translation Options – Koehn 2010

New: Online Community Domain Experts Motivation

88

Experimental Set-Up

German-speaking users will post-edit posts that

have been machine translated (ENDE)

Evaluation:

Professional evaluation:

Comprehensibility

Fidelity

User evaluation:

“Acceptability” in forum context

PE Interface

89

90

Evaluation Interface

User Motivation

Motivation

Extrinsic/ Intrinsic (Deci & Ryan 1985)

91

Development of Skills

Work Experience

Rewards

Reputation

Reciprocity

Community Fun

Passing of Time Dedication

92

User Motivation

How to measure motivation?

QUAN (e.g. Oreg and Nov 2008)

QUAL (e.g. Brabham 2010)

Mixed methods:

Questionnaire

Interviews

Next Steps

Participants to be recruited by September

Pilot Study in October

93

94

Thank you for listening


Documents

The Human Element in Translation Technology · The Human Element in Translation Technology Dr. Dorothy Kenny Centre for Translation and Textual Studies Dublin City University . 2