39
The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of linguistic and conceptual .patterns The majority of corpus analysis tools also offer a number of other features, which often combine the data produced by the concordancer and word frequency counts

The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

Embed Size (px)

Citation preview

Page 1: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of linguistic and conceptual .patterns

The majority of corpus analysis tools also offer a number of other features, which often combine the data produced by the concordancer and word frequency counts

Page 2: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

Towards a Methodology for a Corpus-Based Approach to Translation Evaluation

ByMojgan Heydarali

Professor : DR, BehbahaniCourse : Translation Assessment

Azad University of literature & Foreign Languages, Tehran South Branch

Page 3: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

Content

1 .Translator trainers responsibility2 .Evaluation tools limitations

3 .Importance of corpus-based approach 4 .Characteristics of corpus-based approach

5.Challenges facing evaluators in academic context 6 .Corpora and corpus analysis tools

7.Designing an evaluation corpus a. Comparable Source Corpus b. Quality Corpus c. Quantity Corpus

d. Inappropriate Corpus

Page 4: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

Translator trainers are responsible for:

Grading students’ work and importantly

feedback, providing useful

Page 5: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

In the past translators and trainers worked with resources such as:

Dictionaries Printed parallel texts

Unverified intuition

Subject field experts

But they were not always conductive to providing the conceptual and linguistic knowledge necessary to

.an objective translation evaluation

Page 6: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

What is the importance of corpus-based approach ?

It removes a great deal of subjectivity : 1 2 :Provides improved access to

appropriate conceptual and linguistic information of specialized subject field which is documented by experts in

that field .

Page 7: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

In another word a specially designed evaluation corpus can act as a

benchmark for comparing students translations on a number of

different levels

Page 8: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

so

Translator trainers by having access to wide range of authentic and suitable texts can:

Verify or correct both conceptual and linguisticInformation and,Provide more constructive feedback based on evidence.

Page 9: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

What is a corpus-based approach characteristics?

Firstly, It is based on the analysis of a comparatively large and carefully selected collection of naturally occurring texts that are stored in machine-readable form( i. e, a corpus).

Page 10: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

Secondly,

Because it analyzes actual patterns of language use in the corpus, it is empirical and therefore

objective.

Page 11: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

Thirdly,

:This approach takes advantage of Computational Tools ,

Methods for Manipulating the corpus , Arranging the Data ,

in ways that make it possible to spot items and patterns that would be difficult to identify in other types of resources.

Page 12: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

Additionally

Computers provide consistent and reliable analysis (i.e., they do not change their minds or get distracted.)

Page 13: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

Finally

The corpus-based approach combines both Quantitative and Qualitative techniques;

A computer is capable of churching out counts of linguistic features, but translator trainer is responsible for exploring and interpreting data in order to learn about patterns of language use.

Page 14: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

1 .Challenges Facing Evaluators in an Academic Context

A. The main difficulty surrounding translation evaluation is its subjective nature ; the notion of quality has very fuzzy and shifting boundaries.

B. Clients who commission translations are not interested educating the translator while trainer has .obligation to help students improve their performance

Page 15: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

C. In order to properly preparing students for entering the translation profession, students needs to be exposed to wide range of translation material and text types,

but naturally trainers are not expert in all subjects. So specially designed evaluation corpus can help to meet this need.

Page 16: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

Corpora and corpus analysis toolsSimilarity between corpus and conventional parallel texts:

In translation context, a suitable corpus might be one containing texts that correspond to the intended skopos of target text. In this way a corpus is similar to the conventional parallel .texts used by many translators

Page 17: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

However an electronic corpus is generally much larger and can be processed with the help of computerized tools known as corpus analysis tools.

Page 18: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

Most corpus analysis tools contains at least two main features:Word Frequency lists and Concordancers

A. Word Frequency lists, allows users to discover how many different words are in the corpus and how often

each appears .

DVD 765 * video 126* not 89 * player 80Is 341 * we 121* said 85 * all 79

Will 208 * have 116 * PC 82* MPEG 81

Page 19: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

“I really like translation because I think that translation is really, really interesting”.

:Tokens (total word ) = 13 They can be stored in * Alphabetical order

* Ascending order * Descending frequency

Types(different words) = 9

Page 20: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

Words belonging to the same lemma can be counted together or separately, as can words beginning with upper or lower case.

Lemma refer to words which have same stem and belong to the same major word class, differing only by spelling or inflection.

Stop lists refer to lists of words to be ignored and can also be used In order to eliminate common function words such as prepositions or conjunctions.

Frequency information can be used for helping translators decide which term to use when faced with a number of potential synonyms or translation equivalents.

Page 21: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

B. Concordancer

A concordancer retrieves all the occurrences of particular search pattern in its immediate contexts and displays these in an easy-to-read format.

The most commonly used format is KWIC (key word in context) shows one occurrence of the search pattern per line with the search

pattern itself high-lighted in the center of the screen.

Atsushita slick, portable DVD player with a color LCD and Ndows explorer, but their movie player software refused to plaErs with a “record” button. The player will not even have the o

three years ,” he says. Such a player would have a display

Page 22: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

* The extent of the context on either side of the search pattern is variable,

* These contexts can be sorted in a variety of ways such as :

a. Order of appearance in the corpus,

b. Alphabetically, c. The words preceding or following

the search pattern

Page 23: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

Concordancers are flexible and allow functions such as:

* Case-Sensitive VS Non-Case Sensitive searches (Bill ex president of USA & bill ,Polish people of poland & polish)

* Wildcard searches( e.g. ‘play’ to retrieve‘ play’, ‘player’, ‘played’, etc).

* Another term must appear within a user- specified distance of search term

(e.g. contexts where ‘play’ appear within five words of ‘DVD ’)

Page 24: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

The majority of corpus analysis tools also offer a number of other features, which often combine the data produced be the Concordancer and Frequency Counts.

It must be considered:

* The value of what comes out of a corpus is largely dependent on what texts are included in it.

* Criteria for designing general language corpora have been well- documented in literature ; however, these criteria cannot be adopted wholesale for the design of a special-purpose corpus

such as an Evaluation Corpus .

Page 25: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

Designing an Evaluation Corpus

The evaluation Corpus is the collective name for the collection of texts that is divided into four main sub-corpora:

1 .The Comparable Corpus 2 .The Quality Corpus

3 .The Quantity Corpus 4 .The Inappropriate Corpus

These sub-corpora differ in content and intended function.

Page 26: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

1 .Comparable Source Corpus (CSC)

It is optional and depends on factors such as Time, text type, skopos of the target text.

CSC contains a selection of SL texts that are similar to the source text in term of

text type, publication date,

subject matter .

Page 27: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

The purpose of CSCIts purpose is to allow the evaluator to gauge the “normality” of the source text with regard to other source language texts of that type.

Normalization is a feature of translated texts; normalized texts display exaggerated features of the target language and conform to its

typical pattern (Baker.1997)

Page 28: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

Sanitization:The suspected adaptation of a source text reality to make it more palatable for target audiences.(Kenny)

Both Normalization and Sanitization result in deliberately chosen unconventional lexical or syntactic ST features being changed in translation so that the TT fits in with the

conventions of the target language .

Page 29: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

Determining inappropriate normalization or sanitization

: Evaluators can first: use the CSC as a reference corpus to

establish the relative normality of the ST.

Second: they can then use Quantity Corpus as reference corpus to establish the relative normality of TT.

Page 30: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

If the ST is deemed to be normal (in vocabulary, register, style, etc ). with

reference to texts in the comparable source corpus,

then the text should be normal when compared with texts in the Quantity Corpus( and vise versa).

Page 31: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

2 .Quality Corpus

The Quality Corpus is a high quality sub-corpus consisting of

hand picked texts primarily for their conceptual content,

It is very small by corpus linguistics standards containing four or five texts with total word 5,000 words

Page 32: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

The Quality Corpus is used primarily as a source of conceptual information rather than linguistic information so it is not necessary that all texts to be of the same text type.

But it is important to be complete texts( not a sample or extract of the text.)

At list some of the texts should be current.

Using Quality Corpus will help translator trainer become familiar with basic concepts in the field and identify some of the key terms.

If the texts are well chosen they can serve as benchmark for evaluating students translation.

Page 33: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

3 .Quantity CorpusWhy it is not appropriate to rely exclusively on the

Quality Corpus? Firstly

Because it is a relatively small collection,There is no real way to know that if selected texts are truly representative of the text type at large.

SecondlyThe texts contained in the Quality Corpus may be “older” texts and a term which was appropriate in the past may no longer be so.

Page 34: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

The Quantity Corpus is designed to provide a larger and more representative sample of specialized language in question.

External factors such as time and availability of data have influence on the question of how large and

.how representative

By experience the Quantity Corpora from 20,000 to 200,000 words have proved useful.

20,000 for highly specialized subject field,.200,000 for subject field that are not extremely narrow

Page 35: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

It is useful to divide the Quantity Corpus into further sub-corpora, one for each year , this enables translators or evaluators track terminological changes over time.

A Corpus analysis tool such as Word Smith allows users to consult multiple corpora at once.

Page 36: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

The Quantity Corpus: pros and cons

Pros:The Quantity corpus is compiled in semi-automated fashion and can be used by translator trainer to verify terminological, phraseological, and stylistic appropriateness made by students.

Most Corpus analysis software gives users the option of expanding the context to several lines or the complete text.

The volume of the data makes it possible to spot pattern more easily, to make generalizations and provide concrete

evidence to support decisions .

Page 37: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

Cons:Interacting solely with a large electronic corpusCauses loosing sight of the fact that translation is a text-based

activity .

In corpus analysis the focus is on micro- contexts and the primary power of corpus analysis remains at a sub- text level.

The texts are not readily available in electronic form.

Page 38: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

4 .Inappropriate corpus

It is a corpus containing “inappropriate” parallel texts.Its size vary based on the subjects.

In well established or with wider interest it would be larger, but it is smaller in very recent subjects.

Its purpose is to help translator trainer uncover the mysteries of the unsuitable equivalents in students translation.

If a student has used a term which does not appear in Quality and Quality Corpus it can be checked in this corpus.

Page 39: The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

THE END

The main advantage of concordancing tools is that they allow translators to see terms in a variety of contexts simultaneously to detect various kinds of

linguistic and conceptual patterns

The majority of corpus analysis tools also offer a number of other features, which often combinethe data produced by the concordancer and word frequency counts