37

Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

Converting the �Enciclop�edia Catalana�

bilingual MRD to an MTD�

L� Ben��tez� G� Escudero� J� Farreres� G� Rigau

Contents

� Introduction �

� Preparing information �

��� Grammatical information hCGi � � �hCGi � � � � � � � � � � � � ���� Morfological information hCMi � � �hCMi � � � � � � � � � � � � ���� Semantic information hCSi � � �hCSi � � � � � � � � � � � � � � ���� Registers hREGi � � �hREGi � � � � � � � � � � � � � � � � � � � �

� Main decisions extracting information �

� Step by step �

��� Preprocess � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ���� EnglishCatalan � � � � � � � � � � � � � � � � � � � � � � � � � ���� CatalanEnglish � � � � � � � � � � � � � � � � � � � � � � � � � ����� Final processing � � � � � � � � � � � � � � � � � � � � � � � � � � ��

A English�Catalan dictionary ��A�� Abbreviations � � � � � � � � � � � � � � � � � � � � � � � � � � � ��A�� Source code in Perl � � � � � � � � � � � � � � � � � � � � � � � � ��

B Catalan�English dictionary ��B�� Abbreviations � � � � � � � � � � � � � � � � � � � � � � � � � � � ��B�� Femenine gender in�exion � � � � � � � � � � � � � � � � � � � � ��B�� Source code in Perl � � � � � � � � � � � � � � � � � � � � � � � � ��

References ��

Page 2: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

� Introduction

Our aim is to retrieve semi automatically lexical knowledge from bilingualMRDs �Machine Readeable Dictionaries� �Atserias et al� ��� Ben��tez et al��a�b��Concretelly we are dealing with an EnglishCatalan bilingual dictio nary �both directions� �DEC ����

Obviously a MRD does not provide an immediate source of lexical knowl edge� Complex processes are necessary in order to convert MRDs to MTDs�Machine Tractable Dictionaries� �Rigau ���� Usually� this process takespro�t from typographical information� that is� using this typographical markswe can make explicit several �elds from the MRD� This process is describedin this report�

The MRD is coded in SGML in order to make our task easier� As eachdirection uses di�erent marks to code the same attributes� we have a di�erenttreatment for each one� For instance�

English�Catalan

�E�a ��E��T��v�ar��v� un �v�m��v�� una �v�f��v��

�i�A man��i�� un home� �i�A woman��i�� una dona�

��i�rate� price� etc��i�� a� per�

�i�Three times a week��i�� tres vegades per setmana�

�i�Thirty pounds a month��i�� trenta lliures el mes��T�

Abbreviations are coded in italics� and grammatical and semantic onesare mixed� To make both directions of the bilingual equivalent in structure�grammatical and semantic abbreviations must be separated� In appendixA�� are the abbreviations resulting from our treatment�

Page 3: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

Catalan�English

�E�a ��E��T��i�prp��i� ��i�lloc��i�� in�

�i�Viu a Barcelona��i�� she lives in Barcelona�

��i�direcci��i�� to�

�i�Vaig anar a Anglaterra��i�� I went to England�

��i�temps��i�� in� at� �i�A la nit��i�� at night�

��i�complement indirecte��i�� to�

�i�Vaig donar el diari a la mare��i��

I gave the newspaper to my mother��T�

Abbreviations are coded in italics� and the ones referred to gramaticalinformation are coded distinct from the abbreviations referred to semantics�In appendix B�� are the abbreviations resulting from our treatment�

Marks

The meaning of each mark are the following�

hEi � � �h�Ei Refers to the entry in both directions�

hT i � � �h�T i Refers to the translation of the entry�

hvi � � �h�vi Refers to italics in the �EnglishCatalan� dictionary�

hii � � �h�ii Refers to italics in the �CatalanEnglish� dictionary�

� Preparing information

We decided to mark and distinguish all abbreviations that can be detected�Although only the grammatical codes were necessary� We have choosen thefollowing four marks and categories�

CG grammar category

CM morfologic code

CS semantic code

REG use

and we have assigned each abbreviation to one and only one category�See appendix A�� and B�� for a complete list of these abbreviations�

Page 4: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

��� Grammatical information hCGi � � � hCGi

Catalan�English

�E�abadessa ��E� �T��i��CG�f��CG���i� abbess��T�

�E�abandonar ��E� �T��i��CG�vt��CG���i� to abandon� leave��T�

�E�abandonar ��E� �T� �i��CG�vp��CG���i� to abandon osf��T�

�E�abans ��E� �T��i��CG�av aj��CG���i� before��T�

�E�abast ��E� �T��i��CG�m��CG���i� reach� range��T�

English�Catalan

�E�abdicate ��E� �T��v��CG�v tr��CG���v� abdicar��T�

�E�absent ��E� �T��v��CG�adj��CG���v� absent��T�

�E�back ��E� �T� �v��CG�v intr��CG���v� recular��T�

�E�envelope ��E� �T��v��CG�n��CG���v� sobre��T�

��� Morfological information hCMi � � � hCMi

Catalan�English

�E�caserna ��E� �T��i��CG�f��CG���i� barracks �i��CM�pl��CM���i���T�

�E�duana ��E� �T��i��CG�f��CG���i� customs �i��CM�pl��CM���i���T�

�E�golfes ��E� �T��i��CG�fpl��CG���i� loft �i��CM�sg��CM���i���T�

�E�noces ��E� �T��i��CG�fpl��CG���i� wedding �i��CM�sg��CM���i���T�

Page 5: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

English�Catalan

�E�abbey ��E� �T��v��CG�n��CG���v� abadia �v��CM�f��CM���v���T�

�E�abdomen ��E� �T��v��CG�n��CG���v� abdomen �v��CM�m��CM���v���T�

�E�hair ��E� �T��v��CG�n��CG���v� cabells �v��CM�pl��CM���v���T�

�E�scarlet ��E� �T��v��CG�adj��CG���v� escarlata �v��CM�inv��CM���v���T�

��� Semantic information hCSi � � � hCSi

Catalan�English

�E�acte ��E� �T��i��CG�m��CG���i� �i��CS�tea��CS���i� act��T�

�E�be ��E� �T��i��CG�m��CG���i� �i��CS�zoo��CS���i� lamb��T�

�E�bus ��E� �T��i��CG�m��CG���i� �i��CS�aut��CS���i� bus��T�

�E�cel ��E� �T��i��CG�m��CG���i� �i��CS�rlg��CS���i� heaven��T�

�E�dau ��E� �T��i��CG�m��CG���i� �i��CS�jcs��CS���i� dice��T�

�E�pal ��E� �T��i��CG�m��CG���i� �i��CS�mar��CS���i� mast��T�

English�Catalan

�E�aside ��E� �T� �v��CG�n��CG���v� �v��CS�teat��CS���v� apart��T�

�E�chalk ��E� �T��v��CG�n��CG���v� �v��CS�min��CS���v� creta��T�

�E�cite ��E� �T��v��CG�v tr��CG���v� �v��CS�dr��CS���v� citar��T�

�E�credit ��E� �T��v��CG�v tr��CG���v� �v��CS�com��CS���v� abonar��T�

�E�dope ��E� �T��v��CG�v tr��CG���v� �v��CS�esport��CS���v� dopar��T�

�E�draft ��E� �T��v��CG�v tr��CG���v� �v��CS�mil��CS���v� quintar��T�

Page 6: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

��� Registers hREGi � � � hREGi

Catalan�English

�E�casta ��E� �T��i��CG�f��CG���i� �i��REG�fg��REG���i� class��T�

�E�fatxa ��E� �T��i��CG�f��CG���i� �i��REG�fm��REG���i� look��T�

�E�nu ��E� �T��i��CG�aj��CG���i� �i��REG�fg��REG���i� bare��T�

�E�merda ��E� �T��i��CG�f��CG���i� �i��REG�vlg��REG���i� shit��T�

English�Catalan

�E�alibi ��E� �T��v��CG�n��CG���v� �v��REG�fam��REG���v� excusa��T�

�E�call ��E� �T��v��CG�v tr��CG���v� �v��REG�fig��REG���v� evocar��T�

�E�graft ��E� �T� �v��CG�v intr��CG���v� �v��REG�vulg��REG���v�

pencar��T�

�E�ice ��E� �T��v��CG�n��CG���v� �v��REG�US��REG���v� nevera

�v��CM�f��CM���v���T�

�E�ice ��E� �T��v��CG�n��CG���v� �n�icebox��n� �v��REG�UK��REG���v�

congelador �v��CM�m��CM���v���T�

� Main decisions extracting information

We have two �les with the information prepared as described in section ��one �le for the �EnglishCatalan� dictionary and the other for the �CatalanEnglish� one� In this section we only describe the process for nouns�

Collocations� de�nitions� morphological information and gender in�ec tion have a speci�c process� In this section we explain the method used toextract the pairs formed by a word and its translation� their link toWordNet�and some further treatment�

In both directions of the dictionary�

Collocations�

Some entries correspond to collocations� so the de�nition does not corre spond to a simple word but a combination of words� We have to take thatinto account to retrieve more precise information�

Page 7: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

�E�agent ��E� �T� �n�agent de canvi i borsa��n� stockbroker��T�

�E�cambra ��E� �T� �n�cambra de bany��n� bathroom��T�

�E�dibuix ��E� �T� �n�dibuixos animats��n� cartoon��T�

�E�escola ��E� �T� �n�escola bressol��n� kindergarten��T�

�E�joc ��E� �T� �n�fora de joc��n� offside��T�

�E�lluna ��E� �T� �n�lluna de mel��n� honeymoon��T�

�E�age ��E� �T� �n�old age��n� vellesa �v�f��v���T�

�E�bad ��E� �T� �n�bad habit��n� �v�n��v� vici �v�m��v���T�

�E�canine ��E� �T� �n�canine tooth��n� �v�n��v� ullal �v�m��v���T�

�E�day ��E� �T� �n�day nursery��n� �v�n��v� guarderia �v�f��v���T�

�E�early ��E� �T� �n�early morning��n� �v�n��v� matinada �v�f��v���T�

�E�family ��E� �T� �n�family name��n� �v�n��v� cognom �v�m��v���T�

De�nitions given by other entries�

Some entries refer to another entry in the same language� These ones areseparated from the rest to be treated afterwards� These words are given thesame de�nition as the referred words�

�E�au ��E� �T��i�f��i� �n�ocell��n���T�

�E�caldo ��E� �T��i�m��i� �n�brou��n���T�

�E�escarxofa ��E� �T��i�f��i� �n�carxofa��n���T�

�E�larinx ��E� �T��i�f��i� �n�laringe��n���T�

�E�medecina ��E� �T��i�f��i� �i�fm��i� �n�medicament��n���T�

�E�naturalesa ��E� �T��i�f��i� �n�natura��n���T�

�E�quadro ��E� �T��i�m��i� �n�quadre��n���T�

�E�rabosa ��E� �T��i�f��i� �n�guineu��n���T�

�E�antiquarian ��E� �T��v�n��v� �n�antiquary��n���T�

�E�exam ��E� �T��v�n��v� �n�examination��n���T�

�E�laundromat ��E� �T��v�n��v� �n�launderette��n���T�

�E�liter ��E� �T��v�n��v� �n�litre��n���T�

�E�maths ��E� �T��v�pl��v� �n�mathematics��n���T�

�E�pretense ��E� �T��v�n��v� �n�pretence��n���T�

�E�sulfur ��E� �T��v�n��v� �n�sulphur��n���T�

�E�tzar ��E� �T��v�n��v� �n�tsar��n���T�

Page 8: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

De�nitions of more then one entry or collocation�

Some de�nitions can be also considered as de�nitions from another wordor collocation� We can detect that case by �or � � �� or �o � � �� depending onthe language source� We save a copy of that entries to give them the samede�nition as the main word or collocation�

�E�barret ��E� �T� �n�barret de copa��n� �o �n�barret de mitja copa��n��

top hat ��T�

�E�bath ��E� �T� �or �n�bathtub��n�� �v�n��v� banyera �v�f��v���T�

�E�diesel ��E� �T��or �n�diesel oil��n�� �v�n��v� gasoli �v�m��v���T�

�E�elm ��E� �T��or �n�elm tree��n�� �v�n��v� om �v�m��v���T�

In the �English�Catalan dictionary�

Where to place morphological information of catalan words

When we �nd morphological information of catalan words we code en�

glish word catalan word�morph info� and when we have a catalan colloca tion we code english word catalan collocation�morph info

�E�abbess ��E� �T��v�n��v� abadessa �v�f��v���T�

abbess abadessa�f

�E�abbot ��E� �T��v�n��v� abat �v�m��v���T�

abbot abat�m

�E�sea ��E� �T��v�n��v� mar �v�m�f��v���T�

sea mar�m�f

�E�battle��E��T��n�battlefield��n��v�n��v� camp �v�m��v� de batalla��T�

battle�eld camp de batalla�m

�E�cockroach ��E� �T��v�n��v� escarabat �v�m��v� de cuina��T�

cockroach escarabat de cuina�m

Page 9: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

In the �Catalan�English dictionary�

What to do whith gender in�ection

Consider these entries�

acomodador a �m usher

acomodador a �f usherette

actor triu �m actor

actor triu �f actress

note that outwardly there are same entries appearing twice� In fact theydi�er because of the gender ��f means femenine� �m means masculine��When entries refer to masculine� we remove the information about femeningender� So we�ll store�

acomodador �m usher

actor �m actor

On the other way� when entries refer to femenine� we build the wordcorresponding to the femenin gender� For instance�

acomodador a �f usherette � acomodadora �f usherette

gros ossa �f thumb � grossa �f thumb

hereu eva �f heiress � hereva �f heiress

nebot oda �f niece � neboda �f niece

actor triu �f actress � actriu �f actress

criat ada �f servant� maid � criada �f servant� maid

More information about femenine gender infelxion is shown in appendix B��When we �nd entries like�

ambaixador a �mf ambassador

as they do not give us more precise information we store�

ambaixador �m ambassador

� Step by step

��� Preprocess

�� Translate from DOS format to Unix format

dos�unix catangl�txt� catangl�txt��

�� Transform every multiple de�nition to a single one�

Page 10: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

gawk �� ����E���if�entry����print entry�entry� �next�

�entryentry � � � �

END�print entry�� catangl�txt�� � catanl�txt��

�� Identify every single de�nition�

gawk ��gsub��������T��T���� ��print� �� catangl�txt��

� catangl�txt��

�� Put one line per de�nition�

gawk ��gsub�� ������� ��print� �� catangl�txt�� �

gawk ��gsub���T���� �T���� ��print� �� �

gawk ��for�i��i�NF�i���print ����i�� �

gawk ��gsub������ ��� ��print� �� � catangl�txt��

��� English�Catalan

�� Building the lexical source�

procesAC��pl � anglcat�txt�� � anglcat�font�lexica

�� Keeping information to be used in the next process�

�a� Files with information about entries with ��� i ��or �

procesAC��pl � anglcat�txt�� �

procesAC��pl � egrep ���or �n�� � angcat�or

procesAC��pl � anglcat�txt�� �

procesAC��pl � egrep �� � angcat�igual

�b� We create �le �AC�or� from �le �angcat�or�� File �AC�or� containsthe same links as main entries�

procesAC��pl � angcat�or � sort u �

join word�syns � AC�or

�c� Generation of �AC�igual� with old word new word

procesAC��pl � angcat�igual � AC�igual

�� Linking English words and its WN��� synsets to Catalan words�

procesAC��pl � anglcat�txt�� � procesAC��pl � procesAC��pl �

sort u � join word�syns � egrep v �� �

cat AC�or � sort u � wn�AC�noms

Page 11: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

join AC�igual wn�AC�noms � gawk ��print ���������� �

cat wn�AC�noms � sort u � wnAC�noms

procesCA��pl � wnCA�noms � wn�CA�noms�def

��� Catalan�English

�� Building the lexical source�

procesCA��pl � catangl�txt�� � catangl�font�lexica

�� Keeping information to be used in the next proces�

�a� Files with information about entries with ��� i ��or ��

procesCA��pl � catangl�txt�� �

procesCA��pl � egrep ���o � � catang�or

procesCA��pl � catangl�txt�� �

procesCA��pl � egrep �� � catang�igual

�b� We create �le �CA�or� from �le �catang�or�� File �CA�or� containsthe same links as main entries�

procesCA��pl � catang�or � sort u �

gawk ��print ������� � join word�syns � CA�or

�c� Generation of �CA�igual� with old word new word�

procesCA��pl � catang�igual � CA�igual

�� Linking Catalan words to English words and its WN��� synsets�

procesCA��pl � catangl�txt�� � procesCA��pl � procesCA��pl �

gawk ��print ������� � egrep v �� � egrep v ���� �

egrep v �� � � sort u � join word�syns � cat CA�or �

sort u � wn�CA�noms

gawk ��gsub������� ����� ��print������������� wn�CA�noms �

sort u � join CA�igual � gawk ��print������������� �

gawk ��gsub�� ���������� ��print� �� � cat wn�CA�noms �

sort u � wnCA�noms

procesCA��pl � wnCA�noms � wn�CA�noms�def

��

Page 12: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

��� Final processing

Generation of the homogeneous �le from both sources�

cat wn�AC�noms�def wn�CA�noms�def � sort u � wn�dict�noms

��

Page 13: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

A English�Catalan dictionary

A�� Abbreviations

GRAMMATICAL CATEGORY

adj adj adv adj n adj pron adv adv adj prepadv conj adv n adv prep ar conj conj advinterj n n pron prep pron v intrv tr v tr intr

SEMANTIC FILE

aeron agr agr med anat arquit artart fotog art lit astr aut biol biol tecn

bot bot elect qu�im cin cin etc cin fotog cin teatcom dr econ elect elect mil esportferroc fotog fs gastr geog geog electgram hist inform lit mar mar ferrocmat med mil mil min mil polt etc minms ms lit ms teat polt qum religrelig mil teat teat cin tecn txt zoolzool tecn

MORPHOLOGICAL CODE

f f pl f sing inv m m fm�f pl pp pt pt pp singtb pl

REGISTER

UK US atr desp fam �g�g fam tb �g vulg

��

Page 14: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

A�� Source code in Perl

See �Wall et al� ����

ProcesAC�pl

���usr�local�bin�perl

while ��STDIN��

� Transformacions previes

s��v��adj�pl�n�m�pol�t�tb pl�vulg� � ��!"����v���v������v� �v������v��g�

s���v�n����v� �v��pron���v����� ���g�

s��v��v� � in!"tr�� in!"tr!"�� � ��!"����v���v��� �������v� �v������v� �g�

s���v�v tr����v� �v��intr���v����� �� �g�

s���v�adv� �esport���v��������v� �v����g�

s���v�adv� �fam���v��������v� �v����g�

s���v�adj����v� �v��n���v�� ��� ���g�

s���v�adj����v� �v��pron���v�� ��� ���g�

s���v�adj����v� �v��pron� �pl�����v�� ��� �����v� �v������v��g�

s���v�adj����v� �v��adv���v�� ��� ���g�

s�����v�� ����T������� �g�

� Categories gramaticals

s��v��ad ��!"����v���v��CG������CG����v��g�

s��v��v ��!"����v���v��CG������CG����v��g�

s��v��n ��!"����v���v��CG������CG����v��g�

s��v��con ��!"����v���v��CG������CG����v��g�

s��v��int ��!"����v���v��CG������CG����v��g�

s��v��pr ��!"����v���v��CG������CG����v��g�

s��v��ar����v���v��CG������CG����v��g�

� Camps semantics

s��v��a egnsu! ��!"����v���v��CS������CS����v��g�

s��v��ar ��! ��!"����v���v��CS������CS����v��g�

s��v�� beghlqrz! ��!"����v���v��CS������CS����v��g�

s��v�� beghlqrz! ��!"����v���v��CS������CS����v��g�

s��v��m � ��! ��!"����v���v��CS������CS����v��g�

s��v��t �b! ��!"����v���v��CS������CS����v��g�

��

Page 15: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

s��v��ci ��!"����v���v��CS������CS����v��g�

s��v��co �n! ��!"����v���v��CS������CS����v��g�

s��v��dr����v���v��CS������CS����v��g�

s��v��f eo! ��!"����v���v��CS������CS����v��g�

s��v��f�s����v���v��CS������CS����v��g�

s��v��info ��!"����v���v��CS������CS����v��g�

s��v��po ��!"����v���v��CS������CS����v��g�

� Codis Morfologics

s��v�� fm!�� ��!"����v���v��CM������CM����v��g�

s��v��sing����v���v��CM������CM����v��g�

s��v��m�f����v���v��CM������CM����v��g�

s��v��m f�f pl�f sing����v���v��CM������CM����v��g�

s��v��tb pl����v���v��CM������CM����v��g�

s��v��inv ��!"����v���v��CM������CM����v��g�

s��v��p ptl! ��!"����v���v��CM������CM����v��g�

s��v�f fam���v���v��CM�f���CM����v� �v��REG�fam���REG����v��g�

� Registres

s��v��U ��!"����v���v��REG������REG����v��g�

s��v��desp�vulg�fam����v���v��REG������REG����v��g�

s��v��tb fig����v���v��REG������REG����v��g�

s��v��fig ��!"����v���v��REG������REG����v��g�

s��v��at ��!"����v���v��REG������REG����v��g�

if ���CG�� ��!"����CG��� �

s��CG�� ��!"����CG���catgram���e�

s��v��catgram���v���v��CG��catgram���CG����v��g�

else �

s��T���T��v��CG��catgram���CG����v��g�

print�

��

Page 16: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

ProcesAC�pl

���usr�local�bin�perl

while ��STDIN��

if�������� # ��CG�n����

s�� ��!!"�!��g�

if���n��� �

s���v� ���!"���v�� !"��n� ��!"���n����� �� �g�

s��E� ��!"���E� ��!"�T� !"�n�� ��!"����n���E��� ���E� �T��g�

print�

� while

ProcesAC�pl

���usr�local�bin�perl

while ��STDIN��

s���E�� ��!"� !"���E������g�

s����i� ��!"���i�����g�

s��� ���!"����g�

s���CM�f pl���CM����CM�f�pl���CM��g�

s���CM�m f���CM����CM�m�f���CM��g�

s���CM�f sing���CM����CM�f�sing���CM��g�

s���CM�pt pp���CM����CM�pt�pp���CM��g�

s���CM�tb pl���CM����CM�tb�pl���CM��g�

s�� �n� ��!"���n����g�

s���n� ��!"���n����g�

s�� �v��CM�� ��!"����CM����v�������g�

s���v��CM�� ��!"����CM����v�������g�

s��v��CG���w�� ��w�����CG����v���v��CG���������CG����v��g�

s��v��CG���w�� ��w�� ��w�����CG����v���v��CG������������CG����v��g�

s���v��CG�� ��!"����CG����v��������g�

s���v��CS� ��!"���CS����v����g�

s���v��REG� ��!"���REG����v����g�

��

Page 17: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

s���i� ��!"���i��� ���!"�����g�

s���i� ��!"���i��� ��!"���T������T��g�

s�� ��!!"�!��g�

s� ������g�

s� �T� � �T��g�

s� �T� ��g�

s� �T���g�

s� �T���g�

s����T���g�

s���n�pron���n�g�

s���n��pl����n �g�

s���n��m����n �g�

if ��������� �

if ����n�� �

��a��b��c��d� split���� ������

��primer��segon� split����n ���a��

if ���segon� � ��primer� �segon� split����n���a���

�primer � tr� ����

�primer � s���"��������

�segon � s� !"� ��

�segon � tr� ����

�segon � s�����"������

�segon � s���"��������

�segon � s���"���������

�segon � s�� ���!����� ��!����� ��n!"����������g�

�segon � s�� ��!"��� ���!"���� ��n!"���������g�

�segon � s�����g�

�segon � s������g�

�aa join�� ���primer��segon��

�aa � s���n��g�

�aa � s� ��pl�� �g�

�aa � s���pl����pl�g�

�aa � s� ��pl� �g�

�aa � s��� � ��!"���� ��!"��"��� ��

�aa � s���m��f��m��

print �aa���n��

��

Page 18: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

if ��b� �

�b � s� !�� ��

�b � tr� ����

�b � s�����"������

�b � s���"��������

�b � s���"���������

�b � s�� ���!����� ��!����� ��n!"����������g�

�b � s�� ��!"��� ���!"���� ��n!"���������g�

�b � s�� ��!"��� ��!"�����g�

�b � s�����g�

�b � s������g�

�bb join�� ���primer��b��

�bb � s� ��pl�� �g�

�bb � s���pl����pl�g�

�bb � s� ��pl� �g�

�bb � s��� � ��!"���� ��!"��"��� ��

�bb � s���m��f��m��

print �bb���n��

if ��c� �

�c � s� !�� ��

�c � tr� ����

�c � s�����"������

�c � s���"��������

�c � s���"���������

�c � s�� ���!����� ��!����� ��n!"����������g�

�c � s�� ��!"��� ���!"���� ��n!"���������g�

�c � s�� ��!"��� ��!"�����g�

�c � s�����g�

�c � s������g�

�cc join�� ���primer��c��

�cc � s� ��pl�� �g�

�cc � s���pl����pl�g�

�cc � s� ��pl� �g�

�cc � s��� � ��!"���� ��!"��"��� ��

�cc � s���m��f��m��

print �cc���n��

��

Page 19: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

if ��d� �

�d � s� !�� ��

�d � tr� ����

�d � s�����"������

�d � s���"��������

�d � s���"���������

�d � s�� ���!����� ��!����� ��n!"����������g�

�d � s�� ��!"��� ���!"���� ��n!"���������g�

�d � s�� ��!"��� ��!"�����g�

�d � s�����g�

�d � s������g�

�dd join�� ���primer��d��

�dd � s� ��pl�� �g�

�dd � s���pl����pl�g�

�dd � s� ��pl� �g�

�dd � s��� � ��!"���� ��!"��"��� ��

�dd � s���m��f��m��

print �dd���n��

���

� if linia

��

� while

ProcesAC�pl

���usr�local�bin�perl

while ��STDIN��

s��E�� ��!"����E���novaentrada���e�

s��n�� ��!"����n���vellaentrada���e�

�novaentrada�tr� ����

�novaentrada� s���"��������

�vellaentrada�tr� ����

�vellaentrada� s���"��������

print �vellaentrada�� ���novaentrada���n��

� while

��

Page 20: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

ProcesAC�pl

���usr�local�bin�perl

while ��STDIN��

s��E� ��!"���E�� ���!"���or �n�� ��!"����n���E������E� �����g�

s���E�� ��!"� !"���E������g�

s����i� ��!"���i�����g�

s��� ���!"����g�

s���CM�f pl���CM����CM�f�pl���CM��g�

s���CM�m f���CM����CM�m�f���CM��g�

s���CM�f sing���CM����CM�f�sing���CM��g�

s���CM�pt pp���CM����CM�pt�pp���CM��g�

s���CM�tb pl���CM����CM�tb�pl���CM��g�

s�� �n� ��!"���n����g�

s���n� ��!"���n����g�

s�� �v��CM�� ��!"����CM����v�������g�

s���v��CM�� ��!"����CM����v�������g�

s��v��CG���w�� ��w�����CG����v���v��CG���������CG����v��g�

s��v��CG���w�� ��w�� ��w�����CG����v���v��CG������������CG����v��g�

s���v��CG�� ��!"����CG����v��������g�

s���v��CS� ��!"���CS����v����g�

s���v��REG� ��!"���REG����v����g�

s���i� ��!"���i��� ���!"�����g�

s���i� ��!"���i��� ��!"���T������T��g�

s�� ��!!"�!��g�

s� ������g�

s� �T� � �T��g�

s� �T� ��g�

s� �T���g�

s� �T���g�

s����T���g�

s���n�pron���n�g�

s���n��pl����n �g�

s���n��m����n �g�

Page 21: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

if ��������� �

if ����n�� �

��a��b��c��d� split���� ������

��primer��segon� split����n ���a��

if ���segon� � ��primer� �segon� split����n���a���

�primer � tr� ����

�primer � s���"�����������

�segon � s� !"� ��

�segon � tr� ����

�segon � s�����"������

�segon � s���"��������

�segon � s���"���������

�segon � s�� ���!����� ��!����� ��n!"����������g�

�segon � s�� ��!"��� ���!"���� ��n!"���������g�

�segon � s�����g�

�segon � s������g�

�aa join�� ���primer��segon��

�aa � s���n��g�

�aa � s� ��pl�� �g�

�aa � s���pl����pl�g�

�aa � s� ��pl� �g�

�aa � s��� � ��!"���� ��!"��"��� ��

�aa � s���m��f��m��

print �aa���n��

if ��b� �

�b � s� !�� ��

�b � tr� ����

�b � s�����"������

�b � s���"��������

�b � s���"���������

�b � s�� ���!����� ��!����� ��n!"����������g�

�b � s�� ��!"��� ���!"���� ��n!"���������g�

�b � s�� ��!"��� ��!"�����g�

�b � s�����g�

�b � s������g�

�bb join�� ���primer��b��

�bb � s� ��pl�� �g�

�bb � s���pl����pl�g�

�bb � s� ��pl� �g�

�bb � s��� � ��!"���� ��!"��"��� ��

�bb � s���m��f��m��

��

Page 22: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

print �bb���n��

if ��c� �

�c � s� !�� ��

�c � tr� ����

�c � s�����"������

�c � s���"��������

�c � s���"���������

�c � s�� ���!����� ��!����� ��n!"����������g�

�c � s�� ��!"��� ���!"���� ��n!"���������g�

�c � s�� ��!"��� ��!"�����g�

�c � s�����g�

�c � s������g�

�cc join�� ���primer��c��

�cc � s� ��pl�� �g�

�cc � s���pl����pl�g�

�cc � s� ��pl� �g�

�cc � s��� � ��!"���� ��!"��"��� ��

�cc � s���m��f��m��

print �cc���n��

if ��d� �

�d � s� !�� ��

�d � tr� ����

�d � s�����"������

�d � s���"��������

�d � s���"���������

�d � s�� ���!����� ��!����� ��n!"����������g�

�d � s�� ��!"��� ���!"���� ��n!"���������g�

�d � s�� ��!"��� ��!"�����g�

�d � s�����g�

�d � s������g�

�dd join�� ���primer��d��

�dd � s� ��pl�� �g�

�dd � s���pl����pl�g�

�dd � s� ��pl� �g�

�dd � s��� � ��!"���� ��!"��"��� ��

�dd � s���m��f��m��

print �dd���n��

���

� if linia

��

� while

��

Page 23: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

ProcesAC�pl

���usr�local�bin�perl

while ��STDIN��

s���m��iv���m��inv�g�

s���f��pl���fpl�g�

s���m��f���mf�g�

s���m��pl���mpl�g�

s���m��m���m�g�

s���m�����m�g�

s���m��sing���m�g�

s���f��sing���m�g�

s���f���g�

s���m���g�

s���f��m���mf�g�

print�

� while

��

Page 24: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

B Catalan�English dictionary

B�� Abbreviations

GRAMMATICAL CATEGORY

aj aj av aj f aj iv aj m aj m avaj m iv aj mf aj pl aj pr aj pr ar aj pr plar ar f ar fpl ar m ar mpl avav aj av cnj av m av prp av prp m cnjf f�m fpl fsg�pl inj mm iv m�f mf mpl msg�pl prprp vi vip vp vt vtivtip vtp

SEMANTIC FILE

aer agr agr med ana ana bot ana tcn arqarq mar art ast ast tcn aut aut frr biobot cin cin fot cin tea com dr ecnele esp fot fot cin frr fs fs tcngeo grf grm grm mat gst hst ifmjcs lit mar mar aer mat mat ana mat medmed med txt mil min ms ms tea polqm rlg tb fs tb geo ecn tb mat tb med tb mstb tecn tcn tea tea ms txt txt grf zoo

MORPHOLOGICAL CODE

pl sg sg�pl

REGISTER

atr dsp fg fg fmfg vlg fm vlg

��

Page 25: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

B�� Femenine gender inexion

acomodador a �f usherette aerodin amic a �f aerodynamicsal�lot a �f girl beat a �f lay sisterbenefactor a �f benefactress besn�et a �f great granddaughtercalb a �f bald patch cambrer a �f waitress

caracter�istic a �f characteristic caador a �f hunting jacketclos a �f fence comediant a �f comediennecript ogam a �f Cryptogamia din amic a �f dynamicsdret a �f right dret a �f to stand upelectr onic a �f electronics escultor a �f sculptressfer a �f wild animal� beast �ll a �f daughter�llol a �f goddaughter �ac a �f weaknessfoll a �f madwoman fon etic a �f phoneticsfosc a �f dark� darkness fresc a �f fresh airfruiter a �f fruit bowl� fruit dish gat a �f she catgegant a �f giantess genet a �f horsewomangermanastre a �f stepsister gimn astic a �f gymnasticsgraner a �f broom hidr aulic a �f hydraulicsimpressor a �f printer indirecte a �f insinuation� hint� allusioninfermer a �f nurse jardiner a �f gardeningjardiner a �f window box junt a �f board� council� committeejunt a �f meeting� assembly laic a �f laywomanllaurador a �f ploughwoman lleter a �f churn� milk canl ogic a �f logic matem atic a �fsg�pl mathematics� mathsmec anic a �f mechanics mec anic a �f mechanismmenjador a �f trough� manger monjo a �f nunmort a �f death mort a �f death� endmosso a �f girl m axim a �f at most

m axim a �f maxim m�imic a �f mime� mimicrym�usic a �f music nen a �f girlnoi a �f girl n autic a �f art of navigationn�et a �f granddaughter parell a �f couplepastor a �f shepherdess pescador a �f �sherwomanpistoler a �f holster pl astic a �f plastic art� modellingpoltre a �f �lly� foal pol emic a �f polemic� controversy

pol�itic a �f policy pol�itic a �f politicsporc a �f sow prostitut a �f prostitutepr actic a �f �pl training �sg pr actic a �f in practicepr actic a �f pilot pr actic a �f practice

��

Page 26: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

quint a �f class� call up quint a �f �fthraier a �f raftwoman recte a �f straight linesegador a �f harvester� reaper� mower sembrador a �f sowersenyor a �f lady� wife sogre a �f mother in lawtravesser a �f road which crosses a town tra!�dor a �f traitress� betrayert ecnic a �f technique vedell a �f vealvell a �f old woman vell a �f to grow oldviudo a �f widow xaval a �f girl� lassxicot a �f �anc�ee xicot a �f girl� lassxofer a �f chau�euse� driver etic a �f ethics optic a �f optician�s optic a �f optics optic a �f point of view actor triu �f actressameric a ana �f jacket am�tri�o ona �f hostessanci a ana �f elederly woman angl es esa �f Englishwomanartes a ana �f craftswoman besavi avia �f great grandmothercom�u una �f commune com�u una �f in commoncom�u una �f toilet criat ada �f servant� maidcunyat ada �f sister in law deg a ana �f doyennedes e ena �f ten empresari aria �f businesswoman

escoc es esa �f Scotswoman espadatx�i ina �f swordswomanesp os osa �f wife exclusiu iva �f exclusive

exclusiu iva �f sole right fadr�i ina �f young womanfranc es esa �f Frenchwoman gall es esa �f Welshwomangas�os osa �f lemonade germ a ana �f sistergros ossa �f �rst prize gros ossa �f thumbhereu eva �f heiress heroi o!�na �f heroineholand es esa �f Dutchwoman hoste essa �f fair hostess� stewardesshoste essa �f hostess incisiu iva �f incisorinstantani ania �f snap irland es esa �f Irishwomanmanyac aga �f caress marqu es esa �f marchionessminy�o ona �f girl miny�o ona �f maidmitj a ana �f mean nebot oda �f niecenebul�os osa �f nebula negatiu iva �f denial� refusalnormatiu iva �f rules �pl� regulations �pl noucasat da �f recently married woman

ofensiu iva �f o�ensive padr�i ina �f godmother

padr�i ina �f grandmother pag es esa �f countrywomanpag es esa �f to turn a deaf ear pais a ana �f countrywomanpais a ana �f to be in plain clothes primari aria �f thinnessprom es esa �f �anc�ee prom es esa �f promisepropietari aria �f owner� proprietress� landlady religi�os osa �f nun

��

Page 27: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

rod�o ona �f circle rod�o ona �f semibrevesacerdot essa �f priestess ser e ena �f in the openser e ena �f night dew sold a ana �f sultana

tallat ada �f cut� cutting trencad�is issa �f breakageveterinari aria �f veterinary science visc�os osa �f viscose

��

Page 28: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

B�� Source code in Perl

See �Wall et al� ����

ProcesCA�pl

���usr�local�bin�perl

while ��STDIN��

� s��i���nj����i� � print ��� ��n��ge�

� Categories gramaticals

s��i�� ��!����i���i��CG������CG����i��g�

s��i�� avm! ��!����i���i��CG������CG����i��g�

s��i��a jrv! �w��a jrv! �w� �w�����i���i��CG������CG����i��g�

s��i��m iv����i���i��CG������CG����i��g�

s��i��vt�w�����i���i��CG������CG����i��g�

s��i�� �s! �g!����w�����i���i��CG������CG����i��g�

s��i�� �s!"���w�����i���i��CG������CG����i��g�

s��i���pl��nj����i���i��CG������CG����i��g�

s��i��pr����i���i��CG������CG����i��g�

s��i��vip�prp����i���i��CG������CG����i��g�

� Camps semantics

s��i��dr�q�����i���i��CS������CS����i��g�

s��i��a �tlv! �e!����i���i��CS������CS����i��g�

s��i�� zb! �rae!�����i���i��CS������CS����i��g�

s��i�� ghjl! �aol!�����i���i��CS������CS����i��g�

s��i�� rt! �io!�����i���i��CS������CS����i��g�

s��i��tb �w���w�����i���i��CS������CS����i��g�

s��i�� ac! �m! �b! �w�����i���i��CS������CS����i��g�

s��i��ms ��!"����i���i��CS������CS����i��g�

s��i��m aei!�����i���i��CS������CS����i��g�

s��i��m aei!� �f! �i!�w�����i���i��CS������CS����i��g�

s��i��f �e!� �w��g�� �w��t�� �������i���i��CS������CS����i��g�

s��i��pol�ifm����i���i��CS������CS����i��g�

s��i��f or! �c!����i���i��CS������CS����i��g�

s��i��e slc! ��!����i���i��CS������CS����i��g�

s��i��c io! �ps!����i���i��CS������CS����i��g�

��

Page 29: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

� Codis Morfologics

s��i��pl�sg����i���i��CM������CM����i��g�

s��i��s�w����w�����i���i��CM������CM����i��g�

� Registres

s��i��fg �w�����i���i��REG������REG����i��g�

s��i��fg�fm�vlg�dsp�atr�US����i���i��REG������REG����i��g�

s��i��tb fg����i���i��REG������REG����i��g�

if ���CG�� ��!"����CG��� �

s��CG�� ��!"����CG���catgram���e�

s��i��catgram���i���i��CG��catgram���CG����i��g�

else �

if ������� �

s��T���T��i��CG��catgram���CG����i��g�

print�

ProcesCA�pl

���usr�local�bin�perl

while ��STDIN��

if����� ��!!"�!�� # ��CG� mf!����

if���n��� �

s���i� ���!"���i�� !"��n� ��!"���n����� �� �g�

s��E� ��!"���E� ��!"�T� !"�n�� ��!"����n���E��� ���E� �T��g�

print�

� while

��

Page 30: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

ProcesCA�pl

���usr�local�bin�perl

while ��STDIN��

s���E�� ��!"� ���E����entrada���e�

s�����i����i����� ���g�

s��� ���!"����g�

s����i� ��!"���i�����g�

s��i���CG� ��!"���CG�����i�����g�

s���i� ��!"���i��� ���!"�����g�

s���i� ��!"���i��� ��!"���T������T��g�

s���i��CS� ��!"���CS����i����g�

s���i��REG� ��!"���REG����i����g�

s�� �n� ��!"���n��� �g�

�s���n� ��!"���n����g�

s��CG���w�� ��w�����CG���CG���������CG��g�

s��CG���w�� ��w�� ��w�����CG���CG������������CG��g�

s���CG�� ��!"����CG��������g�

�s�� �i��CM�� ��!"����CM����i�������g�

s���i��CM�� ��!"����CM����i�������g�

s�� ��!!"�!��g�

s� !��T� !�� �T��g�

s��T���g�

s����T���g�

s� !�� �g�

if����mf�#� ��� s� � !� ��mf� ��m�g��

if����m�#� ��� s� � !� ��m� ��m�g��

if����f�#� ���

s�� �eo!� a ��f���a ��f�g�

s�� eo!� a ��f�a ��f�g�

s� �ena� ��f��� ��f�g�

s�s �esa� ��f��� ��f�g�

s� �ana� ��f��� ��f�g�

s� �ona� ��f��� ��f�g�

s�s �osa� ��f��� ��f�g�

s� �una� ��f��� ��f�g�

s� �ina� ��f��� ��f�g�

s�eu �eva� ��f��� ��f�g�

s�ac �aga� ��f��� ��f�g�

Page 31: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

s�at �ada� ��f��� ��f�g�

s�ari �ria� ��f��� ��f�g�

s�ani �nia� ��f��� ��f�g�

s�avi �via� ��f��� ��f�g�

s�s �issa� ��f��� ��f�g�

s�os �ossa� ��f��� ��f�g�

s�tiu �iva� ��f�t�� ��f�g�

s�siu �iva� ��f�s�� ��f�g�

s�s �osa� ��f��� ��f�g�

s�oi �ona� ��f��� ��f�g�

s�te �essa� ��f�t�� ��f�g�

s�tor �triu� ��f��� ��f�g�

s�bot �oda� ��f�b�� ��f�g�

s�casat �da� ��f�casa�� ��f�g�

s�ot �essa� ��f�ot�� ��f�g�

s���� mf!� ���pl ���������g�

if���� fm!���

s� ������g�

s���� fm! � !"���catgram���e�

��a��b��c��d� split���� ������

if �� ��m �� � ��primer��segon� split�� ��m ���a��

if �� ��mf �� � ��primer� �segon� split�� ��mf ���a���

if �� ��mpl �� � ��primer� �segon� split�� ��mpl ���a���

if �� ��m��f �� � ��primer� �segon� split�� ��m��f ���a���

if �� ��msg��pl �� � ��primer� �segon� split�� ��msg��pl ���a���

if �� ��m��iv �� � ��primer� �segon� split�� ��m��iv ���a���

if �� ��f �� � ��primer� �segon� split�� ��f ���a���

if �� ��fpl �� � ��primer� �segon� split�� ��fpl ���a���

if �� ��fsg��pl �� � ��primer� �segon� split�� ��fsg��pl ���a���

�primer � tr� ����

�segon � s� !�� ��

�segon � tr� ����

�segon � s�����"������

�segon � s���"��������

�segon � s���"���������

�segon � s��������g�

�segon � s��to���g�

��

Page 32: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

�primera join�����primer��catgram��

�aa join�� ���primera��segon��

print �aa���n��

if ��b� �

�b � s� !�� ��

�b � tr� ����

�b � s�����"������

�b � s���"��������

�b � s���"���������

�b � s��������g�

�b � s��to���g�

�bb join�� ���primera��b��

print �bb���n��

if ��c� �

�c � s� !�� ��

�c � tr� ����

�c � s�����"������

�c � s���"��������

�c � s���"���������

�c � s��������g�

�c � s��to���g�

�cc join�� ���primera��c��

print �cc���n��

if ��d� �

�d � s� !�� ��

�d � tr� ����

�d � s�����"������

�d � s���"��������

�d � s���"���������

�d � s��������g�

�d � s��to���g�

�dd join�� ���primera��d��

print �dd���n��

���

� if linia

� while

��

Page 33: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

ProcesCA�pl

���usr�local�bin�perl

while ��STDIN��

s��E�� ��!"����E���novaentrada���e�

s��n�� ��!"����n���vellaentrada���e�

�novaentrada�tr� ����

�novaentrada� s���"��������

�vellaentrada�tr� ����

�vellaentrada� s���"��������

print �vellaentrada�� ���novaentrada���n��

� while

ProcesCA�pl

���usr�local�bin�perl

while ��STDIN��

s��E� ��!"���E�� ���!"���o �n�� ��!"����n���E������E� �����g�

s���E�� ��!"� !"���E������e�

s�����i����i����� ���g�

s��� ���!"����g�

s����i� ��!"���i�����g�

s��i���CG� ��!"���CG�����i�����g�

s���i� ��!"���i��� ���!"�����g�

s���i� ��!"���i��� ��!"���T������T��g�

s���i��CS� ��!"���CS����i����g�

s���i��REG� ��!"���REG����i����g�

s�� �n� ��!"���n��� �g�

�s���n� ��!"���n����g�

s��CG���w�� ��w�����CG���CG���������CG��g�

s��CG���w�� ��w�� ��w�����CG���CG������������CG��g�

s���CG�� ��!"����CG��������g�

�s�� �i��CM�� ��!"����CM����i�������g�

s���i��CM�� ��!"����CM����i�������g�

��

Page 34: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

s�� ��!!"�!��g�

s� !��T� !�� �T��g�

s��T���g�

s����T���g�

s� !�� �g�

if����mf�#� ��� s� � !� ��mf� ��m�g��

if����m�#� ��� s� � !� ��m� ��m�g��

if����f�#� ���

s�� �eo!� a ��f���a ��f�g�

s�� eo!� a ��f�a ��f�g�

s� �ena� ��f��� ��f�g�

s�s �esa� ��f��� ��f�g�

s� �ana� ��f��� ��f�g�

s� �ona� ��f��� ��f�g�

s�s �osa� ��f��� ��f�g�

s� �una� ��f��� ��f�g�

s� �ina� ��f��� ��f�g�

s�eu �eva� ��f��� ��f�g�

s�ac �aga� ��f��� ��f�g�

s�at �ada� ��f��� ��f�g�

s�ari �ria� ��f��� ��f�g�

s�ani �nia� ��f��� ��f�g�

s�avi �via� ��f��� ��f�g�

s�s �issa� ��f��� ��f�g�

s�os �ossa� ��f��� ��f�g�

s�tiu �iva� ��f�t�� ��f�g�

s�siu �iva� ��f�s�� ��f�g�

s�s �osa� ��f��� ��f�g�

s�oi �ona� ��f��� ��f�g�

s�te �essa� ��f�t�� ��f�g�

s�tor �triu� ��f��� ��f�g�

s�bot �oda� ��f�b�� ��f�g�

s�casat �da� ��f�casa�� ��f�g�

s�ot �essa� ��f�ot�� ��f�g�

s���� mf!� ���pl ���������g�

if���� fm!���

s� ������g�

s���� fm! � !"���catgram���e�

��a��b��c��d� split���� ������

��

Page 35: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

if �� ��m �� � ��primer��segon� split�� ��m ���a��

if �� ��mf �� � ��primer� �segon� split�� ��mf ���a���

if �� ��mpl �� � ��primer� �segon� split�� ��mpl ���a���

if �� ��m��f �� � ��primer� �segon� split�� ��m��f ���a���

if �� ��msg��pl �� � ��primer� �segon� split�� ��msg��pl ���a���

if �� ��m��iv �� � ��primer� �segon� split�� ��m��iv ���a���

if �� ��f �� � ��primer� �segon� split�� ��f ���a���

if �� ��fpl �� � ��primer� �segon� split�� ��fpl ���a���

if �� ��fsg��pl �� � ��primer� �segon� split�� ��fsg��pl ���a���

�primer � tr� ����

�segon � s� !�� ��

�segon � tr� ����

�segon � s�����"������

�segon � s���"��������

�segon � s���"���������

�segon � s��������g�

�segon � s��to���g�

�primera join�����primer��catgram��

�aa join�� ���primera��segon��

print �aa���n��

if ��b� �

�b � s� !�� ��

�b � tr� ����

�b � s�����"������

�b � s���"��������

�b � s���"���������

�b � s��������g�

�b � s��to���g�

�bb join�� ���primera��b��

print �bb���n��

if ��c� �

�c � s� !�� ��

�c � tr� ����

�c � s�����"������

�c � s���"��������

�c � s���"���������

�c � s��������g�

�c � s��to���g�

�cc join�� ���primera��c��

print �cc���n��

��

Page 36: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

if ��d� �

�d � s� !�� ��

�d � tr� ����

�d � s�����"������

�d � s���"��������

�d � s���"���������

�d � s��������g�

�d � s��to���g�

�dd join�� ���primera��d��

print �dd���n��

���

� if linia

� while

ProcesCA�pl

���usr�local�bin�perl

while ��STDIN��

s�������g�

s���m��iv���m��inv�g�

s���f��pl���f�g�

s���m��pl���m�g�

s���m��f���mf�g�

s���f��m���mf�g�

s���fsg��pl���f�g�

s���msg��pl���m�g�

s���mf�����mf�g�

s� temps��m��inv� temps��m�g�

print�

� while

��

Page 37: Con - cs.upc.eduescudero/wsd/98-lsi3t-befr.pdf · Con v erting the Enciclop edia Catalana bilingual MRD to an MTD L Ben tez G Escudero J F arreres G Rigau Con ten ts In tro duction

References

�Atserias et al �� J� Atserias� S� Climent� J� Farreres� G� Rigau� H�Rodr��guez� Combining Multiple Methods for the Automatic Construction

of Multilingual WordNets Proceedings of Conference on Recent Advanceson NLP� RANLP ��� Tzigov Chark� Bulgaria� �����

�Ben��tez et al ��a L� Bentez� G� Escudero� J� Farreres� G� Rigau� Ap�plying Automatic Methods for the Construction of Multilingual WordNets�Technical Report LSI �� � T� LSI Department� Universitat Politcnica deCatalunya� �����

�Ben��tez et al ��b L� Bentez� S� Cervell� G� Escudero� M� Lpez� G� Rigau�M� Taul �Universitat Politcnica de Catalunya" Universitat de Barcelona�Methods and tools for building the Catalan WordNet� In workshop onLanguage Resources for European Minority Languages at Conference onLanguage Resources and Evaluation �LREC�� Granada� Spain� �����

�DEC �� Diccionari b�asic catal�a�angl�es angl�es�catal�a� Enciclop ediacatalana� Barcelona ����� ISBN� B������ �����

�Rigau �� G� Rigau� Automatic Acquisition of Lexical Knowledge from

MRDs� PhD Thesis� Departament de Llenguatges i Sistemes Inform atics�Universitat Polit ecnica de Catalunya� Barcelona� �����

�Wall et al �� WALL� Larry" CHRISTIANSEN� Tom" SCHWARTZ�Randal L� Programming Perl� Second Edition� Sebastopol� O�Relly #Associates� ����� ISBN� � ����� ��� ��

��