Application of the AR2NL system for reporting association rules in Finnish

Preview:

DESCRIPTION

Application of the AR2NL system for reporting association rules in Finnish. 21.10.2004 Emilia Ylirinne emilia@iki.fi Tampere University of Technology. Introduction. Based on system LISp-Miner Data of the medical project STULONG AR2NL translates association rules into Czech and English - PowerPoint PPT Presentation

Citation preview

Application of the AR2NL system for reporting association rules in Finnish

21.10.2004

Emilia Ylirinne

emilia@iki.fi

Tampere University of Technology

Introduction Based on system LISp-Miner Data of the medical project

STULONG AR2NL translates association rules

into Czech and English Translating into Finnish

Topics Reporting Data Mining results in

Natural Language System AR2NL Translating into Finnish Concluding Remarks

(based on doctor Petr Strossa’s articles)

Reporting Data Mining results in Natural Language Association rule φ ψ Founded implication φ ψ p,n

Four-fold contingency table

χ

Example of association rule

ED(univ) RS(mng) 0.95,76 AJ(sits)

Four-fold table

several ways to formulateNatural Language (NL) Formulations

1. 76 (i.e. 95 %) of the observed patients confirmthis dependence: if the patient has university education and responsibility of a manager, then he mainly sits in his job.

1. 76 (eli 95 %) havainnoiduista potilaista toteuttaa seuraavan riippuvuuden: jos potilaallaon korkeakoulutus ja työ johtotehtävissä,hän istuu enimmäkseen työssään.

Natural Language (NL) Formulations

2. 95 % of the observed patients that havereached university education and work as a managerial position also mainly sit in their job.

2. 95 % havainnoiduista potilaista, jotka ovatsaaneet korkeakoulutuksen ja työskentelevät johtotehtävissä, myös enimmäkseen istuvat työssään.

Natural Language (NL) Formulations

3. Potilaille, jotka ovat saaneet korkeakoulutuksenja työskentelevät johtotehtävissä, on ominaista, ettäheillä on myös istumatyö. Tämän toteuttaa 76 (eli 95 %) havainnoitua potilasta.

3. It is characteristic for the patients that havereached university education and work as amanagerial position that they also have a sedentaryjob. This fact is confirmed by 76 (i.e. 95 %)observed patients.

X Y 0.95,76 ZNatural Language (NL) Formulations

1. a (i.e. 100p %) of the observed patients confirmthis dependence: if the patient has NLF(X) and NLF(Y), then he NLF(Z).

2. 100p % of the observed patients that NLF(X) and NLF(Y) also NLF(Z).

3. It is characteristic for the patients that NLF(X) and NLF(Y) that they also have NLF(Z). This fact is confirmed by a (i.e. 100p %)observed patients.

Noun phrase, NP university education korkeakoulutusa managerial position työ johtotehtävissäa sedentary job istumatyö Verb phrase, VP works as a managerial position

työskentelee johtotehtävissähas reached university education

on saavuttanut korkeakoulutuksen Adjectival phrase AP university-educated korkeakoulutettu Participial phrase working as a manager

johtotehtävissä työskentelevämainly sitting in his job

työssään istuva

Natural Language (NL) Formulations

1. a (i.e. 100p %) of the observed patients confirmthis dependence: if the patient has NP(X) and NP(Y), then he VP(Z).

1. a (eli 100p %) havainnoiduista potilaista toteuttaa seuraavan riippuvuuden: jos potilaallaon NP(X) ja NP(Y), hän VP(Z).

Natural Language (NL) Formulations

2. 100p % of the observed patients that VP(X) and VP(Y) also VP(Z).

2. 100p % havainnoiduista potilaista, jotka VP(X) ja VP(Y), myös VP(Z).

Natural Language (NL) Formulations

3. It is characteristic for the patients that VP(X) and VP(Y) that they also have NP(Z). This fact is confirmed by a (i.e. 100p %)observed patients.

3. Potilaille, jotka VP(X) ja VP(Y), on ominaista, että heillä on myös NP(Z). Tämän toteuttaa a (eli 100p %) havainnoitua potilasta.

Finnish language Belongs to Uralian family of

languages More than a dozen cases(http://www.cs.tut.fi/~jkorpela/finnish-cases.html)

Synthetic language uses suffixes to express grammatical relations and

also to derive new words

in my house, too -> talossanikinafter you had written -> kirjoitettuasi ”Free” word order

Pete loves Anna - Anna loves Pete

Pete rakastaa Annaa. This is the normal word order, the same as in English. Annaa Pete rakastaa. This emphasizes the word Annaa: the object of Pete's love is Anna, not someone else. Rakastaa Pete Annaa. This emphasizes the word rakastaa, and such a sentence might used as a response to some doubt about Pete's love; so one might say it corresponds to Pete does love Anna. Pete Annaa rakastaa. This word order might be used, in conjunction with special stress on Pete in pronunciation, to emphasize that it is Pete and not someone else who loves Anna. Annaa rakastaa Pete. This might be used in a context where we mention some people and tell about each of them who loves them. So this roughly corresponds to the English sentence Anna is loved by Pete. Rakastaa Annaa Pete. This does not sound like a normal sentence, but it is quite understandable.

source: http://www.cs.tut.fi/~jkorpela/finnish-intro.html

Finnish language no definite or indefinite article no grammatical gender negation, corresponding to English

not, behaves as a verb ownership or possession (have

and be in English)

I have a dog -> Minulla on koira

("at me (there) is (a) dog")

System AR2NL Main features Written in XML standard Files which contains data needed

in translations Translates association rules with

founded implication

FP-file Formulation Patterns Base of (NL) sentences File which contains data needed in

translations Translates association rules with

founded implication

FP-file

FPA-file Formulation Patterns - Auxiliary Substitutions for higher-order non-

terminal symbols Variability of sentences

FPA-file

Entitynames-file Entities e.g. ”Patient”, ”which”

MN-file Morphology - Nouns Language dependent Singular and plural case endings 7 cases in Czech,

14 cases in Finnish

MN-file

MV-file Morphology - Verbs Singular and plural case endings Participial form and case

Elementary-file Important part Contains data of the literals Noun phrase, Adjectival phrase,

Verb phrase

Elementary-file

Conversion process example of the process

Problems word order in participial form drink beer - drinking beer juoda olutta - olutta juova

cases in participial form many cases ja (and) in logic and in Finnish Patients drinking beer and smoking mainly sits in

their job. Olutta juovat ja tupakoivat potilaat istuvan

enimmäkseen työssään ownership

Concluding Remark AR2NL system can translate

association rules into Finnish, too

Recommended