22
Corpus Evaluation Adam Kilgarriff Lexical Computing Ltd Corpus evaluation Portsmouth Nov 2011 1

Corpus Evaluation

  • Upload
    shanta

  • View
    53

  • Download
    0

Embed Size (px)

DESCRIPTION

Adam Kilgarriff Lexical Computing Ltd. Corpus Evaluation. Now Corpora to spec Choice Need to evaluate. Then Very few corpora Use what ’ s there. Intrinsic See what it looks like Extrinsic Embed in a task How well do you do at the task Better It all depends what you want it for. - PowerPoint PPT Presentation

Citation preview

Page 1: Corpus Evaluation

Corpus Evaluation

Adam Kilgarriff

Lexical Computing Ltd

Corpus evaluation Portsmouth Nov 20111

Page 2: Corpus Evaluation

Portsmouth Nov 2011

Then Very few corpora Use what’s there

Now Corpora to spec Choice Need to evaluate

Corpus evaluation 2

Page 3: Corpus Evaluation

Portsmouth Nov 2011

IntrinsicSee what it looks like

ExtrinsicEmbed in a taskHow well do you do at the taskBetter

• It all depends what you want it for

Corpus evaluation 3

Page 4: Corpus Evaluation

Portsmouth Nov 2011

it all depends what you want it for but

‘general English (/French/Chinese/ …)’Many purposesNot specialist sublanguage

A decent construct?Not sure but it has form

• General language dictionaries

• “how good is a corpus, for making them?”

Corpus evaluation 4

Page 5: Corpus Evaluation

Portsmouth Nov 2011

General truths

Duplicates bad Noise bad Big good Diverse (good coverage of varieties

within research scope, not dominated by any one variety) good

Corpus evaluation 5

Page 6: Corpus Evaluation

Portsmouth Nov 2011 Corpus evaluation

word sketch

A corpus-derived one-page summary of a word’s grammatical and collocational behaviour

6

Page 7: Corpus Evaluation

Portsmouth Nov 2011 Corpus evaluation

Macmillan English DictionaryFor Advanced Learners

Ed: Rundell, 2002

7

Page 8: Corpus Evaluation

Portsmouth Nov 2011 Corpus evaluation

11 years 1999-2010

Feedback Good but anecdotal

Formal evaluation

8

Page 9: Corpus Evaluation

Portsmouth Nov 2011 Corpus evaluation

Goal

Collocations dictionary Model: Oxford Collocations Dictionary Publication-quality

Ask a lexicographer For 42 headwords

• For 20 best collocates per headwords “should we include this collocation in a

published dictionary?”

9

Page 10: Corpus Evaluation

Portsmouth Nov 2011 Corpus evaluation

Sample of headwords Nouns verbs adjectives, random High (Top 3000) N space solution opinion mass corporation leader V serve incorporate mix desire Adj high detailed open academic Mid (3000- 9999) N cattle repayment fundraising elder biologist sanitation V grieve classify ascertain implant Adj adjacent eldest prolific ill Low (10,000- 30,000) N predicament adulterer bake bombshell candy shellfish V slap outgrow plow traipse Adj neoclassical votive adulterous expandable

10

Page 11: Corpus Evaluation

Portsmouth Nov 2011 Corpus evaluation

Precision and recall

We tested precisionRecall is harder

How do we find all the collocations that the system should have found?

11

Page 12: Corpus Evaluation

Portsmouth Nov 2011 Corpus evaluation

Four languages, three families

Dutch ANW, 102m-word lexicographic corpus

English UKWaC, 1.5b web corpus

Japanese JpWaC, 400m web corpus

Slovene FidaPlus, 620m lexicographic corpus

12

Page 13: Corpus Evaluation

Portsmouth Nov 2011 Corpus evaluation

User evaluation

Evaluate whole system Will it help with my task

• Eg preparing a collocations dictionary

Contrast: developer evaluation Can I make the system better?

• Evaluate each module separately

• Current work

13

Page 14: Corpus Evaluation

Portsmouth Nov 2011 Corpus evaluation

Components

Corpus NLP tools

Segmenter, lemmatiser, POS-tagger

Sketch grammar Statistics

14

Page 15: Corpus Evaluation

Portsmouth Nov 2011 Corpus evaluation

Practicalities

Interface Good, Good-but

• Merge to good Maybe, Maybe-specialised, Bad

• Merge to bad

For each language Two/three linguists/lexicographers If they disagree

• Don't use for computing performance

15

Page 16: Corpus Evaluation

Portsmouth Nov 2011 Corpus evaluation

Results

Dutch 66% English 71% Japanese 87% Slovene 71%

Two thirds of a collocations dictionary can be gathered automatically

16

Page 17: Corpus Evaluation

Portsmouth Nov 2011

<world, final> problem

Is it good?Superficially noLook at concordances:

• World cup finals

Solution‘Commonest string’

Corpus evaluation 17

Page 18: Corpus Evaluation

Portsmouth Nov 2011 Corpus evaluation

Next step

Recall• 200 collocates per headword

• Selected from

• All the corpora we have

• Various parameter settings

• Plus just-in-time evaluation for 'new' collocates

ThenFor a sample of headwords

• These are the collocations we should get

18

Page 19: Corpus Evaluation

Portsmouth Nov 2011

From sketches to corpora

Hold other inputs constantJust one variesEvaluate that one

Hold tools, stats, grammar constantevaluate corpora

Corpus evaluation 19

Page 20: Corpus Evaluation

Portsmouth Nov 2011

Criteria

• Duplicates bad• Noise bad• Big good• Diverse (good coverage of varieties within

research scope, not dominated by any one variety) good

We think so

Corpus evaluation 20

Page 21: Corpus Evaluation

Portsmouth Nov 2011

Over next year

Build test sets Textbook cases

English• BNC vs UKWaC vs OEC vs Gigaword

Dutch• ANW corpus vs web corpus

web crawling, deduplicationWhich parameters give best results?

Corpus evaluation 21

Page 22: Corpus Evaluation

Portsmouth Nov 2011 Corpus evaluation

Thank you

http://www.sketchengine.co.uk

22