Océ at CLEF 2003

Roel Brand

Marvin Brünner

Samuel Driessen

Jakob Klok

Pascha Iljin

Outline

• Océ mission

• Participation in 2001, 2002

• Participation in 2003: three models

• Results

• Conclusions

• Remark on evaluation measures

Océ-Technologies B.V.

• active in approximately 80 countries

• 23,000 people worldwide

To enable people to share information by offering products and services for the reproduction, presentation, distribution and management of documents.

Mission:

Research: >2000 employees

Participation in 2001, 2002

2001: Dutch mono-lingual task

2002: All mono-lingual tasksSeveral cross-lingualMulti-lingual

Participation in 2003

Mono-lingual tasks

3 ranking models:• BM25• probabilistic• statistical

title + description

parsing

stop word removal

BM25, probabilistic

title + description

parsing

stop word removal

statistical

+ compound splitting, morphological variations

Indexing

parsing

stop words are not removed

Ranking functions

BM25k1 & b parameters:

the best match for 2002 Dutch

probabilistic

urn model

coordination level ranking

statistical

a set of clues

degree of significance

Results

Name of the runNumber of retrievedrelevant documents

Averageprecision

R-precision

Swedish BM25 729 out of 889 0.3584 0.3585Swedish probabil. 633 out of 889 0.2716 0.2743Italian BM25 759 out of 809 0.4361 0.4287Italian probabil. 731 out of 809 0.3805 0.3865French BM25 894 out of 946 0.4601 0.4273French probabil. 865 out of 946 0.4188 0.4044Finnish BM25 417 out of 483 0.3570 0.3230Finnish probabil. 407 out of 483 0.3031 0.2624Spanish BM25 2109 out of 2368 0.4156 0.4094Spanish probabil. 2025 out of 2368 0.3500 0.3696German BM25 1482 out of 1825 0.3858 0.3838German probabil. 1337 out of 1825 0.3017 0.3088Dutch BM25 1438 out of 1577 0.4561 0.4438Dutch probabil. 1336 out of 1577 0.4049 0.3652Dutch statist. 2001 1375 out of 1577 0.4253 0.3940Dutch statist. 2002 1378 out of 1577 0.4336 0.3983

Conclusions

• the BM25 model outperforms the probabilistic one

‘knowledge’ about data collectiontopicsassessments

• mathematical correctness - not the best guideline

• for a better retrieval model:

Remark on evaluation measures

Dutch data from 2001

top T=1000 docs; top N are read; M participants

at most N*M relevance judgements

16774 relevance judgements for 50 queries => about 335 per query

1224 relevant documents for 50 queries => about 25 per query

about 60-70% docs in the top 1000 = unknown ?! = irrelevant ?!

A proposal:

Read all T docs. (T=100? 200?)

Océ at CLEF 2003

Documents

Océ Annual Report 2004 - KU Leuven · 2012-10-31 · Océ n.v. Report for the ﬁnancial year December 1, 2003 to November 30, 2004 Océ enables its customers to manage their documents

Océ TDS320 Specifications - professionalrepro.com · Océ Scan Logic® software Océ Scan Logic is an optional scan-to-file software application for the Océ TDS320 multifunctional

Treble Clef/G Clef

Océ TDS860

PRISMAsync...Océ Products: JetStream 1000 10 Océ Products: Arizona 550XT 11 Agenda Introduction Océ and Océ Products Introduction PRISMAsync Controller Architecture PRISMAsync

Music Theory Review. Staff Treble Clef Bass Clef

Océ TDS320 User Manual - Océ | Printing for Professionals Océ TDS320 printer 9 The Océ TDS320 scanner 9 The Océ Power Logic® controller 10 Océ TDS320 options 11 Océ TDS320

INSTRUMENTATION - edrmartin.com · Bb Trombone 3 Bass Clef Bb Trombones 1 & 2 Treble Clef Bb Trombone 3 Treble Clef Bb Euphonium Treble Clef Bb Euphonium Bass Clef Eb Tuba Bass Clef

Océ Account Center - Canon Global · 2.7 Océ Account Center 2.6.3.1 12 2.7.1 New features 12 2.7.2 Enhancements 12 2.8 Océ Account Center 2.6.3 12 2.8.1 New features 12 ... Océ

Océ User's Guide

Océ TDS400 - Printing for Professionals Océ on the internet at for: ... The Océ TDS400 12 The Océ TDS400 concept and components 13 The Océ TDS400printer 13 The Océ TDS400scanner

Océ Repro Desk

CLEF 2015 labs building on CLEF 2014 and more!clef2015.clef-initiative.eu/docs/CLEF_2015_labs.pdf · 2015-09-02 · CLEF 2015 labs building on CLEF 2014 and more! ... CLEF NEWSREEL

Precision printing without boundaries - MDS IT · Océ Arizona 22x0 XT Océ Arizona 22x0 GT Océ Arizona 12x0 XT Océ Arizona 12x0 GT Océ Arizona 318 GL Applications Point of purchase

Océ Jetstream

Musical Terms and Symbols practice. Treble Clef Bass Clef

Océ archive solutions - Copier Catalogbrochure.copiercatalog.com/oce/3050_brochure_v1_m56577569830… · Océ archive solutions • Increase productivity with the Océ 3050 • Batch

Music · Bass-clef The bass-clef is the other most popular type of clef in music. It’s normally called bass-clef or F-clef. If you are drawing the bass-clef the second line form

Océ Account Center readme 2.0 - files.oceusa.comfiles.oceusa.com/media/Assets/PDFs/TSS/external/AccountCenter/... · Océ Account Center, Readme 2.7.0.0 4 ... Océ ColorWave 650

Océ Development Process · Ed Brinksma, ESI presentation sources: • Ron Notermans (VP R&D Océ) • Peter van den Bosch (developer Océ) • Lou Dohmen (architect Océ) • Roelof