24
Welcome to CLEF 2008 Carol Peters ISTI-CNR Pisa, Italy

Welcome to CLEF 2008 Carol Peters ISTI-CNR Pisa, Italy

Embed Size (px)

Citation preview

Page 1: Welcome to CLEF 2008 Carol Peters ISTI-CNR Pisa, Italy

Welcome to CLEF 2008

Carol PetersISTI-CNR Pisa, Italy

Page 2: Welcome to CLEF 2008 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008

CLEF Objectives

Stimulate the development of multilingual IR systems (for European languages !)

To create a CLIR/MLIA community Construct publicly available test-suites

Conducting annual evaluation campaigns

Designing tracks/tasks to meet emerging needs and to stimulate research in the”right” direction

Objective: truly multilingual/multimedia systems

Page 3: Welcome to CLEF 2008 Carol Peters ISTI-CNR Pisa, Italy

Come to CLEF – and see Europe!

CLEF 2002Rome

CLEF 2006Alicante

CLEF 2005Vienna

CLEF 2004Bath

CLEF 2003Trondheim

CLEF 2008Aarhus

CLEF 2001Darmstadt

CLEF 2007Budapest

CLEF 2000

Lisbon

Page 4: Welcome to CLEF 2008 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008

CLEF2008 Coordination

Athena Research Center, Greece Business Information Systems, U. Applied Sciences

Western Switzerland, Sierre, Switzerland Centre for Evaluation of Human Language & Multimodal

Communication Technologies (CELCT), Italy Centruum vor Wiskunde en Informatica, Amsterdam, NL Computer Science Department, U. Basque Country, Spain Computer Vision and Multimedia Lab, U. Geneva, CH Data Base Research Group, U. Tehran, Iran Dept. of Computer Science, U. Indonesia Dept. of Computer Science & Medical Informatics, RWTH

Aachen U., Germany Dept. of Computer Science and Information Systems, U.

Limerick, Ireland Dept. of Medical Informatics and Clinical Epidemiology,

Oregon Health and Science U., USA Dept. of Information Engineering, U. Padua, Italy Dept. of Information Science, U. Hildesheim, Germany Dept. of Information Studies, U. Sheffield, UK Dept. Medical Informatics, U. Hospitals and University of

Geneva, Switzerland Evaluations and Language Resources Distribution Agency,

Paris, France

German Research Centre Artificial Intelligence, DFKI GESIS-IZ Social Science Information Centre, Germany Information and Language Processing Systems, U.

Amsterdam, The Netherlands Information Science, U. Groningen, The Netherlands Institute of Computer Aided Automation, Vienna

University of Technology, Austria Laboratoire d'Informatique pour la Mécanique et les

Sciences de l'Ingénieur (LIMSI), Orsay, France U. Nacional de Educación a Distancia, Madrid, Spain Linguateca, Sintef, Oslo, Norway Linguistic Modelling Lab., Bulgarian Acad Sci Microsoft Research Asia NIST, USA Research Computing Center of Moscow State U. Research Inst. Linguistics, Hungarian Acad. Sciences School of Computer Science and Mathematics, Victoria

U., Australia School of Computing, DCU, Ireland TALP , U. Politècnica de Catalunya, Barcelona, Spain UC Data Archive and School of Information

Management and Systems, UC Berkeley, USA U. "Alexandru Ioan Cuza", IASI, Romania

CLEF is coordinated by the Istituto di Scienza e Tecnologie dell'Informazione, Consiglio Nazionale delle Ricerche, PisaThe following Institutions are contributing to the organisation of the different tracks of the CLEF 2008 campaign:

Page 5: Welcome to CLEF 2008 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008

CLEFSteering Committee

Maristella Agosti, U.Padove, Italy Martin Braschler, Zurich, Switzerland Amedeo Cappelli, ISTI-CNR & CELCT, Italy Hsin-Hsi Chen, National Taiwan U., Taipei, Taiwan Khalid Choukri, ELRA/ELDA, Paris, France Paul Clough, University of Sheffield, UK Thomas Deselaers, RWTH Aachen University, Germany Giorgio Di Nunzio, U. Padova, Italy David A. Evans, Clairvoyance Corporation, USA Nicola Ferro, U. Padova, Italy Christian Fluhr, CEA-LIST, Fontenay-aux-Roses, France Norbert Fuhr, University of Duisburg, Germany Frederic C. Gey, U.C. Berkeley, USA Julio Gonzalo, LSI-UNED, Madrid, Spain Donna Harman, NIST, USA Gareth Jones, Dublin City University, Ireland Franciska de Jong, University of Twente, Netherlands Noriko Kando, NII, Tokyo, Japan Jussi Karlgren, SICS, Sweden Michael Kluck, German Institute for International and

Security Affairs, Berlin, Germany

Natalia Loukachevitch, Moscow State University, Russia Bernardo Magnini, ITC-irst, Trento, Italy Thomas Mandl, U. Hildesheim, Germany Paul McNamee, Johns Hopkins University, USA Henning Müller, University & University Hospitals of

Geneva, Switzerland Douglas W. Oard, University of Maryland, USA Anselmo Peňas, LSI-UNED, Madrid, Spain Maarten de Rijke, University of Amsterdam, Netherlands Diana Santos, Linguateca, Sintef, Oslo, Norway Jacques Savoy, University of Neuchatel, Switzerland Peter Schäuble, Eurospider Information Technologies,

Switzerland Richard Sutcliffe, University of Limerick, Ireland Max Stempfhuber, Informationszentrum

Sozialwissenschaften Bonn, Germany Hans Uszkoreit, German Research Center for Artificial

Intelligence (DFKI), Germany Felisa Verdejo, LSI-UNED, Madrid, Spain José Luis Vicedo, University of Alicante, Spain Ellen Voorhees, NIST, USA Christa Womser-Hacker, University of Hildesheim, Germany

Page 6: Welcome to CLEF 2008 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008

CLEF 2008: Track Coordinators

Ad Hoc: Abolfazl AleAhmad, Hadi Amiri, Eneko Agirre, Giorgio Di Nunzio, Nicola Ferro, Thomas Mandl, Nicolas Moreau, Vivien Petras

Domain-Specific: Vivien Petras, Stefan Baerisch iCLEF: Paul Clough, Julio Gonzalo, Jussi Karlgren QA@CLEF: Danilo Giampiccolo, Anselmo Peñas, Pamela Forner, Iñaki

Alegria, Corina Forăscu, Nicolas Moreau, Petya Osenova, Prokopis Prokopidis, Paulo Rocha, Bogdan Sacaleanu, Richard Sutcliffe, Erik Tjong Kim Sang, Alvaro Rodrigo, Jodi Turmo, Pere Comas, Sophie Rosset, Lori Lamel, Djamel Mostefa

ImageCLEF: Allan Hanbury, Paul Clough, Thomas Arni, Mark Sanderson, Henning Müller, Thomas Deselaers, Thomas Deserno, Michael Grubinger, Jayashree Kalpathy–Cramer, and William Hersh

Web-CLEF: Valentin Jijkoun and Maarten de Rijke

GeoCLEF: Thomas Mandl, Fredric Gey, Giorgio Di Nunzio, Nicola Ferro, Ray Larson, Mark Sanderson, Diana Santos, Paula Carvalho

VideoCLEF: Martha Larson, Gareth Jones INFILE: Djamel Mostefa DIRECT: Marco Duissan, Giorgio Di Nunzio, Nicola Ferro

Page 7: Welcome to CLEF 2008 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008

CLEF 2008: Participating Groups

Page 8: Welcome to CLEF 2008 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008

CLEF: Trend in Participation

Europe = 69(51); N. America = 12(14); Asia = 15(14), S. America = 3(1), Africa = 1(0)

CLEF 2000 - 2008 Participation

0

10

20

30

40

50

60

70

80

90

100

110

2000 2001 2002 2003 2004 2005 2006 2007 2008

Oceania

Africa

South America

North America

Asia

Europe

Page 9: Welcome to CLEF 2008 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008

Multilingual textual document retrieval (Ad Hoc) Mono- and cross-language information retrieval on structured scientific data (Domain-Specific) Interactive cross-language retrieval (iCLEF) Multiple language question answering (QA@CLEF) Cross-language retrieval in image collections (ImageCLEF) Multilingual retrieval of web documents (WebCLEF) Cross-language geographical information retrieval (GeoCLEF)

CLEF 2008 Tracks

Pilots: Cross-language Video Retrieval (VideoCLEF) Multilingual Information Filtering (INFILE)

Page 10: Welcome to CLEF 2008 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008

No. of Participants per Track

Ad Hoc: 26(22) Domain-Spec: 6(5) iCLEF: 6(na) QA@CLEF: 29(28) ImageCLEF: 42 (35)

WebCLEF: 3(4) GeoCLEF: 11(13)

plus VideoCLEF: 5INFILE: 1

Page 11: Welcome to CLEF 2008 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008

CLEF 2000 – 2008Participation per Track

CLEF 2000 - 2008 Tracks

0

5

10

15

20

25

30

35

40

45

50

2000 2001 2002 2003 2004 2005 2006 2007 2008

Years

Par

tici

pat

ing

Gro

up

s

AdHoc

Dom Spec

iCLEF

CL-SR

QA@CLEF

Im ageCLEF

WebClef

GeoClef

VideoClef

InFile

MorphoChall

Page 12: Welcome to CLEF 2008 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008

CLEF 2008:Test Collections

2000 News documents in 4 languages GIRT German Social Science database

2008 CLEF multilingual comparable corpus of more than 3M news docs in 15

languages: BG,CZ,DE,EN,ES,EU,FI,FR,HU,IT,NL,RU,SV,PT and Persian The European Library Data in DE, EN, FR (>3M docs) GIRT-4 social science database in EN and DE, Russian ISISS collection;

Cambridge Sociological Abstracts Online Flickr database

IAPR TC-12 photo database (20,000 image, captions in EN, DE); ARRS Goldminer database (200,000 medical images) IRMA: 10,000 images for automatic medical image annotation INEX Wikipedia image collection (150,000 images) Dutch / English documentary TV videos Agence France Press (AFP) newswire in Arabic, French & English

Page 13: Welcome to CLEF 2008 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008

CLEF 2008: Highlights

Big rise in participation 100 groups in 2008 (81 in 2007); workshop >150 Participants (115 in 2007)

Expansion of test-suites Ad Hoc: new collections – TEL & Persian – new tasks Domain-specific holds its own! Enormous success of ImageCLEF Confirmation of interest in QA@CLEF, GeoCLEF iCLEF – lots of interest WebCLEF & INFILE – what happened???

CLEF 2008 Proceedings Published

Page 14: Welcome to CLEF 2008 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008

CLEF 2008 Proceedings

Advances in Multingual and MultiModal Information Retrieval8th Workshop of the Cross-Language Evaluation Forum, CLEF 2007, Budapest, Hungary, September 19-21, 2007 Revised Selected PapersSeries: Lecture Notes in Computer ScienceLecture Notes in Computer Science , Vol. 5152 Peters, C.; Jijkoun, V.; Mandl, Th.; Müller, H.; Oard, D.W.; Peñas, A.; Petras, V.; Santos, D. (Eds.) 2008, XXI, 922 p. With online files/update., Softcover ISBN: 978-3-540-85759-4

Page 15: Welcome to CLEF 2008 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2000 – 2008 Results

Creation of strong CLIR research community (increase in participation over years )

Promotion of research in key areas (multilingual IR; results merging; cross language access in multimedia; interactive query formulation and results presentation)

Encouraged takeup of techniques/resources between research groups

Stimulated synergy between researchers from different areas (IR, NLP, Image Processing, User Interfaces, …)

Literature: Working Notes, Proceedings and other publications report state of the art plus emerging trends

Production of language resources; test suites

CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008

Page 16: Welcome to CLEF 2008 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2007 Workshop, Budapest, Hungary 19-21 September 2007

Points for Discussion

What new tasks/evaluation methodologies are needed to address more advanced information requirements?

How can we best reduce the gap between research and application communities?

Who are the users?

Does CLEF have a future?

Page 17: Welcome to CLEF 2008 Carol Peters ISTI-CNR Pisa, Italy

CLEF & TrebleCLEF

CLEF is an activity of the TrebleCLEF Coordination Action under the Seventh Framework Programme of the European Commission.

TrebleCLEF organises a set of dissemination activities in the multilingual information access field.

CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008

Page 18: Welcome to CLEF 2008 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008

Treble-CLEF

The CLEF research results have led to development of a new generation of multilingual retrieval system prototypes

BUT lack of technology transfer

Treble-CLEF extends the CLEF activity by: continuing to promote MLIA R&D via evaluation campaigns; providing a consistent training activity: tutorials, workshops,

summer school; producing best practice guidelines for system implementation; providing resources to encourage the multilingual system

development.

www.trebleclef.eu

Page 19: Welcome to CLEF 2008 Carol Peters ISTI-CNR Pisa, Italy

Approach EvaluationEvaluation

test collections and laboratory test collections and laboratory evaluationevaluation

user evaluationuser evaluation log analysislog analysis

Best Practices & GuidelinesBest Practices & Guidelines system-oriented aspects of MLIA system-oriented aspects of MLIA

applicationsapplications collaborative user studiescollaborative user studies user-oriented aspects of MLIA user-oriented aspects of MLIA

interfacesinterfaces

Dissemination and TrainingDissemination and Training tutorialstutorials workshopsworkshops summer schoolsummer school

Page 20: Welcome to CLEF 2008 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008

Treble-CLEF Events

Workshop on Novel Methodologies for Evaluation in Information Retrieval, ECIR’08, Glasgow, Scotland

Best Practices Workshops Workshop on Best Practices for the Development of Multilingual

Information Access Systems, Segovia, Spain, June 08 Workshop on Best Practices for System Developers: Bringing

Multilingual Information Access to Operational Systems, Winterthur, Switzerland, October 2008

Workshop on Best Practices in Query Log Analysis, Spring 2009

MLIA Technology Day – Dissemination of results of Best Practices Workshops, Fall 2009

Page 21: Welcome to CLEF 2008 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008

Workshop Memory Stick

Workshop Programme List of Participants Book of Abstracts

CLEF 2008 Questionnaire Map at Workshop Venue

Social Dinner - 17 September 2008 TrebleCLEF Other

Page 22: Welcome to CLEF 2008 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008

Treble-CLEF Summer School

Focus: How to build effective MLIA systems and How to evaluate them

Program will cover the following areas: Multilingual Text Processing (language specific tokenization, indexing,

stemming); Cross-Language Information Retrieval (approaches and technologies

used for CLIR); Multingual Information Retrieval and MultiModality (querying,

retrieving & presenting results from a multingual/multimedia collection System Architectures and Multilinguality (theory & practice) Resources for MLIA (information on language processing tools and

linguistic resources); Best Practices in User-oriented MLIA Evaluation for Multilingual Systems and Components

Page 23: Welcome to CLEF 2008 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008

CLEF2008 Questionnaire

Aim of the Questionnaire is to collect information on the current needs of MLIA

system developers in terms of applications, resources, evaluation activities

Compile the questionnaire online atwww.trebleclef.eu/clef_2008_questionnaire.php

Page 24: Welcome to CLEF 2008 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2008

Thank you for your attention

and

ENJOY THE Workshop !

CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008