16
Welcome to CLEF 2007 Carol Peters ISTI-CNR Pisa, Italy

Welcome to CLEF 2007 Carol Peters ISTI-CNR Pisa, Italy

Embed Size (px)

Citation preview

Page 1: Welcome to CLEF 2007 Carol Peters ISTI-CNR Pisa, Italy

Welcome to CLEF 2007

Carol PetersISTI-CNR Pisa, Italy

Page 2: Welcome to CLEF 2007 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2007 Workshop, Budapest, Hungary 19-21 September 2007

CLEF Objectives

Stimulate the development of multilingual IR systems for European languages

To create a CLIR/MLIA community Construct publicly available test-suites

Conducting annual evaluation campaigns

Designing tracks/tasks to meet emerging needs and to stimulate research in the”right” direction

Page 3: Welcome to CLEF 2007 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2007 Workshop, Budapest, Hungary 19-21 September 2007

CLEF Coordination

Centre for the Evaluation of Human Language and Multimodal Communication Technologies (CELCT), Trento, Italy

College of Information Studies and Institute for Advanced Computer Studies, U. Maryland, USA

Dept. of Computer Science, U. Indonesia Depts. of Computer Science & Medical Informatics,

RWTH Aachen U., Germany Dept. of Computer Science and Information Systems,

U. Limerick, Ireland Dept. of Computer Science and Information

Engineering, National U. Taiwan Dept. of Information Engineering, U. Padua, Italy Dept. of Information Sci, U. Hildesheim, Germany Dept. of Information Studies, U. Sheffield, UK Evaluations and Language Resources Distribution

Agency Sarl, Paris, France Fondazione Bruno Kessler FBK-irst, Trento, Italy German Research Centre for Artificial Intelligence,

DFKI, Saarbrücken, Germany Information and Language Processing Systems, U.

Amsterdam, Netherlands IZ Bonn, Germany

Inst. For Information technology, Hyderabad, India Inst. of Formal and Applied Linguistics, Charles

University, Czech Rep LSI-UNED, Madrid, Spain Linguateca, Sintef, Oslo, Norway Linguistic Modelling Lab., Bulgarian Acad Sci Microsoft Research Asia NIST, USA Biomedial Informatics, Oregon Health and Science

University, USA Research Computing Center of Moscow State U. Research Institute for Linguistics, Hungarian

Academy of Sciences School of Computer Science and Mathematics,

Victoria U., Australia School of Computing, DCU, Ireland UC Data Archive and School of Information

Management and Systems, UC Berkeley, USA University "Alexandru Ioan Cuza", IASI, Romania U. Hospitals and U.of Geneva, Switzerland Vienna University of Technology, Austria  

CLEF is coordinated by the Istituto di Scienza e Tecnologie dell'Informazione, Consiglio Nazionale delle Ricerche, PisaThe following Institutions are contributing to the organisation of the different tracks of the CLEF 2007campaign:

Page 4: Welcome to CLEF 2007 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2007 Workshop, Budapest, Hungary 19-21 September 2007

CLEFSteering Committee

Maristella Agosti, U.Padove, Italy Martin Braschler, Zurich, Switzerland Amedeo Cappelli, ISTI-CNR & CELCT, Italy Hsin-Hsi Chen, National Taiwan U., Taipei, Taiwan Khalid Choukri, ELRA/ELDA, Paris, France Paul Clough, University of Sheffield, UK Thomas Deselaers, RWTH Aachen University, Germany Giorgio Di Nunzio, U. Padova, Italy David A. Evans, Clairvoyance Corporation, USA Nicola Ferro, U. Padova, Italy Christian Fluhr, CEA-LIST, Fontenay-aux-Roses, France Norbert Fuhr, University of Duisburg, Germany Frederic C. Gey, U.C. Berkeley, USA Julio Gonzalo, LSI-UNED, Madrid, Spain Donna Harman, NIST, USA Gareth Jones, Dublin City University, Ireland Franciska de Jong, University of Twente, Netherlands Noriko Kando, NII, Tokyo, Japan Jussi Karlgren, SICS, Sweden Michael Kluck, German Institute for International and

Security Affairs, Berlin, Germany

Natalia Loukachevitch, Moscow State University, Russia Bernardo Magnini, ITC-irst, Trento, Italy Thomas Mandl, U. Hildesheim, Germany Paul McNamee, Johns Hopkins University, USA Henning Müller, University & University Hospitals of

Geneva, Switzerland Douglas W. Oard, University of Maryland, USA Anselmo Peňas, LSI-UNED, Madrid, Spain Maarten de Rijke, University of Amsterdam, Netherlands Diana Santos, Linguateca, Sintef, Oslo, Norway Jacques Savoy, University of Neuchatel, Switzerland Peter Schäuble, Eurospider Information Technologies,

Switzerland Richard Sutcliffe, University of Limerick, Ireland Max Stempfhuber, Informationszentrum

Sozialwissenschaften Bonn, Germany Hans Uszkoreit, German Research Center for Artificial

Intelligence (DFKI), Germany Felisa Verdejo, LSI-UNED, Madrid, Spain José Luis Vicedo, University of Alicante, Spain Ellen Voorhees, NIST, USA Christa Womser-Hacker, University of Hildesheim, Germany

Page 5: Welcome to CLEF 2007 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2007 Workshop, Budapest, Hungary 19-21 September 2007

CLEF 2007: Track Coordinators

Ad Hoc: Giorgio Di Nunzio, Nicola Ferro and Thomas Mandl Domain-Specific: Vivien Petras, Stefan Baerisch, Maximillian

Stempfhuber QA@CLEF: Danilo Giampiccolo, Bernardo Magnini, Anselmo

Peñas, Christelle Ayache, Petya Osenova,, Maarten de Rijke, Bogdan Sacaleanu, Diana Santos and Richard Sutcliffe

ImageCLEF: Allan Hanbury, Paul Clough, Henning Müller, Thomas Deselaers , Michael Grubinger, Jayashree Kalpathy–Cramer, and William Hersh

CL-SR: Douglas W. Oard, Gareth J. F. Jones, and Pavel Pecina Web-CLEF: Valentin Jijkoun and Maarten de Rijke

GeoCLEF: Thomas Mandl, Fredric Gey, Giorgio Di Nunzio, Nicola Ferro, Ray Larson, Mark Sanderson, Diana Santos, Christa Womser-Hacker, Xing Xie

Page 6: Welcome to CLEF 2007 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2007 Workshop, Budapest, Hungary 19-21 September 2007

CLEF 2007: Participating Groups

Brown U., USA California State U. SanMarcos, USA** Charles U., Prague, Czech Rep. Daedalus & Madrid Univs, Spain **** Ching Yun Univ., Taiwan DFKI-Artificial Intelligence, DE**** Dokuz Eylul U.,Turkey* Dublin City U. - Comp.Sci., Ireland *** Fondazione Bruno Kessler******** Helsinki U. of Technology Hungarian Acad. Sci. IDIAP Research Inst., CH Imperial College, London, UK** Ist.Nac.Astrofisica, Optica, Electronica,

Mexico** Indian Statistical Inst., India* Indian Inst. Technology (IIT-Bombay) Indian Inst. Technology (IIT-Kharagpur) Inst.Infocomm Research, Singapore ** Inst. Superior Técníco (DEI-IST) IPAL-CNRS (IR2), Singapore **** IRIT / SIG - Toulouse ***** Jadavpur University, Kolkata, India Johns Hopkins U., USA ******* Language Computer Corp., USA* LIMSI-CNRS, France ****

Univ. Evora, Portugal ** U.Freiburg – Pattern Recog., Germany U. & Hospitals

Geneva, CH *** U.Groningen - Inf.Sci, The Netherlands** (2) U.Hagen – IICS, Germany **** U.Hildesheim - Inf.Sci, Germany *** * U.Indonesia - Comp.Sci, Indonesia ** U.Jaen - Intell.Systems, Spain ****** U.Liege - Elect.Eng.&CS, Belgium** U.Lisbon – Informatics, Portugal *** Univ. Macquarie, Australia Univ. Nacional Colombia U.Neuchatel – Informatique, Switzerland ****** Univ. Nottingham, UK U.Ottawa - IT & Eng, Canada* U.Politecnica Catalunya – TALP, Spain** U.Politecnica Valencia - Comp.Sci, Spain** U. Porto, Portugal* U.Salamanca – REINA, Spain ***** U.Stockholm, NLP, Sweden *** U.Tampere, Fiinland **** U.Wolverhampton, UK * UC Berkeley - IM&S-1, USA ******* UNED-LSI, Spain ****** Univ. West Bohemia, Czech rRp.* Vienna Univ. Technology, Austria Xerox XRCE, France *

Linguateca-Sintef, Norway *** Linguit Ltd, UK

Microsoft Asia* Microsoft India MRIM Group – LIG, Grenoble* Nat. Inst.Informatics, Japan *** Nat.Taiwan U. - Comp-Sci, ***** Open Text Corp.(ex Hummingbird) Oregon Health & Sci. U., USA ** Priberam Informatica, Portugal * Research Inst. for AI of Romaian

Academy* RWTH Aaachen-CS., Germany *** RWTH Aachen - Med.Inf., DE*** SUNY Buffalo – Informat, USA **** SYNAPSE Développement, France** Tech U. Chemnitz, Germany* Tokyo Inst. Technology, Japan* U.Alicante, Spain (2) ****** U.AI.I Cuza Iasi, Romania* U.Amsterdam - Informatics, N ****** U. Basil, Seitzerland U. Chicago, USA ** U.Concordia - CINDI, Canada** U.Concordia - CLAK, Canada U.Coruna & U.Sunderland, ES/UK*

Page 7: Welcome to CLEF 2007 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2007 Workshop, Budapest, Hungary 19-21 September 2007

CLEF: Trend in Participation

Europe = 51(59.5); N. America = 14(4.5); Asia = 14(10), S. America = 1(4), Oceania = 1(2)

CLEF 2000-2007 Participation

0102030405060708090

100

2000 2001 2002 2003 2004 2005 2006 2007

Oceania

South America

North America

Asia

Europe

Page 8: Welcome to CLEF 2007 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2007 Workshop, Budapest, Hungary 19-21 September 2007

multilingual textual document retrieval on news collections (Ad Hoc)

mono- and cross-language information on structured scientific data (Domain-Specific)

multiple language question answering (QA@CLEF) cross-language retrieval in image collections

(ImageCLEF) cross-language spoken document retrieval (CL-SR) multilingual retrieval of Web documents (WebCLEF) cross-language geographical retrieval (GeoCLEF)

CLEF 2007 Tracks

Plus: CLEF@SemEval and CLEF@MorphoChallenge

Page 9: Welcome to CLEF 2007 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2007 Workshop, Budapest, Hungary 19-21 September 2007

No. of Participants per Track

Ad Hoc: 22(25) Domain-Spec: 5(4) iCLEF: 0(3) QA@CLEF: 28(37)

ImageCLEF: 35(25) CL-SR: 8(6) WebCLEF: 4(8) GeoCLEF: 13(17)

Page 10: Welcome to CLEF 2007 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2007 Workshop, Budapest, Hungary 19-21 September 2007

CLEF 2000 – 2007Tracks

CLEF 2000-2007 Tracks

0

5

10

15

20

25

30

35

40

2000 2001 2002 2003 2004 2005 2006 2007

Years

Par

tici

pat

ing

Gro

up

s

AdHoc

DomSpec

iCLEF

CL-SR

QA@CLEF

ImageCLEF

WebClef

GeoClef

Page 11: Welcome to CLEF 2007 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2007 Workshop, Budapest, Hungary 19-21 September 2007

CLEF 2007:Test Collections

2000 News documents in 4 languages GIRT German Social science database

2007

CLEF multilingual comparable corpus of more than 3M news docs in 13 languages: CZ,DE,EN,ES,FI,FR,IT,NL,RU,SV,PT,BG and HU

GIRT-4 social science database in EN and DE, Russian ISISS collection; Cambridge Sociological Abstracts

Malach collection of conversational speech derived from the Shoah archives EN & CZ

EuroGOV, a multilingual collection of approx 3M webpages crawled from European governmental sites

IAPR TC-12 photo database; PASCAL VOC 2006 training data ImageCLEFmed radiological database consisting of 6 distinct datasets; IRMA collection in EN and DE for automatic medical image

annotation:10,000 images

Page 12: Welcome to CLEF 2007 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2007 Workshop, Budapest, Hungary 19-21 September 2007

CLEF 2007: Highlights

Slight fall in participation 81 groups in 2007 (90 in 2006); workshop >115 Participants (130 in 2006)

Expansion of test-suites Ad Hoc – mixed results – but good success of the non-European topic languages task Domain-specific holds its own! Enormous success of ImageCLEF Confirmation of interest in QA@CLEF, GeoCLEF and CL-SR iCLEF -<didn’t happen WebCLEF – what happened???

CLEF 2006 Proceedings ???

Page 13: Welcome to CLEF 2007 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2007 Workshop, Budapest, Hungary 19-21 September 2007

CLEF 2007: Highlights

Slight fall in participation 81 groups in 2007 (90 in 2006); workshop >115 Participants (130 in 2006)

Expansion of test-suites Ad Hoc – mixed results – but good success of the non-European topic languages task Domain-specific holds its own! Enormous success of ImageCLEF Confirmation of interest in QA@CLEF, GeoCLEF and CL-SR iCLEF -<didn’t happen WebCLEF – what happened???

CLEF 2006 Proceedings – DID HAPPEN – A Miracle?

Page 14: Welcome to CLEF 2007 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2007 Workshop, Budapest, Hungary 19-21 September 2007

CLEF 2006 Proceedings

Evaluation of Multilingual and Multi-modal Information Retrieval7th Workshop of the Cross-Language Evaluation Forum, CLEF 2006, Alicante, Spain, September, 2006, Revised Selected PapersLecture Notes in Computer Science, Vol. 4730Peters, C.; Clough, P., Gey, F.C.; Karlgren, J.; Magnini, B.; Oard, D.W.; de Rijke, M.: Stempfhuber (Eds.) 2006

Page 15: Welcome to CLEF 2007 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2007 Workshop, Budapest, Hungary 19-21 September 2007

2006: Points for Discussion

What new tasks/evaluation methodologies are needed to address more advanced information requirements?

How can we best reduce the gap between research and application communities?

Who are the users?

The challenge represented by i2010

Does CLEF have a future?

Page 16: Welcome to CLEF 2007 Carol Peters ISTI-CNR Pisa, Italy

CLEF 2007 Workshop, Budapest, Hungary 19-21 September 2007

Treble-CLEF

The CLEF research results have led to development of a new generation of multilingual retrieval system prototypes

BUT lack of technology transfer

Treble-CLEF will extend the CLEF activity by: continuing to promote MLIA R&D via evaluation campaigns; providing a consistent training activity: tutorials, workshops, summer

school; producing best practice guidelines for system implementation; providing resources to encourage the multilingual system development.

Treble-CLEF will begin activity with a brainstorming workshop in January 2008