Download pdf - Multilingual Information Access: What and How Well?events.dimes.unical.it/multilingmine/wp-content/uploads/sites/2/201… · Query Formulation / Coding Verbalization IR "Flow" Index

38th European Conference on Information Retrieval

Nicola Ferro @frrncl

University of Padua, Italy

Workshop on Modeling, Learning and Mining for Cross/Multilinguality (MultiLingMine 2016) 20 March 2016, Padua, Italy

Multilingual Information Access: What and How Well?

What


#ecir2016@frrncl

N. Ferro - Multilingual Information Access: What and How Well?

Some search trends: translate + …

3

0

1000

2000

3000

4000

5000

2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

Facebook Twitter WhatsappInstagram Spotify Youtube


#ecir2016@frrncl


Some search trends: translate + …

3

0

1000

2000

3000

4000

5000

2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

Facebook Twitter WhatsappInstagram Spotify Youtube

Multilingual

information access

is a multimedia

concern

Text is the primary enabler for multilingual

information access


#ecir2016@frrncl


Top Ten Languages in the Web

4

Data taken from http://www.internetworldstats.com/stats7.htm

0%

300%

600%

900%

1200%

1500%

1800%

2100%

2400%

2700%

3000%

0

100

200

300

400

500

600

EnglishChinese

Spanish

Japanese

PortugueseGerman

ArabicFrench

RussianKorean

Others

Users (million)Growth (%)2011


#ecir2016@frrncl


Top Ten Languages in the Web

4

Data taken from http://www.internetworldstats.com/stats7.htm

0%

700%

1400%

2100%

2800%

3500%

4200%

4900%

5600%

6300%

7000%

0

150

300

450

600

750

900

EnglishChinese

SpanishArabic

Portuguese

JapaneseRussian

MalayFrench

GermanOthers

Users (million)Growth (%)2015


#ecir2016@frrncl


Multilingual Information Access

Monolingual retrieval in non-English languages

Bilingual retrieval A → B

Multilingual retrieval A → A, B, ...

Multilingual retrieval AB → A, AB, AC, B, BC, ABC, …

5

Given a query in any medium and any language, select relevant items from a multilingual multimedia collection which can be in any medium and any language, and present them in the style or order most likely to be useful to the querier, with identical or near identical objects in different media or languages appropriately identified.

[D. Oard & D. Hull, AAAI Symposium on Cross-Language IR, Spring 1997, Stanford]


#ecir2016@frrncl


Typical IR Flow

6

4

"IR-Cycle"

IR System

mouse trap

Traps to catch mice .."The Mousetrap", a play by Agatha Christie

..a good trap against rodents

I need a trap to get rid ofsome mice

I could get me a cat!

DOC1: Poisonless mousetrapsDOC2: Get rid of miceDOC3: Traps for rodents…..

How to get rid of mice – the politically correct way

Result

Processing

Query

Formulation / Coding

Verbalization

IR "Flow"

Index

Indexing

Query

Indexing

Matching

Documents

Document representation Query representation

WirtschaftResult

[Figure taken from M. Braschler, Multilingual Information Retrieval and Cross-Language Information Retrieval, TrebleCLEF Summer School 2009, Italy]


#ecir2016@frrncl


Possible CLIR Flow: Query Translation

7


7

MLIA/CLIR "flow"/"structure"z Building a MLIA/CLIR system involves adressing

different processing steps.z We structure our discussion into the following list of

stepsz Indexingz Translationz Matching

Index

Indexing

Query

Indexing

Matching

Documents

Document representation

Query representation

WirtschaftResult


Translation


#ecir2016@frrncl


Possible MLIR Flow: Query Translation

8

8

Bilingual CLIR

z Maybe the "simplest scenario"

z We add query translation to a monolingual IR system

z How to integrate the translation step into the overall system?

MatchingMatching

ResultResult

IndexIndex

15

Index

Indexing

Query

Indexing

Matching



Result


Translation

Documents

Merging

Result

TranslationTranslation



#ecir2016@frrncl


Possible MLIR Flow: Query and Document Translation

9

9

MLIA – Query Translation

z More complex setupz A series of bilingual stepsz A merging step is needed to produce a

single, integrated result

Result

17

Index

Indexing

Query

Indexing

Matching

Document representationQuery representation

WirtschaftResult


Translation

Documents

Translation


Translation

Result



#ecir2016@frrncl


MLIA: Issues to Consider


Language identification

cross-scripting

Segmentation

compound words

N-grams

discourse

Stop lists

varying length

Normalization

diacritics

spellings

...

Stemming

rule-based

statistical / N-grams

Enrichment

named entity recognition

thesaurus expansion

authorship

...

Translation

ambiguity

out-of-vocabulary terms

bag-of-words

...

10

How Well


#ecir2016@frrncl


Why Evaluation?

12

“To measure is to know”

“If you cannot measure it, you cannot improve it”

Lord William Thompson, first Baron Kelvin (1824-1907)


#ecir2016@frrncl


How Does Experimental Evaluation Work?

Cranfield Paradigm

Dates back to mid 1960s

Makes use of experimental collections

documents

topics

relevance judgments

Ensures comparability and repeatability of the experiments

13


#ecir2016@frrncl


Large-scale Evaluation Campaigns

Evaluation activities are conducted in large international fora

TREC (Text REtrieval Conference), USA, since 1992

NTCIR (NII Test Collection for IR Systems), Japan, since 1999

CLEF (Conference and Labs of the Evaluation Forum), Europe, since 2000

FIRE (Forum for Information Retrieval Evaluation), India, since 2008

Share a common methodology, the Cranfield Paradigm

14


#ecir2016@frrncl


Adhoc-ish CLEF over the years

We are interested in carrying out a longitudinal study on the Adhoc-ish CLEF tasks in order to gain an understanding on how performances behaved over the time

It is difficult (or even impossible) to conduct this kind of studies because experimental collections are not directly comparable

15


#ecir2016@frrncl


Standardization

It directly adjusts topic scores by the observed mean score and standard deviation for that topic in a sample of the systems.

The difficulty of a query is estimated from the scores achieved by systems, and parameters derived from these estimates are then used to normalize the systems scores

16

– W. Webber, A. Moffat, and J. Zobel. 2008 Score standardization for inter-collection comparison of retrieval systems.

In SIGIR 2008. ACM Press, 51-58.


#ecir2016@frrncl


How standardization works

For every run in a collection, we have a measure m for each topic t with mean μt and standard deviation σt

17

These are AP values of all the runs for topic 301 of CLEF Ad-Hoc bilingual English 2006

0 5 10 15 20 25 30 350

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Runs

AP

μt�t


#ecir2016@frrncl



For each topic in a collection we can calculate the z-scores of a measure m as

18

m0=

m� µt

�t

0 5 10 15 20 25 30 35−1.5

−1

−0.5

0

0.5

1

1.5

2

Runs

m’ =

AP’Not in the

range [0, 1]

0 5 10 15 20 25 30

0.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

0.75CLEF monolingual French Ad−Hoc 2004

Runs

MAP

/ sM

AP

MAPsMAPAP sAP


#ecir2016@frrncl



Normalization in the [0,1] range by using the cumulative density function:

19

F

X

(m0) =

Zm

0

�1

1p2⇡

e

�x

2/2dx

0 5 10 15 20 25 30 350

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Runs

stan

dard

ized

AP

= sA

P

0 5 10 15 20 25 30

0.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

0.75CLEF monolingual French Ad−Hoc 2004

Runs

MAP

/ sM

AP

MAPsMAPAP sAP


#ecir2016@frrncl


How do monolingual systems behave over the years?

20

0 0.2 0.4 0.6 0.8 1sMAP

CLEF Monolingual Tasks (2000 - 2009) - sMAP Distribution

(rob) 2007(gc) 2008(gc) 2007(gc) 2006(ah) 2006(ah) 2005(ah) 2004(rob) 2006(ah) 2003(ah) 2002(ah) 2001(rob) 2006(ah) 2003(ah) 2002(ah) 2001(ah) 2000(ah) 2007(ah) 2006(ah) 2005(tel) 2009(tel) 2008(rob) 2007(rob) 2006(ah) 2006(ah) 2005(ah) 2004(ah) 2003(ah) 2002(ah) 2001(ah) 2000(ah) 2004(ah) 2003(ah) 2002(ah) 2009(ah) 2008(rob) 2006(gc) 2006(ah) 2003(ah) 2002(ah) 2001(tel) 2009(tel) 2008(rob) 2006(gc) 2008(gc) 2007(gc) 2006(gc) 2005(ah) 2003(ah) 2002(ah) 2001(ah) 2000(ah) 2007(ah) 2006(ah) 2005 Bulgarian

(bg)

German(de)

Spanish(es)

Farsi(fa)

Finnish(fi)

French(fr)

Hungarian(hu)

Italian(it)

Dutch(nl)

Portuguese(pt)


#ecir2016@frrncl


How do bilingual systems behave over the years?

21

0 0.2 0.4 0.6 0.8 1

sMAP

CLEF Bilingual Tasks (2000-2009) - sMAP Distribution

(ah) 2004(ah) 2003(ah) 2006(ah) 2005(ah) 2004(ah) 2003(ah) 2002(tel) 2009(tel) 2008(rob) 2007(ah) 2006(ah) 2005(ah) 2004(ah) 2002(tel) 2009(tel) 2008(rob) 2006(gc) 2008(gc) 2006(gc) 2005(rob) 2006(ah) 2003(ah) 2002(tel) 2009(tel) 2008(rob) 2009(rob) 2008(gc) 2008(gc) 2007(gc) 2006(gc) 2005(ah) 2007(ah) 2006(ah) 2005(ah) 2004(ah) 2003(ah) 2002(ah) 2001(ah) 2000

English(en)

Spanish(es)

German(de)

French(fr)

Italian(it)

Russian(ru)

Portuguese(pt)


#ecir2016@frrncl


Bilingual Breakdown by Source Language

22

(ah) 2

000

(ah) 2

001

(ah) 2

003

(ah) 2

004

(tel) 2

008

(tel) 2

009

med

ian

sMAP

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1French to English, CLEF 2000 - 2009

X2ENFR2EN

(AH) 200

6

(AH) 200

7

(AH) 200

6

(AH) 200

7

(AH) 200

6

(AH) 200

7

med

ian

sMAP

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1Hindi, Oromo and Telugu to English, CLEF 2006 - 2007

X2ENHI2ENOM2ENTE2EN

(ah) 2

005

(ah) 2

006

(ah) 2

007

(gc) 2

007

med

ian

sMAP

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1Indonesian to English, CLEF 2005 - 2007

X2ENID2EN

(ah) 2

001

(ah) 2

002

(ah) 2

007

med

ian

sMAP

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1Chinese to English, CLEF 2001 - 2007

X2ENZH2EN


#ecir2016@frrncl


Bilingual to Monolingual Comparison

23

(AH)

es

(AH)

fr

(AH)

it

(AH)

es

(AH)

it

(AH)

ru

(AH)

fr

(AH)

pt

(AH)

bg

(AH)

fr

(AH)

pt

(GC)

de

(AH)

fr

(AH)

pt

(GC)

de

(ROB

) de

(ROB

) es

(ROB

) fr

(AH)

fa

(GC)

de

(TEL)

de

(TEL)

fr

(TEL)

de

(TEL)

fr

bili/m

ono r

atio

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

110%

120%

130%

140%

150%

160%

170%

180%CLEF Bililingual bili/mono ratio 2000 - 2009 (best z-score MAP)

2002 2003 2004 2005 2006 2007 2008 2009


#ecir2016@frrncl


How do multilingual systems behave over the years?

24

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

CLEF Multilingual Tasks (2000 - 2005) - sMAP Distribution

Merging 2005

2-Years-On 2005

Multi-4 2004

Multi-8 2003

Multi-4 2003

Multi-5 2002

Multi-5 2001

Multi-4 2000


#ecir2016@frrncl


Conclusions

Multilingual information access is still an open challenge

user tasks are rapidly changing and evolving and new unprecedented needs emerge

Multimodality and multimediality are more and more integral parts and key concerns for multilingual information access

the problem is no more only crossing the language barriers

Evaluation has shown positive trends over the years

it is hard to keep consistent evaluation tasks able to represent the real challenges

25