38
TASK OVERVIEW RETRIEVING DIVERSE SOCIAL IMAGES Bogdan Ionescu (UPB, Romania) Alexandru Lucian Gînscǎ (CEA LIST, France) Maia Zaharieva (TUW&UW, Austria) Mihai Lupu (TUW, Austria) Henning Müller (HES-SO, Switzerland) October 20-21, Hilversum, Netherlands UNIVERSITY POLITEHNICA OF BUCHAREST

MediaEval 2016: Task Overview: Retrieving Diverse Social Images

Embed Size (px)

Citation preview

TASK OVERVIEWRETRIEVING DIVERSE SOCIAL IMAGES

Bogdan Ionescu (UPB, Romania)Alexandru Lucian Gînscǎ (CEA LIST, France)

Maia Zaharieva (TUW&UW, Austria)Mihai Lupu (TUW, Austria)

Henning Müller (HES-SO, Switzerland)

October 20-21, Hilversum, Netherlands

UNIVERSITY POLITEHNICA OF BUCHAREST

WHY CARE ABOUT DIVERSIFYING IMAGE SEARCH RESULTS?

GOAL OF THE TASK

For each query participants receive a list of photos retrieved from Flickr and ranked with Flickr’s default "relevance" algorithm

Goal: refine the results by providing a ranked list of up to 50 photos that are both relevant1 and diverse2 representations of the query.1relevant: a common representation of the query concepts2diverse: depict different visual characteristics of the query topics and subtopics with a certain degree of complementarity, i.e. most of the perceived visual information is different from one photo to another.

CORE CHALLENGE

QUERY = general-purpose, multi-topic terme.g.: accordion player, blanket on sofa, construction works, dancing on the street, drinking water, dog on a leash, sand castles, sailing boat, three wheeled car, … .

DATASETS Photo by Roman Kraft

THE BASICS

Photos: Development: 70 queries; 20,757 photos in totalTest: 64 queries; 18,717 photos in total

Available metadata for each photo/query:query formulationinitial Flickr rankingtitle, tags, descriptionviews and user information

ADDITIONAL RESOURCES

Visual-based descriptors: CNN (Caffe framework)

Text-based descriptors: TF-IDF, SOLR indexes

User annotation credibility descriptors: provide an estimation of the quality of tag-image content relationships using visual- and text-based content analysis

Wikiset: semantic vectors for general English terms

SOME STATISTICS

Development Dataset Test Dataset# queries 70 64# images 20,757 18,717# images / query: min - mean (std) - max 176 - 297 (19) - 300 141 - 292 (29) - 300

# relevant images / query: min - mean (std) - max 9 - 191 (76) - 300 10 - 146 (82) - 298

# clusters / query:min - mean (std) - max 5 - 18 (6) - 25 4 - 16 (6) - 25

# images / cluster:min - mean (std) - max 1 - 11 (14) - 179 1 - 9 (10) - 100

EVALUATIONPhoto by John-Mark Kuznietsov

RUN SUBMISSION

Required runs:

run 1: automated using visual information only

run 2: automated using textual information only

run 3: automates using textual-visual fusion without other resources than provided by the organizers

General runs:

runs 4&5: everything allowed, e.g. human-based, hybrid human-machine, using external resources, etc.

OFFICIAL METRICS

Precision @ X = R/X (P@X) where X is the cutoff point, R the number of relevant images

Cluster Recall @ X = Nc/N (CR@X)where N is the total number of clusters for the current query and Nc is the number of different clusters represented in the top X images

F1@X (harmonic meant of CR and P)

Metrics are reported for X={5,10,20,30,40,50}Official ranking: F1@20

CR@200 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

#queries

0

5

10

15

20

25

30

CR@200 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

#queries

0

5

10

15

20

25

30

Flickr Baseline Results

Development data Test data P@20 = 0.6979CR@20 = 0.3117 F1@20 = 0.4674

P@20 = 0.5531CR@20 = 0.3609 F1@20 = 0.4122

P@200 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

#queries

0

5

10

15

20

25

30

P@200 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

#queries

0

5

10

15

20

25

30

BENCHMARK RESULTS 2016 Photo by Andrew Branch

PARTICIPANTS

Survey

13 respondents were interested in the task, 8 very interestedRegistration

14 teams registered from 10 different countriesRuns submission

6 teams (incl. 2 organizers-related teams) finished the taskWorkshop

5 teams participating

SUBMITTED RUNS (29)Team Country

Required Runs General Runs1 (visual) 2 (text) 3 (visual-text) 4 5

IMS* Austria ✓ ✓ ✓ ✓(visual-text) ✗

LAPI* Romania ✓ ✓ ✓ ✓(credibility)

✓(visual-text-credibility)

RECOD Brazil ✓ ✓ ✓ ✓(visual-text)

✓(visual-text)

UNED Spain ✓ ✓ ✓ ✓(text-human)

✓(visual-text)

UPMC France ✓ ✓ ✓ ✓(text-credibility)

✓(visual-text-credibility)

USS-ENIS-REGIM Tunisia ✓ ✓ ✓ ✓(visual)

✓(visual-text-credibility)

*organizers-related team

OFFICIAL RANKING (F1@20)Team Best Run P@20 CR@20 F1@20

UPMC run 3 (visual-text) 0.6961 0.4938 0.5532

LAPI* run 4 (credibility) 0.5484 0.4374 0.4638

UNED run 4 (text-human) 0.5734 0.4252 0.4597

IMS* run 3 (visual-text) 0.5430 0.4130 0.4471

RECOD run 5 (visual-text) 0.5156 0.4065 0.4379

Flickr Baseline 0.5531 0,3609 0.4122

USS-ENIS-REGIM run 5 (visual-text-credibility) 0.4180 0.3538 0.3637

[email protected] 0.45 0.5 0.55 0.6 0.65 0.7

CR

@20

0.3

0.32

0.34

0.36

0.38

0.4

0.42

0.44

0.46

0.48

0.5Flickr Baseline

IMS

LAPI

RECOD

UNEDV

UPMC

USS-ENIS-REGIM

[email protected] 0.45 0.5 0.55 0.6 0.65 0.7

CR

@20

0.3

0.32

0.34

0.36

0.38

0.4

0.42

0.44

0.46

0.48

0.5Flickr Baseline

IMS

LAPI

RECOD

UNEDV

UPMC

USS-ENIS-REGIM

Flickr

[email protected] 0.45 0.5 0.55 0.6 0.65 0.7

CR

@20

0.3

0.32

0.34

0.36

0.38

0.4

0.42

0.44

0.46

0.48

0.5Flickr Baseline

IMS

LAPI

RECOD

UNEDV

UPMC

USS-ENIS-REGIM

Flickr

UPMC

[email protected] 0.45 0.5 0.55 0.6 0.65 0.7

CR

@20

0.3

0.32

0.34

0.36

0.38

0.4

0.42

0.44

0.46

0.48

0.5Flickr Baseline

IMS

LAPI

RECOD

UNEDV

UPMC

USS-ENIS-REGIM

Flickr

UPMC

LAPI

[email protected] 0.45 0.5 0.55 0.6 0.65 0.7

CR

@20

0.3

0.32

0.34

0.36

0.38

0.4

0.42

0.44

0.46

0.48

0.5Flickr Baseline

IMS

LAPI

RECOD

UNEDV

UPMC

USS-ENIS-REGIM

Flickr

UPMC

LAPIUNED

@5 @10 @20 @30 @40 @50

CR

@X

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8Flickr Baseline

IMS

LAPI

RECOD

UNEDV

UPMC

USS-ENIS-REGIM

@5 @10 @20 @30 @40 @50

P@X

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8Flickr Baseline

IMS

LAPI

RECOD

UNEDV

UPMC

USS-ENIS-REGIM

Top 20 Flickr results:

Hanging bridge

Top 20 Flickr results:

Hanging bridge

Top 20 Flickr results:

Hanging bridge

P@20 = 0.20CR@20 = 0.25 F1@20 = 0.22

Hanging bridge

Best achieved result:

Hanging bridge

Best achieved result:

Hanging bridge

P@20 = 0.95CR@20 = 0.75 F1@20 = 0.84

Best achieved result:

Hanging bridge

P@20 = 0.95CR@20 = 0.75 F1@20 = 0.84

Best achieved result:

bottom up view

mid of the nature

facing a hanging bridge

starting point

winter view

colourful bridge

LESSONS LEARNED

The dataset is getting very complex and challenging

Different queries favour different approaches

Potential subjectivity in the annotation process

Still low resources for CC on Flickr

AcknowledgmentsWWTF Project ICT12-010: Maia Zaharieva, Vienna University of Technology, Austria.Task auxiliaries: Adrian Popescu, CEA LIST, France & Bogdan Boteanu, UPB, Romania.Task supporters: Gabi Constantin, Lukas Diem, Ivan Eggel, Laura Fluerătoru, Ciprian Ionașcu, Corina Macovei, Cătălin Mitrea, Irina Emilia Nicolae, Mihai Gabriel Petrescu, Andrei Purică.

Thank You!and …

Photo by Mario Salvo

… please share media online using