16
RECOD @ Placing Task of MediaEval 2016 A Ranking Fusion Approach for Geographic-Location Prediction of Multimedia Objects Javier A. V. Muoz 1 , Lin Tzy Li 1 , Ícaro C. Dourado 1 , Keiller Nogueira 5 , Samuel G. Fadel 1 , Otávio A. B. Penatti 1,2 , Jurandy Almeida 1,3 , Luís A. M. Pereira 1 , Rodrigo T. Calumby 1,4 , Jefersson A. dos Santos 5 , and Ricardo da S. Torres 1 1 RECOD Lab, Institute of Computing, University of Campinas (UNICAMP), Campinas – SP 2 Advanced Technologies, SAMSUNG Research Institute Brazil, Campinas – SP 3 Inst. of Science and Technology, Federal University of São Paulo (UNIFESP), S. J. dos Campos – SP 4 Dept. of Exact Sciences, University of Feira de Santana (UEFS), Feira de Santana – BA 5 Dept. of Computer Science, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte - MG Brazil

MediaEval 2016 - RECOD at Placing Task

Embed Size (px)

Citation preview

Page 1: MediaEval 2016 - RECOD at Placing Task

RECOD @ Placing Task of MediaEval 2016

A Ranking Fusion Approach for Geographic-Location Prediction of

Multimedia Objects

Javier A. V. Munoz1, Lin Tzy Li1, Ícaro C. Dourado1, Keiller Nogueira5, Samuel G. Fadel1, Otávio A. B. Penatti1,2, Jurandy Almeida1,3, Luís A. M. Pereira1, Rodrigo T. Calumby1,4,

Jefersson A. dos Santos5, and Ricardo da S. Torres1

1RECOD Lab, Institute of Computing, University of Campinas (UNICAMP), Campinas – SP2Advanced Technologies, SAMSUNG Research Institute Brazil, Campinas – SP3Inst. of Science and Technology, Federal University of São Paulo (UNIFESP), S. J. dos Campos – SP4Dept. of Exact Sciences, University of Feira de Santana (UEFS), Feira de Santana – BA5Dept. of Computer Science, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte - MG

Brazil

Page 2: MediaEval 2016 - RECOD at Placing Task

GP-Agg

General Pipeline

Feature Extractor 1 Ranked List 1

Ranking

Feature Extractor 2 Ranked List 2

Feature Extractor m Ranked List m

RLDA

Rank Fusion

ID1 LAT1 LONG1ID2 LAT2 LONG2… … ...IDX LATX LONGX

Final Ranked List

Page 3: MediaEval 2016 - RECOD at Placing Task

Ranked list 1~m

ID1 LAT1 LONG1ID2 LAT2 LONG2… … ...IDX LATX LONGX

Top X items' lat/long defined as points of a OPF cluster

v: (lat1,long1)

edge (v, u): connect k-nn d(v, u)

d(v, u): geo-distance between v and u

k=3

u: (lat2,long2)

RLDA: Ranked List Density Analysis Based on OPF (Optimum-Path Forest): find the maximum point in a probability density function

Page 4: MediaEval 2016 - RECOD at Placing Task

GP-Agg Framework

Training Phase

● RLDA is included in set of rank aggregation functions● RLDA is the unique function that uses geographical location in the combination

Page 5: MediaEval 2016 - RECOD at Placing Task

GP-Agg Framework

Testing Phase

Page 6: MediaEval 2016 - RECOD at Placing Task

GP-Agg parameters

Parameter Value

Number of generations

20

Genetics operators

Reproduction, Mutation, Crossover

Fitness functions

FFP1, WAS

Rank Agg. methods

CombMAX, CombMIN, CombSUM, CombMED, CombANZ, CombMNZ, RLSim, BordaCount, RRF, MRA, RLDA

GP individual example:

Page 7: MediaEval 2016 - RECOD at Placing Task

Fitness Functions

FFP1=∑i=1

n

ak r ( l i)×k i×ln−1(i+k 2)

● is the relevance assigned to an element● An element in a ranked list is considered relevant if it is in a range of 1Km● k

1, k

2 are scaling factors, k

1 = 6 and k

2 = 1.2

r ( l i)∈{0,1}

; score (i)=1−log (1+d (i))log(1+Rmax)

WAS=∑i=1

n

score (i)

n

● Rmax

: the maximum distance between any two points on the Earth surface● d(i): geographic distance between the predicted location and the ground truth

Page 8: MediaEval 2016 - RECOD at Placing Task

Features

Descriptors

Textual metadata

BM25*, TF-IDF*, IBS*, LMD*

Photos edgehistogram (EHD), scalable-color (SCD), GIST, cedd, col, jhist, tamura, BIC, GoogleNet

Videos HMP

* All from Lucene package (http://lucene.apache.org/core/)

Page 9: MediaEval 2016 - RECOD at Placing Task

Submissions configuration

Type Photos Videos

Run 1 Textual GP-Agg(all desc.) GP-Agg(all desc.)

Run 2 Non-textual GP-Agg(all desc.) RLDA(HMP)

Run 3 All GP-Agg(all desc.) GP-Agg(all desc.)

Run 4 Free RLDA(IBS, BM25, LMD, TF-IDF) LMD

Page 10: MediaEval 2016 - RECOD at Placing Task

Submissions configuration

Type Photos Videos

Run 1 Textual GP-Agg(all desc.) GP-Agg(all desc.)

Run 2 Non-textual GP-Agg(all desc.) RLDA(HMP)

Run 3 All GP-Agg(all desc.) GP-Agg(all desc.)

Run 4 Free RLDA(IBS, BM25, LMD, TF-IDF) LMD

Page 11: MediaEval 2016 - RECOD at Placing Task

Results on Photos (%)

Run 1

Run 2

Run 3

Run 4

0 10 20 30 40 50 60 70

0.59

0.09

0.56

0.56

6.07

0.87

5.97

5.94

21.06

2.36

20.83

20.73

38

4.47

37.72

37.47

59.69

21.46

59.89

59.28

10m

100m

1km

10km

100km

1000km

Page 12: MediaEval 2016 - RECOD at Placing Task

Results on Videos (%)

Run 1

Run 2

Run 3

Run 4

0 10 20 30 40 50 60

0.45

0

0.51

0.37

5.74

0.03

5.82

4.03

18.69

0.15

18.46

13.51

41.56

2.46

41.2

33.02

54.51

13.54

54.77

47.67

10m

100m

1km

10km

100km

1000km

Page 13: MediaEval 2016 - RECOD at Placing Task

Results (error)

Photos (km) Videos (km)

Avg Median Avg Median

Run 1 2872 254.67 3204.80 566.96

Run 2 5664.14 5821.07 6085.23 6085.63

Run 3 2797.58 261.66 3123.38 571.24

Run 4 2919.30 279.37 3739.79 1236.51

Page 14: MediaEval 2016 - RECOD at Placing Task

Conclusions

● GP-Agg automatically discover semi-optimal combinations of ranked lists and aggregation functions

● Including RLDA into GP-Agg improved results over best configuration submitted in past year

● Use of CNN-based features for photos improved significantly results over past year in non-textual submission

Page 15: MediaEval 2016 - RECOD at Placing Task

Future Work

● Use more CNN-based features for visual content in photos and videos– PlaNet¹-inspired fine tuning

¹Tobias Weyand, Ilya Kostrikov, James Philbin, PlaNet - Photo Geolocation with Convolutional Neural Networks, ECCV 2016.

Page 16: MediaEval 2016 - RECOD at Placing Task

Acknowledgments

● FAPESP● CNPq ● CAPES● Samsung● MediaEval 2016