Upload
raphael-troncy
View
802
Download
0
Embed Size (px)
DESCRIPTION
Opening presentation of the Social Event Detection (SED) task at MediaEval 2012 October 2012, Pisa, Italia
Citation preview
Social Event Detection (SED): Challenges, Dataset and Evaluation
Raphaël Troncy <[email protected]> Vasileios Mezaris <[email protected]> Symeon Papadopoulos <[email protected]> Emmanouil Schinas <[email protected]> Ioannis Kompatsiaris <[email protected]>
What are Events?
Events are observable occurrences grouping
04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 2
Experiences documented by Media
People Places Time
SED: bigger, longer, harder
In 2011 2 challenges 73k photos (2,43 Gb) No training dataset 18 teams interested 7 teams submitted runs
Considered easy F-measure = 85%
(challenge 1) F-measure = 69%
(challenge 2)
In 2012 3 challenges 1 from SED 2011
167k photos (5,5 Gb) cc licence check
Training dataset = SED 2011
21 teams interested … from 15 countries
5 teams submitted runs
Much harder !
- 3 04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy
Three challenges (type and venue)
1. Find all technical events that took place in Germany in the test collection.
2. Find all soccer events taking place in Hamburg (Germany) and Madrid (Spain) in the collection.
3. Find all demonstration and protest events of the Indignados movement occurring in public places in Madrid in the collection For each event, we provided relevant and non relevant
example photos
Task = detect events and provide all illustrating photos
04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 4
Dataset Construction
Collect 167332 Flickr Photos (Jan 2009-Dec 2011) 4,422 unique Flickr users, all in CC licence All geo-tagged in 5 cities: Barcelona (72255), Cologne
(15850), Hannover (2823), Hamburg (16958), Madrid (59043) + 0,22 % (403) from EventMedia
Altered metadata: geo-tags removed for 80% of the photos (random) 33466 photos still geo-tagged
Provide only metadata … but real media were available to participants if they asked (5,5 Gb)
04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 5
Ground Truth and Evaluation Measures
CrEve annotation tool: http://www.clusttour.gr/creve/
For each of the 6 collections, review all photos and associate them to events (that have to be created)
Search by text, geo-coordinates, date and user Review annotations made by others Use EventMedia and machine tags (upcoming:event=xxx)
Evaluation Measures: Harmonic mean (F-score of Precision and Recall) Normalized Mutual Information (NMI): jointly consider the
goodness of the photos retrieved and their correct assignment to different events
04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 6
What ideally should be found
Challenge 1: 19 events, 2234 photos (avg = 117) Baseline precision (random): 0,01%
Challenge 2: 79 events, 1684 photos (avg = 21) Baseline precision (random): 0,01%
Challenge 3: 52 events, 3992 Photos (avg = 77) Baseline precision (random): 0,02%
04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 7
Who Has Participated ?
21 Teams registered (18 in 2011)
5 Teams cross the lines (7 in 2011, 2 overlaps)
One participant missing at the workshop!
04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 8
Quick Summary of Approaches
2011: all but 1 participants use background knowledge Last.fm (all), Fbleague (EURECOM), PlayerHistory (QMUL) DBpedia, Freebase, Geonames, WordNet
2012: all but 2 participants use a generic approach IR approach: query matching clusters (metadata, temporal, spatial):
MISIMIS Classification approach:
Topic detection with LDA, city classification with TF-IDF, event detection using peaks in timeline using the query topics: AUTH-ISSEL
Learning model using the training data and SVM: CERTH-ITI
Background knowledge: QMUL, DISI
2012: all approaches are NOT fully automatic Manual selection of some parameters (e.g. topics)
04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 9
Results – Challenge 1 (Technical Events)
04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 10
Precision Recall F-score NMI AUTHISSEL_4 76,29 94,9 84,58 0,7238 CERTH_1 43,11 11,91 18,66 0,1877 DISI_1 86,23 59,13 70,15 0,6011 MISIMS_2 2,52 1,88 2,15 0,0236 QMUL_4 3,86 12,85 5,93 0,0475
84,58
18,66
70,15
2,15 5,93
0
10
20
30
40
50
60
70
80
90
Runs
AUTHISSEL_4 CERTHITI_1 DISI_1 MISIMS_2 QMUL_4
Results – Challenge 2 (Soccer Events)
04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 11
Precision Recall F-score NMI AUTHISSEL_4 88,18 93,49 90,76 0,8499 CERTH_1 85,57 66,19 74,64 0,6745 DISI_1 MISIMS_2 34,49 17,25 22,99 0,1993 QMUL_4 79,04 67,12 72,59 0,6493
90,76
74,64
22,99
72,59
0
10
20
30
40
50
60
70
80
90
100
Runs
AUTHISSEL_4 CERTHITI_3 DISI_1 MISIMS_2 QMUL_1
Results – Challenge 3 (Indignados Events)
04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 12
Precision Recall F-score NMI AUTHISSEL_4 88,91 90,78 89,83 0,738 CERTH_1 86,24 54,61 66,87 0,4654 DISI_1 86,15 47,17 60,96 0,4465 MISIMS_2 48,3 46,87 47,58 0,3088 QMUL_4 22,88 33,48 27,19 0,1988
89,83
66,87 60,96
47,58
27,19
0
10
20
30
40
50
60
70
80
90
100
Runs
AUTHISSEL_4 CERTHITI_3 DISI_1 MISIMS_2 QMUL_4
Conclusion
Lessons learned Clear winner for all tasks: generic approach but manual
selection of the topics Use of background knowledge still useful if well-used
Looking at next year SED Shlomo Geva (Queensland University of Technology) +
Philipp Cimiano (University of Bielefeld) Dataset: bigger, more diverse Media: photos and videos ? (at least 10% videos?) Metadata: include some social network relationships,
participation at events Evaluation measures: event granularity? Time/CPU?
04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 13