76
Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK http://www.dcs.gla.ac.uk/~philip/ [email protected] c.uk

Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Embed Size (px)

Citation preview

Page 1: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Collections for Automatic Image Annotation and Photo Tag

Recommendation

Philip McParlane, Yashar Moshfeghi and Joemon M. Jose

University of Glasgow, UK

http://www.dcs.gla.ac.uk/~philip/[email protected]

Page 2: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Motivation for annotating images

Problems with existing automatic image annotation collections

Problems with existing photo tag recommendation collections

Flickr-AIA

Flickr-PTR Conclusions

We introduce

We introduce

Page 3: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Motivation for annotating images

Problems with existing automatic image annotation collections

Problems with existing photo tag recommendation collections

Flickr-AIA

Flickr-PTR Conclusions

We introduce

We introduce

Page 4: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

With the amount of multimedia data rapidly increasing, it becomes important to organize this content effectively

Social image sharing websites depend on manual annotation of their images.

This has a large human cost however.Plus, humans often tag with irrelevant tags (e.g. girl) or tags which are opinionated (e.g. cool) etc.

Page 5: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

With the amount of multimedia data rapidly increasing, it becomes important to organize this content effectively

Social image sharing websites depend on manual annotation of their images.

This has a large human cost however.Plus, humans often tag with irrelevant tags (e.g. girl) or tags which are opinionated (e.g. cool) etc.

Page 6: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

With the amount of multimedia data rapidly increasing, it becomes important to organize this content effectively

Social image sharing websites depend on manual annotation of their images.

This has a large human cost however.Plus, humans often tag with irrelevant tags (e.g. girl) or tags which are opinionated (e.g. cool) etc.

Page 7: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Therefore,research has focussed

on the automatic annotation of images

Automatic image annotation (AIA)

Photo TagRecommendation (PTR)

AIA CollectionsMany public collections used

PTR CollectionsMostly non-public collections used

Evaluated on

Corel5k, Corel30k, ESP Game, IAPR, Google Images, LabelMe, Washington Collection, Caltech, TrecVid 2007, Pascal 2007, MiAlbum & 4 other small collections.

The 20 most cited AIA papers on CiteSeerX revealed that at least 15 collections had been used…

For photo tag recommendation, the most popular works use their own collections.

“considers the pixels” “considers those tags already added”

Evaluated on

Page 8: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Therefore,research has focussed

on the automatic annotation of images

Automatic image annotation (AIA)

Photo TagRecommendation (PTR)

AIA CollectionsMany public collections used

PTR CollectionsMostly non-public collections used

Evaluated on

Corel5k, Corel30k, ESP Game, IAPR, Google Images, LabelMe, Washington Collection, Caltech, TrecVid 2007, Pascal 2007, MiAlbum & 4 other small collections.

The 20 most cited AIA papers on CiteSeerX revealed that at least 15 collections had been used…

For photo tag recommendation, the most popular works use their own collections.

“considers the pixels” “considers those tags already added”

Evaluated on

Page 9: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Therefore,research has focussed

on the automatic annotation of images

Automatic image annotation (AIA)

Photo TagRecommendation (PTR)

AIA CollectionsMany public collections used

PTR CollectionsMostly non-public collections used

Evaluated on

Corel5k, Corel30k, ESP Game, IAPR, Google Images, LabelMe, Washington Collection, Caltech, TrecVid 2007, Pascal 2007, MiAlbum & 4 other small collections.

The 20 most cited AIA papers on CiteSeerX revealed that at least 15 collections had been used…

For photo tag recommendation, the most popular works use their own collections.

“considers the pixels” “considers those tags already added”

Evaluated on

Page 10: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Motivation for annotating images

Problems with existing automatic image annotation collections

Problems with existing photo tag recommendation collections

Flickr-AIA

Flickr-PTR Conclusions

We introduce

We introduce

Page 11: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

In this work we consider 3 popular AIA evaluation collections used by recent work [4]

Corel5k [1] IAPR TC-12 [2] ESP Game [3]

[1] Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary. P. Duygulu et al. ECCV 02.[2] Labeling images with a computer game. L. von Ahn and L. Dabbish. CHI '04.[3] The IAPR TC-12 Benchmark- A New Evaluation Resource. M. Grubinger, et al. Visual Information Systems, 2006.[4] Baselines for Image Annotation. A. Makadia, V. Pavlovic and S. Kumar. IJCV 2010

Automatic Image Annotation

Page 12: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

What are the problems with previous automatic image annotation collections?

Page 13: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

What are the problems with previous automatic image annotation collections?

Too many collectionsThere needs to a be a single, openly available collection to reproduce experiments.

Page 14: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

What are the problems with previous automatic image annotation collections?

Too many collectionsThere needs to a be a single, openly available collection to reproduce experiments.

Annotation AmbiguityCollections use many synonyms in the annotation of images. e.g. usa/america etc.

Page 15: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

What are the problems with previous automatic image annotation collections?

Too many collectionsThere needs to a be a single, openly available collection to reproduce experiments.

Annotation AmbiguityCollections use many synonyms in the annotation of images. e.g. usa/america etc.

UnnormalizedModels are able to “exploit” popular tags by promoting them, increasing performance measures.

Page 16: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

What are the problems with previous automatic image annotation collections?

Too many collectionsThere needs to a be a single, openly available collection to reproduce experiments.

Annotation AmbiguityCollections use many synonyms in the annotation of images. e.g. usa/america etc.

UnnormalizedModels are able to “exploit” popular tags by promoting them, increasing performance measures.

Low Image QualityModels are often tested on small, low quality image collections.

Page 17: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

What are the problems with previous automatic image annotation collections?

Too many collectionsThere needs to a be a single, openly available collection to reproduce experiments.

Annotation AmbiguityCollections use many synonyms in the annotation of images. e.g. usa/america etc.

UnnormalizedModels are able to “exploit” popular tags by promoting them, increasing performance measures.

Low Image QualityModels are often tested on small, low quality image collections.

Lack of Meta-dataDespite the increase in research considering time/location etc, these collections don’t include these.

Page 18: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

What are the problems with previous automatic image annotation collections?

Too many collectionsThere needs to a be a single, openly available collection to reproduce experiments.

Annotation AmbiguityCollections use many synonyms in the annotation of images. e.g. usa/america etc.

UnnormalizedModels are able to “exploit” popular tags by promoting them, increasing performance measures.

Low Image QualityModels are often tested on small, low quality image collections.

Lack of Meta-dataDespite the increase in research considering time/location etc, these collections don’t include these.

Lack of diversityCollections often contain images taken on the same camera, at the same place by the same user.

Page 19: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

What are the problems with previous automatic image annotation collections?

Too many collectionsThere needs to a be a single, openly available collection to reproduce experiments.

Annotation AmbiguityCollections use many synonyms in the annotation of images. e.g. usa/america etc.

UnnormalizedModels are able to “exploit” popular tags by promoting them, increasing performance measures.

Low Image QualityModels are often tested on small, low quality image collections.

Lack of Meta-dataDespite the increase in research considering time/location etc, these collections don’t include these.

Lack of diversityCollections often contain images taken on the same camera, at the same place by the same user.

Location TagsLocations, such as “usa”, are impossible to identify, however, these tags are often included in ground truths.

Page 20: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

What are the problems with previous automatic image annotation collections?

Too many collectionsThere needs to a be a single, openly available collection to reproduce experiments.

Annotation AmbiguityCollections use many synonyms in the annotation of images. e.g. usa/america etc.

UnnormalizedModels are able to “exploit” popular tags by promoting them, increasing performance measures.

Low Image QualityModels are often tested on small, low quality image collections.

Lack of Meta-dataDespite the increase in research considering time/location etc, these collections don’t include these.

Lack of diversityCollections often contain images taken on the same camera, at the same place by the same user.

Location TagsLocations, such as “usa”, are impossible to identify, however, these tags are often included in ground truths.

CopyrightCorel is bound by copyright making distribution difficult.

Page 21: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

1. ProblemsAnnotation Ambiguity

All three collection ground truths contain:• Synonyms

(e.g. america/usa)• Visually identical classes

(e.g. sea/ocean)

Corel5k IAPR TC-12 ESP Game

Page 22: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

1. ProblemsAnnotation Ambiguity

All three collection ground truths contain:• Synonyms

(e.g. america/usa)• Visually identical classes

(e.g. sea/ocean)To demonstrate this, we cluster tags which share a common WordNet “synonym set” (removing irrelevant matches manually).

Corel5k IAPR TC-12 ESP Game

Page 23: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Corelpolar/arctic

ocean/sea36 of 374 tags

ice/frost

Page 24: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Corelpolar/arctic

ocean/sea

ESPbaby/child

child/kidhome/house

37 of 291 tags

36 of 374 tags

ice/frost

Page 25: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Corelpolar/arctic

ocean/sea

ESPbaby/child

child/kidhome/house

IAPRwoman/adult

bush/shrubrock/stone

37 of 291 tags

26 of 269 tags

36 of 374 tags

ice/frost

Page 26: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

31%of photos in the Corel collection contain at least 1 ambiguous tag.

Page 27: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

31%of photos in the Corel collection contain at least 1 ambiguous tag.

25%of photos in the ESP collection contain at least 1 ambiguous tag.

Page 28: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

31%of photos in the Corel collection contain at least 1 ambiguous tag.

25%of photos in the ESP collection contain at least 1 ambiguous tag.

sixty three%of photos in the IAPR collection contain at least 1 ambiguous tag.

Page 29: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Annotations: sea, usa, sky, chair

Test image #1

Page 30: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Annotations: sea, usa, sky, chair

Annotation Model #1 sea, usa, blue, water, red

Test image #1

Precision

0.4

Suggestions

Page 31: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Annotations: sea, usa, sky, chair

Annotation Model #1

Annotation Model #2

sea, usa, blue, water, red

Test image #1

ocean, america, blue, water, red

Precision

0.4

0

Suggestions

Page 32: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Annotations: sea, usa, sky, chair

Annotation Model #1

Annotation Model #2

sea, usa, blue, water, red

Test image #1

ocean, america, blue, water, red

Precision

0.4

0

Suggestions

So why do we penalize a

system which treats these

concepts differently?

It is impossible to tell from an image’s pixels, whether it is of the sea or of the ocean.

Page 33: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

2. ProblemsUnnormalised Collections

By nature, the classes used in image collections follow a long tail distribution i.e. there exist a few popular tags and many unpopular tags

This causes problems:1. Selection Bias: Popular tags exist in more training and test images.

Therefore, annotation models are more likely to test on popular classes.2. Prediction Bias: Popular tags occur in more test images. Therefore,

annotation models can potentially “cheat” by promoting only popular tags, instead of making predictions based purely on the pixels.

Corel5k IAPR TC-12 ESP Game

Page 34: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

2. ProblemsUnnormalised Collections

By nature, the classes used in image collections follow a long tail distribution i.e. there exist a few popular tags and many unpopular tags

This causes problems:1. Selection Bias: Popular tags exist in more training and test images.

Therefore, annotation models are more likely to test on popular classes.2. Prediction Bias: Popular tags occur in more test images. Therefore,

annotation models can potentially “cheat” by promoting only popular tags, instead of making predictions based purely on the pixels.

Corel5k IAPR TC-12 ESP Game

Page 35: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

2. ProblemsUnnormalised Collections

By nature, the classes used in image collections follow a long tail distribution i.e. there exist a few popular tags and many unpopular tags

This causes problems:1. Selection Bias: Popular tags exist in more training and test images.

Therefore, annotation models are more likely to test on popular classes.2. Prediction Bias: Popular tags occur in more test images. Therefore,

annotation models can potentially “cheat” by promoting only popular tags, instead of making predictions based purely on the pixels.

Corel5k IAPR TC-12 ESP Game

Page 36: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Tags

# Im

ages

To demonstrate this “prediction bias”(i.e. where annotation models can “cheat” by promoting popular tags)

We annotate each collection using the annotation model described in [6].

We split the vocabulary into 3 subsets of popular, medium frequency and unpopular tags.

For each, we suggest only the tags in each subset.

[6] Baselines for Image Annotation. A. Makadia, V. Pavlovic and S. Kumar. IJCV 2010

Corel5k IAPR TC-12 ESP Game

Page 37: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Popular tags

Tags

# Im

ages

To demonstrate this “prediction bias”(i.e. where annotation models can “cheat” by promoting popular tags)

We annotate each collection using the annotation model described in [6].

We split the vocabulary into 3 subsets of popular, medium frequency and unpopular tags.

For each, we suggest only the tags in each subset.

[6] Baselines for Image Annotation. A. Makadia, V. Pavlovic and S. Kumar. IJCV 2010

Corel5k IAPR TC-12 ESP Game

Page 38: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Popular tagsMedium Frequency

Tags

# Im

ages

To demonstrate this “prediction bias”(i.e. where annotation models can “cheat” by promoting popular tags)

We annotate each collection using the annotation model described in [6].

We split the vocabulary into 3 subsets of popular, medium frequency and unpopular tags.

For each, we suggest only the tags in each subset.

[6] Baselines for Image Annotation. A. Makadia, V. Pavlovic and S. Kumar. IJCV 2010

Corel5k IAPR TC-12 ESP Game

Page 39: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Popular tagsMedium FrequencyUnpopular Tags

Tags

# Im

ages

To demonstrate this “prediction bias”(i.e. where annotation models can “cheat” by promoting popular tags)

We annotate each collection using the annotation model described in [6].

We split the vocabulary into 3 subsets of popular, medium frequency and unpopular tags.

For each, we suggest only the tags in each subset.

[6] Baselines for Image Annotation. A. Makadia, V. Pavlovic and S. Kumar. IJCV 2010

Corel5k IAPR TC-12 ESP Game

Page 40: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK
Page 41: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Ultimately, higher annotation accuracy can be achieved by suggesting only popular tags.

Page 42: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

3. ProblemsImage quality/size

Despite the increase in Hadoop clusters & computational power,

many works still test on small collections of low quality images.

Collection Size (avg dimension) # Images

Corel 160px 5,000

ESP 156px 22,000

IAPR 417px 20,000

Corel5k IAPR TC-12 ESP Game

Page 43: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

4. ProblemsLack of meta-data

Many recent works have focussed on the exploitation of various meta-data [7,8]

e.g. Time, Location, Camera, User

Collection Time Location

Corel X X

ESP X X

IAPR ✓ ✓

[7] On contextual photo tag recommendation. P McParlane, Y Moshfeghi, J Jose SIGIR 2013[8] Beyond co-occurrence: discovering and visualizing tag relationships from geo-spatial. H Zhang et al. WSDM 2012

Corel5k IAPR TC-12 ESP Game

Page 44: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

5. ProblemsLack of diversity

Images in each collection are often taken bythe same user, in the same place, of the same scene/object, using the same camera.

This leads to natural clustering in image collections, making annotation easier due to high inter-cluster visual similarity.

Further there are duplicate images in the test and train sets, also making annotation easier.

Corel5k IAPR TC-12 ESP Game

Page 45: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK
Page 46: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

6. ProblemsIdentifying Location

Identifying location (even high level) within an image is often difficult or sometimes impossible. Despite this, two of the three collections contain images annotated with locations (e.g. usa).

Given this image, would you know where it was taken?

Corel5k IAPR TC-12 ESP Game

Page 47: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

6. ProblemsIdentifying Location

Identifying location (even high level) within an image is often difficult or sometimes impossible. Despite this, two of the three collections contain images annotated with locations (e.g. usa).

Given this image, would you know where it was taken?

Annotations: sea, usa, sky, chair

If not, how can we expect an annotation model to predict the annotation “usa”?

Corel5k IAPR TC-12 ESP Game

Page 48: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

7. ProblemsCopyright

An evaluation collection should at least be free and distributable.

Unfortunately, the Corel collection is commercial and bound by copyright.

Corel5k IAPR TC-12 ESP Game

Page 49: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Motivation for annotating images

Problems with existing automatic image annotation collections

Problems with existing photo tag recommendation collections

Flickr-AIA

Flickr-PTR Conclusions

We introduce

We introduce

Page 50: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Flickr-AIA contains 312,000 images from Flickr built with AIA evaluation in mind.

Page 51: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Flickr-AIA contains 312,000 images from Flickr built with AIA evaluation in mind.

Openly availableUsing Flickr images under the creative commons license

Page 52: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Flickr-AIA contains 312,000 images from Flickr built with AIA evaluation in mind.

Openly availableUsing Flickr images under the creative commons license

Meta-dataIncludes extensive location, user and time meta-data.

Page 53: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Flickr-AIA contains 312,000 images from Flickr built with AIA evaluation in mind.

Openly availableUsing Flickr images under the creative commons license

Meta-dataIncludes extensive location, user and time meta-data.

Diverse image setWe search images for 2,000 WordNet categories & limit the number of images for each user.

Page 54: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Flickr-AIA contains 312,000 images from Flickr built with AIA evaluation in mind.

Openly availableUsing Flickr images under the creative commons license

Meta-dataIncludes extensive location, user and time meta-data.

Diverse image setWe search images for 2,000 WordNet categories & limit the number of images for each user.

High qualityThe dimension of each image is 719px on average.

Page 55: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Flickr-AIA contains 312,000 images from Flickr built with AIA evaluation in mind.

Openly availableUsing Flickr images under the creative commons license

Meta-dataIncludes extensive location, user and time meta-data.

Diverse image setWe search images for 2,000 WordNet categories & limit the number of images for each user.

High qualityThe dimension of each image is 719px on average.

No Location tagsWe use WordNet to remove “location” tags from image ground truths (e.g. scotland).

Page 56: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Flickr-AIA contains 312,000 images from Flickr built with AIA evaluation in mind.

Openly availableUsing Flickr images under the creative commons license

Meta-dataIncludes extensive location, user and time meta-data.

Diverse image setWe search images for 2,000 WordNet categories & limit the number of images for each user.

High qualityThe dimension of each image is 719px on average.

No Location tagsWe use WordNet to remove “location” tags from image ground truths (e.g. scotland).

Resolved AmbiguityTags which are synonyms (e.g. usa/america) are “merged” based on WordNet synonym sets.

Page 57: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Flickr-AIA contains 312,000 images from Flickr built with AIA evaluation in mind.

Openly availableUsing Flickr images under the creative commons license

Meta-dataIncludes extensive location, user and time meta-data.

Diverse image setWe search images for 2,000 WordNet categories & limit the number of images for each user.

High qualityThe dimension of each image is 719px on average.

No Location tagsWe use WordNet to remove “location” tags from image ground truths (e.g. scotland).

Resolved AmbiguityTags which are synonyms (e.g. usa/america) are “merged” based on WordNet synonym sets.

NormalizedAside from the normal ground truth, we include a “normalised” ground truth containing only medium frequency tags.

Page 58: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Motivation for annotating images

Problems with existing automatic image annotation collections

Problems with existing photo tag recommendation collections

Flickr-AIA

Flickr-PTR Conclusions

We introduce

We introduce

Page 59: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

In this work we consider collection used in 2 popular photo tag recommendation works.

Photo Tag Recommendation

[5] Flickr Tag Recommendation based on Collective Knowledge. B. Sigurbjornsson and R. van Zwol. WWW '08.[6] Personalized, Interactive Tag Recommendation for Flickr. N. Garg and I. Weber. ACM RecSys '08.

Sigurbjornsson [5] Garg [6]

Page 60: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

1. ProblemsGround Truth

Sigurbjornsson use a small collection of images which have their ground truth’s crowdsourced.

For photo tag recommendation, however, many aspects that users would tag are often not explicit (e.g. locations, dates etc). Therefore, these annotations are missed using crowdsourcing.

Sigurbjornsson Garg

Page 61: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Crowdsourced Workerfootball redteam blueenglandgrasssaturday

Comparing AnnotationsCrowdsourced vs tags added by the user.

Page 62: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Crowdsourcedfootball redteam blueenglandgrasssaturday

Comparing AnnotationsCrowdsourced vs tags added by the user.

User Tagsfootball redteam bluescotland hamilton acciesartificial grass dunfermlinesunday new douglas park

Page 63: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Crowdsourcedfootball redteam blueenglandgrasssaturday

Comparing AnnotationsCrowdsourced vs tags added by the user.

User Tagsfootball redteam bluescotland hamilton acciesartificial grass dunfermlinesunday new douglas park

Page 64: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

2. ProblemsSynonymous Tags

One of the problems with using user tags, however, is that users often use many synonyms to annotate images.

Sigurbjornsson Garg

Page 65: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Annotations: newyork, ny, nyc, newyorkcity, york, timessquare

Test image #1

Page 66: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Annotations: newyork, ny, nyc, newyorkcity, york, timessquare

Annotation Model #1 ny, nyc, newyork, york, city

Test image #1

Precision

0.8

Suggestions

Page 67: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Annotations: newyork, ny, nyc, newyorkcity, york, timessquare

Annotation Model #1

Annotation Model #2

ny, nyc, newyork, york, city

Test image #1

ny, timessquare, people, cab, empire

Precision

0.8

0.4

Suggestions

Page 68: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Annotations: newyork, ny, nyc, newyorkcity, york, timessquare

Annotation Model #1

Annotation Model #2

ny, nyc, newyork, york, city

Test image #1

ny, timessquare, people, cab, empire

Precision

0.8

0.4

Suggestions

Page 69: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

3. ProblemsFree distribution

Existing collections [6,7] for photo tag recommendation were never released making comparable experiments difficult.

[6] Personalized, Interactive Tag Recommendation for Flickr. N. Garg and I. Weber. ACM RecSys '08. [7] Flickr Tag Recommendation based on Collective Knowledge. B. Sigurbjornsson and R. van Zwol. WWW '08.

Sigurbjornsson Garg

Page 70: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Motivation for annotating images

Problems with existing automatic image annotation collections

Problems with existing photo tag recommendation collections

Flickr-AIA

Flickr-PTR Conclusions

We introduce

We introduce

Page 71: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Flickr-PTR contains details of 2,000,000 images from Flickr built with PTR evaluation in mind.

Page 72: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Flickr-PTR contains details of 2,000,000 images from Flickr built with PTR evaluation in mind.

Openly availableUsing Flickr images under the creative commons license

Page 73: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Flickr-PTR contains details of 2,000,000 images from Flickr built with PTR evaluation in mind.

Openly availableUsing Flickr images under the creative commons license

Clustered User TagsUsing a crowdsourced experiment which asked user to group related tags to overcome the problem of synonyms.

Page 74: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

To overcome synonyms in image annotations we took out an crowdsourced experiment which “grouped” the synonyms or the tags which refer to the same aspect.

Page 75: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

ConclusionsThis work highlighted: 7 problems with existing AIA collections (Corel, ESP, IAPR) 3 problems with existing PTR collections (Sigur, Garg)

With this in mind, we introduce two new, freely available image collections:

Flickr-AIA312,000 Flickr images

Flickr-PTR2,000,000 Flickr images

Automatic image annotation evaluation Photo tag recommendation evaluation

Page 76: Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

These collections are available at:

http://dcs.gla.ac.uk/~philip/

Thanks for listening!

[1] Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary. P. Duygulu et al. ECCV 02.[2] Labeling images with a computer game. L. von Ahn and L. Dabbish. CHI '04.[3] The IAPR TC-12 Benchmark- A New Evaluation Resource. M. Grubinger, et al. Visual Information Systems, 2006.[4] Flickr Tag Recommendation based on Collective Knowledge. B. Sigurbjornsson and R. van Zwol. WWW '08.[5] Personalized, Interactive Tag Recommendation for Flickr. N. Garg and I. Weber. ACM RecSys '08. [6] Baselines for Image Annotation. A. Makadia, V. Pavlovic and S. Kumar. IJCV 2010[7] On contextual photo tag recommendation. P McParlane, Y Moshfeghi, J Jose SIGIR 2013[8] Beyond co-occurrence: discovering and visualizing tag relationships from geo-spatial. H Zhang et al. WSDM 2012