19
MUCKE Project Iftene, A., Sirițeanu, A., Petic, M. How to Do Diversification in an Image Retrieval System Laic, A., Iftene, A. Automatic Image Annotation Gherasim, L. M., Iftene, A. Extracting Background Knowledge about World from Text. ConsILR, September 18-19, 2014, Craiova

Diversification in an Image Retrieval System

Embed Size (px)

Citation preview

Page 1: Diversification in an Image Retrieval System

MUCKE Project

• Iftene, A., Sirițeanu, A., Petic, M. How to Do

Diversification in an Image Retrieval System

• Laic, A., Iftene, A. Automatic Image Annotation

• Gherasim, L. M., Iftene, A. Extracting Background

Knowledge about World from Text.

ConsILR, September 18-19, 2014, Craiova

Page 2: Diversification in an Image Retrieval System

Content

MUCKE Team

The core

The data

Text processing

Image processing

Diversification

Problem

Demo

Automatic Image Annotation

ConsILR, September 18-19, 2014, Craiova

Page 3: Diversification in an Image Retrieval System

MUCKE Team

Bilkent University, Turkey

“Al. I. Cuza” University, Iasi, Romania

Vienna University of Technology, Austria

Center for Alternative and Atomic Energy, France

IMCS-50, 2014

Page 4: Diversification in an Image Retrieval System

MUCKE Framework

Page 5: Diversification in an Image Retrieval System

Text Processing

Image Processing

Concept similarity

User credibility

The core R

aw m

ult

imed

ia a

nd

mu

ltili

ngu

al d

ata Output

Image retrieval framework

Semantic Resources

ConsILR, September 18-19, 2014, Craiova

Page 6: Diversification in an Image Retrieval System

Text Processing

Image Processing

Concept similarity

User credibility

The data R

aw m

ult

imed

ia a

nd

mu

ltili

ngu

al d

ata Output

Image retrieval framework

Semantic Resources

Existing collections

A survey done and published online

ImageNet – 14 million annotated images

mediaEval – 3.2 million images

MIRFLICKR – 1 million annotated images

Wikipedia (DBpedia)

ClueWeb09/12

New data

Aim: 100million annotated images

Crawling ongoing

ConsILR, September 18-19, 2014, Craiova

Page 7: Diversification in an Image Retrieval System

Text Processing

Image Processing

Concept similarity

User credibility

The data R

aw m

ult

imed

ia a

nd

mu

ltili

ngu

al d

ata Output

Image retrieval framework

Semantic Resources

Distributed crawling and replicated

storage

ConsILR, September 18-19, 2014, Craiova

Page 8: Diversification in an Image Retrieval System

Text Processing

Image Processing

Concept similarity

User credibility

Raw

mu

ltim

edia

an

d m

ult

ilin

gual

dat

a Output

Image retrieval framework

Semantic Resources

Entity recognition

Disambiguation

Anaphora resolution

Combined with IR methods

Latent semantic retrieval

Explicit semantic retrieval

Components for:

English, French, German, Romanian

Text Processing

Page 9: Diversification in an Image Retrieval System

Text Processing

Image Processing

Concept similarity

User credibility

Raw

mu

ltim

edia

an

d m

ult

ilin

gual

dat

a Output

Image retrieval framework

Semantic Resources

Parsimonious image description

Large scale concept detection

Detector generalization Across different datasets

Asses the use and utility of Different local image descriptors

their combination with other properties (e.g. color)

For optimal low-level image description

Adapted models for specialized tasks Face / landmark recognition

Image Processing

Page 10: Diversification in an Image Retrieval System

Diversification - Motivation

ConsILR, September 18-19, 2014, Craiova

Page 11: Diversification in an Image Retrieval System

Diversification – Problem definition

Search Results Diversification is an optimization

problem aiming to select a subset S of k items out of

the n available ones, such that, the diversity and the

relevance among the items of S is maximized. [1]

ConsILR, September 18-19, 2014, Craiova

Page 12: Diversification in an Image Retrieval System

Diversification – Proposed solution

Exploitation of semantic structures in order to

provide diverse and relevant results

Hierarchical structure of YAGO Concepts [6]:

IMCS-50, 2014

Page 13: Diversification in an Image Retrieval System

Performed steps

Deciding what terms in a query should be

used to query YAGO ontology.

Ranking and grouping the results retrieved

by YAGO ontology.

Choosing which YAGO entities to use in

crawling Flickr database.

Ranking the results so that we achieve both

relevance and diversity in the result set.

ConsILR, September 18-19, 2014, Craiova

Page 14: Diversification in an Image Retrieval System

Demo

https://www.youtube.com/watch?v=KrLfCN

iVcZ8

ConsILR, September 18-19, 2014, Craiova

Page 15: Diversification in an Image Retrieval System

Automatic Image Annotation

ConsILR, September 18-19, 2014, Craiova

Page 16: Diversification in an Image Retrieval System

Reverse Image Search

ConsILR, September 18-19, 2014, Craiova

Page 17: Diversification in an Image Retrieval System

Conclusions

Diversification can really improve quality of search results.

There is still some work to do in order to achieve good results in all the possible scenarios

We need a large collection of annotated images

We need performance algorithms which provide the distance between images

ConsILR, September 18-19, 2014, Craiova

Page 18: Diversification in an Image Retrieval System

Thank you

MUCKE

Multimedia and User Credibility Knowledge Extraction http://thor.info.uaic.ro/~mucke/

ConsILR, September 18-19, 2014, Craiova

Page 19: Diversification in an Image Retrieval System

Bibliography

[1] Drosou, M., Pitoura, E., Search Results Diversification. In SIGMOD, pages 41-47, 2010.

[2] Gollapudi, S., Sharma, A., An Axiomatic Approach for Result Diversification. In WWW, pages 381-390, 2009.

[3] Carbonell, J. G., Goldstein, J., The use of MMR, diversity-based reranking for reordering documents and producing summaries. In SIGIR, pages 335–336, 1998

[4] Clarke, C. L. A., Kolla, M., Cormack, G. V., Vechtomova, O., Ashkan, A., Büttcher, S., MacKinnon, I., Novelty and diversity in information retrieval evaluation. In SIGIR, pages 659–666, 2008.

[5] Zheng, W., Wang, X., Fang, H., Cheng, H., Coverage-based search result diversification, In Journal Information Retrieval, pages 433-457, 2012.

[6] YAGO2s: A High-Quality Knowledge Base, [Online] Available at http://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/yago/ [Last Accessed 27 June 2014].

[7] Cilibrasi, R., Vitanyi, P. M. B., The Google Similarity Distance. In IEEE TKDE, Vol. 19, Issue 3, pages 370-383, 2007.

[8] Kelleher, M., [Online] Available at http://www.smartinsights.com/email-marketing/behavioural-email-marketing/which-top-5-strategies-drive-relevance-in-email-marketing/ [Last Accessed 1 July 2014]

ConsILR, September 18-19, 2014, Craiova