51
Content-Based Image Retrieval Rong Jin

Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval Retrieval by text Label database images by text tags Image retrieval as text retrieval

Embed Size (px)

Citation preview

Page 1: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Content-Based Image Retrieval

Rong Jin

Page 2: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Content-based Image Retrieval Retrieval by text

Label database images by text tags Image retrieval as text retrieval

Find images for textual queries using standard text search engines

Page 3: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Example: Flickr.com

Con: require manually labeling

Page 4: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Image Labeling by Human Computing ESP game http://www.gwap.com/gwap/gamesPreview/espgame

Collect annotations for web images via a game

Page 5: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Content-based Image Retrieval Retrieval based on visual content

Represent images by their visual contents Each query is an image Search for images that have similar visual content

as the query image

Page 6: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Content-based Image RetrievalGiven a query image, try to find visually similar images from an image database

Image Database

Answer

Query

Page 7: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Example: www.like.com

Page 8: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

CBIR Challenges: How to represent visual content of images

What are “visual contents” ? Colors, shapes, textures, objects, or meta-data (e.g.,

tags) derived from images

Which type of “visual content” should be used for representing image ? Difficult to understand the information needs of an

user from a query image

How to retrieve images efficiently Should avoid linear scan of the entire database

Page 9: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Image Representation

• Similar color distribution

• Similar texture pattern

• Similar shape/pattern

• Similar real content

Degree of difficulty

Histogram matching

Texture analysis

Image Segmentation,Pattern recognition

Life-time goal :-)

Page 10: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Vector based Image Representation Represent an image by a vector of fixed

number of elements Color histogram: discretize color space; count

pixels for each discretized color bin Texture: Gabor filters texture features …

Page 11: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Vector based Image Representation

0.3

0.5

0.2

Vq

0.4

0.5

0.1

V1

0.5

0.1

0.4

V2

|V1 – Vq| < |V2 – Vq| >

R

G

B

Page 12: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Images with Similar Colors

Page 13: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Images with Similar Shapes

Page 14: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Images with Similar Content

Page 15: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Challenges in CBIR You get drunk, REALLY drunk Hit over the head Kidnapped to another city

in a country on the other side of the world When you wake up,

You try to figure out what city are you in, and what is going on

That’s what it’s like to be a CBIR system!

Page 16: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Near Duplicate Image Retrieval Given a query image, identify gallery images

with high visual similarity.

Page 17: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Appearance based Image Matching Parts-based image representation

Parts (appearance) + shape (spatial relation) Parts: local features by interesting point operator Shape: graphical models or neighborhood

relationship

Page 18: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Interesting Point Detection Local features have been shown to be

effective for representing images They are image patterns which differ from

their immediate neighborhood. They could be points, edges, small patches. We call local features key points or interesting

points of an image

Page 19: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Interesting Point Detection An image example with key points detected

by a corner detector.

Page 20: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Interesting Point Detection The detection of interesting point needs to be

robust to various geometric transformations

Original Scaling+Rotation+Translation Projection

Page 21: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Interesting Point Detection The detection of interesting point needs to be

robust to imaging conditions, e.g. lighting, blurring.

Page 22: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Descriptor Representing each detected key point Take measurements from a region centered on

a interesting point E.g., texture, shape, …

Each descriptor is a vector with fixed length E.g. SIFT descriptor is a vector of 128 dimension

Page 23: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Descriptor The descriptor should also be robust under

different image transformation.

They should have similar descriptors

Page 24: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Image Representation

22 0 19 23 1

66 103 45 6 38

232 44 0 11 48

29 55 129 0 1

11 78 110 1 32

220 30 11 34 21

Descriptors of the key points

Original image

Detected key points

Bag-of-features representation: an exampleEach descriptor is 5 dimension

Page 25: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Retrieval

How to measure similarity?

22 0 19 23 1

66 103 45 6 38

232 44 0 11 48

29 55 129 0 1

...

Page 26: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Retrieval

Count number of matches !

22 0 19 23 1

66 103 45 6 38

232 44 0 11 48

29 55 129 0 1

...

Page 27: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Retrieval

If the distance between two vectors is smaller than the threshold, we get one match

Page 28: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Retrieval

Matched points: 1

Matched points: 5

Page 29: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Problems Computationally expensive

Requiring linear scan of the entire data base Example: match a query image to a database

of 1 million images 0.1 second for computing the match between two

images Take more than one day to answer a single query

Page 30: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Bag-of-words Model Compare to the bag-of-words representation

in text retrieval

A document

A collection of the words in the document

An image

A collection of the key points of the image

What is the

difference

Page 31: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Bag-of-wordsA document

A collection of the words in the document

An image

A collection of the key points of the image

What is the

difference

The same word appears in many documents

No “same key point”, but “similar key point” appears in many images which have similar “visual content”

Group “similar key point” in different images in to “visual words”

Page 32: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Bag-of-words Model

b1 b2

b3

b4

b5

b6

b7

b8

b1 b2 b3

b4

Group key points into visual words Represent images by histograms of visual words

Page 33: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Bag-of-words The “grouping” is usually done by clustering.

Clustering the key points of all images into a number of cluster centers (e.g 100,000 clusters).

Each cluster center is called a “visual word” The collection of all cluster centers is called “

visual vocabulary”

Page 34: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Retrieval by Bag-of-words Model Generate “visual vocabulary” Represent each key point by its nearest

“visual word” Represent an image by “a bag of visual

words” Text retrieval technique can be applied

directly.

Page 35: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Project Build a system for near duplicate image

retrieval A database with 10,000 images Construct bag-of-words models for each image

(offline) Construct a bag-of-words model for a query image Retrieve first 10 visually most “similar” images from

the database for the given query

Page 36: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Step 1: Dataset

10,000 color images under the folder ‘./img’ The key points of each image have already

been extracted Key points of all images are saved in a single

file ‘./feature/esp.feature’ Each line corresponds to a key point with 128

attributes Attributes in each line are separated by tabs

Page 37: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Step 1: Dataset To locate key points for individual images,

two other files are needed: ‘./imglist.txt’: the order of images when saving

their keypoints ‘./feature/esp.size’: the number of key points an

image have.

Page 38: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Step 1: Dataset Example: Three images imgA, imgB, imgC. imgA : 2 key points; imgB: 3 key points;

imgC: 2 key points.

imglist.txt esp.size esp.feature

imgB.jpg

imgC.jpg

imgA.jpg

3

2

2

imgB-key point 1

imgB-key point 2

imgB-key point 3

imgC-key point 1

imgC-key point 2

imgA-key point 1

imgA-key point 2

Page 39: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Step 2: Key Point Quantization Represent each image by a bag of visual

words: Construct the visual vocabulary

Clustering all the key points into 10,000 clusters Each cluster center is a visual word

Map each key point to a visual word Find the nearest cluster center for each key point

(nearest neighbor search)

Page 40: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Step 2: Key Point Quantization Clustering 7 key points into 3 clusters

The cluster centers are: cnt1, cnt2, cnt3 Each center is a visual word: w1, w2, w3

Find the nearest center to each key point

imglist.txt esp.size esp.feature

imgB.jpg

imgC.jpg

imgA.jpg

3

2

2

imgB-key point 1

imgB-key point 2

imgB-key point 3

imgC-key point 1

imgC-key point 2

imgA-key point 1

imgA-key point 2

Page 41: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Step 2: Key Point Quantization imgA.jpg

1st key point w2 2nd key point w1

imgB.jpg 1st key point w3 2nd key point w3 3rd key point w2

imgC.jpg 1st key point w3 2nd key point w2

Bag-of-words Rep.

imgA.jpg: w2 w1

imgB.jpg: w3 w3 w2

imgC.jpg: w3 w2

Page 42: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Step 2: Key Point Quantization We provide FLANN library for clustering and

nearest neighbor search. For clustering, use

flann_compute_cluster_centers(float* dataset, // your key pointsint rows, // number of key pointsint cols, // 128, dim of a key point int clusters, // number of clustersfloat* result, // cluster centersstruct IndexParameters* index_params,

struct FLANN

Page 43: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Step 2: Key Point Quantization For nearest neighbor search

1. Build index for the cluster centersflann_build_index(

float* dataset, // your cluster centers int rows, int cols, float* speedup, struct

IndexParameters* index_params, struct FLANNParameters* flann_params);

2. For each key point, search nearest cluster centerflann_find_nearest_neighbors_index(

FLANN_INDEX index_id, // your index abovefloat* testset, // your key pointsint trows, int* result, int nn, int checks, struct FLANNParameters* flann_params);

Page 44: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Step 2: Key Point Quantization In this step, you need to save:

the cluster centers to a file. You will use this later on for quantizing key points of query images

bag-of-words representation of each image in “trec” format.

<DOC>

<DOCNO>imgB</DOCNO>

<TEXT>

w3 w3 w2

</TEXT>

</DOC>

<DOC>

<DOCNO>imgA</DOCNO>

<TEXT>

w2 w1

</TEXT>

</DOC>

Bag-of-words Rep.

imgA.jpg: w2 w1

imgB.jpg: w3 w3 w2

imgC.jpg: w3 w2

<DOC>

<DOCNO>imgC</DOCNO>

<TEXT>

w3 w2

</TEXT>

</DOC>

Page 45: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Step 3: Build index using Lemur The same as what we did in the previous

home work Use “KeyfileIncIndex” index No stemming No stop words

Page 46: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Step 4: Extract key points for a query Three sample query images under ‘./sample

query/’ The query images are in the format of .pgm Extracting tool is under ‘./sift tool/’

For windows, use “siftW32.exe” For Linux, use “sift” Example: issue command

Sift < input.pgm > output.keypoints

Page 47: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Step 5: Generate a bag-of-words model for a query Map each key point of a given query to a

visual word. Use the cluster center file generated in step 2 Build index for the cluster centers using

flann_build_index()

For each key point, search nearest cluster center usingflann_find_nearest_neighbors_index()

Page 48: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Step 5: Generate a bag-of-words model for a query Write the bag-of-words model for a query

image in the Lemur format.<DOC 1>

The mapped cluster ID for the 1st key point

The mapped cluster ID for the 2nd key point

The mapped cluster ID for the 1st key point

</DOC>

Page 49: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Step 6: Image Retrieval by Lemur Use the Lemur command ‘RetEval’as:

RetEval <parameter_file>

An example of parameter file<parameters>

<index>/home/user1/myindex/myindex.key</index>

<retModel>tfidf</retModel>

<textQuery>/home/user1/query/q1.query</textQuery>

<resultFile>/home/user1/result/ret.result</resultFile>

<TRECResultFormat>1</TRECResultFormat>

<resultCount>10</resultCount>

</parameters>

Page 50: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Step 7: Graphical User Interface Build a GUI for the image retrieval system

Browse the image database Select an image from the database to query the

database and display the top 10 retrieved results Extract the bag-of-words representation of the query Write it into the file with the format specified in step7 Run the “RetEval” command for retrieval

Load in the external query image, search the images in the database and display the top 10 retrieved results

Page 51: Content-Based Image Retrieval Rong Jin. Content-based Image Retrieval  Retrieval by text Label database images by text tags Image retrieval as text retrieval

Step 8: Evaluation Demo your system in the classes of the last

week. We will provide a number of test query images Run your GUI, load in each test query image and

display the first ten most similar images from the database