Evaluating color descriptors for object and scene recognition

Evaluation of Color Descriptors for Object and Scene Recognition

Koen E.A. van de Sande, Student Member, IEEE, Theo Gevers, Member, IEEE, andCees G.M. Snoek, Member, IEEE

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 32, NO. 9, SEPTEMBER 2010

Introduction

To increase illumination invariance and discriminative power

Color features/descriptors on object and scene recognition

The usefulness of invariance is category-specific

Recommendations on which color descriptors to use under data sets

Reflectance Model

An image f can be modeled under the assumption of Lambertian reflectance as follows:

Shafer proposes adding a diffuse term：

Light source Surface reflectance

Camera sensitivity

Reflectance Model

The spatial derivative of f at location x on scale :

invariance to diffuse light!

Diagonal Model

Changes in the illumination can be modeled by a diagonal mapping or von Kries Model as follows:

𝑓𝑐 = 𝐷𝑢,𝑐𝑓𝑢

Diagonal-offset model:𝑎 0 00 𝑏 00 0 𝑐

𝑅𝐺𝐵

+𝑜1𝑜2𝑜3

unknown light source

Same image transformed

Photometric transforms

Light intensity changes

Light intensity shifts

B

G

R

a

a

a

00

00

00

3

2

1

o

o

o

B

G

R

Scale-invariant with respect tolight intensity

Shift-invariant with respect tolight intensity

Photometric transforms Light intensity scale and shift invariant

Light color change

Light color change and shift

3

2

1

00

00

00

o

o

o

B

G

R

a

a

a

B

G

R

c

b

a

00

00

00

3

2

1

00

00

00

o

o

o

B

G

R

c

b

a

Color Descriptors

Histograms don’t contain local spatial information. RGB, Hue, Saturation, rgHistogram, …

Color Moments contain local photometrical and spatial information.

SIFT contain local spatial information. Color SIFT combined color and SIFT

HSV-SIFT, Hue-SIFT, …

dxdyyxIyxIyxIyxM c

B

b

G

a

R

qpabc

pq )],([)],([)],([

Color Histograms

RGB-histogram

Hue-histogram

H and S are scale-invariant and shift-invariant w.r.t light intensity

rg-histogram

The normalized RGB color model

r,g Scale-invariant (b is redundant)

Not shift-invariant

BGRB

BGRG

BGRR

b

g

r

Color Histograms

Transformed color Normalized the pixel value distri

butions

Scale and shift-invariant w.r.t light intensity.

Opponent color histogram O1,O2 shift invariant

O3: intensity, no invariant

B

B

G

G

R

R

B

G

R

B

G

R

3

6

2

2

3

2

1

BGR

BGR

GR

O

O

O

Color SIFT Descriptors

HSV-SIFT

H color model is scale-invariant and shift-variant

Complete descriptor have no invariance properties due to the combination of the HSV channels

Hue-SIFT

Concatenation of the hue histogram with SIFT

Scale-invariant & shift-invariant


OpponentSIFT SIFT over all channels in the opponent color space. Scale & shift Invariant to light intensity

C-SIFT Eliminate O1 and O2’s intensity information Scale-invariant to light intensity

rg-SIFT SIFT over r,g spaces Scale and shift invariant to light intensity

32

31

2

1

OO

OO

O

O


RGB-SIFT(Transformed color SIFT)

SIFT over every RGB channel (normalized transformed channels)

Scale- and shift-invariant to light color changes and shift.

Experiments

Scale-invariants points by Harris-Laplace point detectors

Color descriptors are computed over the area around the points

By applying K-means clustering to descriptors, visual dictionary is constructed

SVM classifier with EMD/chi-square kernel

15

RESULTS : Experiment1

16

17

19

RESULTS : Experiment1

The SIFT and color SIFT descriptors perform much better than histogram-based descriptors

The descriptors with the best overall performance are C-SIFT, rgSIFT, OpponentSIFT, and RGB-SIFT.

RESUTLS : Expreiment2

Image: PASCAL VOC 2007, over 20 object categories



Most objs were categorized better under scale- and shift- invariant to light intensity

C-SIFT, rgSIFT performed better than other ones

The additional invariance makes the descriptor less discriminative for these object categories because a reduction in performance is observed.


Video: Mediamill Challenge, 39 object and scene categories


SIFT and color SIFT variants perform significantly better than the other descriptors.

OpponentSIFT perform better than C-SIFT and rgSIFT for these categories that occur under a wide range of light intensities.

Conclusion

A color descriptor with an appropriate level of invariance should be selected

Without prior knowledge, OpponentSIFt is the best in general

Light intensity info. Is important for some categories

Usefulness of invariance is category-specific.

Data & Analytics

Evaluating color descriptors for object and scene recognition