1 Faculty of Information Technology Generic Fourier Descriptor for Shape-based Image Retrieval Dengsheng Zhang, Guojun Lu Gippsland School of Comp. & Info

1

Faculty of Information TechnologyFaculty of Information Technology

Generic Fourier Descriptor for Shape-based Image Retrieval

Dengsheng Zhang, Guojun LuGippsland School of Comp. & Info Tech

Monash UniversityChurchill, VIC 3842

Australia

[email protected]://www.gscit.monash.edu.au/~dengs/

2


Outline

• Motivations

• Problems

• Generic Fourier Descriptor (GFD)

• Experimental Results

• Conclusions

• Motivations

• Problems

• Generic Fourier Descriptor (GFD)

• Experimental Results

• Conclusions

3


Motivations

• Content-based Image Retrieval– Image description is important for image searching– Image description constitutes one of the key part of MPEG-7 – Shape is an important image feature along with color and

texture

• Effective and Efficient Shape Descriptor– good retrieval accuracy, compact features, general

application, low computation complexity, robust retrieval performance and hierarchical coarse to fine representation

• Content-based Image Retrieval– Image description is important for image searching– Image description constitutes one of the key part of MPEG-7 – Shape is an important image feature along with color and

texture

• Effective and Efficient Shape Descriptor– good retrieval accuracy, compact features, general

application, low computation complexity, robust retrieval performance and hierarchical coarse to fine representation

4


Fourier Descriptor• Obtained by applying Fourier transform

on a shape signature, such as the central distance function r(t).

• Obtained by applying Fourier transform on a shape signature, such as the central distance function r(t).

1- , 1, 0, ,)/2exp()(1 1

0

NnNntjtrN

aN

tn

Shape SignatureShape Signature

No Contour

Same Contour and Different content

5


Zernike Moments• Acquired by applying Zernike moment transform on a

shape region in polar space.– Complex form– Does not allow finer resolution in radial direction– create a number of repetitions in each order of moment – Shape must be normalized into an unit disk

• Acquired by applying Zernike moment transform on a shape region in polar space.– Complex form– Does not allow finer resolution in radial direction– create a number of repetitions in each order of moment – Shape must be normalized into an unit disk

rnm

x ynmnm

rjmrRrrfn

yxVyxfn

Z

1),exp()()sin,cos(1

),(),(1 *

)exp()()sin,cos(),( jmrRrrVyxV nmnmnm

2/|)|(

0

2

)!2

||()!

2

||(!

)!()1()(

mn

s

snsnm r

smn

smn

s

snrR

6


Generic Fourier Descriptor

• Polar Transform– For an input image f(x, y), it is first

transformed into polar image f(r, ):

– Find R = max{ r( ) }

• Polar Transform– For an input image f(x, y), it is first

transformed into polar image f(r, ):

– Find R = max{ r( ) }

c

ccc xx

yyyyxxr

arctan,)()( 22

1

0

1

0

1and

1where

M

yc

N

xc y

Nyx

Mx

7


Generic Fourier Descriptor-II

• Polar Raster Sampling• Polar Raster Sampling

Polar Grid

Polar image Polar raster sampled image in Cartesian space

8


Generic Fourier Descriptor-III

• Binary polar raster sampled shape images

• Binary polar raster sampled shape images

Polar raster sampling

Polar raster sampling

9


Generic Fourier Descriptor-IIV

• 2-D Fourier transform on polar raster sampled image f(r, ):

where 0r<R and i = i(2/T) (0 i<T); 0<R, 0<T. R and T are the radial frequency resolution and angular frequency resolution respectively.

• The normalized Fourier coefficients are the GFD.

• 2-D Fourier transform on polar raster sampled image f(r, ):

where 0r<R and i = i(2/T) (0 i<T); 0<R, 0<T. R and T are the radial frequency resolution and angular frequency resolution respectively.

• The normalized Fourier coefficients are the GFD.

r i

i T

i

R

rjrfPF )]

2(2exp[),(),(

10


Generic Fourier Descriptor-V

• Rotation invariant• Rotation invariant

Fourier Fourier

Polar raster sampled Polar raster sampled

PF PF

11


Generic Fourier Descriptor-VI• Translation invariant due to using shape

centroid as origin.• Scale normalization:

• Due to f(x, y) is real, only a quarter of the transformed coefficients are distinct. The first 36 coefficients are selected as shape descriptor.

• The similarity between two shapes are measured by the city block distance between the two set of GFDs.

• Translation invariant due to using shape centroid as origin.

• Scale normalization:

• Due to f(x, y) is real, only a quarter of the transformed coefficients are distinct. The first 36 coefficients are selected as shape descriptor.

• The similarity between two shapes are measured by the city block distance between the two set of GFDs.

}|)0,0(|

|),(|,...,

|)0,0(|

|)0,(|,...,

|)0,0(|

|),0(|,...,

|)0,0(|

|)1,0(|,

|)0,0(|{

PF

nmPF

PF

mPF

PF

nPF

PF

PF

area

PFGFD

12


Experiment • Datasets

– MPEG-7 region shape database (CE-2) has been tested. CE-2 has been organized by MPEG-7 into six datasets to test a shape descriptor’s behaviors under different distortions.

– Set A1 is for test of scale invariance. 100 shapes in Set A1 has been classified into 20 groups which are designated as queries.

– Set A2 is for test of rotation invariance. 140 shapes in Set A2 has been classified into 20 groups which are designated as queries

– Set A3 is for test of rotation/scaling invariance.

– Set A4 is for test of robustness to perspective transform. 330 shapes in Set A4 has been classified into 30 groups which are designated as queries.

– Set B consists of 2811 shapes from the whole database, it is for subjective test. 682 shapes in Set B have been manually classified into 10 groups by MPEG-7.

– For the whole database, 651 shapes have been classified into 31 groups which can be used as queries.

• Datasets – MPEG-7 region shape database (CE-2) has been tested. CE-2 has

been organized by MPEG-7 into six datasets to test a shape descriptor’s behaviors under different distortions.

– Set A1 is for test of scale invariance. 100 shapes in Set A1 has been classified into 20 groups which are designated as queries.

– Set A2 is for test of rotation invariance. 140 shapes in Set A2 has been classified into 20 groups which are designated as queries

– Set A3 is for test of rotation/scaling invariance.

– Set A4 is for test of robustness to perspective transform. 330 shapes in Set A4 has been classified into 30 groups which are designated as queries.

– Set B consists of 2811 shapes from the whole database, it is for subjective test. 682 shapes in Set B have been manually classified into 10 groups by MPEG-7.

– For the whole database, 651 shapes have been classified into 31 groups which can be used as queries.

13


Performance Measurement

• Precision-Recall

• For each query, the precision of the retrieval at each level of the recall is obtained. The result precision of retrieval is the average precision of all the query retrievals.

• Precision-Recall

• For each query, the precision of the retrieval at each level of the recall is obtained. The result precision of retrieval is the average precision of all the query retrievals.

imagesretrievedofnumber

imagesretrievedrelevantofnumber

2

n

rP

DBinimagesrelevantofnumbertotal

imagesretrievedrelevantofnumber

1

n

rR

14


Results • Average Precision-Recall on Set A1 and A2• Average Precision-Recall on Set A1 and A2

0102030405060708090100

10 20 30 40 50 60 70 80 90 100Recall

Prec

isio

n

GFD

ZMD

0102030405060708090100

10 20 30 40 50 60 70 80 90 100

Recall

Prec

isio

n

GFD

ZMD

Scale Invariance Test Rotation Invariance Test

15


Results • Average Precision-Recall on Set A4 and CE-2• Average Precision-Recall on Set A4 and CE-2

0102030405060708090100

0 10 20 30 40 50 60 70 80 90 100

Recall

Prec

isio

n

GFD

ZMD

0102030405060708090100

0 10 20 30 40 50 60 70 80 90 100Recall

Prec

isio

n

GFD

ZMD

Perspective Invariance Test General Invariance Test

16


• Average Precision-Recall on Set B• Average Precision-Recall on Set B

0

10

20

30

40

50

60

70

80

0 10 20 30 40 50 60 70 80 90 100

Recall

Prec

isio

n GFD

ZMD

Class 1 2 3 4 5 6 7 8 9 10 Average

No. of shapes 68.0 248 22 28 17 22 45 145 45 42

GFD (%) 47.0 66.4 55.6 50.0 50.0 24.8 30.4 50.8 55.6 29.0 46.0

ZMD (%) 37.0 58.0 55.0 41.2 42.6 22.6 33.6 52.0 41.4 34.0 41.7

Su

bje

cti

ve

Te

st

17


Results S

et A

1S

et A

1S

et A

4S

et A

4

18


Set BSet B

19


Set

BS

et B

CE

-2C

E-2

20


Conclusions • A new shape descriptor, generic Fourier descriptor

(GFD) has been proposed.• It has been tested on MPEG-7 region shape database • Comparisons have been made between GFD and

MPEG-7 shape descriptor ZMD.• Compared with ZMD, GFD has four advantages:

– it captures spectral features in both radial and circular directions;

– it is simpler to compute; – it is more robust and perceptually meaningful; – the physical meaning of each feature is clearer.

• The proposed GFD satisfies all the six requirements set by MPEG-7 for shape representation .

• A new shape descriptor, generic Fourier descriptor (GFD) has been proposed.

• It has been tested on MPEG-7 region shape database • Comparisons have been made between GFD and

MPEG-7 shape descriptor ZMD.• Compared with ZMD, GFD has four advantages:

– it captures spectral features in both radial and circular directions;

– it is simpler to compute; – it is more robust and perceptually meaningful; – the physical meaning of each feature is clearer.

• The proposed GFD satisfies all the six requirements set by MPEG-7 for shape representation .

Documents

1 Faculty of Information Technology Generic Fourier Descriptor for Shape-based Image Retrieval Dengsheng Zhang, Guojun Lu Gippsland School of Comp. & Info