Upload
reginald-perkins
View
221
Download
1
Embed Size (px)
Citation preview
1
Faculty of Information TechnologyFaculty of Information Technology
Generic Fourier Descriptor for Shape-based Image Retrieval
Dengsheng Zhang, Guojun LuGippsland School of Comp. & Info Tech
Monash UniversityChurchill, VIC 3842
Australia
[email protected]://www.gscit.monash.edu.au/~dengs/
2
Faculty of Information TechnologyFaculty of Information Technology
Outline
• Motivations
• Problems
• Generic Fourier Descriptor (GFD)
• Experimental Results
• Conclusions
• Motivations
• Problems
• Generic Fourier Descriptor (GFD)
• Experimental Results
• Conclusions
3
Faculty of Information TechnologyFaculty of Information Technology
Motivations
• Content-based Image Retrieval– Image description is important for image searching– Image description constitutes one of the key part of MPEG-7 – Shape is an important image feature along with color and
texture
• Effective and Efficient Shape Descriptor– good retrieval accuracy, compact features, general
application, low computation complexity, robust retrieval performance and hierarchical coarse to fine representation
• Content-based Image Retrieval– Image description is important for image searching– Image description constitutes one of the key part of MPEG-7 – Shape is an important image feature along with color and
texture
• Effective and Efficient Shape Descriptor– good retrieval accuracy, compact features, general
application, low computation complexity, robust retrieval performance and hierarchical coarse to fine representation
4
Faculty of Information TechnologyFaculty of Information Technology
Fourier Descriptor• Obtained by applying Fourier transform
on a shape signature, such as the central distance function r(t).
• Obtained by applying Fourier transform on a shape signature, such as the central distance function r(t).
1- , 1, 0, ,)/2exp()(1 1
0
NnNntjtrN
aN
tn
Shape SignatureShape Signature
No Contour
Same Contour and Different content
5
Faculty of Information TechnologyFaculty of Information Technology
Zernike Moments• Acquired by applying Zernike moment transform on a
shape region in polar space.– Complex form– Does not allow finer resolution in radial direction– create a number of repetitions in each order of moment – Shape must be normalized into an unit disk
• Acquired by applying Zernike moment transform on a shape region in polar space.– Complex form– Does not allow finer resolution in radial direction– create a number of repetitions in each order of moment – Shape must be normalized into an unit disk
rnm
x ynmnm
rjmrRrrfn
yxVyxfn
Z
1),exp()()sin,cos(1
),(),(1 *
)exp()()sin,cos(),( jmrRrrVyxV nmnmnm
2/|)|(
0
2
)!2
||()!
2
||(!
)!()1()(
mn
s
snsnm r
smn
smn
s
snrR
6
Faculty of Information TechnologyFaculty of Information Technology
Generic Fourier Descriptor
• Polar Transform– For an input image f(x, y), it is first
transformed into polar image f(r, ):
– Find R = max{ r( ) }
• Polar Transform– For an input image f(x, y), it is first
transformed into polar image f(r, ):
– Find R = max{ r( ) }
c
ccc xx
yyyyxxr
arctan,)()( 22
1
0
1
0
1and
1where
M
yc
N
xc y
Nyx
Mx
7
Faculty of Information TechnologyFaculty of Information Technology
Generic Fourier Descriptor-II
• Polar Raster Sampling• Polar Raster Sampling
Polar Grid
Polar image Polar raster sampled image in Cartesian space
8
Faculty of Information TechnologyFaculty of Information Technology
Generic Fourier Descriptor-III
• Binary polar raster sampled shape images
• Binary polar raster sampled shape images
Polar raster sampling
Polar raster sampling
9
Faculty of Information TechnologyFaculty of Information Technology
Generic Fourier Descriptor-IIV
• 2-D Fourier transform on polar raster sampled image f(r, ):
where 0r<R and i = i(2/T) (0 i<T); 0<R, 0<T. R and T are the radial frequency resolution and angular frequency resolution respectively.
• The normalized Fourier coefficients are the GFD.
• 2-D Fourier transform on polar raster sampled image f(r, ):
where 0r<R and i = i(2/T) (0 i<T); 0<R, 0<T. R and T are the radial frequency resolution and angular frequency resolution respectively.
• The normalized Fourier coefficients are the GFD.
r i
i T
i
R
rjrfPF )]
2(2exp[),(),(
10
Faculty of Information TechnologyFaculty of Information Technology
Generic Fourier Descriptor-V
• Rotation invariant• Rotation invariant
Fourier Fourier
Polar raster sampled Polar raster sampled
PF PF
11
Faculty of Information TechnologyFaculty of Information Technology
Generic Fourier Descriptor-VI• Translation invariant due to using shape
centroid as origin.• Scale normalization:
• Due to f(x, y) is real, only a quarter of the transformed coefficients are distinct. The first 36 coefficients are selected as shape descriptor.
• The similarity between two shapes are measured by the city block distance between the two set of GFDs.
• Translation invariant due to using shape centroid as origin.
• Scale normalization:
• Due to f(x, y) is real, only a quarter of the transformed coefficients are distinct. The first 36 coefficients are selected as shape descriptor.
• The similarity between two shapes are measured by the city block distance between the two set of GFDs.
}|)0,0(|
|),(|,...,
|)0,0(|
|)0,(|,...,
|)0,0(|
|),0(|,...,
|)0,0(|
|)1,0(|,
|)0,0(|{
PF
nmPF
PF
mPF
PF
nPF
PF
PF
area
PFGFD
12
Faculty of Information TechnologyFaculty of Information Technology
Experiment • Datasets
– MPEG-7 region shape database (CE-2) has been tested. CE-2 has been organized by MPEG-7 into six datasets to test a shape descriptor’s behaviors under different distortions.
– Set A1 is for test of scale invariance. 100 shapes in Set A1 has been classified into 20 groups which are designated as queries.
– Set A2 is for test of rotation invariance. 140 shapes in Set A2 has been classified into 20 groups which are designated as queries
– Set A3 is for test of rotation/scaling invariance.
– Set A4 is for test of robustness to perspective transform. 330 shapes in Set A4 has been classified into 30 groups which are designated as queries.
– Set B consists of 2811 shapes from the whole database, it is for subjective test. 682 shapes in Set B have been manually classified into 10 groups by MPEG-7.
– For the whole database, 651 shapes have been classified into 31 groups which can be used as queries.
• Datasets – MPEG-7 region shape database (CE-2) has been tested. CE-2 has
been organized by MPEG-7 into six datasets to test a shape descriptor’s behaviors under different distortions.
– Set A1 is for test of scale invariance. 100 shapes in Set A1 has been classified into 20 groups which are designated as queries.
– Set A2 is for test of rotation invariance. 140 shapes in Set A2 has been classified into 20 groups which are designated as queries
– Set A3 is for test of rotation/scaling invariance.
– Set A4 is for test of robustness to perspective transform. 330 shapes in Set A4 has been classified into 30 groups which are designated as queries.
– Set B consists of 2811 shapes from the whole database, it is for subjective test. 682 shapes in Set B have been manually classified into 10 groups by MPEG-7.
– For the whole database, 651 shapes have been classified into 31 groups which can be used as queries.
13
Faculty of Information TechnologyFaculty of Information Technology
Performance Measurement
• Precision-Recall
• For each query, the precision of the retrieval at each level of the recall is obtained. The result precision of retrieval is the average precision of all the query retrievals.
• Precision-Recall
• For each query, the precision of the retrieval at each level of the recall is obtained. The result precision of retrieval is the average precision of all the query retrievals.
imagesretrievedofnumber
imagesretrievedrelevantofnumber
2
n
rP
DBinimagesrelevantofnumbertotal
imagesretrievedrelevantofnumber
1
n
rR
14
Faculty of Information TechnologyFaculty of Information Technology
Results • Average Precision-Recall on Set A1 and A2• Average Precision-Recall on Set A1 and A2
0102030405060708090100
10 20 30 40 50 60 70 80 90 100Recall
Prec
isio
n
GFD
ZMD
0102030405060708090100
10 20 30 40 50 60 70 80 90 100
Recall
Prec
isio
n
GFD
ZMD
Scale Invariance Test Rotation Invariance Test
15
Faculty of Information TechnologyFaculty of Information Technology
Results • Average Precision-Recall on Set A4 and CE-2• Average Precision-Recall on Set A4 and CE-2
0102030405060708090100
0 10 20 30 40 50 60 70 80 90 100
Recall
Prec
isio
n
GFD
ZMD
0102030405060708090100
0 10 20 30 40 50 60 70 80 90 100Recall
Prec
isio
n
GFD
ZMD
Perspective Invariance Test General Invariance Test
16
Faculty of Information TechnologyFaculty of Information Technology
• Average Precision-Recall on Set B• Average Precision-Recall on Set B
0
10
20
30
40
50
60
70
80
0 10 20 30 40 50 60 70 80 90 100
Recall
Prec
isio
n GFD
ZMD
Class 1 2 3 4 5 6 7 8 9 10 Average
No. of shapes 68.0 248 22 28 17 22 45 145 45 42
GFD (%) 47.0 66.4 55.6 50.0 50.0 24.8 30.4 50.8 55.6 29.0 46.0
ZMD (%) 37.0 58.0 55.0 41.2 42.6 22.6 33.6 52.0 41.4 34.0 41.7
Su
bje
cti
ve
Te
st
17
Faculty of Information TechnologyFaculty of Information Technology
Results S
et A
1S
et A
1S
et A
4S
et A
4
20
Faculty of Information TechnologyFaculty of Information Technology
Conclusions • A new shape descriptor, generic Fourier descriptor
(GFD) has been proposed.• It has been tested on MPEG-7 region shape database • Comparisons have been made between GFD and
MPEG-7 shape descriptor ZMD.• Compared with ZMD, GFD has four advantages:
– it captures spectral features in both radial and circular directions;
– it is simpler to compute; – it is more robust and perceptually meaningful; – the physical meaning of each feature is clearer.
• The proposed GFD satisfies all the six requirements set by MPEG-7 for shape representation .
• A new shape descriptor, generic Fourier descriptor (GFD) has been proposed.
• It has been tested on MPEG-7 region shape database • Comparisons have been made between GFD and
MPEG-7 shape descriptor ZMD.• Compared with ZMD, GFD has four advantages:
– it captures spectral features in both radial and circular directions;
– it is simpler to compute; – it is more robust and perceptually meaningful; – the physical meaning of each feature is clearer.
• The proposed GFD satisfies all the six requirements set by MPEG-7 for shape representation .