View
236
Download
1
Category
Preview:
Citation preview
213
Chapter 7
Content Based Image Retrieval (CBIR)
In chapter 1, an introduction to “Content Based Image Retrieval (CBIR)”
was given. Some applications of CBIR and related problems and issues were also
discussed. Several tools and techniques are being used in the development of CBIR
systems to enhance the capabilities for better image retrieval at a higher level of
semantics.
Most of the image retrieval techniques presented in the literature are not
robust enough in exhibiting good image retrieval performance even with altered
input images.
Wavelets have found considerable application in image retrieval. Wavelets
and other wavelet family members such as curvelets are widely being used in the
development and implementation of the efficient image retrieval systems in recent
years [Zho2009].
In this chapter, the role and effectiveness of wavelets in the development of
robust image retrieval schemes, which can perform better against several input
image alterations, is studied. Efforts have been made to utilize wavelets in
extracting “color and texture based features of the images” and their performance
is presented in comparison to some well-known standard color and texture
feature extraction schemes.
7.1 Wavelets in CBIR
Wavelets offer advantages of multi-resolution representation and space-
frequency localization. “They have capability to capture image details in different
directions viz., horizontal, vertical and diagonal”. Curvelets are excellent in
capturing image details along the curvatures at various resolution levels. Wavelets
have also proved to be good for texture based image retrieval.
“A wavelet-based salient point extraction algorithm for CBIR is proposed in
[Tia2001]. In this scheme, the color and texture information in the locations given
214
by these points is extracted which provides significant improvement in the
retrieval results”.
“A CBIR system named SIMPLIcity (Semantics sensitive Integrated
Matching for Picture LIbraries) is proposed in [Wan2001]. It uses semantics
classification methods, a wavelet-based approach for feature extraction, and
integrated region matching based upon image segmentation. The system classifies
images into semantic categories, such as textured-non textured, graph-
photograph. Potentially, the categorization enhances retrieval by permitting
semantically-adaptive searching methods and narrowing down the searching
range in a database. The proposed system also shows robustness to image
alterations”.
“A wavelet based texture feature extraction method for CBIR applications is
presented in [Vad2004]. This method uses standard deviations of high frequency
components of Daubechies wavelet transform as texture features. A total of nine
features are obtained from three levels of decomposition”.
“A simple wavelet based CBIR system is presented in [Chi2006]. In this
scheme, wavelet coefficients of each image are stored. In the image retrieval
phase, the system compares the most significant wavelet coefficients of the Y, U
and V components of the query image with those of the images in the database,
coupled with the weight factors assigned by users, and finds out the matches
based on the features of interest to users”.
“An algorithm for texture feature extraction using wavelet decomposed
coefficients of an image and its complement is presented in [Hir2006]. Four
different approaches (Multispectral approach in RGB, HSV, YCbCr space and Gray
scale texture feature) to color texture analysis are tested on the classification of
images from the VisTex database”.
“A novel approach for rotation invariant texture image retrieval is
presented in [Kok2006]. This method uses set of dual-tree rotated complex
wavelet filter (DT-RCWF) and DT complex wavelet transform (DT-CWT) jointly in
12 different directions. The DT-RCWFs are non-separable and oriented, which
improves characterization of oriented textures”.
215
“Representation of local properties in an image is an important issue in
CBIR. The work in [Muw2007], proposes a salient region detector based on
wavelet transform. The detector can extract visually meaningful regions of an
image and reflect local characteristics”.
“A CBIR method for a diagnosis aid in medical fields is proposed in
[Lam2007]. To extract texture features, the Gaussian curves are tried to fit the
distribution of wavelet coefficients at different levels of decomposition. Only few
parameters defining the fitted curves are stored as image signatures in feature
database. The wavelet function is adopted by the lifting scheme and retrieval
efficiency is given for different databases including a diabetic retinopathy, a
mammography and a face database”.
“A wavelet based retrieval scheme for trade mark images without
complicated image segmentation is presented in [Muw2008]. The proposed
trademark image retrieval scheme comprises two stages: first, edge detection
based on wavelet transform is performed on the trademark image, second, novel
wavelet -based shape features are introduced to reflect the edges’ characteristics”.
“A CBIR method based on an efficient combination of multi-resolution color
and texture features is proposed in [Chu2008]. In this scheme, color auto-
correlograms of hue and saturation component images are used as color features
and BDIP (block difference of inverse probabilities) and BVLC (block variation of
local correlation coefficients) moments of the value component images are used
for texture features”.
“A CBIR method which uses multi-wavelet transform is presented in
[Xiw2008]. Multi-wavelets are used to improve the shape and texture features
extraction and decrease the operands. Theoretical analysis and the numerical and
experimental results show that the multi-wavelet transform based image retrieval
can get good results for image retrieval”.
“The phase often holds crucial information about image structures and
features. However, only the real part or the magnitude of the transform
coefficients is typically used for image processing applications. A method for the
feature extraction of images called Phase-based LBP is presented in [Ngu2010].
216
Proposed method is based on the combination of phase of complex wavelet
coefficients and the Local Binary Pattern operator (LBP)”.
“A scheme of color image retrieval, based on wavelet transform and G-
Regions of Interest (GROI) is presented in [Xug2010]. The images are first
represented in HSV color space. Areas of interest are then extracted by using K-
means clustering in the wavelet domain. The energies of the wavelet coefficients in
these areas of interest is used as a texture feature. For color feature, mean and
variance are used. The barycentric coordinates are also utilized as position
features”.
“An image retrieval method using the Daubechies wavelet and stage
treatment is proposed in [Wan2010]. This algorithm decomposes the color
information of the image using Daubechies wavelet and constructs the eigenvector
using both the low frequency component and the high frequency component.
Three features: variance, invariant moment and angle of the eigenvector are used
to compare the similarity between the retrieval image and the images in the
database”.
“A new image indexing and retrieval algorithm by integrating color and
texture features is proposed in [Red2012]. Histograms are constructed from HSV
space and used as color feature. For texture feature, the color image is converted
into grayscale and divided into eight binary bit-planes. The Binary Wavelet
Transform (BWT) is applied on each bit-plane and the local binary pattern (LBP)
features are extracted from the resultant BWT sub-bands”.
“The use of pyramidal and tree structured wavelet features using 8-tap
Daubechies coefficients is proposed in [Kok2013] for texture analysis along with
extensive experimental evaluation. Comparison with various features indicates
that the combination of energy and standard deviation of wavelet features provide
good pattern retrieval accuracy for tree structured wavelet decomposition while
standard deviation alone gives better result in pyramidal wavelet decomposition”.
“Various other techniques have been used to improve retrieval rates along
with wavelet decomposition. These techniques mainly focus on use of relevance
feedback [Bul2011], use of energy distribution pattern, statistical properties,
various color models, texture analysis [Tam1978] etc. to extract details at various
217
levels of resolution. Along with simple wavelets decomposition, use of some
advanced versions of wavelets such as contourlets [Rao2007], curvelets
[Sum2008], ridgelts [God2010] etc. is also reported in various methods. A
comparative literature survey of use of wavelet based techniques in CBIR is given
in table 7.1”.
From table 7.1 it can be observed that researchers have utilized the
strengths of wavelets in various ways. Majority of the work is focused on texture
and color based features extraction. In almost all the schemes, either energy,
standard deviation or other higher order moments of wavelet coefficients are used
for texture feature extraction. Detailed analysis of properties of wavelets and
effective utilization of their strengths has not been addressed adequately.
Majority of the works do not report the robustness of proposed CBIR
scheme against image alterations and noise except a few such as [Wan2001,
Kok2006, Xiw2008]. The work proposed by [Wan2001] shows the robustness of
the CBIR scheme against various image processing alterations done to the query
images such as cropping, intensity variations etc. The works of [Kok2006,
Xiw2008], only consider rotation invariance. Wavelets have the capability of
capturing edge details of the images, even in the presence of noise. This can help in
providing the robustness against noise. Multi-resolution feature of wavelets can
provide robustness against cropping and scaling. Therefore, there is motivation to
develop a robust CBIR system which perform well even when there are alterations
in the input image.
In this chapter, an attempt has been made to develop a robust CBIR
scheme, “which utilizes the strength of multi resolution capability of wavelets and
strength of edge histogram descriptor”. This scheme extracts both color and
texture features with help of wavelet coefficients.
218
219
220
7.2 Proposed CBIR scheme based on Multi–Resolution Wavelet Transform and Edge Histogram11
Wavelet transform analysis is often used for extracting features of an image
in a few directions. The high frequency detailed wavelet coefficients are able to
represent image details in horizontal, vertical and diagonal directions while
approximation coefficients represent overall image content. The strength of
wavelet based image representation is multi-resolution representation of an
image which emulates the human visual system of observing coarser and finer
details of an image. “The orientations of various edges in an image can be
captured by finding edge histogram with help of Edge Histogram Descriptor (EHD)
of Mpeg-7 standard [Won2002]. Various CBIR techniques use EHD for image
retrieval based on shape and texture features”. In the proposed CBIR method, the
strength of multi resolution capability of wavelets and edge capturing capability of
EHD are combined together to achieve higher retrieval rates.
Color is another very important and primary feature used by CBIR systems.
Various color features defined by Mpeg-7 standard are used in practice such as
“Color Layout Descriptor (CLD), Scalable Color Descriptor (SCD), Color Structure
Descriptor (CSD), Dominant Color Descriptor (DCD)” etc. Other color descriptors
such as Color Correlogram [Jin1997] are also popular. These color descriptors
extract color features in RGB, HSV, YCbCr and HMMD color spaces.
“The HSV color space is a popular choice for manipulating colors. The HSV
color space is developed to provide an intuitive representation of color and to
approximate the way in which humans perceive and manipulate color. This HSV
color space is used by Scalable Color Descriptor (SCD). The SCD is one of the
widely used Mpeg-7 standards for color feature extraction in CBIR systems. In the
proposed CBIR scheme, a modified SCD is implemented, which is simple but
effective as it utilizes full 256 bins color histograms without any quantization”.
This part of the thesis is presented and published in, 11. “C. Patvardhan, A. K. Verma and C. V. Lakshmi, ‘A Robust Content Based Image Retrieval Based On Multi-
Resolution Wavelet Features and Edge Histogram’, 2nd IEEE International Conference on Image Information
Processing (ICIIP), pp. 447-452, Jaypee University of Information & Technology, Shimla, Dec. 9 - 11, 2013”.
221
7.2.1 Proposed Methodology
In the proposed scheme, the effectiveness of wavelet transform is utilized
in both color and texture based feature extraction. Edge histogram is also utilized
along with wavelet transform for texture based feature extraction. These feature
extraction schemes are described next.
7.2.1.1 Color Feature Extraction Scheme
To extract color features of images, a modified version of Scalable Color
Descriptor (SCD) of Mpeg-7 standard is utilized. This version is easier to
implement as quantization steps of SCD are omitted here. “The Haar wavelet
transform is used to reduce the dimension of feature vector”. The overall scheme
is shown in figure 7.1.
Like SCD, the proposed scheme also utilizes the HSV color space. In SCD, the
histograms are quantized and represented in only 16, 4 and 4 bins for H, S and V
respectively. But in the proposed scheme, the full 256 bins histograms are found
for each of the H, S and V color channels collecting more color details at this level.
At this stage, the total length of color feature vector is 3 × 256 = 768. The “Haar
transform” is used to reduce this number.
In the HSV color model, it is known that, ‘H’ is Hue, which represents the
distinct color and is therefore more important than ‘S’ and ‘V’. ‘S’ is Saturation
which represents the amount of color and ‘V’ is Value which represents the
brightness of color. Therefore, “the images can be identified on the basis of Hue
values”. Thus histogram of Hue is decomposed at level 2, while histograms of S and
V are decomposed at level 3.
In Haar wavelet decomposition, the approximation coefficients are
considered because they preserve the fundamental shape of histogram curve at
several levels of decomposition. High frequency detailed coefficients are ignored
as they only contribute fast color changes of noisy nature. The steps are depicted
in figure 7.1 and are explained as follows.
i) Convert RGB input image (𝐼) into HSV color space.
ii) Compute full 256 bins histograms of each color channel H, S and V.
222
iii) Perform 2-level Haar wavelet decomposition for H-histogram (𝐻ℎ)
and 3-level Haar wavelet decomposition for S-Histogram (𝐻𝑠) and V-
Histogram (𝐻𝑣).
iv) Select only approximation coefficients of these wavelet
decompositions and concatenate them to form final color feature
vector (𝑓𝑣𝑐).
The size of each histogram is 256. The 2-level wavelet decomposition of 𝐻ℎ
and 3-level decomposition of 𝐻𝑠 and 𝐻𝑣 result in approximation coefficient vectors
of lengths 64, 32 and 32 respectively. Therefore, the total length of color feature
vector becomes 128.
In the method proposed, Haar wavelet is used because the objective here is
to reduce only the size of feature vector. Also, Haar filter is smallest in size leading
to a very fast method of color feature extraction.
Fig. 7.1: Wavelet based color feature extraction scheme
223
7.2.1.2 Texture Feature Extraction Scheme
Wavelets are good at capturing details of an image in “horizontal, vertical
and diagonal directions” even in the presence of noise where EHD features alone
do not show good performance. Also wavelets can capture these details at various
resolution levels similar to human visual system, while only EHD cannot do this.
Therefore, an attempt is made to combine their capabilities to achieve better
retrieval results. In the proposed scheme, major image details are captured by
wavelet transform of the image at three resolution levels. Dominant edges are
then captured by EHD to form the feature vector. The overall scheme is shown in
figure 7.2.
Fig. 7.2: Wavelet and EHD based texture feature extraction scheme
The edge details of wavelet coefficients are captured by the calculating edge
histogram as shown in figure 7.2. The edge histogram is calculated by the method
already discussed in section 3.3.1.1 of chapter 3. However, 5 bins are added as
global features of the image. These 5 more bins are obtained by averaging all the
224
16 sets of 5 bins representing local image properties. Therefore, length of the edge
histogram vector becomes 85 instead of 80. As can be seen from figure 7.2, the
edge histograms are calculated for wavelet coefficients at all three levels.
Therefore, the total length of texture feature vector becomes 3 × 85 = 255. The
steps are as follows.
i) Decompose image (𝐼) up to level 3 and collect all the wavelet
coefficients at each resolution level.
ii) Arrange wavelet coefficients in a matrix 𝐶𝑚 for each level of
decomposition as shown in figure 7.2.
iii) Find Edge Histogram of matrix 𝐶𝑚 for each level of decomposition. The
edge histogram generates 85 values of texture features. Therefore, for
all the three levels, 255 values are generated.
iv) Concatenate all the edge histogram values to form a single texture
feature vector(𝑓𝑣𝑡) and store it in feature database.
In the method proposed, Db10 wavelet is used for texture feature extraction. This
is a higher order smooth wavelet, which is capable of capturing fast changing (high
frequency) details of images. Also in chapter 6, it is already shown that Db10
performs better than other wavelets for capturing edge details of characters.
Therefore, Db10 is utilized for proposed CBIR case as mainly image details are
captured in form of edges.
7.2.2 Experimental Results and Analysis
The proposed scheme is tried on images of various categories. Two popular
databases are Wang’s image database [Wan2013] and Microsoft Research
Cambridge Object Recognition Image Database [Mic2013]. “These are used for
training and testing the performance of the proposed scheme. The Wang’s image
database is used in implementation of SIMPLIcity software package [Wan2001]
and also used by majority of researchers to show the performance of their
proposed CBIR schemes”.
“The Wang’s image database consists of 1000 images having 10 categories
of 100 images each. These 10 categories and their sample images are shown in
table 7.2”.
225
Table 7.2: Wang’s Image Database
Category Sample images
Africans
Beaches
Buildings
Buses
Dinosaurs
Elephants
Roses
Horses
Mountains
Food
Table 7.3: Microsoft Research Image Database Category Sample images
Aero planes
Cow
Sheep
Bi-Cycles
Cars
Chimneys
Clouds
Doors
Cutlery
Landscapes
Trees
Windows
226
Each image of the database is an RGB color image of size 288 × 384
or 384 × 288. The Microsoft’s image database consists of 4129 images of total 18
categories. The division of images in each category is not equal in this case unlike
Wang’s database. Here some categories have more images while others have very
few images. Therefore, for testing purpose, total 12 categories are selected. These
categories along with sample images are shown in table 7.3. Each Image in
Microsoft’s database is an RGB color image of size 640 × 480 or 480 × 640.
Before testing the proposed scheme, it is trained with images of Wang’s and
Microsoft’s image databases. The training procedure is shown in figure 7.3.
Fig.7.3: Training procedure for proposed CBIR scheme
In the training procedure, “both color and texture features are calculated
for each image in the database and these feature vectors are stored in database”.
Similarly for testing, “both the color and texture feature vectors are calculated for
the query image and compared with feature vectors already stored in database”.
Then the first ′𝑛′ nearest matches are displayed as retrieval results. The testing
procedure is shown in figure 7.4. “The similarity check finds distance between
feature vector of query image and feature vectors stored in the database”. In the
proposed scheme, nearest match is found by the sum of absolute differences of
feature vectors. This is calculated using equation 6.1 of chapter 6.
227
Fig. 7.4: Testing procedure for proposed CBIR scheme
To test the performance of the proposed CBIR scheme, the following 4 test
procedures are followed.
Procedure 1 (P1): In this procedure, only color features are used.
Procedure 2 (P2): In this procedure, only texture feature are used.
Procedure 3 (P3): It is a two-stage classification process which utilizes both the
color and texture features. Here color feature of query image is used as primary
classification then texture feature is used to find final results out of results of
primary classification.
Procedure 4 (P4): This is also a two-stage classification process where both the
features are used. Here, texture feature of query image is first used as primary
classification. Then color features are used to find the set of final results, out of the
results of primary classification.
“The results of relevant image retrieval are expressed in terms of “Precision and
Recall, which are adopted by the majority of researches. “Precision and Recall are
defined as follows.
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =𝑁𝑂
𝑁𝑟 (7.1)
𝑅𝑒𝑐𝑎𝑙𝑙 =𝑁𝑂
𝑁𝑑 (7.2)
228
𝑁𝑜: Number of relevant images retrieved.
𝑁𝑟: Number of total images requested.
𝑁𝑑: Total number of relevant images in database.
Both Precision and Recall are expressed in percentage terms. Larger values
of both precision and recall represent good performance of the CBIR system”.
To test the retrieval performance, 10 test images from Wang’s database and
12 test images from Microsoft’s database are randomly selected. These test images
are used as query images and are shown in figure 7.5(a) and (b). The selected test
images are taken from each category of both the databases.
Africans (3.jpg) Beaches (158.jpg) Buildings (244.jpg) Buses (314.jpg) Dinosaurs (445.jpg)
Elephants (565.jpg) Roses (619.jpg) Horses (744.jpg) Mountains (801.jpg) Food (977.jpg)
Fig. 7.5 (a): Test images from each category from Wang’s database
Airplanes (28.jpg)
Cows (206.jpg)
Sheep (267.jpg)
Bi-Cycles (723.jpg)
Cars (1000.jpg)
Chimneys (1620.jpg)
Clouds
(1846.jpg) Doors
(2265.jpg) Cutlery
(2571.jpg) Landscapes (2852.jpg)
Trees (3377.jpg)
Windows (3635.jpg)
Fig. 7.5 (b): Test images from each category from Microsoft’s database
Testing is performed using all four procedures. For procedure 1 (P1), the results
for test images of figure 7.6(a) are shown in table 7.4 for first 50 nearest matches.
229
Table 7.4: Comparison of Precision for Color Features only (Procedure: P1/ Wang’s Images)
Image Category
Images No.
Precision (%)
𝑵𝒓 = 𝟏𝟎 𝑵𝒓 = 𝟐𝟎 𝑵𝒓 = 𝟑𝟎 𝑵𝒓 = 𝟒𝟎 𝑵𝒓 = 𝟓𝟎
Africans 3.jpg 100.00 100.00 100.00 95.00 92.00
Beaches 158.jpg 100.00 100.00 93.33 95.00 88.00
Buildings 244.jpg 80.00 85.00 73.33 67.50 70.00
Buses 314.lpg 100.00 95.00 93.33 95.00 94.00
Dinosaurs 445.jpg 100.00 100.00 100.00 100.00 100.00
Elephants 565.jpg 100.00 75.00 73.33 72.50 66.00
Roses 619.jpg 100.00 100.00 93.33 90.00 86.00
Horses 744.jpg 100.00 100.00 100.00 97.50 96.00
Mountains 801.jpg 40.00 30.00 33.33 42.50 36.00
Food 977.jpg 70.00 60.00 63.33 67.50 70.00
Average 89.00 84.50 82.33 82.25 79.80
Similarly the retrieval results for only texture features are shown in table 7.5.
Table 7.5: Comparison of Precision for Texture Features only (Procedure: P2/ Wang’s Images)
Image Category
Images No.
Precision (%)
𝑵𝒓 = 𝟏𝟎 𝑵𝒓 = 𝟐𝟎 𝑵𝒓 = 𝟑𝟎 𝑵𝒓 = 𝟒𝟎 𝑵𝒓 = 𝟓𝟎
Africans 3.jpg 60.00 45.00 40.00 30.00 24.00
Beaches 158.jpg 80.00 65.00 56.67 55.00 54.00
Buildings 244.jpg 60.00 50.00 40.00 42.50 40.00
Buses 314.lpg 100.00 100.00 96.67 95.00 92.00
Dinosaurs 445.jpg 100.00 100.00 96.67 97.50 98.00
Elephants 565.jpg 60.00 70.00 50.00 37.50 32.00
Roses 619.jpg 100.00 100.00 100.00 100.00 100.00
Horses 744.jpg 100.00 90.00 86.67 87.50 82.00
Mountains 801.jpg 50.00 40.00 26.67 27.50 24.00
Food 977.jpg 70.00 50.00 43.33 40.00 38.00
Average 78.00 71.00 63.67 61.25 58.40
It can be observed from the results of tables 7.4 and 7.5 that performance of color
features alone is better than texture features. In the case of a few categories such
as Buses, Dinosaurs and Roses, where the object is very clear, the results of texture
based features are also better as compared to color based features. But the
problem becomes more complicated when level of semantics increases such as in
case of images in Africans, Elephants, Mountains and Food categories. Therefore,
both color and texture features need to be combined. This combination is done in
230
testing procedure 3 (P3) and procedure 4 (P4). The results of testing same Wang’s
query test images under procedures P3 and P4 are shown in tables 7.6 and 7.7.
Table 7.6: Comparison of Precision for Color & Texture Features (Procedure: P3/ Wang’s Images)
Image Category
Images No.
Precision (%)
𝑵𝒓 = 𝟏𝟎 𝑵𝒓 = 𝟐𝟎 𝑵𝒓 = 𝟑𝟎 𝑵𝒓 = 𝟒𝟎 𝑵𝒓 = 𝟓𝟎
Africans 3.jpg 100.00 100.00 96.67 95.00 92.00
Beaches 158.jpg 100.00 100.00 100.00 92.50 90.00
Buildings 244.jpg 100.00 95.00 76.67 75.00 70.00
Buses 314.lpg 100.00 100.00 100.00 100.00 94.00
Dinosaurs 445.jpg 100.00 100.00 100.00 100.00 100.00
Elephants 565.jpg 100.00 80.00 73.33 70.00 66.00
Roses 619.jpg 100.00 100.00 100.00 100.00 86.00
Horses 744.jpg 100.00 100.00 100.00 100.00 96.00
Mountains 801.jpg 90.00 75.00 56.67 42.50 36.00
Food 977.jpg 100.00 95.00 83.333 77.50 70.00
Average 99.00 94.50 88.67 85.25 80.00
Table 7.7: Comparison of Precision for Texture & Color Features (Procedure: P4/ Wang’s Images)
Image Category
Images No.
Precision (%)
𝑵𝒓 = 𝟏𝟎 𝑵𝒓 = 𝟐𝟎 𝑵𝒓 = 𝟑𝟎 𝑵𝒓 = 𝟒𝟎 𝑵𝒓 = 𝟓𝟎
Africans 3.jpg 80.00 55.00 43.33 32.50 26.00
Beaches 158.jpg 100.00 85.00 73.33 70.00 58.00
Buildings 244.jpg 90.00 70.00 60.00 55.00 48.00
Buses 314.lpg 100.00 100.00 100.00 100.00 90.00
Dinosaurs 445.jpg 100.00 100.00 100.00 100.00 100.00
Elephants 565.jpg 90.00 55.00 40.00 37.50 34.00
Roses 619.jpg 100.00 100.00 100.00 100.00 100.00
Horses 744.jpg 100.00 100.00 100.00 97.50 80.00
Mountains 801.jpg 70.00 60.00 40.00 30.00 24.00
Food 977.jpg 90.00 75.00 50.00 37.50 30.00
Average 92.00 80.00 70.67 66.00 59.00
From the results of table 7.6 and 7.7, it is clear that results of procedure 3
(P3) are better as compared to results of procedure 4 (P4). The choice of either
procedure P3 or P4 is actually governed by the user’s demand. If color is main
search criterion for the user, then P3 will be selected else P4 will be the choice if
texture is the main search criterion.
The total number of images (𝑁𝑑) in each category of Wang’s database is
100. Therefore, by the equation 7.2, the recall rate can also be found by
231
considering first 100 images in retrieval results. The recall rates for each category
are shown in table 7.8 for P3 and P4.
Table 7.8: Recall rates for Wang’s test images for procedures P3 and P4
Image Category
No. of Images
Images No. % Recall (𝑵𝒓/𝑵𝒅)
Color First (P3) Texture First (P4)
Africans 100 3.jpg 73.00 21.00
Beaches 100 158.jpg 65.00 43.00
Buildings 100 244.jpg 47.00 34.00
Buses 100 314.lpg 78.00 70.00
Dinosaurs 100 445.jpg 98.00 74.00
Elephants 100 565.jpg 42.00 26.00
Roses 100 619.jpg 60.00 80.00
Horses 100 744.jpg 80.00 60.00
Mountains 100 801.jpg 34.00 18.00
Food 100 977.jpg 58.00 30.00
Average 63.50 45.60
From the results of table 7.8, it can be seen that, retrieval results of procedure P3
are better for the images of Wang’s images.
Some visual results of image retrieval for each procedure (P1, P2, P3 and
P4) are shown in figures 7.6 to 7.9. The number of requested images is 15 to be
displayed and the first image at left top corner is query image.
It can be easily seen from the retrieval results of figure 7.6 to 7.9 that
proposed wavelet based color and texture features are strong representations of
image details and the combination of both the features further enhances the
retrieval capabilities.
The performance of the proposed wavelet based CBIR scheme is also tested
for another popular image database which is obtained from Microsoft Research.
Again all the four test procedures are performed and the results of retrieval are
shown in tables 7.9 to 7.12 for the test images of figure 7.5 (b).
232
Fig. 7.6: “Retrieval result for query image (Wang’s 3.jpg) for procedure P1”
Fig. 7.7: “Retrieval result for query image (Wang’s 619.jpg) for procedure P2”
233
Fig. 7.8: “Retrieval result for query image (Wang’s 744.jpg) for procedure P3”
Fig. 7.9: “Retrieval result for query image (Wang’s 314.jpg) for procedure P4”
234
It can be easily seen from the retrieval results of figure 7.6 to 7.9 that
proposed wavelet based color and texture features are strong representations of
image details and the combination of both the features further enhances the
retrieval capabilities.
“The performance of the proposed wavelet based CBIR scheme is also
tested for another popular image database which is obtained from Microsoft
Research”. Again all the four test procedures are performed and the results of
retrieval are shown in tables 7.9 to 7.12 for the test images of figure 7.5 (b).
Table 7.9: Comparison of Precision for only Color Feature (Procedure: P1/ MS Images)
Image Category
Category Range
Image No.
Precision (%)
𝑵𝒓 = 𝟏𝟎 𝑵𝒓 = 𝟐𝟎 𝑵𝒓 = 𝟑𝟎 𝑵𝒓 = 𝟒𝟎 𝑵𝒓 = 𝟓𝟎
Aero planes 1-58 28.jpg 100.00 65.00 50.00 40.00 38.00
Cow 59-240 206.jpg 60.00 65.00 66.67 67.50 60.00
Sheep 241-430 267.jpg 100.00 95.00 86.67 82.50 74.00
Bi-Cycles 499-770 723.jpg 80.00 70.00 63.33 60.00 52.00
Cars 989-1483 1000.jpg 90.00 80.00 73.33 77.50 74.00
Chimneys 1484-1749 1620.jpg 100.00 100.00 93.33 92.50 88.00
Clouds 1750-2178 1846.jpg 100.00 95.00 96.67 97.50 98.00
Doors 2179-2344 2265.jpg 60.00 40.00 46.67 47.50 40.00
Cutlery 2511-2685 2571.jpg 100.00 100.00 100.00 95.00 90.00
Landscapes 2804-3019 2852.jpg 40.00 45.00 40.00 40.00 44.00
Trees 3257-3473 3377.jpg 90.00 90.00 86.67 85.00 82.00
Windows 3474-4125 3635.jpg 70.00 65.00 53.33 50.00 48.00
Average 75.00 75.833 71.39 69.58 65.67
Table 7.10: Comparison of Precision for only Texture Feature (Procedure: P2/ MS Images)
Image Category
Category Range
Image No.
Precision (%)
𝑵𝒓 = 𝟏𝟎 𝑵𝒓 = 𝟐𝟎 𝑵𝒓 = 𝟑𝟎 𝑵𝒓 = 𝟒𝟎 𝑵𝒓 = 𝟓𝟎
Aero planes 1-58 28.jpg 100.00 85.00 86.67 75.00 68.00
Cow 59-240 206.jpg 70.00 65.00 53.33 55.00 52.00
Sheep 241-430 267.jpg 80.00 75.00 76.67 75.00 66.00
Bi-Cycles 499-770 723.jpg 50.00 45.00 33.33 32.50 32.00
Cars 989-1483 1000.jpg 100.00 100.00 100.00 100.00 100.00
Chimneys 1484-1749 1620.jpg 100.00 95.00 90.00 77.50 72.00
Clouds 1750-2178 1846.jpg 100.00 100.00 96.67 97.50 98.00
Doors 2179-2344 2265.jpg 90.00 65.00 56.67 47.50 44.00
Cutlery 2511-2685 2571.jpg 100.00 100.00 100.00 100.00 92.00
Landscapes 2804-3019 2852.jpg 100.00 60.00 56.67 45.00 40.00
235
Trees 3257-3473 3377.jpg 100.00 100.00 100.00 97.50 96.00
Windows 3474-4125 3635.jpg 100.00 95.00 86.67 87.50 86.00
Average 90.83 82.08 78.06 74.17 70.50
Table 7.11: Comparison of Precision for Color & Texture Features (Procedure: P3/ MS Images)
Image Category
Category Range
Image No.
Precision (%)
𝑵𝒓 = 𝟏𝟎 𝑵𝒓 = 𝟐𝟎 𝑵𝒓 = 𝟑𝟎 𝑵𝒓 = 𝟒𝟎 𝑵𝒓 = 𝟓𝟎
Aero planes 1-58 28.jpg 100.00 90.00 73.33 55.00 44.00
Cow 59-240 206.jpg 100.00 100.00 93.33 90.00 82.00
Sheep 241-430 267.jpg 100.00 100.00 83.33 77.50 74.00
Bi-Cycles 499-770 723.jpg 100.00 100.00 93.33 92.50 84.00
Cars 989-1483 1000.jpg 100.00 100.00 100.00 100.00 100.00
Chimneys 1484-1749 1620.jpg 100.00 100.00 100.00 87.50 86.00
Clouds 1750-2178 1846.jpg 100.00 100.00 100.00 100.00 100.00
Doors 2179-2344 2265.jpg 100.00 95.00 90.00 80.00 72.00
Cutlery 2511-2685 2571.jpg 100.00 100.00 100.00 92.50 86.00
Landscapes 2804-3019 2852.jpg 100.00 90.00 93.33 95.00 92.00
Trees 3257-3473 3377.jpg 100.00 100.00 100.00 100.00 100.00
Windows 3474-4125 3635.jpg 100.00 100.00 93.33 90.00 84.00
Average 100.00 97.92 93.33 88.33 83.67
Table 7.12: Comparison of Precision for Texture & Color Features (Procedure: P4/ MS Images)
Image Category
Category Range
Image No.
Precision (%)
𝑵𝒓 = 𝟏𝟎 𝑵𝒓 = 𝟐𝟎 𝑵𝒓 = 𝟑𝟎 𝑵𝒓 = 𝟒𝟎 𝑵𝒓 = 𝟓𝟎
Aero planes 1-58 28.jpg 100.00 90.00 86.67 82.50 72.00
Cow 59-240 206.jpg 100.00 100.00 86.67 70.00 66.00
Sheep 241-430 267.jpg 100.00 95.00 83.33 80.00 76.00
Bi-Cycles 499-770 723.jpg 100.00 90.00 73.33 65.00 54.00
Cars 989-1483 1000.jpg 100.00 100.00 100.00 100.00 100.00
Chimneys 1484-1749 1620.jpg 100.00 100.00 100.00 90.00 80.00
Clouds 1750-2178 1846.jpg 100.00 100.00 100.00 100.00 100.00
Doors 2179-2344 2265.jpg 100.00 75.00 53.33 47.50 44.00
Cutlery 2511-2685 2571.jpg 100.00 100.00 100.00 95.00 94.00
Landscapes 2804-3019 2852.jpg 100.00 70.00 53.33 42.50 36.00
Trees 3257-3473 3377.jpg 100.00 100.00 100.00 100.00 100.00
Windows 3474-4125 3635.jpg 100.00 100.00 90.00 87.50 86.00
Average 100.00 92.50 85.28 80.88 75.33
236
From the results of table 7.9 to 7.12, it is clear that the performance of
procedure P3 is better than others. For images with almost fixed details, like Car,
Chimneys, Clouds, Cutlery etc. the retrieval results are excellent while for the test
images, where level of semantics being higher, the results are slightly lower but
satisfactory. The recall rates for MS test images are shown in table 7.13.
Table 7.13: Recall rates for MS test images for procedures P3 and P4
Image Category
No. of Images
Image No. % Recall (𝑵𝒓/𝑵𝒅)
Color First (P3) Texture First (P4)
Aero planes 58 28.jpg 34.48 60.35
Cow 182 206.jpg 42.86 42.31
Sheep 190 267.jpg 53.16 31.58
Bi-Cycles 272 723.jpg 33.10 25.00
Cars 495 1000.jpg 43.43 82.02
Chimneys 266 1620.jpg 36.10 38.35
Clouds 429 1846.jpg 56.64 61.54
Doors 166 2265.jpg 19.88 25.30
Cutlery 175 2571.jpg 49.15 49.72
Landscapes 216 2852.jpg 24.54 26.00
Trees 217 3377.jpg 54.38 61.30
Windows 652 3635.jpg 23.93 46.78
Average 39.31 45.85
The results of table 7.13 show that recall rates of procedure P4 are better
than procedure P3. But results of Recall show that P3 is better than P4. This
happens because for more than 50 images, retrieval rates decrease rapidly. Some
visual results of image retrieval for each procedure are shown in figures 7.10 to
7.13. The number of requested images is 15 to be displayed and the first image at
left top corner is the query image.
It can be observed from the figures 7.10 to 7.13, that the proposed
algorithm also performs well for Microsoft Research image database. The results
are encouraging and similar to the results of Wang’s image database. Therefore, it
shows the capability and strength of proposed wavelet based CBIR scheme as
retrieval results are not biased and database dependent. Again the selection of
procedure P3 or P4 depends on the users’ requirement, whether color is main
criterion or texture is main search criterion.
237
Fig. 7.10: “Retrieval result for query image (MS 2571.jpg) for procedure P1”
Fig. 7.11: “Retrieval result for query image (MS 1000.jpg) for procedure P2”
238
Fig. 7.12: “Retrieval result for query image (MS 267.jpg) for procedure P3”
Fig. 7.13: “Retrieval result for query image (MS 1620.jpg) for procedure P4”
239
The robustness of the proposed CBIR scheme is also investigated. A good
image retrieval scheme must be able to perform well even in the conditions of
noisy environment. The input query image may have several versions modified by
various image processing operations such as rotation, cropping, intensity
adjustment, noise addition etc. A good CBIR scheme should be able to retrieve the
relevant images from the database for these modified versions (within a specific
limit) of the query image. Such robustness tests are also given in [Wan2001]. A set
of test results, is shown in table 7.14 for Wang’s test images and in table 7.15 for
Microsoft’s images. Due to space limitations, first 10 retrieved images are
arranged in two rows.
Table 7.14: Robustness test for the proposed scheme for Wang’s test images for procedure P3
Type of Attack Query Image Retrieval Results (𝑭𝒊𝒓𝒔𝒕 𝟏𝟎 𝒎𝒂𝒕𝒄𝒉𝒆𝒔: 𝟏 𝟐 𝟑𝟔 𝟕 𝟖
𝟒 𝟓𝟗 𝟏𝟎
)
Scale Down by 50%
Scale Up by 130%
30% Bright
30% Dark
30% more Saturate
30% less Saturate
240
Horizontal Flip
Vertical Flip
Pixalize at 5 pixels
Blur by 3x3 Averaging
50% more Sharpness
50% Crop
Gaussian Noise (𝝁 = 𝟎,
𝝈𝟐 = 𝟎. 𝟎𝟖)
Random Crop
Table 7.15: Robustness test for the proposed scheme for Microsoft’s test images for procedure P3
Type of Attack Query Image Retrieval Results (𝑭𝒊𝒓𝒔𝒕 𝟏𝟎 𝒎𝒂𝒕𝒄𝒉𝒆𝒔: 𝟏 𝟐 𝟑𝟔 𝟕 𝟖
𝟒 𝟓𝟗 𝟏𝟎
)
Scale Down by 50%
241
Scale Up by 130%
30% Bright
30% Dark
30% more Saturate
30% less Saturate
Horizontal Flip
Vertical Flip
Pixalize at 5 pixels
Blur by 3x3 Averaging
50% more Sharpness
242
50% Crop
Gaussian Noise
(𝝁 = 𝟎, 𝝈𝟐 = 𝟎. 𝟎𝟖)
Random Crop
Results of the table 7.14 and 7.15 clearly show the robustness of the proposed
image retrieval scheme. In various types of image processing attacks, the retrieval
results are 100% for first 10 recalls except few cases such as scale up, sharpness
and Gaussian noise for Wang’s test images. Similarly for Microsoft’s test images
only vertical flip, Gaussian noise and random crop gave 1 or 2 mismatches.
The strength of proposed multi resolution approach can be seen in case of
scaling attack clearly as proposed scheme tries to match the query image at three
resolution levels. Therefore, coarser and finer details can be captured easily.
To show the advantage of using wavelets along with edge histogram, the
proposed multi resolution texture feature approach is also compared with only
“Mpeg-7 EHD (standard algorithm with 80 bins)” and wavelet based color feature
with “Scalable Color Descriptor (SCD) and Color Layout Descriptor (CLD)” of
Mpeg-7. The recall results are shown in tables 7.16 and 7.17 respectively for
Wang’s database.
Table 7.16: Recall rates comparison for procedure P2 and only Mpeg-7 EHD
Image Category
No. of Images
Images No.
% Recall (𝑵𝒓/𝑵𝒅)
Only EHD Texture
Only (P2)
Africans 100 3.jpg 33.00 21.00
Beaches 100 158.jpg 14.00 43.00
Buildings 100 244.jpg 29.00 34.00
Buses 100 314.lpg 45.00 70.00
Dinosaurs 100 445.jpg 85.00 74.00
Elephants 100 565.jpg 24.00 26.00
243
Roses 100 619.jpg 56.00 80.00
Horses 100 744.jpg 52.00 60.00
Mountains 100 801.jpg 20.00 18.00
Food 100 977.jpg 09.00 30.00
Average 36.70 45.60
Table 7.17: Recall rates comparison for procedure P1, only Mpeg-7 SCD and CLD
Image Category
No. of Images
Images No.
% Recall (𝑵𝒓/𝑵𝒅)
Only SCD Only CLD Color Only
(P1)
Africans 100 3.jpg 40.00 44.00 73.00
Beaches 100 158.jpg 25.00 36.00 65.00
Buildings 100 244.jpg 32.00 28.00 47.00
Buses 100 314.lpg 44.00 47.00 78.00
Dinosaurs 100 445.jpg 56.00 99.00 98.00
Elephants 100 565.jpg 13.00 43.00 42.00
Roses 100 619.jpg 45.00 60.00 60.00
Horses 100 744.jpg 13.00 82.00 80.00
Mountains 100 801.jpg 05.00 44.00 34.00
Food 100 977.jpg 17.00 38.00 58.00
Average 29.00 52.10 63.50
From the results of the tables 7.16 and 7.17, it is clear that the proposed scheme
performs better as compared to the standard Mpeg-7 descriptors for texture as
well as color individually.
The proposed P3 procedure is also compared with “Fuzzy Color and
Texture Histogram (FCTH) scheme proposed in [Cha2008] and Color and Edge
Directivity Descriptor (CEDD) proposed in [Cha2008a]”. The comparison is given
in table 7.18 in terms of recall.
Table 7.18: Recall rates comparison for proposed P3, FCTH [Cha2008] and CEDD [Cha2008a]
Image Category
No. of Images
Images No.
% Recall (𝑵𝒓/𝑵𝒅)
FCTH [Cha2008] CEDD [Cha2008a] Proposed P3
Africans 100 3.jpg 52.00 48.00 73.00
Beaches 100 158.jpg 50.00 43.00 65.00
Buildings 100 244.jpg 50.00 68.00 47.00
Buses 100 314.lpg 71.00 75.00 78.00
Dinosaurs 100 445.jpg 89.00 93.00 98.00
Elephants 100 565.jpg 44.00 45.00 42.00
Roses 100 619.jpg 72.00 65.00 60.00
244
Horses 100 744.jpg 88.00 85.00 80.00
Mountains 100 801.jpg 36.00 40.00 34.00
Food 100 977.jpg 50.00 35.00 58.00
Average 60.20 59.70 63.50
The results of table 7.18 show that the average recall score of proposed P3 scheme
for image retrieval is better than the “Fuzzy Color and Texture Histogram based
technique (FCTH) [Cha2008] and Color and Edge Directivity Descriptor (CEDD)
[Cha2008a]”.
7.3 Conclusions
In this section, “an efficient image retrieval scheme based on multi
resolution wavelet transform is presented. The proposed scheme extracts both
color as well as texture features of the query image for relevant image retrieval.
The color feature extraction algorithm is wavelet based and inspired by standard
Mpeg-7 Scalable Color Descriptor (SCD). It uses HSV color space and Haar wavelet
transform for size reduction. The results of comparison show that the proposed
color feature extraction scheme performs much better than SCD. The proposed
color feature also outperforms the Color Layout Descriptor (CLD) of Mpeg-7”.
To extract texture features, multi resolution advantage of wavelets is
utilized, which emulates the humans’ way of visualizing the objects in coarser and
finer details. Wavelets with edge histograms perform much better than only EHD
directly applied on the images. Particularly in noise or distortion cases, wavelets
are capable to capture the image details. To observe the robustness of the
proposed scheme, the input query images deformed by variety of noises and
geometrical attacks are given to the proposed scheme and in almost every case, it
performed better as can be seen in result section.
The combination of proposed color extraction feature algorithm and
texture feature extraction algorithm performs better than individual features and
outperforms some well-known existing CBIR algorithms.
Recommended