6
Imperial Journal of Interdisciplinary Research (IJIR) Vol-2, Issue-10, 2016 ISSN: 2454-1362, http://www.onlinejournal.in Imperial Journal of Interdisciplinary Research (IJIR) Page 1996 CBIR Using Color, Texture, and Support Vector Machine Jasjeet Singh Jora 1 & Amit Kumar 2 1,2 Department of Electronics and Communication, BMSCE, Sri Muktsar Sahib Abstract: The problem addressed in this research is how to quantify the notion of similarity between two images with the main focus on content-based image retrieval (CBIR). This research work proposed a combined feature SVM classification based approach based CBIR retrieval system for relevant image retrieval in order to provide better image classification and fast image retrieval. In the proposed technique the color and texture features like, mean, standard deviation are extracted from the image and converted into feature vector. The results show that the proposed approach retrieves images in efficient and accurate manner. Keywords: - Content Based Image Retrieval (CBIR), Support Vector Machine (SVM), Feature Extraction, Classification 1. INTRODUCTION In the past few decades, there is terrific growth in the amount of multimedia like image, video and audio data due to technological improvement in technologies [1] like cheaper digital imaging storage devices, computers having fast computational power and broadband networking make it possible to fabricate, relay, manipulate and store huge numbers of digital multimedia information. As a result, there is an exponential increase in digital images collection; in many areas such as art gallery, remote sensing, medical imaging, geographic information systems, weather forecasting, criminal investigation and communication system. Therefore, ongoing expansion of digital images requires more efficient methods or techniques for storing, sorting, searching as well as retrieving images from such large databases. CBIR systems provide the way for efficient retrieval of relevant images from a large image database based on visual content information of query image and similarity basis [6]. The visual information contents of an image are interpreted in terms of their low-level features [7] that extracted from the image i.e. colour, shape and texture [4]. Therefore, extracting similar images from a large collection efficiently by using their low-level features [8] has been an important issue in research. Various CBIR systems has been evolved and implemented in past few decades like QBIC, Virage, MARS and so on [9]. Such types of systems represent an image data using various features like colour, texture, edges, objects and shape. Features are extracted from the images in a compact vector by dimensionally reducing the image and storing in feature databases [10]. Images that are perceptually similar to the query can be found in the database. Unfortunately, still many difficulties are there [8] which hinder CBIR system. These are listed below: Mostly, to interpret an image users use high-level textual concepts such as words [9] for image description. However, most of the available CBIR systems use the low-level features to represent images. Therefore, semantic gap [11] always exists between the image representation and human interpretation [5]. Different users or the same user can interpret an image differently; under different circumstances [4] i.e. the similarity between two images may be quite different. Therefore, perception subjectivity problem is present in each semantic level and depends on independent user experience [11]. The rest paper is planned as track: Section 2 explains the Support Vector Machine (SVM) Section 3 gives feature extraction details used in present research Section 4 is planned to describe the proposed method of compression i.e. Stego-DWT+SPIHT. Experimental results of the proposed method are discussed in section 5 and last section 6 draws the conclusions. 2. SUPPORT VECTOR MACHINE (SVM) Support Vector Machines (SVMs) are applied to the problem of making predictions based on previously The SVM is a supervised learning algorithm that receives labeled examples as input and outputs a mathematical function that used to predict labels of new examples. Given the space from where the

ISSN: 2454-1362, CBIR Using ...features [7] that extracted from theimage i.e. colour, shape and texture [4]. Therefore, extracting similar images from a large collection efficiently

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ISSN: 2454-1362, CBIR Using ...features [7] that extracted from theimage i.e. colour, shape and texture [4]. Therefore, extracting similar images from a large collection efficiently

Imperial Journal of Interdisciplinary Research (IJIR) Vol-2, Issue-10, 2016 ISSN: 2454-1362, http://www.onlinejournal.in

Imperial Journal of Interdisciplinary Research (IJIR) Page 1996

CBIR Using Color, Texture, and Support Vector Machine

Jasjeet Singh Jora1 & Amit Kumar2

1,2Department of Electronics and Communication, BMSCE, Sri Muktsar Sahib

Abstract: The problem addressed in this research is how to quantify the notion of similarity between two images with the main focus on content-based image retrieval (CBIR). This research work proposed a combined feature SVM classification based approach based CBIR retrieval system for relevant image retrieval in order to provide better image classification and fast image retrieval. In the proposed technique the color and texture features like, mean, standard deviation are extracted from the image and converted into feature vector. The results show that the proposed approach retrieves images in efficient and accurate manner. Keywords: - Content Based Image Retrieval (CBIR), Support Vector Machine (SVM), Feature Extraction, Classification

1. INTRODUCTION

In the past few decades, there is terrific growth in the amount of multimedia like image, video and audio data due to technological improvement in technologies [1] like cheaper digital imaging storage devices, computers having fast computational power and broadband networking make it possible to fabricate, relay, manipulate and store huge numbers of digital multimedia information. As a result, there is an exponential increase in digital images collection; in many areas such as art gallery, remote sensing, medical imaging, geographic information systems, weather forecasting, criminal investigation and communication system. Therefore, ongoing expansion of digital images requires more efficient methods or techniques for storing, sorting, searching as well as retrieving images from such large databases.

CBIR systems provide the way for efficient retrieval of relevant images from a large image database based on visual content information of query image and similarity basis [6]. The visual information contents of an image are interpreted in terms of their low-level features [7] that extracted from the image i.e. colour, shape and texture [4]. Therefore, extracting similar images from a large collection efficiently by using

their low-level features [8] has been an important issue in research.

Various CBIR systems has been evolved and implemented in past few decades like QBIC, Virage, MARS and so on [9]. Such types of systems represent an image data using various features like colour, texture, edges, objects and shape. Features are extracted from the images in a compact vector by dimensionally reducing the image and storing in feature databases [10]. Images that are perceptually similar to the query can be found in the database. Unfortunately, still many difficulties are there [8] which hinder CBIR system. These are listed below:

• Mostly, to interpret an image users use high-level textual concepts such as words [9] for image description. However, most of the available CBIR systems use the low-level features to represent images. Therefore, semantic gap [11] always exists between the image representation and human interpretation [5].

• Different users or the same user can interpret an image differently; under different circumstances [4] i.e. the similarity between two images may be quite different. Therefore, perception subjectivity problem is present in each semantic level and depends on independent user experience [11].

The rest paper is planned as track: Section 2 explains the Support Vector Machine (SVM) Section 3 gives feature extraction details used in present research Section 4 is planned to describe the proposed method of compression i.e. Stego-DWT+SPIHT. Experimental results of the proposed method are discussed in section 5 and last section 6 draws the conclusions.

2. SUPPORT VECTOR MACHINE (SVM)

Support Vector Machines (SVMs) are applied to the problem of making predictions based on previously The SVM is a supervised learning algorithm that receives labeled examples as input and outputs a mathematical function that used to predict labels of new examples. Given the space from where the

Page 2: ISSN: 2454-1362, CBIR Using ...features [7] that extracted from theimage i.e. colour, shape and texture [4]. Therefore, extracting similar images from a large collection efficiently

Imperial Journal of Interdisciplinary Research (IJIR) Vol-2, Issue-10, 2016 ISSN: 2454-1362, http://www.onlinejournal.in

Imperial Journal of Interdisciplinary Research (IJIR) Page 1997

examples are taken there are infinite hyper planes, or linear functions, that can separate two distinct classes (Fig. 1). The main idea behind the SVM is to know which separating hyper plane is the optimal.

Fig. 1: Possible separating hyper plane separating labeled examples in their space representation.

In the case of the linear separable problem the SVM generates a mathematical function, ( )g x that receives as input another function, representing the known examples, or training set, and outputs a label:

( ) ( ( ))g x sign f x= (1)

Where, ( ) ,f x w x b= + , with w a weight vector

and b a scalar. The inner product ,w x is defined as

1

, di ii

w x w x=

=∑ (2)

Where, d is the dimensionality and wi is the i-th element of w, w= (w1, w2, ... , wd).

Then, we can formalize the problem addressed by the linear SVM as given a training set of vectors x1, x2, ... , xn with corresponding class membership labels y1, y2, ... , yn that take on the values +1 or -1, choose parameters w and b of the linear decision function that generalizes well to unseen examples.

The decision rule for the choice of the best hyper plane is that it not only correctly separates two classes in the training set, but lies as far from the training examples as possible. Therefore, the search for such hyper plane is an optimization problem. To solve the optimization problem we need an objective function as well as a set of restriction regarding the intended hyper plane (Fig. 2).

Fig. 2: Optimal hyper plane (solid line) in a linear separable classification problem

In order to our hyper plane correctly separate the two classes we need two sets of constrains

, 0iw x b+ > , for all yi=1 (3)

, 0iw x b+ < , for all yi=-1 (4)

That can be combined as

( ), 0, 1,...,i iw x b y i n+ > = (5)

The set of constrains given by equations (4) and (5) means that the data must be classified in the correct side of the hyper plane. However these are not sufficient to separate the two classes optimally. We need to do so with a maximum margin. The hyper plane satisfying , 0w x b+ =

in Fig 2 is the optimal

hyper plane. The function ,w x b+ is +1, in the upper right, or -1, in the lower left is represented by the dashed lines. In order to maximize the margin distance these two dashed hyper planes must be equidistant from the optimal hyper plane and at the same time parallel to each other. This constrain can be written as

( ), 1, 1,...,i iy w x b i n+ > = (6)

3. FEATURE EXTRACTION

Feature Extraction involves extracting the meaningful information from the images in order to; reduce the required storage memory so that CBIR system becomes faster and effective. Once the Features are extracted from the images they are stored in the database for future use. The features used to represent an image in this work are color, and texture.

Feature Extraction efficiently represents the interesting parts of an image in a compact vector by dimensionally

Page 3: ISSN: 2454-1362, CBIR Using ...features [7] that extracted from theimage i.e. colour, shape and texture [4]. Therefore, extracting similar images from a large collection efficiently

Imperial Journal of Interdisciplinary Research (IJIR) Vol-2, Issue-10, 2016 ISSN: 2454-1362, http://www.onlinejournal.in

Imperial Journal of Interdisciplinary Research (IJIR) Page 1998

reducing the image. Such type approaches are useful when the image sizes are large and a reduced features representation is required i.e. in image matching and image retrieval. The statistical texture features [5] are supposed to be helpful in classification and retrieval of similar images. Hence texture features give the essential information related to intensity level distribution in an image.

Color Moments

These are measures that can be used differentiate images based on their features of color. Once calculated, these moments provide a measurement for color similarity between images. The three color moments can be defined as Skewness, Mean and Standard Deviation.

Skewness can be understood as a measure of the degree of asymmetry in the distribution.

( )1 3

31 j

ij iN

S p EN

= = −

(7)

Mean ( X ) Mean can be understood as the average color value in the image [12]. The mean calculate the average value of color level intensities. If mean value is high then image is bright and if mean value is low then image is dark. Mean of an image may be defined as:

( )1

L

iX iP i

=

=∑ (8)

Standard Deviation (σ ) shows the contrast of gray level intensities. The standard deviation is the square root of the variance of the distribution. The low value indicates low contrast and high value shows high contrast in the image. Standard deviation may be defined [12] as:

( ) ( )2

1

L

ii X P iσ

=

= −∑ (9)

Gabor Wavelet

It has been shown that the Gabor Transform for texture image retrieval yields the highest texture retrieval results. Studies have shown that humans rely on three high-level features for texture interpretation, namely,

repetition, directionality and complexity. The later relates to the consistency of a texture, that is, if it has well defined patterns or not. The first two can simply be characterized by spatial frequency and orientation respectively. This way, measurements have to be made on both the spatial domain (using large bandwidths to localize texture borders) and the spatial-frequency domain (using small bandwidths to distinguish between different textures). The HVS solves this problem using cells tuned to filter and detect different spatial frequencies and orientations. Several methods inspired in our HVS were proposed using this multichannel filtering. One of the most successful is to generate a bank of Gabor filters that filter the image at different frequencies and orientations.

( ) ( )2 2

02 2

1 1, exp .cos 22 2x y x y

x yh x y u xπ φπσ σ σ σ

= − + +

(10)

4. PROPOSED CBIR SYSTEM

In proposed research work a high level color and Texture combined features query-based CBIR system to retrieve images from huge database using SVM classification is implemented. Proposed CBIR system is shown in Fig 3 below. Whenever a user inserts a new image via the graphical user interface, the system begins an online process to transform that image into a set of features and store these features in the database. In the first stage image features are obtained from images in the databases. In the next stage, the feature vector is stored into the feature database for further use. In the CBIR system, we retrieve the similar images within database according to input query image using some similarity measurement technique. In the final stage similar images are displayed in the output.

Fig. 3: Proposed Image Retrieval System

Page 4: ISSN: 2454-1362, CBIR Using ...features [7] that extracted from theimage i.e. colour, shape and texture [4]. Therefore, extracting similar images from a large collection efficiently

Imperial Journal of Interdisciplinary Research (IJIR) Vol-2, Issue-10, 2016 ISSN: 2454-1362, http://www.onlinejournal.in

Imperial Journal of Interdisciplinary Research (IJIR) Page 1999

In the proposed method, Color and Texture features of images are extracted like mean, standard deviation etc. In this stage we extract the color and texture features from the images and store them into the features database. In this work we have used the color images such as images from France, images of colored patterns, flowers images and images containing certain animals like horses, elephants, etc and feature vector stored in the feature data base.

The various steps used in proposed system are as follows:

Query image is the desired image, which we want to retrieve from a large image collection.

Image Feature Extraction: Color and Texture features of images are extracted like mean, standard deviation etc.

Features Database Creation: Color and Texture features are extracted from images that are present in image database using Color Moments and Gabor Wavelet are used for the image representation and these features are saved in the form of features vector into database for further image retrieval process.

Images Collection Database: The image database includes various Corel data set having 1000 images in ten different categories, stored in the JPG format having size 384×255 of each image.

Support Vector Machine (SVM): Support Vector Machines (SVMs) are applied to the problem of making predictions based on previously seen examples in what is called inductive inference.

Features Similarity Comparison: Query image features are extracted and match up with features in the database based on distance metric. The images are shorted according to their distance value. For the similarity measurement Manhattan distance is used i.e. city block distance. It represents the distance between two points in an image or feature vector. The closer the distances among the feature vector, the larger will be the similarity value.

Display Relevant Images: After the comparison the related images are displayed.

5. EXPERIMENTAL RESULTS

The present method has been implemented using MATLAB 2010a. The system efficiency is evaluated by performing test on Corel data set having 1000

images in ten different categories stored in the JPG format having 100 images each of size 384X255. Each set is grouped according to some semantic or visual criteria, such as images from France, images of colored patterns, images containing certain animals like cheetahs, eagles, elephants, etc. A total of 50 query images are used for the system evaluation. For each query the top 50 retrieved images are displayed for feedback. The system performance can be measured by evaluating accuracy (Acc), precision (P) and recall (R) equations.

Fig. 4: Results of Proposed Technique: Flower as

Query Image and Relevant Images

p

p p

tp

t f=

+ (11)

p

p n

tr

t f=

+ (12)

Page 5: ISSN: 2454-1362, CBIR Using ...features [7] that extracted from theimage i.e. colour, shape and texture [4]. Therefore, extracting similar images from a large collection efficiently

Imperial Journal of Interdisciplinary Research (IJIR) Vol-2, Issue-10, 2016 ISSN: 2454-1362, http://www.onlinejournal.in

Imperial Journal of Interdisciplinary Research (IJIR) Page 2000

p n

p n p n

t tAcc

t t f f+

=+ + + (13)

Where,

pt : is no. of relevant images retrieved

pf : is no. of non-relevant images retrieved.

nf : is no. of non-relevant images that are not retrieved.

nt : is no. of relevant images that are not retrieved.

Fig. 5 Results of Proposed Technique: Elephant as Query Image and Relevant Images

The table 1 summaries the SVM classification based system efficiency of the implemented system class-wise. The above classification result shows the classification efficiency of the support vector machine in terms of precision, recall and accuracy. The relative importance of precision and recall may be different depending on the objective of the system. The SVM

classification results show that the overall system accuracy of the system is 90%. Above performance results with different combinations of train and test sets demonstrates an expected increase in classification performance with the increasing number of training patterns.

TABLE 1: SVM Class Wise Implemented System Precision, Recall and Accuracy

Class Precision Recall Accuracy (%)

C1 52.31 68.00 89.00

C2 53.25 82.00 89.00

C3 65.63 42.00 90.12

C4 70.77 92.00 94.00

C5 99.00 98.00 99.73

C6 70.91 78.00 93.11

C7 91.84 90.00 97.59

C8 88.24 90.00 97.00

C9 73.00 38.00 90.57

C10 83.87 52.00 92.64

System Efficiency is evaluated by (a) calculating number of comparisons with feature vectors, (b) measuring the time processor take to respond a query and (c) relating the previous measures with the number of results returned. MATLAB provides a reasonable way to approximate the processing time in milliseconds. Unfortunately, it does not show how much time it take to process query image, since it shows only the actual time a query image takes to complete. The implemented system performance comparison has been given in table 2 w. r. t. previous systems and results shows that the implemented system is much better than the previous one in term of retrieval time, number of queries and relevant images and with the use of SVM classification there is an increase in performance and retrieval of relevant images.

TABLE 2: Comparison Table Prototype QBT QBC QBTC (SVM)

No. of Relevant

Images

118.34 104.28 120.00

78.68 84.10 85.00

No. of Queries 29.05 17.43 50.00

24.20 29.56 25.00

Total Time Taken 25.61 29.56 15.00

22.11 26.78 12.00

Page 6: ISSN: 2454-1362, CBIR Using ...features [7] that extracted from theimage i.e. colour, shape and texture [4]. Therefore, extracting similar images from a large collection efficiently

Imperial Journal of Interdisciplinary Research (IJIR) Vol-2, Issue-10, 2016 ISSN: 2454-1362, http://www.onlinejournal.in

Imperial Journal of Interdisciplinary Research (IJIR) Page 2001

6. CONCLUSIONS

In spite of being one of the most active areas in research various CBIR systems has been evolved and implemented in the past few decades. In presented research work, a combined feature and SVM classification based approach is proposed for content based image retrieval (CBIR) system. In the proposed method image color and texture features are utilized to represent an image. Accordingly relevant images are retrieved from vast database by comparing feature vector of query-image with database images using city block distance. The average precision evaluated for the proposed approach is 90%.

REFERENCES

[1] H.C. Akakin, M.N. Gurcan, “Content Based Microscopic Image Retrieval System for Multi-Image Queries,” IEEE Trans. Inf. Technol. Biomed., vol. 16, no. 4, pp. 758-769, Jul. 2012.

[2] R. Krishnapuram, S. Medasani, “Content-Based Image Retrieval Based on a Fuzzy Approach,” IEEE Trans. Knowled. Data Engg., vol. 16, no. 10, pp. 1185-1199, Oct. 2004.

[3] A.H. Rangkuti, N. Hakiem, R.B. Bahaweres, A. Harjoko, A.E. Putro, “Analysis of Image Similarity with CBIR Concept Using Wavelet Transform and Threshold Algorithm,” IEEE Symp. Comp. Info. 2013.

[4] K.H. Yap, K. Wu, “Fuzzy Relevance Feedback in Content-Based Image Retrieval,” IEEE ICICS-PCM 2003.

[5] S. Chaudhari, R. Chilveri, A. Nanda, “Efficient Implementation of CBIR System and Framework of Fuzzy Semantics,” IEEE Inter. Conf. Adva. Mob. Net. Comm. App. 2012.

[6] Y. Li, J.M. Liu, J.Li, W.D.C.X. Ye, Z.F. Wu, “The fuzzy similarity measures for content-based image retrieval,” Proc. ICMLC, Nov. 2003.

[7] N. Sasheendran, C. Bhuvaneswari, “An Effective CBIR (Content Based Image Retrieval) Approach Using Ripplet Transforms,” ICCPCT-2013.

[8] W. Jiang, G. Er, Q. Dai, J. Gu, “Similarity-Based Online Feature Selection in Content-Based Image Retrieval”, IEEE Trans. Image Proce., vol. 15, no. 3, March 2006.

[9] C.Y. Chiu, H.C. Lin, S.N. Yang, “A Fuzzy Logic CBIR System,” IEEE Inter. Conf. fuzzy Sys. 2003.

[10] Y.P. Huang, T.W. Chang, C.Z. Huang, “A Fuzzy Feature Clustering with Relevance Feedback Approach to Content Based Image Retrieval,” VECIMS, Jul. 2003.

[11] N. Goel, P. Sehgal, “Weighted Semantic Fusion of Text and Content for Image Retrieval,” IEEE Inter. Conf. Advan. Comp. Comm. Infor. (ICACCI), 2013.

[12] F. e-Malik, B. Baharudin, “Effective Content-Based Image Retrieval: Combination of Quantised Histogram Texture Features in DCT Domain,” ICCIS-2012.