SVM Based CSLOS Algorithm

SVM Based Model for Content Based Image Retrieval using Colour, Straight line and Outline Signatures of the Image.L.Jayanthi1 , Dr.K.Lakshmi2

1 Assistant Professor,Periyar Maniammai University, Thanjavur, Tamil Nadu, India

2Dean, School of Computing Science and Engineering, Periyar Maniammai University. Abstract

In this era of internet and multimedia, Content Based Image Retrieval (CBIR) plays a vital role especially where the search speed and accuracy of the results are of utmost importance. Many research scholars in the past have used various classifiers including Artificial Neural Network (ANN), Genetic Algorithm (GA) while this work HAS employed Support Vector Machines (SVM). In THIS method, feature extraction has been achieved by using combined colour histograms for three regions (top, bottom and middle) of the image which would indirectly include the space information within the image, besides the colour information. In addition, straight line signatures and outline signatures of the images are also extracted for improving the performance. Empirical study carried out using Semantics Sensitive Integrated Matching for Picture Libraries (SIMPLIcity) database .Average precision and recall are used as metrics to measure the performance of the system. Our proposed SVM based model for CBIR using Colour, Straight line and outline signatures (SVM CSLOS) are compared with other previous methods of Jhanwar et al,, Hung et al., Chuen et al.,and Elalami which used the same data set and evaluation method. According to the arrived result, the performance of the proposed SVM based model for CBIR using Colour , Straight Line and Outline Signature method provided improvement in performance. Keywords : CBIR, Support Vector Machines, Colour Histogram, Straight Line Signature, Outline Signature, SIMPLIcity.___________________________________________________________________________1. INTRODUCTIONAn image retrieval system is used for browsing, searching and retrieving images from large digital image database. Large amounts of images are created everyday in different areas including remote sensing, crime prevention, fashion, publishing, medicine, architecture etc with the development of the internet and availability of capturing devices such as cameras[1]. So, one has to develop efficient and effective methodology to manage large databases for retrieval.Text Based Image Retrieval (TBIR) uses the text associated with an image to determine what the image contains. Google, Yahoo Image search engines are the examples of systems using this type of approach. These search engines are easy to implement and fast but at times fail to retrieve relevant images. Manual annotation is very laborious , time consuming and next to impossible for large database. It is not accurate. It is difficult to describe the content of different types of images with human languages. Polysemy problem also occurs and surrounding text may not describe the image[2]. To overcome these difficulties encountered by a text based image retrieval system, content based image retrieval (CBIR) was proposed in the early1990s. CBIR involves data collection, build up feature database, search in database, arrange the order and the results of the retrieval. Features include colour, texture, shape and spatial , similarities are measured based on the distance between the features and images to be retrieved automatically[3].The organization of this paper is as follows: After introduction in section 1, detailed discussions are carried out in the existing CBIR Systems in section 2.Algorithm of the refined Euclidean distance matching technique for improved CBIR using colour, straight line and outline sketch signatures of the Image is explained in section 3. Design of the proposed CBIR system is outlined in section 4. Feature extraction of combined three region colour histogram signatures [3RCS] is performed in section 5.Extraction of Straight Line Signatures is illustrated in section 6.Extraction of outline signature is narrated in section 7. Improved 3 region colour, straight line and outline sketch based CBIR (ICSLOS) system is explained in section 8.Algorithm of proposed SVM based model for 3 region colour, straight line and outline sketch signatures of the image (SVM CSLOS) are introduced in section 9.Experimental results and the evaluation of the performance of the proposed system is reported in terms of precision and recall in section 10. Finally, the conclusion and future work are presented in section 11.

2. CBIR SYSTEMS A COMPARISONJhanwar et al. proposed a new technique for content based image retrieval using motif co occurrence matrix (MCM). The MCM is derived from the motif transformed image. The whole image is divided into 2x 2 pixel grids. Number of scan motif are reduced to 6 . The MCM is defined as 3D matrix with (i,j,k). The transformed image is used to calculate the probability of finding a motif "i" at a distance "k" from a motif "j". The distance between the MCMs of two different images is used as the similarity measure while retrieving the images from the database. This method is efficient in computation and storage requirement but expensive. MCM combines the information related to both colour and texture[4].Huang et al. presents an image retrieval system based on texture similarity. The composite sub band gradient vector (CSG) and the energy distribution pattern (EDP)string are the two features extracted. Both features are generated from the sub images of a wavelet decomposition of the original image. At first, a fuzzy matching process based on EDP strings will serve as a filter to remove undesired images in the database. At the second stage, the images passing through the filter will be compared with the query image based on their CSG vectors which are powerful in discriminating the texture features [5][6].Chuen Horng Lin et al. integrate three image features to facilitate the image retrieval process. The first and second image features are useful for describing the relationship between colors and textures in an image. These features are called color co-occurrence matrix(CCM) and difference between pixels of scan pattern (DBPSP)respectively. The third image feature is based on color distribution called color histogram for K-mean(CHKM). CCM calculates the probability of same pixel colour between each pixel and its adjacent ones in each image and this probability is considered as the attribute of the image. DBPSP is the difference between pixels and converts it into the probability of occurrence on the entire image. Each pixel color in an image is then replaced by one color in the common color palette that is most similar to color so as to classify all pixels in image into k cluster, called the CHKM feature. Difference in image properties and contents indicate that different features are contained. It is thus, CCM, DBPSP, and CHKM facilitate image retrieval. Optimal features are selected from original features to enhance the detection rate[7].Elalami introduced a CBIR which depends on extracting the most relevant features. Colour features are extracted from colour histogram and texture features are extracted from Gabor filter algorithm . Genetic algorithm was used for performing feature discrimination. The most relevant features are selected from the original features set by two successive functions, preliminary and deeply reduction function. This method simplify the calculation, achieve maximum detection rate and reduce the retrieval process time.[8]Elalami also presented CBIR using Artificial classifier. The proposed model is composed of four major phases namely: features extraction, dimensionality reduction, ANN classifier and matching strategy. In feature extraction phase, color features are extracted from color co-occurrence matrix (CCM) and texture features are extracted from difference between pixels of scan pattern(DBPSP). The dimensionality reduction technique selects the effective features that have the largest dependency on the target class and minimal redundancy among themselves. These features reduce the calculation work and the computation time in the retrieval process. The artificial neural network (ANN) classifier is used so that the selected features of query image are the input and its output is one of the multi classes that have the largest similarity to the query image. In addition, the proposed model presents an effective feature matching strategy that depends on the idea of the minimum area between two vectors to compute the similarity value between a query image and the images in the determined class[9].

3. THE PROPOSED ALGORITHMS

Refined Euclidean Distance Matching Technique for the CBIR Using Colour, Straight line and Outline Sketch Signatures of the Image are the key points of the proposed algorithms which is described as follows. Let I be the input query image of size m x n.

Apply median filter on the input image

Convert the high colour input image in to a 256 colour image

Separate the top, middle and bottom regions R1, R2, R3, of the image

Find the histogram of the regions R1, R2, and R3 as

Use that combined 3 region colour histograms as the signature S1 of input image

S1 =

Find all the long straight lines in the the image above the threshold and use it as Straight line signature S2 of the image. Detect the outlines of the input image. Find the FFT of the one dimentional representation of the binary outline Image and this is represented as the outline signature S3 of the image Form the combined signature S ={ S1, S2, S3} of input image

Let M be the image signatures of all the images in the dataset.

Find Euclidean Distance between the input image signature and the signature of the all the images of the database

The distance D1=Euclidean distance (M,S)

Sort the distance matrix D in ascending order

Find the index of Top N ranked minimum distances of D

Find the average image signature of the top N matching images

Use that average image signature V as a virtual query image and repeat the search.

Again, find Euclidean Distance between the virtual input image signature V and the signature of the all the images of the database

The distance D2= Euclidean Distance(M,V)

Find the index of Top N ranked minimum distances of D2

Display the Top N ranked Images from the image database using the index.4. DESIGN OF PROPOSED CBIR SYSTEMFigure 1 represents the indexing phase and figure 2 represents the CBIR system.

Figure.1 Indexing Phase

Figure. 2 CBIR system

5. COMBINED THREE REGION COLOUR HISTOGRAM SIGNATURES [3RCS]Experiments were carried out with the data base of James .Z.wang research group SINPLIcity : Simplicity Semantic Sensitive Integrated Matching for Picture Libraries[10].

Figure 3 represents the colour signature from 3 regions of the image. For an m*n image I, the colours in that image are quantized to C1, C2, , Ck. The colour histogram H(I)={h1, h2, , hk}, where hi represents the number of pixels in colour Ci.

Ci is the ith colour index, hi is the number of pixels with that colour index. m x n is the total number of pixels in the image.

i = 0,1,2,.,L-1. (where L = 256).

Let the image is decomposed in to three regions R1, R2, R3.I1={ R1, R2, R3 }

Let h1 and h2 represent two colour histogram of three regions R1, R2, R3, of two images I1 and I2. Now, we can represent the combined histograms of the image I1 as

The Histogram Euclidean Distance between the colour histograms h1 and h2 can be computed as:

Combination of histograms for different regions of the image will indirectly include the space information within it. [11]

Figure.3.Creating Colour Signature from 3 Regions of the Image

6. STRAIGHT LINE SIGNATURES. A line in the image space can be expressed with two variables (r, () in polar coordinate system and a line equation can be written as

In general for each point (x0 , y0), we can define the family of lines that goes through that point as:

A sinusoid ( - r plane shown in Figure 4. represents the line passing through the point (x0 , y0). Only points such between r>0 and 0 <

Documents

SVM Based CSLOS Algorithm