4

Click here to load reader

[IEEE 2007 9th International Symposium on Signal Processing and Its Applications (ISSPA) - Sharjah, United Arab Emirates (2007.2.12-2007.2.15)] 2007 9th International Symposium on

  • Upload
    moussa

  • View
    215

  • Download
    3

Embed Size (px)

Citation preview

Page 1: [IEEE 2007 9th International Symposium on Signal Processing and Its Applications (ISSPA) - Sharjah, United Arab Emirates (2007.2.12-2007.2.15)] 2007 9th International Symposium on

AUTOMATIC SEEDS RECOGNITION BY SIZE, FORM AND TEXTURE FEATURES

ADJEMOUT Ouiza, HAMMOUCHE Kamal and DIAF Moussa

Mouloud Mammeri University of Tizi-ouzou, ALGERIA e-mail: [email protected]

[email protected] [email protected]

ABSTRACT

This work deals with an automatic seeds analysis system based on pattern recognition methods. However, this paper emphasizes only the pattern recognition aspects of the problem and, for our tests, four hundred samples of each of four species of seeds, namely corn, oat, barley and lentil are considered. The recognition procedure is, firstly, made on the basis of shape features and texture features, separately. Both of theses methods have given good results with a small confusion rate. In order to increase the recognition rate, the shape and texture features are used all together leading then to results considerably improved. After images acquisition and their pre-processing, the general process includes the features space reduction using the Principal Component and clustering operation based the k-means algorithm. The decision phase is based on the nearest Euclidean distance rule between the feature vector of an unknown seed and the average feature vector of each cluster. Key words: Pattern Recognition, Feature Extraction, Principal Component Analysis, Classification, Seeds Sorting.

1. INTRODUCTION After years of fundamental search, the application of the artificial vision in the seeds industry is became very important [1]. The essential activities, in this field, are seeds analysis and classification which contribute to an additional value in vegetable production, the species improvement, the seeds quality control, the impurities identification etc. Usually, theses tasks are carried out by specialists who inspect visually each grain of a sample referring to photographic documents. This strained operation task is hard, tedious and time consuming. Thus, the complete automation of this operation proves to be useful. The prototype device involves a mechanical system for handling and sorting the seeds, a hardware design and development, a digital electronic circuitry, a CCD camera and a computer. However, this paper emphasizes only the pattern recognition aspects of the problem. Then, four hundred samples of each of four

species of seeds are considered. In order to recognize these seeds three procedures have been used. The first and second procedure is based on shape features and texture features respectively. Both of theses methods have given good results with a confusion rate between some seeds. In order to increase the recognition rate, the third procedure combines the shape and texture features all together. Thus, this paper is organized as follows. The next section deals with the images acquisition and pre-processing. Section 2 is devoted to the recognition procedure based on shape features. The recognition procedure based on texture features makes object of section 3. The combination of both procedures is described in section 4.

2. IMAGE ACQUISITION AND PRE-PROCESSING Four hundred images of each of four species of seeds, namely corn, oat, barley and lentil were taken by means of an ION CCD camera and a Video Maker image interface acquisition. The choice of these seeds is made on the basis of their shapes (Figure1). The lentil is particularly circular and may be easily recognized. The barley and the oats have stretched shapes so that confusion between them may be result. By its shape, the corn can be considered equidistant of the three other species.

Figure 1. Example of used images The images, taken whatever the orientations and the faces of the seeds, are preprocessed before the operation of features extraction. This pre-processing consists in applying a median filter, an image binarization using Otsu’s algorithm [2] and an edge tracking with connexity

Oats Corn

Barley Lentil

1-4244-0779-6/07/$20.00 ©2007 IEEE

Page 2: [IEEE 2007 9th International Symposium on Signal Processing and Its Applications (ISSPA) - Sharjah, United Arab Emirates (2007.2.12-2007.2.15)] 2007 9th International Symposium on

method. Figure. 2 shows an example of pre-processing results.

Figure 2. Example pre-processed images

3. RECOGNITION BASED ON SHAPES FEATURE

3.1 Shapes features The choice of features to be extracted constitutes an important step like it is in all pattern recognition process. Usually, the experience recommends taking a large number of features. Then, this number must be reduced automatically by using, for example, the Principal Component Analysis [3]. In our application, 15 shape features are calculated from the pre-processed images. Among them, the perimeter, the surface, the circularity, Hu’s moments [4] and central moments of second order. These moments, invariants in translation, rotation and scaling, are given by the following equations.

1 20 02η ηΦ = + (1) 2 2

2 20 02 11( ) 4η η ηΦ = − + (2) where normalized central moments pqη is calculated by :

pqpq p q 1

2s

µη + +

= ; p q 2+ ≥ (3)

p qpq

i j(i x) ( j y) f (i, j)µ = − −∑∑ (4)

with x and y , the gravity centre of surface s of the considered image, calculated as:

10

00

mx

m= ; 01

00

my

m= (5)

In the general case, mpq is given by the expression: M 1 N 1

p qpq

i 0 j 0m i j f (i, j)

− −

= == ∑∑ ; p,q 0,1,...,= ∞ (6)

where M and N are respectively the horizontal and vertical dimension of the image and f(,j,j) the intensity of the pixel (i j). Table 1 and 2 show the values of these features. The difference between the values of each specie shows the efficiency of the shape features.

Table 1. Mean values of moments Oats Corn Lentil Barley

1Φ 0.743 0.228 0.1631 0.753 2Φ 0.064 0.013 4.2e-005 0.052 3Φ 0.019 0.005 7.3e-006 0.026 4Φ 0.304 0.005 0.0003 0.304 5Φ -0.023 2.1e-005 -1.e-008 -0.026 6Φ 0.077 0.0003 -2.e-006 0.070 7Φ -0.001 2.5e-005 1.5e-008 0.008

Table 2. Values of shape features

Seeds Surface Perimeter circularité Major axis

Minor axis

Oats 7638 856 0.86 121.20 94.56Corn 5209 571 0.79 70.56 55.00Lentil 13291 807 0.74 67.81 66.06Barley 7022 814 0.86 130.25 84.78

The application of the Principal Component Analysis shows that 5 factorial components explain more than 96% and the first and second component axe explain more than 73%. Figure. 3 shows a projection of the data (red for the oats, green for corn, cyan for the lentils and magenta for the barley) on the first and second principal axe of inertia. It is noted oats and barley projections overlap seeds by opposite to lentil and corn which are quite separated

Figure 3. Projection of seeds according to the first two principal axis

3.2 Classification Classification techniques aim to group a set of multidimensional observations, represented as data points scattered through an N-dimensional space, into clusters, according to their similarities and dissimilarities. Several different classification procedures have been proposed in the literature [5]. Starting from a mixture of the four

a. Original image b. Median filter

c. Binarisation d. Edge detection

Page 3: [IEEE 2007 9th International Symposium on Signal Processing and Its Applications (ISSPA) - Sharjah, United Arab Emirates (2007.2.12-2007.2.15)] 2007 9th International Symposium on

species of seeds, we have used a classification method based on the best-known k-means algorithm [6]. The recognition which consists in affecting an unknown seed to its class is carried out on the basis of the nearest Euclidean distance calculated between the feature vector of the unknown seed and the average feature vector of each cluster. The unknown seed is affected to the cluster corresponding to the smallest distance. After testing more than four hundred seeds of each class, the average of recognition rate has reached 85.75 % (Table 3). Because of their circular shapes, lentils are recognized at 100%. With its compact shape, the corn recognition rate reaches 99%. For the oats and barley, the recognition rates are 97% and 47% respectively. We have verified that some seeds of barley were confused with oats and, conversely, some seeds of oats were confused with barley. This result is due to the similarities between the sizes and the stretched shapes. We have, also, observed that some seeds of corn were confused with barley. As it is explained above, these results are attempted.

Table 3. Classification rate using shape features.

Seeds Oats Corn Lentil Barley

97% 99% 100% 47% Recognition rate

Total: 85,75%

4. RECOGNITION BASED ON TEXTURAL FEATURES

Visual inspection shows that the surface of the seeds presents different appearances for each specie. According to this consideration, in this section we propose to discriminate the seeds by using the texture information. This process involves the same above data set with the K-means algorithm. A number of texture analysis methods have been proposed in the image processing literature [7]. In this paper, we have used the popular spatial gray-level dependence method for his efficiency [8]. This method involves the co-occurrence matrices since a large variety of texture features can be derived from such matrices that combine spatial information with statistical properties. In our application, the used features are the second angular moment (SAM) who informs us about the homogeneity of texture, contrast (CONT) who measures the local variation of texture and who supports the great transitions from the grey levels, entropy (ENT) who evaluates the degree of organization of the pixels , variance (VAR), differential inverse moment (IM) and correlation (COR).

( )g gn n

2

i 1 j 1SAM m i, j

= ==∑∑ (7)

( )g g gn n n

2

n 1 i 1 j 1CONT n m i, j

= = =

=

∑ ∑∑ (8)

( ) ( )g gn n

i 1 j 1

ENT m i, j ln m i, j= =

= − ∑∑ (9)

( ) ( )g gn n

2

i 1 j 1VAR i j m i, j

= == −∑∑ (10)

( )( )

g gn n

2i 1 j 1

1IM m i, j1 i j= =

=+ −

∑∑ (11)

x yi j

x y

(i )( j )m(i, j)COR

µ µ

σ σ

− −=∑∑

(12)

where xµ , yµ and xσ , yσ are respectively the averages and standard deviation of the lines and the columns given by the equations below:

g gn 1 n 1

xi 0 j 0

i m(i, j)µ− −

= =

= ∑ ∑ ;g gn 1 n 1

x xi 0 j 0

(i ) m(i, j)σ µ− −

= =

= −∑ ∑g gn 1 n 1

yi 0 j 0

j m(i, j)µ− −

= =

= ∑ ∑ ; g gn 1 n 1

y yi 0 j 0

( j ) m(i, j)σ µ− −

= =

= −∑ ∑ (13)

Table 4 shows the values of these features determined on area of four seeds of different specie. The values show that the textural features can effectively discriminate the present seeds.

Table 4. Values texture features calculated for various seeds

Seeds SAM CONT ENT VAR IM COR

Oats 0.0011 93.750 7.403 978.46 0.326 0.0087Corn 0.0086 69.848 5.999 358.73 0.498 0.0141Lentil 0.0069 28.656 5.758 290.48 0.631 0.0126Barley 0.0015 131.39 7.153 581.90 0.364 0.0085

The application of Principal Component Analysis method makes it possible to create new features by linear combinations of the initial parameters. Figure 4 shows the projection of seeds-data on the plan according to the first and second principal components. These components describe 89,9% of total information.

Figure 4. Projection of seeds according to the first two principal axis

Table 5 summarizes the classification rate of the four classes of seeds performed by the k-means algorithm. The total classification rate 78% can be considered as

Page 4: [IEEE 2007 9th International Symposium on Signal Processing and Its Applications (ISSPA) - Sharjah, United Arab Emirates (2007.2.12-2007.2.15)] 2007 9th International Symposium on

satisfactory. However, as for the shape features, the confusion between the oats, barley and corn is not avoided. We have checked that certain seeds correctly recognized by using the shape features are not it any more with the texture features and vice versa .

Table 5 .Classification rate using texture features

Seeds Oats Corn Lentil Barley

39% 90% 100% 83% Recognition rate

Total : 78%

5. RECONGNITON BASED ON BOTH SHAPE AND TEXTURE FEATURES

Considering the results obtained with the two preceding procedures are not sufficient and, knowing that some seeds recognized by the first procedure are not recognized by the second and reciprocally, we propose to combine the whole of the shape and texture features in order to improve the recognition rate. This third procedure is applied to the same data base. Figure 5 shows projections of the data according to the first and second axe of the Principal Component Analysis.

Figure1. Projection of seeds according to the first two principal axis

By following the same step as for the two preceding procedures, the procedure combining the shape and texture features has clearly improved the global recognition rate as it is shown in table 6. Lentil is stil recognized at 100%. The percentage of 99% for the corn is probabily due to some seeds with very particular shapes very close to the barley shapes. The barley and oasts recognition is improved but it steel to be imrpoved more.

Table 6. Recognition results with shape and texture features.

Seeds Oats Corn Lentil Barley

89% 99% 100% 69% Recognition rate

Total : 89,25%

6. CONCLUSION If in some industrial applications, the pattern recognition give very interesting results, in other fields, the rate of recognition, have not reached the hoped value. In all the cases, the material used for image acquisition, is a factor which has an influence on this quality of the results. Other factors related on the lighting and the nature and characteristics of the objects to be recognized has also their influence. In the field in which we are interested, the recognition rate of seeds based on shape and texture features are 85,75 and 78% respectively. The combination of these features improves the recognition which reaches 89.25% although the quality of the acquisition image devices at our disposal. With professional devices, these results may be more improved. The used shape and texture features may be no enough sufficient. To improve the results, other types of features can be considered. In addition, other methods of classification based on other concepts like the neural networks and fuzzy logic can be introduced.

REFERENCES [1] P. M. Granitto, P. F. Verdes, H. A. Ceccatto, “Large Scale Investigation of Weed seeds Identification by machine vision”, Institute de Fesica Rosario (IFIR), Argentina., 2004.

[2] N. Otsu, “A threshold selection method for grey level histograms”, IEEE Trans. on System, Man and Cybernetics, vol. SMC-9, no.1, pp. 62-66, 1979.

[3] J. De Lagarde, “Initialisation à l’Analyse de Données”, Edition Dunod, 1983.

[4] M. K. Hu. “Visual pattern Recognition by Moment Invariant”, IRE Transaction on Information Theory, IT-8: pp. 179-187, 1962.

[5] Xu, R., Wunsch, D., Survey of clustering algorithms. IEEE Trans. On Neuronal Network, vol. 16, pp. 645–678, 2005.

[6] L. MacQueen., “Some methods for classification and analysis of multivariate observations”, In Proceeding 5th Berkeley Symp., pp. 281-297, 1967.

[7] T. R. Reed, J. M. Hans du Buf, “A review of recent texture segmentation and feature extraction techniques”, CVGIP: Image Understanding, vol. 57, pp. 359-372, 1993.

[8] R. M. Haralick, K. Shanmugam, I. Dinstein, Texture features for image classification, IEEE Trans. Syst. Man. Cybern. vol. 3, no. 6, pp. 610-621, 1973.