6
Controlled Classification of Earth Remote Sensing Data V. V. Asmus a , A. A. Buchnev b , and V. P. Pyatkin b a The Planeta Research Center, Moscow, Russia b Institute of Computational Mathematics and Mathematical Geophysics, Siberian Branch, Russian Academy of Sciences, Novosibirsk, Russia E-mail: [email protected] Received September 3, 2007 Abstract—A system of controlled classifying (trainable) the Earth remote sensing data is described. Computational experiments for estimating the distributed data processing efficiency are presented. DOI: 10.3103/S8756699008040079 INTRODUCTION The key task of interpreting the Earth remote sensing data (ERSD), that is, improving the decoding qual- ity, is concerned with the problem of choosing an adequate recognition algorithm [1]. The difficulties arising here are caused by the following [2]: 1. The real data structure does not correspond to the data model used in the proposed controlled classifi- cation algorithm, e.g., the assumption on the normal data vector distribution, or the condition that the mea- surement field is random are violated. Experience shows that these situations occur when the object radiation goes over the dynamic range of the equipment. Hence, one has to abandon methods that require covariance matrix conversion, or apply approaches that increase the data variance. 2. Unrepresentative training sequences: insufficient data for reconstructing the decision rule parameters, inadequacy of the training data and data to be recognized (sampling “smearing” by mixed measurement vec- tors, i.e., vectors that appear when several natural objects occur in a resolution element of the system, incom- plete adequacy of the training data obtained by means of clustering to the true mathematical classes, equipment interferences, the influence of atmospheric conditions, etc. Thus, recent experience of computer-aided ERSD recognition shows that it is practically impossible to determine beforehand the best algorithm in terms of relation between classification quality and cost. Hence, it is advisable to build several algorithms into the recognizing system and choose the optimal algorithm em- pirically using results of test data classification at the training stage. The chosen algorithm is then used for recognizing the whole set of measurement vectors. A system of controlled classifying (with training) ERSD in software was developed jointly by the Insti- tute of Computational Mathematics and Mathematical Geophysics (Novosibirsk) and the Planeta Research Center (Moscow). The system contains seven classifiers (one element and six object classifiers) based on us- ing the Bayes maximum likelihood strategy and two object minimum-distance classifiers. The system is part of the ERSD processing software system [3]. Our goal is to propose a spectrum of algorithms for controlled classifying the Earth remote sensing data, which are adequate to the model. ELEMENT CLASSIFICATION The element means the N-dimensional vector of measurements (features) x x x N T = ( , ..., ), 1 where N is the number of spectral ranges. It is assumed that the vectors x have in the class w i the normal distribution Nm B i i ( , ) with the mean m i and the covariance matrix B i . In this case, the Bayes maximum likelihood strat- egy for the element classifier is formulated as follows [4–6]. 331 ISSN 8756-6990, Optoelectronics, Instrumentation and Data Processing, 2008, Vol. 44, No. 4, pp. 331–336. © Allerton Press Inc., 2008. Original Russian Text © V.V. Asmus, A A. Buchnev, V.P. Pyatkin, 2008, published in Avtometriya, 2008, Vol. 44, No. 4, pp. 60–67. ANALYSIS AND SYNTHESIS OF SIGNALS AND IMAGES

Controlled classification of Earth remote sensing data

Embed Size (px)

Citation preview

Page 1: Controlled classification of Earth remote sensing data

Controlled Classification of Earth Remote Sensing Data

V. V. Asmusa

, A. A. Buchnevb

, and V. P. Pyatkinb

aThe Planeta Research Center, Moscow, Russia

bInstitute of Computational Mathematics and Mathematical Geophysics, Siberian Branch, Russian Academy of

Sciences, Novosibirsk, Russia

E-mail: [email protected]

Received September 3, 2007

Abstract—A system of controlled classifying (trainable) the Earth remote sensing data is described.

Computational experiments for estimating the distributed data processing efficiency are presented.

DOI: 10.3103/S8756699008040079

INTRODUCTION

The key task of interpreting the Earth remote sensing data (ERSD), that is, improving the decoding qual-

ity, is concerned with the problem of choosing an adequate recognition algorithm [1]. The difficulties arising

here are caused by the following [2]:

1. The real data structure does not correspond to the data model used in the proposed controlled classifi-

cation algorithm, e.g., the assumption on the normal data vector distribution, or the condition that the mea-

surement field is random are violated. Experience shows that these situations occur when the object radiation

goes over the dynamic range of the equipment. Hence, one has to abandon methods that require covariance

matrix conversion, or apply approaches that increase the data variance.

2. Unrepresentative training sequences: insufficient data for reconstructing the decision rule parameters,

inadequacy of the training data and data to be recognized (sampling “smearing” by mixed measurement vec-

tors, i.e., vectors that appear when several natural objects occur in a resolution element of the system, incom-

plete adequacy of the training data obtained by means of clustering to the true mathematical classes,

equipment interferences, the influence of atmospheric conditions, etc.

Thus, recent experience of computer-aided ERSD recognition shows that it is practically impossible to

determine beforehand the best algorithm in terms of relation between classification quality and cost. Hence,

it is advisable to build several algorithms into the recognizing system and choose the optimal algorithm em-

pirically using results of test data classification at the training stage. The chosen algorithm is then used for

recognizing the whole set of measurement vectors.

A system of controlled classifying (with training) ERSD in software was developed jointly by the Insti-

tute of Computational Mathematics and Mathematical Geophysics (Novosibirsk) and the Planeta Research

Center (Moscow). The system contains seven classifiers (one element and six object classifiers) based on us-

ing the Bayes maximum likelihood strategy and two object minimum-distance classifiers. The system is part

of the ERSD processing software system [3].

Our goal is to propose a spectrum of algorithms for controlled classifying the Earth remote sensing data,

which are adequate to the model.

ELEMENT CLASSIFICATION

The element means the N-dimensional vector of measurements (features) x x xN

T

� ( , ..., ) ,1

where N is

the number of spectral ranges. It is assumed that the vectors x have in the class �i

the normal distribution

N m Bi i

( , ) with the mean mi

and the covariance matrix Bi. In this case, the Bayes maximum likelihood strat-

egy for the element classifier is formulated as follows [4–6].

331

ISSN 8756-6990, Optoelectronics, Instrumentation and Data Processing, 2008, Vol. 44, No. 4, pp. 331–336. © Allerton Press Inc., 2008.

Original Russian Text © V.V. Asmus, A A. Buchnev, V.P. Pyatkin, 2008, published in Avtometriya, 2008, Vol. 44, No. 4, pp. 60–67.

ANALYSIS AND SYNTHESIS

OF SIGNALS AND IMAGES

Page 2: Controlled classification of Earth remote sensing data

Let � � ( , ..., )� �1 m

be the finite set of classes and pi

( )� be the a priori probability of class �i. Then the

discriminator of �i

has the form

g x p B x m B x mi i i i

T

i i( ) ln( ( )) . ln(| | ) . ( ) ( ).� � � � �

� 0 5 0 51

(1)

The classic decision rule for the classifier is as follows: the vector x is entered into class �i

if g x g xi j( ) ( )�

for all j i� .

For the class �i, we define the parameter

T p A N Q Bi i i

� � �ln ( ) . ( , ) . ln | |,� 0 5 0 5 (2)

where A N Q( , ) is the critical value of level Q of the distribution �

2. Let t

ibe the variable whose value

depends on the classification parameter thr:

t

T

T

i

i

i

m

i

i

m

� �

, ,

, ,

min , ,

max

if thr

if thr

if thr

1

2

3

1

1

T

T m

i

i

i

m

, ,

, .

if thr

if thr

4

5

1

(3)

Then the decision rule for the classifier in view of (3) is as follows [2]: the vector x is entered into �i

if

g x g xi j( ) ( )� for all j i� and g x t

i i( ) � (for t T

i i� , this means restriction of the Mahalanobis distance

( )x mi

T

� B x m A N Qi i

� �

1( ) ( , ) to the center of the class). Otherwise, the vector is entered into the class

of rejected vectors (class number m � 1).

The values of A N Q( , ) for the vector dimension N � 30are found from statistical tables. For example, for

N � 4 and Q � 0.05 (i.e., 5% of vectors may be rejected), A � 9.488. For N � 30, the values of A N Q( , ) are

found via approximation

�1

2

1

3

12

9

2

9� �

� � �

�Q Q

N N

N

u

N

( ) , (4)

where uQ1�

is the value of standardized normal quantity for probability 1 � Q (in particular, for the level

Q � 0.05, u0 95.

� 1.645).

OBJECT CLASSIFICATION

The object means a block of adjacent square or cross-shaped vectors. Since physical dimensions of really

scanned spatial objects are usually greater than the shooting system resolution, the data vectors are interre-

lated. This kind of information makes it possible to increase the classification accuracy if we try to recognize

simultaneously a group of adjacent vectors, that is, the object. Let us consider the vector (object)

X x xL

T

� ( , ..., )1

consisting of adjacent N-dimensional vectors x i Li, , ..., ,� 1 e.g., in the vicinities of 3 3� ,

5 5� , ... elements. We use two kinds of object: square of cross-shaped. The decision on referring the central

object element to some class is taken from the result of classifying the whole object.

OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 44 No. 4 2008

332 ASMUS et al.

Page 3: Controlled classification of Earth remote sensing data

This approach generates a whole family of decision rules. Firstly, using the vote principle, i.e., independ-

ent classifying object elements and referring the central element to a class which most object elements were

referred to. Secondly, applying texture operators (the simplest example is describing the object X via the

vector of central components of its elements) followed by referring the central element to a class which the

parameter of X was referred to. Thirdly, describing the object X by the random Markovian field

p X p x x x p x x x p xi L i L i

( | ) ( | , ..., ; ) ( | , ..., ; ) ... (� � ��1 2 2 3 L i

| ).�

In this case, the model looks as follows. Let the vector x have in the class �i

the normal distribution

N m Bi i

( , ) with the mean mi

and the covariance matrix Bi. Then the vector X is also normally distributed in

the class �i

with the mean Mi

of NL and the covariance NL NL� matrix Ki. Estimation of the matrix for

great NL (it is required to have enormous training data) and also its inversion are hardly feasible. Hence, we

will enter simplifying structural assumptions. Assuming that the correlation between object elements in all

shooting zones is the same, the correlation matrix Ki

may be represented as the direct product of the spatial

correlation matrix Ri

and the covariance matrix Bi. If R

iis a unit matrix, then p X

i( | )� � p x

l

L

l i

1

( | ).�

Hence, we have the known decision rule under assumption that the object elements are independent. More

adequate models appear under other assumptions on the structure of correlations. For example, assuming

divisibility of the autocorrelation function of object elements in vertical and horizontal directions, we obtain

a causal autoregressive model of first or third order (depending on the object shape).

These are algorithms for some object classifiers. We will assume that the vectors inside the block are in-

dependent. Let us consider vectors composing the object X as one NL vector. Then we have a discriminator

of class �i

g X p L B x m Bi i i

l

L

l

i

T

i( ) ln( ( )) . ln(| | ) . ( ) (� � � �

�� 0 5 0 5

1

1x m

l

i� ).

The decision rule for this classifier has the form: the central element of X is entered into class �i

if

g X g Xi j( ) ( )� for all j i� and g X t

i i( ) ,� otherwise, into the class of rejected vectors. Herein, t

iis found

from (2) and (3) with A A LN Q� ( , ).

We assume again that the vectors inside the block are independent. The classified vector is x that is equal

to the central one for all vector of X:

x

L

xl

l

L

1

1

.(5)

The discriminator of �i

is written as

g X p B L x m B x mi i i i

T

i i( ) ln( ( )) . ln(| | ) . ( ) ( ).� � � � �

� 0 5 0 51

The decision rule for these classifiers is as follows: the central element of X is entered into class �i

if

g X g Xi j( ) ( )� for all j i� and g X t

i i( ) ,� otherwise, into the class of rejected vectors. Herein, t

iis deter-

mined by (2) and (3).

We classify the middle vector (5) of the block assuming that vectors inside the block are independent and

the covariance matrices are equal to unit matrix. In fact, they are usual object classifiers whose decision rules

are based on the minimal Euclidean distance to the center of the class. The discriminator of �ihas the form

g X p x m x mi i i

T

i( ) ln( ( )) . ( ) ( ).� � � �� 0 5

The decision rule for these classifiers has the form: the central element of X is entered into �i

if

g X g Xi j( ) ( )� for all j i� and g X t

i i( ) ,� otherwise, into the class of rejected vectors. Herein, t A

i� � ,

where A � 0 is the number given by the user.

OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 44 No. 4 2008

CONTROLLED CLASSIFICATION OF EARTH REMOTE SENSING DATA 333

Page 4: Controlled classification of Earth remote sensing data

The classification system contains also object classifiers based on the model of causal Markovian ran-

dom vector of first and third orders.

TRAINING AND WORK OF CLASSIFIERS

For constructing the discriminators of classes we require the following estimates of statistical character-

istics, namely, vectors of means, covariance matrices, coefficients of spatial correlation between values of

coordinates of neighboring vectors in horizontal and vertical directions, are determined via vectors from

training samples (fields). Besides the training fields, a set of control fields may also be given for each class.

The control field is a sample from the training pattern of the set of measurement vectors with the known clas-

sification that is not involved in determining the classifier parameters, but is presented for recognition to a

classifier based on measurement vectors from training fields for estimating the probabilities of correct clas-

sification.

All the classifiers may be utilized in test and operating modes. An error matrix and estimates of correct

classification probabilities are calculated according to results of classifier work in the test mode on vectors

of training and control fields. It is known (see, e.g., [4, 6]) that the estimates for vectors from training fields

are, on average, optimistic, and pessimistic for vectors from the control fields. Analyzing these data, we can

estimate (control) the training quality.

As we have already mentioned, in some situations, the condition of random measurement field is vio-

lated; the result is zero variances in some channels. Then formulas like (1) become inapplicable because the

covariance matrices B are singular. For correcting these situations, the classification system provides a func-

tion of adding the Gaussian noise with zero mean and unit variance to spectral channels with zero variance.

Modeling the standard normal random variable X, we use the approximate calculation formula based on

the central limiting theorem [7]:

X Ui

i

� �

1

12

6,(6)

where Ui

is independent random variables distributed on the interval (0, 1). There are known methods for

“exact” modeling the random variable X (see, e.g. [8]), but they require much more computational time.

The result of using the classifiers in the operating mode is a one-channel (byte) image for which the val-

ues of pixels are class numbers. The image is colored in predetermined colors that may be changed in the in-

teractive mode by user’s colors. Moreover, it is possible to apply to the image an edit function that is defined

as refining the classification chart based on taking into account the context without changing the list of ear-

lier marked out classes. The function can work in two modes: in Vote mode, the central pixel of the same

neighborhood changes only when the central pixel of a 3 3� neighborhood is replaced by a mode of neigh-

borhood histogram, and in Allsame mode, when all surrounding pixels have the same value.

Characteristics of the system of controlled classification: up to 9 training patterns, up to 15 classes, up to

10 training and control fields in a class, field size up to 50 50� vectors, object size from 1 1� to 11 11� , and

the data vector size is unlimited.

DISTRIBUTED COMPUTING

Analyzing the formulas for computing the values of decision functions of object classifiers, we see that

the main contribution to the computer time is from computing values of the quadratic form with the matrix of

size N, where N is the number of spectral ranges. In particular, in causal Markovian random field classifiers,

for a 3 3� object, in classifying each vector, the quadratic form value must be computed 13m times for the

crossing and 53m times for the square, where m is the number of classes. For a 5 5� object, the number of cal-

culations of quadratic form values for each vector is 25m and 186m, respectively. Table 1 illustrates time

characteristics (in seconds) of classifying 107

vectors with the size N � 5 on IBM PC, AMD Athlon

XP 3200+.

The computer time can be substantially decreased due to creating a classification system distributed be-

tween PC and MBC-1000/M multiprocessor computer. The PC is responsible for classifier training. Results

of training along with the classified data set are transferred via SFTP protocol to the MBC-1000/M that starts

a parallel version of the corresponding program. Paralleling consists in distributing the data set between the

OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 44 No. 4 2008

334 ASMUS et al.

Page 5: Controlled classification of Earth remote sensing data

given number of processors, each processor writing its results into a separate file. The files are transferred to

the PC that “glues” them and carries out further interpreting the classification results. Table 2 represents

classification of 107

vectors in dynamics on the MBC-1000/M, depending on the number of processors. The

size of vectors is N � 3, the number of classes is m � 5, the object size is 9 9� pixels, and the type of object is a

square.

The figure represents an ice-condition map for the eastern Arctic; the map was obtained using the con-

trolled classification. The images to be recognized were mosaics of radar and radiometric pictures from the

Okean-O1artificial Earth satellite. For choosing test areas, we used a color-synthesized image.

OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 44 No. 4 2008

CONTROLLED CLASSIFICATION OF EARTH REMOTE SENSING DATA 335

Table 1

The num-

ber of

classes

Crossing Square

3 3� 5 5� 3 3� 5 5�

5 80.7 147.4 311.6 1049.1

10 150.8 291.0 610.1 2120.4

Table 2

The number of processors

1 2 3 4 20

10200 5070 3242.5 2412.5 593

Figure.

Page 6: Controlled classification of Earth remote sensing data

CONCLUSIONS

The described classification system is a segment of software for processing data of Earth remote sensing.

This software was introduced into practice of current and research activity by the Planeta Research Center.

We should note that the so-called object classifiers proposed for the controlled classification are absent in

standard commercial digital image processing packages.

ACKNOWLEDGMENTS

This research was partly supported by the Russian Foundation for Basic Research, project

no. 07-07-00085.

REFERENCES

1. S. Davis, D. Landgrebe, et. al., Remote Sensing: The Quantitative Approach, Ed. by P. Swain and S. Davis

(McGraw-Hill, New York, 1978).

2. V. V. Asmus, Doctoral Dissertation in Mathematical Physics (Moscow, 2002).

3. V. V. Asmus, A. A. Buchnev, and V. P. Pyatkin, “Software for processing data of Earth remote sensing data

processing software,” in Proceedings of the Thirty Second International Conference on Information Technologies

in Science, Education, Telecommunication, and Business (IT SE’2005) (Yalta, 2005).

4. J. Tou and R. Gonzalez, Pattern Recognition Principles (Addison-Wesley, London, 1974).

5. R. Gonzalez and R. Woods, Digital Image Processing (Prentice-Hall, New Jersey, 2002).

6. J. P. Marques de Sa, Pattern Recognition: Concepts, Methods and Applications (Springer, Berlin, 2001).

7. N. P. Buslenko and Yu. A. Shreider, The Monte-Carlo Statistical Test Method with Implementation on Digital

Computers (Fizmatgiz, Moscow, 1961) [in Russian].

8. G. Vinkler, Image Analysis, Random Fields, and Monte-Carlo Dynamic Methods. Mathematical Foundations

(SO RAN, Novosibirsk, 2002) [in Russian].

OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 44 No. 4 2008

336 ASMUS et al.