11

Click here to load reader

Precision tracking with segmentation for imaging sensors

  • Upload
    y

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Precision tracking with segmentation for imaging sensors

Precision Tracking with Segmentation for Imaging Sensors

EUEZER ORON ANILKUMAR YAAKOV BAR-SHALOM, Fellow, IEEE University of Connecticut

We present a method for precision target tracking based on

data obtained from imaging sensors when the target i. not fully

visible during tracking. The Image is divided into several layers of

gray level intensities and thresholded. A binary image is obtained

and grouped into dusters using image segmentation techniques.

The association of the various clusters to the track to be estimated

relies on both the motion and pattern recognition characteristics

of the target Using the centroid measnrements of the clusters,

the probabilistic data association filter (PDAF) is employed

for state estimation. Expressions for the single-trame-based

centroid measurement noise variance of the target duster, and

the optimal parameters for cluster segmentation are given. The

simnlation results presented validate both the expressions for

the measurement noise variance as weD as the performance

predictions of the proposed tracking method.

The method Is first illustrated on a dim synthetic target

occupying about 80 pixels within a 64 X 64 frame in the presence

of noise background which can be stronger than the target. The

target Is modeled as having an intensity distribution within a

narrow range and the background in a much wider range, both

above and below the average intensity of the target The binary

image obtained after an "intensity bandpass" thresholding is

reduced to dusters by the nearest neighbor criterion. The results

show that it Is possible to achieve subplxel accuracy in the range

of 0.3 to 0.4 pixel rms error with moderate (0.7) to low (0.3) target

pixel detection probabillty; The subplxel accuracy can be further

m.>roved for a larger target

The usefulness of the method for practical applications Is demonstrated by considering a sequence of real target images (a

moving car) where the measurement noise variance was calculated as b!tvlng 0.7 pixel rms value. The achieved filter accuracy for

position was 0.4 pixel rms in each coordinate and 0.09 pixel/frame

for velocity after 10 frames.

Manuscript received April 1, 1992; revised June 4, 1992.

I EEE Log No. T-AES/29/3/07981.

This work was supported in part by the Officc of Naval Research under Grant NOOO14-91-J-19S0.

Authors' address: Dept. of Electrical and Systems Enginccring,

University of Connecticut, Rm. 312, U-lS7, 260 Glenbrook Rd., Storrs, CT 06269-3157. Dr. Oron is on leave from Israel Aircraft Industry.

0018-9251/931$3.00 © 1993 IEEE

I. INTRODUCTION

Many researchers have addressed the problem of accurately tracking an extended target [3, 9-11, 13, 14] using forward-looking infrared (FUR) imaging sensors. In [9-11] an extended Kalman filter with the observed intensities as nonlinear measurements was employed for tracking. This approach is computationally expensive as it used a 64-dimensional measurement vector. In [14], a linear filter was used with centroid and centroid offset measurements. It was assumed that the measurement noise is white and its variance known. In [3], explicit expressions for the measurement noise statistics were presented and it was also shown that the offset measurement noise is autocorrelated and is the output of a moving average system driven by white noise. In [13], a joint probabilistic data association merged-measurement coupled filter was used for tracking crossing targets using centroid and centroid offset measurements. All the above methods assume that the target has the strongest intensity and is small (about 10 pixels).

When the target image is "large" these methods are no longer valid. In [5], the utility of nonlinear filtering approach for extracting motion (rotational and translational) parameters of a known rigid body, based on a sequence of images was demonstrated. Here we propose a general technique which is independent of the size of the target and less sensitive to target intensity. In our approach, the target motion and pattern (object) characteristics are used in the data association for precision target tracking in noise background which can be stronger than the target.

The association of images to the target trajectory in a tracking process is done by exploiting certain typical characteristics of the target. These characteristics can be found either by pattern (object) recognition methods or by motion recognition methods. For example, typical motion characteristics of the target are its location, velocity, and acceleration (state vector) whereas typical object characteristics are its geometric structure, energy distribution (the gray levels in the image) in one or more spectral bands. The pattern recognition is mostly done on the discrete image (intrascan or image level) while the motion recognition is done using successive frames (inter-scan level). Motion recognition methods are the only ones that can be used for point targets, as in conventional radar tracking, where it is easy to measure position and velocity. When there are more detailed measurements of the target, as in the case of high resolution target images obtained by electro-optical sensors, synthetic aperture radar or in group tracking, one can also use object recognition methods. Tracking by electro-optical sensors, like CCD (charge coupled device) based cameras in the visible light or FPA (focal plane array) based cameras in infrared light, has the advantage of a good accuracy, high resolution, and rate of scanning,

IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 29, NO.3 JULY 1993 977

Page 2: Precision tracking with segmentation for imaging sensors

which enable object recognition techniques to be used in the data association. The main drawback of these sensors is the large data size which results in complicated data association algorithms and extensive computations in real time.

A practical approach to track the electro-optical images of targets using the advantages of both motion and object recognition techniques, can include the following steps.

Intrascan (Single Image) Level: 1) Spatial noise filtering by image processing principles (like averaging kernels), 2) identifying potential targets by image segmentation, and 3) calculation of centroid of each of the identified targets.

Inter-Scan Level: 1) Tracking centroids using single or multiple target tracking techniques, and 2) separation of the true and false targets by association based on the motion and object characteristics.

The advantage of this tracking method are:

a) improvement of the signal-to-noise ratio (SNR) by filtering the spatial random noise;

b) segmentation techniques help in data association but are not as complicated as pattern recognition techniques;

c) tracking centroids instead of pixels reduces the number of "targets" from about 250,000 pixels in a typical video image to only 10-20 relevant centroids;

d) method can be used to track targets which are weaker than the background noise;

e) inaccuracies of data association from segmentation are reduced by using the conventional motion tracking.

Image segmentation is a critical step in this method. It has to be simple enough to be used in real-time applications, but it also has to include enough object recognition aspects. It must provide an output between image frames in order to be used in the inter-scan data association.

We present here an approach which combines both object and motion recognition methods for practical target images. Motivated by electro-optic image patterns, we chose the intensity and the size of the cluster of the target as its typical target recognition characteristics. It is assumed that the pixel intensity is discretized into 256 gray levels. The image is divided into several layers of gray level intensities. Each layer is characterized by upper and lower gray level limits. It is assumed that a sufficient number of target pixel intensities are within the limits of a certain layer. Using the upper and lower limits of the "target layer" as threshold limits, the image is converted to a binary image. The binary image is reduced to clusters by the nearest neighbor criterion [1, 7, 8, 15].

In order to demonstrate the method for the case of one medium size target in dense clutter, the probabilistic data association filter (PDAF) [2] was

employed for state estimation using the centroid measurements of the clusters. The PDAF is needed to overcome the following problem. In the neighborhood of the predicted location for the target centroid during tracking one can find several centroids due to splitting of the target cluster or noise clusters. A standard Kalman filter with a logic of measurement assignment would not be able to track the target in such a case. Further reduction in computational complexity can be achieved by ignoring the clusters with small probabilities to be targets. For example, if an a priori estimate of the target size is available, then all the clusters that differ sufficiently from the target size can be eliminated.

In Section II closed-form expressions for the single-frame-based centroid measurement noise variance, based on the target and background intensity statistical characterizations, are presented along with simple examples. The segmentation algorithm used to reduce the binary image to clusters is the topic of Section III. The measurement noise (single frame centroid error) variance obtained by analytical computations is compared with the values obtained using simulations. Section IV presents the state models for the PDAF algorithm used for precision tracking of a target based on its image. In Section V, a synthetic image model and numerical results validating the performance predictions of the proposed method are presented based on Monte Carlo runs. In Section VI, the tracking performance of the method is demonstrated considering a sequence of real target images obtained using an FPA-based camera in infrared light. Conclusions are given in Section VII.

II. CENTROID OF A RANDOM CLUSTER

A. Centroid of an Image with Random Intensities

Consider a cluster of N points (pixels) in a Cartesian coordinate system. Each point is denoted by a single index i, i = 1, . . . , N. If each point has an intensity Ii, then the centroid of the cluster is defined as

where Xn, is the nth coordinate of point i. Suppose that L are independent random variables (RVs) with mean Jli and standard deviation (Ji then xn, given by (1) is also a RV whose first- and second-order moments can be calculated as follows. Equation (1) can be written as

where N

Mn = 2::Xn,Ii i=l

(1)

(2)

(3)

978 IEEE TRANSACTIONS ON AEROSPACE AND ELEcmONIC SYSTEMS VOL. 29, NO.3 JULY 1993

--------- - ----------------�---- - -

Page 3: Precision tracking with segmentation for imaging sensors

and

(4) i;1

Being sums of independent RVs, Mn and I have mean and variance given by [12]

N

/J(Mn) = L)ni/J(L) i=1 N

var(Mn) = 2::x�ivar(Ii) i=1 N

/J(I) = 2:: /J(Ii) i=1 N

var(I) = 2::var(Ii). i;1

The mean /J(Mn) can be interpreted as the nth coordinate first moment of the intensity and var(Mn) as the nth coordinate moment of inertia of the intensity and p.(I) as the total intensity. Now, if 1 = 1(�,'Ij;) is a function of two RVs � and 'Ij;, then under certain conditions [12] its variance var(!) can be approximated as

(81)2 2 8181 (81)2 2 var(f)= 8� (J"{+28�8'1j;7(J"�(J"tjJ+ 8'1j; (J"tjJ

where r is the correlation coefficient and the derivatives of 1 with respect to (wrt) � and 'Ij; are evaluated at the mean values of E and ¢. Identifying

L> L> L> 1 =xn" e =Mn and 'Ij; =1 we get, at the mean

(5)

(6)

(7)

(8)

(9)

81 1 1 1 8E = ¢ = /J(I) = Li':,I/J(L) (10)

81 _ -e _ /J(Mn) _ Li':,l xni/J(L) 8'1j; - 'lj;2 - /J(I)2 - (Li':,I/J(Ii)f·

(11)

In order to avoid use of the unknown correlation coefficient r in (9), it is convenient to use as coordinate system origin the true centroid. In this case, at the mean

In [3] the assumption was that there is a target with a certain intensity pattern observed in the presence of additive noise, while here a random target is assumed without noise-the background which has a different distribution will be accounted for later.

B. Centroid of Binary Image

In our model, the clustering points are the intage pixels. The gray level image is transformed to a binary image, with new intensity f3i' by a hard limiter according to

(3i = {I, 0,

h :s Ii:S IH otherwise (14)

where hand IH are the threshold limits for the intensity. The probability of binary intensity Pi of pixel i, called detection probability of this pixel by the threshold rule, is dermed as

P{f3i = I} = Pi P{f3i =O} = 1- Pi·

(15) (16)

In other words three layers of intensity are considered with the middle one used for target detection. For example, if L is N (/J, (J"2), i.e., Gaussian with mean /J and variance (J"2, then

1 J,IH " Pi = -- e-(x-I') /7.tT dx

(J""fiir h

where <P is the error function. The mean and the variance of the binary intensity

of a single pixel are /J«(3i) = Pi

var«(3i) = Pi(1-Pi).

(18) (19)

For the binary image, the variance of the centroid (13) becomes

var(f) = L�1 x�,PI(1-Pi)

(LJ':lPi) 2 (20)

81 N

a = 2::Xni/J(L) = 0. ¢ i=1

Then the estimated variance of the centroid reduces to

(12) C. Random Inten sity Target With out Noise

Consider a target of arbitrary shape having a size of Nt pixels. The intensity of the pixels is given by

(f) () var(Mn) L:":1 x�.var(L) var = var x = = ' n, 2(1) (N

) 2 /J Li=l/t(Ii)

where xn, is the nth coordinate of pixel i wrt the center of mass.

(13)

Ii = {Si' 0,

1::;i::; Nt elsewhere.

If Si are i.i.d. N(s,(J"2) then the pixel detection probability Pi is

1 J,IH "-"/2 ' L> Pi = -- e-«-s, ,, d( = P

(J""fiir h ORON ET AL: PRECISION TRACKING WIlli SEGMENThTION FOR IMAGING SENSORS

(21)

(22)

979

Page 4: Precision tracking with segmentation for imaging sensors

N var(Mn) = p(l- p) Z'>�i

;=1 p,(I) = pNt

This wid th d p is taken as the proximity distance of (23) the noise clusters (to be discussed in the next section

in the context of the segmentation). Using (26), (27),

(24) and (33) we get

(1- p)"N x2 var(x ) = L..,=! ni n, pN?

This last result, is also valid for a target with any intensity distribution provided the probability p is known.

Examples: 1) A circular target with a radius of R pixels. Using (23) through (25) yields

(25)

N

lRl11' R41r EX�i = 2 R3cos2(8)dfJdR = - (26)

;=1 0 0 4

Nt = 1rR2 1-p var(x ) = --n, 41r P .

2) A rectangular target with dimensions a in coordinate n; and b in the other coordinate.

Again using (23) and (24)

N ja/2jb/2 a3b EX�i = p2dpdT = 12 ;=1 -a/2 -b/2

Nt = ab (1- p)a var(xn,) =

12pb .

D. Random Target With Noise Background

(27)

(28)

(29)

(30)

(31)

Consider a target of size (area) Nt pixels with a "basic" ring of N" pixels around it, with pixel detection probabilities Pt for the target and Pv for the ring area (due to noise), respectively. The intensity of the pixels is given by

{ Si, L= Vi,

l�i�Nt elsewhere.

(32)

In this case the image is a combination of Nt pixels of the target and N" pixels which might have noisc detections. Then the variance of the centroid estimate is, using (20)

var(x ) = pt(l- Pt)L:;ETx�i + p,,(l- PV)L:;ENX�i n, (PtNt + p"Nv)2 (33)

where T is the set of target pixels and N is the set of noise pixels.

Consider now a circular target of radius rt in a noise background of radius rt + dp where dp is the width of the ring containing Nv pixels, where

N" = 7r«rt + dp)2 - rho (34)

( ) _ (1- pt)ptri + (1- pv)pv«rt + dp)4 - ri) var Xn - 2 2 22 ' 41r(ptrt + pv(r/ + dp) - PVrt) (35)

The above equation is a coarse approximation for the cluster case because it does not take into account the chain effect of the clustering algorithm: if one of the noise pixels becomes part of the augmented target, it can link to another one, which eventually leads to asymmetrical additions to the target. This is taken into account next.

Assuming pv <{:: Pt, the circular target can be considered to be in a "maximum noise ring" background of width r, where

r, = dpPvNv (36)

i.e., the longest radial chain consisting of the average number of noise pixels PvNv. The probability that a pixel from the maximum ring belongs to the target cluster is proportional to the noise pixel detection probability p",

(37)

where the proportionality factor is the ratio of the area Nv of the basic ring (with width d p) to the area of the maximum ring (with width r.)

(38)

In other words, the probability of a pixel belonging to the target cluster in thc maximum ring is its detection probability reduced by the factor by which the number of pixels havc bccn increased from the basic ring to the maximum ring. This is a consequence of the fact that the optimal dp discussed in the next section, keeps the target cluster size constant on the average regardless of the changing shape of the cluster because of the noise, i.e.,

NvPv = N,p,. Then, accounting for this asymmetry, the

measurement noise variance is given by

This expression was used in evaluating the measurement noise variance of the target cluster in the next section.

(39)

III. SEGMENTATION AND CENTROID ESTIMATION

In the method presented, image segmentation is a two-step process. First, the original gray scale image is

980 IEEE TRANSACTIONS ON AEROSPACE AND ELEClRONIC SYSTEMS VOL. 29, NO. 3 JULY 1993

Page 5: Precision tracking with segmentation for imaging sensors

TABLE I Computed Value and Sample Average of Measurement Noise Variance for Circular Thrget of Different Sizes

Target size P, Pv d' Variance Variance p Actual Cluster (theory) (simulations)

Mean Mean

(theory) (simulations)

1256 501 '191 0.383 0.0'18 3.08 0.26 0.'10

1256 886 878 0.680 0.097 2.21 0.10 0.20

78 20 19 0.176 0.044 3.57 4.21 4.22 78 3'1 39 0.383 0.0'18 3.08 0.76 0.71 78 61 62 0.683 0.097 2.21 0.33 0.27

Note: Roman letter symbols here correspond to italic letter symbols in text.

transformed to a binary image by the hard limiter rule of (14). Then using a clustering method, the binary image is grouped into clusters. Most of the standard clustering methods are difficult to implement in real time. Because of this, a clustering algorithm, which is not so accurate but can be implemented in real time, was chosen. The clustering method that is used in the implementation is the single linkagc or nearest neighbor technique [1, 7, 8, 15]. In this technique, a pixel belongs to the cluster if it is linked to at least one other pixel of the cluster by a distance which is less than a certain proximity distance. By choosing an optimum proximity distance as shown below, fewer noise clusters are obtained.

The proximity distance, denoted by dp, affects the size, shape, and number of the clusters obtained by clustering. For example, if dp is larger than the size of the image, then the whole image shows up as a single cluster. On the other hand, if d p is less than one pixel, then every single pixel with a detection becomes a separate cluster.

If Pt and Pv are the detection probabilities of the target and the noise pixels, respectively, then the average distance between neighboring pixels in the target and the noise is about Jl/ Pt � dt and Jl/pv � dv· The optimal dp must be between these two values. That is,

(41) This guarantees that most of the target pixels will

be in the same cluster and noisc pixels are grouped into small clusters. With good segmentation most of the target pixels will be in the same cluster with a small number of noise pixel clusters around its edge. Such a noisy edge increases the centroid measurement variance. So in practice, it is better to use d p close to dt to minimally cover the gaps in the target image.

In order to find an acceptable dp, the centroid measurement noise variance was obtained by analytical computation for a circular target with noise corrupted edges. The variance of the centroid measurement noise depends on the proximity distance d p of the target

'D\.BLE II Measurement Noise Variance for Different dp for Fwe Cases

Considered in Thble I

dp 2 3 'I 5 d' p

Case 28.3 0.'18 0.75 1.01 0.'10

2 0.57 0.55 0.91 0.20

3 21.1 5.71 19.'1 '1.22

'I 2.09 0.85 2.50 0.71

5 0.28 2.19 18.1 0.27

Note: (lmin = 50 for cases 1-2 and Imin = 5 for cases 3-5.)

Note: Roman letter symbols here correspond to italic letter symbols in text.

and noise pixels, intensity threshold levels (IH and h), and hence the probability of target and noise pixel detections (Pt and Pv).

For a given circular target, the measurement noise variance and the best proximity distance d; from 300 Monte-Carlo runs for different target and noise pixel detection probabilities (pt and pv) are shown in Thble I. Also, when the noise is distributed around the circular target forming a bagel-like structure, the analytically calculated measurement noise variances using (40) are shown in the next to the last column of Table I. From this table, it can be observed that the theoretical and computed measurement noise variances are close for most of the cases. The variance depends on the size of the target and target pixel detection probability. For a large target or high Pt, the variance is lower. Therefore by choosing appropriate threshold levels, PI can be increased to assure that the variance is small. It also depends on the type of the clustering algorithm used in the implementation. Also the experimentally determined optimal proximity distance d; satisfies (41). In Thble II, for the five cases discussed in Thble I, the computed measurement noise variance is shown for different proximity distances. From the table, it can be observed that the smallest measurement noise variance is achieved when d p is the

ORON ET AL: PRECISION TRACKING WITIf SEGMENTATION FOR IMAGING SENSORS 981

Page 6: Precision tracking with segmentation for imaging sensors

'D\BLE III Threshold Levels and Detection P robabilities for TWo Cases

Considered in Simulations

Case IL IH Pt Pv A 90 110 0.682 0.097 B 95 105 0.383 0.048

Nate: Roman letter symbols here correspond to italic letter symbols in text.

average of d" and dt. Therefore, in practice one can use

d' = dt +dv p 2

(42)

as the approximate optimal proximity distance which gives minimum centroid measurement noise variance.

An important observation is that for dp = d; the target cluster size stays very close to its average.

IV. STATE ESTIMATION MODEL

The state equation for tracking a nearly constant velocity target in two dimensions, using the centroid measurements is [2]

x(k + 1) = [� : a a a a

T2 a 0 �lX(k)+

2 0 T a 1 T2 v(k)

a a 2

a T (43)

where the state consists of the current position and velocity in two Cartesian coordinates; T is the sampling period and v(k) is the zero mean, white process noise (acceleration) with variance q", and qy in the two Cartesian coordinates. The change in velocity over the sampling period T is of the order of y7jT.

In the state space model, only the centroid position measurements in two-dimensional Cartesian coordinates were considered. The measurement equation at discrete time k is

[1 a a 0] ��= x�)+w�)

o a 1 0 where w(k) is the centroid measurement noise and x(k) is the state vector. As shown in Thble I, for a given dp, the centroid measurement noise variance obtained from (40) is of the same order as the estimated variance. The covariance of the measurement noise is

(44)

(45) where the measurement noise variances (same for each coordinate) are given by (40).

The measurement and process noise sequences are uncorrelated. The PDAF [2] was used to obtain the state estimates.

V. PRECISION TRACKING OF SYNTHETIC TARGET IMAGE

A. Modeling of Image

As described in Section III, the segmentation teChnique divides the image into three layers of gray level intensities. The target is assumed to be in one of the layers. A two-dimensional array of 64 x 64 pixels is considered for the image. The target, a two-dimensional array of m = nx x ny pixels, is modeled as a white Gaussian random field with a mean Ilt and variance or The background noise is also modeled as a white Gaussian random field with moments Ill' and 0";. Using the upper and lower limits of a layer as threshold limits (1£ and IH), the image is converted to a binary image and grouped into clusters by the nearest neighbor technique using the optimal proximity distance d;. All the clusters of size less then

[min are ignored. The value of Imin is chosen depending on the expected target size. The centroid of each cluster was calculated and used for processing by the PDAR For 10" threshold (1£ = Ilt - O"t, IH = Ilt + O"t) there are about 15 clusters in the image. But within the validation gate of the PDAF, a maximum of 3 clusters appear. The PDAF probabilisticaUy weights the different measurements (centroids) in the validation gate [2]. The sampling (scan) period was T = 1 and target motion was modeled by the second-order kinematic model (43). The gray scale image with noise for a single scan is shown in Fig. 1 for two target sizes. The binary images are shown in Fig. 2.

B. Num erical Results

The results are presented for two cases of target pixel detection probabilities and the state estimation was done for a period of 20 scans. The initial state vector of the target in the image (pixel) frame was

x(O) = (11.0 1.5 15.0 1.0]'. (46)

The initial target position was near the bottom of the image and moderate constant velocities were considered (1.5 and 1.0 pixel/scan in x and y coordinates, respectively) so that the target stays in the same frame for 20 scans. The target and noise parameters used in the simulations are given in (47) and (48) and the threshold levels and detection probabilities were as in Thble III.

.

Thrget pixel intensity: N(l00, 102) (47)

Noise pixel intensity: N(50, 502). (48)

982 IEEE TRANSACTIONS ON AEROSPACE AND ELEC1RONIC SY STEMS VOL. 29, NO.3 JULY 1993

Page 7: Precision tracking with segmentation for imaging sensors

(a) (b) Fig. 1. Gray scale image with noise for single scan. (a) Circular target of diameter 40. Cb) Circular target of diameter 10.

(a) (b) Fig. 2. Binary image for single scan. Cal Circular target of diameter 40. (b) Circular target of diameter 10.

A circular target of 78 pixels (radius of 5 pixels, shown in Fig. l(b)) in an image of size 64 x 64 was considered. Expected target sizes for cases A and B were 61 and 34, respectively (last two rows of Thble I). The minimum cluster size [min for both cases was chosen as 10. As given in Table [ and [42], the proximity distances for case A and case B were 2 and 3, respectively.

The PDAF was used from the interactive software [4] with the following parameters. The measurement noise had a variance of 0.33 and 0. 76 (cf. (40) and Thble I) for cases A and B, respectively. The target detection probability was taken as 0.97. A low level process noise qx = qy = 10-4 was assumed by the filter.

Thble IV summarizes the result� of 100 Monte Carlo runs in terms of achieved accuracies for the problem considered. The results are given only for the x coordinate but the normalized state estimation error squared (NEES) pertains to the entire four-dimensional state. From the last columns of Tdble IV, it is clear that the NEES is within the 95%

TABLE IV

Position and Velocity Variances (in Steady Stat e) for Each

Coordinate and Filter Consistency Verification From 100 Runs

Case Position VelOCity Average 957. R�ion variance variance NEES for N ES

A 0.05:3L 0.0010 4.10 3.5-4.5 8 0.1357 00016 3.79 3.5-'1.5

probability region (based on the chi-square distribution [2]) indicating that the filter is consistent, i.e., the calculated variances match the actual errors.

The results show that it is possible to achieve subpixel accuracy in the range of 0.25 pixels rms error for case A and 0.37 pixels rms error for case B. The threshold settings (II,I H) in case A clearly yield superior results compared to case B for the problem considered. These results show that one can obtain very good subpixel accuracy even though the target is very small (78 of 4096 pixels) and is not fully visible most of the time (Fig. 1). The subpixel accuracy would further improve for a larger target.

ORON ET AL: PRECISION TRACKING WITII SEGMENTATION FOR IMAGING SENSORS 983

Page 8: Precision tracking with segmentation for imaging sensors

-[

(a) (b)

(0) (d) Fig. 3. Real images (frames 2 and 10) with clusters. (a) and (b) Gray scale images. (c) and (d) Binary images. Thrget (clusters) indicated

by solid contour.

VI. PRECISION TRACKING OF A REAL TARGET IMAGE

To demonstrate the usefulness of thc proposed method for practical applications, it was uscd to track a moving car in a sequence of real images. The real images were obtained by an FPA-based platinum-silicide camera (resolution 256 x 256) in infrared light (3-5 {lm wavelength). The 256 x 256 images were reduced to 64 x 64. The latter were used for segmentation and centroid calculation and these centroid measurements were then passed on to the tracking filter. The images show two cars moving in opposite directions crossing each other during daytime. The car m oving from left to right (with an unknown velocity) was chosen for tracking. Two of the ten

frames are shown in Fig. 3 with the target clusters indicated.

The target parameters were estimated by considering a single frame of the image as follows: The target size was found to be about 20 pixels with intensity mean and variance of about 150 to 90, respectively. The noise statistics were found from the entire image to be of mean and variance 120 and 100, respectively. Using the segmentation technique the image at each scan was divided into three layers of gray level intensities around the target intensity (150). The gray scale image was binarized using ±lu threshold levels (140 to 160). These values lead to target and noise pixel detection probabilities of 0.7 and 0.03, respectively. With the target size of 20 pixels, the equivalent radius is

984 IEEE TRANSACTIONS ON AEROSPACE AND ELE CmONIC SYSTEMS VOL. 29, NO. 3 JULY 1993

Page 9: Precision tracking with segmentation for imaging sensors

TABLE V

Image Parameters Used in Real Thrget Tracking Example 3,5

J.lt at ).Lv °v IL IH Pt Pv d' p var(xncl � 25

----------------------------------------------- 1 Note: Roman l etter symbols here correspond to italic letter : 1.5

150 9,5 120 10 1'10 160 0.70 0.03 3.53 0.'16

symbols in text. ]

60� __ --�--� 55 50

.� 45 ! 40 P-,5 35 ; 30 1l .3 25

20 15

100 ---c---=------;;---:---�--__:z_--_:;_----.,"�110 Frame Number

Fig. 4. Estimated centroid position along with ± 10' accuracy,

2.5 pixels. The image parameters are summarized in Thble V. '

With the above parameters, using (40), we get the single-frame-based centroid measurement noise variance as 0.46 and the optimal proximity distance (rounded to) 4. As the target size was small, the minimum cluster size Imin was chosen as 1.

On the average there are about 10 clusters in each scan and a maximum of 3 clusters in the validation gate of the PDAF. Using the PDAF, the estimated values of the centroid position and velocity of the target along with the estimated accuracies are given in Figs. 4 and 5. The filter-calculated rms estimation errors were 0.4 pixel (variance 0.16; this is a reduction by a factor of almost 3 from the raw measurements) for each coordinate in position and 0.09 pixel/frame in velocity. The position accuracy amounts to abo�t one-tenth o� the target size. Since the ground truth IS not known, m order to check the consistency of the filter (validity of the filter-calculated estimation accuracy), the 10-sample time-averaged normalized innovation squared (NIS) [2] was calculated and was found to be 1.16, which is within the chi-square limits [0.96--3.14] based on X�/lO. Thus the filter-calculated accuracies shown in Figs. 4 and 5 are remarkably reliable.

These results show that, in spite of the non-white and non-Gaussian noise in the image, the method was successful for this real image.

VII. CONCLUSIONS

A method for precision tracking of targets based on data obtained from imaging sensors was presented. A closed-form single-frame-based target centroid

� 0.5

Frame Number

Fig. 5. Estimated centroid position velocity along with ±10' accuracy.

measurement noise accuracy was presented in terms of target image and the noise characteristics and verified through simulations. This "measurement noise variance" is an important tracking filter design parameter. Using image segmentation techniques,

. the gray scale image is thresholded and the resultmg binary image is reduced to clusters by the nearest neighbor criterion. An expression for the optimal proximity distance needed for image segmentation was given. The association of the clusters to the track was done by exploiting both the motion and pattern (object) recognition characteristics of the target. Using the centroid measurements of the clusters, the PDAF was employed for state estimation.

The simulation results presented validate the performance predictions of the proposed method. The results for a target of about 78 pixels in a 64 x 64 frame show that it is possible to achieve subpixel tracking accuracy in the range of 0.25 pixels rms error. The usefulness of the proposed method was demonstrated by also considering a sequence of real target images, where for a 20 pixel target, the tracking accuracy obtained was 0.4 pixel rms for each position coordinate and 0.09 pixel/frame for velocity.

The theory presented can be extended to multiple target tracking. As explained in the real image example, the target parameters can be estimated by considering a single frame of the image. If the targets fall in different layers of gray level intensities, then by running separate filters for each of the layers, the images can be tracked. If the targets fall in the same intensity layer and their images are crossing, then the JPDAMCF [13] has to be employed for tracking.

ACKNOWLEDGMENTS

The authors would like to thank Magda Lachish, Dany Gros, and Marco Zalik for providing the real image data.

ORON ET AL: PRECISION TRACKING WIlli SEGMENTATION FOR IMAGING SENSORS 985

Page 10: Precision tracking with segmentation for imaging sensors

REFERENCES [9] Maybeck, P. S., and Mercier, D. E. (1980)

[I]

[2]

[3]

[ 4]

[5]

[6]

[7]

[8]

986

Anderberg , M. R (1973) Cluster Analysis for Applications. New York: Academic P ress, 1973.

Bar-Shalom, Y., and Fortmann, T. E. (1988) Tracking and Data Association. New York: Academic Press, 1988.

Bar-Shalom, Y., Shertukde, H. M., and Pattipati, K. R (1989) Use of measurements from an imaging sensor for precision target tracking. IEEE Transactions on Aerospace and Electronic Systems, AES·25 (Nov. 1989), 863-872.

Bar-Shalom, Y. (1991) MULTIDAT-Multiple Model Data Association Track£r 4.0. Interactive Software, 1991.

Broida, T. J., and Chellappa, R (1986) Estimation of o�ect motion parameters from noisy images. IEEE Transactions on Pattern ATUllysis and Machine Intelligence , PAMI-S (Jan. 1986), 90-99.

Fortmann, T. E., Bar-Shalom, Y., and Scheffe, M. (1983) Sonar tracking of multiple targets using joint probabilistic data association. IEEE Transactions on Oceanic Engineering, OE·8 (July 1983), 173-174.

Jain, A. K., and Dubes, R C. (1988) Algorithms for Clustering Data. Englewood Cliffs, NJ: Prentice-Hall, 1988.

Krishnaiah, P. R, and Kanal, L N. (1982) Handbook of Statistics 2. Amsterdam: North-Holland, 1982

[10]

[11]

[12]

[13]

[14]

[IS]

A target tracker using spatially distributed infrared measurements. IEEE Transactions on Aut omatic Control, AC-25 (Apr. 1980), 222--225. .

Maybeck, P. S., Jensen, R L, and Harnly, D. A (1981) An adaptive extended Kalman filter for target image tracking. IEEE Transactions on Aer ospace and Electronic Systems, AES·17 (Mar. 1981),173-180.

Maybeck, P. S., and Suizu, R 1. (1985) Adaptive tracker field-of-view variation via multiple model filtering. IEEE Transactions on Aerospace and Electr onic Systems, AES-Zl (July 1985), 529-539 ..

P apoulis, A (1991) Probability, Random variables and Stochastic Processe s. New York: McGraw-Hill, 1991.

Shertukde, H. M., and Bar-Shalom, Y. (1991) Tracking of crossing targets with imaging sensors. IEEE Transactions on Aerospace and Electronic Systems, AES·27 (July 1991), 582--592

Thbin, D. M., and Maybeck, P. S. (1988) Enhancements to a multiple model adaptive estimator/image tracker. IEEE Transactions on Aerospace and Electronic Systems, AES-24 (July 1988), 417-426.

Zupan, J. (1982) Clustering of Large Data Sets. Research Studies Press, 1982

Eliezer Oron was born in lei-Aviv, Israel, in 1945. He received the B.Sc. and M.Sc. degrees in physics from lei-Aviv University, Israel, in 1972 and 1974,

. respectively, and in 1979 he received the Ph.D. degree in physics from Ben-Gunon University, Israel.

He has served in the Israel Defense Forces as a pilot. Since 1982 he has worked as a scientist for the Israel Aircraft Industry in the areas of aerodynamics, thermodynamics, electro-optics, and image processing. During Sprin� 1991 �e was a Visiting Scholar at the University of Connecticut, Storrs, working on mtage processing and tracking.

IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 29, NO.3 JULY 1993

Page 11: Precision tracking with segmentation for imaging sensors

Anil Kumar was born in Hyderabad, India, in 1964. He received the B.S. degree in electrical engineering from Osmania University, Hyderabad and the M.S. degree in electrical engineering from the Indian Institute of Science, Bangalore, in 1985 and 1988, respectively. Presently he is working toward his Ph.D. degree at the University of Connecticut, Storrs.

During 1988--1989 he worked as a scientist at the Research and 1taining Unit for Navigational Electronics, Hyderabad. His research interests include array processing, spectral estimation, and image processing.

Yaakov Bar-Shalom (S'63-M'66--SM'80--F84) was born on May 11, 1941. He received the B.S. and M.S. degrees from the Thchnion, Israel Institute of Technology, in 1963 and 1967, and the Ph.D. degree from Princeton University, Princeton, NJ, in 1970, all in electrical engineering.

From 1970 to 1976 he was with Systems Control, Inc., Palo Alto, CA. Currently he is Professor of Electrical and Systems Engineering at the University of Connecticut, Storrs. His research interests are in estimation theory and stochastic adaptive controL

Dr. Bar-Shalom coauthored the monograph '!racking and Data Association (Academic Press, 1988) and edited the books Multitarget-Multisensor '!racking: Applications and Advances (Artech House, VoL I 1990; Vol. II 1992). He has consulted to numerous companies, and originated the series of Multitarget-Multisensor '!racking short courses offered via UCL A Extension, at Government Laboratories, private companies, and overseas. He also developed the commercially available interactive software packages MULT IDAT™ for automatic track formation and tracking of maneuvering or splitting targets in clutter, PASSDAT™ for data association from multiple passive sensors, BEARDATTM

for target localization from bearing and frequency measurements in clutter, and IMDAT

TM for image data association for tracking. During 1976 and 1977 he served as Associate Editor of the IEEE '!ransactions

on Automatic Contro l and from 1978 to 1981 as Associate Editor of Automatica. He was Program Chairman of the 1982 American Control Conference, General Chairman of the 1985 ACC, and Co-Chairman of the 1989 IEEE International Conference on Control and Applications. During 1983-1987 he served as Chairman of the Conference Activities Board of the IEEE Control Systems Society and during 1987-1989 was a member of the Board of Governors of the IEEE CSS. In 1987 he received the IEEE CSS Distinguished Member Award. He has been elected Fellow of IEEE for "contributions to the theory of stochastic systems and of multitarget tracking".

ORON ET AL: PRECISION TRACKING WIlli SEGMENTATION FOR IMAGING SENSORS 987