Fused EO/IR Detection & Tracking of Surface Targets ...isif.org/fusion/proceedings/fusion09CD/data/papers/0463.pdf · Fused EO/IR Detection & Tracking of Surface Targets: Flight Demonstrations

Fused EO/IR Detection & Tracking of Surface Targets: Flight Demonstrations

Allen Waxman, David Fay, Paul Ilardi, Pablo Arambel, Xinzhuo Shen, John Krant, Timothy Moore (a) Brian Gorin, Scott Tilden, Bruce Baron, James Lobowiecki, Robert Jelavic, Cliff Verbiar (b)

John Antoniades, Mark Baumback, Daniel Hass, Jonathan Edwards, Samuel Henderson, Dave Chester (c)BAE Systems

(a)Burlington, MA; (b) Greenlawn, NY; (c) Columbia, MD; USA [email protected] [email protected]

Abstract - At Fusion 2007 we presented real-time methods for fused video imaging, detection and tracking of multiple ground targets, and showed results of tower-based ground tests. This paper summarizes the use of these methods in proof of concept flight tests, undertaken as a collaborative effort among several business units of BAE Systems North America. Utilizing a Sonoma commercial RGB/MWIR 4-band turret mounted on a King Air 90, we integrated our existing image co/geo-registration methods with the image fusion, target learning & detection, multi-target tracking, and sensor/turret resource management methods reported previously, implemented them on real-time commercial processors, and conducted two test flights over water and land. We illustrate results of these flight tests that demonstrate proof of concept capabilities, showing examples of air-to-surface detection and tracking of multiple vessels over an extended field of regard, ground vehicles (both stationary and moving), and multiple dismounts walking on a beach. The approach does not require stabilization of the background in order to detect moving targets because detection is all spectral-based. Targets can be detected as spectral anomalies with respect to an adaptive local background, and then learned on-the-fly as a target class. Alternatively, targets can be designated by an operator and learned in real-time. This approach applies to both day & night fused video imaging over land & water, and can be extended to include other imaging sensors such as SWIR & LWIR, and coupled with hyperspectral fingerprinting for feature-aided tracking. Keywords: Video tracking, multi-target tracking, image fusion, target detection, neural networks, sensor resource management

1 Introduction Building on extensive prior work in real-time image fusion based on opponent-color neural models of vision [1, 2, 3], target learning & detection based on Adaptive Resonance neural models of pattern learning [4, 5, 6], multi-hypothesis multi-target tracking methods [7, 8], and sensor/turret resource management (i.e., automatic

control) for video tracking [1, 9] of surface targets, BAE Systems undertook to develop an integrated proof-of-concept demonstration of air-to-ground fused video tracking. Unlike other approaches to ground target tracking which require stabilization of the background in order to detect targets in motion [9 describes one such approach], the use of fused multisensor imaging enables spectral-based detection of both moving and stationary targets in the open and when partially obscured (so shape is not available for matching), and on the ground as well as at sea (where background texture is not available for stabilization), as well as in daylight and at night. Fused imagery with only 4-bands is sufficient to enable background normalcy learning so as to detect anomalies as potential targets, then learn those targets in order to track them over an extensive field of view. In the maritime case, it is possible to exploit fused imagery to detect small boats in high glint conditions, track and reacquire boats beneath clouds, and generally detect targets at low spatial resolution thereby providing wider area coverage. In the following sections we review some key prior work in fused imaging and on-the-fly target learning, and then show examples from flight tests conducted during daylight off the coast of Florida.

2 Image Fusion We have described our opponent-color neural architecture for color image fusion in many previous publications [1 – 4]. This image fusion architecture is shown in Figure 1 for the 4-band RGB/MWIR sensor suite we had available in our turret. It combines gray-scale brightness fusion with opponent-sensor contrast mapped to human opponent-colors, and was implemented in real-time on inexpensive DSP boards [4]. Typically, one uses this approach to fuse images containing complementary information, e.g., reflected light in day or night (VNIR or SWIR) with emitted light (MWIR or LWIR) from the scene. This same architecture can be used to fuse two, three or four imaging sensors, as shown in Figure 2. As the spectral diversity of the input imagery increases, the resulting fused image quality is improved. This same spectral diversity aids in the learning and detection of targets.

12th International Conference on Information FusionSeattle, WA, USA, July 6-9, 2009

978-0-9824438-0-4 ©2009 ISIF 1439

Figure 1: Opponent-color fusion network uses center-surround shunting dynamics to support local adaptive contrast enhancement and two stages of opponent-color that are mapped to the YIQ human opponent-color space.

Fused VNIR/SWIR/LWIR Fused VNIR/SWIR/MWIR/LWIR

InGaAs SWIR (0.9-1.7μm) Fused SWIR/LWIR

Fused VNIR/SWIR/LWIRFused VNIR/SWIR/LWIR Fused VNIR/SWIR/MWIR/LWIRFused VNIR/SWIR/MWIR/LWIR

InGaAs SWIR (0.9-1.7μm)InGaAs SWIR (0.9-1.7μm) Fused SWIR/LWIRFused SWIR/LWIR

Figure 2: The architecture of Figure 1 is used to fuse 2, 3, or 4 sensors, including VNIR/SWIR/MWIR/LWIR shown here, as collected under quarter-moon conditions (27 mLux). Image quality and interpretability improves, with increased color contrast, as more complementary spectral bands are fused.

Dismounts can also be enhanced in fused imagery, enabling improved visualization, detection and tracking. Figure 3 illustrates fused imaging of dismounts at night with two alternative pairs of complementary sensors.

Figure 3: Dual-sensor fusion of dismounts comparing VNIR/LWIR and SWIR/LWIR under overcast full-moon.

3 Target Detection in Fused Video In order to track multiple targets, it is first necessary to detect the targets reliably from the fused imagery. In a tracking context, the temporal stability of the real targets also helps to reduce false alarms at the tracker output. In order to detect targets in fused imagery, we utilize the opponent-color contrasts created in the fusion process itself together with the multisensor inputs, to create a feature vector as shown in Figure 4 [4].

Pixels

FeaturesColorfused

Visible

SWIR

MWIR

LWIR

Opponent-Band contrasts

ContoursTextures

3D Context

...

Figure 4: Multi-modality and cross-band contrast image layers contribute to a feature vector at each pixel. The same applies to use of RGB/MWIR 4-band imagery.

This feature vector is then input to an adaptive classifier (shown in Figure 5) based on Adaptive Resonance Theory (ART) [5, 6], which we’ve modified for salient feature subspace discovery and real-time implementation.

F1F1

F2F2

F3F3

Input Feature Vector

Target Non-Target

f1 … f3 … fi …

Recognized Class

Category-to-Class Learned Mapping

Bottom-Up and Top-Down Adaptive Filters

Unsupervised Clustering

ComplementCoding

ρT

ρNT

Reset

MatchTracking

Figure 5: Modified Fuzzy ARTMAP neural classifier is used to learn category & salient feature representations of targets and normalcy models of backgrounds.

In this form, the classifier learns fused representations of targets by example and counter-example as designated by an operator. Or, it can learn backgrounds through random sampling or operator designation, and then detect fused spectral anomalies as potential targets which can be learned as categories. Typically, with four bands like

1440

RGB/MWIR, the classifier learns a more generic target detector, such as a man-detector, boat-detector or vehicle-detector as opposed to a specific target detector that serves to fingerprint a particular target among many similar targets. Clearly, this kind of target detector is not based on target shape or resolution or motion. It is, essentially, a spectral-based target detector that exploits cross-band spectral contrast. This is illustrated in Figures 6 & 7 taken from our prior work [1, 4].

MWIR LWIR

Fused

VNIR

Kayaks

Detections

MWIR LWIR

Fused

VNIR

Kayaks

Detections

Figure 6: The same neural architecture shown in Fig 1 is used to fuse three sensors VNIR/MWIR/LWIR here. The various outputs of the opponent-colors are used as inputs to the network in Fig 5 to train a target detector for the spatially unresolved small rubber boats (kayaks). Key advantages of fused multisensor/spectral target detection are independence from both target shape and target motion. Independence from target shape enables detection of small unresolved targets, hence, supporting a wider field-of-view. When tracking targets from a moving platform, there is no need to stabilize the background in order to detect movers while coping with false alarms due to imperfect background stabilization. Stationary targets can be detected easily as well. However, purely spectral-based target detection also has its shortcomings, when other scenic elements have similar spectral content in the limited sensing bands available. As with the human visual system, the most robust target detectors will likely utilize a combination of the color, shape and motion pathways. Nonetheless, we have been quite successful in detecting a large variety of target types in different backgrounds. The nature of our target detector, being based on continuous learning in an ART network (Fig. 5), supports adaptation of the representation while tracking the target. This helps tune the detector on-the-fly as the target moves through extended operating conditions and among confusers.

Fused

I2CCD LWIR

Fused

I2CCD LWIR

Figure 7: Dual-sensor video is fused in real-time. One man-target is designated by an operator as a target of interest, and the classifier rapidly learns a detector that then detects all men in the scene, and runs in real-time on compact commercial hardware. These detections then serve as the input to a multi-target tracker.

4 Multi-Hypothesis Tracking & Sensor Resource Management

Modern target trackers utilize a combination of Kalman filtering, interacting motion models, and multi-hypothesis (MHT) association strategies [7, 8]. This enables the tracker to deal with multiple targets undergoing maneuvers and coasting through temporary occlusions. Commonly associated with radar tracking, similar methods have been applied to air-to-ground video tracking [9]. This same association strategy can be extended to track-to-track fusion as well as detection-to-

1441

track fusion [8], providing a means to support track stitching utilizing fingerprints or other information. In order to track targets over an extended field-of-regard that exceeds the field-of-view of the sensors, it is necessary to have a dynamic sensor control strategy that supports both wide-area search as well as multi-target tracking. The combined control of sensor pointing (e.g., a turret or pan-tilt unit), field-of-view or zoom, and operating mode (e.g., frame-rate, resolution, or possibly integration time), is known as sensor resource management (SRM). In support of the DARPA VIVID video tracking program, we have developed SRM methods based on approximate stochastic dynamic programming [9] that have been adapted for the multisensor fused imaging approach. The method utilizes a cost function that aims to optimize the number of targets imaged in a field-of-view, the size of the field-of-view, the time between revisiting targets, and accounts for a planning horizon that includes not only the next view but following views as well. We illustrated this in previous publications [1, 9].

5 Flight Testing & Examples The flight test components are illustrated in Figure 8. Ground control points (GCPs) were established at key coastline features to test geo-pointing and LOS stability, required for flight-based implementation of MHT and SRM (which operate in lat/lon coordinates). Likewise, the real-time co- and geo-registration were tested to support the image fusion and target tracking capabilities. Details of tested capabilities are described in reference [10].

Figure 8. Flight tests conducted near coastal Florida used a Sonoma 21” RGB/MWIR turret mounted on a King Air.

Figures 9 & 10 illustrate air-to-surface detection and tracking of multiple vessels of varying types near the coast, in open waters, and through modest cloud cover. All targets are detected using the same learned target detection “agent”, trained on-the-fly with the aid of an on-board operator designating a single target. A 20ft vessel covered with a blue tarp is detected without false alarms from a range in excess of 2 nautical miles and tracked through various sea states with significant solar glint.

Figure 9. Air-to-surface detection and tracking of vessels from a range of 3-5 nautical miles, altitude of 10kft and aircraft speed of approximately 150 knots. Each target detection is designated by a red cross, with no false alarms on the wakes, coast or bright clouds.

Figure 10. A vessel covered with a blue tarp is detected and tracked among glinting waves without false alarms. The onboard (and/or ground station) operator views a situational awareness display as shown in Figure 11. The display provides imagery with detections overlaid

1442

and a map with target detections & tracks, as well as a list of detected targets and their lat/lon coordinates. This data is transmitted over a narrow bandwidth comms link.

Figure 11. Real-time operator display includes imagery with target detections overlaid, a map with target detections & tracks, and a target coordinate list. As the aircraft turret was pointed towards land, the operator designated a red car as a target of interest, with other color vehicles as counter-examples. The learned detector is then able to detect moving & stationary red cars, including a red car partially occluded by a tree, at a busy intersection, as shown in Figure 12 (top). It is also simple to use the MWIR sensor alone to detect and track multiple ground vehicles at night on roads (Fig. 12, bottom), though fusion of MWIR with low-light visible gives rise to a more robust target detector at night (see Figs. 6 & 7), and is essential for detecting vessels in the presence of solar glint while suppressing false alarms.

Figure 12. (Top) Detection & tracking of multiple “red” cars in fused RGB/MWIR imagery. (Bottom) MWIR alone can support multi-target ground vehicle tracking on open roads, however, fused low-light visible/MWIR provides a more robust detector and is essential for vessels in glint.

6 Conclusions We have demonstrated a proof-of-concept airborne capability to detect and track multiple surface targets using fused 4-band RGB/MWIR video. The system was tested over water and land, on a wide range of targets including slow & fast vessels, moving & stationary vehicles, and dismounts both stationary & walking along the shore. This system integrated many previously developed capabilities for image co-registration and geo-location, image fusion, target learning & detection, MHT multi-target tracking, and automated sensor control & resource management. All computations ran in real-time on COTS hardware. Once a prototype system is developed and performance can be assessed under a wide variety of conditions, a similar capability could be integrated directly with a compact multi-sensor turret aboard a manned or unmanned fixed-wing aircraft or rotorcraft.

References [1] A. Waxman, D. Fay, P. Ilardi, P. Arambel, and J. Silver, Active tracking of surface targets in fused video, Proceeds. of the 10th International Conference on Information Fusion, Fusion 2007, Quebec, 2007. [2]A. Waxman, D. Fay, P. Ilardi, E. Savoye, R. Biehl and D. Grau, Sensor fused night vision: Assessing image quality in the lab and in the field, Proceeds. of the 9th International Conference on Information Fusion, Fusion 2006, Florence, 2006. [3] A. Waxman, A. Gove, D. Fay, J. Racamato, J. Carrick, M. Seibert, and E. Savoye, Color night vision: Opponent processing in the fusion of visible and IR imagery, Neural Networks, 10, 1-6, 1997. [4] D. Fay, P. Ilardi, N. Sheldon, D. Grau, R. Biehl, and A. Waxman, Real-time image fusion and target learning & detection on a laptop attached processor, Proceeds. of the 7th International Conference on Information Fusion, Fusion 2005, Stockholm, 2005. [5] G.A. Carpenter, S. Grossberg, N. Markuzon, J.H. Reynolds, and D.B. Rosen, Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps, IEEE Transactions on Neural Networks, 3, pp. 698-713, 1992. [6] T. Kasuba, Simplified Fuzzy ARTMAP, AI Expert, pp. 18-25, Nov 1993. [7] Y. Bar-Shalom & X.-R. Li, Multi-target multisensor tracking: Principles and techniques, YBS Inc., 1995. [8] S. Coraluppi, C. Carthel, M. Luettgen and S. Lynch, All-source track and identity fusion (ATIF), MSS National Symposium on Sensor and Data Fusion, 2000. [9] P. Arambel, M. Antone, M. Bosse, J. Silver, J.Krant, and T. Strat, Performance assessment of a video-based air-to-ground multiple target tracker with dynamic sensor control, SPIE Defense and Security Symposium, Signal Processing, Sensor Fusion, and Target Recognition XIV, Orlando, 2005. [10] B. Gorin and A. Waxman, Flight test capabilities for real-time multiple target detection and tracking for airborne surveillance and maritime domain awareness, SPIE-6945, Orlando, 2008.

1443

Documents

Fused EO/IR Detection & Tracking of Surface Targets ...isif.org/fusion/proceedings/fusion09CD/data/papers/0463.pdf · Fused EO/IR Detection & Tracking of Surface Targets: Flight Demonstrations