8
Sensor Fused Night Vision : Assessing Image Quality in the Lab and in the Field A. Waxman, D. Fay, P. Ilardi, D. Savoye (consultant), R. Biehl, and D. Grau Fusion Technology & Systems Division BAE SYSTEMS Advanced Information Technologies Burlington, MA, U.S.A. [email protected] Abstract – Generating real-time fused combinations of VNIR, SWIR and LWIR imagery enhances situational awareness, task performance, and overall image quality. However, no quantitative means to link image quality metrics with task performance exist for fused imaging, the way it does for single modality imaging. We illustrate how fused image quality is affected by multiple dimensions, including scene illumination, reflectance contrast, thermal contrast, sensor resolution, noise-limited resolution, local adaptive gain & contrast enhancement, noise cleaning, and fusion methodology. Task performance has also been assessed in two field collection campaigns, and multiple human performance tests in the lab using field data. Though image fusion clearly improves subjective image quality, a means to predict task performance improvement is still lacking. Keywords: Image fusion, fused night vision, image quality, image quality assessment 1 Introduction The fusion of low-light visible and thermal infrared imagery in realtime has put us on the threshold of a new era in night vision. Laboratory breadboards, integrated field cameras, and prototype digital goggles that fuse the spectral content of reflected VNIR and emitted LWIR, are being developed world-wide. This complementary information will provide the user with enhanced situational awareness, but just how effectively remains to be demonstrated, assessed and quantified where possible. Ultimately, the user will want task specific measures of effectiveness, much like the NIIRS [1,2] scale or multi- spectral NIIRS [3] provides for image analysts, or the Johnson criterion [4] provides for intensified night vision in the VNIR. Target detection and recognition in the VNIR is a function of the noise-limited resolution under low-light conditions, the target-to-background contrast, and the overall reflectivity of the scene at a given illumination [4,5]. Target detection in the LWIR is a function of the thermal contrast of target-to-background, the noise equivalent temperature difference (NETD), and the sampling resolution [6]. These quantifiable metrics, and others like the Generalized Image Quality Equation [7], have been formalized and implemented in computer codes. For VNIR/LWIR fused imagery, no such model of image quality or utility exists as yet, though various studies have been conducted in order to show the merits of fused imaging. Target detection will be a function of the sensor specific parameters, and the manner in which the imagery is fused. Color fused imagery is likely to provide stronger target-to-background contrast than gray fused imagery, as the human visual system is quite sensitive to a great many color contrasts. This can lead to faster and more reliable target pop-out, shorter reaction times (for target detection, navigation, situational awareness), and less fatigue. However, unnatural false color imagery may prove hard to interpret despite its contrast. Resolution and contrast visibility tests can be conducted in the laboratory under controlled illumination and thermal conditions, whereas task specific measures of performance can be assessed in the field under natural night illumination conditions. Ultimately, one wants to relate quantifiable metrics that can be extracted directly from imagery, to task specific performance. This paper will describe laboratory assessments of VNIR, SWIR, LWIR, and fused imagery from a variety of sensors, and results from two field collections to assess relative performance of a variety of tasks under differing operating conditions at night. The assessment of fused imagery has begun, but there are no established metrics as yet. The goal of predicting human performance for various tasks using fused imagery remains to be attained. 2 Image Fusion & Image Quality In 1993, Waxman and colleagues [see 8, 9] introduced the opponent-color neural approach to image fusion, based on observation of visible/thermal fusion in pythons and rattlesnakes, and its analogy to opponent-color processing in mammalian retina and primary visual cortex. Over the years the opponent-color architecture has matured, and has been combined with the multi-resolution gray fused brightness channel introduced by Toet [10] and extended by Burt [11]. Figure 1 illustrates the multi- resolution opponent-color architecture for fusing two sensors, VNIR/ LWIR and SWIR/LWIR, with example results shown in Figures 2 & 3. The exact same architecture is used to combine three sensors, VNIR/SWIR/LWIR or VNIR/ MWIR/LWIR as shown in Figures 4 & 5, or all four sensors, VNIR/SWIR/MWIR/LWIR as shown in Figure 5.

Sensor Fused Night Vision : - ISIF

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Sensor Fused Night Vision : - ISIF

Sensor Fused Night Vision : Assessing Image Quality in the Lab and in the Field

A. Waxman, D. Fay, P. Ilardi, D. Savoye (consultant), R. Biehl, and D. Grau

Fusion Technology & Systems Division BAE SYSTEMS Advanced Information Technologies

Burlington, MA, U.S.A. [email protected]

Abstract – Generating real-time fused combinations of VNIR, SWIR and LWIR imagery enhances situational awareness, task performance, and overall image quality. However, no quantitative means to link image quality metrics with task performance exist for fused imaging, the way it does for single modality imaging. We illustrate how fused image quality is affected by multiple dimensions, including scene illumination, reflectance contrast, thermal contrast, sensor resolution, noise-limited resolution, local adaptive gain & contrast enhancement, noise cleaning, and fusion methodology. Task performance has also been assessed in two field collection campaigns, and multiple human performance tests in the lab using field data. Though image fusion clearly improves subjective image quality, a means to predict task performance improvement is still lacking. Keywords: Image fusion, fused night vision, image quality, image quality assessment

1 Introduction The fusion of low-light visible and thermal infrared imagery in realtime has put us on the threshold of a new era in night vision. Laboratory breadboards, integrated field cameras, and prototype digital goggles that fuse the spectral content of reflected VNIR and emitted LWIR, are being developed world-wide. This complementary information will provide the user with enhanced situational awareness, but just how effectively remains to be demonstrated, assessed and quantified where possible. Ultimately, the user will want task specific measures of effectiveness, much like the NIIRS [1,2] scale or multi-spectral NIIRS [3] provides for image analysts, or the Johnson criterion [4] provides for intensified night vision in the VNIR. Target detection and recognition in the VNIR is a function of the noise-limited resolution under low-light conditions, the target-to-background contrast, and the overall reflectivity of the scene at a given illumination [4,5]. Target detection in the LWIR is a function of the thermal contrast of target-to-background, the noise equivalent temperature difference (NETD), and the sampling resolution [6]. These quantifiable metrics, and others like the Generalized Image Quality Equation [7], have been formalized and implemented in computer codes. For VNIR/LWIR fused imagery, no such model of

image quality or utility exists as yet, though various studies have been conducted in order to show the merits of fused imaging. Target detection will be a function of the sensor specific parameters, and the manner in which the imagery is fused. Color fused imagery is likely to provide stronger target-to-background contrast than gray fused imagery, as the human visual system is quite sensitive to a great many color contrasts. This can lead to faster and more reliable target pop-out, shorter reaction times (for target detection, navigation, situational awareness), and less fatigue. However, unnatural false color imagery may prove hard to interpret despite its contrast. Resolution and contrast visibility tests can be conducted in the laboratory under controlled illumination and thermal conditions, whereas task specific measures of performance can be assessed in the field under natural night illumination conditions. Ultimately, one wants to relate quantifiable metrics that can be extracted directly from imagery, to task specific performance. This paper will describe laboratory assessments of VNIR, SWIR, LWIR, and fused imagery from a variety of sensors, and results from two field collections to assess relative performance of a variety of tasks under differing operating conditions at night. The assessment of fused imagery has begun, but there are no established metrics as yet. The goal of predicting human performance for various tasks using fused imagery remains to be attained.

2 Image Fusion & Image Quality In 1993, Waxman and colleagues [see 8, 9] introduced the opponent-color neural approach to image fusion, based on observation of visible/thermal fusion in pythons and rattlesnakes, and its analogy to opponent-color processing in mammalian retina and primary visual cortex. Over the years the opponent-color architecture has matured, and has been combined with the multi-resolution gray fused brightness channel introduced by Toet [10] and extended by Burt [11]. Figure 1 illustrates the multi-resolution opponent-color architecture for fusing two sensors, VNIR/ LWIR and SWIR/LWIR, with example results shown in Figures 2 & 3. The exact same architecture is used to combine three sensors, VNIR/SWIR/LWIR or VNIR/ MWIR/LWIR as shown in Figures 4 & 5, or all four sensors, VNIR/SWIR/MWIR/LWIR as shown in Figure 5.

Page 2: Sensor Fused Night Vision : - ISIF

Opponent Contrast:Multi-sensor/scale

contrasts drive displaybrightness & color

Register andNoise Clean

LACEGLocal adaptivecontrast & gain

Rotate HueMap to RGBRotate HueMap to RGB

VNIR

LWIRRev. +

-

+-

+-

+-+-

+-

+-+-

Y

IQ

+- = center-surround

shunting dynamics

LWIR

+-

+-

Fused Result

VNIR

-+

Opponent Contrast:Multi-sensor/scale

contrasts drive displaybrightness & color

Register andNoise Clean

LACEGLocal adaptivecontrast & gain

Rotate HueMap to RGBRotate HueMap to RGB

VNIR

LWIRRev. +

-+-

+-

+-

+-

+-+-

+-

+-+-

+-

Y

IQ

+- = center-surround

shunting dynamics+-

+- = center-surround

shunting dynamics

LWIR

+-

+-

+-

+-

Fused Result

VNIR

-+

Figure 1: Multi-resolution opponent-color fusion network uses center-surround shunting dynamics to support local adaptive contrast enhancement and gain (LACEG), and two stages of opponent-color that are interpreted as the elements of the Y, I, Q human opponent-color space.

Figure 2: Dual-sensor fusion of VNIR/LWIR using an intensified-CCD for the VNIR channel, and SWIR/LWIR using an InGaAs imager for the SWIR channel, both using an uncooled microbolometer for the LWIR channel. Taken in overcast full-moon (40 mLux), the SWIR/LWIR image has better quality than the VNIR/LWIR image, though the I2CCD has higher resolution than the InGaAs.

Daylight Visible Mid-Wave Infrared FLIR

Gray Fused Color Fused

Daylight VisibleDaylight Visible Mid-Wave Infrared FLIRMid-Wave Infrared FLIR

Gray FusedGray Fused Color FusedColor Fused

Figure 3: Image quality is much improved by fusing VNIR, providing context, with MWIR, highlighting targets, when obscurants are present. Color fusion aids target pop-out and spontaneous segmentation, as compared to gray-level fusion.

These examples show how image fusion can impact image quality through adaptive contrast enhancement (edge sharpening), locally adaptive gain control (exploiting wide dynamic range), enhanced color contrast by fusing more complementary spectral content, target pop-out and spontaneous segmentation through color

contrast, and visibility through obscurants of targets and context. MWIR LWIR

Fused

VNIR

Kayaks

Detections

MWIR LWIR

Fused

VNIR

Kayaks

Detections

Figure 4: The same neural architecture shown in Figure 1 is used to fuse three sensors VNIR/MWIR/LWIR here. The various outputs of the opponent-colors support the detection of spatially unresolved kayaks via image mining.

Fused VNIR/SWIR/LWIR Fused VNIR/SWIR/MWIR/LWIR

InGaAs SWIR (0.9-1.7µm) Fused SWIR/LWIR

Fused VNIR/SWIR/LWIRFused VNIR/SWIR/LWIR Fused VNIR/SWIR/MWIR/LWIRFused VNIR/SWIR/MWIR/LWIR

InGaAs SWIR (0.9-1.7µm)InGaAs SWIR (0.9-1.7µm) Fused SWIR/LWIRFused SWIR/LWIR

Figure 5: The architecture of Figure 1 is used to fuse 2, 3, or 4 sensors, including VNIR/SWIR/MWIR/LWIR shown here, as collected under quarter-moon (27 mLux). Image quality improves, with increased color contrast, as more complementary spectral bands are fused.

3 Local Adaptive Contrast & Gain The first stage of the neural architecture shown in Figure 1 operates on each sensor modality separately. It provides for local adaptive contrast enhancement & gain (LACEG). It enhances image quality by boosting edge sharpness and compressing wide dynamic range data locally while preserving local contrasts. Unlike histogram equalization typically utilized in camera automatic gain control (AGC), which ignores spatial relationships among brightness features, LACEG processing preserves local contrast and renormalizes it in terms of local brightness across an arbitrarily large dynamic range. It emerges from our modified version of Grossberg’s shunting dynamics [8]. Figures 6 & 7 illustrate the significance of LACEG processing in VNIR and SWIR scenes of wide dynamic range and non-uniform illumination.

Page 3: Sensor Fused Night Vision : - ISIF

Full Dynamic Range (LACEG)Low 256 gray levels High 256 gray levels

VNIR

LWIR

VNIRVNIR

LWIR LWIR

Full Dynamic Range (LACEG)Low 256 gray levels High 256 gray levels

VNIR

LWIR

VNIRVNIR

LWIR LWIR

Figure 6: Wide dynamic range VNIR (12-bit) imagery under starlight conditions, and LWIR (14-bit) imagery in freezing conditions, are adaptively enhanced and rendered in 8-bit dynamic range via LACEG processing LACEGAGC

LACEGAGC

LACEGAGC

LACEG

LACEGAGC

LACEGAGC

Figure 7: SWIR imagery of a low-light scene with intense local illumination cast by a flashlight. Ordinary AGC (left) looses local contrast and often saturates in areas, while LACEG (right) preserves local contrast and detail across a wide dynamic range, enhancing image quality.

4 Noise-Limited Resolution, Detection Modeling, and Color-Fused Filling-In We are accustomed to thinking that sensor resolution impacts image quality, though this is not a simple issue under low-light-level (signal starved) conditions. In Figure 2 we observe that the SWIR/LWIR fused image on the lower-right has better visual quality than the VNIR/ LWIR fused image on the lower-left, despite the fact that the SWIR imager has sampling resolution 640x480 while the I2CCD has sampling resolution 1280x960. However, the noise characteristics of intensifier tubes are quite different than focal plane arrays (FPAs) and readout circuitry, the contrast sensitivity behavior of tubes and FPAs are quite different, and the spectral opponent-contrast is different here. Each affects the resulting fused image quality such that the SWIR/LWIR image has better visual appearance. (This is also true when viewing image sequences fused in real-time, such as we did in the field.) Sensor resolution is replaced by noise-limited resolution (NLR), which can be understood as the minimum spatial element with contrast just equal to the average signal-to-noise in that element. As the signal level drops, the many noise sources (including signal photon fluctuations) combine in quadrature, requiring a larger element to average over the noise during the human visual integration time. Hence, many factors conspire to

determine the NLR of a low-light imager [4,5,6], including the composite spectrum of the illuminant (solar moonlight, starlight, galactic background, and airglow particularly strong in the SWIR). The model of Savoye captures all these effects. Figures 8(a-d), illustrate the loss of apparent resolution of the intensified-CCD VNIR sensor in the lab under controlled (and spectrally calibrated) illumination conditions. We also show how our LACEG local adaptive processing enhances spatial contrast, utilizes the wide dynamic range, and our temporal (non-averaging) filter removes speckle noise at starlight levels and below.

Figure 8a: I2CCD imaging resolution charts of varying contrast under “full moon” (42mLux) in the lab. The loss of apparent resolution with target contrast is obvious, as is the improvement using LACEG processing.

Figure 8b: Same as 8a for “quarter moon” (12mLux)

Figure 8c: Same as Figure 8a for “starlight” (3mLux)

Figure 8d: Same as 8a for “below starlight” (1mLux)

Page 4: Sensor Fused Night Vision : - ISIF

100% Contrast 50% Contrast

20% Contrast

100% Contrast 50% Contrast

20% Contrast

100% Contrast 50% Contrast

20% Contrast

100% Contrast 50% Contrast

20% Contrast

100% Contrast 50% Contrast

20% Contrast

100% Contrast 50% Contrast

20% Contrast

100% Contrast 50% Contrast

20% Contrast

100% Contrast 50% Contrast

20% Contrast

Figure 9: Low-light CCD imager without an intensifier microchannel plate shows different noise characteristics, but similar loss of apparent resolution at low light levels.

Figure 9 corresponds to a new generation of low-light- level VNIR imagers that do not amplify photoelectrons by means of a microchannel plate, as is used in intensifier tubes. This low-light CCD was developed in the mid- 1990s at MIT Lincoln Laboratory [12]. It utilized a high quantum-efficiency, back-illuminated, thinned CCD, with

4 low-noise readout ports. In order to achieve this kind of performance in a compact, low-power sensor, newly emerging electron-bombarded CMOS and even micro-channel plate CMOS imagers are being developed. The noise-limited-resolution (NLR) model of Savoye extends the NVESD models [4,5,6] to include more complete modeling of noise sources, photocathode spectral sensitivity, and spectral content of the illuminant from 0.4–2.0 microns. Figures 10(a-c) compare predicted NLR curves for three low-light sensors under varying illumination for target-to-background contrasts of 100%, 50%, and 10%. Two imagers operate in the VNIR, including a direct view Omni IV intensifier tube (with Gen III GaAs photocathode) and an electron-bombarded 1280x1024 CMOS imager (with the same photocathode) still under development, and the third is 640x480 bump-bonded InGaAs imaging in the SWIR. The readout noise is relatively low, and the sensors are only TE-cooled. The NLR is shown in terms of line-pairs per mm, and the Johnson criteria [4] for 80% detection & recognition of a man–target at 150 meters is indicated for each imager.

1

10

100

0.000001 0.00001 0.0001 0.001 0.01 0.1 1

Sce ne Il lum ina nce in Lux (Com posite 2856 K BB / NVES D / VATSIA Spe ctrum to 2m lux , plus m oonlight a bove 2 m lux )

Lim

itin

g R

eso

luti

on

in

lp

/

Intevac ISIE10 , Scene Contrast=1

LP/m m Req. for M an Detec tion-Recognition a t 150m .

SU SDV 640 , Scene Contras t =1

LP/m m Req. for M an Detec tion-Recognition a t 150m .

OMNI V Im age Intens ifier , Scene Contras t = 1

LP/m m Req. for M an Detec tion-Recognition a t 150m .

Best EB APS vs Best InG aAs vs Best I2 Intensifier

Overcas tStar light

Sta rlightNo M oon

QuarterM oon

Calcu la ted Noise-Lim ited Resolution vs. L ight L evel

Full Moon

Air Force Re solution Te st Pa tte rn, Obse rva tion tim e = 0.1s,Le ns: F-num be r = 1.2, T= .8, M TF = 1.0, Sce ne Re fle cta nce = 0.2, Contra st = 1.0

Detect @ 150m

Recognize @ 150m

1

10

100

0.000001 0.00001 0.0001 0.001 0.01 0.1 1

Sce ne Il lum ina nce in Lux (Com posite 2856 K BB / NVES D / VATSIA Spe ctrum to 2m lux , plus m oonlight a bove 2 m lux )

Lim

itin

g R

eso

luti

on

in

lp

/

Intevac ISIE10 , Scene Contrast=1

LP/m m Req. for M an Detec tion-Recognition a t 150m .

SU SDV 640 , Scene Contras t =1

LP/m m Req. for M an Detec tion-Recognition a t 150m .

OMNI V Im age Intens ifier , Scene Contras t = 1

LP/m m Req. for M an Detec tion-Recognition a t 150m .

Best EB APS vs Best InG aAs vs Best I2 Intensifier

Overcas tStar light

Sta rlightNo M oon

QuarterM oon

Calcu la ted Noise-Lim ited Resolution vs. L ight L evel

Full Moon

Air Force Re solution Te st Pa tte rn, Obse rva tion tim e = 0.1s,Le ns: F-num be r = 1.2, T= .8, M TF = 1.0, Sce ne Re fle cta nce = 0.2, Contra st = 1.0

Detect @ 150m

Recognize @ 150m

Figure 10a: Noise-limiting resolution (NLR) curves of low-light-level imagers as a function of night sky illumination for a target contrast of 100%. The direct view intensifier outperforms the electron-bombarded CMOS and lower pixel resolution SWIR imager.

Page 5: Sensor Fused Night Vision : - ISIF

1

10

100

0.000001 0.00001 0.0001 0.001 0.01 0.1 1

Scene Illuminance in Lux (Com posite 2856 K BB / NVESD / VATSIA Spectrum to 2mlux , plus m oonlight above 2 mlux)

Lim

itin

g R

esol

utio

n in

lp/m

Intevac ISIE10, Scene Contrast=0.5

LP/m m Req. for Man Detection-Recognition a t 150m .

SU SDV 640, Scene Contrast =0.5

LP/m m Req. for Man Detection-Recognition a t 150m .

OMNI V Im age Intens ifier , Scene Contrast = 0.5

LP/m m Req. for Man Detection-Recognition a t 150m .

Best EBAPS vs Best InGaAs vs Best I2 Intensifier

OvercastStarlight

Star lightNo Moon

QuarterMoon

Calculated Noise-Limited Resolution vs. Light Level

Full Moon

Air Force Resolution Te st Pattern, Observa tion time = 0.1s,Lens: F-num ber = 1.2, T= .8, M TF = 1.0, Scene Refle ctance = 0.2, Contrast = 0.5

Recognize @ 150m

Detect @ 150m

1

10

100

0.000001 0.00001 0.0001 0.001 0.01 0.1 1

Scene Illuminance in Lux (Com posite 2856 K BB / NVESD / VATSIA Spectrum to 2mlux , plus m oonlight above 2 mlux)

Lim

itin

g R

esol

utio

n in

lp/m

Intevac ISIE10, Scene Contrast=0.5

LP/m m Req. for Man Detection-Recognition a t 150m .

SU SDV 640, Scene Contrast =0.5

LP/m m Req. for Man Detection-Recognition a t 150m .

OMNI V Im age Intens ifier , Scene Contrast = 0.5

LP/m m Req. for Man Detection-Recognition a t 150m .

Best EBAPS vs Best InGaAs vs Best I2 Intensifier

OvercastStarlight

Star lightNo Moon

QuarterMoon

Calculated Noise-Limited Resolution vs. Light Level

Full Moon

Air Force Resolution Te st Pattern, Observa tion time = 0.1s,Lens: F-num ber = 1.2, T= .8, M TF = 1.0, Scene Refle ctance = 0.2, Contrast = 0.5

Recognize @ 150m

Detect @ 150m

Figure 10b: NLR curves as in Figure 10a for a target contrast of 50%. The direct view intensifier is still predicted to be the best, but all three sensors will have similar resolution below starlight illumination conditions.

1

10

100

0.000001 0.00001 0.0001 0.001 0.01 0.1 1

Sce ne Illum ina nce in Lux (Com posite 2856 K BB / NVESD / VATSIA Spe ctrum to 2m lux , plus m oonlight a bove 2 m lux)

Lim

itin

g R

eso

luti

on

in

lp

/m

Intevac ISIE10, Scene Contras t=0 .1

LP/m m Req. for Man Detection-Recognition at 150m .

SU SDV 640 , Scene Contras t =0 .1

LP/m m Req. for Man Detection-Recognition at 150m .

OMNI V Im age Intensifie r, Scene Contrast = 0 .1

LP/m m Req. for Man Detection-Recognition at 150m .

Best EBAPS vs Best InGaAs vs Best I2 Intensifier

OvercastStar light

Star lightNo Moon

QuarterMoon

Calculated No ise-Limited Resolution vs. Ligh t Level

Full Moon

Air Force Re solution Test Pa tte rn, Observa tion tim e = 0.1s,Le ns: F-num be r = 1.2, T= .8, M TF = 1.0, Sce ne Re fle cta nce = 0.2, Contra st = 0.1

Recognize @ 150m

Detect @ 150m

1

10

100

0.000001 0.00001 0.0001 0.001 0.01 0.1 1

Sce ne Illum ina nce in Lux (Com posite 2856 K BB / NVESD / VATSIA Spe ctrum to 2m lux , plus m oonlight a bove 2 m lux)

Lim

itin

g R

eso

luti

on

in

lp

/m

Intevac ISIE10, Scene Contras t=0 .1

LP/m m Req. for Man Detection-Recognition at 150m .

SU SDV 640 , Scene Contras t =0 .1

LP/m m Req. for Man Detection-Recognition at 150m .

OMNI V Im age Intensifie r, Scene Contrast = 0 .1

LP/m m Req. for Man Detection-Recognition at 150m .

Best EBAPS vs Best InGaAs vs Best I2 Intensifier

OvercastStar light

Star lightNo Moon

QuarterMoon

Calculated No ise-Limited Resolution vs. Ligh t Level

Full Moon

Air Force Re solution Test Pa tte rn, Observa tion tim e = 0.1s,Le ns: F-num be r = 1.2, T= .8, M TF = 1.0, Sce ne Re fle cta nce = 0.2, Contra st = 0.1

Recognize @ 150m

Detect @ 150m

Figure 10c: NLR curves as in Figure 10a for a target-to-background contrast of only 10%. The intensifier tube and electron-bombarded CMOS are predicted to have very similar resolution at all light levels, whereas the lower pixel count SWIR imager is predicted to have the best performance at quarter-moon and lower light levels.

Given the somewhat surprising predictions of SWIR imager NLR indicated in Figure 10c, it is understandable that the SWIR image in Figure 2 has better visual quality

than the I2CCD image (noting that both sensors have reduced sampling resolution as compared to those used in Figures 10).

Page 6: Sensor Fused Night Vision : - ISIF

When we fuse noise-limited low-light imagery with thermal LWIR imagery, we enhance image quality by two means. One is the enhanced contrast and resultant target pop-out inherent in color fusion under any illumination conditions. The other is perceptual filling-in of brightness and color within the form generated by the object’s boundary and feature contours, and not by the spatio-temporal noise. This is shown in Figure 11, taken in the lab under controlled illumination levels as indicated. It is not enough, however, to simply have image fusion or even color fusion to improve image quality. The method of color fusion can greatly impact the fused image quality perceived by the user. In Figure 12 we compare three kinds of VNIR/LWIR image fusion. Gray-level additive fusion is the basis of most non-color fusion methods, in which the two original (or enhanced) grayscale images are blended on one or more scales, either in terms of local brightness or local contrast. Another is color assignment fusion, in which one modality (usually VNIR) is assigned to the green channel and the other modality (usually LWIR) is assigned to the red channel. This is similar to the optical overlay fusion characteristic of the U.S. Army’s new Enhanced Night Vision Goggles. This kind of color fusion provides high color contrast and target pop-out, but is non-literal and unstable under changing illumination conditions. The third kind of fusion shown is our opponent-color fusion, in which cross-sensor contrasts are converted to human opponent-color contrasts, essentially combining three fused images in one color presentation. It supports target pop-out, enhanced contrast, local gain, a literal coloring, and is relatively stable under changing illumination.

Star

light

Bel

ow S

tarli

ght

Star

light

Bel

ow S

tarli

ght

Figure 11: Fusion of noise-limited low-light VNIR imagery with thermal LWIR imagery illustrates improved image quality. This results from both enhanced color contrast and perceptual filling-in of brightness and color within boundary and feature contours of the target.

Uncooled LWIRLow-Light VNIR (Enhanced)

A+B Gray Fusion Color Assignment Fusion

Opponent-Color Fusion

Uncooled LWIRUncooled LWIRLow-Light VNIR (Enhanced)Low-Light VNIR (Enhanced)

A+B Gray FusionA+B Gray Fusion Color Assignment FusionColor Assignment Fusion

Opponent-Color FusionOpponent-Color Fusion

Figure 12: Three alternative fusion methods combine VNIR and LWIR via additive gray fusion, color assignment fusion, and opponent-color fusion. The resultant fused image qualities differ dramatically.

5 Task Specific Utility Assessments In the previous discussion we focused on issues of color, contrast, noise, and apparent resolution, all of which affect subjective measures of image quality. For single modality imagers, past work has been done to correlate simple image metrics with user performance, such as object detection and recognition (e.g., GIQE [7] and its relation to the NIIRS [1,2], or the Johnson criterion [4,5]). Alternatively, we might assess fused image quality based on its ability to support specific tasks or, more generally, classes of tasks. This can be done in the lab using static or dynamic imagery, to assess accuracy, reaction time, and fatigue under controlled conditions. Or task specific assessments can be conducted in the field under natural illumination and thermal conditions, while performing activities of interest for night operations.

A comparison of performance on low-contrast target search in single modality static VNIR, LWIR, and fused (gray and color) imagery was conducted in [13], where it was shown that opponent-color fused imagery yields shorter reaction times and higher accuracy than single modality or grayscale fusion. These tests used false targets of controlled contrast embedded in natural scenes. Toet [14] developed an alternative false color fusion method, and collaborated with Waxman [15] to assess which of the various color or gray fusion methods would provide greatest situational awareness in a ground surveillance task. This test utilized real static VNIR and LWIR scenes involving a man on a dirt road and near a fence around a house. The subject had to determine where the man was relative to the road or fence, and other related tasks, in order to judge both detection accuracy and spatial relations in single modality and fused scenes. Again, opponent-color fusion showed the best performance.

Page 7: Sensor Fused Night Vision : - ISIF

Helicopter pilotage tasks were assessed in [16] across a variety of grayscale and color fusion algorithms. I2CCD and FLIR imagery was collected on a helicopter conducting night flight operations under mostly quarter-moon conditions. Take-off, landing, hovering, flight over roads, past vehicles, and over tree-tops, was conducted. Imagery was then fused in real-time and provided to the U.S. Army’s Night Vision Lab for perceptual testing on Army and Marine helicopter pilots wearing helmet-mounted displays. Quantitative measures of performance were recorded, and in most cases opponent-color fusion outperformed single modalities, gray fused imagery, and an alternative color assignment fusion method. In 1998 we had the opportunity to conduct field tests with U.S. Army Special Forces under starlight conditions. Operations on land and on water were organized, and a variety of tasks were assessed for performance using single modality or real-time fused imagery [17]. Approximately 15 observers, most of them trained military, scored the ease or difficulty in which tasks could be conducted based on the visibility of key information in the fused displays. Figure 13 shows the quality of some of the imagery taken under these challenging conditions. The complementary nature of the VNIR and LWIR imagery is easily apparent, and the fused image clearly conveys more than the sum of its parts.

Low-Light CCD Uncooled LWIR

Real-time Color Fused

Low-Light CCD Uncooled LWIR

Real-time Color Fused

Figure 13: Fusion of low-light CCD and LWIR imagery under starlight (1mLux) conditions. The men were at 100 meters from the sensors, using a 7° field-of-view. Figure 14 illustrates a task performance matrix, in which observers scored a variety of tasks being conducted by Special Forces in starlight conditions, while watching on a monitor the single modality or fused imagery. Overall, it is apparent that opponent-color image fusion enhanced task performance in that it performed as well as or better than the best of the two individual modalities.

L o w -L ig h t V is ib le T h e rm a l In fra re d C o lo r F u s e dW id e fo v N a rro w fo v W id e fo v N a rro w fo v W id e fo v N a rro w fo v

D is ta n c e (m ) 3 5 7 0 1 0 0 1 0 0 3 0 0 5 0 0 3 5 7 0 1 0 0 1 0 0 3 0 0 5 0 0 3 5 7 0 1 0 0 1 0 0 3 0 0 5 0 0T A S KM e n :

T ra c kId e n tifyD is c r im in a te

Id e n t ify a c t iv it ie s

V e h ic le s :D e te c tId e n t ify

U n ifo rm s :Id e n t ifyD is c r im in a te

W e a p o n s :Id e n t ifyD is c r im in a te

C a m o u f la g eD e te c t io n

O b s c u ra n tsV e g e ta t io nS m o k e s c re e n

L o w -L ig h t V is ib le T h e rm a l In fra re d C o lo r F u s e dW id e fo v N a rro w fo v W id e fo v N a rro w fo v W id e fo v N a rro w fo v

D is ta n c e (m ) 3 5 7 0 1 0 0 1 0 0 3 0 0 5 0 0 3 5 7 0 1 0 0 1 0 0 3 0 0 5 0 0 3 5 7 0 1 0 0 1 0 0 3 0 0 5 0 0T A S KM e n :

T ra c kId e n tifyD is c r im in a te

Id e n t ify a c t iv it ie s

V e h ic le s :D e te c tId e n t ify

U n ifo rm s :Id e n t ifyD is c r im in a te

W e a p o n s :Id e n t ifyD is c r im in a te

C a m o u f la g eD e te c t io n

O b s c u ra n tsV e g e ta t io nS m o k e s c re e n

L o w -L ig h t V is ib le T h e rm a l In fra re d C o lo r F u s e dW id e fo v N a rro w fo v W id e fo v N a rro w fo v W id e fo v N a rro w fo v

D is ta n c e (m ) 3 5 7 0 1 0 0 1 0 0 3 0 0 5 0 0 3 5 7 0 1 0 0 1 0 0 3 0 0 5 0 0 3 5 7 0 1 0 0 1 0 0 3 0 0 5 0 0T A S KM e n :

T ra c kId e n tifyD is c r im in a te

Id e n t ify a c t iv it ie s

V e h ic le s :D e te c tId e n t ify

U n ifo rm s :Id e n t ifyD is c r im in a te

W e a p o n s :Id e n t ifyD is c r im in a te

C a m o u f la g eD e te c t io n

O b s c u ra n tsV e g e ta t io nS m o k e s c re e n

Figure 14: Task performance is enhanced by using color fused imagery compared to single modality VNIR or LWIR imagery. Tests conducted by Army Special Forces. [Green = Easy; Yellow = Difficult; Red = Impossible] In December of 2004, we conducted another field collection using our latest VNIR, SWIR, and LWIR

imaging sensors, and laptop attached fusion processor [9], as was done for Figure 2. We collected data under

Page 8: Sensor Fused Night Vision : - ISIF

overcast full-moon level of illumination (40mLux), due to reflection of distant urban lights off the clouds. Our team was in the field (actually, in the snow), conducting a variety of tasks at multiple distances from the sensors, both in the open field and against the trees. Sample imagery is shown in Figure 15. We compiled a task matrix, as shown in Figure 16, which compares task visibility using single modality VNIR, SWIR, or LWIR, with fused VNIR/LWIR and SWIR/LWIR.

Fused

I2CCD LWIR

Fused

I2CCD LWIR

Figure 15: Fused data collection of men operating in snow, December 2004, night under 40mLux illumination.

Open FieldDistance (m) 25 50 100 150 25 50 100 150 25 50 100 150 25 50 100 150 25 50 100 150

Men Standing Discriminate from Background 1 1 2 3 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1Discriminate from Each Other 1 2 3 3 1 2 3 2 1 2 2 3 1 2 2 3 1 2 3 3

Camouflage Discriminate while Standing 2 3 3 3 2 3 3 3 3 3 3 3 2 3 3 3 2 3 3 3Men Walking Discriminate from Background 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Track 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1Discriminate from Each Other 1 1 2 3 1 1 2 3 1 1 3 3 1 1 2 3 1 1 2 3

Launch Tube Discriminate while Still 1 1 3 3 1 1 3 3 1 1 3 3 1 1 3 3 1 1 3 3Discriminate while Turning 1 1 3 3 1 1 3 3 1 1 3 3 1 1 3 3 1 1 3 3Discriminate while Kneeling 1 1 3 3 1 1 3 3 1 1 2 3 1 1 3 3 1 1 3 3Discriminate while Walking 1 1 3 3 1 1 3 3 1 1 3 3 1 1 3 3 1 1 3 3

Launch Tube Identify while Still 2 3 3 3 1 2 3 3 1 2 3 3 1 2 3 3 1 2 3 3Identify while Turning 2 3 3 3 1 3 3 3 1 2 3 3 1 2 3 3 1 2 3 3Identify while Kneeling 2 2 3 3 1 3 3 3 2 2 3 3 1 2 3 3 1 2 3 3Identify while Walking 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

Near TreelineDistance (m) 25 50 100 150 25 50 100 150 25 50 100 150 25 50 100 150 25 50 100 150

Men Standing Discriminate from Background 1 2 3 3 1 2 2 3 1 1 1 1 1 1 1 2 1 1 1 2Discriminate from Each Other 1 3 3 3 1 3 3 3 1 3 3 3 1 2 3 3 1 2 2 2

Camouflage Discriminate while Standing 2 3 3 3 3 3 3 3 3 3 3 3 2 2 3 3 2 2 3 3Men Walking Discriminate from Background 1 1 2 2 1 1 2 3 1 1 1 1 1 1 2 2 1 1 1 2

Track 1 1 2 2 1 1 2 3 1 1 1 1 1 1 2 2 1 1 1 2Discriminate from Each Other 1 2 3 3 1 1 3 3 1 2 3 3 1 2 3 3 1 2 3 3

Launch Tube Discriminate while Still 3 3 3 3 1 2 3 3 1 2 3 3 2 3 3 3 2 2 3 3Discriminate while Turning 2 2 3 3 1 1 3 3 1 2 3 3 1 3 3 3 1 2 3 3Discriminate while Kneeling 1 3 3 3 1 1 3 3 1 2 3 3 1 2 3 3 1 1 3 3Discriminate while Walking 1 2 3 3 1 1 3 3 1 1 3 3 1 1 3 3 1 1 3 3

Launch Tube Identify while Still 3 3 3 3 3 3 3 3 1 3 3 3 3 3 3 3 3 3 3 3Identify while Turning 3 3 3 3 3 3 3 3 1 3 3 3 2 3 3 3 3 3 3 3Identify while Kneeling 3 3 3 3 2 3 3 3 2 3 3 3 2 3 3 3 2 3 3 3Identify while Walking 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

Easy Hard Impossible

LWIR I2CCD/LWIR Fused

Task

SWIR/LWIR Fused

Task

I2CCD SWIR LWIR I2CCD/LWIR Fused SWIR/LWIR Fused

I2CCD SWIR

Figure 16: Task performance in VNIR, SWIR, LWIR, and fused VNIR/LWIR and SWIR/LWIR, conducted in snow under overcast full-moon conditions. PRELIMINARY

6 Conclusions It is clear from the many examples given here, that fused image quality spans multiple dimensions: quantitative metrics, perceptual metrics, and task specific metrics. Image quality is influenced by the choice of sensors (spectral bands and alternative sensors within a particular band), the prevailing illumination and thermal conditions, and the kinds of enhancement and fusion processing being applied to the data. Image quality models try to capture the phenomenology, relating sensor parameters and natural conditions to various measures of image quality. Task specific utility of image fusion has already

been demonstrated by multiple investigators across several domains of operation. However, relating task performance to measures of fused image quality remains elusive at this point in time. Further work is needed to bridge the gap and generate fused image metrics that are predictive of human performance in different task domains. References [1] J. Irvine, National imagery interpretation rating scales (NIIRS): Overview and methodology, SPIE-3128, 93-103, 1997. Also see online www.fas.org/irp/imint/niirs.htm [2] R. Driggers, P. Cox, and M. Kelley, National imagery interpretation rating system and the probabilities of detection, recognition & ID, Optical Engineering, 36, 1052-1059, 1997. [3] IRARS, Multispectral Imagery Interpretability Rating Scale (MSIIRS), 1995, www.fas.org/irp/imint/niirs_ms/msiirs.htm [4] RCA Electro-Optics Handbook, RCA Corporation, 1974. Available online from Burle Industries, www.burle.com/cgi-bin/byteserver.pl/pdf/Electro_Optics.pdf [5] L. Biberman, editor, Electro-Optical Imaging: System Performance and Modeling, SPIE Press, 2000. [6] NVESD, NVTherm (thermal imager prediction models) – NVTherm Rev. 7 (see Ontar Corp. online), and NVThermIP see Sensiac Military Sensing Information Analysis Center www.sensiac.gatech.edu/sensiac/external/products/models_simulations/nvtherm_overview.jsf [7a] J. Leachtenauer, W. Malila, J. Irvine, L. Colburn, and N. Salvaggio, Generalized image quality equation: GIQE, Applied Optics, 36 (32), 8322-8328, 1997. [7b] J. Leachtenauer, W. Malila, J. Irvine, L. Colburn, and N. Salvaggio, Generalized image quality equation for infrared imagery,, Applied Optics, 39 (26), 4826-4828, 2000. [8] A. Waxman, A. Gove, D. Fay, J. Racamato, J. Carrick, M. Seibert, and E. Savoye, E.D., Color night vision: Opponent processing in the fusion of visible and IR imagery, Neural Networks, 10, 1-6, 1997. [9] D. Fay, P. Ilardi, N. Sheldon, D. Grau, R. Biehl, and A. Waxman, Realtime image fusion and target learning & detection on a laptop attached processor, SPIE-5802, Enhanced and Synthetic Vision, 2005. [10] A. Toet, L. van Ruyven, and J. Valeton, Merging thermal and visual images by a contrast pyramid, Optical Engineering, 28, 789-792, 1989. [11] P. Burt and R. Kolczynski, Enhanced image capture through fusion, Fourth Inter’l Conference on Computer Vision, 173-182, Los Alamitos: IEEE Computer Society Press, 1993. [12] A. Waxman, D. Fay, D. E. Savoye, et al., Solid-state color night vision: Fusion of low-light visible and thermal infrared imagery, Lincoln Laboratory Journal, 11(1): 41-60, 1998. [13] A. Waxman, A. Gove, M. Seibert, D. Fay, J. Carrick, J. Racamato, E. Savoye, B. Burke, R. Reich, W. McGonagle, and D. Craig, Progress on color night vision: Visible/IR fusion, perception & search, and low-light CCD imaging, SPIE-2736, Enhanced and Synthetic Vision, 96-107, 1996. [14] A. Toet, and J. Walraven, New false color mapping for image fusion, Optical Engineering, 35, 650-658, 1996. [15] A. Toet, J. IJspeert, A. Waxman, and M. Aguilar, Fusion of visible and thermal imagery improves situational awareness, SPIE-3088, Enhanced and Synthetic Vision, 177-188, 1997. [16] P. Steele and P. Perconti, Part task investigation of multispectral image fusion using gray scale and synthetic color night vision sensor imagery for helicopter pilotage, SPIE-3062, 88-100, 1997. [17] M. Aguilar, D. Fay, D. Ireland, J. Racamato, W. Ross, and A. Waxman, Field evaluations of dual-band fusion for color night vision, SPIE-3691, Enhanced and Synthetic Vision, 1999.