24
Signal Processing: Image Communication 19 (2004) 1005–1028 Importance prioritisation in JPEG 2000 for improved interpretability Anthony Nguyen , Vinod Chandran, Sridha Sridharan School of Electrical and Electronic Systems Engineering, Image and Video Research Laboratory, Queensland University of Technology, GPO Box 2434, Brisbane 4001, Australia Received 21 January 2004 Abstract Importance prioritised coding is an image coding principle aimed at improving the interpretability versus bit-rate performance of image coding systems. It is important in surveillance where image formats are large and transmission over bandlimited channels can take considerable time even for compressed bit-streams. It is also useful in content-based retrieval and browsing applications, where the number of images viewed tend to be large and faster interpretability would imply faster rejection of unwanted partially decompressed images. An importance prioritised image coder incorporated within the JPEG 2000 framework, called IMP-J2K, is presented, to prioritise image contents according to a metric based on its ‘importance’. The performance of IMP-J2K is also quantitatively assessed using the objective peak signal-to-noise ratio quality and subjective national imagery interpretability rating scale. r 2004 Elsevier B.V. All rights reserved. Keywords: Image coding; Interpretability; JPEG 2000; NIIRS; Region of interest (ROI) 1. Introduction The interpretability of an image is important in applications such as the transmission and analysis of satellite and aerial surveillance imagery where particular regions in an image may be of higher importance than others. Typical surveillance ap- plications may require the use of the imagery to support the decision processes in various levels of strategic, operational and tactical tasks [14]. Often this involves well defined visual tasks requiring the detection of specific targets or simply the recogni- tion of image contents. In addition to the criterion of functional image interpretability, surveillance operations may involve large image formats and be used in environments where the transmission over bandlimited channels can take considerable time even for compressed bit-streams. It is also ARTICLE IN PRESS www.elsevier.com/locate/image 0923-5965/$ - see front matter r 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.image.2004.08.005 Corresponding author. Tel.: +61-7-3864-1608; fax: +61-7- 3864-1516. E-mail addresses: [email protected] (A. Nguyen), [email protected] (V. Chandran), [email protected] (S. Sridharan).

Importance prioritisation in JPEG 2000 for improved interpretability

Embed Size (px)

Citation preview

ARTICLE IN PRESS

0923-5965/$ - se

doi:10.1016/j.im

�Correspondi3864-1516.

E-mail addre

v.chandran@qu

s.sridharan@qu

Signal Processing: Image Communication 19 (2004) 1005–1028

www.elsevier.com/locate/image

Importance prioritisation in JPEG 2000 forimproved interpretability

Anthony Nguyen�, Vinod Chandran, Sridha Sridharan

School of Electrical and Electronic Systems Engineering, Image and Video Research Laboratory,

Queensland University of Technology, GPO Box 2434, Brisbane 4001, Australia

Received 21 January 2004

Abstract

Importance prioritised coding is an image coding principle aimed at improving the interpretability versus bit-rate

performance of image coding systems. It is important in surveillance where image formats are large and transmission

over bandlimited channels can take considerable time even for compressed bit-streams. It is also useful in content-based

retrieval and browsing applications, where the number of images viewed tend to be large and faster interpretability

would imply faster rejection of unwanted partially decompressed images. An importance prioritised image coder

incorporated within the JPEG 2000 framework, called IMP-J2K, is presented, to prioritise image contents according to

a metric based on its ‘importance’. The performance of IMP-J2K is also quantitatively assessed using the objective peak

signal-to-noise ratio quality and subjective national imagery interpretability rating scale.

r 2004 Elsevier B.V. All rights reserved.

Keywords: Image coding; Interpretability; JPEG 2000; NIIRS; Region of interest (ROI)

1. Introduction

The interpretability of an image is important inapplications such as the transmission and analysisof satellite and aerial surveillance imagery whereparticular regions in an image may be of higher

e front matter r 2004 Elsevier B.V. All rights reserve

age.2004.08.005

ng author. Tel.: +61-7-3864-1608; fax: +61-7-

sses: [email protected] (A. Nguyen),

t.edu.au (V. Chandran),

t.edu.au (S. Sridharan).

importance than others. Typical surveillance ap-plications may require the use of the imagery tosupport the decision processes in various levels ofstrategic, operational and tactical tasks [14]. Oftenthis involves well defined visual tasks requiring thedetection of specific targets or simply the recogni-tion of image contents. In addition to the criterionof functional image interpretability, surveillanceoperations may involve large image formats andbe used in environments where the transmissionover bandlimited channels can take considerabletime even for compressed bit-streams. It is also

d.

ARTICLE IN PRESS

A. Nguyen et al. / Signal Processing: Image Communication 19 (2004) 1005–10281006

true for content-based retrieval and browsingapplications, where the number of images viewedtend to be large and faster interpretability wouldimply faster rejection of unwanted partiallydecompressed images. Therefore, it is importantto constrain the transmission of an image tocertain parts of an image to speed up this process.Importance prioritised coding, as the name sug-gests, aims to encode and prioritise regions ofimportance (or more commonly referred to asregions of interest (ROI)) such that improvedinterpretability can be achieved for partiallydecoded images.

An example of importance prioritised codingapplied to the ‘Lena’ image is shown in Fig. 1. Theimage on the left was encoded using the usualdistortion (or objective quality) versus bit-rateprioritisation strategy. Typical objective qualitymetrics include mean square error (MSE) or peaksignal-to-noise ratio (PSNR) and these measurestreat all impairments as equally important anddoes not correlate well with image quality and thehuman visual system (HVS) processes [3]. Impor-tance prioritised coding, on the other hand, aimsto prioritise image contents by using a metricbased on the region’s importance. An example ofthis is shown in Fig. 1(b), where the prioritisationof the face region ahead of its surroundingsconsiderably improves the interpretability of Le-na’s face. In such a case, the ROI was defined to bethe face, which was considered important forinterpretability, and was used to constrain thetransmission of the image with a certain level ofpriority to the ROI. The level of priority can be

(a) (b)

Fig. 1. Reconstructed ‘Lena’ image at 0.125 bits per pixel (bpp)

for (a) default and (b) importance prioritised coding strategies.

defined using an ‘importance’ metric to allow usersat the encoding end to specify the degree ofimportance of the ROI relative to other regions inthe image. The location of the ROI and its degreeof importance (or importance score) defines whatis called an importance map [6].The level of interpretability is commonly as-

signed using a subjective rating scale callednational imagery interpretability rating scale(NIIRS) [7]. It is a task-based numerical scaleused by expert image analysts to measure thepotential interpretability of images. NIIRS pro-vides a means to directly relate the quality of animage to the interpretation tasks for which it maybe used. NIIRS is strongly based on spatialinformation such that at a higher NIIRS, moredetailed information can be extracted from theimage. NIIRS can also be used to describe theinterpretability of regions in an image such asROIs. The term interpretability in fact describesthe level of detail, but this is in contrast todefinitions provided in [16], where the level ofdetail (LoD) is defined as a combination of severallow-level parameters such as resolution, coefficientprecision (or quality) and colour, which are usedto describe the transmission state of ROIs. Theseparameters alone are not sufficient to describeinterpretability and require higher-level descrip-tors such as NIIRS. Here in this paper, NIIRS isused to describe the transmission state of ROIsand is investigated to analyse the influence of ROIprioritised coding on interpretability.The paper describes a ROI prioritised coding

technique developed within the JPEG 2000 frame-work, called importance prioritised JPEG 2000(IMP-J2K). The technique makes use of a highlydesirable feature of the JPEG 2000 coding scheme,which allows the flexible and customised prior-itisation of ROIs by selecting and distributing thebit allocation during the JPEG 2000 code-streamconstruction. This provides the user at the encoderend with great flexibility in the formation of theJPEG 2000 code-stream. Moreover, IMP-J2Kgenerates legal JPEG 2000 code-streams and doesnot require the encoding of the ROI or itsimportance score. Section 2 presents an overviewof JPEG 2000 with emphasis on features thatare important to the development of IMP-J2K.

ARTICLE IN PRESS

A. Nguyen et al. / Signal Processing: Image Communication 19 (2004) 1005–1028 1007

Section 3 presents the technical aspects of IMP-J2K, followed by the objective quality andsubjective interpretability assessments, which aredescribed in Sections 4 and 5, respectively. Basedon the assessments, Section 6 presents a metho-dology for the determination of a region’simportance score given a priori specification of adesired region and background interpretabilityperformance. Conclusions are drawn in Section 7.

2. JPEG 2000 overview

JPEG 2000 is an international standard forimage coding [18]. The coding scheme uses awavelet-based (discrete wavelet transform (DWT))bit-plane coding technology as shown in Fig. 2.Only the pertinent features of JPEG 2000 that areused to develop IMP-J2K are discussed in thissection.

The DWT is first performed on the originalinput image and generates sub-bands of waveletcoefficients at a number of resolution levels thatdescribe the horizontal and vertical spatial fre-quency characteristics of the input image. Twotypes of wavelet filters are defined in JPEG 2000. ADaubechies ð9; 7Þ filter provides an irreversibletransform for lossy coding, while the reversibleð5; 3Þ filter is defined for lossless coding. Anexample DWT applied to the ‘Lena’ image isshown in Fig. 3. The transform results in four newsub-bands at each level of decomposition, namely,an approximation sub-band at low resolution, LL,

DWTBit-plane

Modeling

Input

Image

O

(a)

EntropyDecoding

EmbeddedBit-stream

(b)

Fig. 2. JPEG 2000 (a) encoder and (b)

and three directionally sensitive detail sub-bands:LH—horizontal image features (vertically highpass), HL—vertical features (horizontally highpass), and HH—diagonal features (horizontallyand vertically high pass).The DWT sub-band coefficients are scalar

quantised and partitioned into rectangular blocks,called code-blocks. These code-blocks are inde-pendently bit-plane encoded; a bit-plane at a timestarting from the most significant bit-plane (MSB)to the least significant bit-plane (LSB). Withineach bit-plane, coefficient bits are processed with amaximum of three ‘sub-bit-plane’ coding passes,namely, significance, refinement, and a clean-uppass as shown in Fig. 4. The significance propaga-tion pass is the first pass in a new bit-plane and a‘1’ bit is encoded in this pass, if at least one of itseight connected neighbours is significant. A loca-tion is significant if a ‘1’ has been encoded for thatlocation in the current or previous bit-planes. Thesign bit is coded immediately after the ‘1’ bit justencoded. The second pass is the magnituderefinement pass, and all bits from locations thatbecame significant in the previous bit-planes areencoded. The third and final pass is the clean-up

pass, and takes case of any bits not encoded in thefirst two passes. Each of these passes collectscontextual information about the bit-plane data,and are arithmetically encoded and associatedwith a distortion and bit-rate measure, which areused for rate-control.Once all the code-blocks have been com-

pressed, a post-compression rate-distortion (PCRD)

Entropy

Coding

Bit-stream

Formation

Embedded

Bit-stream

PCRD

ptimisation

InverseDWT

ReconstructedImage

decoder (Adapted from Ming [8]).

ARTICLE IN PRESS

LL0

(a)

LL1

HL1

LH1

HH1

(b) (c)

LL2 HL2

LH2 HH2

LH1 HH1

HL1

Fig. 3. Example DWT with corresponding notation: (a) original image, (b) one level, and (c) two level DWT.

C

R

S

block 1bit-stream

C

R

S

C

R

S

C

block 2bit-stream

block 3bit-stream

block 4bit-stream

block 5bit-stream

block 6bit-stream

C

R

S

C

R

S

C

R

S

C

R

S

S

C

R

S

C

R

S

C

C

R

S

C

R

S

C

R

S

C

R

S

S

C

R

S

C

R

S

C

C

R

S

C

BP 1MSB

BP 2

BP 3

BP 4

BP 5

layer 1

Significance (S)

Refinement (R)

Clean-up (C)

layer 2

layer 3

Quality LayerLegend

R

S

S

R

S

C

R

S

S

R

S

C

R

S

S

R

S

C

R

S

C

R

S

S

layer 4

Fig. 4. Embedded code-block bit-stream representations associated with various sub-bit-plane coding passes and quality layers. The

number of coded bytes in each sub-bit-plane coding pass is dependent on the bit-plane data being encoded.

A. Nguyen et al. / Signal Processing: Image Communication 19 (2004) 1005–10281008

optimisation operation passes over all the com-pressed code-blocks to generate the final bit-stream. Through this rate control, code-block

sub-bit-plane coding passes are ordered in thefinal code-stream as a succession of layers. Eachlayer contains the additional sub-bit-plane coding

ARTICLE IN PRESS

A. Nguyen et al. / Signal Processing: Image Communication 19 (2004) 1005–1028 1009

passes from each code-block such that each layerwill incrementally improve the overall imagequality for the entire image at full resolution.The operation is based upon the rate-distortioncurve and determines the extent to whicheach code-block’s bit-stream should be truncatedin order to achieve a particular target bit-rate.The data with the highest distortion reductionper average bit of compressed representationshould be encoded first. The only restrictionsare that truncation points must occur at theend of a sub-bit-plane coding pass, and that thesecoding passes must appear in a causal orderstarting from the most significant bit-plane for agiven code-block. It has been justified that thesequencing of the three sub-bit-plane codingpasses maintains a decreasing distortion reductionper average bit of compressed representation[15,19].

The first quality layer (i.e. ‘layer 1’ in Fig. 4) isformed from the optimally truncated code-blockbit-streams such that the target bit-rate achievesthe highest possible quality in terms of minimisingMSE. Then each subsequent layer is formed byoptimally truncating the code-block bit-streams toachieve higher target bit-rates, and thus imagequality. The number of sub-bit-plane contribu-tions in any given layer differs from code-block tocode-block, and depends on the distortion (orerror) contributions from the sub-bit-planecoding passes associated with these code-blocks.These contributions to layers are flexible andarbitrary and is left to the implementer of theencoder to adopt different types of prioritisationstrategies. A more complete description of JPEG2000 and the layered bit-stream formation can befound in [18].

2.1. Image content prioritisation in JPEG 2000

By default, JPEG 2000 places in each layerthe sub-bit-planes (among all sub-bit-planes, notyet included in previous layers) with the steepestrate-distortion slope. This implementation mini-mises the MSE at each point in the embedded bit-stream and improves the MSE performance. Thisprioritisation strategy or ordering conventionis termed layer-oriented. The standard also sup-

ports resolution-oriented and spatially orientedsequences.It is also possible to modulate the distortion

estimates of the code-block sub-bit-plane codingpasses to make it consistent with the visualmasking properties and/or ROI characteristics[18]. The latter is referred to as the Implicit ROIcoding method and is of more interest to thispaper. The prioritisation of regions is achieved bymodulating the distortion cost function of code-blocks that contain the ROI by arbitrary weights.This renders ROI code-blocks more important andtheir embedded bit-stream will be sequenced intothe code-stream in a manner, which effectivelyelevates the priority of their corresponding spatialregion. By modulating the distortion estimates,neither the wavelet coefficients nor the quantisa-tion step sizes need to be altered. As such, there isno impact on the number of bit-planes, whichmust be processed for the decoding of the image.One disadvantage, however, is that ROI adjust-ments can only be made on a code-block by code-block basis, rather than at a coefficient level. ROIreconstructions are not as localised as thosereconstructed using coefficient scaling methodssuch as the Max-shift ROI coding method, whichwill be discussed later. The distortion modulationconcept also allows for the coding of multipleROIs with different weights of importance, but itsdescription and implementation in [17,18] isconstrained to the specification of ROIs using asingle degree of importance for all ROIs. Themethod is most effective when working with largeimages, and relatively small code-block dimen-sions [18].JPEG 2000 also specifies another ROI coding

mode called the Max-shift method to prioritise allROIs ahead of the background. Max-shift is acoefficient scaling method based on the scaling (orshifting) of ROI coefficients such that all ROI bit-planes are placed higher than those belonging tothe background. This provides an implicit meansfor the decoder to identify the location of the ROIand thus no ROI shape information is required tobe encoded. The number of ROI bit-plane shifts isreferred to as the scaling (or up-shift) value, s.During the bit-plane coding of coefficients, theROI will be encoded and placed in the code-stream

ARTICLE IN PRESS

A. Nguyen et al. / Signal Processing: Image Communication 19 (2004) 1005–10281010

before those associated with the background. Theprogressive decoding of the code-stream will thusimprove the ROI (at a coefficient level) rapidlywith little change in the background, until allthe ROI information has been received. Noflexibility exists in controlling the degree ofimportance between the ROI and background. Ifmultiple ROIs are to be identified and each ROI isto be given a different degree of importance, thenthe shifting strategy would have to be such that nobit-planes associated with a given ROI intersectswith those belonging to another. In such a case,the number of magnitude bit-planes may be aslarge as (number of ROIs) � (the original bit-depth)[2]. Whether or not single or multiple ROIs areused, there is an increase in the number of bit-planes to be encoded and this correspondinglyresults in an increased code-stream bit-rate. Thedecoder, in turn, must be capable of processingthis increased dynamic range if the background isto be fully decoded. In practice, if multiple ROIsare defined then each ROI would have the samescaling value and hence the same degree ofimportance.

A detailed description and comparison ofMax-shift, Implicit, and other ROI codingmethods in JPEG 2000 are presented in [10].Only the Max-shift and the Implicit methodproduces compliant JPEG 2000 code-streamsyntax which can be decoded by generic JPEG2000 decoders. These coders are said to be JPEG2000 ‘Part 1’ compliant [5]. This property isimportant for interoperability, especially whenthe expected proliferation and wide-spread use ofthe ‘baseline’ standard will likely be used in manyimagery systems and processes. This paper onlyconsiders JPEG 2000 Part 1 compliant codingschemes.

Following is a description of the proposedJPEG 2000 Part 1 compliant importance prioritised

JPEG 2000 (IMP-J2K) image coder. The methodextends the concept and implementation in [17] toallow for the specification of an ‘importance’ map,which can define multiple ROIs and differentimportance scores (or weights of importance). Theflexibility in defining the importance map allowsfor greater control over how ROIs are encodedand decoded.

3. Importance prioritised JPEG 2000 (IMP-J2K)

image coder

IMP-J2K provides yet another method forprioritising image content and is based on theordering of code-block sub-bit-planes in such away that yields an ‘importance’ ordered bit-stream. The manipulation of the bit-stream order-ing may be efficiently achieved by modulating thedistortion estimates of sub-bit-plane coding passessuch that it is consistent with a priori knowledge ofregion importance. An importance map modelsthe region’s importance and provides an imagemap that assigns importance scores to each pixelregion. This provides a useful basis for theprioritisation of image contents according to anyimportance map specification. IMP-J2K uses thesame principle of reordering code-block contribu-tions to prioritise ROIs as the Implicit method,and extends the framework described and imple-mented in [17,18] to accommodate the use of animportance map that can specify multiple ROIsand arbitrary importance scores. The IMP-J2Kimportance mapping and prioritisation sub-sys-tems within the overall framework of JPEG 2000 isshown in Fig. 5. The method supports a host ofuseful features and includes:

Multiple and arbitrary shaped regions, whichcan be defined at different spatial locations, sub-band orientation and scale.

Arbitrary importance scores may be defined todifferent regions so that regions may beprioritised by different amounts.

Progressive lossy to lossless reconstruction ofboth regions and background.

No importance map (or ROI coefficient addres-sing) information or importance scores need tobe signalled to the decoder for reconstruction.

Final code-stream maintains the JPEG 2000 bit-stream syntax and can be decoded by genericJPEG 2000 decoders, as shown in Fig. 5(b).

The last feature is very desirable and impliesthat IMP-J2K does not require the encoding ordecoding of the ROI or its degree of importance,and hence no extra storage overhead is in-curred with the prioritisation method. Only a

ARTICLE IN PRESS

(a)

EntropyDecoding

InverseDWT

EmbeddedBit-stream

ReconstructedImage

(b)

EmbeddedBit-stream

Bit-streamFormation

PCRDOptimisation

Importance Prioritisation

EntropyCoding

Bit-planeModeling

DWTInputImage

ImportanceMapping

Fig. 5. Importance prioritised JPEG 2000 (IMP-J2K) (a) encoder and (b) decoder.

A. Nguyen et al. / Signal Processing: Image Communication 19 (2004) 1005–1028 1011

computational overhead is incurred with theimportance mapping and prioritisation of waveletcoefficients. The features listed above provides aflexible framework for the prioritisation of imagecontents.

3.1. Importance mapping of wavelet coefficients

In order to determine the importance scores (ordegree of importance measures), Ii, for each code-block, Bi, an importance mapping tool is requiredto determine the importance score for each andevery coefficient in all sub-band orientations andscale, similar in structure to that generated by theDWT of JPEG 2000. The hierarchical structure ofthe wavelet sub-band importance maps (i.e. imagemaps of importance scores) including the impor-tance map at the highest resolution form the scale-

space importance map pyramid. The grouping ofsub-band importance scores into code-blocks formIi[k], which is used for the modulation of thedistortion cost function (see Section 3.2).

A scale-space importance map pyramid may besystematically generated in a number of ways. Onesuch method may propagate importance scoresbelonging to all pixel locations of the importancemap at the highest resolution through the waveletdecomposition for all sub-band orientations andscale. This involves tracing and identifying which

wavelet coefficient belongs to which image pixellocation, and then assign its corresponding im-portance score to the coefficient. In the event of acoefficient affecting a number of pixel locations atfull resolution, then the maximum of all corre-sponding pixel importance scores is assigned to thegiven coefficient.The scale-space importance map above will

generate a ‘lossless’ importance map, such thatall wavelet coefficients affecting the reconstructionof ROIs in any way would be included in the map.A ‘lossy’ importance map, on the other hand, onlyhas coefficients affecting the ROI the mostincluded in the map. That is, not all coefficientsaffecting the ROI are emphasised and thus thelossless reconstruction of ROIs will not be achiev-able until a later bit-rate than those using losslessimportance maps. An example lossy importancemap may be generated by taking the maximum of2� 2 block neighbourhoods to produce the map atthe next lower scale. Other sampling techniquesmay be used to generate other lossy importancemaps.Although the scale-space importance map pyr-

amids discussed above provides a hierarchy ofimportance maps derived from the importancemap at the highest resolution, any importancemapping scheme may be used to assign importancescores to different spatial locations, sub-band

ARTICLE IN PRESS

1IMP-J2K is equivalent to the Implicit ROI coding method

when a single ROI is prioritised.

A. Nguyen et al. / Signal Processing: Image Communication 19 (2004) 1005–10281012

orientations and scales. Note that the constructionof the scale-space importance map is only acomputational overhead, and will not need to beencoded or transmitted.

3.2. Importance prioritisation of wavelet

coefficients

The PCRD optimisation routine is used toallocate code-block contributions to quality layersby using a modulated rate-distortion, or morespecifically rate-importance, optimisation criter-ion. The distortion cost function is modulated bythe ROI characteristics (specified as importancescores) defined from the scale-space importancemap pyramid. The modulated distortion estimate,D

p

i ; contributed by the additional sub-bit-planecoding pass, p, for code-block, Bi, is obtained byweighting the unmodulated distortion estimate,D

pi ; by the square of the code-block’s importance

score, Ii:

Dpi ¼ w2

bi

Xk

ðspi ½k� si½k�Þ

2; (1)

Dp

i ¼ Dpi � ðI iÞ

2; (2)

where si[k] denote the sub-band coefficients incode-block Bi, s

pi ½k� are the representation of these

coefficients associated with all sub-bit-plane cod-ing passes up to the coding of the pth sub-bit-plane, and wbi

is the L2-norm of the wavelet basisfunctions for sub-band, bi, to which code-block Bi

belongs. By modulating the distortion estimates,neither the wavelet coefficients nor the quantisa-tion step sizes need to be altered. As such there isno impact on the sub-bit-plane coding passes andthus no impact on the compressed code-block bit-streams. The distortion measures merely provide ameans to modulate the rate-distortion slopefor rate control such that a different ordering ofsub-bit-plane coding passes to layers can beachieved.

In this method of prioritisation, ROIs with ahigh importance score contribute a higher distor-tion estimate and thus a greater rate-distortionslope for rate control. Code-blocks containingROIs will thus be rendered more important andtheir embedded bit-streams will be sequenced into

the final code-stream earlier than those associatedwith lesser important regions. The backgroundimportance score is set to 1 such that no emphasisis placed on its prioritisation.In the event that multiple ROIs of differing

degrees of importance are found in a given code-block, Bi, then the maximum importance scorefrom all the sub-band samples in Bi will beassigned the code-block importance score, Ii.

I i ¼ maxðI i½k�Þ: (3)

IMP-J2K is a generalisation of both the baselineJPEG 2000 (when all ROIs and the backgroundhave the same importance score) and Implicitmethod (when all ROIs have the same importancescore).

4. Objective quality assessments

This section shows the flexibility of IMP-J2K inproviding image content prioritisation when usingsingle1 and multiple ROIs.

4.1. Single ROI prioritisation

Three test images, comprising of ‘Baghdad’(satellite), ‘Duntroon’ (aerial), and ‘Cafe’ (JPEG2000 test image), as shown in Fig. 6, were used forthe following experiments. A single rectangularROI was marked on each image and was chosen tobe approximately 6–8% of the image area. TheROI locations were independent from one imageto another.IMP-J2K code-streams were generated for each

test image for a number of ROI importance scores(2, 4, 8, 16, 32, 64, 128, and 8192). The significanceof using scores that are powers of 2 is so that it canbe related to the scaling value used in coefficientscaling methods such as Max-shift, which repre-sents the equivalent number of ROI bit-planeshifts that would be used to prioritise the ROIahead of the background. A useful guideline to therelationship between the importance score, Ii, andscaling value, s, for all coefficients contained in

ARTICLE IN PRESS

(a) (b)

(c)

Fig. 6. IMP-J2K test images and hand-marked ROI. (a) Baghdad (3089� 4193, 8 bpp), (b) Cafe (2048� 2560, 8 bpp), and (c) Duntroon

(6080� 4756, 8 bpp).

A. Nguyen et al. / Signal Processing: Image Communication 19 (2004) 1005–1028 1013

ARTICLE IN PRESS

A. Nguyen et al. / Signal Processing: Image Communication 19 (2004) 1005–10281014

ROI code-blocks is given by

s ¼ log2ðI iÞ: (4)

Note that an importance score of 1 indicates thatno scaling of coefficients has taken place. For easeof reading and understanding, the remainder ofthe paper will refer to the importance score, Ii, as,Score, to denote the importance of code-blocksthat affect the reconstruction of ROIs.

In addition to the IMP-J2K code-streams, thedefault JPEG 2000 and Max-shift code-streamswere also generated to show the two extreme casesof ROI prioritisation in JPEG 2000. The defaultJPEG 2000 (equivalent to IMP-J2K withScore=1) is the case where no ROI prioritisationis performed, while the Max-shift using s ¼ 13(equivalent to a Score=8192 (=213)) refers to thecase where all the ROI is prioritised before thebackground.

In all cases, the default Daubechies (9/7) filterwith a four level DWT was used. Eleven quality

10-3

10-2

10

15

20

25

30

35

40

45

50

55

Bit-rate (b

PS

NR

(dB

)

Default JPEG 2000IMP-J2K (Score = 4)IMP-J2K (Score = 16)IMP-J2K (Score = 64)Max-shift

Fig. 7. Performance curves for selected ROI importance scores. ROI

JPEG 2000 curve, respectively.

layers were used to provide progressive encoding;Ten octave quality layered bit-rates were spacedapproximately between 29 and 1 bit per pixel(bpp), and an extra layer to include all thecompressed bits in the code-stream. The code-block size, however, was chosen such that eachcoding method was not disadvantaged in terms ofits rate-distortion performance. Studies haveshown that the preferred code-block size for thedefault JPEG 2000 and Max-shift is 64� 64 [18],and IMP-J2K has a preferred code-block size of32� 32 [11]. The smaller code-block size used forIMP-J2K is to allow for the efficient prioritisationof ROIs at the code-block level.All the compressed code-streams were subse-

quently decoded at quality-layered bit-rates, andthe average PSNR, which is the PSNR using theaverage of the MSE over all test images, werecomputed for the ROI and background. ThePSNR performance curves are plotted in Fig. 7and show that the ROI performance is dependent

10-1

100

its per pixel)

and background curves are those above and below the default

ARTICLE IN PRESS

A. Nguyen et al. / Signal Processing: Image Communication 19 (2004) 1005–1028 1015

on the ROI importance score. The larger theimportance score, the larger the difference inPSNR between the ROI and background. Thesecurves are bounded by the two ROI codingextremes, namely, Max-shift and default JPEG2000. With the default JPEG 2000, the ROI andbackground performances are similar since noROI emphasis has been introduced into thecoding. Max-shift, on the other hand, producesthe fastest ROI reconstruction since all ROI bitswere encoded before those belonging to thebackground. The consequence of this is that avery poor background performance results. Onemay notice, however, that the ROI in the Max-shift method seems to be still increasing in quality,while the background also starts to refine between0.25 and 0.5 bpp. This may seem to contradict theoperations of the method, but this is because thereexist a bit-rate between 0.25 and 0.5 bpp that theROI would reach full quality, and the backgroundwould begin to refine at this bit-rate. If a greaternumber of layers were used, one may be able toobserve the bit-rate at which the ROI reaches fullquality.

It is also important to observe how the ROIperformance curves are clustered, especially at lowbit-rates with little variations, while the back-ground performance curves have a more variedeffect. This is due to IMP-J2K having a ROI witha larger spatial extent that is bounded by the code-blocks, sub-band and resolution that contain ROIcoefficients (see Fig. 8). This larger ROI extentconsequently requires more bits for its prioritisa-tion and would require a higher bit-rate to encodethe ROI to a given PSNR value. The cluttering ofthe curves, however, is caused when all the bits

ROI coefficients in ROI code-blocks

non-ROI coefficients in ROI code-blocks

non-ROI code-blocks (i.e. background)

Fig. 8. Spatial extent of ROI when using IMP-J2K. ROIs

effectively occupy a larger spatial extent, bounded by the code-

blocks, sub-band and resolution that contain ROI coefficients.

required to encode the ROI have more or less beenexhausted and little improvement in ROI perfor-mance can be gained for the given bit-rateregardless of the ROI importance score. The vastimprovement in quality at this stage of prioritisa-tion, however, are not regions belonging to non-ROI code-blocks (i.e. background) but regionsbelonging to non-ROI coefficients from ROI code-blocks. This is not a limitation if localised contextsurrounding the ROIs are also of importance. Thiseffect can be minimised by choosing a ROI sizeand location, which are aligned along the code-block boundaries [11]. IMP-J2K may also beenhanced and the bounds of an appropriate ROIsize and location may be removed by furtherweighting ROI code-block importance scores bythe proportion or fraction of ROI coefficients thatbelong to the corresponding code-block, similar tothat described in [20]. In such an approach, code-blocks containing both ROI and non-ROI coeffi-cients would not be emphasised as much as code-blocks containing all ROI coefficients.

4.2. Multiple ROI prioritisation

The image used to illustrate multiple ROIprioritisation is shown in Fig. 9. The image is a4096� 4096 pixel partition from a much larger‘Carpark’ aerial image (8800� 18566, 8 bpp).Three arbitrarily shaped ROIs were marked,namely ROI 1, ROI 2, and ROI 3, which covered

Fig. 9. Selected ROIs from a 4096� 4096 partition of

‘Carpark’ aerial image.

ARTICLE IN PRESS

10-3

10-2

10-1

100

15

20

25

30

35

40

45

50

Bit-rate (bits per pixel)

PS

NR

(dB

)

Region 1 (Score = 64)Region 2 (Score = 32)Region 3 (Score = 16)Background (Score = 1)Image

10-3

10-2

10-1

100

20

25

30

35

40

45

50

Bit-rate (bits per pixel)

PS

NR

(dB

)Region 1 (Score = 64)Region 2 (Score = 32)Region 3 (Score = 16)Background (Score = 1)Image

(a)

(b)

Fig. 10. IMP-J2K results for ‘Carpark’ image: (a) ROI only,

and (b) ROI and lowest resolution sub-band prioritisation.

A. Nguyen et al. / Signal Processing: Image Communication 19 (2004) 1005–10281016

approximately 22% of the total image area. TheROIs do not represent any particular region ofsignificance, but were purely chosen to illustrateIMP-J2K’s multiple ROI coding capability.

The ROIs were assigned arbitrary importancescores as given by Eq. (5). The highest importancescore of 64 was assigned to ROI 1, indicating thatthe region should be encoded with the highestpriority. ROI 2 was assigned intermediate prioritywith a score of 32, followed by ROI 3 with a scoreof 16. The background’s importance was set to thedefault score of 1. These importance specificationsgenerate the importance map at the highestresolution. The lossless importance mapping ap-proach was used to generate the scale-spaceimportance map and then subsequently used tomodulate the distortion cost function for prior-itisation.

I 0Carpark ¼

64 Region 1 ðROI 1Þ;

32 Region 2 ðROI 2Þ;

16 Region 3 ðROI 3Þ;

1 Background ðBGÞ:

8>>><>>>:

(5)

The image was encoded using 30 quality-layeredbit-rates ranging from 0.0005 bpp to the final bit-rate where all compressed bits have been includedin the code-stream. A five level DWT using thedefault Daubechies (9,7) filter was performed onthe original image and code-block sizes wereconstrained to 16� 16 samples, which was purpo-sely chosen to be small enough to demonstrate theefficient spatial refinement of ROIs. Better overallcompression performance, however, can beachieved by using a code-block dimension of32� 32 [11].

The resultant code-stream was decoded at eachof the quality-layered bit-rates and PSNR valueswere calculated for each one of these, for eachROI, background, and the entire image. Fig. 10(a)shows the PSNR performance for the ‘Carpark’image. The ROIs are emphasised efficiently andreconstructed at a higher quality than the back-ground. Furthermore, the prioritisation of ROIs isin order of importance with ROI 1 encoded withthe best quality, followed by ROI 2, ROI 3, thenthe background. The larger the ROI’s importancescore, the faster the ROI’s rate of refinement.

Fig. 11(a) shows an example decoded ‘Carpark’image at 0.003 bpp. Notice how the hand-markedregions as shown in Fig. 9 were initially arbitrarilyshaped, but the reconstructed regions are blocked.This is because the ROIs are reconstructed at acode-block level and not at a coefficient level. Asexplained earlier for the case of single ROIprioritisation, a number of methods can beadopted to minimise this, such as aligning theROI along the code-block boundaries, orby weighting code-block importance scores bythe fraction of ROI coefficients contained in thecode-block. Another way of improving theROI definition while still allowing multiple ROI

ARTICLE IN PRESS

(a) (b)

Fig. 11. Decoded IMP-J2K images at 0.003 bits per pixel for ‘Carpark’ image: (a) ROI only, and (b) ROI and lowest resolution sub-

band prioritisation.

A. Nguyen et al. / Signal Processing: Image Communication 19 (2004) 1005–1028 1017

prioritisation, is to use a hybrid coefficient scalingmethod (such as Max-shift) and IMP-J2K [9]. Thisbetter ROI localisation, however, comes at theexpense of an increased final code-stream bit-ratedue to the increase in dynamic range caused by theshifting of coefficient bits.

Referring to Fig. 11(a), regions belonging tonon-ROI code-blocks were not decoded at thegiven bit-rate, and thus no background contextualinformation is available to aid the interpretation ofregions outside the ROIs. The usual approach thatcan be adopted to enhance these regions is toweight the lower resolution sub-bands of thewavelet decomposition by an importance scorehigher than that assigned to the background butusually the same or lower than those assigned tothe ROIs. This allows a low resolution back-ground to be encoded along with the ROIs. Theproblem with this approach is that the lowresolution sub-band importance score will alsoapply to all ROI code-blocks in these sub-bands.This can adversely affect the ROI, either by de-emphasising the low-resolution code-blocks thatcontain ROIs when using a low importance scorecompared to the ROIs, or over-emphasising thelow resolution background if a higher importancescore was used [11]. A more appealing approachwould be to only weight non-ROI code-blocks inthe lower resolution sub-bands, while leaving ROIcode-blocks with their assigned importance score.With this approach, the modified importance score

specifications for the ‘Carpark’ image could begiven by

I 00Carpark ¼

64 Region 1 ðROI 1Þ;

32 Region 2 ðROI 2Þ;

16 Region 3 ðROI 3Þ;

8 Non-ROI code-blocks in

lowest resolution bands;

1 Background ðBGÞ:

8>>>>>>>><>>>>>>>>:

(6)

Fig. 10(b) shows the performance results for IMP-J2K where the lowest resolution sub-bands werealso prioritised along with the ROIs. The ROIs,especially those with a high importance score, arestill prioritised in order of importance. Moreover,the background curve can be seen to be greatlyimproved at low bit-rates compared to the ROIonly prioritisation case. This feature produces a‘low resolution’ background being encoded withthe corresponding ‘higher resolution’ ROIs asshown in Fig. 11(b). This is a characteristic similarto human vision, where the point or region ofvisual attention is in high resolution compared tothe periphery which is of lower resolution [13].

5. Interpretability assessments

The subjective assessment presented in thissection provides an analysis of the interpretabilityperformance of IMP-J2K for the case where only

ARTICLE IN PRESS

A. Nguyen et al. / Signal Processing: Image Communication 19 (2004) 1005–10281018

a single ROI is prioritised. NIIRS is used to assignan interpretability rating to the ROI and back-ground, and this is used to investigate the influenceof the coding scheme on interpretability. Theassessment described in this section supplementsprevious interpretability assessments in [12]. Mul-tiple ROI coding or prioritisation of backgroundcontext along with the ROIs are a subject of futurework.

5.1. NIIRS rating

Interpretability ratings are commonly assignedusing a subjective NIIRS [7]. NIIRS is a numericaltask-based scale utilised by expert image analyststo measure the potential interpretability of images.The ratings provide a means to directly relate thequality of an image to the interpretation tasks forwhich it may be used. The basis for NIIRS is thatimage analysts should be able to perform moredemanding interpretation tasks with higher qualityimagery.

NIIRS is composed of 10 rating levels, from 0 to9; the higher the NIIRS rating, the higher theimage interpretability. To define the interpretabil-ity at a specific NIIRS level, textual descriptors,which are referred to as NIIRS criteria, describesseveral interpretation tasks that can be performedby image analysts for each level. By having severalcriteria, an image analyst familiar with a particularcriterion has other references to help understandthe intended interpretability of that NIIRS level.Such NIIRS criteria exists for Visible, Civil,Radar, Infrared and Multi-spectral imagery [4].

Table 1

Example criterion per NIIRS level. (Source: Maver et al. [7])

Rating Level NIIRS criteria

0 Interpretability of the image

1 Detect a medium-sized port

2 Detect large buildings

3 Detect trains or strings of r

4 Identify individual tracks, r

5 Identify individual railcars b

6 Identify automobiles as seda

7 Identify individual rail ties

8 Identify windshield wipers o

9 Detect individual spikes in r

To illustrate the progression in level of inter-pretability represented with increasing NIIRS,Table 1 lists an example of a criterion used ateach NIIRS level. These criteria were extractedfrom the Visible (military-type targets) and Civil(civil and environmental imagery) NIIRS criteria.Within Civil NIIRS, there are three additional

categories, namely, ‘Natural’, ‘Agricultural’, and‘Cultural’. These additional criteria expand thevariety of image interpretation tasks to those thatclosely resemble the class of images under con-sideration. The NIIRS ratings assigned to thesecriteria are fractional and were statistically de-rived. Small differences (e.g.70.1 or70.2 NIIRS)can result from the variability in the ratingsassociated with these criteria [4].

5.2. Experimental procedure

The three test images shown in Fig. 6 were usedfor the assessment. The ‘Baghdad’ and ‘Duntroon’images are the typical types of imagery used inNIIRS evaluations, whereas, the ‘Cafe’ imagewas less typical and was chosen as a wildcardimage. The three test images contain varyinglevels of detail, where NIIRS(Baghdad)oNIIRS(Duntroon)oNIIRS(Cafe).The same single rectangular ROI marked for

each test image as shown in Fig. 6 was used for theassessment. The ROI locations were, to somedegree, scalable in interpretability, and containedobjects with varying spatial detail ranging fromthose that have a possible low NIIRS to those with

ry is precluded by obscuration, or very poor resolution

facility

olling stock on railroad tracks

ail pairs, control towers, switching points in rail yards

y type

ns or station wagons

n vehicles

ailroad ties

ARTICLE IN PRESS

A. Nguyen et al. / Signal Processing: Image Communication 19 (2004) 1005–1028 1019

a NIIRS rating up to the NIIRS of the originalimage.

For each image, ten ROI prioritisation codingsettings with varying ROI importance scores wereapplied to each test image, namely:

(1)

Default JPEG 2000 (which is equivalent toIMP-J2K with Score=1).

(2)

IMP-J2K with Score={2, 4, 8, 16, 32, 64, 128,and 8192 (=213)} to provide 8 different ROIprioritisation settings.

(3)

Max-shift with s ¼ 13 (which is equivalent to aScore=8192 (=213)).

The above importance scores relate to the ROIand the background importance score is defaultwith Score=1. Other coding parameters are thesame as those specified in Section 4.1, unlessotherwise stated.

A varied number of quality-layered bit-rateswere used for each test image. The lowest bit-ratewas chosen such that the ROI or background ofthe reconstructed image was just interpretable (i.e.NIIRS40.0). This would allow the full NIIRSrange (from 0.0 through to the maximum NIIRSfor the image) be rated by image analysts. Ingeneral, the larger the image, the lower the bit-ratethat one can use. This is the case because a smallerbit-rate can result from a larger image for the samenumber of bits used to encode an image. Thequality layers were spaced at octave bit-rates fromthe minimum bit-rate for the image through to1 bpp. The test images were compressed as follows:‘Baghdad’ – 12 quality layers (minimum bit-rate=211 bpp); ‘Duntroon’ – 13 quality layers(minimum bit-rate=212 bpp); and ‘Cafe’ – 11quality layers (minimum bit-rate=210 bpp). Theencoded bit-streams were subsequently decodedinto intermediate image files at the correspondinglayered bit-rates to construct all the image datarequired for the NIIRS evaluation.

In summary, the image data sets comprised ofthree images, each compressed using 10 differentROI prioritisation modes, at a number of bit-rates(12 for ‘Baghdad’, 13 for ‘Duntroon’, and 11 for‘Cafe’ image); giving a total of 360 images to beNIIRS rated. The images were divided into 15 datasets, each comprising of an average of 24 images in

each. To keep the NIIRS ratings unbiased, thecoding methods were not revealed and that thesequence of images being presented to the imageanalyst for each data set were randomised.Four trained and qualified image analysts at the

Defence Science and Technology Organisation(DSTO) completed the evaluation using the sameworkstation over a two and a half month period.The image analysts were free to pan and zoom theimage and then NIIRS rate the ROI and back-ground regions. Each data set was estimated totake approximately 20 to 30min. The cultural civilNIIRS criteria [4] were used for the ratings.

5.3. Experimental results and discussion

The type of image, bit-rate, and the assignedROI importance score, all affect the interpret-ability performance of a ROI prioritised codingscheme. Each factor is independently studied anddiscussed.Fig. 12 shows the average ROI and background

NIIRS ratings provided by the four image analystsfor each test image. The standard error (ordeviation) bars indicate the variation in ratingsbetween the image analysts for the given image.The NIIRS rating variations between the imageanalysts specified as a standard deviation were70.5 for the ‘Baghdad’ image, 70.7 for the‘Duntroon’ image, and 70.9 for the ‘Cafe’ image.The overall image analyst standard deviation forall the ratings provided is 70.7 NIIRS. Thestandard deviation was calculated as the squareroot of the average MSE.Note the consistencies in the average NIIRS

rated by each image analysts. Subject C tended torate the highest, followed by Subject A, thenSubject D, and finally Subject B. The onlyinconsistencies were from Subject A and B forthe ‘Duntroon’ image because a small number ofdata sets were not available. These data sets, ingeneral, contained decoded ‘Duntroon’ imagesthat were encoded using IMP-J2K at high ROIimportance scores and the Max-shift. As a result,the average ROI NIIRS ratings for Subject A andB would be slightly lower than expected for theROI and significantly higher than expected for thebackground. Below is a list of factors that may

ARTICLE IN PRESS

3.3

6.8

4.2

3.0

6.5

3.53.7

7.1

4.5

3.3

6.4

3.8

0.0

1.0

2.0

3.0

4.0

5.0

6.0

7.0

8.0Subject A Subject B Subject C Subject D

NIIR

S

1.8

3.1 3.1

1.7

3.0 2.9

2.1

4.0

2.31.8

3.1

1.9

0.0

1.0

2.0

3.0

4.0

5.0

6.0

Baghdad DuntroonCafé

Subject A Subject B Subject C Subject D

NIIR

S

(b)

Baghdad DuntroonCafé(a)

Average BG NIIRS for each image analyst (independent of bit-rate and importance score)

Average ROI NIIRS ratings for each image analyst (independent of bit-rate and importance score)

Fig. 12. Average (a) ROI and (b) background NIIRS for each test image. Standard error bars indicate the variation between the image

analysts for the given image.

A. Nguyen et al. / Signal Processing: Image Communication 19 (2004) 1005–10281020

account for the differences in ratings by imageanalysts:

are

cap

Variations can be caused by differences inhuman perception and judgement.

The images presented to image analysts containcompression artefacts and are not the usual typeof images used in NIIRS evaluations.2 As aresult, the image analyst will have to adapt tothe highly compressed imagery.

2Typical imagery used in NIIRS evaluations have ratings that

a function of the optics of the image sensing camera used to

ture the image and the ground resolution achieved.

Different levels of interpretation tasks may beachieved in different parts of an image, resultingin variable NIIRS ratings. The NIIRS ratingsare thus constrained to particular image regionsexamined by the image analyst.

Human rating errors such as a mistake in theentering of NIIRS ratings can also be a sourceof rating variability.

In the analysis that follows, all the NIIRSratings that were provided by image analysts wereused. In general, there is a high degree ofconsistency among image analysts, and thus theunrated NIIRS data sets will not significantly bias

ARTICLE IN PRESS

A. Nguyen et al. / Signal Processing: Image Communication 19 (2004) 1005–1028 1021

the results. The ratings provide a means to observeany trends that may exist in investigating theinfluence of ROI prioritised coding on interpret-ability. Any derived statements from these datasets are merely an indication of the results obtainfrom these experiments and would require agreater number of image analysts if these state-ments are to be generalised.

The impact of NIIRS interpretability on ROIimportance score and bit-rate for the three testimages are shown in Figs. 13–15. The averageNIIRS ratings shown in the figures were calculatedas an average of all NIIRS ratings for the givenparameter constraint. For example, NIIRS values

2.8 3.03.3 3.3 3.5

2.7 2.7 2.6 2.42.1

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

1 2 4 8 16

Importan

ROI Background

NIIR

S

1.72.3 2.5

2.93.2 3.4

0.2 0.50.9 1.0 1.3 1.5

-1.0

0.0

1.0

2.0

3.0

4.0

5.0

0.000488 0.000977 0.001953 0.003906 0.007813 0.015625

Bit-rate (bi

ROI Background

NIIR

S

Average NIIRS with Increasing Importance Scor

(a)

Average NIIRS with increasing bit-rate for "Bag

(b)

Fig. 13. Average ROI and background NIIRS with increasing (a) im

error bars indicate the variation between the image analysts for the ‘

for each importance score (2, 4, 8, 16, 32, 64, 128,and 8192) including the default JPEG 2000(Score=1) and the Max-shift were averaged acrossall layered bit-rates, while NIIRS values for eachlayered bit-rate were averaged across all impor-tance scores. Also shown in the figures are thestandard error bars indicating the variationbetween image analysts.As can be seen from Figs. 13(a), 14(a), and

15(a), the ROI NIIRS are increasing with im-portance score towards that achieved by Max-shift, and vice versa for the background. This isexpected because of the higher priority ROI beingprioritised ahead of the background. The extent of

3.4 3.5 3.5 3.4 3.5

1.81.5

1.3

0.5 0.7

32 64 128 8192 Max-shift

ce Score

3.7 3.9 3.9 4.1 4.2 4.2

1.82.2 2.4

2.9

3.63.9

0.03125 0.0625 0.125 0.25 0.5 1

ts per pixel)

e for "Baghdad" Image (Independent of bit-rate)

hdad" image (independent of importance score)

portance score, and (b) bit-rate for ‘Baghdad’ image. Standard

Baghdad’ image.

ARTICLE IN PRESS

5.9 6.16.5

7.0 6.9 6.9 6.9 6.9 6.9 6.8

5.85.3 4.9

4.43.4

2.82.3

1.9

0.71.4

-1.0

0.0

1.0

2.0

3.0

4.0

5.0

6.0

7.0

8.0

1 2 4 8 16 32 64 128 8192 Max-shift

Importance Score

ROI BackgroundN

IIRS

1.2

3.7

5.46.4

7.17.9 8.2 8.4 8.4 8.4 8.4

0.5 0.3 0.8 1.42.0

2.63.6

4.2

5.77.0

8.3

-2.0

0.0

2.0

4.0

6.0

8.0

10.0

0.000977 0.001953 0.003906 0.007813 0.015625 0.03125 0.0625 0.125 0.25 0.5 1

Bit-rate (bits per pixel)

ROI Background

NIIR

SAverage NIIRS with Increasing Importance Score for "Cafe" Image (Independent of bit-rate)

(a)

Average NIIRS with increasing bit-rate for "Cafe" image (independent of importance score)

(b)

Fig. 14. Average ROI and background NIIRS with increasing (a) importance score, and (b) bit-rate for ‘Cafe’ image. Standard error

bars indicate the variation between the image analysts for the ‘Cafe’ image.

A. Nguyen et al. / Signal Processing: Image Communication 19 (2004) 1005–10281022

ROI reconstruction, and hence interpretabilitylevel, depends upon the relative degree of im-portance between the ROI and background asspecified by the importance score. Although thereare conflicting requirements for simultaneouslyhigh ROI NIIRS and background NIIRS, thechoice of importance score can be chosen accord-ingly to achieve a desired ROI-backgroundtrade-off.

Note that the ROI NIIRS plateaus beyond acertain importance score. Any further increase inimportance score does not significantly affect theROI NIIRS, but adversely affects the backgroundNIIRS. The normalised NIIRS scale, as shown in

Fig. 16(a), relates the test images and also showsthe same trend. The normalised scale is an averageof normalised NIIRS scores for each of the testimages, where the normalised NIIRS scores for animage was given by the NIIRS score divided by itsmaximum. The figure shows that the interpret-ability of the ROI begins to saturate at animportance score of 16. With this importancescore, the ROI interpretability is close to thatachieved by Max-shift, while maintaining a back-ground NIIRS performance of greater than halfthe ROI’s NIIRS performance. Using importancescores greater than 16 will substantially degradethe interpretability of the background, but this is

ARTICLE IN PRESS

3.03.7

4.1 4.4 4.4 4.5 4.3 4.3 4.4 4.3

3.4 3.53.1

2.72.4

2.01.5

1.1 0.9 0.8

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5

5.0

5.5

1 2 4 8 16 32 64 128 8192 Max-shift

Importance Score

ROI BackgroundN

IIRS

0.41.5

2.73.2

3.64.3

4.95.4 5.6 5.7 5.4 5.6 5.7

0.0 0.1 0.3 0.51.0 1.2

1.82.6 2.7

3.1 3.5

5.15.6

-1.0

0.0

1.0

2.0

3.0

4.0

5.0

6.0

7.0

0.00024 0.00049 0.00098 0.00195 0.00391 0.00781 0.01563 0.03125 0.0625 0.125 0.25 0.5 1

Bit-rate (bits per pixel)

ROI Background

NIIR

SAverage NIIRS with Increasing Importance Score for "Duntroon" Image (Independent of bit-rate)

Average NIIRS with increasing bit-rate for "Duntroon" image (independent of importance score)

(a)

(b)

Fig. 15. Average ROI and background NIIRS with increasing (a) importance score, and (b) bit-rate for ‘Duntroon’ image. Standard

error bars indicate the variation between the image analysts for the ‘Duntroon’ image.

A. Nguyen et al. / Signal Processing: Image Communication 19 (2004) 1005–1028 1023

compensated by the better NIIRS performance inthe ROIs and its close surroundings (bounded byROI code-blocks).

The second factor that was investigated is theimpact of NIIRS on the ROI prioritisationmethods with increasing bit-rate. From Figs.13(b), 14(b), and 15(b), it can be seen that boththe ROI and background NIIRS are monotoni-cally increasing with bit-rate. The normalisedNIIRS ratings across all three test images, asshown in Fig. 16(b), also shows the same trend.

More importantly, the rate-NIIRS results allowthe determination of the useful range of bit-ratesthat is most effective for ROI prioritisation. If asearch is performed in Figs. 13(b), 14(b), and 15(b)

for the size of the gap between the lower and uppererror bar of the ROI and background, respec-tively, we can obtain an indicative idea on whichbit-rate is most effective for ROI prioritisation.From the plots, this gap is more substantial at bit-rates less than 0.25 bpp, which indicates that ROIprioritisation is most effective at these bit-rates.This useful range of bit-rates is also in accordancewith the subjective quality evaluations on theMax-shift method conducted in [1]. At higher bit-rates (say, 0.5 and 1.0 bpp), the difference in ROIand background NIIRS performance is small sincethe ROI and background will usually have a highspatial resolution and would be near visuallylossless when compared to the original. The

ARTICLE IN PRESS

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

1 2 4 8 16 32 64 128 8192 Max-shift

Importance Score

ROI BackgroundN

orm

alis

ed N

IIRS

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

0.000977 0.001953 0.003906 0.007813 0.015625 0.03125 0.0625 0.125 0.25 0.5 1

Bit-rate (bits per pixel)

ROI Background

No

rmal

ised

NIIR

SNormalised NIIRS with increasing importance score (independent of image and bit-rate)

(a)

Normalised NIIRS with increasing bit-rate (independent of image and importance score)

(b)

Fig. 16. Normalised ROI and background NIIRS with increasing (a) importance score, and (b) bit-rate.

A. Nguyen et al. / Signal Processing: Image Communication 19 (2004) 1005–10281024

computational overhead associated with prioritis-ing a ROI ahead of the background in this casewould not warrant the need for ROI prioritisation.

6. NIIRS and PSNR specifications for IMP-J2K

A methodology was originally proposed in [12]to aid a user in the appropriate choice ofimportance score that would achieve a desiredNIIRS performance. A user would specify a ROIand background NIIRS rating, map the NIIRSspecifications to their corresponding range ofPSNR values, and then determine from the PSNRversus bit-rate curve the appropriate importance

score to use. The NIIRS specification forROI prioritisation in this case can be quite in-flexible because of the mapping of NIIRS to therange of PSNR values. The success of such an ap-proach is dependent on the success of the mappingoperation.Another approach to the NIIRS specification

for ROI prioritisation is presented in this sectionand eliminates the NIIRS to PSNR mappingoperation. A NIIRS plane, is required, to simulta-neously show the full relationship between theNIIRS rating, bit-rate, and importance score forthe ROI and background. Fig. 17 shows such aNIIRS plane pictorially visualised as a contourplot, which is representative of the three test

ARTICLE IN PRESS

1 2 4 8 16 32 64 128 81920.000976563

0.001953125

0.00390625

0.0078125

0.015625

0.03125

0.0625

0.125

0.25

0.5

1

ROI Importance Score

Bit-

rate

(bi

ts p

er p

ixel

)

Normalised ROL NIIRS contour plot

1 2 4 8 16 32 64 128 81920.000976563

0.001953125

0.00390625

0.0078125

0.015625

0.03125

0.0625

0.125

0.25

0.5

1

ROI Importance Score

Bit-

rate

(bi

ts p

er p

ixel

)

Normalised Background (BG) NIIRS contour plot

(a)

(b)

Fig. 17. Normalised NIIRS contour plot (representative of the

three test images) for (a) ROI and (b) background (BG). Solid

box in (a) and (b) represents operating points that satisfy ROI

NIIRS40.9 and BG NIIRS40.4, respectively. The lower bit-

rate limit is the lowest bit-rate that can achieve the NIIRS

specifications.

A. Nguyen et al. / Signal Processing: Image Communication 19 (2004) 1005–1028 1025

images. Since each test image had a differentmaximum NIIRS, the NIIRS scores werenormalised to make them more comparable. Fora given class of imagery such as aerial surveillanceof urban landscapes, where the images arecaptured using the same imaging sensor opticsand with the same ground resolution, there willexist a representative set of NIIRS curves that canbe used for the NIIRS specification methodologyproposed here. Generally, the ROI NIIRS contourplot is non-increasing, where a higher NIIRSrating can be achieved at a smaller bit-rate forhigh importance scores. The background NIIRScontour plots, however, are non-decreasing, wherea higher bit-rate is required to achieve a givenNIIRS rating for higher ROI importance scores.

These are the properties expected of a ROI andbackground NIIRS contour plot, but due to thevariation in the assigned ratings, the curves areapproximately but not strictly monotonically non-increasing and non-decreasing for the ROI andbackground, respectively (see Fig. 17).NIIRS specifications can lead to the use of one

of three ROI prioritisation strategies, namely,default JPEG 2000, Max-shift, and IMP-J2K(or Implicit).

(1)

If one is to specify an equal (or near equal)interpretability level for the ROI and back-ground, then the default JPEG 2000 can beused.

(2)

In the event that the background NIIRS is notimportant and the fastest ROI reconstructionis required to achieve a given NIIRS rating,then the Max-shift method can be used.

(3)

Otherwise, IMP-J2K can be used, if a variableROI and background NIIRS performance aredesired.

The discussion that follows relate to the casewhere IMP-J2K is used to provide a desired ROIand background NIIRS performance.Conceptually, one can specify an appropriate

ROI and background NIIRS rating and locate onthe ROI and background NIIRS contour plot theoptimal operating point where one can estimatethe ROI importance score to use and at which bit-rate the desired performance would likely to occur.To do this, one would need to firstly specify adesired ROI and background NIIRS performancesuch as a normalised ROI NIIRS40.9 andnormalised background NIIRS40.4. So to deter-mine the appropriate operating points, we candetermine those operating points on the ROINIIRS contour plot that satisfy the ROI NIIRSand then superimpose those operating points ontothe operating points that satisfy the backgroundNIIRS specification. The intersection of the ROIand background operating points determine theoperating points that satisfy both the ROI andbackground NIIRS specifications. From all theavailable operating points for the given NIIRSspecification, one can then determine the optimaloperating point that would achieve the smallest

ARTICLE IN PRESS

A. Nguyen et al. / Signal Processing: Image Communication 19 (2004) 1005–10281026

bit-rate. The corresponding importance score atthat bit-rate is then the importance score thatshould be used to achieve the desired ROI andbackground NIIRS specifications.

For the NIIRS contour plots in Fig. 17, theselection of ROI importance scores and bit-ratevalues should be quantised to the parameter valuesthat were used in the experiments. Interpolatingbetween these values were avoided to minimiseuncertainties. Fig. 17 shows an example of theabove procedure for the ROI and backgroundnormalised NIIRS specifications of40.9 and40.4,respectively. We can see from Fig. 17(a) that thelowest bit-rate that satisfies the ROI NIIRS is0.03125 bpp and that any importance score greaterthan 4 would satisfy the ROI NIIRS. At this bit-rate, the highest importance score that satisfies thebackground NIIRS (in Fig. 17(b)) can be found tobe 16. Thus, one should be able to use animportance score ranging from 4 to 16 to achievethe desired specifications at the lowest bit-ratepossible (i.e. at 0.03125 bpp).

ROI NIIRS=5.5 (a)

BG NIIRS=2.5 (a)

Fig. 18. Validation of the NIIRS specification methodology. Far left

background (bottom row) at the desired NIIRS specifications. (a), (b),

of 0.03125bpp using a ROI importance score of 4, 8, and 16, respectiv

or better than the ground truth images in terms of the interpretation

desired NIIRS specifications and validates the NIIRS specification m

To validate the above methodology, the ‘Dun-troon’ image was encoded using a different ROI(also approximately 6–8% of the total image area)to see if the determined importance scores andestimated bit-rate would achieve the desiredNIIRS specifications. The left column of Fig. 18shows a segment of the test ROI and backgroundobtained from the decoded default JPEG 2000code-stream, which were NIIRS rated by imageanalysts as 5.5 (normalised NIIRS rating of 0.9 at0.125 bpp) and 2.5 (normalised NIIRS rating of0.4 at 0.0078125 bpp) respectively. These regionswere not explicitly chosen for evaluation by imageanalysts but were evaluated in the default JPEG2000 case as part of the background. These imagesare referred to as the ‘ground truth’. Fig. 18(a)–(c)then shows the reconstructed IMP-J2K imagesencoded using the test ROI with an importancescore of 4, 8, and 16, respectively. The images areall decoded at the predetermined minimum bit-rateof 0.03125 bpp. The decoded ROI and backgroundsegments can be seen to be approximately the same

(b) (c)

(b) (c)

column: Test or ‘ground truth’ segments of ROI (top row) and

and (c) are reconstructed images at the target minimum bit-rate

ely. The reconstructed image segments can be seen to be similar

tasks that could be achieved in the images, which satisfies the

ethodology.

ARTICLE IN PRESS

1 2 4 8 16 32 64 128 81920.000976563

0.001953125

0.00390625

0.0078125

0.015625

0.03125

0.0625

0.125

0.25

0.5

1

ROI Importance Score

Bit-

rate

(bi

ts p

er p

ixel

)

ROI PSNR (dB) contour plot

1517.517.5

17.52020

20

20

22.522.5

22.52525

25

25

27.527.5

27.5

3030

30

30

3535

35

4040

40

4545

45

50

50

1 2 4 8 16 32 64 128 81920.000976563

0.001953125

0.00390625

0.0078125

0.015625

0.03125

0.0625

0.125

0.25

0.5

1

ROI Importance Score

Bit-

rate

(bi

ts p

er p

ixel

)

Background PSNR (dB) contour plot

15

15

15

15

17.5

17.5

17.5

17.5

20

20

20

20

22.5

22.5

22.5

25

25

25

27.527.5

27.5

3030

(b)

(a)

Fig. 19. PSNR contour plots (representative of the three test

images) for (a) ROI and (b) background.

A. Nguyen et al. / Signal Processing: Image Communication 19 (2004) 1005–1028 1027

or better than the ground truth images, in terms ofthe level of interpretation that could be achieved inthe images. This implies that the choice ofimportance scores allow the encoding of an imageto satisfy the NIIRS specifications at the estimatedminimum bit-rate.

The above methodology can also be similarlyadopted for any PSNR specification. Fig. 19shows the PSNR contour plots (representative ofthe three test images), that can be used todetermine the bit-rate and importance scoreoperating point, given a ROI and backgroundPSNR specification. Again, a representative set ofPSNR contour plots for a given class of imagerycan be used for the PSNR specification for ROIcoding.

7. Conclusion

An importance prioritised JPEG 2000 (IMP-J2K) coder has been presented to address the imagecoding principle of improving the interpretabilityversus bit-rate performance. The method providesa framework for prioritising regions of interests(ROIs) based on modulating the distortion costfunction by the ROI’s importance score as definedby an importance map. The method allows a hostof useful features such as multiple ROI codingusing arbitrary importance scores and the ability tobe decoded by generic JPEG 2000 decoders.The interpretability (NIIRS) assessment has

shown that the interpretability of ROIs increaseswith the ROI’s importance score and vice versa fornon-ROIs. The most useful range of bit-rates thataffect the ROI’s interpretability the most wasdetermined to be less than 0.25 bits per pixel.Given these results and findings, a methodologyfor the determination of a ROI’s importance scoregiven a desired ROI and background interpret-ability performance was also proposed. Impor-tance prioritisation coding has the ability toprovide increased interpretability in ROIs whileproviding high compression performance.

Acknowledgements

The authors would like to thank Dr. RobertPrandolini from Defence Science and TechnologyOrganisation (DSTO), Australia for access tounclassified satellite and aerial surveillance dataand his involvement in arranging trained andqualified defence image analysts for the NIIRSevaluations.The IMP-J2K coding system described in the

paper was implemented using the Kakadu soft-ware implementation of JPEG 2000 developed byDr. David Taubman [17]. The software providesfull compliance to the JPEG 2000 standard.

References

[1] A.P. Bradley, Can region of interest coding improve

overall perceived image quality?, in: Proceedings of the

ARTICLE IN PRESS

A. Nguyen et al. / Signal Processing: Image Communication 19 (2004) 1005–10281028

Workshop on Digital Image Computing, Brisbane,

Australia, 2003, pp. 41–44.

[2] C. Christopoulos, J. Askelof, L. Mathias, Efficient

methods for encoding regions of interest in the upcoming

JPEG2000 still image coding standard, IEEE Signal

Process. Lett. 7 (9) (2000) 247–249.

[3] M.P. Eckert, A.P. Bradley, Perceptual quality metrics

applied to still image compression, Signal Processing 70

(1998) 177–200.

[4] Imagery Resolution Assessments and Reporting Standards

(IRARS) Committee, Civil NIIRS Reference Guide,

http://www.fas.org/irp/imint/niirs_c/, March 1996.

[5] Information technology—JPEG2000 image coding sys-

tem—Part 1: core coding system, ISO/IEC 15444-1, ITU-T

Rec. T.800, August 2002.

[6] A.J. Maeder, Importance maps for adaptive information

reduction in visual scenes, in: Proceedings of the Austra-

lian and New Zealand Conference on Intelligent Informa-

tion Systems, Perth, Australia, 1995, pp. 24–29.

[7] M.A. Maver, C.D. Erdman, K. Ruiehi, Imagery inter-

pretability rating scales, Soc. Inform. Display (1995)

117–120.

[8] T.H. Ming, JPEG2000 Image Compression: Standard,

Technology and Implementation, School of Electrical and

Electronic Engineering, Nanyang Technological Univer-

sity, Singapore, 5–6 April 2000.

[9] A. Nguyen, V. Chandran, S. Sridharan, R. Prandolini,

JPEG2000 region of interest coding—a hybrid coefficient

scaling and code-block distortion modulation method, in:

Proceedings of the Australasian Workshop on Signal

Processing and Applications, Brisbane, Australia, 2002,

pp. 59–62.

[10] A. Nguyen, V. Chandran, S. Sridharan, R. Prandolini,

Importance prioritisation coding in JPEG2000 for inter-

pretability with application to surveillance imagery, in:

Proceedings of the Visual Communications and Image

Processing, vol. 5150, Lugano, Switzerland, 2003,

pp. 806–817.

[11] A. Nguyen, V. Chandran, S. Sridharan, R. Prandolini,

Guidelines to using region of interest coding in JPEG 2000,

in: Proceedings of the International Symposium on Digital

Signal Processing and Communication Systems, Gold

Coast, Australia, 2003, pp. 183–188.

[12] A. Nguyen, V. Chandran, S. Sridharan, R. Prandolini,

Interpretability performance assessment of JPEG2000 and

Part 1 compliant region of interest coding, IEEE Trans.

Consum. Electron. 49 (4) (2003) 808–817.

[13] D. Norton, L. Stark, Eye movements and visual percep-

tion, Sci. Amer. 224 (1971) 34–43.

[14] R. Prandolini, M. Grigg, W. Fletcher, JPEG2000—

implications for defence, Technical Report DSTO-TN-

0408, Defence Science and Technology Organisation,

January 2002.

[15] M. Rabbani, R. Joshi, An overview of the JPEG 2000 still

image compression standard, Signal Processing: Image

Communication 17 (2002) 3–48.

[16] U. Rauschenbach, H. Schumann, Demand-driven image

transmission with levels of detail and regions of interest,

Comput. Graph. 23 (1999) 857–866.

[17] D. Taubman, Kakadu software: a comprehensive frame-

work for JPEG2000, http://www.kakadusoftware.com/,

2004.

[18] D.S. Taubman, M.W. Marcellin, JPEG2000: Image

Compression Fundamentals, Standards and Practice,

Kluwer Academic Publishers, Boston, 2002.

[19] D. Taubman, E. Ordentlich, M. Weinberger, G. Seroussi,

Embedded block coding in JPEG 2000, Signal Processing:

Image Communication 17 (2002) 49–72.

[20] D. Taubman, R. Rosenbaum, Rate-distortion optimized

interactive browsing of JPEG2000 images, in: Proceedings

of the International Conference on Image Processing,

vol. 3, 2003, pp. 765–768.