Efficient Bit Allocation For Object Oriented Coding Of ... Discrete Wavelet Transform (DWT) is particularly effective for coding still images and may be used as an alternative to the

Efficient Bit Allocation For Object Oriented Coding Of Medical Images In The Context Of JPEG-2000

IOANNIS M. STEPHANAKIS1 and GEORGE C. ANASTASSOPOULOS2

1 Hellenic Telecommunications Organization, 151 81, Athens, GREECE

2 Medical Informatics Laboratory, Democritus University of Thrace, GR-68100, Alexandroupolis, GREECE

Abstract: - The problem of efficient bit allocation in the context of object oriented compression of medical images obtained from the medical database of the Second Department of Surgery of the University Hospital of Alexandroupolis, Greece is addressed. Medical images are usually obtained from digital capturing systems at high resolution. Nevertheless transmission of multiple image copies through conventional videoconference and telemedicine systems may be still problematic. The segmentation of medical images into regions that may contain valuable diagnostic information is proposed as the first step of our approach. Each segment is described as a medical image object (in accordance with video objects -VO - in MPEG-4 terminology). Medical image objects are compressed using wavelet transform coding. The DC and the AC sub-band coefficients of the wavelet transforms are quantized separately. A different number of bits is allocated to each medical image object according to its significance. The total bit budget is observed via optimization on the Rate-Distortion plane. Subjective weighting of the distortions contributed by the several regions of the image (medical objects) is possible. Key-Words: - Bit Allocation, Object Oriented Coding, JPEG-2000, Rate-Distortion Theory, Wavelets, Medical Imaging. 1 Introduction Digital 1-D and 2-D signals are coded in order to facilitate their storage in indexed (object oriented) databases (DB) as well as their subsequent processing and efficient transmission. Both still frame images and video sequences may be encoded according to a variety of standards. Image enhancement, image restoration and image segmentation are common processing tasks that may be implemented through the use of a versatile set of algorithmic processes and various tools derived from diverse mathematical fields.

Signal and image coding may be either lossless or lossy (compression). Typical methods of image coding are the following [1] :

Scalar coding / quantization (including uniform and non-uniform quantization). Predictive quantization (including Delta modulation and linear prediction algorithms like the Levinson-Durbin algorithm). Transform and sub-band coding (like the Karhunen-Loeve Transform, the Discrete Cosine Transform and the Discrete Wavelet Transform). Entropy coding (including Huffman coding, arithmetic coding and Ziv-Lempel coding). Finite-state vector quantization.

Tree and trellis encoding. Vector coding / quantization (including constrained and predictive vector quantization). Parametric model aided coding. Shape coding.

Object oriented coding.

Often a signal coding system contains several different quantizers, each of which has the task of encoding a different parameter that is needed to characterize the signal. Each such parameter may have a different physical meaning and may require a different relative accuracy of reproduction to achieve a desired overall quality of the reconstructed signal. The number of bits available to collectively describe this set of parameters is inevitably limited. Consequently, a major concern of the designer of a coding system is bit allocation, i.e. the task of distributing a given quota of bits to the various quantizers to optimize the overall coder performance. 1.1 Object Oriented Coding In traditional waveform coding methods, each sample is quantized with the same number of bits since each sample has the same degree of a priori importance of its effect on the overall signal reproduction. On the other hand, in some coding systems distinct parameters or features of varying

Proceedings of the 6th WSEAS International Conference on Applied Informatics and Communications, Elounda, Greece, August 18-20, 2006 (pp99-105)

mailto:[email protected]

importance are individually quantized and it becomes important to intelligently allocate bits among them. Typical examples are transform coding sub-band and object oriented coding.

A typical example of object oriented coding is presented in Fig.1. Still images or video frames are considered as a union of the so-called image and video objects, i.e.

U UU )( ... )( )()(.,., 21 tObjtObjtObjtI k= . (1)

Object oriented standards for encoding still images and video sequences like the JPEG-2000 [2] and MPEG-4 [3],[4] allow for differentiated bit budgets per image object or differentiated rates (bps - bits per second) per video object according to their relative significance. The foreground and the background are usually considered as two different objects. The requirements for minimum distortion are applied to each object separately instead of the entire frame yielding a distributed compression ratio within a frame and allowing rate control per video object [5]. The complexity of the encoder depends heavily upon the number of different objects. This is expected to become less important over the years to come due to the development of more efficient fast algorithms and the parallel processing capabilities of the encoders.

Segmentation methods are essential in a variety of image processing tasks and may be used to determine the different objects of a still image frame The purpose of image segmentation is to decompose an image domain into a number of disjoint regions (spatially connected groups of pixels) so that the features within each region have visual similarity, exhibit strong statistical correlation and reasonably good homogeneity. Alternatively, segmentation can be considered as a pixel labeling process in the sense that all pixels that belong to the same homogeneous region are assigned the same label. Several approaches have been proposed to tackle the problem of image segmentation [6],[7].

1.2 JPEG-2000 For Still Frame Image

Compression The Joint Photographic Experts Group has developed the JPEG International standards [8],[9]. In 1997, some of the world’s leading companies and top researchers contributed their expertise to develop a new image coding system for different types of still images (b-level, gray-level, color, multi-component) with different characteristics (natural, scientific, medical, remote sensing images, rendered graphics, etc) allowing different imaging

models (real time transmission, image library archival, limited buffer and bandwidth resources etc) preferably within a unified system. The JPEG 2000 still image coding system allows for low bit rate operation with distortion and subjective image quality performance superior to existing standards, without sacrificing performance at other points in the rate distortion spectrum, incorporating at the same time many interesting features [2],[10]. Some of the features of the JPEG 2000 standard include: Lossy and lossless compression. 40-60% smaller files, without sacrificing the image quality. Good quality at ultra-high compression rates without any blocky artifacts. Support for multi spectral imagery, CMYK format and ICC profiles. Error resilience for noisy channels or transmission errors, by means of variable length coding. A standard file format (.JP2). Region of Interest (ROI) coding allowing the image foreground to be compressed at a higher quality than the background. Random access code streaming for extraction and reconstruction of data by area of interest, resolution, color format and quality. Scalable coding of still images by means of SNR scalability and spatial scalability. Spatial scalability is useful for fast database access as well as for delivering different resolutions to terminals with different capabilities in terms of display and bandwidth capabilities.

JPEG 2000 is an ideal file format for medical database (DB) applications [11]. It may be used in the archiving of medical images and their transmission. A problem common to all DB applications is the need to store multiple versions of a medical image, i.e. a thumbnail is needed in viewing the patient’s record and a high-resolution image is needed for diagnosis purposes. JPEG 2000’s multi resolution capability eliminates the need to store multiple images. Instead, a single high-resolution medical image is stored in JPEG 2000 format.

2 The Bit Allocation Problem The general problem of allocating bits to a set of quantizers does not distinguish between quantizers of specific types. It is assumed that a set of k random variables, X1 , X2 , … , Xk , each with zero mean value and with variances { } 22

iiX σ=Ε , for


ki ,...,2 ,1=

ib

subject

optimalFind

;'(b

, is to be quantized by k quantizers, which are denoted as Q1 , Q2 , … Qk. Assuming that the probability distribution function (pdf) of each Xi is known and a particular distortion measure is selected, one seeks for the optimal quantizer Qi for Xi for any choice of Ni quantizer levels. Usually the resolution is an integral number of bits and the overall distortion D may be found as the additive distortion of the k partial distortions, denoted as d

iN2log=

)(

nstrai to the co

D

for ibi =

b

''() bQ

i(bi), due to independent quantization of the variables. Then the bit allocation problem reads :

λλ >

();( dQb −);( dQb =λ

;( 22 Qb);( 11 Qb λλ =

.

)(

21

1

1

Bb thatnt

bd

nimize mito ,...,k ,

k

ii

k

iii

≤

=

∑

∑

=

=

(2)

The distortion related with the i-th object (Obji) is denoted as di whereas the number of bits allocated to Obji is denoted as bi . 2.1 The Plane Of Allocated Number Of Bits

vs. Distortion To problem of bit allocation is usually addressed in the context of Rate-Distortion Theory [12]. Each implementation of a quantizer, denoted as Q, yields a pair of values for the required rate (or, equivalently, the number of bits) and the resulting distortion, (r(Q), d(Q)) or (b(Q), d(Q)). Generally speaking, the optimal solutions define the convex hull whereas the non-optimal solutions lie above it (see Fig. 2). The convex hull by construction is a curve that decreases monotonically in such a way that the following relationship holds true for the absolute value of its slope λ at (b(Q), d(Q)),

);Q if b . ''' b< (3) The slope expresses the distortion decrement at (b(Q), d(Q)) should one extra bit be allocated during optimal quantization, i.e.

);1 Qb + .

Optimal bit allocation along the lines of Eqs. 2 suggests that for all quantizers, Q1 , Q2 , … Qk.

);() kk Qbλ==L . (4) This guarantees equilibrium among all quantizers during bit allocation. 2.2 The Discrete Wavelet Transform (DWT) The Wavelet Transform expands a discrete or continuous 1-D or 2-D signal upon a set of orthonormal basis functions, the so-called wavelet functions. They are displaced replicas of the same

mother function that are produced by space or time dilation [13],[14]. The wavelet transform decomposes a signal into non-overlapping sub-bands. It is implemented using analysis and synthesis filter banks. Various wavelet families are reported in the relevant literature.

The Discrete Wavelet Transform (DWT) is particularly effective for coding still images and may be used as an alternative to the Discrete Cosine Transform (DCT) in the context of recently developed standards like the JPEG-2000 and MPEG-4. A 2-D DWT produces a DC component (low-frequency sub-band) and a number of AC (high-frequency) sub-bands. The DC sub-band is quantized, predictively encoded (using a form of DPCM) and entropy encoded (usually using an arithmetic encoder). The AC sub-bands are quantized and reordered (“scanned”), zero-tree encoded and entropy encoded. The wavelets used in the context of MPEG-4 still texture coding are the Daubechies (9,3)-tap biorthogonal filter [15]. This is essential a matched pair of filters, one low pass (with three filter taps) and one high pass (with nine filter taps).

The DC sub-band is quantized using a scalar quantizer whereas the AC sub-bands may be quantized in various ways:

1) Scalar quantization using a single quantizer, prior to reordering and zero-tree encoding.

2) “Bilevel” quantization after reordering. The reordered coefficients are coded one bitplane at a time using zero-tree encoding. The coded bitstream can be truncated at any point to provide highly scalable decoding.

3) “Multilevel” quantization prior to reordering and zero-tree encoding. A series of quantizers are applied, from coarse to fine, with the output of each quantizer forming a series of layers (a type of scalable coding).

2.3 Quantizing The Wavelet Coefficients

For Each Medical Object A uniform scalar quantizer featuring a varying quantization step is used in the context of this study in order to quantize the wavelet coefficients. The quantization steps obtain N discrete values, which are found as the maximum absolute value of the wavelet coefficients in all AC sub-bands at the same level (scale) divided by 2n where n=1,2, … , N. Entropy encoding is assumed for all non-zero reproduction levels, i.e.


4 Experimental Results

) ] )21( )

21[(:(

where )(log)(0

2

stepnstepnXXpp

ppstepb

nn

Nn

nNn

nnn

xx +−∈=

−= ∑=

≠−= . (5)

The image depicted in Fig. 4 (a lung X-ray) is used to obtain the encoding results presented in Table 1. The original image is obtained from the medical database of the Second Department of Surgery of the University Hospital of Alexandroupolis, Greece. The following platform is used:

The differences between the actual values of the coefficients of the DC sub-band and their mean value are quantized in the same fashion as the coefficients of the AC sub-bands. The number of required bits obtained using Eq. 5 is a good estimate of actual quantization results obtained by scalar quantization according to the object oriented encoding standards in Section 1.

PC (Pentium4 – 1.80 GHz, 512 MBytes RAM, 40 GBytes / 7200 RPM Hard Disk) a Heidelberg Lynotype CPS Saphir/Opal Scanner, and a PC (Athlon XP – 2.5GHz, 512 MBytes RAM, 80GBytes / 7200 RPM Hard Disk) with a 64 MByte graphics card and a 22’’ display monitor for viewing purposes.

3 The Proposed Approach

3.1 Optimizing The Overall Distortion Of The Image Based On The Relative Diagnostic Information Of The Medical Objects

All pieces of the aforementioned equipment are connected via a Cisco switch to the 100 Mbps intranet of the University Hospital of Alexandroupolis. Standard thresholding (see Fig. 5) is employed in order to determine the three different objects illustrated in Fig. 6. The Discrete Wavelet Transform (Daubechies (9,3)-tap biorthogonal filters) are used to decompose the three objects into five (5) detail levels and a DC sub-band. The sub-bands and the corresponding Bit budget Vs. Distortion planes are presented in Figs. 7 to 12.

The same incremental distortion ,

,i

ji

ji

bd

λ=∂

∂is

assumed for the DC sub-band and all AC sub-bands of medical object Obji. Relevant weighting of the distortions of the different objects of the image according to their significance in the medical diagnosis is possible. Usually weighting of the different medical objects is carried out in such a way that distortion represents the Mean Square Error (MSE) corresponding to the object. Alternatively, one may choose different lambdas for the medical objects in the image.

λ(obj1) 0.0150 0.0100 0.0060

bpp(obj1) 0.0374 0.0374 0.0374

Obj

ect

1

PSNR(obj1) 44.71 44.71 44.71λ(obj2) 0.0025 0.0020 0.0015

bpp(obj2) 1.2139 1.5003 1.5776

Obj

ect

2

PSNR(obj2) 32.23 34.34 34.91λ(obj3) 0.0050 0.0040 0.0030

bpp(obj3) 1.9548 2.0580 2.2980

Obj

ect

3

PSNR(obj3) 32.94 33.67 35.67bpp(all) 0.7575 0.8699 0.9370

Entir

e im

age

PSNR(all) 35.20 36.72 37.69

3.2 Processing Steps The method:

1) Segment a medical image acquired through standard digital capturing systems into homogeneous regions, the so-called medical image objects.

Table 1. Values of PSNR (Peak Signal to Noise Ratio) and bpp (bits per pixel) pertaining to the three

objects and the entire image for different lambdas

2) Weight the distortion contributed by each of the aforementioned regions of the image according to its significance to the medical diagnosis.

5 Discussion 3) Set constraints regarding the distortion and the allocated number of bits per medical object. Good encoding results are obtained for the three

different set of lambda values given in Table 1. Actual bit budgets obtained from commercial object oriented encoders may be slightly higher but they are still less than similar results obtained by other conventional encoding methods used today. The versatility of the proposed approach makes it a powerful tool in everyday medical practice.

4) Perform optimization on the plane of Allocated Number of Bits vs. Distortion. Find optimal λ.

5) Check for consistency with the constraints posed upon acceptable distortions and number of bits per medical object. Adjust the solution accordingly if necessary.


Acknowledgement This work was partially supported by the Greek Ministry of Education and the European Union under the program PYTHAGORAS – 89203.

References:

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

M. B. Ruskai, G. Beylkin, R. Coifman and I. Daubechies, Wavelets and Their Applications. Boston, MA: Jones Bartlett, 1992.

I. Daubechies, “The Wavelet Transform, Time-Frequency Localization And Signal Analysis”, IEEE Trans. Inf. Theory 36, 1990, pp. 961-1005. A. Gersho and R. M. Gray, Vector Quantization

and Signal Compression. Norwell, Kluwer Academic, 1992.

ISO/IEC 15444, Information technology – JPEG 2000 image coding system, 2000. ISO/IEC 14496-10 and ITU-T Rec. H.264, Advanced Video Coding, 2003. Iain E. G. Richardson, H.264 and MPEG-4 Video Compression – Video Coding for Next-Generation Multimedia. England: Wiley, 2003. Fig. 1 : Object oriented encoding – illustration of

Object Planes Ioannis M. Stephanakis, Anastasios Doulamis, Nikolaos Doulamis, and Stefanos Kollias, “Adaptive Wavelet-Packet Decomposition for Rate Control of Object Oriented Coding of Video Sequences”, Proceedings 1999 IEEE International Conference on Image Processing (ICIP ’99), Kobe, Japan, October 1999, v.2, pp. 298 –302.

N. Pal and S. Pal, A Review on Image Segmentation Techniques, Pattern Recognit., Vol. 26, 1993, pp. 1277-1294. R. Haralick and L. Shapiro, Image Segmentation Techniques, CVGIP, Vol. 29, 1985, pp. 100-103.

[16]

Fig. 2 : The convex hull on the Rate (Bit budget) Vs. Distortion plane

ISO/IEC 10918-1 / ITU-T Rec. T.81, Digital compression and coding of continuous-tone still images, 1992 (JPEG).

http://www.jpeg.org: JPEG and JPEG-2000 compression standards.

A. Skodras, C. Christopoulos and T. Ebrahimi, “The JPEG-2000 still image compression standard”, IEEE Signal Processing, Vol. 18, No. 5, Sep. 2001, pp. 36-58.

George K. Anastassopoulos, Aggelos D. Tsalkidis, Ioannis M. Stephanakis, George Mandellos, and, Konstantinos Simopoulos, “Application of JPEG 2000 Compression in Medical Database Image Data”, Proceedings 14th International Conference on Digital Signal Processing (DSP 2002), Santorini, Greece, July 2002, vol.2, pp. 539 –542.

Fig. 3 : The Discrete Wavelet Transform (DWT) decomposes an image into Low and High Sub-bands

(i.e. separate spatial frequency zones)

A. Ortega and K. Ramchandran, Rate-Distortion Methods for Image and Video Compression, IEEE Signal Processing Magazine, Vol. 15, No. 6, November 1998, pp. 23-50.

M. Vetterli and J. Kovacevic, Wavelets and Subband Coding. NJ: Prentice Hall PTR, 1995.

Fig. 4 : Original image


Fig. 5 : Histogram of the original image and

thresholds Fig. 7 : Sub-bands of medical object obj1

Fig. 8 : Bit budget – Distortion planes for the sub-bands of medical object obj1

Fig. 9 : Sub-bands of medical object obj2 Fig. 6 : Medical object planes (obj1, obj2 and obj3)



Fig. 11 : Sub-bands of medical object obj3



Documents

Efficient Bit Allocation For Object Oriented Coding Of ... Discrete Wavelet Transform (DWT) is particularly effective for coding still images and may be used as an alternative to the