4
RECOVERY OF VIDEO IN THE PRESENCE OF PACKET LOSS USING INTERLEAVING AND SPATIAL REDUNDANCY Gongsan Yu, Michael W. Marcellin Dept of Electrical & Computer Engineering University of Arizona, Tucson, AZ 85721 ABSTRACT This paper proposes a new algorithm for error conceal- ment in the presence of packet loss for block based video CODECs. It uses a one-pixel overlap block structure. With- out any smoothness assumption on block boundaries, this method provides consistently high performance in lost block recovery. The one-pixel overlap generally decreases the compression ratio for a fixed quantization step size. How- ever, increases in PSNR obtained by averaging overlapping pixels largely compensates the effect. Experimental results show that there is almost no performance decrease in PSNR vs. compression ratio for the “Football” and “Miss Amer- ica” videos when using 9 x 9 blocks with one-pixel overlap (in place of the usual 8 x 8 blocks with no overlap). 1. INTRODUCTION Packet loss is a serious problem for packet video commu- nication. Without modification for handling packet loss, most standard video CODECs, such as MPEG and H.261, can not work over packet switching networks. In the past, there have been many different approaches developed to overcome this problem [1]-[5]. Among them, the maximally smooth method [4][5] was shown to be a powerful approach to mitigate the effects of packet loss in video CODECs. That method estimates the DCT coefficients of lost blocks by maximizing the smoothness of the block boundaries be- tween received blocks and lost blocks. For many video se- quences, this scheme provides excellent recovery from lost blocks. However, this scheme may fail when encoding is operated at a high compression ratio or when the video has some defects. In these cases, the smooth boundary criterion may not make sense and may result in poor recovery. One example of this is the “football” video illustrated later in Section 4. The mathematical foundation of our proposed method is similar to that used by the maximally smooth algorithm. However, instead of using the maximally smooth assump- tion on block boundaries, we perform the DCT on each block with a one-pixel overlap as shown in Figure 1. Then, the lost DCT coefficients can be estimated using a set of linear equations. Because there are no assumptions about block boundaries, this method works well for any video source, even at high compression ratios. This work was supported in part by the National Science Foundation under Grant No. 9258374. 0-7803-3258-X/96/$5.00 0 1996 IEEE 105 Ma,x M.-K. Liu Quickturn Design Systems, Inc. 440 Clyde Ave., Mt. View, CA 94043 Block B / ......... ,: ........ I: : : : :: :: ::: : ......... ....... ..I:::::: :: ::: : ..................... BlockC Figure 1: Three adjacent 9 x 9 blocks with one-pixel overlap. Group of Blocks Macro Block Figure 2: H.261 GOB structure. The drawback of the one-pixel overlap transform is that there may be a small decrease in compression ratio. The- oretically, the worst case compression ratio decrease is 1 - $ = 21% when using 9 x 9 blocks instead of 8 x 8 blocks. However, averaging overlapping pixels increases PSNR. Ex- perimental results show that the performance decrease in terms of PSNR vs. compression ratio is negligible. 2. MODIFIED H.261 WITH INTERLEAVING AND OVERLAPPED BLOCKS The proposed error recovery approach can work with most block transform based coders. To demonstrate its perfor- mance, we chose H.261 as our “test bed” coder. The stan- dard Group of Blocks (GOB) structure for H.261 is shown in Figure 2. Each GOB consists of 33 macro blocks ar- ranged as shown. Each macro block, in turn, consists of 4 luminance blocks (Yl, Y2, Y3, Y4) and their corresponding (decimated) chromanance blocks (U, V). H.261 was not designed for packet video communica- tions. To use it in a packet switching environment, we split each Group Of Blocks, GOB, into five parts, as shown in Table 1. Each part (except the header) is divided into two packets, one containing run-length and entropy encoded low frequency DCT coefficients with the other containing simi- larly coded high frequency coefficients. Thus, our proposed format contains nine packet types. Among them, only the packets which contain the headers need to be protected.

[IEEE 3rd IEEE International Conference on Image Processing - Lausanne, Switzerland (16-19 Sept. 1996)] Proceedings of 3rd IEEE International Conference on Image Processing - Recovery

  • Upload
    mmk

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: [IEEE 3rd IEEE International Conference on Image Processing - Lausanne, Switzerland (16-19 Sept. 1996)] Proceedings of 3rd IEEE International Conference on Image Processing - Recovery

RECOVERY OF VIDEO IN THE PRESENCE OF PACKET LOSS USING INTERLEAVING AND SPATIAL REDUNDANCY

Gongsan Yu, Michael W. Marcellin

Dept of Electrical & Computer Engineering University of Arizona, Tucson, AZ 85721

ABSTRACT

This paper proposes a new algorithm for error conceal- ment in the presence of packet loss for block based video CODECs. I t uses a one-pixel overlap block structure. With- out any smoothness assumption on block boundaries, this method provides consistently high performance in lost block recovery. The one-pixel overlap generally decreases the compression ratio for a fixed quantization step size. How- ever, increases in PSNR obtained by averaging overlapping pixels largely compensates the effect. Experimental results show that there is almost no performance decrease in PSNR vs. compression ratio for the “Football” and “Miss Amer- ica” videos when using 9 x 9 blocks with one-pixel overlap (in place of the usual 8 x 8 blocks with no overlap).

1. INTRODUCTION

Packet loss is a serious problem for packet video commu- nication. Without modification for handling packet loss, most standard video CODECs, such as MPEG and H.261, can not work over packet switching networks. In the past, there have been many different approaches developed to overcome this problem [1]-[5]. Among them, the maximally smooth method [4][5] was shown to be a powerful approach to mitigate the effects of packet loss in video CODECs. That method estimates the DCT coefficients of lost blocks by maximizing the smoothness of the block boundaries be- tween received blocks and lost blocks. For many video se- quences, this scheme provides excellent recovery from lost blocks. However, this scheme may fail when encoding is operated at a high compression ratio or when the video has some defects. In these cases, the smooth boundary criterion may not make sense and may result in poor recovery. One example of this is the “football” video illustrated later in Section 4.

The mathematical foundation of our proposed method is similar to that used by the maximally smooth algorithm. However, instead of using the maximally smooth assump- tion on block boundaries, we perform the DCT on each block with a one-pixel overlap as shown in Figure 1. Then, the lost DCT coefficients can be estimated using a set of linear equations. Because there are no assumptions about block boundaries, this method works well for any video source, even at high compression ratios.

This work was supported in part by the National Science Foundation under Grant No. 9258374.

0-7803-3258-X/96/$5.00 0 1996 IEEE 105

Ma,x M.-K. Liu

Quickturn Design Systems, Inc. 440 Clyde Ave., Mt. View, CA 94043

Block B

/

......... ,: ........ I: : : : :: : : : : : : . . . . . . . . . ....... ..I:::::: :: ::: :

..................... BlockC

Figure 1: Three adjacent 9 x 9 blocks with one-pixel overlap.

Group of Blocks Macro Block

Figure 2: H.261 GOB structure.

The drawback of the one-pixel overlap transform is that there may be a small decrease in compression ratio. The- oretically, the worst case compression ratio decrease is 1 - $ = 21% when using 9 x 9 blocks instead of 8 x 8 blocks. However, averaging overlapping pixels increases PSNR. Ex- perimental results show that the performance decrease in terms of PSNR vs. compression ratio is negligible.

2. MODIFIED H.261 WITH INTERLEAVING AND OVERLAPPED BLOCKS

The proposed error recovery approach can work with most block transform based coders. To demonstrate its perfor- mance, we chose H.261 as our “test bed” coder. The stan- dard Group of Blocks (GOB) structure for H.261 is shown in Figure 2. Each GOB consists of 33 macro blocks ar- ranged as shown. Each macro block, in turn, consists of 4 luminance blocks (Yl, Y2, Y3, Y4) and their corresponding (decimated) chromanance blocks (U, V).

H.261 was not designed for packet video communica- tions. To use it in a packet switching environment, we split each Group Of Blocks, GOB, into five parts, as shown in Table 1. Each part (except the header) is divided into two packets, one containing run-length and entropy encoded low frequency DCT coefficients with the other containing simi- larly coded high frequency coefficients. Thus, our proposed format contains nine packet types. Among them, only the packets which contain the headers need to be protected.

Page 2: [IEEE 3rd IEEE International Conference on Image Processing - Lausanne, Switzerland (16-19 Sept. 1996)] Proceedings of 3rd IEEE International Conference on Image Processing - Recovery

Part # 1 2 3

Elements in the part All Y1, Y4 blocks in one GOB. All Y2, Y3 blocks in one GOB. All U blocks of odd numbered macro-blocks and all V blocks of even numbered macro- blocks in one GOB. All U blocks of even numbered macro-blocks and all V blocks of odd numbered macro- blocks in one GOB.

5 Header information of H.261.

4

Table 1: Data arrangement in proposed CODEC.

This interleaved structure of parts in Table 1 results in the data being packetized using a “checker-board” pattern of the blocks within a GOB. Thus, if only one packet is lost, all lost blocks will be surrounded by non-lost blocks. The lost blocks are then easily recovered by our algorithm.

The size of the DCT used in the proposed CODEC is different from that in H.261. H261 uses the DCT with 8 x 8 blocks while ours uses 9 x 9 blocks with a one-pixel overlap, as shown in Figure 1. As will be seen in Section 4, this structure provides significant improvement in lost block re- covery, with almost no decrease in compression ratio.

3. LOST B L O C K R E C O V E R Y

Let fm and a k indicate the intensity and transformed co- efficients of a (9 x 9) image block, respectively, arranged in a one-dimensional vector in any order. The purpose of this one-dimensional format is to facilitate the description of the algorithm. Given the ordering of the one-dimensional vectors, a unitary transform can be formulated by

where N = 81 is the number of pixels in a block. Let l i k

be the quantized value of a k , and fm be the reconstructed intensities. Suppose that some or all of the are lost during data transmission. Then,

k E R k E L

where R is the set of received coefficients, L the set of lost coefficients. The problem of error recovery is to estimate 6 k for k E L.

With the one-pixel overlapped transform, the boundary pixel intensities of each block are stored in the block-to be processed and its neighboring blocks so that if a lost block ( B ) is surrounded by undamaged blocks, the boundary pix- els can be retrieved from neighboring blocks. Then, the lost coefficients, l i k for IC E L, can be estimated by minimizing

( 3 )

Figure 3: Smoothness constraint imposed on each sample in the direction toward its nearest boundary. An arrow be- tween two samples means that the difference between these two samples occurs in the smoothness measure.

Be the subset of B containing only the pixels on the boundary of a block.

41. estimated lost coefficients, &. f m the boundary pixel intensities reconstructed

by the received and the estimated lost coefficients. the boundary pixel intensities stored in neigh- boring blocks.

This estimation problem is often ill-conditional and some type of regularization is desirable. In our algorithm] two cost functions with regularization are defined as

f m

m E B , k E L

and

mEB,

(m,m’)EB,

where w is a constant weighting factor and B, is a set con- taining ordered index pairs indicated by arrows in Figure 3 141. @2 and @ 3 can be considered as @I regularized by smoothing functions in the temporal domain and spatial domains, respectively [ 5 ] . In both cases, these smooth- ness criteria are applied within blocks and not across block boundaries. The drawback of using smoothing functions for regularization is that they add an effect similar to a low pass filtering. In other words, the recovered video may be blurry. To reduce the smoothing effect, the smoothing weight factor, w , in Equation 4 and 5 is chosen as a positive real number which is close to 0.

The estimation of the lost transform coefficients, & for k E L, by minimizing $1 , @z, or $3 is a typical least squares estimation problem. It can be solved by standard mathe- matical approaches. The estimation results when choosing @ p z as the cost function can be shown as

where where

106

Page 3: [IEEE 3rd IEEE International Conference on Image Processing - Lausanne, Switzerland (16-19 Sept. 1996)] Proceedings of 3rd IEEE International Conference on Image Processing - Recovery

Compression Petfotmance

0 100 1 50 2w 250 Compression Ratio (X : I )

Figure 4: PSNR vs. Compression Ratio.

a~ i i ~ p

the vector whose elements are h g , for k E L. the vector whose elements are zih, for k E R. the vector whose elements are’the interframe pre- diction intensities of the pixels in Be. For in- traframe processing mode, the vector becomes a zero vector.

vg the basis vector of the transform code on the boundary of a block corresponding to iLh.

the matrix whose column vectors are v h , for k E R. the matrix whose column vectors are v h , for k E L.

VR VL

The results when choosing (Ps as the cost function are

i i ~ = [ V ~ V L + w T E D ~ D T L ] - ~

(VE[f- p - V R ~ R ] - wTEDTD[p, + T R ~ R ] ) (7)

where p n the vector whose elements are the intensities of the -I

intra-frame prediction block. the transform matrix whose column vectors are the DCT bases for &g, k E L. the transform matrix whose column vectors are the DCT bases for &, k E R. a square matrix whose element at the i th row and the j t h column, dt, j is defined as

TL

TR

D

1 : i = j d,,j = -1 : (i,j) E B, i 0 : otherwise .

The estimation formula when choosing (PI can be obtained by simply setting w to zero in either Equation 6 or 7.

4. SIMULATION RESULTS

Figure 4 shows the PSNR vs. Compression ratio for the “Miss America,” “Table Tennis,” and “football” videos us- ing standard H.261 and the method proposed here. From the figure, the decrease in compression ratio is approxi- mately 6.4% for “Table Tennis” (at PSNR = 30 dB) and 0% for “Football.” For “Miss America,” the compression ratio is even raised by the overlap block coding.

For testing the performance of lost block recovery, we discarded all Dackets (of one frame Drocessed in interframe

Table 1 (i.e., the low frequency DCT coefficients for every luminance block lying in a checker-board pattern. ’) Then, we used the maximally smooth algorithm [4] and our pro- posed method to recover the lost blocks. Figure 5 shows the image frame of the video “Football” obtained using stan- dard H.261 without any block loss. Figures 6 and 7 show the damaged image frame recovered by the maximally smooth algorithm and by the proposed algorithm, respectively. The same quantization step size was used for compression under the three schemes. As can seen from these results, the pro- posed algorithm provides a striking improvement in visual quality compared to the maximally smooth scheme.

Figure 5: Decoded image frame without block loss.

Figure 6: Decoded image frame with block loss recovered by the maximally smooth algorithm.

Figures 8-10 show the objective results of the tests in terms of compression ratio vs. PSNR. The compression ra- tio is found by compressing 30 consecutive image frames of a given video into a single file. Each video is compressed by two different CODECs, one with block overlap and the other without block overlap. 7 different quantization step sizes, 8,16,24, . . . ,56, are used to provide different com- pression ratios. Then, all DCT coefficients of Part 1 in a frame are dropped for the tests. Finally, the damaged videos are decoded and error concealed by 4 different algo- rithms; motion compensation block substitution (Method

mode) containing the low frequency portion of Part 1 in lIf part 2 were also lost, a recurjive operation is required. [7]

107

Page 4: [IEEE 3rd IEEE International Conference on Image Processing - Lausanne, Switzerland (16-19 Sept. 1996)] Proceedings of 3rd IEEE International Conference on Image Processing - Recovery

Football (Lost all lowlreq. parts oIY1 and Y4 blocks) I

36

34

32

30

28

26

Figure 7: Decoded image frames with block loss recovered by the proposed algorithm choosing 9 3 as the regulariza- tion. -

-

-

-

-

~

6 B E In 32

30

29

20 40 60 80 100 120 140 160 180 200 220 240 Compression Ratio (X 1)

Figure 8: Objective results - “Miss America.”

1) , Least Squares based Maximally Smooth algorithm with- out block overlap (Method 2) [4][5], the proposed Least Square algorithm using block overlap and 9 2 as the reg- ularization function (Method 3), and the proposed Least Squares based algorithm with block overlap using Qi3 as the regularization function (Method 4). In all the tests, the pro- posed algorithm using either 9 2 or a3 as the regularization outperforms the Maximally Smooth algorithm.

5. CONCLUSION

This paper proposes a new method for lost/damaged block recovery by estimating lost DCT coefficients. With one- pixel overlapped block coding, the estimation of lost DCT coefficients does not need any smoothness assumption across block boundaries. One-pixel overlap block coding generally decreases the compression ratio for a fixed quantization step size. However, PSNR increases compensate the effect. Ex- perimental results show that the performance difference in terms of PSNR vs. Compression Ratio is small in the case of no packet loss. However, when packet losses are present, dramatic improvements can be obtained in both objective and subjective image quality.

t ........... ......... ....... ............. 1 --

>..:.. -..__

23 ’ I 10 20 30 40 50 60 70 80 90

Comprassion Ratio (X ’ 1)

Figure 9: Objective results - “Football.”

Table Tennis (Lost all low freq parts 01 Y1 and Y4 blocks)

No overlap, No loss - Method 1 ----. Method 2 Method 3 Method 4 - - -

24 t ............................. .......... ........................................

........... 22

20 40 60 80 100 120 Compression Ratio (X : 1)

10

Figure 10: Objective results - “Table Tennis.”

6. REFERENCES

[l] P. Haskell and D. Messerschmitt, “Resynchronization of Motion Compensated Video Affected by ATM Cell Loss,” Proc. 1992 Int. Conf Acoust., Speech and Sig- nal processing, vol. 3, 1992, pp.545-548.

[2] M. Ghanbari, lLAn Adapted H.261 Two-Layer Video Codec for ATM Networks,” IEEE Trans. on Commu- nications, vo1.40, 1992, pp.1483-1490.

[3] Y. Chen, K. Sayood, “A Robust Coding Scheme for Packet Video,” IEEE Trans. on Communications, vol 40, no. 9 1992, pp. 1491-1501.

[4] Y . Wang, Q. Zhu, and L. Shaw, “Maximally Smooth Image Recovery in Transform Coding,” IEEE trans. on Couumunications, vol. 41, no.10 Oct. 1993, pp.1544- 1551.

[5] Q. Zhu, Y . Wang and L. Shaw, “Coding and Cell-Loss Recovery in DCT-Based Packet Video,” IEEE trans. on Circuits and Systems for Video Technology, vol. 3, no. 3, June 1993, pp.248-258.

[6] William H. Press, Saul A. Teukolsky, William T. Vet- terling, Brian P. Flannery, “Numerical Recipes in C,” Cambridge University Press, 1992, pp.808-812.

[7] Gongsan Yu, “Error concealment for Packet Video,” Ph.D. dissertation, University of Arizona, 1996.

108