Download pdf - 1+1 - Library and Archives · PDF fileof I'BC' who helped make mu education possible bu awarding me with a University Graduate Fellowship. SI!- deepest gratitude

National Library 1+1 of canada Bibliothèque nationale du Canada

Acquisitions and Acquisitions et Bibliographie Services services bibliographiques

395 Wellington Street 395. rue Wellington OttawaON K l A W OnawaON K l A W canada canada

The author has granted a non- exclusive licence allowing the National Library of Canada to reproduce, loan, distribute or sell copies of this thesis in microform, paper or electronic formats.

The author retains ownership of the copyright in this thesis. Neither the thesis nor substantial extracts fkom it may be printed or othenvise reproduced without the author's permission.

L'auteur a accordé une licence non exclusive permettant à la Bibliothèque nationale du Canada de reproduire, prêter, distribuer ou vendre des copies de cette thèse sous la fome de microfiche/fïim, de reproduction sur papier ou sur format électronique.

L'auteur conserve la propriété du droit d'auteur qui protège cette thèse. Ni la thèse ni des extraits substantiels de celle-ci ne doivent être imprimés ou autrement reproduits sans son autorisation.

Abstract

C'ornpressecl image and vicleo bit streams are very sensitive to channel errors and

nia!. be altered or lost during transmission. Error concealment by post-processing

interids to reconstruct lost visual information by esploiting the correlation Letiveen

the irnage/video data. The applications of concealment of errors in coded visual

information incIude visual communication over unreliable channels such as wireless

networks and the Internet.

For most types of encoders and input data. coded visual information consists

of a collection of coded texture (DCT coefficients). shape and motion information.

In ttiis tliesis ive present concealment methods for errors in texture and shape infor-

mation. and address the concealment of errors in motion data in conjiinct ion \vit h its

corresponding texture or shape information. The method developed for coricealment

of crrors in cocled testure in\+olves compensation of the effects of the missing da ta

on the rest of the texture information and then using a deterministic or a statistical

algori t h m for the restorat ion of missing information. The deterministic algori t hm

acliievcs a goocl performance level in t h e reconstruction of' edges. The stat istical

algorithm which is based on maximum n postefion (hIAP) estimation. employs an

adaptive llarliov random field (MRF) as the image a-priori model. The adaptation

enables the estimation procedure to incorporate more information without a dramatic

increase in computational complexity. AIAP estimation is also employed for the re-

construction of missingshape data. Although it uses an adaptive JIRF. theest imator

is different in the sense t h a t it is designed for binary shape information.

In the second part of t h e thesis. Ive evaluate the performance of the developed

concealment methods for three different types of coded visual data: baseline .JPEG

coded st il1 images. H.26:3 coded vicfeo and IIPEG-4 coded t-ideo. Our clcperimental

rcsults demonstrate t hat t h e met hods presenteci in t his thesis achieve consistently

good corn put at ion-performance t radeoffs. making theni very beneficial for real time

communication over error prone networks. In fact. the proposed error concealment

methocis can lead t o acceptable visual quality at loss rates as high as 20%.

Contents

Abstract

Contents

List of Tables

List of Figures

Acknowledgements

Dedication

iv

viii

ix

xiv

xvi

1 Introduction 1

. . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . 1 OutlineoftheThesis 4

2 Background 7

- 2.1 Image and Video Compression . . . . . . . . . . . . . . . . . . . . . . 1

. . . . . . . . . . . . . . . . . . . . 2 The .JPEC: Compression Standard 9

. . . . . . . . . . . . . . . . . . . . . . . . 2.3 Block-Basecl Video Coding 1 L

. . . . . . . . . . . . . . . . . 3 Mot ion Compensated Prediction 1 1

List of Tables

. . . . C'ompti tat ional complexity of various reconstruction methods.

PSSR results (in d B ) for the 512 x 512 LEU and Pepper images. The

niinibers are for the images after removing the st ripes and reconst riict-

. . . . . . . . . . . . . . . . . . . . . . . . . . . ing t h e missing data.

PSSR results (in d B ) for the 512 x 51'3 LEYA image for varioiis recon-

st ruction met ho&. T h e numbers are for the image after ren~oving t h e

. . . . . . . . . . . . . . stripes and reconstriicting the missing data.

PSS R corn parison of different niet hocls for the video secjuence FORE>,i.-\S. 1 10

PSSR cornparison of differcnt mcthods for thc video seqiience A K I Y O . 116

List of Figures

The block diagram of a typical image encoder . . . . . . . . . . . . . .

The scanning of t hc AC' coefficients in JPEG . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . The block cligrarn of a typical video coder

A basic block diagram of a n 1IPEG-4 video coder . . . . . . . . . . .

.-\ \'OP enclosed in a rectangular bouncling box ancl di\-ided into mac-

roblocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Binary alpha plane

. . . . . -4 simplifieci block diagram of a \-ide0 commiinication system

Error propagation in the frarnes of a video secluence . . . . . . . . . .

A site (,Y ) and its ( a ) first order neighboriiood systeni . ( b ) second order

. . . . . . . . . . . . . . . . . . . . . . . . . . . . neigh borhood system

3.1 Error concealment of intra coded texture information . . . . . . . . . .

1.2 Error concealn~ent of inter cocled texture information . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1 Estimation of DC' valiic

3.4 Post-processing for removing the stripes . . . . . . . . . . . . . . . . .

. . . 3 ..i Removing the stripes for blocks with different horizontal positions

. . . . . 3.6 Histogram of the difference value: (a ) Fighter: (b ) Mandrill.

. . . . . . 3.7 Tlie partitioning of the blocks adjacent to a missing block.

S .A pisel. its clique c and the eight directions. The complement of t h e

clique. c' is t h e dark area- . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . 9 -4 rnissing pixel in a vertical line.

1%. 10 (a) The four sub-hlocks. (b) their correspondirig sub-blocks in t h e prc-

. . . . . . . . . . . . . vioiis fraine ancl t h e blocks connected to t hem.

A pixel. its clique c and the eight directions. The complcment of the

. . . . . . . . . . . . . . . . . . . . . . . . clique. c' is the dark area.

.A :3 x 13 pisel u-indoiv. t h e border pixels in i t (shaded) and the best

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . line-fit.

. . . . . . . . . . . . . . . . . Original images: (a) Lena: (b ) Pepper.

Images: (a) Lena with 10% missing data: ( b ) Pepper u-ith 3% missing

data. The missing blocks are shown in white to show their positions

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . clearly.

Decoded images: (a) Lena witli 10% missing data: ( b ) Pepper with 3 %

niissing data. The value of each pisel of t h e n~issing blocks is repIaccd

. . . . . . . . . . . . . . . with the estimated DC value of t h block.

. . . . . . . Pictures after removing the stripes: ( a ) Lena: ( b ) Pepper.

Reconstructed images after removi ng t lie stripes and restoration of

niissing blocks using the method based on Equation (:3.4) (pieceirise

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . constant weights).

5.6 Reconstructed images after removing the stripes and restoration of

missing blocks iising t h e rnethod based on Equation (3.6) (linear ramp

weights). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - - 3 . Reconstructed image iising our proposed statistical rnethod. . . . . .

5.8 Reconstructed images using the method proposed in . . . . . . . . .

-5.9 Reconst riicted image iising the met hoc1 proposed in . . . . . . . . . .

5 . IO Reconst riicted image when cIiagonally adjacent blocks are used in ad-

dition to horizontalIy and vertically adjacent blocks. . . . . . . . . . .

5.1 1 LIagnified picttire of part of t lie image i.ES.4 shown in Figure 5.5- . .

5-12 liagnified pictiire of the same area shown in Figure 5-11 for the image

LEN=I of Figure 5.6. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.3.1:3 llagnifiecl picture of the sanie area shown in Figure 5.11 for the image

. . . . . . . . . . . . . . . . . . . . . . . . . . . . LESA of Figure 5.7.

5.1-l llagnified picture of the same area shown in Figure 5.11 for the image

LESA of Figure 5.S. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.1.5 LIagnified pictiire of the same area shown in Figure 5.11 for the image

. . . . . . . . . . . . . . . . . . . . . . . . . . . . LESA of Figure 5.9.

.5.16 Magnified pictiire of the same area shown in Figure 5 . i 1 for the image

LESA of Figure 5-10. . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.1 T Ps.\iR \rilues for image sequence FOREM.4'; with 10% C;OB missing for

di fferent conceahent met hods. . . . . . . . . . . . . . . . . . . . . .

5 . iS A 1 1 inter coded frame of t h e secluence FOREM.6\X (irame '22) concealed

b- the (a) replacement. ( b ) median. and ( c ) proposecl met hods. . . .

A frame from the video sequence FOREMAS. (a) original. ( b ) miss-

ing blocks. reconstructed using (c) a CillRF model. (d) a suboptimal

HIIRF model. (e) the method proposed in . . . . . . . . . . . . . . .

.-\ frame from the video sequence FORESI AS (a j 1s-it h missing COBs. re-

constructed using ( b ) a GJ lRF rnodel. ( c ) a suboptirnal HMRF model.

( d ) the nlethod proposed in . . . . . . . . . . . . . . . . . . . . . . .

A frame from the video sequence A K I Y O ( a ) original ( b ) wit h miss-

ing hlocks. reconstructed using (c) a GSIRF model. (cl) a suboptimal

Hl[ RF model. (e) the method proposed in . . . . . . . . . . . . . . .

frarne from the video sequence AKIYO ( a ) !vit h missing GOBs. re-

constructed using ( b ) a GMRF model. (c ) a suboptimal HBIRF model.

( c i ) the met hod proposed in . . . . . . . . . . . . . . . . . . . . . . .

Vicleo packet structure of SiPEG-4 in (a) 1-LOPs and ( b ) P-VOPs. .

The shape of the first [-\,'OP of the X K I Y O video object. . . . . . . .

The shape of the A K I Y O video object niissing :IO% of the shape blocks.

The reconstr~icted shape of the first 1-VOP of the AKIYO video object.

The shape of the first 1-VOP of t h e BREAM \-ide0 object. . . . . . .

The shape of the B R W M video object missing '25% of the shape blocks.

The reconstructed shape of the first 1-VOP of the BREAM video object.

The sIiape of the first 1-VOP of t h e \VE.-\THER video object. . . . . .

The shape of the \\'EATHER ïideo object missing 13.5%~ of the shape blocks. 122

The reconstructed shape of the first 1-VOP of the \ V E A T ~ I E R video object . l22

A piscl. its clique (dark) and the complement of t h e clique ( in shade). 146

xii

A.? Simplified diagram of a parallel processing machine. . . . . . . . . . . 148

A.:3 The connection of adjacent PEs for error concealment based on the

)[RF models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

.A.-! The lookrip table used in the error concealment based on the GMRF

model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.30

A..? The connection of adjacent PEs in the error conceaiment based on

kI3IRF rnodel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-51

A.6 The bit pattern for a pixel value of 9s and = 2. . . - . . . . . . . . 1.52

A.7 The lookiip table for the Hiiber f~inction with y = 2. . . . . . . . . . 1.58

Acknowledgement s

There i.i no such thing as a self-made' man. Ib-e ffrE made up of thousnnds

of others. Eccryone ~cho ha.$ Ewr doné a kind decd for us . o r spoken

o n c trord of éncouragérrzenf to us. has en té red into t h e nzcrrl-é-up of our

chnrncter and of our thoughts. as u d as our success.

George ;\latlheu: Adams

First ancl foremost. 1 woiild l i k e to express my deep gratit iide and appreciation

to my supervisors. Professor Faouzi Kossen t ini and Professor Rabab \Varci. for t tieir

consistent and generous support. encouragement and guidance in both technical and

rioritechnical mat ters. 1 worild also l i ke to t hank mj- colleagues in the Signal Processing

ancl ~Iultimeclia Groiip of the Cniversity of British Columbia (L:BC) for creating

a fricndly and stimulating work environment. In particular. L a m grateful for mu

friencls and colleagiies Jaehari In. Giiy Cote. 'rlichael Gallant and Berna Er01 for

tlicir assistance wit h the JPEC;. H.26:3 and NPEG-4 standards. 1 am t hankful for my

friericis and officemates Adriana Durnitras and Da\*e Tompkins. for reading parts of my

t hesis ancl making constructive suggestions. 1 \rorild like to express ni-. appreciat ion

to Dr. -Alen Docef who was an invaluable source of support u-hile lie iras at UBC.

Also. 1 wodd Iike t o t h a n k my friend I iayvan S a j a r i a n for al1 t h e friiitful discussions

t liat ive tiad tliiring my y a r s nt UBC. I a m gratefui t o Dr. Ali derbi of the Xat ional

Research Council of C a n a d a where I spent an internship in 1998. for his s u p p o r t a n d

stimiilat ing envi ronment . M y gra t i tude also goes t o t h e Faculty of G r a d u a t e Stiidies

of I'BC' who helped make mu education possible bu awarding m e with a University

Graduate Fellowship.

SI!- deepest grat i tude. Iove a n d affection bclong of course t o my parents. t o

nlioni. as 1 begin t o realize more and more. 1 owe al1 t h a t I am and al1 the I have ever

acconiplished. And finail?. I detficate this n-ork to rny dear wiie. SIetirnoosh. whose

love. ca r t a n d pa t ience have supportecl me.

Th€ I - n i r c m i t g of British Columbia

.\fcrrch L'OU0

To my dear Pamnts

and

To Mehmoosh, rny beloved wife.

Chapter 1

Introduction

.-ln ediicntion k n ' f hou. much you hare cornrnitted to memory . or m e n hou

rnuch gou knouq. 1l-s being able to difler~ntinte bctlrecn chat yoii do bnow

and what yorr don't.

T h e fast growth of digital transmission sen-ices has created a great interest i i i digital

transmission of image and video signals. Digital image and video signals require very

higli bit rates. therefore compression is rised to reduce the amount of data needed

for represent at ion of soch signals. Compression is achieved by esploi t ing the spatial

ancl temporal redundancies of t h e signal arid the characteristics of t h e user. Several

conipression standards have emerged to facilitate the growth of new visual commiini-

cation applications. iricluding the Jo in t Pliotographic Experts Group (JPEC;) stan-

tlard for st il1 iniage compression. the International Telecornmunication Union ( ITU )

reconirnendation H.261 for video telephony conierencing and its subsequent revisions,

H.266 and H.26:3+. and the lloving Pictures Expert Group (AIYEC;) standards for

full rnotion video compression and coding in digital storage media and digital commu-

nication applications. namelx MPEG-1. 41PEG-2 and 4IPEG-4. The common feature

of t hese compression standards is the use of the discrete cosine transforrn ( DCT). due

to its effectiveness. efficienq- and sirnplicity.

Communication channels are not error free, and conseqriently. the encoded bit

streams are viilnerable to trarisrnission errors. usiially causing loss of blocks of data

ancl/or loss of sunchronization. The cornpresseci image and \-icleo bit streams are

very sensitive to channel errors due t o the variable Iength codewords and prerliction

techniques. which are t ~p i ca l l y used in image/\-ideo coding algorithms.

_\Ianx techniques have been proposed to combat the transmission error prob-

lem. In one group of technicliies. the source coding algori t hm and/or transport cont rol

mechanisms are designed either to minimize the effects of transmission errors wittiout

rcquiring any error concealment at the decoder or t o makc the error concealment

task at the decoder more effective. Esamples of such techniques are forwarcl error

correct ion ( FEC). joint source-channel coding anci la>-ered cocling. T hese techniques

~is i ia l l~* increase t h e overhead in terrns of the overaIl bitstream size: hence. part of the

coclirig efficiency achievcd 11y compression is lost. In another group of techniques. it

is assurnecl that a backward channel froni the decoder to the encoder is a\ailable and

the encocler and decoder work coopcratively to minimize the impact of the errors.

Esamples of siich technicliies are Aiitornatic Retransmission Recluest (ARQ) and se-

lect i vc preclictive coding hasccl on the feedback from the decoder. Retransmission

has generally been considered unacceptahle for real time video applications becaiise

of the associated delay. Retransmission has also been considered inappropriate for

miilt ipoint video transmission because the repeat request from a large number of de-

codcrs can overwhelm the encoder. Xnot her d i sad~~antage of retransmission is t hat i t

ma!- increase the loss rate by adding more craffic on the network. In another group of

tech niques. referred to as error concealment by post processing. the clecoder fulfills the

task of error concealment. The objective of post-processing is t o remove the visuaI1y

annoying ar t ifacts t hat degrade the picture quali ty without distorting t tie informa-

t iori content. In generai. error concealment bx post-processi ng a t tempts to reco~-er

or reconst riict or restore the lost informat ion by estimation or interpoIation wit hout

relying on additional informat ion from t lie encoder. Since the available information

cannot uniquely determine the original iniormat ion. prior knowledge abolit the im-

age/vicIeo signal is typicaIly iised in an effort t o characterize the lost information.

In t his t hesis. ive propose post-processing met hocls for concealment of errors

resiil t ing from t h e transmission of coded visual informat ion in error prone environ-

ments. Basecl on the tl-pe of the encoder and the input data. cocled visiial information

consists of a collection of coded testiire (DC'T coefficients). shape and motion infor-

mation. Object-based video cocling algorit lims generate coded texture. shape. and

motion information while bitstreanis of block-based coded video do not include the

shape information. In the still image coding case. t h e output bit stream consists of

only testurc information. In this work. Ive address concealment of errors in testiire.

shapc and motion information. The motion da ta is solely used for motion coinpen-

satecl prediction of tes ture and shape. Thus. we consider the concealment of errors

in mot ion data in conj unct ion wit h i ts corresponding texture o r shape informat ion.

&> a h cvaluate the performance of our proposed concealment methods for three

cliffercnt types of coded visiial data: baseline JPEC; coded still images. H.26:3 coded

vicleo and LIPEG-4 coded \-ideo. In each case. ive evaluate our concealment results

subjectively and objectively. Moreover. we compare our reconstruction results with

thosc of other rnethods proposed in the literature. We show t hat the rnethods pro-

poscd in this thesis achieve consistently Better performance-computation tradeoffs. In

particular. the proposed rnetliocls are suitable for real t ime video communication over

packet lossy and bit error prone networks. Employing the proposed error concealment

rnethods can leacl to acceptable video qiiality at loss rates as liigh as 20%.

1.1 Outline of the Thesis

In cliapter 2. the image and video compression standards ancl various methods that

have been proposed t o combat channel errors in transmission of compresseci t-isual

information are re\.iewecl. IIoreover. \-arious error conceaiment met liods are discussed

in detail. Finally. l[asirnurn a Posteriori (II.AP) estimation a n d Llarkov Ranclom

Fielclç ( I IRFs) are re~.icwed.

In chapter 3. we propose noive1 error concealment methods for the texture

coniponent of codecl visiial clata. Proper error concealment methods are suggested

for int ra coded ancl inter cocleci testure information. Obviously. conceaiment met h-

ocls dcveloped for iritra coded texture is applicable to compressed still images. The

method introduced for concealrnent of errors in intra coded testtire, iinlike previoiisly

piiblished algorithnis. consiclers the clependency between codecl blocks in a frame.

It uses a deterministic or a statistical rnethod for the restoration of missing testure

informat ion. The deterministic method achieves a good performance level in the re-

construction of edges. The statistical method uses an adaptive MRF as the image

a-priori model. The adaptation enables the estimation procedure t o incorporate more

information without increasing the order of MRF. The met hod proposed for conceal-

ment of errors in inter coded texture uses the information in t he adjacent €rame to

find an initial estimate of the missing texture. The initial estimate is then combined

u-ith the estimate of the prediction error of the rnissing block t o find the best ap-

prosi niat ion of the missing texture information. Since 1IRF-based error concealment

rnet Iiocls are iisiially comput at ionally cornples, we discuss fast implementations of

these nlethods in Appendix A.

In Chapter 4, we propose efficient concealment methods for errors in the shape

information in cocled vitleo. This subject has. so far. not been addressed in the

literature. Similar to C'hapter 13. ive propose error concealment r~iethods for intra

codccl and inter codecl shape data. For intra cocled shape. a 11XP estimator. which

uses an adaptive 1 IRF as the image a-priori model. is used to est imate the missing

shape information- For inter coded shape inforniation a tivo stage reconstruction

method. similar t o the one that is proposecl in Chapter 13. is devised.

Finally. in Chapter 5 . ive esamine the performance of the error concealment

rriettiods proposed in Chapters 3 and -1. \\:e consider the bitstreams generated by

~Warious coding standards: JPEC;. H . X I and 11PEG-f. First . we evaluate the perfor-

niance of the proposecl niethods for baseline JPEC; coded images. Kext. ive consider

the performance of the proposed error concealment methods for H.26:3 codecl video

secltiences. Finally. ive show the resiilts of the ~ r o p o s e d method for reconstruction of

missing shape information in MPEG-4 coded video data. In each case. w e compare

the performance of our proposed method with that of ot her met hods available in t he

literature.

Chapter 2

Background

Thc farther bnck you cnn look. t h e farther forvard you are l i k d y Io sec .

1 t,l nst o n Ch rr rch il1

In this chapter. first. the main image and video compression standards a re briefll-

re\-iewcd. T hen. t h e challengi ng issue of transmission of compressed visual informa-

tion in an error prone en\-ironment is introduced. and various methods that Iiave been

proposecl to combat channe1 errors are re\-iewed. Ses t . error concealment is clisciissect

arici various error concealment met hods a re surveyed in detail. FinaIl'-. ~ I a s i m i i m a

Posteriori ( h1.A P) estimation ancl llarkov Randorn Fields ( l l R Fs) are revieived.

2.1 Image and Video Compression

The importance of visrial communications has increased tremendously in the last few

ciecacles as a result of the progrcss in electronics and compiiter technology ancl the

- I

creat ion of networks operat ing wit h various capacities. Emerging applications such

as video conferencing. mobile terrninals. and Internet-based audio-visual communi-

cation have a great impact on everyday life. education and entertainment. Visual

information is one of the richest but also the most handwidth consuming components

in a commiinication sustem. To meet the requirements of new applications. powerfiil

data compression techniques are needecf t o reduce the bit rate dramatically. even in

t h e prescnce of growing communication channels offcring increased bandwidth.

.A nunzber of compression standards have been defined t o facilitate the g r o ~ h

of new visiial communication applications. The JPEC; still-image compression stan-

clard \vas proposed by the .Joint Photographic Experts Group [1. 2. 13. 4. 5 . 61. Rec-

ornmericlat ion M-261 (ancl its sti bsequent revisions. namely H.2613 and H..26:3+) was

introcluceci by the International Telecommunication CTnion (ITL-) for video conferenc-

ing at bit rates down to 64 Iib/s. The l[PEC; standards from the Stol-ing Pictures

Expert Group (SIPEG) address the compression of vide0 signals and the associatecl

audio information [ I I . The cornmon feature of these compression standards is the lise

of the discrete cosine transform ( DC'T). mainly because of its effectiveness and sim-

pliciry. Encocling algorit hms t hat are cornpliant wit h the abo\-e DC'T-basecl standards

have provided good i mage/video reproduction quali ty a t relati vely high compression

ratios (more than 20: 1 for JPEG. more than 100:l for H.263. and more than 100: 1

for AIPEC;l/lIPEG'L).

2.2 The JPEG Compression Standard

.JPEC; is the current DC'T-basecl stanclard for compression of stiil. cont inuoiis-tone.

monochrome and color images. JPEG has four distinct modes of operation: sequent ial

DC'T- baserl. progressiw DC'T- based. loss less. and hierarchical.

1 - - - - - - - - - - - - - T - - - - - - - - - -

Encropy Codine

I I 1

i I - Image 8x5 Block Quantizer Mode-specific

FDCT (Optional) -

Envopy C d e r ) (ticadcrs/

Tables/

Figure 2.1: The block diagram of a tj-pical image encoder.

Al1 DC'T-based modes of operation in JPEC; follow the same basic coding

procediirc stiown in Figure '2.1. First. an input image is partitioned into 8 x S blocks.

Each blocli is t lien transformed using the forwarcl discrete cosine transforni and.

optionallx. quantized. The first element of each S x S DCT block is called the DL

coefficient and the rest are called the AC' coefficients. The DC'T coefficients are

t licn prepared for entropy encoding. The J PEG standard allows two algori t hms for

cntropy coding. The first algorithm is Huffman-based and the seconcl one is based

on arithriirtic coding. Huffman-based coding is widely used hecause it is simple and

pu blicly availahle. ivhereas ari t limet ic coding (alt hough more compression efficient) is

niore comples and it is mostlx patentecl. Final15 necessary information for clecoding

sucli as headers (image dimension and sample precision ) and tables (quant izat ion and

Low Freqücncy

7 High Frequency

Low High Frequency Frequency

Figure 2.2: The scarining of the .AC coefficients in JPEG.

Hiiffman tables) are added t o the encoded da ta t o form the compressed bit stream.

Likewise. at the decoder. the received bit streams are entropy decoded t o pro-

vide 8 x 8 blocks of DCT coefficients. which are then deqiiantized and transformed

using inverse DCT t o yietd the reconstructed image.

in the sequential DCT-hased mode the 8 x S DC'T blocks in an image are

critropy encoded in a raster scari order. T h e DC coefficient of the 8 x 8 Glock is

always cocled first. The diKercnce between the DC value and its predicted valiie.

which is the DC coefFicient of the most recently coded block. is fecl to the entropy

coder. The :IC' coeEcients a re first scanned in t h e zigzag sequcnce order as shown in

Figure 2.2 and t tien entropy encoded t o form part of t fie corn presscd image data.

There are two types of sequential DCT-based .JPEG. T h e sirnpler form of the

sequent ial DCT-based mode of operat ion is called the %aseline .J P ECi- [ï]. Only

a n 8-bit input precision and two sets of Hiiffman tables (each set incliicles one DC

tahle and one AC tahle) for entropy coding are allowetl in baseline JPEC;. sequential

DCT-based .JPEG that bas capabilities beyond the baseline .JPEG is called -estendeci

seqriential .JPEG." which allows 1'2-bit input precision and -1 sets of Huffrnan tables.

111 the progressive JPEC; (PJPEC;) mode. S x 8 biocks are fornled in the same

order as in sequential JPEG. DCT transforrned and (optionally ) quantized to a spe-

cific niimher of bits. The qiiantized DC'T coefficients are ttien partially encoded in

rnrilt iplc scans. Each scan corresponds to a feu- bits of one or more DCT coefficients.

ancl represents a port ion of t h e image bei ng encoded/decoded. tVhiIe t lie seqiient ial

.J PEC; niode yields essent ially t h e same level of conipression performance for most

cncoder implementations. the performance of progressive JPEG depencls highly iipon

the structure of the clesigned encoder. This is due to t he flexibility the JPEC; standard

leaves open in designing progressive J PEG encoders [S. 91.

2.3 Block-Based Video Coding

AIost of block-based video compression algorithms (e-g.. H.263. LIPEG-2) rel': on two

basic techni~~iies: block-hased motion cornpensateci prediction for the reduction of

temporal recliinclancies and t rarisform domain ( DCT) based coding for the reduct ion

of spatial rediinclancies (10. I l ] . .A block diagram of a typical video encotler is sliown

in Figure 2-13 and is briefly esplaineci in the following.

2.3.1 Motion Compensated Predictiou

LIotiori compensated prediction techniques are used t,o reduce temporal redundan-

cies bet.wecri video franles. Most of these techniques assume tha t the pixels within

11

Figure 3.3: T h e block digram of a t ~ p i c a l video coder.

Coding Control

t tie ciirrent picture can be rnodeled as translations of t h e pixels within previoiis o r

* INTER/INTRA dccision flq

fiitiirc pictiires. Main\_\- d u e t o their simp1icit)- and efficiency. block-based motion

compensation algorithrns tha t are based on a translational moclel are popular. ln

L " T m m i t t e d o r noi" flag

Quantizrr indication t

such algorithrns, t h e motion \-ector is obta i~ied by finding a rnacroblock (a LG x 16

DCT .

arca of the picture) wit hin a pre-defined search area. such that a cost fiinction ~vhich

+ Quaniizer index for trimsform coefficients

nieasiircs the niisnlatch betiveen this rnacroblock and the currcnt rnacroblock is min-

iniizcd. T h c most \videlx usecl cost measure is the siim of absoliite differences (SAD).

Vidco in

Inverse Quantizer

lnvcrx DCr

1- - 1

P '

To fincl the hest matcliing macroblock. the S A D stioulcl L e compiited a t se\-eral loca-

t ions wit hin t Iie searcti area. The simplest, but the most conipiitationally intensive

? ! - - - hfoiion vcctors

--J

i-riethod. known as t he full or exhaustive search methoci. evaluates t h e S.AD for mac-

t

Picture ~ l e m o q

roblocks positioned a t evcry possible pixel location in t he search area. To reduce

the computational complesity. several algorithms with leiver search points have been

proposeci [12. 13. 14. 1.51. The Picture .\lemor- in Figure 2.3 stores one or more pre-

\-ioiisly reconst ructed frames and represents the mot ion est i mat ian component of the

video cocling algorit hm.

2.3.2 Transformation

Transforms decorrelate t he image content and compact energx into as few coefficients

as possihlc. The DCT ancl Inverse DC'T blocks showri in Figure 2.3 perforrn 8sS

trarisforriiations of t h e luminance and chrominance blocks of macroblocks.

2.3.3 Quantizer

The iiiinian viewer is more sensitive to reconstruction errors in the low spatial fre-

cluency regions of an image t han in the high frecluency ones [16. 1 ï]. Low frccli~enc~-

cliariges in intensity or color are easily distinguished by the el-e. Quick. high frequency

clianges are harder to see and may therefore he discardeci. In vicleo coding. quanti-

zation consists of applying iiniform mid-treacl quantizers t o the DCT coefficients.

2.3.4 Entropy Coding

Prior to ent ropj- coding. t he quant ized DC'T coefficients a r e zig-zag scannecl. in order

of increasing frequencj-. T h e re-arranged a r rq - is coded into a scquence of riin-length

coclcs. The riin is defined as the distance between two non-zero coefficients in the

array. The le\-.el is the non-zero value irnrnecliately foIlowing a secluence of zeros. The

riin-level pairs. and ot her relevant in format ion about the macroblock such as mot ion

vectors. and predict ion types. are then ent ropy codeci most ly using Huffman-li ke

cocling.

2.3.5 Prediction Types

Tlie coding mode in wtiich temporal prediction with reference to a previoiis pictiire

is applicd is called inter. and it is called intra i f no temporal prediction is ~iscd. The

intra coding mode can bc selected for a complete picture or macrobIocks in an inter

codecl pict ure. Int ra coclinp is performed periodically to eliminate error propagation

int rociucecl bj+ inter- frame predict ion. h t ra and inter coded pict ures are iisually calleci

I-pict ures and P-pictiires. respect ii-ely. There is anot her tl-pe of coded pict ure caIIcd

B-picture. in which each 16 x 16 macroblock can be forward predicted. backwarcl

prcclictecl or both. The motion information consists of one vector for a forward or

hacku-ard predicted macroblock. and two vectors for bidirectionallj- predictecl mac-

roblocks. The H.263 standard has an optional mode called PB-frame. A PB-franie

corisists of two pictures being coded as one unit: one P-picture tvhich is predicted

from t lie previously decodecl P-pict cire and one B-pict ure wIiich is predicted from

bot h t lie prcviously clecoded P-picture ancl the P-pict tire current 1'- being clecoded.

Thc switches in Figure 2.:3 differentiate between intra and inter prediction.

2.3.6 Coding Layers

St anclard DC'T-hased video cocting algorit hms typically ciecompost. the video sequence

into sis layers: Block Layer. Nacroblock La)-er. Slice Layer. Picture Layer. Group of

Pictiires Layer. and Scqtience Layer. Each layer supports a definite function: either a

signal processing function (e-p.. DC'T at the Block Layer. motion compensation a t the

SIacroblock Layer) or a logical function ( resynchronizat ion a t t he Slice Layer. random

access at the Grotip of Pictures or at the Sequence Layer). Frorn a coding standpoint.

the riiacroblock l q e r is t lie most important one. as most coding decisions are made

at t his l q e r . Coding decisions itsually incliide prediction type. motion vector choice.

and clriant izer select ion .

2.4 The MPEG-4 Compression Standard

The new generation of highly interactive multimectia applications recpire that the

iisers be able to access anci manipulate multimedia data. This has fueled recent

A I PEG--1 internat ional standardization activities for a content -baseci coded represen-

tation of multimedia data.

.\IPEG--1. l ike IIPEG-1 and SIPEG-2 [27. 281 and the [TI'-T H.263/H.26:3+

[29. 301 standards. offers Iiigh compression performance levels. rendering the storage

aitd t ransmission of audiovisiial data more efficiently. The ot lier key ohject ives of

.\IPEC;-4 are to provide 013 ject-hased access and provicie funct ionalities çuch as error

resiliencc. scalability and Iiybrict coding of synthetic and natiiral data [31. 32. 3 3 . 134.

3.3. 361.

2.4.1 Audiovisual Object Represeiitation

\ l PEG-4 acliieves content -basecl representation by defining audiovisiial objects and

coding them into separate bit streani segments [XI. 37. 381. .An aiidiovisual object

1 5

(.A\-O) consists of a visual object component. an audio object component. or a combi-

nation of tliese cornponents. Some examples of AVOs include a sound recorded with

a microphone. a speech synthesized [rom a tes t . a person recorded by a video camera.

and a :3D image with text overlay.

1IPEG--I supports the composition of a set of audiovisual objects into a scene.

also referred to as an audiovisual scene. In order t o allow interactivity with indivicliial

A\'Os [vit hin a scene. it is essential to transmit the information that describes each

.-\\**O.s spatial and temporal coordinates. This information which is referred to as the

scene description information is t ransmi t ted as a separate st ream and mi11 t iplesed

[vit b .-\\-O elementarj- bit streams so that the scene can be composed a t the iiser*s

end. This fiinctiona1it~- niakes it possible to change the composition of .-\VOS wit hoiit

having to change the content of AVOs.

2.4.2 The MPEG-4 Visual Coding Standard

The LIPECL! visiial coding standard provides standardized core processing elemcnts

that allow efficient storage. transmission and manipulation of visual data [:31]. Dif-

fcrcnt rcpresentat ions a n d compression algori t hms ni-- offer optimal soli1 t ions for

cliffcrent applications. bit rates. and formats. Thercfore. XI PEG-4 provides four

clifferent types of coding tools: Cfdeo object coding for coding of a natural ancl/or

synt het ically originatecl. rectangular or arbitrarilx shaped video object. m esh object

coding for coding of a visual object represented with a mesh striictiire. rnodel-bnsfd

coding for coding of a synthetic representation a n d animation of the hiiman face and

t~ot l~ . . and st il1 fcst u r r coding for waveIet coding of st il1 text iires.

Video Object Coding

.A \-ideo object (VO) is an arbitrarily shaped video segment that Ilas a semantic

meaning. .A 2 D snapshot of a VO a t a particular time instant is called a video object

plane (VOP). .A \:OP is defined by its texture (luminance and chrominance values)

and its shape. 1IPEC;-4 allows access to the video objects and also to temporal

instances of the t-idto objects, i.e-. LOPs. To cnable access to an arbitrarily sliaped

object. siich an object needs to be separateci from the t~ackgroiind and other objects.

This process is called segmentation. and it can be performcd during encoding (on-

linc). or prior to encocling (off-line). The segmentation process is not stanclardized in

JIPEG-4. There are a number of automatic and senii-aiitomat ic tools a\-ailable for

segment atiori [XI]. Xlso. it is possible to generate image secluences t hat are segmented

init ially by using techniques such as chroma lie)-ing [ - IO] . where a unique color is iised

to separate the background from a \-icleo object.

hIPEG-1 vicleo object coding consists of shape coding (for arbit rarily shaped

\*Os). motion conipensatcci prediction to recluce temporal redtindancies. and DCT-

haseci texture cotling of t h e motion compensated prediction error data to reduce

spatial rctlunclancies. \-ide0 coding in AIPEG-4 is performed a t t h e macroblock level.

\-OPs are divideci i r~to macroblocks. siich t hat they arc reprcsented witti t he min-

irrium number of macroblocks within a bouncling rectangle. \ m e n the VOP is a

rectangiilarly shapecl vicleo frarne. 3IPEC;-4 vicleo coding becomes cliiite sirnilar to

that specified in 1IPEG-l/SIPEG-2 [Yi. 281 and H.263 [29]. Siniilar to SIPEG-1

and .\IPEC;--2. 1IPEG-4 supports intra coded (1-). temporally predicted ( P-). and

hi-directionally preclictetl (B-) VOPs.

.. 3exture coder (Scaa/] i

Variable > + Length

\Coder J Vidw I M m - 1 inverse ] 1

I

4 Arbitrary t shaped VOP ?

C shape' / @der,

Figure 2.4: :\ basic block diagram of an l1PEC;-4 \-ide0 coder.

Figure 2.4 shows the basic VOP encoder structure. T h e encocler consists

riiainly of two parts: a hpbrid of a motion compensatecl preclictor and a DCT-based

coder. and a shape coder. In t h e first part. motion estimation and compensation is

prrfornietl (escept for 1-VOPs) on texture data. followed by DCT. quantization and

VLC' coding. Motion information is also encodetl using VLCs. Then . the VOP is re-

constructecl as in the decoder. that is. by appll-ing inverse VLC. inverse DCT. in\-erse

qiiaiitization. and adding the resulting da ta to the motion cornpensatecl prediction

data. The resulting VOP is t hen used for the prediction of future VOPs. The stiape

coder encodes the binary shape information of the object. The shape infornlation is a

t\vo dimensional binary mask used to represent the sliape of a vicieo object such that

18

t h e pisels t i iat a r e opaque a r e par t of t h e object whereas pixels t h a t a r e t ransparent

a r e not par t of t h e object. S ince t h e s h a p e of a VOP ma>. not change significantly

between consecutive VOPs. predictive coding is employed t o reduce tempora l redun-

dancies. Thus. mot ion est iniation a n d compensat ion a r e also performecl for t h e shape

of t h e object . Finally. motion. texture. a n d shape information a r e mult iplesed with

t h e headers t o forrri t he coded \'OP bit Stream. At t he decoder end . t h e \,-OP is

reconstructed bj- combining motion. texture. a n d shape d a t a dccodccl from t h e bit

s t reani .

2.4.3 Motion Vector Coding

)lotion vectors ( I I V s ) a r e predicted using a spat ia l neighborhood of th ree MVs- a n d

t h e prediction error is t hen variable lengt h codeci. l [ o t ion vectors a r e t ransmi t t ed

onlj- for P -VOPs ancl B-VOPs. Since t h e V O P s can be arbi t rar i ly shaped. t he re

ma'- not b e a corresponding pixel available for precliction of t h e cur ren t VOP. In

order t o giiarantee tha t e\-ery pixel of t h e ciirrent L'OP can be predicted. sortie o r

al1 of t h e boiinclary and outsicle blocks of t h e reference VOP need to be padded by

e s t rapolat ion. Tl ie boundary blocks a r e padded by first repeating t h e bounclarj- pixels

in t hc horizontal direction. a n d then repeat ing t h e boiindary pixels in t h e vertical

direct ion w h i l ~ averaging pixels whose values were already obtairied by horizontal

padding. \Vhen a reference pixel belongs t o a hlock tha t is completely outsicte t h e

1-OP. then t h e hlock is filied by extended padding. where t h e pixels a r e assigned

a\-erage vaiiies t ha t a r e de te rmined by t h e neighboring blocks.

Bounding

Outside block box

Boundary bloc ks

Inside bloc k

Figure 2..5: A \,-OP enclosed in a rectangiilar boiinding box and dividecl into macroblocks.

Intra codecl blocks. as ive11 as motion compensated prediction error blocks. are

test tire coded. As in 'clPEG-1. JlPEG-2 ancl H.L6:3. DCT basecl coding is emploj-ed

to redrice spatial redundancies. That is. each VOP is divided into niacrol>Iocks. as

illiistrated in Figure T5. and DCT cocling is applied to the four Sxd luminance and

t wo SsS chrominance blocks of the macroblocks. If a macroblock lies on the boundary

of an arbitrarily shaped VOP. then the pisels that are outside the VOP are padded

hcforc DC'T coding. For intra VOP boundary macroblocks. padding is perfornled

as describecl i n the previous section. whereas for residual blocks. t h e region that is

oiitsicle the VOP is paclded with zeros. llacroblocks that are completely iriside the

L'OP are DCT transformed as in MPEG- 1. 1IPEC;-2 and H.263. The bIocks t h a t do

not helong to the VOP are not coded. DCT transformation of the blocks is followed

by ~~riantization. zigzag scanning. and variable length coding.

- Boundary Blocks

Figure 2.6: Binary alpha plane.

2.4.5 Shape Coding

L[PEG--k supports coding of shape information to enable content-bascd access to

individual video objects in a scene [ 3 3 . 41. 42. 401. Because of its liigh compression

performance and low complesitj-. a bit map-based shape coder was adopted in .LI PEG-

-1. In hitmap-bascd shape coding. t he shape and transparency of a VOP are defined

1)'- tlieir hinary and grayscale (respectively) alpha planes. A hinary alpha plane

indicates whet her or not a pixel helongs to a \'OP. .A grayscale alpha plane indicates

t h e t rançparency of eacli piscl within a VOP. MPEG-4 provides tools for Imth lossless

ancl 10s~'- coding of bina- and gray scale alpha planes. Furt hermore. bot h intra and

inter shape coding are supportecl.

Binarj- alpha planes are clivicied into 16 x 16 blocks as illustrateci in Figure

2.6. The hlocks t h a t are inside the VOP arc signaled as opaque blocks and the blocks

t liat arc outside the VOP are signaled as transparent blocks. The pixels in boundary

hlocks (i-e.. blocks that contain pixels hoth inside and outsicle the VOP) a r e scanned

in a raçter scan order and coded using arithmetic coding.

In inter shape coding. the shape of the current block is first predicted frorn the

shape of the temporally previous VOP by performing motion estimation and corn-

pensat ion. The shape mot ion vector is t hen coded predictively. Ses t . the tlifference

Letween the current and the predicted shape block is arit hrnetically coded.

2.5 Transmission of Compressed Image and Video

in Error Prone Environments

Input vidbo Output vide0

8 Source Coder

1 $

8

Transporl Coder - Channei - Tranqmt dscoder

Figure 2.7: A siniplified block diagram of a video conirnunication system.

Figure 2.7 shows a sirnplified block diagram of an image/video communication

SJ-stem. The input iniage/video is compressed by the source coder to t h e desired

bit rate. The source coder iç us~ially composed of a waveform coder and an ent ropJ-

coder. The transport coder in Figure 5.7 refers t o an ensemble of devices performing

cliannel codi ng. packet izat ion and/or modu1at ion. and transport level cont rol tising a

part icular t rarisport protocoi.

If al1 t h e coded information of a n image o r video seqiience is correctly re-

ceiveci. t he receiver can perform t h e corresponding decoding. yielding t h e intended

reconstriicted image or video sequence. As communicat ion channels a r e not error free.

t h e encodecl bit s t reams a r e viilnerable t o transmission errors. In fact , cornpressed

image and video bit s t r eams a r e ver>- sensitive t o channel errors due t o t he variable

length codewords a n d prediction techniques. which a r e iisually used in image/vicleo

coding algorit hxns.

Transmission crrors c a n be roughlv classified i n to two categories: random bi t

errors and erasure errors. R a n d o m bit errors a r e caused bj- imperfections of t h e

phi-sical channel. which c a n result in bit inversions. bi t insertions. a n d bit deletions.

T h e impact of random bit errors depends mos t ly o n t h e coding methocl. \\:ben

fised-Iength coding is used, a randoni bit error will on ly affect o n e cotleword and t h e

rcsult ing damage is generally acceptable. \\;ben a VLC is iised. random bit errors c a n

dcsynchronize t h e cotled iriforrnation such tha t many f d o w i n g bits a r e undecodabIe

i l rit i l t h c nes t synchronizat ion code word appears. Erasure errors. o n t h e ot her hancl.

can bc caused by packet loss ( in packet networks). burs t errors. o r short- t ime s'stern

failiirc. Random bit errors i n VLC' can also cause effect i\*e erasure errors since a single

Ilit error can lead t o man'. following bits being undecodable a n d hence useless. Sincc

almost al1 t h e state-of-the-art image/video compression techniques employ L'LC's.

there is no need t o t reat randorn errors a n d crasure errors separately [-KI!. Thercfore.

t hc generic t e rm -transmission error" o r .*cIiannel e r rorc will b e used t hroughout t his

t hesis.

T h c visual effects of er rors rnay propagate tempora l iy and s p a t i a l l ~ For ex-

Transmission Error

Figure 2.8: Error propagation in t he f rames of a video secluence.

aniple. in H.268 loss of information in a n intra coded f rame can propagate t o t h e

follo\ving frames. T h e resulting artifacts remain visible for a long t i m e w-hich is an-

noying t o e n d user. Figure 2.8 illustrates t h e typical effect of loss in a frame. AS

indicatcd in t h e figure. t h e e r ror m a y also propagate f rom i t s original position d u e t o

mot ion corn pensatecl prediction [U].

I I a n y techniques have been proposed t o comba t transmission errors. These

technicliles can be divicled in to th ree groups: e r ror resilience encoding. interactive

crror compensation. and error concealnient by post processing. \.+-e briefly review

cach of these tecliniques.

2.5.1 Error Resilieiice Encoding

Error resilience encoding refers t o those techniques t ha t t h e source coding algorit hm

ancl/or t ransport coder shown in Figure 2.8 are designed e i t he r t o minimize t h e effects

of t~~ansrr-iission errors without requiring any error coi-icealment a t t h e decoder. or t o

rnakc the error concealment t ask at the decoder more effective. There are man?- w-s

to make the coded data more resilient t o errors. Essentially. tliey al1 add a controlled

amount of redundanq* in either the source coder or the transport coder. The most

important techniques in t his group are:

a Layered coding: The \-ide0 information iç partitioned into several groups or lay-

ers. To combat channet errors. tayered coding must be combined with transport

prioritization so that the base layer is delivered with a higher degrce of error

protection. Layerecl cotling can be implementecl in diffcrent tvays depending on

t h c way the video information is part itionecl. leaciing to temporal scalabi1it.j..

spatial scalabilitj- and SNR scalability [45. -161.

l lul t iple description coding: In multiple description coding. sevcral coded bit

streams (descript ions) of the samc source signal are generated and transmit ted

over separate channels. At the dest.ination. dependi ng on which descriptions are

received correct ly. diflerent reconstruction scliemes are emploj-ed [-4 7. -18. 49. 501.

Joint source and channel coding: These techniques invoke source-channel in-

teraction. In general. joint source channel coding is accomplishecl bj- designing

tlie quantizer anci entropy coder for the given channel error characteristics to

rninimize the effect of transmission errors.

0 Robust wavcforrn coding: In traditional source coder design. the goal is to

eliminate 130th the statistical and visual redundancy of the source signal to

achieve the best compression gain. This. however. niakes the error concealment

task as the decoder very difficult. One approach to solve this problem is to

intentionally keep sorne redundancy in the source coding stage such t liat better

error concealment can be performed a t the decoder tvhen transmission errors

occiir. An esample of this technique is intra coding to eliminate teniporal error

propagation in a predictive video coding systern [ j l . 52. %3].

O Robust entropy coding: In this group of techniclues. redundancy is addecl t o the

entropj--coding stage of the source coder to help detect bit errors anci/or prer-ent

error propagation. Esamples of these techniqiies are using tlie synchronization

code words [52. 531. error-resilient entropy codes [54. 55. 561. ancl reversi ble

\-ariahle-lerigt tl codes.

O Forward error correction (FEC) coding: To make the video codeci data more

resiiient to cliannel errors. FEC' codes such as Reed-Solornon codes and BCH

codes are typically employed by the encoder. At the decocler. t hese FEC' codes

are then used to correct errors in the bitstrearn. -4lthough FEC techniques

provc quite effective against random bit errors. their perforniance is usually in-

aclecluate against long-cliiration burst crrors. The FEC techniclues also increase

overilead in terrns of the overall bitstream size: hence. part of the cocling effi-

ciency achievecl t>y video compression is lost. FEC is u s u a l l ~ appliecl to provide

a certain let-el of protection to the conipressed bi tstrearn. and the resicliial errors

are handled bj- other tools [ ~ i ] .

0 Transport level control: Examples of t liese techniques are the packet izat ion

schemes in packet vicleo, and intcrleaving. In packet ization. the out put of tlie

source coder is assenibled irito transport packets in siich a way that when a

packet is lost. the other packets can still be useful. Because a fised amount

of data has to be accumulated to perform interlea~ing. the latter introduces

relat ively long delays.

2.5.2 Interactive Error Compensation

If a backward channel from the decoder to the encoder is available. better perfor-

mancc can be achieved i f the encoder and decocler cooperate in the process of corn-

bating channel errors. This cooperation can be realized a t either the source codirig or

transport Ievel. At t h e source coder. coding parameters can be aclapted bwed on the

feecl back in format ion from t hc decoder. .At the transport Ievel, the feedback informa-

tion can be employed t o change the percentage of the total bandwidth used for FEC'

or retransmission I.581. Retransmission has been succes s fu l~ iised for non-real time

clat a trarismission. but i t has been generally considered unacceptable (due to incurred

delay) for real tinie video applications. Retransmission has also been considered inap-

pro priate for mrilt i point \-ide0 transmission hecause the ret ransmission request froni

a large nuniber of decoders can overwhelm the encoder. Another corlcern about using

retransmission is that i t may increase the loss rate becaiise it will add niore traffic to

the network.

2.5.3 Error Concealment by Post-Processing

Error concealment b ~ - post processing includes techniques in which the decoder fulfills

t lie task of error concealment. In general. t hese met hods a t tempt to recover the lost

in format ion by est iniat i ng and interpolat ing wi t hout relying on addit ional informat ion

from the encocler. LVe will discuss the various method of error concealment by post

processing (which is henceforth called error concealment) in more detail in Section

2.7.

2.5.4 Error Detection and tocalization

Before the transmission errors can be concealed. they nlust be detected. FEC' tech-

niques can he usecl to detect errors and pass the location of the errors to the video

clecoder so that the video decoder can concea1 t h e errors. In addition to FEC'. syn-

tactic ancl semantic error detect ion techniques can aIso be applied a t the tlecoder for

dctcction of crrors. In a typical block based video compression techriiclue that uses

motion compensated prediction and DCT. the following rules are applied to cletect

hitstrearn errors: 1 ) motion vectors are out of range. 2) invalid VLC table entry is

fountl. :3) t h e DC'T coefficients is out of range. and -1) the number of DCT coefficients

in a hlock esceeds 64.

2.6 Error Resilience Tools in Current Video Com-

pression Standards

1-1.263 incliic1es error resilience tools which are clefined in the baseline syntas and in

four of its normatii-e Annexes. The visual part of the IIPEG-4 standard also provides

support for error resilience. \Ve here siimmarize the error resilience tools of the abovc

ment ioned standards.

2.6.1 Resynchronization

Diie t o t h e application of VLCs in the vicleo compression algorit hms. t he location

in the bit strearii where t h e tlecoder detects an error is some irndetermined distance

away froni n-here the crror actiially occiirred. Hence, wtien the clecoder detects a n

error it loses synchronization !vit h t he encoder. tliat is. it is iinablc t o iclentify t h e pre-

cise location in t lie image where the current d a t a betongs. Resynclironizat ion tools.

as t lie nanie iniplies. a t ternpt t o enablc resynchronizat ion bet weeri t h e ciecocler and

t h c bitstreani after a resiclual error or errors have heen cletectctl. Resynchronization

niarkcrs. \\.hich arc distinct froni al1 the valicl code worcls. arc insertcc1 into t h e bit-

strcani cliiring encoding. Cpori detection of a n crror. t h e clecoder seeks forwartl in

t h e bitstreani for t he next res~-nchronization marker. Once this niarker is foiind, the

decocler gains s'-nchronization with the encoder. Resynchronization rnarkers localize

t h e errors spat iall~.. Generally. t h e d a t a betu-een the synchronizat ion point prior t o

t h e crror and t h first point u-here s>mchronization is re-established. is rliscarded. If

t lie rcsynchroniaation approach is effective a t localizing the ariioiint of da ta cliscarded

bj- t lie clecoclcr. t lien the abilities of ot her tools t hat recol-er da ta and/or conceal t h e

c fkc t s of errors is greatly enhanced.

-l'lie rrsynclironization approacties adoptcd H.26~ and 11 P EG-4 are simi-

lar and are briefiy reviewed in ivhat follows. In the t1 .26~ stanclarcis. a -Groiip of

Blocks'. ( G O B ) is tfefineci as one o r more rows of niacroblocks. A t t Iie s tar t of a new

GOB. iriforniation called a GOB header is placed rvit hin the bitstrearn. This header

iriforriiation contains a GOB s ta r t code. wliich al1ows the decoder to locate t h e GOB.

Fu rt hcrniorc. t tie GO B header contains in format ion which al lows t h e clecoding pro-

ccss t o be restarted (i-e.. resynchronize the decoder t o t h e bi ts tream and reset al1

codecl clata tha t is predicted). .A similar procedure can b e applied t o a n arbi trary

n i m b e r of acljacent MBs called -slice--

T h e C;OB approach t o resynchronizat ion is based on spat ial resynchronizat ion.

Tha t is. once a part icular macroblock location is reached in t h e encoding process. a

resj-nchronization marker is inserted into the bitstream. X potential problem with this

approach is t hat the resynchronizat ion markers will most l i keli- bc rinevenly spacetl

throughout the bitstrcam hecause t h e encoding process is variable ra te in nature.

Tliercforc. certain portions of the scene. stich as high motion areas. \vil1 be more

siisceptible to errors. which will also be more difficult t o conceal.

The video packet approach adopteci by hIPEG-1 is based on t h e insertion of

perioclic resj-nchronization markers t hroughout t h e bitst ream. In o t her words. t h e

length of t h e video packets are not based on t h e number of macroblocks. but instead

on the number of bits contained in each packet. If t h e niirnber of bits contained in

tlic ciirrent vicieo packet esceeds a predetermined thresholcl. then a neiv video packet

is created at the s ta r t of the iiext macroblock.

Synchronization markers a re ver- helpful in IocaIizing spat ial errors. hoivever.

t hey iiicrease bit rate. In [52. S 3 ] . it is shown t hat location of synchronization markers

cari l x optimally chosen to - i e l c l best rate-clistortion ( R D ) tradeoffs.

2.6.2 Data Partitionhg

Typical video decoders discard al1 t h e d a t a containing errors between two rcsynchro-

nizat ion markers. O n e of t h e main reasons is t h a t hetween two resynchronization

markers. motion and DCT da ta for each of the MBs are coded together. Hence.

the decocier cannot ident i f - if the motion or the DCT data of any of the MBs in the

packet is not erroneous. In MPEG-1. the data partitioning mode separates the motion

and the niacroblock header information frorn the texture (DC'T) information. This

approach requires t hat a second resynchronizat ion marker be inserted between t h e

motion and the tes ture information. Data partitioning is sigrialed to the decoder.

If the test tire information is Lost. this approach utilizes the mot ion information to

conceal the errors. Tha t is. due to the errors the tes ture information is discarded.

n-hile the motion is used to motion compensate the previously decoded VOP. The

fort hconiing t hi rd version of H.2613 will likely include da ta part itioning as well.

2 -6.3 Reversible Variable Lengt h Codes (RVLCs)

AS mcntioned earlier. one of the problems $vit h transmit t ing compressed video over

error pronc channels is the use of variable lengtli codes. During the decoding process.

if the decocier detects an error while dccoding LLC data. it loses synchronization

ancl hence. it typically discards al1 the data up to the ne'rt resyncfironization point.

Ri'LC's alle\-iate this problem. RVLCs are special VLC's that have the prefis property

bot 11 in t lle forward and the reverse directions. I-ience. t hej. can be uniquely decoded

bot h in the forward and reverse directions. Thereiore. when the decoder detects an

crror wliile decoding t h e bitstream in the forward direction. it jumps to the nest

resynclironizat ion marker and decodes the bitst ream in the hackicard direct ion till i t

encoiinters an error. In this way. the decoder can recover some of the data that would

haive ot herwise been discarded [59. 601. Both H.263 and .\f PEG-4 support the use of

2.6.4 Reference Picture Selection

The H.26:3 standard defines a reference picture seiection mode. which provides error

resilience. In this mode. bot h the encoder and the decoder have multiple prediction

framr hiiffers. Besicles vide0 data. t he encoder and the decoder eschange messages

regarding what is correctly received and what is not. Based on this information. the

encodcr cletermines which frame buffers have heen damagcd at t h e decoder. Then

tlic cncocier will use an rindamagetl frame buffer for prediction. The information for

t hc selected prediction buffer is also incliided in the encodecl bit st reani so that the

decocler can use the same frame buffer for prediction [4:3].

2.6.5 Header Duplication

Some of the header information present at the heginning of a coded video frame is

cssential ir i orcler to decode the frame, If sorne of this information is not recei\-ed

1)'- t h e decoder. the whole franic mal have to he discarded. In order to increase the

probability of receiving this information. header diipiication allows the introduction

of driplicatc copies of important Iieader information in the vicko packet. This option

is supported in XIPEG-4 and a simiIar tool is included in H.26:3+.

2 -6.6 Independent Segment Decoding

\\:hm slices are used in H.26:3. al1 predictions within a pictrire are restricted t o the

hoiiridaries of the slices. This. however. does not pret-ent error propagation due to

motion compensation. Error propagation can be preverited using the independent

segment clecoding mode supported in H.263. which enforces the treatment of segment

boiinciaries as if they were picture boundaries. A segment is defined as a slice. a

GOB. or a niimber of consecut i~e GOBs. By employing this mode. error propagation

oiitsicle the segment boundaries due to motion compensation can be avoided.

2.7 Error Concealment

Error coricealment iniends to conceal the effccts of channel errors by esploiting the

recliiriclancies in the video signai and the liniitations of the hiirnan visual systcrn.

witliout recluiring additional information [-131. Error concealment cannot provide per-

fect recovery. however. it can oFten provide a subjectively acceptable picture quality

wit fioiit increasing the transmission bandw-idt h.

Error concealment in an image or cideo signal. that is. reconstruction of lost

pixels is a n ill-posed problem. and as sucti. it does not have a iiniqiie solution- Er-

ror concealment methods solve this ill-posed problem by introdiicing assiimptions to

rcst. sict t lie ot hcrwisc infinitc number of admissible solrit ions. Different conceal~iieiit

met tiods arc outcomes of different assiimptions about the image and video signais

and/or different interpretat ion of t hesc assumpt ions. Basecl on the particular as-

siinipt ions made. the spat ially ancl temporally adjacent aiailable data are used for

restoring t Iie rnissing information. For compressed st i l 1 inlages and int ra-frames ( 1

framcs) of vicleo sequences. the avaitahle data are the spatially adjacent c!ata. For

prcdictccl frames ( P frames). both spatially and the temporallÿ adjacent data can he

used in concealinent procedure. Error concealnient methods that use the spatially

a n d temporal11 adjacent d a t a a r e referred t o as spat ia l a n d tempora l e r ror conceal-

ment methods. respectively. I n what fo l~ows we review different a-priori assurnptions

niade about image a n d vicleo signals a n d t h e concealment me thods t h a t are based

on these assiimptions [61]. Following t hat. I v e divide t h e rnethods i n to spatial a n d

temporal error concealment.

2.7.1 Consistency

.-\ frinclamental a-priori knowledge abou t pisel values in images a n d video is consis-

tcncy witli known valiles. Tliis reqiiires t h a t correctly rcceived pixel values reniain

iinaltered by the restoration process. a n d t h a t t he restorcd values lie within a n ac-

ceptable range (e.g.. O t o '25.5).

2.7.2 Smoothuess

Smoothness is a basic a s sumpt ion which has beeri iised in mos t of t h e concealment

techniques, and it has b e n interpreted in many different ~vays . O n e implication of

t lie sniootliness property of image signals used in [G2] is t h a t t h e variation b e t t e e n

adjacent pisels wit hin t. h e darnaged block a n d t heir s p a t ially neighboring pixels in

adjacent blocks should be small . Tiius. t o restorc t h e niissing da t a . a measiire of

spat ia l ~ a r i a t i o n s between ad jacent pisels (e-g. gradient or Laplacian) is minimized

[62. 631. T h e main shortcomings of t his niethod a r e t l ia t it uses a smal l fraction of

t h e a:-ailable d a t a (one pixel wide arca around the missing block) in t h e concealment

procecliirc ancl i t makes the assumpt ion t h a t t he lost block is smoo th . .As t he re-

coiistriiction method proposed in [62. 631 uses only a one pisel wide a r ea arouncl t h e

niissing block. it does not adequatelj- exploit structures that exist in the image in the

reconstruction process. The smoot hness assiimption. on the ot her hand. Iimits the

capahili ty of the reconstruction procedure in restoring image details and edges. The

smoothness asstimption may also Iead to artifacts around reconstructetl blocks.

The smoothness assumption can be formulated for temporally adjacent pisels

in a vicleo sequence to restore missing data in P-frames. In [64]. it is proposect that

in calculating the smoothness measirre. temporal information shoiild be considcred.

Thus. in forniulating the smoothness meaçure in addition to the term reprcsenting the

spatial pixel differences for pixels located at the boundary of the missing block. a term

icpresent ing the differcnce between the reconstruction resiilts and the pixel values at

the sarne position in the previous frame is also incIuded. This tcrm provides temporal

smoot hness and en forces a smoot h transition between adjacent frames. Obviously.

for fast changing scenes. this technique generates unacceptable resdts.

In [6.5]. the damaged blocks are recovered iitilizing the smoothness property

of an image at the boundaries of t h e blocks. In contrast to the met hod proposed

in [@]. the method proposed in [65] cloes not minimize the variations between the

saniple values in the missing block. Instead. a cost function ivhich repreçents t h e

\-ariations hetween values of the pisels of the missing block at the boundary and their

corresponding pixels in the acljacent blocks is dcfinecl. In [G]. the cost function i s

written in the DC'T domain and its derivatives with respect to eacfi of the 64 nlissing

DC'T coefficients are calculatecl. The resulting 64 cquations are not independent.

as t h e proposed cost function imposes restriction o n k on the boundary pixels. It

is proved in [6J] that for S x S blocks. tliis method is able t o restore at most '28

coefficients. m a i n l - lom frequency coefficients. while assuming t h a t t h e rest of t h e

coefficients a r e zero. Conseqiiently. t his met hod is iinable t o reconst ruc t edges and

details of t h e lost block. In [65] . t h e methocl described above hns been used for t h e

restoration of lost d a t a in bo th 1 a n d P frames. neglecting ternporally ad jacen t d a t a

available for P frames.

Another implication of t h e smoothness proper ty is t ha t a pixel ( o r a DCT coef-

f icirnt) is likcly t o have a value which is close t o t h e values of pixels (DCT coefficients)

in spatially adjacent blocks. Csing t h i s assumption. in [66]. t h e erroneous block is

replaccd wi t h t h e average of i t s t h ree nearest neighboring blocks in t h e previous row.

This tnethocl is computat ional ly very s imp le but ol>viously l imited in t e r m s of perfor-

niance. In [ 6 i ] . each pixel in a d a m a g e d block is interpolated i rom t h e corresponding

pixels of i t s four neighboring blocks siich t h a t t h e tota l scluared border e r ro r is min-

iniized. B y reconstructing each lost pixel in a block as a linear combina t ion of t h e

corresponding pisels froni neighboring blocks. t h e inter block correlation is exploitecl

and t lie stri ict u r e of the reconstructed block is determincd iising t h e s t r u c t u r e of t h e

rieiglihoring blocks rat lier t han by a global smoothness assumption which m a y yield

tinn-anted artifacts. 3Ioreover. local image characterist ics a r e preserved a n d liigh

frequcncy cletails can be reproduced. T h i s me thod yields good reconstruction of hor-

izont al. vertical. near-horizontal a n d near-ver t ical eclges. lines a n d pa t te rns . S t rong

diagonal edges thoiigh. a r e no t ire11 reconstrtictect 13'. tliis technique. The poor re-

prodiiction qual i ty around diagonal edges is d u e to t h e fact t h a t on ly o n e weight is

assignecl to each of t h e top. below. right. a n d left blocks in t he linear interpolation of

t h inissing block.

Based on the s a m e assurnption. it has been proposed in [68] and 1691 tha t the

pixel raliies in a damaged block be interpolated as a weighted combination of the

pisels in four boundaries ( 1-pixel-wide neighbor of the missing block). Ohviously t his

rnethod is very simple. but t h e performance is limited because t h e structure of the

image is not exploited and thus edges cannot be recovered,

In [TOI. a one-pixel overlap block structure is proposed for the reco\-ery of

lost information. Each block has a size q u a 1 t o 9 x 9 pixels. and overlaps with

the below ancl the right blocks. Using this method. if a lost block is surrounded by

undamageci blocks. t h c botinciary pixels can be ret rieved from neighbori ng hlocks.

The lost coefficients are estimated so that a cost furiction consisting of the surn of the

sqiiarecl differences between t h e estimated and the act ual intensities of t h e boundary

pixels is rninimized. The proposecl 9 x 9 overlapping blocks are not compatible with

n-iost of the current image and video compression standards. which require 8 x 8

non-overlapping hlocks.

For P-franies. smoothness nlay be interpreted in a iashion which assumes that

an area in the current frame is the same as the corresponding area in t h e pre\*ious

franle- A simple and yet effective temporal reconstriiction met hod iising t his as-

siimpt ion replaces the corrupted/missing rnacroblocks ( hlBs) wit h t heir correspond-

i n g parts in the previous frame. Although this method generally works well in still

parts of the picture. such as the l ~ a c k ~ r o u n d . it cannot produce sat isfactor~- resiilts

when the vide0 sequence eshi bits fast moving objects or sudden scene changes [il].

2.7.3 Statistical Correlatiou

StatisticaI correlation is another a-priori assiirnption made about images. This im-

plies ttiat the pixel values in an image are realizations of an underlying statistical

model. In the method proposed in [6S]. the original image and its corrupted version

are moclelecl as 1Iarkov Random Fielcls. -4ssiiming a spccific n prion distribution for

t hr original image. a maximum a posteriori (lI.L\P) estimation of the missing data is

obtainecl conditioned on the values of the received data. This met hod is compiitation-

aIiy espensive and it has limited performance in the restoration of edges. To rediice

t h e coriipiitational complesity. a suboptimal solution of the lI.r\P estimation problcm

lias bcen proposecl in ( T I ] . bx selecting eacli pixel in the missing block as the median

of its neighboring pisels. The median filtering methocl. however. cloes not yield a

iiniclue solution and cannot reconstruct clamaged edges. T h e above mentioned works

in spatial crror concealment exploit ing the LIRF mode1 use a single pixel [vide region

aroiind the erroneous area t o achieve t h e reconstruction. This practically restricts

t tic amount of available information iised in t lie concealment proceclure to a small

region around the missing area. Incorpoïating more pixels u s u a I l ~ leads t o a higher

order mode1 and t his is computat ionally espensive as the computat ional complesi tv

gron-s esponentially with the order of the MRF nlodel. Csing the above mentionecl

approaches. the damaged area is recoristructed fairly u d l in very low tfeqiiency por-

t ions of the image. However. the reconstruction process yields blurry restilts \vit h

a significant Ioss of detail in high frec!uenq+. or edge portions of the image. In the

statistical approach proposecl in [ ï3]. the image is modeled as an Auto Regressive

( .AR) proccss and the missing information arc estimated using the model. In [TA] .

a video seqiience has been niodeled as a 3-D AR process and the missing data is

ses tored using t his model. Because of the non-stat ionary characterist ics of the image

and video data. the methods proposed in [73] and [T - l ] are applied locallj..

In (751. error concealmeiit in the spatial doniain is aciiieved using a General-

ized ~ l a x i m u m Likelihood Ordered Statistics (GMLOS) which is a robust. nonlinear.

recursive filter with a variable kernel. The reconstruction starts frorn the boundaries

of t h e missing blocks and the valucs of the missing pisels are estimated recursii-ely

towards the center of the missing blocks. The filter kernels and the processing win-

clows arc chosen t o esploit the structural correlation of the rnissing 1IBs with their

surrounding neighbors. At the same tirne. the GAILOS filter exploits the statistical

correlation of the neighboring AlBs.

2.7.4 Edge Continuity

Edge contiriuity or more generally continuity of linear patterns in a n image requircs

t h a t edges in a scene be continuous. This assumption implies that if an edge is

present in the neighboring blocks of a missing block and its direction implies that

t h e eclgc passes through the niissing area. then the edge should pass through the

rnissing block. In [TG] this assumption is ~isecl in a Projection Onto Conves Sets

(POC'S) based retrieval met tiocl. T h e met hod proposed in [76] assumes a simple edge

structure (i.e.. onlj- a single dominant edge) and. therefore. does not work d l for

comples structures such as streaks. corners. etc. In [ T i ] . a heuristic approach. whicli

is cornputationally less espensive. replaces the POC5 stage in the above rnentioned

met hod. Alt hough yielding more accurate rcsults t han the approaches based on

snioothness assumptions. edge-based metliods a re cornpritationally demanding. The.

reqiiire t tie estimation of edge directions and for t h e POCS rnethod. many iterations

are recliiired before convergence. The rnetliod proposed in [ I S ] is also based on the

iclea of edge continuitu. To detect linear patterns (edges) in blocks adjacent t o a

missing block. iiowever. a tomographie projection based met hod has been proposed.

Alt horigh t lie edge detection stage of this algorit h m is different irom other edge based

error concealment nlethods. the reconstruction phase is almost the same. Therefore.

t his method has basically the same performance level as t hose of other eclge based

rcconst riict ion met hods.

In ['ïS] and [SOI. the local geometrical st riict ure of the image is estracted using

the available pixels siirroiinding a missing area. and t hen the missing pixels are inter-

polated using that structure. X two pixel wicle frame surroiinding the missing area is

iisecl to extract the geornetrical information. T h e frame is first thresholded t o form

a hinarj- pattern. Csing the location and number of white to black transitions in a

franiti. t h e directions and the number of edges passing through the missing area are

estirriatccl. Based on the estimatcd direction and number of edges. the rnissing area

is partitionecl into a nuniber of regions. In each region a bilinear interpolation of t h e

rnissing pixels is performed.

In [SI]. a mcthod which combines the spatial interpolation method iising edge

contiriiiity ancl temporal replacement techniques. has been suggestecl, -The objec-

ti\-c of t h e proposcd method is t o maintain tlic edge continuity between the niissing

macroblock and its surroiinding neighbors. while trying to minimize t h e difference be-

twer i the missing macrohlock and its temporal replacement. r! pre-processing stage

is performed to determine the similarity between the missing 41B ancl i t s correspond-

i ng temporal replacement. T h e similarit- rneasiire is based on com parison among

prominent feature (e.g.. the edge direction. motion vector direction and magnitude

ancl prediction error). I f the similarity measure is high. temporal reconstruction is

usecl. That is. first the motion \-ector of the missing LI B is synthesized as the average

of mot ion vectors of .\I Bs above and below i t . Then the missing area is replaced \vit h

the motion cornpensated area in the anchor frame. If t h e similarity meastire is low.

spatial restoration based on the edge continuity metliod is employed. For medium

values of the siniilarity measure a conibination of spatial and temporal restoration

met hods is proposecl.

In [S'L] spatial and temporal estimations of missing da ta are combined using

the following relation

u-hcre E(-Y 1 r) is the (optimal) estimate of rnissing information. L'(.Y 1 u;. x ) is

each of the spatial or temporal estimates of missing information and p(t1; 1 x ) is the

likelihood of that cstimatc. rt-hich is calcii1atecI by modeling the video sequeiice as

rnist lire of 1Iarkov processes.

2.7.5 F'ractal Behavior

The rnet hod proposed in [Y31 assunies blockwise similarity in an iniage (i.e.. a Cractal-

l i kc assumption). The wfiole image is searclied to find the best match of an enlarged

niissi~ig N B composed of a missing MB and a two pixel-wide frame around it. The hest

niatch is the one tliat niinimizes a cost fiinction which is a function of the clifference

between t h e tn-O pisel-wide frame around the missing block a n d two pisel-wicte frames

correspondin; t o o ther 3IBs of t h e image.

In [S-l]. an image is modeled a s t h e density of particles of a material wliose

behavior is expressed by the diffusion ecluation. To restore the missing pixel values.

t h e diffusion equation is solvecl given t h e houndary conditions which a re t h e \-altxes

of t h e pixels in the neighboring blocks a t t h e border with t h e missing block. Below.

UT provc that this method is esac t l j - t h e same w the one proposed in [62] and.

consequent [y. t heir performances levels and compiitat ional complexi tics a r e the sanie.

The mctliod proposed in [G?] finds the pixel vaiues f ( r . y ) such tha t t h e siim of t h e

scltiare of clifferences betwcen adjacent pixels over t h e niissing area is minimized. In

o t lier words t h e function f minirnizes t h e following integral

where c(.r. y ) is t h e weight at position (s. y) . According t o t h e Euler t heorem. if L is

a given fiinction. then the function t~'(x. y ) tha t minimizes t h e integral

is t lie solution of the follo~ving different ial equation

where cc, and uqY are t he partial derivative of rc with respect t o s and y respectivelj-.

Lysing t h e Euler theorem for Eqiiation (2.1 ) ive have

Thus. f is t he solution of t h e following equation

which can be wri t ten as

which is the diffusion eqriation.

2.7.6 Temporal and Spatial Error Concealment

Error coricealment niet hods use one or more of t h e a-priori assumpt ions discussed

abm-e along with t h e available d a t a to reconstruct t h e niissing information. Based o n

t lie avaiIahle d a t a wtiich is eniployed in the reconstruction process. error concealment

met hods are cli viclecl into t tvo groups: temporal a n d spatial.

In temporal error concealment methods. which are usualiy rised for error con-

ccaInient in inter coded frames. t h e missing blocks are restorecl using t h e information

in t h e temporallj- adjacent frames. Data of a typical motion compensated inter coded

block is cornposeci of t h e mot ion vector and the DCT coefficients of t h e precliction er-

ror. .A simple temporal concealment met hod involves replacing the corriipted/missing

I~locks \vit h their corresponding parts i r i t he prex-ious frame [ T l ] . This niethod. whicti

works well in still par ts of t h e picture (e.g.. background). cannot procluce satisfactory

rcsults whcn ttic vicleo sequence eshibi ts fast moving objects. lighting changes. or

sudclen sccne changes. If t h e motion vectors are adeqiiately protected so tha t they

are received wit hout errors. t he information in t h e previous frame is motion conipen-

sated and serves t o conceal the lost information in t h e current frarne. However. in

man>- of the vidco compression aigorit hms. the loss of dat.a of a block results in the

loss of bot h t h e motion vectors and the DCT coefficients. This is hecause the codecl

motion vectors and DCT coefficients a r e iisually interleaved. Therefore. most of the

proposed temporal concealment met hods first estirnate the motion vectors associateci

with a missing block. The estimatecl motion vector is then useci to find a block in

the prel-ious frame rvhich yields the restoration of the missing block. To est imate

t h e niissing motion vectors. methods such as averaging [TL]. median [TZ] and AL--\P

estimation [T2] have been suggested. T h e main draivback of methods tliat use the

motion vectors of blocks adjacent to t h e missing block as basis for estimating the

niissing motion vector is tliat. if adjacent blocks are coded in non-predictive mode

(intra mode). there rvouid not be an'- da ta available for the estimation procedure.

Thiis. ad-hoc assiimptions for estimating the missing motion vectors shouId be made.

ancl the results m+- be unreliable. 1Ioreover. if the missing block lies at the bouridary

of two objects moving in opposite directions. such methods perforrn quite poorlj-. Last

bii t not t hc Icast . mot ion vector estimation can gencrally be inaccurate. cspecially

wlien large block sizes (e-g.. 16 x 16 pisels) are used.

Spatial error concealrnent methods restore the missing blocks usirig the infor-

mation in t lie ciirrent frame. Spatial concealment metiiocfs proposed earlier incliide

averagi ng or linear interpolation [67. 681. const rai ned linear interpolation [68]. mult i -

direct ional edge- basecl met hods [76. 77. Y.3.I. and stat istical met hods [Y-. 7-11 .

2.8 MAP Estimation of Missing Data

Let .Y be an ,\:, x , \ f rame a n d let 1'- b e t h e received version of ,Y a t t h e o u t p u t of

t h e channel. where 1' ma? have nlissing da t a . Each transmit ted pictiire consists of .\/

blocks. each having -1' x .Y pixels. Let xi be t h e lexicographie orclering of t h e ith block

of ,Y. T h e vector X is then defined t o be t h e concatenation of t h e xl.xz. - - - .xJ,.

tha t is X = [x1.x2. - - - .x-\I].

Siniilarly. t h e vector Y is defined as t h e concatenation of t h e blocks of l'*.

I f t h e ;th block is n-iissing then Y c a n h e espresscd a s Y = TX. n-here 7 is a n

- x m a t r i s t h a t corisists of t h e identity m a t r i s excluding t h e r o w

O r u - \ - O O ( + 1 - 1- If n of t h e J I blocks a r e missing. t hen 7 will

l>c a n - n,\") x ;\-1-V2 niatr is . Vsing XIAP estimation. t h e e s t i m a t e of t h e

clcconlpressed image is given bj.

x = nrg m a s log P r ( X 1 Y). X

L-sing Bayes rule i v e ob ta in .

\rllcre t!ir term log P r ( Y ) has been dropped because it is independent of X. Lsing

t hc relation Y = TX. t h e conditional probabili t j- of the corrupted image can then be

Siibst ituting Equation ('2.3) in Equation (2.2). the Mr\P estimation of the decorn-

~ r e s s c d image becomes

x = « r y min(- log P r ( X ) } . XES

where S = {Z : Y = 72).

2.9 MRF Modeling

Iniage mocleling refers to the development of analytical representat ions for the inten-

sity ciistributions in an imagc. There are different types of image models. each being

appropriate for one or more classes of applications [97. 981. The most important

reasori for iising image models is the abstraction thcy provide of the large arnoiint of

da ta contained in images. L-sing analytical representations for images. one can de-

\-clop s>*sternatic algorit hms for accornplishing a particular image-related task. C" 1 I ven

tha t the pixels in a local neighborhood are correlated. researchers have proposed 2-D

extensions of time-series moclels for images. The Markovian notion has also been cx-

tendcd to 2-D and l I R F has been introdiiced and estensively iised for imagc mocleling

[99]. The at t ractii-e features of an I IRF model are its ability t o represent different

iniage sources and the local naturc of the resiilting estimates. To enable the rnoclel t o

acc i i ra te i~ characterize the image data. adjustable parameters are iisually considered

in t h e model. Here. we hriefly review h'lKF mociels.

T h e concept of an )[RF is a generatization of tha t of l la rkov processes (XIP)

n-hic11 a r e u-idely used in sequence analysis. An MP is a secluence (cliain) of random

variables . - - . - - . .Y,. - - - defined on t h e t ime indices { - - - . 1. - . - . rn. - - -1. . ln n-th

order unilateral I\IP satisfies

.A i~i iaterai or non-causal l [ P depends not only on the past but aIso on the future.

An n-th order bilateral MP satisfies

1IP is gcneralization into 1 I R F s when t h e t ime indices are replaced with spatial

indices. To this end. t h e concept of neighborhood should be introduced.

Let S represent a discrete set of $ sites

in wliich 1.. - - . .V a re indices. X s i te often represents a point or a region in the

Eiiclidcan space such as an image pixel o r a n image feature such as a corner point.

a lirie segment or a surface patch. The sites in S a re relateci t o one anotlier via a

ncighborhood system. A neighborhood system for S is clefined as

where .\; is the set of sites neighhoring S. T h e neighboring relationship lias t he

foIlowing propert ies:

:\ si te is not neighhoring to itself: s ,\i;. and

47

F i r e 2-91 .A site (X) and its ( a ) first order neighborhood system. (b) second order ncighborhood system.

0 Tlie neighhoring relationship is mutual: s E .\: o r E .\:

I n t h e first order neigh borhood system. also called the 4-neighborhood system. every

site has four neighbors ris shoirn in Figure 2.9a. In the second order neighborhood

systeni. also callecl the Y-neighborhood srstem. there are eight neighbors for each site.

as stiorvn in Figure 2.9b. T e pair A forms a graph. A siibset c 2 S is a clique

i f cvcry pair of distinct sites in c are neighbors. C denotes the set of cliques.

Let .\- = .ysI. .si E S he a famil)- of random variables defined on the set S. in

wliicti each randon1 variable .Ys, takes a value r,,. For simplicit; we can assume a

cornnion state space. SV .\ = 0. 1.2. - . L - 1. Tlie faniily .Y is called a random

field. X joiii event of (S,, = x,, . -Ys, = x,,. - - - . .Y,, = r , , ) is abbreviated as .Y = x.

irherc s = ( .r , , . - - - .+,,) is a configuration of .V-. corresponding to the realization of

thc fielcl. Let O be the set of al1 possible configurations:

S is sait1 to be a Markov random field on S with respect to a neighborhood system

-\- i f a n d only if t h e foliowing two conditions a r c satisfied:

P(-Y = cc) > O . and

.-Iccording t o t h e Hammerstey-Clifforci theorem. everq- 3IRF h a a Gibbs dis-

t r ibut ion j99]. A n hIRF with a Gibbs distribution is given by

wliere Z is a normaIizing constant . G(.) is callecl a potent ia l function ancl is a functiori

of a locaI group of pisels c called clique. a n d C denotes t h e set of al1 cliqucs t hrotighout

t h e image. T h e potential function characterizes t he relationsliip hetween a group of

pixels hy assigning larger costs t o configurations of pisels wIiich a r e Iess likelj- to

occiir. The choice of t h e potent ial function inipacts subs t an t iaIlq- the performance of

tlic image niodel. T h e function xcSC C;(x) niiist be conves in order t o have a n casil)--

ob ta inahlc gIobal minimuni. If th is function is not conves . local min ima a r e present.

tvhicli can he avoiclecl \.ia con ip i i t a t i ona l l~ espensive techniques such as simulated

annealing. C'ornmonlj-. t h e poteritial fiinctions a r e selected t o be in t h e forni

where d, is a coefficient vector a n d x is t h e vector of pixels in t h e clique c and p( . ) is

the cost function. T h e convexit,? propertg of t h e cost function insiires that t h e srini

of poterit ial lunctions (CcEC C . ' ~ ( J - ) ) remains conves. T h e select ion of t h e coefficients

d, is based on the a priori assumptions about the image. The coefficients are iisually

selectcd so that d:x provides an approximation of the first or second derivatives of

the image at each pixel. For the special case of p f r ) = x'. the mode1 is called a

Gauss-Jlarkov random field (GJIRF). To reduce the snioothing effect of the GhIRF

model. other forms of cost functions have been introduced such as the Huber function

~vliere 7 is a threstiold. The image niodel using this cost fiinction is knowri as the

Huber SLarko\- Random Field (HXIRF) model. The Huber fiiiiction is a n esarnpie of

robiist error meastires. There are other CV0 (e-g.. Lorentzian. Ckman-McLiire) and

non-Co (e-g.. truncated qiiadratic) measures. in general cailed redescending estinia-

tors.

2.10 Summary

In t liis chaptcr. IVC first rcviewed image and video compression stanciaïtIs. Tlien. we

surveyed different error concealment methods suggested in t h e literatiire for coded

image a n d vicleo aric1 compared t heir n-pr ior i assiimpt ions. performance and com-

plesity. C;enerally. error concealment met hods t hat exploi t the st riictures t hat esist

wi t hin t lie image in the reconstruction process have bet ter performance le\-els. al-

t liougli t tiey are more compiitat ionally expensi \-e. as extra computat ions are required

to est ract the image striicture. Tliere is an inherent, t rade-off between computational

com plêsi ty and cliiality of the restoration resiilts.

Chapter 3

Error Concealment: Texture

Informat ion

-4 man who hn.5 comrnif td a mistake and doesn-t corr-ect i f is cornrnilting

a n o t h ~ r nz is tnk .

Con fucirrs

3.1 Introduction

In this chapter. we propose error concealment met hods for the test ure component

of coded \+isual data. I t is assumed that the missing tes ture information lielongs

to olie or more blocks. W e consider intra coded and inter codecl testure separately

and propose error concealment methods for each of theni. Obviously. concealment of

errors in cornpressed still images is identical to t hat developed for intra coded testure.

5 1

Position of missing blocks. Data of intra cadeci frame

Compensate the effects of missing blocû on the rest of the

blodrs [if applicable)

Reconstmct the rnissing data (deteministic or statistical

methods)

* Reconstruction resuit

Figure :3.1: Error concedment of int ra coded test ure information.

Concealmcnt oierrors in intra coded testure information is perforrned by invok-

ing spatial error concealnient metliods. That is. the correlation hetween the missing

t~lock and its adjacent blocks in the same frame is used for error concealment. The

metliod introduced in this chapter for concealment of errors in intra coded textiire

consists of two stages as shown in Figure 3.1. First. if the loss of coded data of a

block !las any effect on the rest of the blocks of the frame. this effect is compensated.

For esample. in man- of the image and video compression schemes ( c g . . JPEG and

AIPEC2) the DC coefficient of the DC'T transform is differentially codecl. Therefore.

the decoder is unable t o decode the DC coefficients of the blocks which follow the

missing one. Second. the missing texture corresponding to the missing blocks is re-

constructed. k k present both a deterministic technique and a statistical one for the

restoration of the missing testure inforniat ion.

In most of the error concealment methods proposeci for testure (e-g.. those

proposed for JPEG codecl images). the dependency between the coded data of blocks

has been ignored. In other words. it h a s been assiimed that the DC coefficients of

blocks are coded non-differentially [67. 621. This constrains the impact of a block

loss to onl~. the siibject block and therefore simplifies the error concealment task

but is not conipliant w i t h many image and i-ideo compression standards (e-g.. J PEG

coded images) [ 3 ] . Here. ive propose an error concealment met hod for t h e intra codcd

texture t tiat acldresses the two main problems. tha t is. t h e loss of DC track and the

reconstruction of data of missing blocks. These two probIems are independent in

t h e sense that any methocl for the recovery of the lost DC track can in principle

t x coinbined with an!- method for the recovery of the data of a missirig block ancl

vice-versa.

It shoulcl be noted that the cfependency between the codecl blocks of a frame

can he remot-cd by employing tools supported in compression standards (e.g. rest art

nlakers in J P E G ) . Also. some of t h e \-ide0 coding schemes (e.g.. H.26:3) do not use

di fferential encoding of DU coefficients. In t hese cases. the error concealmcnt algo-

rithm skips the first stage shown in Figure :3.1 and proceeds to the nest stage which

is t lie reconstruction of the missing information of the block.

In concealment of inter coded test ure informat ion. the post-processing tech-

nique makes use of the correlation between the missing block and its neighboring

Position of missing Mocks. Data of inter coded frame

Spatial reconstruction of the prediction e m of the missing

block

Data of adjacent ftames

Initial estirnate of the missing b M s

Figure : 3 2 : Error concealment of inter coded testuce information.

I~locks in t h e same frame and/or the adjacent frames to accomplish error conceaI-

nient. A block ciiagranl of the method proposed for concealment of errors in inter

coclecl testiire information is depicted in Figure 3.2. First. the information in the

previous frame is used to fintl a n initial estimate of the missirig testiire information.

If t h e mot ion vectors of the missing hlocks are avaitable. motion conipensat ion is usecl

to provide the estimates. Othertvise. a n algorithni which preserves image continuity

is empIoyed to compute the initial estimates. Ses t . the spatial error concealment

nietliod is applied to the prediction error of the blocks acljacent to the missing blocks

in orcler to find an estimate of the prediction error of the missing block. The re-

sul t is then combined with t h e initial estimate of the missing block to find its bcst

approsimat ion.

In the rest of this chapter. we rvill describe each of the stages shown in Figures

3.1 and 3.2 in more detail. In Section 3.2. a post-processing method is explaineci

t hat cornpensates the effects of the loss of a block on the rest of the hlocks in the

sarne frame. Sections 3.:3 and 13.4 present the deterministic and stat istical met hods

proposecl for reconstruction of missing texture information. respectivelu. Section 3..5

presents the niethod used to find the best initial estimate of a missing inter codecl

hlock in an adjacent frame.

3.2 Removing the Effects of a Missing Block on

the Rest of the Blocks

In man'. of the compression schemcs. the DC' coefficients of the blocks of the image

are codecl clifferentially. That is. the input source image is partitionecl into (8 x 8 )

blocks dong the raster scan order (i.e.. left to right / top t o bottom) and DCT

transfornied. Each of the resulting DCT coefficients is quantizecl iising a different

ilni form quant izer. The DC coefficient wliich is first coefficient is different ially coded.

T tiat is. t Iic clifference coefficient DIFF given hy

D I F F = di - di-1.

irhere d, and d Z e l are the current block and the pre\-ious block DC coefficient \.cllues

(respect i\-ely ). is coded.

The recovery of the lost DC track. which is necessary for the decoder t o decode

t h e DC' coefficients of the blocks which follorv t h e missing one. has not b e n much

addressed in the literature. In [66]. different.ia1 encoding of t he DC coefficient in a

DCT-based coder has been considered. but it was assumed that there is a synchro-

nization code word a t t he end of eacli tine of blocks. The synchronization code word

iimits the propagation of errors due to loss of the DC' coefficients. making the error

concealnient task easier.

To enable the decoder t o decode the DC value of hlocks following a missing

t~Iock. t h e DC 1-alue of the missing block shou1d bc estinlateci.

3.2.1 Estimation of DC values

Figure 13-13: Estimation of DC value.

I t is assiimeci ttiat the clecoder knows the location of missing blocks. This can

l>e achieved. for esaniple. by transmit t ing the blocks of the image/frame in a predcter-

miilcd ordcr and assigning secluence numbers to packets in packet bascd transmission

[ f > i ] . The dccoder s tar ts decoding the data. and when i t reaches a n arca representing

a missing block. it estimates t he DC value of the missing block using the DC' value of

adjacent hlocks. The DC value of the missing block is estirnateci hl- a iinear, causal

estimator ci. which is given by

wlierc d l . d2 . & and d4 are t he DC values of the adjacent blocks shown in Figure 3.13.

Ttic coefficients of the estimator are ohtained iising well-k~iown Linear Least Square

Error (LLSE) techniques. To obtain a set of fised values for t h e coefficients that

can be used for naturai images. four training images. hi,\';DRILL. FICHTER. L-AKE

and C O U P L E were used. The obtained values for the coefficients are u?l = -0.35.

rr- = 0.54. 1 ~ 3 = 0.1:3 and Z P ~ = 0.68. If for a specific missing block any of the

neigliboring blocks does not exist. (e-g.. for the b!ocks in the top-rnost row or in the

Ieft-rnost coliimn). the corresponding DC value is set to zero. In partictilar. i f the

block at the top left ( the first blockj of the image is rnissing. the estimated DC value

of t lie niissing block is set t o zero. This causes a constant level shift in al1 the pixels

of t hc image which can be corrected by adjusting the bright ness of the image.

3.2.2 Removal of stripes

'in crror in estirnating the DC value of a missing block causes a shift in the gray

level of al1 the pixels between tha t block and the ncxt missing block. This level shift

appears a s a stripe across the image. To remove the stripe. the amount of the level

shift should be deterniined. Adjacent pixels in an image are highly correlated and

t heir valries are close to each other. Therefore. the difference betu-een the average of

pisel values in the st ripe and t hat of t heir spatially neighboring pisels outside (above)

the stripe seems to be an appropriate estimate of the level shift. Lysing the a-priori

information about the position of missing blocks. we conipute

where li.i+l is t lie average value of t he pixels d o n g the iipper line segment connecting

tlic missing hlocks i and i + 1. and is the average value of the pixels in the

corresponding line segment in the previous row (see Figure 3.4). li,i+i can he wri tten

wherc r; is t h e column niimber of the right coliimri of block i. li+l is the colurnri

nuriifm- of left colrirnn of block i + 1. p is t he row nuniber of thc uppcr pisels in t h e

t~locks i and i + 1 and .\;v;+i is t he number of pixels in the ron. p between columns

r; ancl i;,,. Sitiiilar definition applies t o m;.;+l- If (Ail > T . where T is a threshold.

a s tr ipc is detected ancl removed bu subtract ing 1; from al1 the pisels bctweeri t h e

missing hlocks i ancl i+1. The proceclure is repeated for a11 the 11 niissing blocks.

Figure 3.4: Post-processing for reniot-ing the stripes.

T'lie nlissing btocks i a n d i + 1 ma>- have differcnt horizontal positions. but

t Iiis docs not affect t h e aho\-e procecliire. AS discusse~l ahove. t h e average of t h e

pisrl t-allies d o n g t h e uppcr l i ne segment inside t h e s tr ipc is corriputeci. and it is thcn

coniparecl to t h e average of t h e correspo~~ciiiig piscls in t tic previous line segment.

'The liric scgnicnts woulcl. of course. be broken lines in t liis case. Figure 13.5 shows

ttic 1-arious cases tha t can occur when t h e missing blocks have different horizontal

positions. Obviously. if two missing blocks a r e inimecliatelj- adjacent t o cach otlier.

Figure 3.5: Rernoving t h e s t r ipes for blocks wit h different horizontal positions.

rio s t r ipe is forrned between them. and t h e a lgor i thm proceecls t o t h e nes t pair of

iiiissing hlocks. For t h e inissing blocks appear ing in t h e upper-niost par t of t h e

image. there is not a previous row. Therefore. t h e average of pixel valiies in a block

I d o r e a n d a block a f t e r t h e missing block a r e compared t o e s t ima te the level stiift.

T o determine t h e t-altie of t h e threshold T. t h e dis t r ibut ion of t h e differences

I ~ c t tveeii t h c nt-erage pixel valiies of consecut ive rows in a n image was obtained for a

set of images. Figure 3.6 shows t h e histogram of t h e difference values for t h e images

l l x s ~ n i t ~ and FIGHTER. -4s it can be seen. t h e his tograms have a peak a round

zero. indicating tliat mos t of t h e rows have a lmos t t h e saine average pixel values.

Scvertheless. wit h a sma l l t hreshold value (i-c.. a r o u n d zero). t here is t he possibili ty

of esceecling t h e threshold level because of a n edge a t in Figure 3.4. despi te

tliat there are no visible stripes. Taking this fact i n t o account a n d considering t h e

diffcrence value at which s t r ipes becorne visible t o t he hiiman e o (especially in t h e

sniootli regions of t h e image) . and t o maintain the probabi l i ty of false a l a rm (i-e..

t licrc is not a s t r ipe bu t t h e algorithm de tec t s one) to less t han 1%. a tlireshold value

Figure 13.6: Histogram of t h e difference value: ( a ) Fighter: ( b ) 1Iandrill.

of 4 [ras selected [LOO].

3.3 A Deterministic Method for Reconstruction of

Missing Texture Informat ion

The estimation of the DC' values of the

t h e blocks foilowing the missing block

data shoulcl be reconst ructetf.

, niissing blocks enables the decoder to decode

. Xfter the stripes are removed. the missing

Our determiriistic method for estimating t h e missing data of a I~lock is based

on the one siiggested in [65] . The method proposed in [67] estimates the data of a

missing biock as a linear combination of its top. hottom. right and left blocks. i.e..

t lie estimated matr is pz is given by

d i e r e PT- PB. PL. ancl PR are 8 x 8 matrices of pixel values of the top. bottom. left

and right blocks (respectivelj-) of t h e rnissing block '. The weights IL '~ .WB.UL.WR a r e

selectecl so t hat t h e to ta l sqiiared border error

is minirriized. Each border error is given b ~ .

whcre the ma t r i s is generated using Equation ( 1 3 . 1 ) . The vectors pZt. pZb.

and p z r are those elements of pz t ha t corresponci t o the top. bottom. Ieft and right

pixels in the missing bIock. respectively. Similar ly p ~ a consists of t h e bot tom line

pixels of the top block. p~~ consists of the top line pixels of t he bot tom block. p ~ ,

consists of the right line pisels of t he Ieft block and p ~ , consists of t he left pixels of

the right block. By reconstriicting each pixel in a block as a linear combination of

the corresponcling pisels from neighboring blocks. t h e s tructure of t h e reconstructecl

block is determined by t h e s tructure of the neighboring blocks rather than hy a

global smoothness assumption which may J-ielcl iinwanted artifacts. Moreover. local

image characteristics a r e preserved and high frequcncy details can be reproduced.

This met hod can reconstriict very weIl horizontal. vert icd. near-Iiorizontal and near-

vertical edges. S t rong diagonal edges. however. are not reconstructed well bu t his

' \Vithout loss of generality, in t his section. WC assume that the size of a block is 8 x 8.

tectiiiiqiie. What J-ields a poor reprodiiction qualitj- arountl diagonal edges is the fact

ttiat only one weight for eacli of the top. belou*. riglit. and left blocks is used. In our

proposecl method. two weights are assigned to each of these blocks. First. we consider

the case tliat each of these blocks is partitioned into two parts. and a constant rveight

is assigned to each part. Thiis. t h e estimation eqiiation ivill be modified t o

where Li,.l is the 4 x 4 identity rnatrix. O is the I x -1 zero matrix and WTl. i q - 2 .

~ L ' B I . (1'82. ICLI . LL'LZ. I C R ~ . and are weighting coefficients. Values Cor the latter

coefficients are selected minimizing the cost function described in (3.2) and (3.3).

Equation (:3.-1) can also be ivritten as follow.

wlicrc p7-l is p~ half replacecl wi th zeros and PT, is t lie o t lier half of p ~ . .A similar def-

. . *

~riition applies to PLI . PL* PRI and PR*. Clonsidering Eqtiations (3.1) and (3.5).

tlie reconstriiction procedure can hc interpreted as tlie estimation of a 2-D function

~ising a set of basis fiinctions. The niimber of basis fiinctions is four in Eqiiation (3 .1 )

a n d eiglit in Equation ( 3 . 5 ) . Equation (3.1) is a special case of Equation ( 3 .5 ) . where

l C ~ l = ( ~ 7 ~ . U'BI = tL'82. 1 ~ ~ 1 = uy~2 : aiid (CRI = Therefore. al1 the satisfactory

fcatiircs of the method proposed in [6ï] are preserved in our method. Increasing t he

nuniber of independent b a i s fiinctions is espec ted t o ciecrease t h e estimation error.

T h e manner in which adjacent blocks a re parti t ioned affects t h e basis functions a n d

thus t h e est imation results. .Ml t h e adjacent blocks should not be parti t ioned in

jiist one direct ion (horizontally or vert ical1~-). as siich a part i t ioning wotild m a k e Our

niethod alrnost t h e s a m e as the one proposed in [6C] with a block size of 8 x 4 o r -1 x S.

1Ioreoi-er. two blocks t ha t lie in t lie s a m e side of a diagonal edge (e-g.. left and bot tom

blocks) shouid not b e partitioried in t h e s a m e direction. becairse this cariscs four of

t lie basis functions in Equat ion (:3..5) t o b e alrnost t h e same. The choice of which of

t h e two opposirig blocks shoulcl be part i t ioned horizontailj- or vertically is arbitrarj-.

In this work. as shown in Figure 13.7. t h e t o p and b o t t o m bIocks are parti t ioned in

t h e horizontal direction. and the left and right blocks a r e parti t ioned in t h e vertical

direction.

Figure :3.7: The partitioning of t h e blocks adjacent t o a missing block.

T h e weight matr ices in Equat ion (:3.4) a r e diagonal matr ices and t h e diago-

6 3

na1 e lements of each ma t r ix can b e considered as a n S-point sequence. For t h e case

discussed above. each of t h e 8-point sequences is a piece-wise constant function with

two values ( e . g . . a n d w L 2 ) -4s will b e seen in C'hapter 5. using piece-wise con-

s t an t secpiences improves t h e reconst ructiori of diagonal edges bu t m a y genera te some

blockiness in t he reconstructed images. If t h e 8-point seqiiences m e n t ioned above a r e

chosen a s linear ramps. t h e es t imat ion equa t ion can be wri t ten as

w-licre rnr. m ~ . n z ~ ancl r n ~ a r e t h e increments assigneci t o t h e weights of t h e cor-

rcsponding blocks. Iralues of t h e unknown parameters (zctls a n d r n , s ) a r e obtained

b>- minimizing t h e total border e r ro r described in Equation (3.2). C-sing t h e above

est imat ion. t he reconstructed images will not contain much blockiness but t h e diag-

onal edges may b c snioothed. O t her types of S-point sequences c a n also be assigned

t o eacli block. This. howevcr. would increase t h e computat ional complesi t j - of t h e

nlethod. Indeccl. ou r results show t h a t t h e improvement in t h e visiial cliialitj- of t h e re-

const riictcd images cloes not just if!- t h e addi t ional complexi t? of iising more corn ples

weighting schemes.

C-sing the diagonally ad jacent blocks can help improve t h e reconstriiction of

diagonal edges. This. Iiowever- would increase t h e computat ional coinplexity of t h e

c r ros concealnient me t hod bj. a fac tor of S. l loreover. t h e diagonally adjacent blocks

a r e fart lier from t h e missing block t h a n t h e horizontal a n d vertical acljacent blocks.

Thiis. t heir correlation wit h t h e missing block is usually less t han t h a t of t h e horizontal

or \-ertical adjacent blocks. Therefore. including diagonally adjacent blocks in the

error coriceatment would likely not prot-ide a signi ficant irnprovement in reconstruction

qriality. In fact. oim simuIations show that. in many cases. the increase in quality of

the reconsttucted image as a resiilt of using diagonaliy adjacent blocks does not justify

the increase in computational complesity.

The \veiglits w r ~ . t c ~ l . w ~ l . WB-. L C L ~ . U L ~ . W R ~ ancl C L ' R ~ . which are required for

cstimating the rnissing block iisiiig Equation ( 3 . 4 ) . are determinecl by minimizing the

total border error ,ai\-en by Equation ( 3 . 2 ) . Since the total border error is a quaciratic

fiinction of the ahove mentioned weights. the weights are obtained hy solving a set of

linear ecl~iat ions. Likewise. if Eqiiation (3 .6) is iised as t h e estimator. the total borcler

error is a quaciratic function of the iinknown parameters q - 1 . m ~ l . tcgl. n - 1 ~ 1 . W L I .

m L I . t rRl ancl m R I . Therefore. the values of the latter paranieters can also be obtained

hy solving a set of linear ecluations. Thiis. the associated computational complesity

of oiir niethod is eqiial to that required for solving a set of S linear eqiiations with 8

~inkriowns. besicles the miiltiplications and acldi t ions requiretl to cornpute the equation

coeffcients from t h e available da ta and the ones necessary in Ecpations (i3.4) or (3.6).

Table 3.1: Computatiorial cornplesitj. of various reconstruction rnethods.

1 Met Iiod I Xiirnber of additions I Sumber of multirîlications I

Method of [6'2] ' 1

4069 7-10 1920

Slet hod of [65] , Pronosed method

82-36 108'2 1 720

h-e nest compare the computational cornplexit- i.e. t he reqiiired niiniber of

actdit ions and multiplications. of our met hod wit h o t h e r methods reviewed in C h a p t e r

2. Generally. ecige-based met tiods are computat ional ly intensive since t tiey reqriire

estimation of edge directions. Some other methods like [66] are compi i t a t iona l l~ very

simple but their performance levels are worse than t h a t of otir methoci. Therefore.

we nes t compare oiir method wit h only t hose proposed in [ 6 ï ] . [65] and (621. T h e

nrrnlber of additions and multiplications requireci for restoring 64 pixel values a r e $60

and 960 respcctively using the niethod proposecl in [67] and 17'20 and 1920 respectively

i~s ing oiir met hod. The method proposed in [62] requires S2-56 additions a n d -1069

miiltipIications t o restore the 6-1 missing pixel values in the spatial domain. T h e

method proposecl in [65] is unable t o restore al1 of t h e 64 coefficient in t h e DCT

clornain. I t restores u p to '28 DCT coefficients. The missing pisel values a re ohtained

iising thc inverse DCT- The reqiiirecl number of addit ions ancl miiltiplications for tliis

mettiocl assuming a fast inverse DCT a l g x i t h m a re 1082 and 7-10. respectively. Table

:3.1 siimmarizes t tie cornparison results. T h e cornputat ional demand of oiir met hod is

larger than t hat of [67] and (6.51 and less t h a n tha t of [62] and edge-based methods.

However. orir methocl can reconstruct image details 1%-hich the methocl presented in

[65] is unable t o recover. and it can reconstriict diagonal edges whicli t h e nietliod

presented in [67] carinot restore. This. in most applications. justifies the additional

conipri ta t ional complesity.

3.4 A Statistical Method for Reconstruction of Miss-

ing Texture Information

The s ta t istical m e t h o d t h a t ive propose for t h e reconstruction of missing t ex tu re infor-

mation ernploys a h l a x i m u m A Posteriori (.\lt\P) est imator . \Ve consider a Slarkov

Random Field ( M R F ) ~ v i t h a n eight-pixel clique a round each pisel as t h e image a

priori niodel (see t h e riiarkeed pixels in Figure 3.8) [10 11. The reason for seiecting a n

eiglit-pixel clique in t h e manne r shown in Figure 3.8 will becorne clear later. Tlie

potential fiinction of Eqiiation (2.6) can be wri t ten as

Figure 3.S: .A pixel. i t s clique c a n d t h e eight directions. Tlic complcment of t h e cliqiic. cf is t h e d a r k arca.

irherc c is t h e cliciiie a n d D(siSj. xbSl) is t h e difference betiveen the i-aliies of t h e pisel

in position (i.j) a n d t h e pisel in i t s clique nt position (k.1). and ir;.,-k,l is t h e weight

assignecl t a t his difference.

Combining the SIAP estimator of Equation (2.4) with Equations (2.5) and

( S . 7 ) . t h e restoration of missing data results in the following minimization problem

ivliere .M is the set of a11 missing pixels in the frame. For the case p ( r ) = x2 (Gauss-

lIarko\- Random Field (GMR F ) moclel). the above rninimizat ion probiem xields a

tiniquc global solution. since p ( s ) is a conves function. In fact. the estimatcd value

of a pisel is gii-en by

~vherc c is t h e clique and c' is its complement shown in Figure 3.S. If al1 the weights

are selected equal to each ottier. the above ecpation is simplified to

where r t , ancl r1,l are the numbers of pixels in the clique and its complement. respec-

tii-el'- (each S for the shon-n cliclue). In other words. the estimated value of a pixel is

t h e average of the pisels in its clique and the clique's complement (sec Figure 3.8).

Tlie i-alties of missing piscls in a block can be estimated iising Equation (3.9) itera-

t ively. This is in fact the application of t lie iterative conditiorial mode (tC.\i) mcthod

proposecl in (1021. In the ICSI met hod. the .\L.-\P problem is not solvecl but only ap-

prosirnateci. Depending on t lie sclected neighborhood systern pixels are divided into

groupssçtich that those in one groiip are mutually independent ancl the a posteriori

of eacli group is masimized. This perniits fast convergence and parallel synchronoiis

implementat ion. Since the fiinction p ( . ) is conves. the iterative met hod will converge

to the glo1)al minimum.

The cost function. p ( x ) . may also be selected as t h e Huber function given hy

wlierc is a threshold. T h e image model using th i s cost function is called a H u b e r

Jlarkov Random Field ( HMRF) model - For th i s case it is difficult to ob t a in a closed

forni espression for the es t imated value of a missing pixel [10:3]. T h e solution t o Ecliia-

t ion 3 3 . however. can be obtained by n ieans of t h e ICll methocl. T h e performance

of t lie JI.AP e s t i ~ n a t o r hased on H l i R F mode l can be described as follo\vs. \Vhen t Iie

valtics of t h e pisels i n t he cliclue a r e close to each other. t h e missing pixel is se t to

t h e average of t hose pixels. When t h e pixel vaIues a r e not similar. a voting proceclure

is performed ancl t he est iniated value is se lected such t h it is close t o t h e \-aliie of

tlic major i ty of the neighboring pixels (a m e d i a n Iike performance). T h i s behavior

prcvcnts t lie appearance of pixel values t h a t a r e different fronl t heir neighbors. which

in t iirri linlits t h e performance of t h e e s t i m a t o r in t h e reconstruction of edges. Be-

sides. HSI RF hased reconst riiction algorit hms a r e computat ionalIy corn ples anci t heir

suhoptirna1 \-ersion. median filtering. does no t yield unique soltitions for t h e missing

\ali ies [!O-11. -4 very simple c s a m p l e dep i c t ed in Figure 3.9 reveals t h e perfornianre

of a li:\P cçt imator using t h e GMRF a n d HNRF models. The value of the cente r

pixel . r , , is missing. :Issume tha t t h e pisel values a r e p and q a s shown in t h e figure.

p << q . a n d there is a \-ertical line in t h e image . Using t h e G M R F model. t h e value

of t h c missing pixel is

Figure 3.9: A missing pixel in

L-sing t hc H.\IRF moclel for t his specific e sample .

Ob\-iotisly. both of t h e es t imated values a r c

a vertical line.

we u-il1 have

(3.12)

close t o the rnajority of t h e neigh-

horing pisels. Thiis. none of the above mentionecl models (i.c.. CiAlRF a n d H M R F )

is ab le t o de tec t t h e presence of t h e vertical edge a n d reconstruct t h e missing pixel

\-altic based o n t ha t edge. L-siial1~-. rel-ing only o n t h e local image characteristics in

t h e reconstruction procediire causes s o m e of t h e image at t r ibutes t o be ignored o r

niisintcrpreted [lO5]. Here. we aclopt a GMRF wi th an eight pisel clique (shown in

Figure i3.8) as t h e image n-priori model . Howex-er. t h e weight corresponding t o t h e

cliffcrence between a pixel ancl each of t h e pixels in i ts clique (u . , , - k .~ in Equat ion

(3.9)) is selected aclaptivell-. hased o n t h e likelihood of a n edge in t h e direction of t h e

siiI>jcct pair of pisels. T h e rationale behind th i s selection is t o give more weight t o

t h e ciifference between t h e pixels in t h a t direction ivhich will cause t h e 1-dues of t h e

pixels in t h a t direction t o get closer to eacli o ther . T h e likelihoocl of eclges in each of

t h e ciglit clirections is cornputed using blocks a round t h e missing block. In th i s way.

t h e a\-aiiable information in a larger a r e a is esp lo i ted ivithout increasing t h e o rde r of

t lie G.\I R F model (which increases dramaticaI1y t h e computational complesi t y ) . To

determine the likelihood of edges in each of the eight directions. edges in the blocks

siirroiinding the missing block whose directions impIy t hat t hey pass t hrough the

rnissing block are determineci using

for CL-ery pisel in the blocks to the left. right. top and bottom of the rnissing block.

The niagnit ude and angiilar direction of t lie edge at pixel (i-j) are

Y, O = arctan(-). gr

tvherc 0 cletcrmines if the edge nt pixel (i.j) passes through the missing biock- Since

t here are eight pisels in the clique. the value of 19 is roiincled to one of the eight direc-

t ions eclually spaced in the range from O" to lSQO. There is a counter cm, ( m = l .?.....S)

for cach of t hc eight directions. If the cstcnsion of an eclge at pixel (i.j ) belonging to

one of the neighboring blocks passes through the missing area. the counter for that

particular direction is incrernentecl by the arnount of G . This procediire is repeated

for al1 the pixels in the blocks to the top. bottom. left and right of the rnissing block

( i f applicable). To remove the spurious edges resulting from the type of the edge

cletcct or eniployed. the values of the counters are t lires holded.

Thrre are eiglit pixels in the clique of each pixel and eight clircctions for the

cletccted edges. Each pisel in the clique of a pixel corresponds to a direction. In our

proposecl method. the value for ZL';,,~;,J is selected baçed on the eclgc coiinter of the

direction corresponding to (i-j) and (k.1). i-e..

\\-here 3 is a constant and cm is the corrntcr corresponding to direct ion m. and direct ion

rn corresponds to the direction formed b ~ - (i.j) and (k.1). Tlius. oiir statistical error

conceal ment. met hod can be summarized as follows:

1. Determine the edges in the neighboring blocks and assign eacli to one of eight

eclually spaced directions. C'ompiite the countcr for each direction,

2. L-se Equation (3.1:3) to find a set of weights for each missing block.

3. Use Equation (13.9) to obtain an estiniate of each missing piscl emplo~-ing the

weights obtained in the previous step. and

4. 1 teratively re-estimate the missing pixels using Equation (3.9) until convergence.

In t lie case where adjacent blocks are lost . the reconstruction algorithni is appliecl re-

cursi velj.. Blocks wi t h t lie masimum nurnber of correctly decodecl neighboring blocks

are reconst riicteci first. and the rest of blocks are reconst ructed recilrsivcly. This giiar-

ant ces t hat the best possible estimation accuracy. when missing blocks are rest ored

one by one. is achieved. Sote that. since the cost function is conves. convergence is

g~iaranteed.

For intra coded blocks, our >I.AP estimation method is ernployed d o n g with

the da ta of neiglihoring blocks. For inter coded blocks. prediction error signais of

the neighhoring blocks are used together with the JlXP estimator (or the method

proposecl in Section 4-13). to find an estimate of the prediction error of t h e niissing

block. The estirnatecl value is then added to the initial estimate (as shown in Figure

:3. '3) . Since the predict ion error signal consists most ly of high frequency components.

the second stage wil l improve the video reconstruction quality. especiallv around

ed ges .

Tlic niimbers provided beiow for t h e computational cornplexity correspond to blocks

and sitb-blocks of sizes 16 x 16 and 8 x S. respectively. The cornpiitational load of

t l i ~ proposed M'IP estimator consists of t hose compiitat ions recliiired in aclapt ing

the weights of the C i l l R F niodel. which are approsimately lS-100 operations for a

rnissing block. assuming four available neighboring biocks. The estimation of missing

pisels using Equation (3.9) is an iterative procediire which. for each iteration. requircs

approsimateIy 130 operations. O n average. 80 itcrations are recpired for the algorithm

to con\.erge for a bIock.

3.5 Initial Texture Estimate of a Missing Inter Coded

Block

The purpose of this stage of oiir mcthod is to obtain an initial estimate of the

missing texture of an inter cocled block iising information [rom the prcvioiis frame. If

t lie motion vector of the rnissing block is receivcd correct ly (e-g.. because theéncoded

texture data has been separated from the motion da ta via partitioning). then the

estiniate is set to the motion compensation predicted block. N'hen the motion vector

Figure 13-10: (a) The four siib-blocks. ( h ) their corresponding sub-blocks in the previous franie ancl the blocks connected to them.

is not available. tve do not enlploy ttic motion vectors of adjacerit blocks to estimate

the missing motion vector due to the reasons mentioned in Chapter 2. instead. we

find an estimate of the missing block such that image continuit'. inside the block

and across its boundaries is preservcd [106]. To do this. sub-blocks adjacent to the

niissing block. i-e.. ul . LI?. 11 and 12 in Figure XlOa. are consiciered. First. for each

of t hese siib-blocks. a corresponding sub- block in the previous frame is deterinineci.

T h corresponding sub-Mock is found by searctiing a srnail area arountl the point

corresponding to the center of each of the sub-blocks u l . u2. I L . I l in the previous

franie, The siim of absolutc differences is iised as the measure of similaritj-. The four

siib-blocks ui. u 2 . Il. 12 ancl their corresponding sub-blocks u l f . trl ' . I l f . 12' in the

prek-ious frarne are shown in Figures :3.10a and 3.10b. For example. t l 1' is t Iie sub-

l~lock corresponding to ul. Then. four blocks. namelj- XI. X2. X3. and X4. which

are connected to ul' . n2'. Il'. and I2' (respectively). are cietermined. To obtain an

initial estirnate of the missing block that smoothly connects to the rest of t h e image.

a block from the above four blocks that minimizes the squared sums of border errors.

between the estimatecl block and its adjacent above and left blocks. is selected. Thus

2 x = nrg min c . x 1 .X2 .X3 .X4

where

Each of the border errors c~ and EL is clefined in terms of pixels bl-

where the vector p ~ b consists of the bottom line pixels of the block to the top of

the missing block and p ~ , consists of the right column pixels of the block to the left

of niissing block. The vectors Xt and xl are those elements of the estimated block

X that correspond to the pixels in its top row and left column. respectively. If the

missing block belongs to the upper-most row of the frame. only the left border error

is considered in Eqtiation (3.15). Sirnilarly. for missiiig blocks lying in the Ieft-most

colunin of t hc frame. only the tipper border error is considered in Equat ion (3.15). If

the first block of a frame is missing. the initial estimate is set t o the corresponding

block in the previous franie.

The estimation of motion of the sub-blocks has a high computational overhead

whicli can possi bly int roduce unacceptable rcyuirements on decoder. The corn puta-

tional 01-ertiead can be reduced if the search for displacement of each of the sub-blocks

is restricted to a set of candidate motion vectors. This is a decoder option and can be

used to t rade performance against corn put at ional complexity [ l o i ] . The set consists

of the motion vector of the block corresponding to the missing block in the previoiis

franie. the motion vectors of a\-ailable neighboring biocks. the median of the motion

vectors of available neighboring blocks. the average of the motion vectors of a\-ailable

ncighboring hlocks. and the zero motion vector [LOS].

3.5.1 Computationd Complexity

Tlie cornput at ional load of t his stage of the proposed met hoc1 consists of t hose com-

piitatiotis reqiiired for L ) estimating tlie displacement of each of the sub-hlocks. and

2 ) computing the error of Ecluation (3.14). For motion estimation. we tised a spirai

search method using an area of 16 x 16. This requires approsimatcly 197000 opera-

tions. If t he searcii for the displacement of each of t h e stib-blocks is restricted to the

set of candidate motion vectors as i ras esplained. the required niimber of operation

will reducc to 6100. The number of operations required t o finci t h e total error of

Eqtration (3.1-L) for four b1ocks is approsinlately :380.

C'learly. t he computational load of oirr error concealment methotl is quite rea-

sonable. Our simulation esperiments confirm that the run time of otir method is

indeed. acceptable. In fact . real-t inie decoding (e-g.. 10 frames/sec for QCIF vicleo

seqiienccs) is still possible or1 a Pentium 300 l I H z PC.

3.6 Summary

In t his chapter. reconstruction of missing texture i r i transmission ~f coded visual data

~i-as investigated. A two stage method waç introduced for concealment of errors in

76

int ra codecl tes t lire. In the first stage. if applicable. effects of loss of coded data of a

block on the rest of the blocks of the frame is removed. This is a key contribution of

the proposed reconstruction argori t hm since. unlike previously publishcd algorit hms.

it considers the depenclencj- between coded bIocks. In the second stage. the missing

textures corresponding to t h e missing blocks are reconstructed. W e proposed a deter-

niinistic and a statistical method for the restoration of missing testure information.

The cleterministic methocl achieves a good perfor~nance level in the reconstruction of

ectges. The stat istical met hoc1 uses an aclaptive IIRF as the image a-priori model. The

adaptation enables the estimation procedure to incorporate more information without

increasing the order of A I RF. The met hod proposed for concealment of errors in intcr

coded testure information also consists of two stages. First the information in the

acljaccnt frame is used to find an initial estimate of the missing testure information.

Sest . the spatial error concealrnent method is applied to the preclict ion error of the

biocks adjacent to the missing blocks in order to find an estiniate of the prediction

crror of the niissing block, The result is then combined with the initial estimate of

the rnissing block to find its best approsimation.

Chapter 4

Error Concealment: Shape

Informat ion

To r a i s t n e w q u t s t i o n s . ne w poss ib i l i t i r s . lo regard o ld p r o b l ~ m s fronz rr

n e t r r r n g l ~ r e q u i r e . ~ c r e a t i m imcrginatzon and m a r k s rval rrdccrnces i n x i -

EIICE.

-4 lbe rt E i n s t e i n

4.1 Introduction

A s was nientioned in Chapter 2. the new generation of conipression standards (e-g..

11PEC;-4) support an object-based representatioii of vicleo bj- allowing t lie coding

of t h e shape information of arbitrarily shaped video objects along with the objects'

testiire and motion information. The concealment of errors in t h e texture inforrna-

tion was disciissecl in the previous chapter. In this chapter. ive present a n efficient

concealment method for errors in the shape information. This suhject has. so far.

not been addressed in t h e Iiterature. .As in the previous chapter. ive discuss error

concealment methods for intra coded and inter coded shape data. For intra coded

s h a p e . the spatially adjacent information is iised to estimate the missing sliape in-

formation. =ln adaptive llarkov Ranciom Field ( J IRF) . which is designed for binarj-

shape information. is proposcd as the image a-prion' modcl. The proposecl image

mode1 is used along with a 1l.AP estimator t o recover the missing sliape informa-

tion. For inter coded s h a p e information a reconstruction methoci simiIar to t he one

that \vas proposed in the previous chapter is devised. It should be noted that con-

ceaIment of errors in inter coded shape data is more critical tlian that of the inter

coded one. simpIx because the intra coded shapes are used for prediction and thiis

t h e errors in them propagate to the follou-ing inter coded shapes. Sections 4.2 and

4.:3 present the proposed error concealment niet hods for intra coded anci inter coded

s h a p t information. respectivel~.

4.2 Concealment of Errors in Intra Coded Shape

Informat ion

For reconstruction of missing intra coded sliape information. ive e m p l o ~ a statistical

mct hocl. As esplainecl in Chapter 2. statistical met hods for error concealment assume

t hat the pixel values in an image o r video signal are realizations of an underlying

randoni process. The I IXP estimation can t hen be employed to yield the most likely

image given the obserx-ed image data and the image mode1 [Z. 104]. Here. ive use a

Il-AP estiniator for restoring missing shape information. Since the shape information

is biriary. ive propose an appropriate form of hlarkov Random Field (SIRF) as the

image niodel.

As shown in C'hapters 2 and 3. the N=\P estimation of missing data in an

image assiirning AIRF as the image a-priori mociel and t h e conditiond probabilit-

,ai yen h ~ - Eqiiation 2.3 can be espressed bu the following rninimization problem

x = min C \,,.-(z,, j. 1: .J i . j € M

wlicre .M is the set of al1 missing pisels in the image and V is the potential t'unction.

Tlic potential fiinction characterizes the relationship between a group of pixels bl-

assigning larger costs to configurations of pisels which are less likeiy to occiir. The

choice of the potential function is crucial to the performance of the image modeI.

('oriinionly. the potential functions are selected to be of the form

ivhere c is a clique. p is a fiinction called the cost function. and rc;,j,k.l is the weight

assignecl to the difference between the pixel ~ a l u e s ri, and [W]. Tlie cost fiinction

p ( . ) . i n fact. encourages the pixels that a re spatially close to each other to have the

sarnc ïaliin '. The shape information is binary and r i , can only assume one of two

valucs. Therefore. WC select the cost function with the following form

! Refer to Cliapters 2 and 3 for a complete discussion of MRF

80

wherc 3 is a positive constant .

Figure -1.1: =\ pixel. i ts cliclue c and t h e eight directions. T h e complement of t h e clique. cf is t he d a r k area..

For t h e clique. ive aclopt an eight pixel neighborhood shown in Figure -1. 1. The

iveight corresponding t o t h e difference between a pixel a n d o n e of t h e pisels in i ts

clique IL-,.,-^., in Equat ion (4.1 ) ) is selected adaptively. based on t h e likelihood of a n

edge in t he direction of t h e subject pair of pixels, The rationale behinci this selection

is to dive more weight t o t h e difference betwecn t h e piseIs in a direction which n-il1

cause t h e r a l i i n of t h e pixels in tha t direction t o b e t h e same. Assuniing the values

rc;,,k.l t o hc integers. t h e estimatecl value of a pixel given t h e values of t he pixels

aroiintl it ( in t h e clique a n d i t s complement) , will be

wliere cf is t h e cornpleinent of t h e clique shown in Figure 4.1. Each of the t e rms

ri.^ - Ji+ i . j+m ) in t h e right hand s ide of Equatiori (-1.2) will contr ibute L? to t h e

whole sumrnation i f t h e e s t ima ted value of t h e missing pisel ( I i , ) i s clifferent from

t h e corresponding neighboring pixel a n d will cont r ibu te zero t o t h e whole

sumrnat ion if t h e estimateci valiie of t h e missing pisei is equal t o the value of t h e

corresponding neighboring pisel. Therefore. t o minimizc t h e \-alcie of the right hand

s ide of Equation (4.2). t h e number of terrns with a n e s t ima ted \ d u e that is different

from tha t of its neighboring pisel. should b e minimizeci. Thus. t h e estimatecl value

shoitld be equal t o t h e median of t h e following vector

T h e likelihood of edges in each of t h e eight directions (shown in Figure 4.1)

is cornputeci using blocks a round t h e missing block. In tliis way. t h e available s h a p e

information in a larger a r ea is exploited in t h e concealment proccss. To assess t h e

likelihood of edges in each of t h e eight directions. edges in t h e blocks siirrounding t h e

missing block. whose direct ions imply t h a t they pass th rough t h e rnissing block. a r e

clet erniined-

Since t h e edge information of t h e s h a p e d a t a is embeddcd in its borders. ive

first scpara te t he borders of t h e shape d a t a in t h e adjacent blocks. To d o this. ive

use a niorphological t ransforni caIled t h e boundary t ransform [16]. If a11 the four

neighhoring pixels (above. bellow. left and right ) of a pisel a r e insiclc the shape. tlien

t h e pixel is declared inside t h e shape. O t her~vise. t h e pisel belongs to the border of t h e

shape. After finding t h c boundary of t h e shape using t h e above describecl transform.

a 3 x 3 window is cen te red at each border pisel and t h e angle of the best line-fit t o

t h e border pixels in t h e window is computed. T h i s in fact gives t h e direction of t h e

cdgc a t t h e pixel in t h e cen t e r of the window. Figure 4.2 shows a typical window a n d

Figure 4.2: X :3 x 3 pisel window. the border pixels in it (shaded) and the best Iine-fit.

the best line-fit and the angle of tlie line. There are eight counters corresponding

to eight directions shown in Figure 4.1. The counter corresponding to tlie direction

of the detected edge (best line-fit ) is incremented if the extension of the best line fit

passes through the missing block. The procedure is repeated for al1 the border pisels

in the blocks to t h e right. left. below and above the missing block ( i f applicable). The

weights required in Equation (4.3) are then obtained by

where cm is the counter corresponding to direction m. and the direction m corre-

sponds t o the direction formed bj- (i: j ) and (k: 1). Finally: the proposed shapc error

concea lme~t met hod can be sumrnarized as follows:

1. Determine the edges in the neighboring blocks using the border transforrn and

the best-Iine-fit and assign t hem to eight equally spaced direct ions. Cornpute

the corresponding counter for each direction.

2. Use Equation (4.4) to find a set of weights for each missing block,

3. Ese Equation (4.:3) to obtain an estimate of each missing pixel employing the

weights obtained in the previous step. and

4. Iteratil-el- re-estimate the missing pixels using Equation (-1.3) until convergence.

In the case where adjacent blocks are lost. the reconstruction algorithm is applied re-

cursii-ely. Blocks with the rnaxim~rn number of correctly decoded neighboring blocks

are reconst.ructed first. and the rest of the blocks are reconstructed recurciveIy. This

ga ran t ee s that the best possible estimation accurac- when missing blocks are re-

stored one by one. is achieved.

The computational load of the proposed method consists of those computations

required in finding the boundary pixeIs and the slope of the best line-fit for each of

t h e boundary pixels and estimation of missing pixels using Equation (4.3). The first

two i-alues depend on the complexity of the shape in the blocks adjacent t o a missing

block. The estimation of missing pixels using Equation (4.3) is an iterative procedure

rvhich. for each iteration. requires counting the number of 1's and 0's which can

he performed very fast. On average. 6 iterat,ions are required for the algorithm to

converge for a bIock. Clearly, the computational load of our error concealment method

is quite reasonable. Our simulation experiments confirm this fact.

4.3 Concealment of Errors in Inter Coded Shape

Informat ion

.A simple concealment method for inter coded shape da ta is to replace the shape

of the missing blocks with those of blocks corresponding to the same location in the

84

previous frame. .An alternative method is to est imate the missing motion vectors from

the motion vectors of surrounding macroblocks a t the decoder and use the est imated

\-ectors to replace the rnissing shape from the previous frame. ket anot her method

is to perform a motion estimation a t the decoder similar to the method that \vas

proposed in Chapter 3 for texture information. Hoivever. our esperiments showed

t hat for shape information a motion estimation a t the decoder. similar to what is

dorie for texture. is unlikely to improve the qualit. of reconstruction. In fact. the

motion vector of the macroblock above a lost macroblock provides a good estimate of

the missing motion vector and can be employed for motion compensated concealment

of missing shape information. This is because motion ivithin an object is regular

and therefore the motion vectors of available blocks b e h g i n g to an object are good

estimates of the motion of a missing block. This is in agreement with what has been

observed in [log].

4.4 Summary

In this chapter. methods for concealing errors in intra and inter coded shape infor-

mation were iiitroduced. The proposed method for intra coded shape da ta employs

a SIAP estimator along ivith an MRF as the image a priori model. The 41RF is de-

signed for the binary shape information, and its parameters are adapted based on the

information embedded in the neighboring blocks. Motion compensated concealment

is used for inter coded shape data.

Chapter 5

Spatial and Temporal Error

Concealment: Performance

Creat ir i ty incolces breaking out of ~stabl ished patterns i n order lo look al

things in a different c a y .

Edward De Bono

In this chapter. we examine the performance of the error concealment methods de-

veloped in Chapters 3 and 4. In Section 5.1. we evaluate the performance of the

proposed method for reconstruction of intra coded texture in baseline JPEG coded

images. Since there is dependency between coded blocks in baseline JPEG coded

images, both stages shown in Figure :3.1 should be employed to reconstruct the miss-

ing parts of the image. The stages are 1) removing the effects of loss of a block on

t h e rest of the image and 2) the reconstruction of the missing block. For the second

stage. Ive ernploy both the deterministic and the statistical methods and compare

t heir performances. In Section 5.2, we study the performance of the proposed error

concealment method on H.263 coded video sequences. For intra coded blocks. ive use

the statistical method presented in Section 3.4. Concealment of inter coded blocks

consists of two stages. In the first stage. we use t h e information in the previous

frame to obtain initial estimates of the missing blocks using the method discussed

in Section :3..'j. In the second stage. a 1 lAP estimator. which ernploys an adaptive

AIRF as the image a-priori model. is used for refinernent of the initial estimates. In

Section 5.13, we first revieic various scenarios that can occur when decoding an error-

corrupted SIPEG-4 video bit stream. Obviously the methods discussed in Chapter 3

and studied in Sections 5.1 and 3.2 can be employed to reconstruct the missing tes-

turc information in 31PEG-4 coded data. Therefore. we focus on the missing shape

data and show the results of our method for reconstruction of sliape information on

1IPEG-4 \.ide0 coded data. In each of the cases (JPEG. H.26:3 and IIPEG-4). ive

compare the performance levels of our methods with those of methods proposed in

the literature. It sliould be added that the mettiods developed in Chapters :3 and

4 are applicable to other compression standards (e.g.. 5fPEG-2) as well. The main

reason for considering H.263 and MPEG-4 is the application of these standards in

communication over low bit rate unreliable channels.

5.1 Error concealment of JPEG coded images

N-e here evaluate the reconstruction method presented in Sections 3.2, 3.:3 and 3.4.

Such a method consists of estimation of DC values of the missing blocks, removal of

the stripes and reconstruction of the missing pixel values (employing the deterministic

and the statistical methods). in terms of reproduction quality. It should be added

t Iiat in the case where adjacent blocks are lost. Our proposed reconstruction method

is appiied recursively. start ing wi t h blocks wi t h the maximum number of correctly

decoded neighbors.

Figure 5.1 shows the 51" 512. 8 bits/pixel images LESA and PEPPER coded

and decoded using baseline .JPEG. Figure 5.2 shows the image LESA with 10% rniss-

i r i g blocks and the image PEPPER with 3% missing blocks. The missing blocks are

colored in white so that their positions are clearly shown. To decode the image

data with missing blocks, the DC value of each missing block is estimated using the

method developed in Section :3.2 and the missing blocks are primarily reconstructed

using tlieir estimated DC values. The results are shown in Figure 5-53. The stripes

appearing in the figure are due t o errors in DC value estimation. Figure -5.4 shows the

images of Figure 5.3 after appl-ing the stripe removal algorithm and before the final

reconstruction stage. Cleariy. most of the error effects in estimating the DC values

are rernoved.

('4 'igure 5.1: Original images: (a) Lena; (b) Peppe

i r e 5.2: Images: (a) Lena with 10% missing data; (b) Pepper with 3% missing data. The missing blocks are shown in white to show their positions clearly.

Figure 5 . 3 : Decoded images: (a) Lena with 10% missing data; (b ) Pepper with 3% missing data. The value of each pixel of the missing blocks is replaced with the estimated DC value of that block.

Figure 5.4: 1

(b)

?ictures after removing the stripes: (a) Lena; er.

After removing the stripes, the texture of missing blocks should be recon-

structed. Figure 5.5 shows the images LENA and PEPPER after reconstructing the

data of missing blocks using piecewise constant weights (Equation (~3.4)) . Figure 5.6

shows the same images after reconstructing the data of missing blocks usirig linear

ramp weights (Equation (3.6)). We also applied our statistical method for recon-

struction of rnissing texture data to the image shown in Figure 5-13 (a) . Figure 5.7

sliows the reconstruction result. Sforeover. we compare the results of our method

(baseci o n Equation (3.4)) for reconstructing the data of the missing blocks wit h the

ones proposed in [67] and [62] in terms of visual quality. Figure 5.8 shows the image

LESA after reconstructing the data of missing blocks using the rnethod proposed in

[6ï]. Figure 5.9 shows the reconstruction result using the met hod proposed in [62] for

the image LESA. Comparing Figures 5.5 and 5.9. it is clear that the visual quality

of the results of the method proposed in [62] is very close to Our method. However.

our met hod perforrns better in the restoration of diagonal edges. and this can be

seen in the restored blocks around the edge of the hat. .As was discussed in Section

3.13.1. t h e cornputational complexity of our method is also less than that of [62]. Fi-

nally. Figure 5.10 shows the reconstruction result obtained using diagonally adjacent

I3locks (besides the horizontal and vertical adjacent blocks). The improvement in the

reconstruction of diagonal edges is small, and it comes a t a cost in computational

com plexi ty.

Figure 5-11 shows a magnified picture of an area of Figure 5.5a. Figure 5.12

shows a magnified picture of the same area of Figure 5.1 1 for Figure 5.6. Figures

5-14. 5.1.5 and 5.16 show magnified pictures of the same area of Figures 5.1 1 and 5.12

for Figures 5.S. .5.9 and .i.lO. respectivel. By comparing Figures 5-14 and 3-15 witli

Figures -5.11 and 5-12. it is clear that our met hod (wit h both piece-wise constant and

linear ramp weights) improves the quality of the reconstruction of edges. lloreover.

comparing Figures 5.11 and 5-12. it becomes clear that using weights that var? as

liriear ramps reduces t h e blockiness of the output image but smoot hes the edges.

Figure 5.5: Reconst ructed images after removing the stripes and restoration of missing hlocks using the method based on Equation (3.4) (piecewise constant rveights).

Figure 5.6: Reconstructed images after removing the stripes and restoration of missing blocks using the method based on Equation (3.6) (linear ramp weights).

Figure 5.7: Reconstructed image using Our proposed statist ical met hod.

Figure 5.8: Reconstructed images using the method proposed in 1671.

Figure 5.9: Reconstructed image using the method proposed in [62].

Figure 5.10: Reconstructed image when diagonally adjacent blocks are used in addition to horizontally and verticallÿ adjacent blocks.

Figure 5.1 1: Slagnified picture of part of the image LENA shown in Figure 5.5.

Figure 5-12 Magnified picture of the same area shown in Figure 5.1 l for the image LESA of Figure 5.6.

Figure 5.13: Magnified picture of the same area shown in Figure 5.1 1 for the image LESA of Figure 5.7.

Figure 5.14: Magnified picture of the same area shown in Figure 5.11 for the image LESA of Figiire 5.8.

Figure 5.1.5: Magnified picture of the same area shown in Figure 5.1 1 for the image LESX of Figure 5.9.

Figiire .5.16: Magnified picture of the same area shown in Figure 5.1 1 for the image LEKA of Figure 5.10.

Table -3.1: PSXR results (in dB) for the 512 x 512 LENA and Pepper images. The nurnbers are for the images after removing the stripes and reconstructing the missing data.

Table 5.1 prorides the PSKRs for the images LEM and PEPPER obtained using

our mcthod (based on Equation ( 3 . 4 ) ) . as well as the PSNRs of the correctly JPEG

decoded images. for several different block loss rates. Figures 5.5 and 5.6 and Table

5.1 demonstrate tliat our method performs quite ~atisfactorily~ bot h subjectil-ely and

objectively.

Moreover. w e compare the results of our deterministic method (based on Equa-

tions (13.4) and (3.6)) and Our statistical rnethod for reconstructing the data of the

missing blocks with the ones proposed in [67]. [ T T ] and [6'2] in terms of PSSR. Table

5.2 compares the PSSR values for t h e 512 x 512 LEKA image with about 10% rnissing

blocks for various restoration methods. The numbers are for the image after remov-

irig t h e stripes and reconstructing the missing data. As can be seen from the Table

the mettiod proposed in [77] slightly outperforms our deterministic rnethod in terms

of PSSR. Hoivever. our deterministic method compares favorably to that of [ii] in

terms of computational compiexity.

Loss Percentage LES'.-\

PEPPERS

0% 35.7 :34,7

1% 29.3 '25.9

3% 28.6 '23.4

10% 37.3 '22.0

Table 5.2: PSXR results (in dB) for t h e 512 x 512 LEXA image for various reconstruction methods. The numbers are for the image after removing the stripes and reconstruct ing the missing data.

1 Met hod 1 PSNR (dB) 1 Proposed in [62] Proposed in (671 Proposed in [77j

Using diagonally adjacent blocks

. ,

26.50 26.64 27.63 26. 13

Our proposed based on Equation 3.4 Our proposed based on Equation 3.6

Our proposed statistical method

'27.46 1 '26.10 27-85

5.2 Error concealment of H.263 coded video se-

quences

To study the performance of the proposed methods in H.263 coded video sequences.

QCIF (176 x 144 pixels/frame) vide0 sequences a t a temporal resolution of 10 frames

per second are coded at 64 kbps and then decoded in the presence of slice/GOB errors.

The size of the blocks is 16 x 16. that of the sub-block is 8 x S.

AS explained in Chapter 2. t o make the compresseci data more error resilient.

most of the standard-compliant video compression systems partit ion a video frame

into Groups of Blocks (GOBs) or **slices-. which are coded independentl) Tlierefore.

the output bit stream usually consists of segments separated with markers. where

each segment corresponds to the coded data of the blocks in a GOB or a slice. When

channel errors occur. the decoder usually discards the erroneous data between two

markers surrounding the erroneous data. effectively discarding the GOB or the slice.

Then. loss of data of a slice does not affect the rest of the compressed video sequence.

Here. it is assumed t hat the frames of the coded video sequence are partitioned into

GOBs or slices. and thus. the missing da ta belongs to blocks of a GOB or a slice.

lloreover. it is assumed t hat the decoder knows the locations of the missing blocks.

This information (e.g.. the checksurn information) can be obtained from the network

or it can be inferred. for esample. by detecting the semantic or syntactic violations

as a result of errors [59 1 101.

To simulate the channel errors. the following tasks are performed. Coded

video information is first grouped into packets, where each video packet consists of

the coded data of a GOB or a slice. The video packets are then multiplexed with

audio information according to the H.223 standard. We used a n H.223 multiplexing

simulator which receives video packets. simulates audio t raffic. applies errors to the

multiplexed bit s tream accordinp to an error pattern stored in a file. and outputs the

packets [ I l l ] . The error pattern that ive ernployed corresponds t o a wireless channel

[1 1 y. Burst errors rc-il1 most likely not corrupt two consecutive video packets. since

audio packets are inserted between thern. The erroneous bit stream is decoded such

tliat the effect of errors will appear as missing slices/COBs.

For inter coded blocks. our error concealment consists of 1) obtaining an ini-

tial estimate of the missing block using information from the previous frarne and 2 )

estimate the prediction error using the adaptive MRF model. We compare the per-

formance of the above method with two other methods: 1) a replacement method.

ivhere a missing block is replaced wit h the same block in the previous frarne. and 2)

a median method. where the motion vector of a missing block is set to the median of

motion vectors of blocks to the left. above and above-right of it. and the estimated

motion vector iç used t o obtain a motion compensation block which serves as the

concealment of the missing block. Here. we consider only the case of missing GOBs

(i.e, it is assumed tha t each packet contains a GOB).

Shown in Figure 5-17 are the concealment PSNR values for the three methods

rnentioned above for the video sequence FOREMAK wit h a 10% packet (GOB) loss rate.

-4s can be seen, for video sequences with a large amount of motion like FORESIAS.

the performance of the proposed method and the median method are close to each

other and both are better than the replacement method. Although the performance

Figure 5-17: PSNR values for image sequence FOREMA'; with 10% GOB missing for different concealment methods.

of the above methods is comparable in terms of PSXR. the video sequences obtained

using the proposed met hod are more visually appealing t han the ones obtained using

t h e other two methods. Figures .j.lS(a). J.lS(b). and J.lS(c) show the reconstructed

video franle (inter coded) of the video sequence FOREMAN. As can be seen in the

images. the replacement method does not generate good results. especially around

the nose area. Moreover. the concealment result of the proposed method is clearly

Letter than that of the other two methods.

For intra codcd blocks, ive compare the performance of our method (detailed

in Section 3.4) with that of four other efficient methods 1) a MAP estimator using

C;MRF as the a-priori mode1 where each missing pisel is basically set to the average

of the pixels around it [El. 2) a suboptimal version of the M A P estimator using the

Figure 5-18: An inter coded frame of the sequence FOREMAN (frame 22) concealed by the (a) replacement, (b) median, and (c ) proposed methods.

HJIRF model where a missing pixel is set to the median of pixels around it [104].

3 ) a deterministic method proposed in [6i] where each pixel in a daniaged block is

interpolated from the corresponding pixels in its four neighboring blocks such that

the total squared border error is minimized. and 1) a deterministic met hod proposed

in [62] where the missing data is restored such that the variation between adjacent

pixels wi t hin the damaged block and t heir spat ially neighboring pixels in adjacent

blocks is minimized. Since for H.263 coded video. the size of the missing blocks are

comparable to the size of the image. the deterrninistic method proposed in Chapter

13 [vil1 not perform acceptabIe. This is mainly because the pixels in the area around

the missing block will be weakly correlated to the pixels in the damaged region.

In this experiment. we employ different video sequences and different packe-

tization schemes. In the first set of simulation experiments. we assign a slice which

consists of one block. to a packet. Figure 5.19(a) shows a frame of the image se-

quence FOREhl,\Y encoded and decoded iising an H.26:3 compliant coder. Figures

.S. 19(b). 5.19(c). .5.19(d). 5.19(e). 5.19(f). and 5.19(g) show (respectively) the same

frame 1) missing approximately '20% of the packets (blocks), 2) reconstructed us-

ing the non-adaptive GXIRF model. 3) reconstructed using the suboptimal H&LRF

model. 1) reconstructed using the met hod proposed in [67]. 5) reconstructed using

the method proposed in [62]. and 6) reconstructed using our proposed adaptiw hlRF

error concealment met hod. By comparing t h e figures. the superior performance of

our method becomes obvious. This performance advantage is demonstrated in the

areas of the frarne that correspond to adjacent missing blocks. Moreover, blocks that

contain edges are reconstructed very well by our method.

Figure 5.19: -4 frame from the video sequence FOREMAN, (a) original, (b) missing blocks. reconstructed iising (c) a GMRF model, (d) a suboptimal HMRF model. (e) t h e rnethod proposed in [67], ( f ) the method proposed in [62]? and (g) our adaptive JIRF model.

Table 5-13: PSXR comparison of different met hods for the video sequence FORELIAS.

GOB missing 25.3

Met hod G 391 RF H 34 RF 3lethod proposed in [67] Method proposed in [62] Adaptive MRF

In the next set of simulation experiments. we assign the coded da ta of a GOB.

which consists of 11 blocks. to each of the video packets. Figure a.ZO(a) shows a frame

of t h e \-ide0 sequence F O R E M A N wit h two missing packets ( GOBs). approsimatel-

ISR loss rate. Figures 5.2O(b) and .j.2O(c) show the error concealment result obtained

iising the GhlRF model and suboptimal HMRF models. respectively. Figures 5.20(d)

and j.'lO(e) show the result obtained using the rnethods proposed in [65] and [62].

respectively. Figure .52O(f) shows the result obtained using our adaptive SIRF model.

Clearl~.. our proposed met hod periorms best in reconstruction quality. part icularly in

retrieving the edges.

For a quantitative evaluation, TabIe 5.3 provides the PSKR values of the above

concealment met hods (for bot h packetization cases) for the video sequence FOREXI..\N.

The table demonstrates that our method outperforrns the other methods by at Ieast

2 dB. The concealment results for another video sequence (AKIYO) with different loss

pattern. are also provided in Figures 5-21, and 5.22. The PSNR values are quoted

in Table 5.4. It is clear from the shown reconstructed video frames and the PSXR

values t hat the proposed met hod has a clear performance advantage in concealing the

effects of errors.

No loss 32.1 32.1 32.1 32.1 3'2.1

20% Block loss 26.3 25.4 23.2 2.5-2 28.6

(4 ( f )

Figure 5.20: -A frame from the video sequence FOREMAN (a) with missing GO&. reconstructed using (b) a GMRF model. (c) a suboptimal HMRF model. (d) t h e method proposed in [67], ( e ) the method proposed in [62]! and ( f ) our adaptive XIRF rnodel.

Figure 0.21: A frarne from the video sequence AKIYO (a) original (b) with missing blocks. reconstructed using ( c ) a GMRF model, (d) a suboptimal HMRF model. (e) the method proposed in [67], ( f ) the method proposed in [62] , and (g) our adaptive AlRF model.

Figure 5.22: A frame from the video sequence AKIYO (a) with missing GOBs, reconstructed using (b) a GWRF model, ( c ) a suboptimal HMRF model: (d) the method proposed in (671, ( e ) the method proposed in [62], and ( f ) our adaptive MRF model.

iMet hod GMRF HMRF Method proposed in [67] Method pruposed in [62] Adaptive MRF

Table .5.4: PSNR cornparison of different methods for the video sequence AKIYo.

1 No loss 33.52 33.5'2 33-52 33.52 33.52

20% Block loss 27-03 27.07 24.4 16.3 28.53

GOB rnissing 26.54 25.79 22.9 24.3 26.95

5.3 Error concealment of MPEG-4 coded video se-

quences

Here. ive discuss the various scenarios tha t can occur when decoding a n erroneous

XIPEG-4 coded video bit s t ream. I t is assumed t h a t t h e error resilience tools of

LI P EG-4 (ment ioneci in Chapter 2): more specifically using video packets and data

parti tioning. a re applied during t h e encoding. W e consider the intra-video object

planes (1-L'OPs) and inter-video object planes (P-VOPs) separately.

Quantimion Sbpc & Motion Tufurc RQI)IIPE. I=INUmbaI Vduc IMoti~t~DataIMkI 1-1

Figure 5.23: Video packet s t ruc ture of MPEG-4 in (a) 1-VOPs and (b ) P-VOPs.

In the P-VOPs. t h e shape a n d motion da ta in a packet a re separated from t h e

texture d a t a with a motion marker as seen in Figure 5.23. If a n error occurs in t h e

t e s t ure part of a video packet. t he decoder can use t h e motion information t o replace

the missing texture with t h e tex ture in t h e previous VOP. If t h e error occurs in t h e

motion/shape part, then t h e whole video packet is discarded. A simple concealment

can be done by replacing t h e shape a n d t e s tu re of t h e missing macroblocks wit h t hose

of nlacroblocks corresponding t o t h e same location in t h e previous VOP. An alter-

native method is t o e s t ima te t h e missing motion vectors from t h e motion vectors of

surrounding macroblocks a t t h e decoder and use the es t imated vectors t o replace t h e

missing texture and shape from the previous VOP [log] or use the method proposed

in Section 3 . 5 [113: 1141.

\Ve tested Our proposed shape reconstruction met,hod on various corrupted

AIPEG-4 bit streams ivith various packet sizes which contain vide0 ohjects coded at

-5000 bits/VOP. Error simulation is performed at the decoder: While decoding the

bit stream. the decoder ignores the video packets randomly with a given packet loss

percent age. Then the proposed concealment algorit hm is applied to the erroneous

1-VOP binary shape data to recover the missing shape blocks-

The similarity of the erroneous and error concealed shape data to the originaI

shape data is measured using the following value

where nd is t h e number of pisels that are different betiveen the restored shape and

t h e original one: and nt is the total number of in the bounding box. nd is in

fact t h e Hamming distance between the restored shape and the original one.

Figure 5.24: The shape of the first 1-VOP of the AKIYO video object.

Figure 5.24 shows the binary shape da ta of the first 1-VOP of the AKIYO video

11s

Figure 5-25: The shape of the .AKrYo video object missing 30% of the shape blocks.

Figure 5.26: T h e reconstructed shape of the first 1-VOP of the A K I Y O video object.

ohject. T h e bounding box of the VOP is of size 272 x 208. The length of each

\-ide0 packet is selected to be 500 bits. The number of blocks in each video packet

is different and depends on the size o l coded data of each biock. For example. in

this case. the first ten packets have 9: 14. 3, 14: 2, 15. 2 . 2 . 13. and :3 macroblocks.

respecti\rely. Figure 5.25 shows the shape of the 1-VOP with 90% of t h e shape blocks

rnissing. corresponding to 6 missing video packets. The result of the proposed error

concealment method is presented in Figure 5.26. The shape similarity measure JI is

SS% for the erroneous shape data shown in Figure 5.25. and 99% alter using the

proposed shape concealment method.

Figure 5-27: The shape of the first 1-VOP of the BREAXf video object.

Figure -5.28: The shape of the BREAM video object missing 25% of the shape blocks.

Figure 5.27 show the binary shape data of the first 1-VOP of the video object

BRE.A.\~. The size of the bounding rectangle is 272 x 192. The size of' the video packets

is selected to be 700 bits. The coded 1-VOP contains 44 video packets. Figure 5.28

shows the binary shape information of the 1-VOP with 2.5% of the blocks missing.

The shape data after concealment is given in Figure 5.29. T h e similarity measure p

is SG% before? and 99% after? concealment.

Lasto w e apply our concealment technique to the shape of the first 1-VOP of

the \VE.aTHER video object. as presented in Figure 5.30. The size of the VOP is 160

x 22-4. The size of the video packets is set to 1000 bits. Figure 5.31 shows the shape

Figure 5.29: The reconstructed shape of the first 1-VOP of the BREAM video object.

Figure .5.30: The shape of the first 1-VOP of the WEATHER video object.

of the 1-VOP when 3.5% of the macroblocks are missing. Figure 5.32 shows t h e result

of the proposed error concealment method. The value of 7 is S i % and 99'7~ before

and after the coricealment of shape information, respectively.

5.4 Summary

In this chapter, we studied the performance of the error concealment methods pro-

posed in Chapters 3 and 4 on baseline JPEG, H.263 and klPEG coded visual data.

Figure 5.3 1 : The shape of the \VE..\THER video object missing 35% of the shape blocks.

Figure 5-32: The reconstructed shape of the first 1-VOP of the it'E.4THER video object.

In each case. we also compared the performance levels of our proposed methods rvit h

t hose of similar met hods proposed in the literature. The proposed concealment met h-

ods achieve very good computation-performance tradeoffs. as demonstrated via our

esperiment al results.

Chapter 6

Conclusions and Future Work

6.1 Summary of Thesis Contributions

The problem of error control and concealment in i-isual communication is becoming

i ncreasingly important because of the growi ng interest in image/video delivery over

iirireliable channels sucli as wireless networks and the Internet- This thesis has ad-

clressed the problern of post processing for concealment of errors resulting frorn t h e

transmission of coded visual information in error prone environrnents. r\lthoiigh most

of the proposed methods a re applicable to more popular visual coding systems. we

have focused on the 11PEG-4 framework where the bit Stream consists of a collec-

tion of coded testure (DCT coefficients). shape and motion information '. W e have

devcloped error concealment methods for each of these components. For texture com-

ponent of coded visual data. we have considered intra and inter coded testure. The

niet liod int roducecl for concealment of errors in int ra coded t e s t ure. unlike previously

published algorithms. considers the dependency between coded blocks in a frame. It

uses a deterministic or a statistical technique for the restoration of missing testiire

information. The deterministic technique achiews a good performance level in the

reconstruction of edges. The statistical technique uses an adapt ive MRF as the iniage

cl-priori rnoclel in a SI-AP estimator. 'The adaptation cnablcs the estimatiori procc-

dure to incorporate more information wi t hoiit a dramatic increase in computat ional

complesi ty. The met hod proposed for concealment of errors in inter-coded testure

Lises t he information in the adjacent frame to find an initial estirnate of the missing

test tire. If the mot ion vectors associated wit h the missing blocks are availahle. motion

compensation is uscd to provide good estimatcs. Otlierwise. a novel algorithm which

preserves image continuity is used t o find the initial estimate of the missing texture.

The initial estimate is then combined with the estimate of the precliction error of the

missing block to fincl the best approximation of the missing texture information.

Error concealment for shape information has. so far. not been addressed in the

l i tcrat ure. \k proposecl efficient concealnient nlethods for shape information in t his

thcsis, .A t\-\I'\P estiniator. which uses an adaptive >[RF as the image n-priori model.

is used t o estimate the missing shape information.

\\'e esamirieci the performance of our error concealment met hods on the bit-

'The hybrid motion conipensated DCT-based coder (e-g.. NPEG-2) is a special case of the hIPECL.1 one. where the video objects are rectangular in shape

streams of baseline JPEC; coded images. H.263 coded vide0 sequences. and 3IPECL-I

cocfed ride0 data. R-e evaluated the performance of our proposed methocls bj+ using

objective criteria such as PSXR. and by subjective assessment of reconst riicted im-

ages. .\Ioreo\-er. i v e compared t h e performance of our proposed met hod tri t h that of

other methods available in literature. The methods presented in this thesis achieve

consistent ly good perforniance-computation t radeoffs which makes t hem siiitable for

real t ime communication over error prone nctworks. Employing the proposed error

concealment methods can lead to acceptable visual quality at loss rates hç high as

20%.

The main contributions of this thesis can bc summarized as follows:

An error concealment methoci for texture that consiclers the dependenc~- be-

tween coded blocks.

A determinist ic technique for reconstruction of missing texture informat ion t hat

achie\.es a goocl performance Ievel in the reconstruction of edges.

O .A statistical technicl~ie for reconstruction of missing texture that uses an adap-

tif-e AIRF as the image a-priori mode1 in a LIAP cstimator. The adaptation

enables t Iie estimation procecliire to incorporate more information without a

dramatic increase in computational coniplesity.

An efficient concealment niethods for shape information wtiich uses an adaptive

AlRF. designed for binarj- shape information. as the image a-priori modcl.

6.2 Future Research Directions

The problern of error control and coiicealment in visual communication will continue

to be of high importance because the ciemand for visual communication will continue

to grow. Thus. research in error coiitrol and concealment algorithms and tooIs d l

continue to be an active area. Among the possible research topics t o extend and

improve the ~ o r k of this thesis are:

Derivation of bounds for the estimation error in reconstruction of missing d a t a

of image and video signal.

lIore advanced interactive error concealnient methods ut ilizing t h e cooperation

bctween the encoder and the decoder.

Fast implementation of error concealment met hods for by using task-split tech-

niques (e.g.. pipelines).

0 -+II t hough more e fec t i ve error concealment approaches are st il1 neetleci. more

crnphasis should be placeci a t joint design of the encoding algorithm. transport

protocol. and post-processing met hod to minimize the combinecl distort ion due

to hoth compression and transmission.

Bibliography

[ I I K. R. Rao a n d .J. .J. Mirang. Techniques and Standards /or Irnnge. I'idco and

.4udio C'oding. Engiewood Cliffs. SJ: Prentice-Hall . 1996.

[2] ISO/IEC. Digital compression and coding of contin uous ton t still images: I-e-

quir~rncnis and guid l inr .~ International S t a n d a r d Organization. 1994.

[3] \$:. Pennebaker a n d .J. .\litchell. dPEG Still Image Cornpr~s.sion Sfnndnrd. Seir

York: Van S o s t r a n d Reinholcl. 19913.

[-l] O. Egger. P. Fleury. T. Ebrahinii. a n d 11. Kunt. a-High-performance compression

of visual information. a tutor ia l rcview. par t 1: St i l l pictiires." Proc. of th€ IEEE.

rot. $7. pp. 976-101 1. .lune 1099.

[5] B. Ci. Haskell. P. Ci. Ho~rard. Y. A. Lecun. A. Puri . J . Ostermann. 51. R.

Civanlar. L. Rabiner. L. Bottoii. a n d P. Haffner. -*Image and video coding-

cmcrging s t anda rds a n d heyond." IEEE Trans. on Clrcirit.r; and Sysferns /or

\Idco Technology. vol. 8 . pp. C-14-8:37. Yov. 1998-

[fi] T. El>raliimi and )1. l iunt . -Visual data compression for miiltirnedia applica-

tions." Proc. o/ the IEEE. vol. 86. pp. 1109-1 12.5. J u n e 1998.

[7] G . K. U;allace. -The jpeg st il1 pictiire compression standard.- Conimunications

of the .4C-11. vol. 134. pp- 30-44. Apr. 1991.

[XI .J. In. S. Shirani. and F. Iiossentini. .*On R D optimized progressive image coding

using JPEC; .- IEEE Transactions on fninge Proc~ssing. vol. S . pp. 1630- 1698.

Sov. 1999.

[9] .i. In. S. Shirani. and F. Kossentini. -JPEC cornpliant efficient progressive image

- coding. in Proccedings of the International Conference on ,-lcoustics, Speech.

nrd Signal Processing. (Seat t k ) . pp. 2633-26136. 199s.

[ I O ] D. L e Ciall. -.\IPEC;: a video compression s tandard for multimedia applica-

tions-*' C'ornmunications of the -4C'M. vol. :34. pp. 46-58. Apr. 1991.

[ I 11 I i . Sayood. introduction lo data cornpr~ssion. San Francisco. C'alifornia: Morgan

Iiaufnlann Publiçliers. 1996.

[ l?] \:. Bliaskaran and K. Iïonstantinides. lrnagc and Ir'ideo Compression Standards:

.-llgorifhms and Archif ecf u r ~ . Boston: Iiluwer Xcademic Piiblisliers. 1995.

[1:3j B. C;irocl. -Rate-const rained motion est irnat ion." in SPIE Proc. I wual Corn-

nl u n icat ions and Inlnge Proce.i;sing. vol. 2308. pp. 10%- 103-1, 1994-

i1.11 \\.'. C'hung. F. Kossentini. and A I . Smith. -.An efficient motion estimation tech-

.. . n ique based on a rate-distortion criterion. in Proceedings o j the lnt~rnationnl

Conference on . - I C O ~ L S ~ ics. Speech. and Signal Processing. vol. - 1 . (.Atlanta. C; .A.

[--S.=\). pp. 1926-1929. 1Iay 1996.

[16] F. Iiossentini. Y. Lee. 51. Smith. and R. \\%rd. -Predictive RD-constrained

motion estimation for very low bit ra te video coding.- IEEE Transactions on

S ~ f e c f cd . d r ~ n s in ~ommunicat ions . vol. 15. pp. 1 ï32-li63. December 1997.

[16] A. II. Jain. Fundamcntnb of Digital Image Processing. Englewood Cliffs. Xew

.Jersej-: Prentice-Hall. 1989.

- [l ï] .J. .J. S. Ja-nt. -Signal compression based o n rnotlels of hiiman perception.

Proc. of the IEEE. pp. 1S85-1-13?. Oct 1993.

[IS] .-1. Segall. -Bit allocation and encoding for vertor sources.- IEEE Transnctiona

on Infornzation Theory. vol. IT-22. pp. 162-169. I\Iar. 1976.

[19] 1-. Shoham and A. Gersho. -Efficient bit allocation for an arbi trary set of

q u a n t i ~ e r s . ~ IEEE Tram. o n Acovst ics. Speech. and Signal Proc~s.cirtg. vol. 136.

pp. 144.3-14.73. Sept. 19SS.

[-O] T. \\'iegand. 11. Lightstone. D. Slukherjee. T. Campbell. and S. .\[itra. -Rate-

Distort ion Opt imized 5Iocle Selectiori for Ver? Low Bit Rate Vidco C'ocling and

t h e Emerging I-1.263 Standard.- IEEE 7'rczn.s. on C'irruits and SpIe ms for I l d ~ o

Tcchnologg. pp. 1S2-190. Apr. 1996.

[ Z l ] S. Farvardin ancl d. \Y. .\lodestino. -Adaptive buffer-instrumented entrop?--

codecl quant izer performance for menioryless s o i ~ r c e s . ~ IEEE Trcr nsact ion.+ on

In for~rnation Thcory. vol. IT-32. pp. 9-22. Jan . 1986.

['>" 1. Shoham and A. Gersho. -Efficient bit allocation for an arbitrary set of yuan-

tizers." IEEE T r w n s . .Jcoirst. Spicech Signal Proc&.si.. vol. ASSP-36. pp. 1445-

14.53. Septenlber 1988.

[23] H. Sun. W. Kwok. 11. Chien. and J . Ju. -Mpeg coding performance improve-

ment b ~ - jointlj- optirnizing coding mode decisions and rate control." IEEE - I'ran.~. on Circuits and S g d ~ r n s for I'i'deo Technology. vol. L . pp. 449-A%. .June

19%'.

[24] A. Schuster and --1. liatsaggelos. -Fast and efficient mode and qiiantizer select ion

in the rate-distortion sense for H.263." in SPIE Proc. I'i.sual C'omrnunications

clnd Image Processing. vol. 27-27. pp. CS-1-79.5, 1996.

[25] le. Lee. F. Iiossentini. and R. \\'ad. -Efficient 51PECL'L encoding of inter-

laced video..' Cunndian .Journa l of Elect rical and C'omput c r Engineering. vol. Z3.

pp. 61-61. . lune 1998.

['>(il 1.. Lee. F. Iiossentini. R. liard. and 11. Smith. -To~vards SIPEG4: An improved

II .%Il- basecl vidco codcr." Signal Proc~ssing: lm ngc C'ont 111 urr ica t ion. Sptcial

I.wué on .\IPEG-4. vol. 10. .July 1997.

7 ISO/IEC. -Information te ch no log^. - coding of moving pictures and associated

audio for digital storagc niedia at up to about 1.5 mbits/s: Video.- 11 172-2.

1993.

2 ISO/IEC. -Information technology - generic coding of rnoving pictures and

associated atidio information: Video." 1:3SlS-2. 1995.

[29] IT Li-T. -Vide0 coding for low bit rate communication.- Recommendat ion

t1.2653, 1996.

[:IO] [TL--Tm -Vide0 coding for Iow bit rate communication." Recomniendation H.2613

\'ersion 2. 1998.

[3 11 T. Sikora. -The .\IPEG-4 video standard \-erification mode1 .*. in IEEE Trnns.

o n Circuits and S y s t ~ m s for Iiidco Tech~ofogy- vol. 1. pp. 19-Rl. Febriraru 1997.

[:El . J . Osterman and -4. Puri. -Xatiiral and synthetic video in 1IPEC;-4." in Pr-oc.

/EEE In!. C'on f. .-lcoust .. Spc~ch . anci Signal Processing. vol. 5 . pp. 3805-3808.

May 199s.

[:H] 1. JTC' 1 /SC'2S/\VG 1 1. -1IPEG-4 s p t e m s FAQ: version 7-0a.- X2521. October

1998.

[Xi] 1. JTCI /SC'29/\jiG 1 1 . -1fPEC-4 requirements. version 9.'- S'L4.56. Octoher

109S.

[37] 11 P EG--1 Systerns Croup. T o d i n g of audio-visual ohjects: Systems.'* ISO/ IEC

.JTCl/SC29/\.\:G 1 1 K2201. 11ay 199s.

[dd] 1. JTC 1 /SC'29/\4;C; 1 1. "Description of IIPEG-4:- S 1-1 10. Octobcr 1996.

[99] R. Haralick and L. Shapiro. "Image segmentation techniques.'* in Cornputer

Il-isiorz. Graphics and image Processing. pp. 100- l Z . 198.5.

1 :E

[-101 A. Iiatsaggelos. L. P. Iiondi. F. Meier. -1. Ostermann. and Ci. Schuster. -.\lPEG-

4 and rate-distort ion-based shape-coding technicl~ies.- in Proc. of th E IEEE.

vol. 86. pp. 1029-1051. .June 199s.

[-!LI AI.-. V. Group. -Coding of audio-visual objects: Vicieo.^ ISO/IEC

.JTCl/SC'2S/\\:G I 1 S2202. hfarch 1998.

[41] If.-. V. Group. *hIPEG-4 video verification mode1 version 8.0.- ISO/IEC

.JTC I/SCB/\\:C; 11 51796. July 1997.

[-131 1'. Wang ancl Q. F. Zliii. 'Error control and concealnwnt for video comrnunica-

tion: .A revicw." Proceedings of thc IEEE. pp. 97'4-997. -\la'- 1998.

[4-l] S. Farber. B. Girard. and J . Lyillasenor. '-Estension of ITC--T reconimmnedation

H.263 for error-reiIierit video transmission.- IEEE corn rn unication nzagnzine.

L-01. 36. pp. 120-128. Jurie 1998.

[-I.31 L. El. Kieu and K. S. Sgan. Tell-loss concealment techniques for layercd video

cociecs in an XTlI netivork.- IEEE Trnns. on Image PI-ocf.z'sing. vol. :3. pp. 666-

677. Sept. 1994.

[AG] R. Aravincl. R. C'ivanlar. and A. R. Reibman. --Packet loss resilience of 1lPEG-

'1 scalal,le 1-ideo cocling alg~rithrns.~. IEEE Trans. on Circuits and Syderns f i r

Iridco T~chnofogy. vol. 6 . pp. 570-5d0. Oct. 1996.

[-171 5 1. T. Orcliard. Y. \\:ang. V. Vaishampayan. and A. R. Reibnian. -Reclunclancy

raie-distort ion analysis of mui tiple description cotling using pair wise correlat ing

-. - transforms. in Proceedinp of International C'onference on Image Procéssing.

(Santa Barbara). pp. 608-61 1. Oct. 1997.

[48] Y. \Vang. 41. T. Orchard. and .A. R. Reibnian. -SIuitiple description image

coding for noisy channels by pairing transform coefficients." f roc . IEEE 1.d

IP-orkshoy .\lutirnedia Signai Proccssing. pp. -119-424. .lune 1997.

[49] )'. Wang. 11. T. Orcharcl. and A. R. Reibman. -Optimal pair \vise correlat-

i ng t ransfornis for muitipk description coding." in Proceedings of Intcrnntionnl

C'or~ f c r ~ n c e on Imngc Processing. vol. 1 . (Chicago). pp. 679-683. Oct. 1998.

[30] 1.. Cioyal. J . liovacevic. R. Arean. and AI. Vet terli. -Slult iple description trans-

.. . form coding of images. in Proceedings of International Con ferencc on Inzagr

Proc~.i.iing. (Chicago). pp. 674-67s. Oct . 1998.

[ . i l ] Ci. Rej-es. -4. R. Reibman. J . Chuang. and S. Chang. -Video transcoding for

rcsilierice in wireless channels." in P r o c ~ ~ d i n g s of Int~rnationcri Cenfcrc n c ~ on

/rrtagc Processing. vol. 1 . (Chicago). pp. 3138-342. Oct 199s.

[ X ] Ci. C'ote. S. Shirani. and F. Kossentini. -Robiist h.263 video communication over

packet lossy networks." in Procecdings of International Conjcrencc on I lnage

Pt.occssing. vol. 2. ( Iiobe. Japan). pp. .5:33-539. Oct 1999.

[53] Ci. Cote. S. Silirani. and F. Iiossentini. -Optimal mode selection ancl syncliro-

nization for rohust vide0 communications over error prone net\vorks.-- ncc~p ted

for publication in the IEEE Journal on Sdcctcd Amas in C'orn nz unicntions.

[.i-I! \\-. Lam and A. R. Reibman. -Self-s-nchronizing variable length codes for image

transmission.- in Proc. IEEE Int. C'onf. =Icoi~at.. Speech, and Signal Processing.

i.01. 6. pp. 477-480. 190'2.

[55] D. Rcdmill and S. G - Iiingsbury. -The EREC: a n error-resilient technique for

coding variable-lengt h blocks of data." IEEE Trana. on Image Proc~ssing. vol. 5 .

pp. .56.5-574, april 1996-

[.XI A. li. Katsagelos. F. Ishtiaq. L. P. liondi. SI. C. Hong. 11. Banham. and

.. . .J. Brailean. -Errer resilience and concealment in \-ide0 cocling. in Proc.

EC-SIPC'O--98. ( Rhodes. Cireece). pp. 211-228. Sept. S- 11 1998.

1571 C'. Parthasarath-. J . W. Ilodestino. and K. S. Vastola. -Design of a transport

coding scheme for higii-qiiality video over ATSI networks.'- IEEE Trcrns. on

C'ircirit.~ and Sysfcrna /or \-ide0 Twhnology. vol. 7. pp. 3-58-376. April 1997.

(5S] 11. . iiliansari. A. .Jalali. E. Dubois. and P. Ilermclstein. -Law bit-rate video

transmission over fading channels for wireless microcellular systerns.- IEEE

TI.QI~.s. or? ClrcuiC.< crrzd Syslema for I,ldeo Tcchnology. voi. 6 . pp- 1-1 1. Feb.

9 R. Talluri. -Error resilient video coding in

Cornm. .\/nyazinc. 1-01. 26. pp. 112-119. June

t tie 1.1 P EG-4 standard.'- IEEE

1998.

[GO] J . Liang and R. Talluri. -Tools for robust image and \-ide0 coding in .JPEC;2000

ancl 1I PEG-4 st aiiclards..- i.01. 36-53. pp. 40-5 1. January 1990.

[Gl] S. Shirani. F. liossentini. and R. Ward. -Errer concealment methods: A corn-

parati\-e study." in iEEE Canadian Conference o n efectrical and cornputer en-

gincering. (Edmonton. Canada). pp. S3.5-840. .lune 1999.

[62] Y. \i:a:ang. Q. F. Zhu. and L. Shaw. -11 asimally smoot h image recorery in trans-

form cocling." IEEE Trans. on Communications. vol. 41. pp. l.j-L-I-l-E 1. Oct.

1 99:3.

[&3] \\-. Zhu and )'. \4:ang. -The use of second order deri\-atives based smoot hness

measiire for error concealment in t ransfer based codecs.- in SPIE Proc. ITi.sutd

C'omn~iinicalions a n d Image Proces.iing. vol. '2501. pp. 120.7-1214. Feh. 199.5.

[(i-l] Q. F- Zhu. \-. Wang. and L. Shaw. "Image reconstruction for Iiyhrid video

cociing systems.-' in Pr-oc. IEEE Data C'ornpr-ession Co nJ . pp. 229-238. l\[arch

1992.

[G5] .J. \\:. Park. J . \Y. Kim. ancl S. Ir. Lee. -DC'T coefficients recovery-basecl error

concealment technique and its appIication to the 1IPEC;-2 bit streani error."

IEEE Tran.5. on Clrcuits and Systems for lri&o Tech nology. pp. 8-1.5-8.51. De-

cernher 1997.

[G6] \\'. 11. Lani and A. Reibman. -.An error concealment algorithm for images

subject to chan~iel errors.- IEEE Tr.ans. on Imcrgc Procés.sing. pp. 5:3:3-.54'L.

hlay 199.5.

[ f i i l S. S. Heniami ancl T. H. Y. Meng. "Transiorm coded image reconstruction

espIoi t ing interblock correlat ions.? IEEE Trnnsnct ions on Image procéssinq,

vol. -1. pp. 1023- 1027. 3uly. 199.5.

[GJ] P. Salama. S. B. Shroff. E. .J. Coyle. and E. J . Delp. "Error concealment tech-

niques for encodecl video st reams.- in Proceedings of International Conjerence

on Image Proce.r;.sing. pp. 9-12. 1995.

[69] V. \\éerackocody. C. Podilchuk. Y. Zhou. and A. Est rella. Transmission of J PEC; - coded images over wireless channels. in SPIE Proc. I-isual C'ornrnunicatiotzs

and Image Procesaing. pp. 157- 168. 1995.

[TOI G. J u . 11. \V. 1Iarcellin. and .CI. M. K. Liu. -Recovery of video in the presence

.* . of packet loss iising intericaving and spatial redundancy. in Procccdinp of the

Intct-national C o n f ~ r c n c c on Acoustics. Speech. and Signal Pr-oce.wing. pp. 105-

108. 1906.

[ i l ] 51. C;lianhari and V. Seferidis. T e l l loss concealmerit in ATSI video coclecs."

IEEE Tr-ansacfions on Circuit and sgstem i n i~chnology. vol. 3 . pp. 238-

2-17. .June 19913.

[Z] P. Salama. S. B. Shroff. and E. J . Delp. -t\ baysian approach to error conceal-

ment in encoclecl vidco strcams.- in Procccdings o f In tcrnnt ionni Cor2 f~rcncé o n

l m n g e Proc~ssinq, pp. 49-52, 1996.

[73] R. Veldliiiis. Restorntion O/ Lost Sarnp1c.s in Digital Signais. Engleu-ood Cliffs.

Sew Jerse!.: Prent ice- HaIl. 1990.

[T4] .A. C'. liokaram and et al. -Interpolation of niissing data in image secltiences.-

IEEE Transactions imnye processing. vol. -1. pp. 1509-1519. Kov. 199.7.

FI. R. Rabiee. H. Radha. and R. L. Iiashyap. -Errer concealment of still image

ancl video st ream $vit h mrrlti-direct ional recursive non-linear fi1ters.- in Proceed-

ings oJ the international Conference on ;Icoustic.s. Speech. and Signal Process-

ing. pp. 37-40. 1996.

H. Sun and W. Kwok. "Concealment of damaged block t ransform coded images

using projection onto conves setsl" IEEE Tran.sncti0n.s imagc proce.iising. vol. 4.

p p . 4704'7'7. April 199.5.

\4:. 1I;wok and H. Sun. -1Tulti-directional interpolation for spatial error con-

cealment ." /EEE Tran.snctions on consumer- c lecf ronics. vol. 39. pp. -15.5-460.

Airgiist 199:3.

L. Capodiferro. S. Puledda. and G. .Jacovitti. -1lissing block recover? by linear

pattern proPa@ ion." in Proce~ding of Picl ure Coding Sympo.iiu m. pp. 3-53-359.

Sept. 1997-

IV. Zcng and B. Liu. -Ceometrical-st ructirre- based directional filtering for error

concealment in iniage/video transmission." in SPIE Proc. i I.sirnl COmm u~licn-

tions nrid fmnge Proc~ssing. pp. 14.3-1.56. 1995.

\\-. Zeng and B. Liu. "Geornet ric-struct ure-based error concealment !vit h novel

applications in block-based low-bit-rate coding." iEEE Trnns. o n CFi-cuits and

Sgsfcrns for l l d r o Technology. vol. 9. pp. 6-48-665. June 1999.

11. Chien. W. Sun. and W. Iiwok. -.A temporal ancl spatial POC5 based error

concealment algori t hm for the SIPEG encoded video sey uence.- in SPIE Proc.

I*isunl Communications and I m a g ~ Processing. vol, 2501. pp. 16s-174. Feb.

1 09.5.

[S-] \\'. S. Lee. 11. R. Frater. ancl d. F. Arnoldet-al. --Spatial temporal concealment

of lost blocks in coded \-ideo." in Proceedings of International COnferenc~ on

Irnngic Proce.wing. vol. 13. (Chicago). pp. 477-481. Oct 1998.

[S3j 2. \\;ang. Y. Yu. and D. Zhang. -Best neighhorhood mat ching: An information

loss restorat ion technique for block- based image codi ng sys tems." IEEE T I - m .

on Image Pr-oc~ssing. pp. 10.56- 106 1. .lui'- 19%.

- 1 S. Ikng a n d )-.-Hm Hu. T o d i n g artifacts rernoval iising biasecl anisotropic diffii-

sion." in Proccc~dings of Int érnational Con fer~nce on [mage Prmxssing. pp. :346-

349. 1997-

[S5] Q. F. Zhu. Y. Wang. and L. Shaw. -Image reconstriictior, for hybrid vicleo

cotling systems." in Proc. IEEE Data Comprt.ision Con/. . (Snowbircl. UT).

p p . 219-2138. Mar. 1992.

[SG] Ci. D. Sampson. D. V. Papadimitriori. and C'. Cliamazas. --Postprocessing of

block-cociecl images at Iow bit rates.'* in Procecdings of the Inter-nnlional Con-

f c rc-nce o n .-tcoilst ics. Spcech. and Signal Processing, pp. 1-4. 1906.

[ST ] J . Ho. S. Sinaceur. F. Li. ri. Tarn. and 2. Fan. ~~Reiiioval of hlocking and

ringing artifacts in transforrn coded images.- in Proceedings of thc International

CWon/'êr-encc on Acoust ics. Speech. and Signal Processing. pp. '356.5-2.56S. 1997.

[Sa] H. C . Iiim and H. W. Park. -Signal adaptive postprocessing for blocking effects

reduction in JPEG image." in Procecdings o j the Internntionnl Conjkrrncr on

.-Lcoustics. Speech. and Signal Processing. pp. 4 1-44. 1996.

[89] Z. Fan and F. Li. -Reducing artifact in JPEC. decompression segmentation

" - ancl smoothing. in Procecdings of thc Int~rnational Conference on ..tcou..;tics.

Speçch. and Signa€ Processing. pp. 17-20. 1996.

.. [!)O] C'. J . liiio and R. J. Hsieii. -Adaptive postprocessor for block encoded images.

IEEE 'Tr-an.5. on Circuits n nd Systems for I'ideo Technofogy. pp. 2%-30-1. Aug

199.5.

[9 11 R. C1<astagno and J . A. Villaroel. spline-based adaptive filter for the r e m o d

of the blocking artifacts in image seciilence coded at ver>- ION- bit rate." in

Procc~dings of the [nternationd Conferrr~ce on ~+~cou.sCics. Speech. and Signal

Processing. p p . 45-48. 1996.

[92] H. Paek and S. [ y . Lee. -A projection bwed post processing technique to reduce

blocking artifact iising a priori information on DCT coefficients of adjacent

hlocks." in Pmcedin ip of th c International Con fercnce on .-lcousf ics. Speech.

and Signal Procrssing. pp. 53-56. 1996.

[93] D. C'. Yotila and H. \\:ebb. -Image restoration by t h e nietbocl of conves projec-

.. tions: Part 1-theory. IEEE Trnnsactions on medical Imageing. vol. 1. pp. 81-

9-1. Oct. 198'2.

[94] C'. Derviaus. F. X. Couctous. SI. G. Gazalet. P. Corlay. and '1. Ciharbi.

postprocessing technique for block effects elirnination using a perceptual dis-

- tor t ion meastire. in Proc~edings 01 the Inlrrnationaf confer-encc o n .-koustics.

S p ~ c c h . and Signal Processing. p p . 300 1 -ROO4. 1997.

[95] Z. Xiong. 31. T. Orchard. and Y. Q. Zhang. -.A deblocking algori thm for JPEC;

compressed images iising overcomplete wavelet representat ion." IEEE Trans.

on Clrcuits and Sya/ems fcr Video Technology. pp. 433--146. Apr. 1997.

[W] T. C'. Hsiing. D. P. K. Lrin. a n d \V. C . Siti. -.A cleblocking technique for JPEC;

decoded image using wavelet t ransorm modulus maxima representaion." in Pro-

cecdings of t h t I n t ~ ~ n a t i o n a l C o r z f e r ~ n c ~ on rlcou.otic.s. Speech. und Signal Pro-

crssing. p p . .% 1 -.564. 1996.

1971 T. C'lien. --The p s t . present. a n d fu tu r e of image and miiltitlimentional signal

processing." I E E E Signal Proc~s s ing .\.fagnzin. vol. 15. pp. 21-58. JIarch 1998.

[9S] -4. Rosenfelti. Irrznge .\lodels. Seu- Y+ork. 31.: .-\cademic Press. 1 %O.

[99] S. C e m a n a n d D. Geman. --Stochast ic rclasation. Gi bbs clistri but ion. ancl the

Bayesian restoration of images.- /EEE Trnr~i;. Pattern -4 nnl. and JIacfl. [nt..

\ v o l . 6. p p . 721-741. Sovernber 1984.

[100] S. Çhirani. F. Iiossentini. and R. \Varcl. -Reconstriictiori of baseline JPEG

codcd images in error prone envi ronmerits." acctpf ed /or publication i n f ll e IEEE

Trnnsnct io 1 2 s o n Irnage Proce.wing.

[IOl] S. Sliirani. F. Iiossentini. and R. Uard. -.A concealrnent niethod for video com-

niunicat ions in a n error prone environment ." acceytcd /or publication in the

IEEE Jour-na1 o n Sflected .Pt /-cas in C'ornrn unictrtions.

[LO?] J . Besag. -On the statistical analysis of dirty pictures.- .Journal Royal Stntist.

Soc. B. vol. 48. pp. '2.59-302. May 1986.

[103] P. Salama. S. Shroff. E. .J. Coyle. and E. .J. Delp. Error conccalrnent in encoded

cidco streams. in Signal recovery techniques for image and vicleo compression

ancl transmission. Ed. S. P. Galatsanos and A. K. Katsaggelos. Boston: Kluwer

Acaciemic Piiblishers. 1998.

[104] P. Salama. 3. B. Shroff. and E. J . Delp. -A fast suboptimal approach to error

conceaiment in encoded video st rcams." in Proc~edings of Inter-national Con-

j ~ i v n c c on Image Proc~ssing. pp. irol. I I . 101-104. 1997.

[105] S. Shirani. F. Kossentini. and R. \Vard. --An adaptive rnarkov randorn field

hased error concealment method for video communication in an error prone

.. - erivironment, in Proceedings of the Jnternational Con f e r t n c e on . - I C O U S ~ ~ C S .

Spctch. and Siynal Proc~ssing. vol. VI. pp. 31 17-3120. lfarch 1999.

[IO61 S. Siiirani. F. Iiossentini. and R. \\:ard. -Reconstriiction of motion vector miss-

.. - ing n-racroblocks in H.26:3 encocIec1 video transmission over lossy networks. rn

Procc~dings o f fnternationnl C'on ference on Image Proce.r;sing. pp. -LSï--l9 1. vol.

III . October 199s.

[lOT] J. Zhang. .J. F. Arnold. M. R. Frater. and JI . R. Pickering. --C-ideo error conceal-

nient using decoder motion vector estimation ." in IEEE TE.\'C'O.\-- Speech and

Imngc Technology for Cornputer and Telécornnzt~nications. pp. 777-ïdO. 1997.

[ los] \4:. A I . Lam. A. R. Reibman. and B. Lin. -Recot-ery of lost or erroneously

- receivecl niotion vectors. in Procerdings of t h e International C'onjerence on

-4coil.i-tics. Speech. and Signal Processing. vol. V . p p . 4 17-120. 1993.

jlol)] 1 R. Frater. IV. S. Lee. and d. F. Arnold. -Error concealment for arbitrary

stiaped video ohjects." in P r o c d i n g s of Interna!ionaI C'on ference on Image

Proc~ssing. vol- 13, (Chicago). pp- 507-.7I 1. Oct. 199s.

[ l I O ] Ci. Côté and F. Kossentini. -Optimal intra coding of blocks for rohost video

cornn~~inication over t h e internet.- EL-ROSIP Journal for- I-kunl Cenzmunica-

lion. Sp~cial Issue on Réal-time I'idco orer- thc Intcrrr~t. vol. 15. pp. 25-34.

Scpt. 1990.

(1 111 Ci. Siillivan. -.A simple ï ideo packet rnux sirnolator program for video streams

in H.26:3/1\[ using XLB mux of H.233 annes B." [TC-T Sfudy Group 2.5. l'ideo

C'oding Erpc ;.t C;r.oup. Docum ER! Ql.5-F- 16. Yovember 1996.

[ I l - ] Ericsson. -WCDMr\ error patterns at 64 kh/s.- [ T ( - - T Study Group 16. Miil-

f i n ~ d i n Er-rrt irtcrls and Sy.slenw Ezpc ri Gr-O u p . J une 19%-

[1 l:3] S. Shirani. B. Erol. and F. Iiossentini. "Error concealnient for mpeg-4 video

corrimunication in an error prone environment.- in Proc. IEEE In!. Conf.

.-\CO 11.5t.. Sptxch. and Signal Processing. ( to appear). 2000.

[11-11 S. Shirani. B. EroI. and F. Iiossentini. ".-1 concealment rnethod for sliape in-

formation in MPEG-4 coded video sec~uences." acccptcd 101. publication itz the

IEEE Transaction on Multimedia.

[ l i 51 .I. .A. \'-ebb. -Steps toward archi tectwe-independent image pr0cessing.l IEEE

C'ompcitcr dlagnrir~. pp. 2 1-31. February 1997.

[ l lF] R. Chellappa. J l a r k o r Randorn Fields. T'hheory a n d .4pp l ica t ions . California:

Academic Press. Inc., 1993.

[ I l i ] L. O. S. Herodotoii and A. Venetsanopoulos. -Image interpolation using a

simple gi hbs ranclom fielcl model.- in Procced ings of l n t ~ r n n t i o n n l Con ference

on i m a g e Process ing . vol. 1 . (Washington D.C.). pp. -194-49s. Oct 1995.

[IlSI H. .J. Siegel. J . B. Armstrong. and D. \V. Watson. -llapping cornputer-vision-

relatcd t asks ont0 reconfigiirable paraHel processing systenis." lEEE C'omp ute r

,\f(~gnzin. pp. 54-6R. February 1992.

Appendix A

Parallel Implement at ion of

MRF-Based Error Concealment

Method

A.1 Introduction

I t was sliown in C'Iiapters 2 and 3 that reconstruction of niissing tex ture inforniation

lising 1lRF-based l f AP estimation resillts into a minimizat ion problern which is usii-

a l I ~ - solved using i terat ive met hocls. Alt hough under specific condit ions t hey con\-erge

to t lie global mininuim. iterative methods a r e t ime consuming and prohibitil-e in ap-

plications where real t ime implenlentation is required ( a s in most video processing

applications). T h e high speecl recluired in many of the image processing applications

cari he achievecl 1- employing parallel processing techniques [lL5].

A paralle1 implemeritation for the simulated anneal ing algori t hm. which has

Figure A.1: A pisel. i t s clique ( d a r k ) a n d t h e cornplement of t h e clique ( in shatle).

heen used t o obtain t h e 41.-\P e s t i m a t e of a noisy image. has been proposed in [116].

In [ I l i ] . image interpolation is performed using a s imple .\[RF a s t h e image model.

Tlic i terat ive rnethocl required for 'c1XP est imat ion is app ros ima ted b ~ * non-iterative

noiiliriear filtering operations. thereby reducing t h e computa t iona l cornplesity of t lie

interpolation process.

In t his appendis . we present fast implementations for t h e lIRF-basecl error

concealmerit met hods proposed in C h a p t e r :3 using parallel processing techniques.

Since the pisel values in a n image a r e restricted t o a finite se t (e.g.. O t o '2.55). t h e

proposecl implementations replace mos t of the reqtiirecl a lgebraic operat ions wit h

lookiip tables. reducing t lie exccution t ime.

For t h e sake of simplicit?. ive assume t h a t in

x = min C C p ( ~ i . ~ - k . / D ( . r ; , . . r i , [ ) ) . r''' i . jE,u k.(Ec

rr;.,-k.l = 1. V ( i . j ) . (k. 1 ) . l loreover . lets consider a two ( ins tead of eight ) pixel

ncighborhood for tlie cliqiie. T h e clique is illustrated in F igure -4.1. T h e estension

t o a n eight-point clique is straightforu-ard. Finally. it is sufficient to assume tha t 71

in Equat ion (2.7) is a positive integer value since t he pixel values a r e positive integer

.A parallel machine consists of a nurnber of processing elements ( PEs). T h e

PEs are usually arranged in a niatris configuration with each PE connected to its

ncighboring PEs through a bus (see Figure A.2). lloreover. each PE usually has its

own riiemorj.. Parallel machines can operate in t wo modes: single instruction. niultiple

data (SI I ID) or multiple instruction. multiple data (1II l ID) . In SIIID. al1 active PEs

esecute the same instruction sj-nchronous1~- on their own data. Onlj- a single copy

of the program is stored in this niode and -sencl- and -receiveW operations betiveen

PEs are automatically synchronized. In the 1 I I I ID mode. on the other hand. t here

is no constraint on opcrations t hat can be performed concurrently and PEs operate

as'-nchronously. The SIMD mode is easier to program and requires lcss memory but

is not as flexible as the I I I I ID niode. Choosing the proper parallelism mode and

in~~le rnen ta t ion tins a significant impact on the performance of the application [118].

.ksiirning each missing block consists of -Y x -Y pixels. in our proposed imple-

mentation. :Y2 PEs are enip10'-ed for estimating tlie missing pisel values. Eacli PE is

rcsponsihle for reconstructing the value of a pisel in tlie missing block. If the number

of adjacent missing pisels esceeds the number of PEs. a groiip of pisels is assigned

to cach PE. In that situation. each PE is responsible for restoring the values of t h e

pixels in the group. The operations ~erforrned b-. each of the PEs and the way it

commiinicates with the PEs around it. is detailed below for the G l I R F and HMRF

nioclcls.

A.2 GMRF Mode1

.As was cliscussed in Chapter 3. for the C.\IRF moclel the estimated value of a pixel

is given bj- Ecpation (3.9). In other words. the estimated value of a pixel is the

(weighted) average of the pisels in its clique and tlie clique's complement. For this

model. the estimated value of a pixel has a simple closed form expression in terms of

acljaccnt pixels. Therefore. the estimation result c m be compiited efficientlu and there

is t hereiore no need for a parallel implementation. Hoivever. ive will discus the parallel

iniplemcnt at ion of the Cil1 RF based error conceainient since i t will help developing

tlie parallel iniplementation of error concealment methocls iising other forms of SIRF

models.

Figure A.?: Sim~lified diagram of a parallel processing machine.

For paraIlel implementat.ion of error concealment usiiig the G3.1 RF motlel. each

PE commiiriicates witli the PEs responsible for restoration of pixels i r i the clicliie

ancl its complement of the pixel that the PE is restoring (see Figure ."\:3). Having

the values of t lie pisels in the clique and its complement. the surnmation of them is

Figure A.3: The connection of adjacent PEs for e r ror concealment based o n t h e LIRF niodels.

calciilated bj- t h e PE. Since each pixel value is restricted t o a n integer niirn'ber between

O and 25.5. t h e surnniat ion can onlj- have 2.56 x 4 different values. For eacli srimmation

\-allie. t h e est imat ion result which is t h e surnmation divided by four. is s tored in a

lookup table. T h i s lookiip table. schernatically shown in Figiire A . - I . requires LI<

Bytes of mernory a n d can be shared a m o n g al1 t h e PEs. T h e opera t ions perforrned

by each of t h e P E s a r e identical. Hence. a ver)- efficient SIMD implementat ion is

possible.

if t h e restoration of each pixel could be clone independent ly of t h e value of

o thc r pixels. t h e proposed parallel implementat ion could lead t o a speecliip factor

equal t o t h e to ta l nurnber of P E s (.V2). However. t h e estirnated value of a pixel

depends o n t h e \*alue of pixels aroirncI i t and therefore sorne in te r -PE d a t a t ransfer is

rcquirecl. Thiis. t h e speedup factor will b e less t h a n .Y2. If t h e t i m e t o perform an

in te r -PE t ransfer is denoted by t t h e speedup factor of t h e parallel irnplc-mentat ion

over t h a t of t h e uni-processor will be equa l t o

SUM ESTIMATION

Figure A.4: T h e lookup tab le iised in t h e e r ror concealment based o n t h e C;lIRF model.

where f,,, is t h e t ime recluired t o perform t h e procedure detailcd above (calculating

t h e siim ancl using t h e lookup table) . T h e speedup calculation is based on t h e as-

sumpt ion t h a t t he uni-processor a n d each PE a r e of equa l comput ing potver. If t h e

t i m e t o perform inter-processor t ransier is ignored. t h e speedup factor of t h e proposed

parailel implementat ion compared to a serial implementat ion of t h e s a m e algorithm

will be equal t o t h e n u m b e r of PEs which is .Y2.

A.3 HMRF Mode1

Tliere a r e t ~ v o sources for cornputat ional c o m p l e s i t ~ in t h e lI.L\P es t imat ion empIoving

t h e H l I R F model. First . the es t imated value of a pisel depends o n t h e values of t h e

pixels in i ts clique which a r e iisually unknown. Second. t h e function p ( . ) has diffcrent

definitions for 1.7-1 > 7 a n d 1x1 < -1. L7sualb. i terat ive methods a r e iised t o ohtain t h e

es t imated value of pisels iinder these conditions. hence a reaI t i m e processing rnay

not be fcasible. Ive address t h e first problem hy proposing a parallel implementation

1.50

for II1,lRF-based MAP estimation. To address the second problem. ive use bit-wise

operations and lookup tables.

Figure .-1..5: The connection of adjacent PEs in the error concealment hased on HAIRF

For paraIlel implementation of error concealment methods using the HMRF

model. each PE communicates with the PEs responsible for restoration of pixels in

the clique ancl its complenient of the pixel that the PE is restoring (see Figiire -4.5).

To determine which part of the functiori p ( . ) should be used in computing the cost, we

dc\doped t h e following met hod. Corresponding to the value of each of the pisels in

the clique ancl its complement. there is an intcrval called an associate interval defined

as the follou-ing set

~vtiere a; is the valiie of ith pisel. If the intersection of the associatc intervals of

the four pixels arountl a specific pixel is not ernpty. the estiniated valiie of t h e pisel

is equal to one of the valiles in t h e intersection. If intersection of t h e associate

intervals of the four pixels is ail empty set. cfifferent combinations of three associate

intervals are testecf for an intersection. If the intersections of al1 conibi~iations of

t hree di fferent associate intervals are empty. combinat ions of two associate intervals

Figure A.6: T h e bit pattern for a pisel value of 98 and 5 = 2.

arc being checked- If intersections of al1 two associate inter\-als are al1 ernpty. t h e

cstiniatecl value of the pixel is one of the values in t h e associate intervals of the pixels

in t h e clique. However. this case is very iinlikely a n d corresponds to scatterecl d a t a

which obviously carinot prof-ide a reliablc estimate. I n n-tiat folloivs. fvc cliscuss a

method for fincling the intersection of the associate inter\-& efficiently.

T h e associate interval of each pixel value consists of a t most 27 + 1 integer

nirrnhers with values hetiveen O and 2.55. The associate interval of each pixel value

can be stored in a 2.56 bit-wide binary number as follou-. The least significant bit

ancl t he most significant bit in the binary nurnber represent the pixel value of O ancl

2.5.5. respectively. The bits corresponding to the values in the associate interval of

a specific pixel value are set to 1 a n d al1 other bits are set to zero. Thereiore. a

spccific hinary ni~rnber is generated for every pisel value and a particular 7. Dire to

t h e fact tliat the pisel values can assume only 2.56 different integer niimbers. t here are

o n l ~ - 2.56 clifferent binary niirnbers. These numbcrs can be stored as a look-iip table

consisting of 236 elements each of t hem 2.56 bit \vide. Figure A.6 sho\vs an elenlent

of tliis table for a pixel value of 9s and *, = 2. This look-iip table. calletl associate

intervals look-up table. recluires G4Ii of meniory and c a n be shared bet,ween al1 the

P Es.

Figure A.7: The lookup table for t h e Hube r function with 7 = 2.

Ttic intersection bctween t h e associate i n t e r \ a l s of pixels around a specific

piscl can be de te rmined hy a series of bit-wise ;\>:D operations. T h e set of possible

candidates for t h e es t imat ion resiilt consists of t h e \-alties corresponding t o ones in

t h e result of X S D operations. =\mong t h e values in t h e candida te set . t lie value t h a t

niinimizes t h e potent ia l functioii (Ecluation (3.7)) is t h e est imat ion result. To find

the values in t h e canclidate set t hat niinimizes t h e potent ia l fiinction efficient15 t h e

value of t h e H u b e r function (Eqiiation 2.7) is also c o m p u t e d for clifferent pixel values

( O to 25.3) a n d s tored in a look-up table called cost fi inction look-up table (see Figure

.-1.7)- Sincc p ( . ) is a n cvcn furiction. only t h e function \ -dues for s betweeii O a n d 2.55

shoiild be storecl in t h e cost function look-up table. This tab le can be shared between

al1 t he PEs.

A n e s a m p l e can clarifj- t h e abovr proceclure. Let's assiinie the valucs of t h e

pixels a round a specific pixel t o be 98. 99. 100 a n d 37 a n d 7 = 2. These values a r e

available to the PE t hrougti conimunication wi t h t h e neighboring PEs. The biiiary

representation of associate i n t e n d of t h e pisel value of 98 will have L in positions

corresponding t o 96. S i . 9s. 99 ancl 100. For t h e ~ i s e l values of 99. 100 ancl -57 these

positions will h e 91. 9s. 99. 100. 101 a n d 98. 99. 100. 101. 102 and .55. 56. -57. .?S.

.59. rcspectively. These bit patterns a r e obtained using t h e associate intervals look-up

table. .An A S D operation performcd on al1 four bit pat terns results in zero value.

Among different combinations of three different bit patterns. only the resiilt of the

A S D operat ion on t h e first three bit pa t te rns will have a non-zero value. The 1s

in the result of t h e A S D operation correspond t o t h e pixel valiles of 9s. 99 a n d 100

whicli const i tute t h e candidate set. For each of t h e values in the canclidate se t t h e

t-alue of t h e potential function (3.7) is cornputeci using t h e cost function look-iip as

fol lo\vs:

the est imation residt is the value t ha t minimized t h e potential function which for t his

esample will bc 98.

Finallj-. t h e proposecl met hod c a n be siimmarized as follon-s ( t h e ope ra t ions

are pcrformeci in parallel in each of t h e PEs):

1. Obta in t h e associat,e interval of t h e pixel values in t h e clique ancl its complemcnt

enlptoying t h e associate intervals look-up table.

2. Find t h e candidate set by performing a series of :\XI) operations on the associate

i nt ervals.

13. Cornpute t h e valiics of the potential h n c t ion ( Ecluat ion S.7) using cost functioti

look-up table.

4. Select the value in the candidate set corresponcling to the minimum value of

the potent ial funct ion compiited in the previoiis step.

For HMRF model. the number of A?;D operations and hence the esecution tirne

of est iniation of each pixel is data dependent. S I N D and JIILID implerrientat ions are

bot h concei\-able in t his case with different trade offs. In SIlID. since al1 the PEs are

synchronized. the process shouid wait for t h e PE with t.he highest number of AND

operations to finish its operations. In hII31D implementation. on the other hand.

each processor has its own program and can perform the operatioris without waiting

for other PEs. C'learl~.. the speecl of 1 I N D implementation is greater tlian or ecltiaI

to t hc SI3ID iniplenientation. but i t requires more mernory space.

The speedixp factor of the parallel inipiementation of H l 1 RF mode1 wsiimi ng

SIJID niode over that of a uni-processor will be eclual to

Lvliere t,,, is the time required t o perform the procediire listed above. Again. it is

assumecl t hat the uni-processor and each PE are of eclrial compiiting power. Ignoring

t , , , comparecl to t , , . the speedup factor of the proposecl parallel implementation

cornparcci to a serial implementation of the samc algorithm will be eqiial to the

niiniher of PEs whicli is ,Y2,