09.04.Detection of Missing Data in Image Sequences

  • Upload
    alessio

  • View
    221

  • Download
    0

Embed Size (px)

Citation preview

  • 8/14/2019 09.04.Detection of Missing Data in Image Sequences

    1/13

    1496 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4. NO . 1 I . NOVEMBER 1995

    Detection of Missing Data in Image SequencesAni1 C. Kokaram, Member, ZEEE, Robin D. Morris, William J. Fitzgerald, and Peter J. W . Rayner

    Abstractaright and dark flashes are typical artifacts in de-graded motion picture material. The distortion is referred to asdirt and sparkle in the motion picture industry. This is causedeither by dirt becoming attached to the frames of the film, orby the film material being abraded. The visual result is randompatches of the frames having grey level values totally unrelated tothe initial information at those sites. To restore the film withoutcausing distortion to areas of the frames that are not affected, thelocations of the blotches must be identified. Heuristic and model-based methods for the detection of these missing data regions arepresented in this paper, and their action on simulated and realsequences is compared.

    I. INTRODUCTIONETHODS for suppressing impulsive distortion in stillM mages and video sequences have traditionally involvedmedian filters of some kind. Arce, Alp et al. [11-13] have intro-duced 3-D (spatiotemporal) multistage median filters (MMFs)that can be used to suppress single pixel wide distortion in

    video signals. The MMF is a variant of standard medianfiltering in which the output value is the median of a setof values that are themselves the output of several othermedian filter masks of various shapes. In the case of degradedmotion picture film however, it is more typical to find blotchesthat represent multiple pixel sized impulsive distortion. Suchregions of constant intensity disturbances are called dirt andsparkle by television engineers. Kokaram et al . [4] haveintroduced a 3-D MMF that can reject such distortion.

    It is important to realize that a successful treatment of themissing data problem must involve detection of the missingregions. This would enable the reconstruction algorithm toconcentrate on these areas and so the reconstruction errorsat noncorrupted sites can be reduced. This philosophy hasimportant implications for median filtering in particular, whichtends to remove the fine detail in images. Such a systemincorporating a detector into a median filtering system forvideo has been used to good effect in 141-161.

    This paper introduces model-based approaches to the gen-eral problem of detecting missing data in image sequences.Although it is clear that, as yet, there does not exist a definitiveimage sequence model, both Markov random field (MRF)based techniques and the 3-D autoregressive (AR) model holdsome promise. Both models can describe the smooth variationof grey scale that is found over large areas of the image and thelocal pixel intensities. They can also handle the fine detail thatis so important for image appreciation. The following work

    Manuscript received March 19, 1994; revised January 10, 1995. This workwas supported in part by the British Library and Cable and Wireless PLC.The associate editor coordinating the review of this paper and approving itfor publication was Prof. A. Murat Tekalp.The authors are with the Signal Processing and Comm unications Labora-tory, Department of Engineering, Cambridge U niversity, Cambridge, UK.IEEE Log Number 9414596.

    describes both an MRF based and a 3-D AR detector for dirtand sparkle in video signals. The performance is comparedwith the systems introduced in 141 and 151.Of course, any solution to this general problem of detectionand suppression of missing data in image sequences must

    involve attention to the motion of objects in the scene. With-out considering motion, the application of 3-D processes totypical image sequences (e.g., television) would result in littleimprovement over what could be achieved using just spatialinformation. This is because like information must be treatedtogether in each frame, and motion in a scene implies that theinformation at a particular position coordinate in one framemay not be related to the information at that coordinate in otherframes. In other words, moving portions of an image tend to behighly nonstationary in the temporal direction perpendicularto the frame.Although both AR and MRF methods can be used toestimate motion in video 161-[9], a high computational costis incurred. It is to be noted also that motion estimation is avibrant research area and it would not be feasible to treat boththis problem and the detection problem in this one paper. Itis chosen instead to use block matching to generate motionvectors that are then used by the 3-D detection process thatfollows. Block matching is widely used as a robust motionestimator in many applications [lo], [113. Since it is primarilymotion that gives clues to the detection of dirt and sparkle, adescription of the motion estimator used is given first, followedby the description of the detectors.

    11. MOTIONESTIMATIONDespite the additional computational load necessary toestimate motion in an image sequence, the rewards in termsof detection accuracy are great. Furthermore, dirt and sparklecan be easily modeled as a temporal discontinuity facilitatingits recognition. This discontinuity at a site of dirt and sparklemay be recognized in a broad sense as an area of image that

    cannot be matched to a similar area in both the previousand next frames. Using three frames for detection in thismanner reduces problems caused by occlusion and uncoveringof objects, which would give rise to temporal discontinuitiesin either the forward or backward direction only.

    The algorithm used for motion estimation is described fullyin 161. It is a multiresolution motion estimation techniqueusing block matching (BM) with a full motion search (FMS).A multiresolution technique is essential if one is to dealefficiently with all the different magnitudes of motion in aninteresting scene. Several representations of the original imageare made on different scales by successively lowpass filteringand subsampling the original frame. Typically three or fourlevels are used for a 256 x 256 pixel image having resolutions

    1057-7149/95$04.00 0 1995 IEEE

    uthorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.

  • 8/14/2019 09.04.Detection of Missing Data in Image Sequences

    2/13

    K O K A R A M et al.: DETECTION OF MISSING DATA IN IMAGE SEQUENCES I491

    128 x 128, 64 x 64 etc. In this paper, if there are N levelsgenerated, the highest resolution image is defined as Level 0,and the lowest as Level N - 1.Motion estimation begins at Level N - 1. Block matchinginvolves, first of all, segmenting the current frame f , ay, intopredefined rectangular blocks (of size L x L pixels in this case)and then estimating the motion of each block separately. It isnecessary, first of all, to detect motion in each of these blocksbefore a search for the correct motion vector can begin. This isdone simply by thresholding the mean absolute error (MAE)between the pixels in the current block and those in the blockat the same position in the previous frame. If the MAE exceedsa threshold t,, then it is assumed the block is moving.Once motion is detected, the MAE between the currentblock and every block in a predefined search space in theprevious frame is calculated. This search space is definedby fixing the maximum expected displacement to +w pixels.Then, the search space is the ( L+ 2 x w ) x ( L+ 2 x w ) blockcentered on the current block position, but in the previousframe. The motion estimator used here is a simple integeraccurate technique, i.e., the blocks searched in the previousframe correspond only to the pixels available on the givengrid locations. Fractional displacement accuracy is possibleby interpolating between grid locations or by interpolatingthe resulting MAE curve from an integer accurate search.Fractional estimation will yield better results, but it is morecomputationally demanding and so was not used in this work.The displacement corresponding to the minimum MAE(Ed) s then selected for consideration. In order to prevent spu-rious matches caused by noise (another problem encounteredfrequently in degraded video sequences), the method of Boyce[121 is used. This technique compares Ed with the no motionerror Eo corresponding to the center block in the search space.If the ratio r = Eo/Ed is less than some threshold ratio rt,the match is assumed to be a spurious one and the estimatedmotion vector is set to [0, 01. If the ratio is larger than thethreshold, then it is assumed that the minimum match is toosmall to be due to the effect of noise, and the displacementcorresponding to that match is selected.After motion estimation at level N - 1 is complete, thevectors are propagated down to the level N- 2 where FMS BMis again used to refine those estimates. Bilinear interpolationis used to estimate initial start vectors in the level 1 that arenot estimated at the previous level 1+ 1. The multiresolutionscheme is the same as that used by Enkelmann er al. [131,except only top down vector refinement is used. At the finallevel, 0, it is possible that blocks not containing moving areasare assigned nonzero motion vectors because of their proximityto moving regions. To identify and correct this problem ofvector haloes the solution by Bierling [l o] is used. Motion isdetected again at the original level before the estimate fromlevel 1 is accepted.The final result is a field of vectors estimated on a blockbasis over the entire image. To get a displacement for everypixel one could either use the same vector for every pixel ina block, or as is used here, to interpolate the vector field. ItUsing in this case bilinear interpolation.

    is found in general that it is better for pixel-wise detectionto interpolate the vector field than to use a block-based field.This alleviates the more serious blocking artifacts, althoughit is agreed that this solution is by no means a consistentone. Removing blocking artifacts should be incorporated intothe motion estimator itself and not as a post-processing stageNevertheless, as far as detection of degradation is concerned,blocking artifacts from the motion estimator are not a problem.For alternative motion estimation schemes, the reader isreferred to the extensive literature in [61 and [14]-[ 191.

    111. THE MODELSIn a sense, estimating motion in the video signal alreadyimposes some model on the data. Using BM implies a trans-lational model of the image sequence, such that

    I T L ( T 3 = L-1(7+ & . T < - l ( q ) ( 1 )where r = [ : E . y] denotes spatial coordinate and GTz,n-1( 7)sthe motion vector mapping the relevant portion of frame 71into the corresponding portion of frame 71,- 1 at position rThe motion vector is found by minimizing a functional ofI n ( q- r L - l ( ? + c T l , n - l ( r ) ) .n the case of BM, this form isthe absolute error operation and the minimization is achievedvia a direct search technique over all possible motion vectorswithin a certain range.This basic model therefore creates each image by rear-ranging patches of grey scale from the previous frames. Thissimple structure can be used to propose several detectors fora temporal discontinuity that will be considered in the nextsection. However, it is possible to use alternative models, suchas those discussed in the following sections, to describe theevolution of pixel intensities. These models are more capableof describing changing object brightness due to shading, forexample. Of course these models must take motion intoaccount and it is possible to design schemes for motionestimation, whether implicit or explicit, using these techniques[6]-[8]. In practice, however, one finds i t feasible to combinea rough yet robust motion estimation algorithm (such asBM) with more complicated image models. The process istreated in two stages, the first involving motion estimation andthe second using these motion estimates to construct someimage sequence model. This procedure takes advantage ofthe relatively simpler BM motion estimation process ratherthan resorting to the more complicated model based processes.Note however, even though differing ideas underlie the motionestimation in the first stage, and the models used in the secondstage, the essential basis remains that an image in a sequenceis mostly the same as the images before or after it.A . Markov Random Fields

    The use of Gibbs Distributions, or equivalently MRF modelsfor images was introduced to the signal processing literatureby Geman and Geman [20]. The framework is a very flexibleone-here, only the basic theory needed for the developmentof the MRF-based detector in Section IV-D is outlined; for acomplete discussion, refer to [20] and [21].

    uthorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3 Downloaded on October 8 2009 at 04:31 from IEEE Xplore Restrictions apply

  • 8/14/2019 09.04.Detection of Missing Data in Image Sequences

    3/13

    1498 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO. 11, NOVEMBER 1995

    Fig. 1. 2-D lattice illustrating neighborhoods of different orders. Thenth-order neighborhood of F includes all pixels labeled with numbers 5 n.Neighborhoods on 3-D lattices are defined similarly.Consider a finite lattice S in two or three dimensions. Ateach site r'of the lattice, define a random variable i(3,wherei(3 akes values from the discrete state-space w . Let i denoteany of the possible configurations of the complete field I .Define a nighborhood system N on S , (see Fig. l) , where thenth order neighborhood of pixel ? is the set of pixels such thatIF- .'I2 5 n.These local neighborhoods are symmetric, suchthat s'E N? r' E N g . The set I is then an MRF if

    P ( I = 2 ) > 0 vi E W J S J (2)P ( I ( 3= i(F) I(q= i(q,r'#q = P ( I ( 3= i(F) I ( 4 = i(q,?E &) (3 )

    For example, if I represents the intensities of the pixels inan image, (3) states that the conditional probability of a pixeltaking a particular value is a function only of the intensi-ties of pixels in a finite (and usually small) neighborhood.The Hammersly-Clifford theorem [221 states this conditionalprobability distribution can be written as a sum over cliquepotentials as follows:P ( I ( 3= i ( q I I ( q= i (3, 'E NF)

    that is, the conditional probability is a function only of thosecliques C ,dependant on i (F),where a clique is defined to be asubset of S such that the clique contains either a single site, orevery site in the clique is a neighbor of every other site. Somecliques for the first-order neighborhood are illustrated in Fig. 2.The function Vc(i)s known as the potential function and is afunction only of those variables within the clique C. From theconditional probability definition the joint distribution may bewritten

    (5)

    Yo--osingleton horizontal vertical

    Fig. 2.the MR F detector.Cliques associated with the first-order neighborhood system u sed in

    for generating samples from the joint distribution P ( I = 2 ) .The Gibbs sampler is applied by repeatedly sampling fromthe simple distribution of (4); that is, samples are drawn fromP ( I ( 3= i(?')II(q= i (3, 'f 3, here each time ?indexesa different site in the lattice. This proceedure will cause theconfiguration of the field I to converge to a sample from thejoint distribution P ( I ) , rrespective of the initial configuration.These samples from P ( I ) an be used to calculate expectationswith respect to P ( I ) , nd, coupled with annealing, can be usedto find the mode of the distribution in (5), the configurationwith maximum probability.Finding the maximum of this probability distribution is amassive combinatorial optimization problem, similar to thetravelling salesman problem [24]. Introduce into (5) the idea oftemperature, by multiplying the argument of the exponentialby +. By varying the value of T , the characteristics ofthe distribution may be changed from uniform at T = 00 ,to completely concentrated at the mode for T = 0. Byintroducing the variable T and reducing its value at eachiteration of the Gibbs sampler according to some schedule,the sample drawn from P ( I ) will converge to the maximumprobability configuration of the field. In [20] it was proveda logarithmic schedule will cause the algorithm to convergeto the maximum probability solution. It has been noted manytimes, however, that this schedule is too slow in practice, andcommonly an exponential schedule is used [25].Details of using the mean field approximation to solve forthe minimum variance solution can be found in [26].B. Th e 3 -0 AR M odel

    The structure of the AR model allows efficient, closed-form, computational algorithms to be developed, and it is this,together with its spatiotemporal nature, which is of interest.The physical basis for its use as an image model is limitedto its ability to describe local image smoothness both in timeand space.Simply put, the model tries to make the best predictionof a pel in the current frame based on a weighted linearcombination of intensities at pels in a predefined supportregion. This support region may occupy pels in the currentframe as well as previous and past frames. The 3-D AR modelhas already been described by Strobach [27], Efstratiadis e?al . [7], and Kokaram [6], and the equation is repeated belowusing the notation of Kashyap [28].

    where C is the set of all cliques.Because of the identity between MRF's and Gibbs dis-tributions, many of the techniques of, and analogies with,statistical physics can be applied to problems described by thisframework. In particular the techniques of simulated annealing[201 and meanJield annealing [23] have been used successfully

    The Gibbs sampler is the basic technique for much workwith MRF's. It provides a computationally tractable method

    Nto find solutions to the optimization problems associated with I(x, , , = a k l ( x + % k + w x n , n + q n h Y), Y + Q Y ~MRF's. k = l+ w n , n + q , h (x7Y), n + 4nk ) + 4x7 Y7

    (6)

    uthorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.

  • 8/14/2019 09.04.Detection of Missing Data in Image Sequences

    4/13

    KOKARAM et al.: DETECTION OF MISSING DATA IN IMAGE SEQUENCES 1499

    No displacement Displacementof [- I -11

    Motion vectorMotion vector

    Frame n Frame nSUPpOfipelPredictedPel

    Fig. 3 . Handling motion with the 3-D AR model

    In this expression, I ( s , , n ) represents the pixel intensity atthe location ( T ,y) in the nth frame. There are N model coef-ficients ak. With no motion between frames, each coefficientwould weight the pixel at the location offset by the vectorsS;E = [ q , k , q y k , q n k ] , the sum of these weighted pixels givingthe predicted value f(z,, n ) below

    hi ( z , . n ) = a k l ( z f q z k , y + q y k . n + q n k ) . (7)

    k = lBecause there is motion however, these support locationsmust be offset by the relative displacement between thepredicted pixel location and the support location. Thisdisplacement between frame n and frame m is defined to(z ,y ) illustrate that the displacement is a function of positionin the image. Finally, t ( z , y , n ) is the prediction error orresidual at location ( z , y , n ) . It can also be consideredto be the innovations sequence driving the AR model toproduce the observed image I ( z , y . n ) . Fig. 3 shows atemporally causal 3-D AR model with five pixels supportat [ O ,O , -11, [-1,0, -11, [O . -1, -11, [ 1 , O . -11, [0,1.-11. Thefigure illustrates how the displacement is incorporated intothe prediction.For the purposes of parameter estimation, the model isconsidered in the prediction mode of (7). The task thenbecomes to choose the parameters in order to minimize somefunction of the prediction error, or residual

    be G,m(xl) = [ W n , m ( Z ,!I),Yn,m(z lY)]. The arguments

    E ( Z , y, = I(., Y, - i(s,, (8)Equation 8 is just a rearrangement of the model (6) with theemphasis placed on the prediction error, t (z , , n ) .It was decided, in the interest of computational load, to usea least-squared estimate for the model coefficients in order toadapt the coefficients to the image function prior to motionestimation. Recall that the displacement estimates are derivedfrom a separate motion estimation process and so they do notcomplicate the least-squared solution further. The coefficientsare chosen, therefore, to minimize the square of the error 0,above. This leads to the normal equations. The derivation isthe same as the 1-D case and the solution can be arrived atby invoking the principle of orthogonality. Solving the normalequations yields the model coefficients a k .

    The final set of equations to be solved is stated below:(9)a = - c .

    Here, C and c represent terms from the correlation functionof the image sequence. a is the vector of model Coefficients.(See [61, VI, [271-[291.)

    Iv . T H E DETECTORSIt is important to realize from the outset that this workcharacterizes missing data in an image sequence by a region ofpixels that have no relation to the information in any frame butthe current one. No relation is assessed in different ways de-pending on the model structure used. This is typically the casein all real occurrences of the problem. This simple realizationgives the key to all the detectors discussed here; the idea isto look for temporal discontinuities in the sequence. Furtherinformation can be gathered from spatial discontinuities aswell. This is more difficult to rely upon principally becausespatial discontinuities are a common and perhaps a necessary

    occurrence in an interesting picture.Several detectors are described here. The discussion beginswith those previously introduced and then moves on to the newdetectors, namely the SDIa-, MRF-, and AR-based systems.A . Heurist ics

    There have been two detectors previously discussed thatinvolve some heuristics for detection. The earliest is thatdiscussed by Storey [30], [31]. This did not employ motionestimation and instead thresholded the forward and backwardnonmotion-compensated frame differences to detect a blotch.*There were a number of heuristics involved for detectingmotion by using this information to vary the threshold in someway. The main thrust of the detector, however, is given by thefollowing statements (where I n ( q s the pixel intensity at thelocation F in the nth frame):

    eb = I TL ( ~n - I ( qef =L(T3-L+1(.3

    1, if (161 > e t ) AND ( l e f l > et)AND (sgn ( e b ) == sgn (er)) (10){ 0, otherwise.D B B C =The detector can be stated in words as follows: I n ( 3 isa blotched pixel if both the absolute forward and backwarderrors are greater than the threshold error e t , and In(?)does

    not lie within the range represented by the values I n - 1 ( q andIn+l(?).The latter rule is placed because of the assumptionthat if the pixel value is between those of the pixels in thetwo frames is n + 1,n - 1, then it must be part of the naturalevolution of grey levels in the scene. The first two rules ensurethat both the forward and backward differences agree thatthe central pixel represents some discontinuity. This wouldlessen the effect of false alarms at occlusion and uncoveringsince in that situation there would be a large error in onetemporal direction only. The assumption of equal sign is only2T he term blotch is used here as a synonym for the term dirt and sparkleand temporal discontinuities.

    uthorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.

  • 8/14/2019 09.04.Detection of Missing Data in Image Sequences

    5/13

    IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO . 1 I , NOVEMBER 1995500

    true in general if the blotches tend to be bright white ordark black. If the blotches are random in grey scale, thenthis detector is likely to miss those occurrences. However thisis not a common situation. Finally, in the presence of largemotion, this detector cannot correctly separate moving regionsfrom blotched areas for obvious reasons, despite the additionalcontrol measures implemented in [30]. The reader is referredto [30]and [31] for further details.B. SDI

    A very similar detector, using the spike detection index(SDI), was presented in [4]. This was motion compensated,however. It attempted to generate one number from which thepresence or absence of a blotch could be inferred. The SDIis defined as follows:

    where tl is a low threshold that overcomes problems when eland e2 tend to zero. The SDI is limited to values between 0and 1 and the decision that a spike is present is taken whenthe SDI at the tested pixel is greater than some predefinedthreshold t,.To understand how this approach works, assume the motionis purely translational. Now consider the following points,where p , f , b are the present, forward, and backward pixelvalues, respectively, along a motion trajectory.

    Occlusion: ( p- l will be large and Ip - bl will be zero.Therefore, SDI = 0 .Uncovering: Jp- l will be zero and J p- J will be large.Therefore, SD I = 0.Normal (trackable) motion: Both Ip - l and Ip - bl willbe zero. As both p - and p - b tend to 0, the SDI isnot well behaved. However, when this happens, it meansthe motion estimator has found a good match in bothdirections; hence, the current pel is not likely to be ascratch. Therefore in this case the SDI is set to zero.A blotch at the current pel but in the position of an objectshowing normal motion: Both Ip - l and Ip - l will belarge and the same and so SD I = 1. They would be thesame since f, would both be the same pels on the objectat different times thus having the same intensity, providedthe assumption of pure translation holds.A blotch at the current pel but in a position on an objectimage showing occlusion or uncovering: it is difficult tosay how the SDI behaves here. The SDI will take someundefined value, not necessarily zero or one. This valuewould depend on the actual intensities of the occludingregions.A blotch at the current pel but f and/or b represent pelsat which blotches have also occurred. Again the SDI isnot defined, but if the blotches are fairly constant valuedthe index tends to 0.

    For real sequences there must be some lower threshold tlfor the forward and backward differences that will indicatethat the match found is sufficiently good that the current pelis uncorrupted. This is necessary because in real sequencesthe motion is not translational and due to lighting effects theintensities of corresponding areas do not necessarily match.Further, there will be errors from the motion estimator.The general rule is that when the SDI is 0 the current pelis uncorrupted; else when it is 1 the current pel is corrupted.In order to allow for the cases where occlusion and multiplecorruptions along the motion trajectory are possible, there mustbe some threshold to make the decision. The threshold alsoallows some tolerance in the case of real sequences wheremotion is not purely translational and one has to deal withslight lighting changes not due to motion.The SDI was found to be effective in most cases but relieson the motion estimator tracking the actual image and notbeing affected by blotches. This is an important issue sincetypical B M algorithms are not robust to artifacts of such apotentially large size. Further, the use of the lower thresholdt l automatically excludes a number of discontinuities fromconsideration. The SDI also has quite a high false alarmrate in occluded and uncovered regions where large forwardand backward differences are likely. Nevertheless it is moreeffective than the detector of (IO), primarily because of its useof explicit motion compensation.C . SDIa

    There is scope for implementing a motion-compensatedversion of the detector given in (10). This is the first new(perhaps simpler) formulation to be considered in this paper. Itflags a pixel as being distorted using a thresholding operationas follows:

    1 if (eb > e t ) AND (e f > e t )c 0 otherwise .SDIa =Here, Cn,n-l(f), Cn,,+1(T) re motion vectors mapping thepixel in frame n into the next and previous frames. A pixel istherefore flagged as distorted when the forward and backwardmotion-compensated frame differences are both larger thansome threshold e t . This is the simplest detector for temporaldiscontinuities [6]; it does not involve the sign operations ofthe detector defined by (10). This is because it is possible forblotches to occur that violate the sign portion of the rule. TheSDIa also has a direct association with the AR-based systemthat is discussed later in this article.D . Detection Using Markov Random Fields

    The use of the theory of MRFs outlined in Section 111-Aenables a different definition of no relation to be used-thespatial nature of MRF models allows the information thatdirt and scratches tend to occur in connected regions to beencoded into the detector. No significant attempt is made tomodel the image, this being too computationally intensive;

    uthorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.

  • 8/14/2019 09.04.Detection of Missing Data in Image Sequences

    6/13

    KOKARAM er al.: DETECTION OF MISSING DATA IN IMAGE SEQUENCES 1501

    rather, the MRF model is applied to the blotch detectionframe to introduce spatial continuity there. This encouragesthe detection of connected blotch regions.In this section, D denotes the detection frame between thetwo image frames, which is to be estimated, where d ( f ) = 1indicates the presence of a blotch at position r and d ( 3 = 0denotes no blotch. Bayes theorem givesP ( D = d 1 I = i ) 0: P ( I = i 1 D = d )P (D= d ) . (11)

    That is, the probability distribution of the detection frameis proportional to the product of the likelihood of observingthe frame I , given the detection configuration, and the priordistribution on the detection frame, i.e., the model for theexpected blotch generation process.Thus, using 4(.) to denote the potential function for thetwo element cliques used, N for the four in-frame neighbors(the first-order neighborhood) and i ( 6 c ) or the single motion-compensated neighbor used, the likelihood function is

    P ( I = i I D = d )

    + a(1-

    keeping all d(.Sq.s# r constant at their current values whencalculating the conditional distribution for d ( 3 .These conditional distributions are used in the Gibbs sam-pler with annealing to find the maximum a posteriori (MAP)configuration of the detection frame, given the data and themodel for blotches, as discussed in Section 111-A. The MAPconfiguration is found for the detection frame between thecurrent frame and the previous frame, and the current frameand the following frame. Regions detected in both temporaldirections are consistent with the heuristic for blotches andare classified as such.Parameter Estimation: The MRF detector is seen to dependon three parameters-@, P I , /&. The value of controls thestrength of the self-organization of the discontinuities, andRipley [ 3 3 ] gives arguments for a value around two for afour nearest neighbor system, based on considerations of theconditional probability of a pixel when surrounded by three orfour pixels of the same state. Arguments of a similar naturecan be used to find CY and P 2 .The last term in (14) balances the increase in conditionalprobability introducing a discontinuity that eliminates the ef-fect of the first term, the motion-compensated frame differenceterm. To balance a difference of el requires

    u e : N /j2.Also, consider a single pixel error of magnitude e2 . For thisto be detected requires

    exp(-/?2) > exp(-crei + 4/31).i.e., the probability of a pixel having a particular grey scalevalue is a function of the pixels in its spatiotemporal neigh-borhood, with the temporal neighbor being excluded if a (16)discontinuity is indicated.The prior on the detection frame is taken to be the nearest-neighbor king model 1321 (the n = 1 neigborhood in Fig.I ) , together with a term to bias the distribution toward nodetections, to avoid (1 1) being optimized by a solution withscratches detected everywhere. This prior is successful inorganizing the detection into connected regions as desired.The prior is

    Thus, by quantifying the heuristic that spatial discontinuitiesare indicators of blotches, the values of the parameters of themodel to detect the blotches may be chosen consistently. Thishas been shown to result in a detector with a soft threshold1341,whereby the temporal discontinuity required for a blotchto be detected is reduced as the spatial extent of the blotchincreases.

    P(D = d ) E. Detection Usinn the 3-0 R Model:-Assume that the image is corrupted according to the fol-

    where f ( d ( 4 ) is the number of the four neighbors of d ( 3with the same value as d ( 3 , and S() is the delta function.Combining (12) and (13) , using 4(.) = (.) as the potentialfunction and dropping the term from (12) that is not a functionof d gives the a posteriori distribution asP ( D = d I I = i )

    - [ a ( l- d(F))(i(F) - i(rzc))2TES

    1= -expz

    This is the joint distribution for the detection frame D. From(14) the local conditional distributions are easily formed by

    (18)0 with probability (1- P s )with probability Ps.here b(F) = { i ,Here, B is a randomly distributed grey level representing ablotch or impulse, and it occurs at random intervals in theframe with probability PB. As in the previous section, it isrequired to detect the likely occurrences of b ( 3 # 0 in orderto isolate the distortion for further examination. The key to thesolution is to make the assumption that the undistorted imageI ( z , , n) obeys the AR model whereas b(?) does not. Thisapproach was taken by Vaseghi et al. 1351, [36] in addressinga similar problem in degraded audio.Suppose the model coefficients for a particular image se-quence were known. The prediction error could then beconsidered to be noise with some correlation structure [29].

    uthorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.

  • 8/14/2019 09.04.Detection of Missing Data in Image Sequences

    7/13

    IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO. 11, NOVEMBER 1995502

    If g ( 3 was filtered with the model prediction filter, the outputcould be written as below:Nf '(3 g( ' f ) - a k g ( F + g k )

    k = lN N

    = I(?) + b ( 3 - a k I ( F + < k ) - akb('?+ & )k = l k = lN

    = E ( 3 + b ( 3 - a k b ( F + fk). (19)k = l

    Equation (19) shows the undistorted samples in the degradedsignal are reduced to the scale of the error or residual sequence.The distorted samples can be reduced in amplitude if there areother distorted samples nearby but this does not occur often.Therefore, thresholding the prediction error is a useful methodof detecting local distortion.Parameter Estimation: For a real sequence, the model co-efficients are unknown. In this work, they are estimated

    from the degraded sequence using the normal equations. Amotion-compensated volume of data is extracted and then"centralized" by subtracting a linear 3-D trend following the2-D work of Veldhuis [37]. The coefficients are then estimatedusing the previously calculated displacements and the normalequations. The choice of the spatial extent of the volume usedis important. If the size of a block is comparable to the size ofa particular blotch, then the coefficients are heavily biased bythat distortion and the resulting detection efficiency is poor.This effect is enhanced when the model has spatial supportin the current frame since the model support is then more likelyto contain corrupted sites.3 In the case of dirt and sparkle,because the distortion occupies a large spatial area, a modelwith spatial support in the current frame would only give largeprediction errors at the edges of a blotch. Inside the blotch theresidual would be reduced in magnitude. In practice, modelswith no support in the current frame are more effective sincethe distortion is local (impulsive) in time but not necessarilyas local in space.There is the question of how the current block beingmodeled is assigned motion vectors to yield the 3-D data. volume required. There are two approaches. One is to use thesame block size as used by the motion estimator, which wouldbe consistent with previous assumptions, then compensate theentire block using the one vector. The other is to compensateeach pixel in that block using interpolated vectors. This workuses the former technique primarily because of the lowercomputation required.It becomes helpful to describe AR predictors by the numberof pixels support in each frame. There is no evidence forasymmetric supports so a 9:O model refers to a model withnine pixels in a 3 x 3 square in the previous frame acting assupport. A 9:0:9 model has twice that support, nine pixels ineach of the previous and next frames.

    Implementation: There are two types of model-based detec-tion systems that can be considered. The first thresholds the31n most real degraded sequences, blotches d o not occur at the same spatialposition in consecutive frames.

    prediction error given a single model. Therefore, an impulseis detected when[f(3]' (20)

    where t , is some threshold.The other detection system uses two temporally differentmodels-a forward predictor N:O and a backward predictor0:N. The two prediction error fields, 1 and 2 , are thenthresholded to yield a detected distortion when

    (21)[ 6 1 ( 3 ] ~2 G ) AND ( [62 (312 2 t e ) .Therefore, a blotch is located when both predictors agree amatch cannot be found in either of the two frames. Such asystem is denoted by N:O/O:N.In practice the causal/anti-causal detector is better than thenoncausal approach. This is due to the better ability of theformer technique to account for occlusion and uncovering byseeking an agreement between two directed predictions. Onlythe N:O/O:N system is considered here.

    Note that the SDIa detector is the same as the 1:0/0:1 ARdetector except that the two AR coefficients are set to 1 O. Thatdetector is true to the model being used for motion estimationvia BM. It follows from the idea that every image is just arearrangement of image patches in the past or the next frame.Hence, pixels that cannot be found (to some tolerance) ineither of the two surrounding frames must not be part of thesequence.F. Computational Load

    In this work, multiresolution block matching was used toestimate motion. At each frame, motion must be estimatedin both the forward and backward temporal directions. Thecomputation this requires, in all cases, is far in excess of thatrequired by the detectors. Also, the detectors do not involvethe motion estimator explicitly; therefore, the motion estimatorload is not considered here.All arithmetic operations e.g. + - ABS < were counted ascosting one operation. The exponential function evaluation wastaken as costing 20 operations and inversion of an N x Nmatrix was assumed to be a N 3 process. Estimates for thenumber of operations per pixel for the detectors are as follows:D B B C= 11, SDI = 11, SDIa = 7, 3DAR = 140 (assuminga block size of 8 x 8 pixels and a 9:O model) and MRF = 50per iteration. Only a small number of iterations (typically five)were needed in the following experiments as the temporal termin the detector (14) usually dominates over the spatial terms.

    v. RESULTS AND DISCUSSIONIn order to objectively assess the performance of the variousdetectors just discussed, the sequence WESTERN1 (60 framesof 256 x 256) was artificially corrupted with blotches ofvarying size and shape and random grey level. The methodof corruption is outlined in Appendix A. The exact methodof corruption is not important, it is sufficient to recognize thatareas of missing data were introduced into each frame in somerandom manner so they represented temporal discontinuities.The corruption was quite realistic in that the size and shape

    uthorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3 Downloaded on October 8 2009 at 04:31 from IEEE Xplore Restrictions apply

  • 8/14/2019 09.04.Detection of Missing Data in Image Sequences

    8/13

    KOKARAM et al.: DETECTION OF MISSING DATA IN IMAGE SEQUENCES 1503

    of the blotches produced were not regular. Typical degradedframes (48-50) are shown as Figs. 6-8. No effort was made toinsure that blotches did not occur at the same spatial location inconsecutive frames; indeed this was the case in some frames.The experiment thus represents worst case results in somesense, since multiple occurrences of blotches in the sameposition in consecutive frames are indeed a very rare event inpractice. Figs. 9-13 show, respectively, detection results whenthe SDIa, SDI, MRF, 1:0/0:1 (known4), and 1:0/0:1 systemsare applied to frame 49.Motion estimates were made from the degraded frames. Afour-level motion estimation process was used as outlined inthe description previously. The search space used for the fullsearch block matching process was +4 pixels at each level.The generating kernel for the image pyramid was a spatiallytruncated Gaussian function with variance of 1.0 and a kernelsize of 9 x 9 pixels. A threshold of 10.0 on the MAE wasused to detect motion, with a noise ratio [12] of 1.2 at theoriginal resolution level and 1.0 at all other levels. A blocksize of 9 x 9 was used at the 256 x 256 level, and 5 x 5otherwise. In a sense, these and other details about the BMparameters used are not important. It is only necessary to notethe results for each detector were generated using the samemotion vector estimates.Fig. 4shows a plot of the correct detection rate versus falsealarm rate for each detector. Since the original data is availableit is possible to estimate a true set of AR model coefficientsfor a particular AR model configuration, and this was donefor both a 1:0/0:1 and a 5:0/0:5 system (using a supportas in Fig. 3). The motion estimates used for this artificialsituation were also gained from the degraded sequence. Thecurves for the SDIa, and AR systems were generated bymaking measurements of accuracy for thresholds that varyfrom 0 to 2000 in steps of 100.The SDI curve was generatedsimilarly with thresholds varying from 0 to 1.0 in steps of 0.05(tl = 10.0). The point on the lines nearest the top right handcorner of the graphs corresponds to the smallest threshold.The MRF detector characteristic was found by using differentvalues of el and e2 (1 4 5 el 5 34 and 16 5 e2 5 5 6 ) inestimating the parameters a and ,LIZ via (15) and (16).It can be seen that the SDIa detector performs very welloverall, maintaining greater than 80% correct detection for lessthan 1% false alarm rate. Surprisingly, the AR model baseddetector systems do not perform well when the coefficients areestimated from the degraded data (the real curves). The MRFapproach gives slightly better results than the SDIa detector ina real situation. The SDI detector does not perform as well asthe SDIa or MRF systems and is more restrictive in its usefuloperating range.Considering the AR system, more spatial support seems toyield a worse performance. This may seem to be counter-intuitive, but the curves obtained when the coefficients are es-timated from the original, clean data (the known curves) showa much better performance, and hence provide an explanation.Since these curves were generated using the same motionestimates from the degraded images, the worse performance

    4Using parameters estimated from the clean sequence

    I

    0.9

    ~ 0.8P08 0.7z2o 0.633

    E 0.5

    0 01 0 1 0.1 I0.41E-05 OOOO1 Frobabihty of False AlarmFig. 4. Performance of detectors on 60 frames of the sequence: WESTERN.

    in the practical case is due to the AR coefficient estimationprocess being biased by the degradation.The energy of the blotches is sometimes so large in pro-portion to the rest of the image patch being modeled that itcauses an adverse bias in the estimation process. This leadsto an increase in the false alarm rate and a decrease in thecorrect detection rate. When the coefficient estimation processoperates on clean (known)data, the performance is much betterthan the SDIa or MRF systems. In this case, increasing thespatiotemporal support does help the situation and the 5:0/0:5( known) system performs better than the 1:0/0:1 (known)system. In the real case, increasing the support for the systemworsens the bias because the block sizes used for estimation(9 x 9) are small, and hence the confidence with whichcoefficients can be estimated is sensitive to the number ofmissing pixels and the number of correlation terms that mustbe measured. Such small block sizes are forced because of thespatial nonhomogeneity of images.It is interesting to note that in a real situation, the AR model-based detector would miss blotches with a low intensitydifference with preceeding and next frames. The reason is thecoefficients can adjust to account for low levels of grey scaletemporal discontinuities, hence yielding a low residual power.The SDI system has a limited range of activity because ofthe low threshold that must be used to filter pixels with agood match in both the previous and next frames. Furthermore,the SDI ratio is not well defined in regions of occlusion anduncovering, especially if a blotch is present. The false alarmrate is seen to be higher than either the SDIa or MRF systems.Note, however, that one could choose thresholds so that theSDI performance approaches that of the SDIa. This shows thatit is more difficult to use the SDI effectively.The MRF system performs slightly better than the SDIabecause it is able to incorporate the idea of spatial smoothnessin the form of the blotch. Therefore, it will flag a pixel asbeing distorted not only if it is at the site of a temporaldiscontinuity but also if it is connected to a pixel of thesame grey level that was at the site of a discontinuity. Itwill therefore be able to detect the marginally contrastedblotches primarily because of spatial connectivity, whereas

    uthorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.

  • 8/14/2019 09.04.Detection of Missing Data in Image Sequences

    9/13

    1504

    1

    0.9

    86'1 .826 0.7ba'c

    40.6

    0.5O S

    IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO . 1 1 , NOVEMBER 1995i0.0003 0001 0.003 0.01 00 3 01 0 3 1

    RobabilityofFalseAlsrmFig. 5 . Performance of detectors on frame 49 of the sequence: WESTERN.

    Fig. 7. Degraded frame 49 of WESTERN.

    Fig. 6. Degraded frame 48 of WESTERN.

    the SDIa system would be unable to detect a blotch if thetemporal differences are too low (i.e., poorly contrasted). Thatthis only produces a small improvement in performance isunderstandable as the additional blotches found are thoseof low grey-scale difference, which will be only a smallproportion of the blotches. In a frame when this is significant,a larger difference in operating characteristic is observed (seeFigs. 5-11).

    Figs. 9-13 show, respectively, detection results when theSDIa, SDI, MRF, 1:0/0:1 (known), and 1:0/0:1 systems areapplied to frame 49. Each figure shows the result obtainedwhen the relevant parameters or thresholds are set so theprobability of correct detection is 90%;hence they represent ahorizontal slice across Fig. 5 at P, = 0.90. They illustratethe points made earlier. Red represents a missed blotchedpixel, green represents correctly detected blotched pixels,and brown represents false alarms. Note how none of thesystems-SDIa, SDI, or AR -d et ec t the lightly contrasted

    Fig. 8. Degraded frame 50 of WESTERN.blotch on the shoulder well. Note also that the increased falsealarm rate shown in Fig. 13occurs around the blotches and isdue primarily to the influence of these discontinuities on thecoefficient estimation process.The false alarm rates for each of the detectors are:SDIa4.4%, SDI4.8%, MRF-O.28%, AR 1:0/0:1(known)-O.23%, AR 1 O/O: 1 (estimated)-1.3%. All thedetectors flag the area highlighted in Fig. 7as a false alarmregion. The main area of improvement of the M RF detectorand the AR l:O/O:l (known) detector is the reduction in thenumber of single pixel false alarms flagged. These are mostnoticeable on the actor's shoulder, hair, and arm. The falsealarm rate for the AR detector with estimated coefficientsis dominated by the effect of the bias in the coefficientestimation process.

    uthorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.

  • 8/14/2019 09.04.Detection of Missing Data in Image Sequences

    10/13

    KOKARAM ef al.: DETECTION OF MISSING DATA IN IMAGE SEQUENCES 1505

    Fig. 9. Detection using SDIa on frame 49 . Fig. 11. Detection using the MRF on frame 49 .

    Fig. IO . Detection using SDI on frame 49.

    In all Figs. 9-13, there is an undetected region (red) on theshoulder of the main figure. This region can be seen to be onlyslightly contrasted in the degraded frame in Fig. 7.It is notablethat all the detectors miss this region at this detectiodfalsealarm rate, and it is because the area is of low contrast withthe rest of the image it is, in fact, difficult to see.

    Overall then, the SDIa detector is the best in terms of thecompromise it strikes between computation and accuracy. TheM RF approach is the most accurate however, and performsextremely well in the real situation where the AR based ap-proach fails because of poor estimation of model coefficients.It is possible to use optimal weighted estimation of coefficientsto alleviate the difficulties with the use of the AR approach asin [38]. Of course the computational complexity would then

    Fig. 12. Detection using known AR parameters, l:O/O:l,

    be increased. In cases where high fidelity of the reconstructionis required, fo r example still frame viewing, the MRF detectoris most suitable.A . Errors in Motion Estimation

    It is clear that motion estimation errors would adverselyaffect the performance of all these detectors, more so thepurely temporal SDI and SDIa systems. In the interest ofbrevity then, we do not include results when the motionestimates come from the clean original, but choose insteadto present Figs. 6-13.Figs. 6-8 show three frames, 48-50. A red block highlightsa region in frame 49 that has been uncovered from frame 48and partially occluded in frame 50. As stated earlier, Figs.

    uthorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.

  • 8/14/2019 09.04.Detection of Missing Data in Image Sequences

    11/13

    1506 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO. 1 1 , NOVEMBER 1995

    Fig. 13. Detection using estimated AR parameters, 1:0/0: 1. Fig. 14. Frame from actual degraded sequence9-13 show, respectively, detection results when the SDIa,SDI, MRF, 1:0/0:1 (known), and 1:0/0:1 systems are appliedto frame 49. Note how the white coat lining of the centralfigure is a source of false alarms in all cases. This effectivelydemonstrates a fundamental limit in detection capability: areasof fast motion can represent temporal discontinuities as regionsare rapidly uncovered and occluded. It is in these areas itwould be advantageous to use more than three frames in themotion estimator (in the manner of [30] and [40]-[42], forinstance) to allow matches to be found when the material isagain uncovered.

    This problem is unavoidable and therefore, in the designof an interpolator for this missing data, robust estimatorsmust be found; i.e., interpolators that can reconstruct large,apparently missing regions without distortion by using spa-tial continuity when temporal smoothness is absent. This isdiscussed in [43].

    VI. REALDEGRADATIONFigs. 14 and 15 show results from the application of theSDIa and MRF systems to the problem of detecting the realdistortion in a motion picture frame. For brevity, only theframe concerned is shown here, the motion in the scene

    consists of a vertical pan of four to five pixels per frame.The background consists of out of focus trees that sway inand out of shadow. The motion is typical of motion pictures,the objects in the scene move with velocities varying fromsmall (foreground) to very large (background).The main distortion is boxed in red in Fig. 14. The resultsfor the SDIa and MRF systems are superimposed on the imagein Fig. 15. Red pixels are those flagged as distorted by bothdetectors. Bright white pixels are those flagged by the SDIaprocess but not by the MRF process, finally green pixels arethose flagged by the MRF process and not by the SDIa process.The brightness of the image in Fig. 15has been reduced sothe color of the flagged pixels can be more easily seen.

    Fig. 15.SDIa; green, MRF.Detection using SDIa and MRF systems. Red, both; bright white,

    As expected, the MRF system detects more of the largeblotch due to spatial connectivity. The SDIa is unable todetect all of it because parts of the blotch match well withparts of the head in the next and previous frames. The SDIahas more false alarms in the background but performs betteron the daisy (with respect to false alarms) again because ofthe MRF tendency to collect pixels together. Both detectorshave problems along the moving arm of the figure becausethe integer accurate motion estimation cannot properly com-pensate for the fractional motion here, and the edge of thearm is highly contrasted with the dark suit. Nevertheless, bothdetection systems detect the distortions satisfactorily.It is useful to note that by detecting the regions of suspecteddistortion, the computation necessary for the next stage of

    uthorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.

  • 8/14/2019 09.04.Detection of Missing Data in Image Sequences

    12/13

    KOKARAM et al.: DETECTION OF MISSING DATA IN IMAGE SEQIJENCES 1507

    reconstruction is reduced since the number of pixels that mustbe considered is a small subset of the entire frame. The rateof suspicion for the sytems in this case is 1.2 and 0.88% forthe SDIa and MRF, respectively.VII. CONCLUSION

    The problem of detecting blotches or missin!: data ofthis kind in image sequences is well posed in a temporalsense. The success of the purely temporal SDI1 detectorshows the problem can be solved just by observing temporalinformation. The results have indicated that incorporating morespatial information does have some benefit, but to exploit thefull potential gains it is necessary to reduce the influenceof the degradation on both the motion estimator and thecoefficient estimator (with respect to the AR systems). It wouldbe an advantage if it were possible to estimate the MRFhyperparameters from the image.The paper has discussed the problem of detecting blotches indegraded motion pictures, which pertains to the more generalproblem of missing data detection. Identification of the missingdata regions allows efficient algorithms to be developed tointerpolate these missing regions. This is discussed in [43].

    APPENDIXAGENERATIONF ARTIFICIALBLOTCHESFigs. 6 8 show frames from the sequence degraded withartificial blotches. These artificial blotches are a good visualmatch with blotches observed on real degraded sequences. Themethod of generation was as follows.The Ising model is an MRF model defined on two states,with a conditional probability structure defined such that the

    probability of a pixel being in a given state is proportional tothe number of its neighbors in that state. The joint probabilityof the field is thus, where z i E {-1, +1}

    Samples from this model have approximately equal numberof pixels in each state. If a term of the form nS(z; + 1)is introduced into ( 2 2 ) , this will bias the field toward thestate z, = -1. Iterating the Gibbs sampler on this biaseddistribution will result, from an initialization of equal numbersof each state, in a clustering of the pixels in each state, and agradual reduction of the number of pixels in the state zi = 1.If the evolution of the field is stopped before a uniform pictureis reached, then it will consist of small connected regions ofstate x1 = 1, randomly distributed across the frame. As canbe seen from the figures, these regions are good simulationsof the kind of amorphous distortion found in practice. Thesemodel the location of the blotches very well, as can be seenfrom the figures.

    Finally, isolated blotches were colored uniformly with avalue chosen randomly from [O , 2551 and then the originalframes were corrupted by inserting the colored areas into theframes, replacing the original information.

    REFERENCES[ I ] G . R. Arce and E. Malaret, Motion preserving ranked-order filters forimage sequence processing, in Proc. IEEE Int. Con$ Circuits Syst. ,[2] G. R. Arce, Multistage order statistic filters for image sequenceprocessing, IEEE Trans. Signal Processing, vol. 39, pp. 1146-1 161,May 1991.[ 3 ] Bilge Alp, Petri Haavisto, Tiina Jarske, Kai Oistamo, and Yrjo Neuvo,

    Median-based algorithms for image seq uence processing, SPIE VisualCommun. Image Processing, 1990, pp. 122-133.[4] A. C. Kokaram and P. J. W. Rayner, A system for the removal ofimpulsive noise in image sequences, in SPIE Visual Commun. ImageProcessing, Nov. 1992, pp. 322-331.[SI - Removal of impulsive noise in imag e sequences, in SingaporeInt. Conf Image Processing., Sept. 1992, pp. 629-633.[6] A. C. Kokaram , Motion picture restoration, Ph.D. thesis, Cambrid geUniv., UK, May 1993.[7 ] S . Efstratiadis and A. Katsa gellos, A model-based, pel-recursive mo-tion estimatio n algorithm, in Proc. IEEE ICASSP, 1990 , pp. 19 73-1976.[8] J. Konrad and E . Dubois, Bayesian estimation of mo tion vector fields,IEEE Trans. Patt. Anal. Machine Intell, , vol. 14, no. 9, Sept. 1992.191 I. M. Abdelquader, S . A. Rajala, W . E. Sn yder, and G. L. Bilbro, En-ergy minimization approach to motion estimation, Signal Processing,vol. 28, pp. 291-309, 1992.[IO] M. Bierling, Displacement estimation by hierarchical block matching,in SPIE VCIP, 1988, pp. 942-951.[ 1 ] M. Ghanb ari, The cross-search algorithm for motion estimation , IEEETrans. Commun., vol. 38, pp. 950-953, July 1990.

    [ 121 J. Boyce, Noise reduction of image sequences using adaptive motioncompensated frame averaging, in IEEE ICASSP, vol. 3, 1992, pp.461-464.[ 131 W . Enkelmann, Investigations of multigrid algorithms for the estima-tion of optical flow fields in image sequences, Comput. Vision Graph.Image Processing, vol. 43, pp. 150-177, 1988.[I41 H. Nagel, Recent advances in image sequence analysis, in PremiereColoque Image Traitment, Synthese. Technologie et Applications., pp .545-558, Ma y 1984.[IS] H. Nagel and W. Enkelmann, An investigation of smoothness con-straints for the estimation of displacement vector field from imagesequences, IEEE Trans. Putt. Anal. Machine Intell., vol. PAMI-8, pp.565-592, Sept. 1986.[ 16 ) S . Fogel, The estimation of velocity vector fields from time-varyingimage sequences, Comput. Vision Graph. Image Processing: Image

    Understanding., vol. 53, pp, 253-287, May 1991.[I71 J. Robbins and A. Netravali, Image sequence processing and dynamicscene analy sis, in Recursive Motion Compensation: A Review. Berlin,Vienna, New York: Springer-Verlag, 1983, pp. 76-103.1181 B . Schu nck, Image flow: fundamen tals and fu ture research, in IEEEICASSP, 1985, pp. 560-571.[I91 J. Riveros and K. Jabbour, Review of motion analysis techniques,Proc. IEEE, vol. 136, no. 397 40 4, Dec. 1989.[20] S . Geman and D. Geman, Stochastic relaxation, Gibbs distributions,and the Bayesian restoration of images, IEEE Trans. Putt. Anal.Machine Intell., vol. PAMI-6, pp. 721-741, Nov. 1984.[21] D. Geman, Rando m fields and inverse problem s in imagin g, in Lec-ture Notes in Mathematics,volume 1427. Berlin, Vienna, New York:Springer-Verlag. 1990, pp. 113- 193.(221 J. Besag, Spatial interaction and the statistical analysis of latticesystems, J . Royal Sfatist. Soc. B , vol. 36, pp. 192-326, 1974.1231 H. P. Hiriyannaiah, G . L. Bilbro, and W. E. Snyder, Restoration ofpiecewise-constant images by mean-field annealing, J. Opt. Soc. Am. ,

    1989, pp. 983-986.

    - .pp. 1901-1912, 1989.1241 S. Kirkpatrick, C. Gelatt, and M. Vecci, Optimization by simulated~ annealing, Science, vol. 220, pp. 671-680, 1983.[25] S . Geman, D. E. McClure, and D. Geman, A nonlinear filter for filmrestoration and other problems in image restoration, CVGIP: GraphicalModels Image Processing, vol. 54, no. 4, pp. 281-289, July 1992.[26] G. L. Bilbro, W . E. Sny der, and R. C. Mann , Mean-field approximationminimizes relative entrop y, J . Opt. Soc.A m . , vol. 8. no. 2, pp. 290-294,Feb. 1991.1271 P. Strobach, Quadtree-structured linear prediction models for imagesequenc e processing, IEEE Trans. Putt. Anal. Machine Intell., vol. 11,pp. 742-747, July 1989.[28] A, Rosenfeld, Ed,, Univariate and Multivariate Random Fields forImages.[29] A. K, Jain, Fundamentals of Digital Image Processing. EnglewoodCliffs, NJ: Prentice-Hall, 1989.New York: Academic, 1981, pp. 245-258.

    uthorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04 :31 from IEEE Xplore. Restrictions apply.

  • 8/14/2019 09.04.Detection of Missing Data in Image Sequences

    13/13

    1508 IEEE

    (301 R. Storey, Electronic detection and concealment of film dirt, U KPatent Spec.$cation no . 2139039, 1984.[31] -, Electronic detection and concealment of film dirt, SMPTE J .,pp. 642-647, June 1985.[32] D. Chandler, Introduction to Modem Statistical Mechanics. New York:Oxford University Press, 1987.[33] B. D. Ripley, Statistical Inference for Spatial Processes. Cambridge,U K Cambridge University Press, 1988.[34] R. D. Moms and W. J. Fitzgerald, Replacement noise in imagesequences-Detection and correction by motion field segmentation, in

    [35] S. V. Vaseghi and P. J. W. Rayner, Detection and suppression ofimpulsive noise in speech communication systems, Proc. IEEE,, vol.137, pp . 38 4 6, 1990.1361 S. V. Vaseghi, Algorithms for the restoration of archived gramophonerecordings, Ph.D. thesis, Cambridge Univ., UK, 1988.1371 R. Veldhuis, Restoration of Lost Samples in Digital Signals.Englewood

    P ~ o c . CASSP, vol. 5, 1994, pp. V245-248.

    . . Cliffs, NJ: Prentice-Hall, l980. .1381 E. DiClaudio, G. Orlandi, F. Piazza, and A. Uncini, Optimal weighted- I I

    . . LS AR estimation in presence of impulsive noise, in-IEEE I C A ~ S P . ,vol. E3.8, 1991,: pp. 3149 -3152.[39] M. Sezan, M. Ozkan, and S. Fogel, Temporally adaptive filtering ofnoisy image sequences using a robust motion estimation algorithm, inProc.,, EEE ICASSP, vol. 3, 1991, pp. 2429-2431.[40] M. Ozkan, M. I. Sezan, and A. M. Tekalp, Adaptive motion-compensated filtering of noisy image sequences, IEEE Trans. CicuitsSyst.,. video TechnoL, pp. 277-290, Aug. 1993.[41] M. Ozkan, A. Erdem, M. Sezan, and A. Tekalp, Efficient multiframeWiener restoration of blurred and noisy image sequences, IEEE Trans.Image Processing, vol. 1, p p. 4 5 3 4 7 6 , 1 99 2.[42] A. Erdem, M. Sezan, and M. Ozkan, Motion-compensated multiframeWiener restoration of blurred and noisy image sequences, in IEEEICASSP, vol. 3, pp. 293-296, 1992.[43] A. C. Kokaram, R. D. Moms, W. J. Fitzgerald, and P. J. W. Rayner,Interpolation of missing data in image sequences, IEEE Trans. ImageProcessing, this issue, pp. 1509-1519.

    Ani1 C. Kokaram (S91-M92) was born in SangreGrande, Trinidad and Tobago, on June 19, 1967.He received the B.A. degree in electrical and in-formation engineering sciences from the CambridgeUniversity Engineering Department, UK, in 1989.He went on to receive the M.A. and Ph.D. degreesfrom the Signal Processing Group of the CambridgeUniversity Engineering Department in 1993, havingworked principally on motion picture restoration.Since 1993, he has been w orking on other prob-lems in archived motion picture film in his capacityof Research Associate in the Signal Processing Grou p. His interests encom pasiimage sequence processing in general. He is currently engaged in such areasas image sequence noise reduction, missing data reconstruction, and motionestimation. He has applied this work in many different environments includingscanning electron microscopes and particle image velocimetry.

    TRANSACTIONS ON IMAGE PROCESSING, VOL. , NO. 11, NOVEMBER 1995

    Robin D. M o r n s was born in Bury, Lancashire,UK, on December 15, 1969. He received the B.A.degree in electrical and information sciences fromCambridge University Engineering Department inJune 1991.Since then, he has qualified for the M.A. degreeand is worlung toward the Ph.D. degree with theSignal Processing Laboratory of the same depart-ment. His research has been in the area of Bayesianinference and statistical signal processing, with ap-plications in the area of Markov random fields

    In October of 1994, Mr. Mo ms was elected to a Junior Research Fellowshipapplied to motion picture restoration.at Trinity College, Cambridge.

    William J. Fitzgerald received the B.Sc. degreein physics in 1971, the M.Sc. degree in solid statephysics in 1972, and the Ph.D. degree in 1974 fromthe University of Birmingham, UK.He worked for six years at the Institut LaueLangevin in Grenoble, France, as a Research Sci-entist worlung on the theory of neutron scattenngfrom condensed matter. He spent a year teachingphysics at Trinity College, Dublin, and then be-came Associate Professor of Physics at the ETHin Zurich, working on diffuse scattering of x-raysfrom metallic systems as well as teaching. After several years working inindustrial research in Cambridge, he then took up his present position asUniversity Lecturer in the Engineering Department, where he teaches andconducts research in signal processing.

    Dr. itzgerald is a Fellow of Chn sts College, and his research interests areconcerned with Bayesian inference and model-based signal processing.

    Peter J. W. Rayner received the M.A. degree fromCambridge University, UK, in 1968 and the Ph.D.degree from Aston University in 1969.Since 1968, he has been with the Department ofEngineering at Cambridge University and is Headof the Signal Processing and Communications Re-search Group. He teaches courses in random signaltheory, digital signal processing, image processing,and communication systems. His current researchinterests include image sequence restoration, audiorestoration, nonlinear estimation and detection, andtime series modeling and classification. In 1990, he was appointed to anad-hominem Readership in Information Engineering.