9
Video-based non-uniform object motion blur estimation and deblurring Xiaoyu Deng a , Yan Shen a , Mingli Song a,n , Dacheng Tao b , Jiajun Bu a , Chun Chen a a Zhejiang Provincial Key Laboratory of Service Robot, College of Computer Science, Zhejiang University, China b Centre for Quantum Computation and Intelligent Systems, University of Technology Sydney, Broadway, NSW 2007, Australia article info Article history: Received 28 September 2011 Received in revised form 12 December 2011 Accepted 31 January 2012 Communicated by Y. Fu Available online 24 February 2012 Keywords: PSF estimation Motion deblurring Local feature tracking Deconvolution abstract Motion deblurring is a challenging problem in computer vision. Most previous blind deblurring approaches usually assume that the Point Spread Function (PSF) is spatially invariant. However, non- uniform motions exist ubiquitously and cannot be handled successfully. In this paper, we present an automatic method for object motion deblurring based on non-uniform motion information from video. First, the feature points of the object are tracked throughout a video sequence. Then, the object motion between frames is estimated and the circular blurring paths (i.e. PSFs) of each point are computed along the linear moving path in polar coordinates. Finally, an alpha matte of the blurred object is extracted to separate the foreground from the background, and an iterative Richardson–Lucy algorithm is carried out on the foreground using the obtained blurring paths. Experimental results show our proposed approach outperforms the state-of-the-art motion deblurring algorithms. & 2012 Elsevier B.V. All rights reserved. 1. Introduction Motion blur is a common problem in image and video record- ing, which is caused by the relative motion between the camera and the object during exposure. The relative motion can be divided into two kinds: camera shaking and object motion. And both the two kinds of relative motions can be modeled as an affine transformations, which includes translation, rotation and scaling. Most state-of-the-art blind deblurring approaches have a common assumption that the Point Spread Function (PSF) is spatially invariant, and model the blurred image as a convolution of the latent image and the blur kernel [15]. This assumption holds only for the first kind of blur and does no’t take effect for the latter one because the motion of the object and that of the background are different. Blur caused by the object motion cannot be avoided by using hardware like tripod or camera stabilizer. A few research is available to deal with this problem. Most of the previous work solves this problem by employing a linear motion assumption of the object [6,7], or using a hybrid or coded exposure camera [810], or user’s manual interaction [9,11]. In Agrawal’s work [6], the authors use multiple images with different exposures to make the PSF invertible, but the PSF is assumed to be linear. Bar’s work [7] holds the similar assumption, but proposes a variational model which solves deblurring and motion estimation at the same time. Wang et al. [12] treat deblurring as a classification problem and solve it using calculus of variation, however, their method needs PSF in advance. Tai et al. [9] try to recover a rotating object recorded with coded exposure method [8], which is unavailable for ordinary digital cameras, and the motion of the object is estimated by motion from blur method [13]. Toward recovering the latent object, most state-of-the-art blind image deblurring methods estimate spatially invariant PSF with a MAP method [3,4]. Fergus et al. [1] introduce a natural image prior to estimate PSF, but their method fails when the saturation of the object’s color is too high. Joshi [14] and Jia [2] focus their attention on the boundaries of objects, and estimate PSF using gradient variation on the boundaries, which needs manual interactions. One step further, Shan [4] proposes a local smooth prior to make the reconstructed image more smooth, which also needs a manual selection of a boundary region. Traditionally, rotating objects are treated as separate regions to handle spatially variant motion. Levin [15] used a box filter to detect different blur regions, depending on the derivatives dis- tribution in blurred images. Chou et al. [16] segment the image into background and foreground, and compute PSFs using a regularized energy function, then solve it using an alternating optimization technique. Toward better separation, several authors using semi-automatic methods to extract objects, Agrawal et al. [6] and Tai et al. [9] extract the blurred object with semi-automatic alpha matting [17], the result is improved but highly dependant on the matting result. In order to estimate the blur parameters automatically, we need to extract low-level features to feed to a blur estimation model. Feature extraction is a critical first step for many other computer vision problems. Traditionally, visual feature can be Contents lists available at SciVerse ScienceDirect journal homepage: www.elsevier.com/locate/neucom Neurocomputing 0925-2312/$ - see front matter & 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.neucom.2012.01.017 n Corresponding author. E-mail address: [email protected] (M. Song). Neurocomputing 86 (2012) 170–178

Video-based non-uniform object motion blur estimation · PDF fileVideo-based non-uniform object motion blur estimation and ... [email protected] (M. Song). ... Video-based non-uniform

Embed Size (px)

Citation preview

Neurocomputing 86 (2012) 170–178

Contents lists available at SciVerse ScienceDirect

Neurocomputing

0925-23

doi:10.1

n Corr

E-m

journal homepage: www.elsevier.com/locate/neucom

Video-based non-uniform object motion blur estimation and deblurring

Xiaoyu Deng a, Yan Shen a, Mingli Song a,n, Dacheng Tao b, Jiajun Bu a, Chun Chen a

a Zhejiang Provincial Key Laboratory of Service Robot, College of Computer Science, Zhejiang University, Chinab Centre for Quantum Computation and Intelligent Systems, University of Technology Sydney, Broadway, NSW 2007, Australia

a r t i c l e i n f o

Article history:

Received 28 September 2011

Received in revised form

12 December 2011

Accepted 31 January 2012

Communicated by Y. FuFirst, the feature points of the object are tracked throughout a video sequence. Then, the object motion

Available online 24 February 2012

Keywords:

PSF estimation

Motion deblurring

Local feature tracking

Deconvolution

12/$ - see front matter & 2012 Elsevier B.V. A

016/j.neucom.2012.01.017

esponding author.

ail address: [email protected] (M. Song).

a b s t r a c t

Motion deblurring is a challenging problem in computer vision. Most previous blind deblurring

approaches usually assume that the Point Spread Function (PSF) is spatially invariant. However, non-

uniform motions exist ubiquitously and cannot be handled successfully. In this paper, we present an

automatic method for object motion deblurring based on non-uniform motion information from video.

between frames is estimated and the circular blurring paths (i.e. PSFs) of each point are computed along

the linear moving path in polar coordinates. Finally, an alpha matte of the blurred object is extracted to

separate the foreground from the background, and an iterative Richardson–Lucy algorithm is carried

out on the foreground using the obtained blurring paths. Experimental results show our proposed

approach outperforms the state-of-the-art motion deblurring algorithms.

& 2012 Elsevier B.V. All rights reserved.

1. Introduction

Motion blur is a common problem in image and video record-ing, which is caused by the relative motion between the cameraand the object during exposure. The relative motion can bedivided into two kinds: camera shaking and object motion. Andboth the two kinds of relative motions can be modeled as anaffine transformations, which includes translation, rotation andscaling. Most state-of-the-art blind deblurring approaches have acommon assumption that the Point Spread Function (PSF) isspatially invariant, and model the blurred image as a convolutionof the latent image and the blur kernel [1–5]. This assumptionholds only for the first kind of blur and does no’t take effect forthe latter one because the motion of the object and that of thebackground are different.

Blur caused by the object motion cannot be avoided by usinghardware like tripod or camera stabilizer. A few research isavailable to deal with this problem. Most of the previous worksolves this problem by employing a linear motion assumption ofthe object [6,7], or using a hybrid or coded exposure camera[8–10], or user’s manual interaction [9,11]. In Agrawal’s work [6],the authors use multiple images with different exposures to makethe PSF invertible, but the PSF is assumed to be linear. Bar’s work[7] holds the similar assumption, but proposes a variationalmodel which solves deblurring and motion estimation at thesame time. Wang et al. [12] treat deblurring as a classification

ll rights reserved.

problem and solve it using calculus of variation, however, theirmethod needs PSF in advance. Tai et al. [9] try to recover arotating object recorded with coded exposure method [8], whichis unavailable for ordinary digital cameras, and the motion of theobject is estimated by motion from blur method [13].

Toward recovering the latent object, most state-of-the-artblind image deblurring methods estimate spatially invariant PSFwith a MAP method [3,4]. Fergus et al. [1] introduce a naturalimage prior to estimate PSF, but their method fails when thesaturation of the object’s color is too high. Joshi [14] and Jia [2]focus their attention on the boundaries of objects, and estimatePSF using gradient variation on the boundaries, which needsmanual interactions. One step further, Shan [4] proposes a localsmooth prior to make the reconstructed image more smooth,which also needs a manual selection of a boundary region.

Traditionally, rotating objects are treated as separate regionsto handle spatially variant motion. Levin [15] used a box filter todetect different blur regions, depending on the derivatives dis-tribution in blurred images. Chou et al. [16] segment the imageinto background and foreground, and compute PSFs using aregularized energy function, then solve it using an alternatingoptimization technique. Toward better separation, severalauthors using semi-automatic methods to extract objects,Agrawal et al. [6] and Tai et al. [9] extract the blurred object withsemi-automatic alpha matting [17], the result is improved buthighly dependant on the matting result.

In order to estimate the blur parameters automatically, weneed to extract low-level features to feed to a blur estimationmodel. Feature extraction is a critical first step for many othercomputer vision problems. Traditionally, visual feature can be

Fig. 1. Tracked Features of two adjacent frames, and the motion of moving object represented as circles.

X. Deng et al. / Neurocomputing 86 (2012) 170–178 171

categorized to corner-based and region-based features [18,19].Recent years, more complicated features are introduced todescribe texture information in the image, i.e. [20–23]. However,region-based and texture-based features are unstable when blurhappens. Instead, corner-based features, i.e. [24] has more repeat-ability than other features. Sometimes, the features are in a veryhigh dimensional space, and various manifold learning methodscan be used to reduce the dimensionality first [25–28].

In this paper, we attempt to deblur moving objects in videosequence with non-uniform blur automatically. We model themotion of the object in a scene as an affine transformationwithout scaling, i.e. translation and rotation. We propose a featurepoint tracking based method to estimate the motion of themoving object. Feature points can be extracted and trackedrobustly by KLT (Kanade–Lucas–Tomasi) feature tracker [24],even on blurred objects. The motion of the object can berecovered exactly by solving two minimizations with trackedfeature points in adjacent frames (Fig. 1). Then, the PSF of non-uniform motion blur is deduced from blur estimation, which iscarried out on a transformed gradient (gradient in the motiondirection) map. The background binary mask is obtained by usinggaussian background subtraction. And the background alphamatte can be obtained by blurring the mask with the obtainednon-uniform PSF. So the background influence on blurred objectboundary can be subtracted using the alpha matte to get theblurred object. We recover the latent object by applying aniterative Richardson–Lucy algorithm on blurred object with thenon-uniform PSF. We can apply our approach to visual surveil-lance and personal video recordings, where some interestingobjects undergoing non-uniform motion have important informa-tion to be recovered.

The major contributions of our work are as follows:

A new non-uniform motion estimation method which is basedon an affine motion model to describe object motion, and usesa feature based method to estimate the local motions of amoving object between two frames. � A novel blur estimation method, which operates object

gradient map in polar coordinates, to estimate the non-uniform blur and construct spatially variant PSF based onour motion model.

� A novel automatic alpha matte refinement method based on

the previously estimated spatially variant PSF.

The remainder of this paper is organized as follows:Section 2.1 gives an overview of our video based object motionblur estimation and deblurring method. Section 2.2 describes themotion model and the feature point tracking based method toestimate the motion of a moving object between two frames.Section 2.3 describes the blur estimation method and spatially

variant PSF construction based on our motion model. Section 2.4describes blurred object extraction. Section 2.5 gives a shortdescription on the spatially variant Richardson–Lucy method toremove non-uniform blur. Experiments are in Section 3. And weconclude in Section 4.

2. Video-based non-uniform object motion blur estimationand deblurring

2.1. System overview

Blur caused by object motion is similar to blur caused by cameramotion but different in two ways: First, the region of the motion islimited to the object region, so that only the motions in the objectregion are useful. Second, the motions are spatially variant, whichcauses the image formation model cannot be represented by asingle convolution. In our model, multiple objects can be handledsuccessfully, we will assume single object for simplicity. For multi-ple objects, we refer to Section 3.2. Traditionally, the blurred imagecan be represented as a convolution of the latent image F and theblur kernel K in discrete form: IðxÞ ¼

P9Dx9oDFðxþDxÞ � Kð�DxÞ.

Since we are dealing with moving object with spatially variantmotion, we formulate the image formation model as follows:

IðxÞ ¼ aðxÞ �X

9Dx9oD

FðxþDxÞ � KxþDxð�DxÞþð1�aðxÞÞ � BðxÞ ð1Þ

in which, I(x) is the blurred image, F is the original foreground to beestimated, and K is the blur kernel, which is defined in a smallregion around x, xþDx. The blur kernel (PSF) is determined by theobject motion model at each point on the object, which is to beestimated by our method. a is the alpha matte of the object, whichis dependent on the exposure time when x is covered by theforeground, and B is the background.

In order to recover the blur caused by fast motion of the objectin front of a quasi-static background, we propose a frame-work composed of four parts: (a) motion estimation; (b) blurestimation and PSF construction; (c) blurred object extraction;(d) deblurring with non-uniform PSF. Fig. 2 is the flowchart of ourapproach.

In the motion estimation part, the motion of the objectbetween two adjacent frames is treated as a combination ofrotation and translation. Let

RTðo,y,t,xÞ ¼ RðyÞðx�oÞþoþt ð2Þ

be the transformation function of this motion model, where o isthe rotation center, y is the rotation angle, t is the translationvector, and x is a coordinate in the image. The rotation RðyÞ is a

X. Deng et al. / Neurocomputing 86 (2012) 170–178172

geometric rotation matrix

RðyÞ ¼cos y �sin ysin y cos y

� �ð3Þ

Let xi and xiþ1 be the coordinates of the same point on theobject in frame i and iþ1, which satisfy the following relation-ship:

xiþ1 ¼ RTðoi,yi,ti,xiÞ ð4Þ

where oi, yi and ti are the parameters of the motion model inframe i. yi and oi can be estimated by an optimization process bythe tracked feature points of frame i and iþ1. KLT feature trackeris employed to locate the feature points, since KLT is rotationinvariant, see Fig. 1.

Blur heavily depends on the motion of object and records themotion in a short period. PSF reflects the motion during theexposure time. Therefore, we estimate the PSF based on themotion model above. Let

piðx,sÞ ¼ RTðoi,ybi s,tb

i s,xÞ,sA ½0;1� ð5Þ

be a function of point and scale factor of frame i, ybi and ti

b denoterotation angle and translation vector of the object during expo-sure period, which are called ‘‘blur angle’’ and ‘‘blur vector’’,respectively. The blur of object in a frame is in proportion by afactor s to the exposure time of this frame, which is to beestimated in the blur estimation step, see Section 2.3. So the blurangle yb

i and blur vector tib can be estimated from the gradient

information of the blurred object. Supposing the object movessmoothly during the exposure period then the PSF of xi haveuniform value along the piðxi,sÞ (Fig. 7(b)). However, because theobject undergoes non-uniform motion, the deduced PSFs of theobject are spatially variant.

Fig. 3. Motion estimation based on feature point tracking. (a) Feature points tracked by

(For interpretation of the references to color in this figure legend, the reader is referre

Fig. 2. Flow chart of our approach.

In order to extract the object in the video frame, morphologicaloperation on the foreground binary mask is employed to approxi-mately match the absolute foreground. After morphologicaloperation, the alpha matte is obtained by our alpha matterefinement method. Finally, the refined blurred object can beextracted as follows:

f ai ¼ Ii�b � ð1�aiÞ ð6Þ

where Ii denotes frame i, f ai is the refined blurred object in frame i,and b denotes the background.

In order to obtain the latent object, a spatially variantRichardson–Lucy algorithm is employed to restore the latentobject by the obtained spatially variant PSFs and the extractedblurred object. It is an iterative Bayesian-based method.

2.2. Object motion estimation based on feature point tracking

As discussed in Section 2.1, the motion of object between twoadjacent frames is composed of rotation and translation. Werewrite Eq. (4) as

xiþ1 ¼ RTðo,yi,ti,xiÞ ¼ RTð0,yi, t i,xiÞ ¼ RðyiÞxiþ t i ð7Þ

where

t i ¼ tiþoi�RðyiÞoi

It is noticeable that the rotation angle keeps constant when therotation center changes. Since the rotation angle yi is independenton the rotation center oi, yi can be approximated by solving thefollowing optimization problem based on the tracked featurepoints in the adjacent frames tracked by the KLT algorithm [24]:

minyi , t i

9Xiþ1�ðRðyiÞXiþTiÞ92

ð8Þ

where Xi ¼ ½x1i ,x2

i , . . . ,xni �, Xiþ1 ¼ ½x

1iþ1,x2

iþ1, . . . ,xniþ1� and Ti ¼

½t i,t i, . . . , t i� are matrices, with the dimension of 2� n, eachcolumn is a 2�1 coordinate vector, containing tracked featurepoints by the KLT algorithm; xi

j and xjiþ1 (jA ½1,n�) are coordinates

of the same feature point in frame i and iþ1. The feature points inthe background are filtered out by setting a threshold on themagnitude of motion vector (xj

iþ1�xji), see Fig. 3(a), the rest of

them are marked with green circles. Also as Fig. 3(a) shows, wefilter out the noisy feature points by constructing a histogram oforientation of motion vectors (Fig. 4), left points are connectedwith yellow lines.

The rotation center oi can also be approximated by solving aleast square problem with tracked feature points. Since therotation radius of the object is usually much larger than thetranslation of the rotation center (the radius is usually in tens ofcentimeters, and the translation is usually in millimeters), thetriangle DXiXiþ1oi is considered to be an isosceles triangle. So theinner product PðoiÞ ¼ ðXiþ1�XiÞ � ½oi�ðXiþ1þXiÞ� is expected to be

KLT in adjacent frames. (b) Motion vectors and estimated motion shows in circles.

d to the web version of this article.)

X. Deng et al. / Neurocomputing 86 (2012) 170–178 173

zero. Minimize P according to oi, we get

minoi

9ððXTi �XT

iþ1ÞoiÞ�12ððX

Ti þXT

iþ1Þ � ðXTi �XT

iþ1ÞÞ1292

ð9Þ

where � denotes matrix element wise multiplication operator, 12

is the column vector ½11�T . As yi, t i and oi are estimated, ti can beobtained as below:

ti ¼ t i�oiþRðyiÞoi ð10Þ

As shown in Fig. 3(b), the rotation center oi is the maximum-likelihood point, where green lines come across, which is out ofthe boundary of image.

2.3. Blur estimation and non-uniform PSF

Object blur occurs in the duration of exposure. The duration ofexposure is shorter than the time difference between two frames.We assume the exposure time is in a fraction s of the video timedifference between frames, and the blur parameters yb

i and tib are

proportional to the motion parameters yi and ti.In order to estimate yb

i and tib, we extract the coarse foreground

binary mask mi based on gaussian background subtraction [29].However, mi usually contains some noises or holes due to light

Fig. 4. Histogram of the directions of motion vectors of feature after thresholding:

x-axis denotes the angle of motion vectors, 101 for each bin; y-axis denotes the

number of motion vectors for each bin. Bins (brown) have most vectors are

selected automatically from the histogram. (For interpretation of the references to

color in this figure legend, the reader is referred to the web version of this article.)

Fig. 5. Refined foreground mask extraction and blur estimation. (a) Static background b

object extracted using mi. (e) Ordinary gradient map. (f) Gradient map in motion direc

every single column in green. (For interpretation of the references to color in this figu

changing and the similarity between foreground and background.Therefore, morphological operators are carried out on mi toremove the noises and holes. As shown in Fig. 5(c), a refinedbinary foreground mask is obtained to finely cover the blurredobject. And the foreground fi

m (Fig. 5(d)) is obtained as follows:

f mi ¼ Ii �mi ð11Þ

where Ii denotes the ith frame, and � denotes the operation ofelement wise multiplication. Then, we calculate the gradient ofthe foreground (Fig. 5(f)) in the motion direction as follows:

gðxÞ ¼ f mi ðx0Þ�f m

i ðxÞ ð12Þ

where x0 ¼ RðygÞðx�oiÞþoiþðyg=yiÞtoidenotes the coordinate of

point x when the object rotate by yg . In which, yg is the histogramfiltered most-likely angle in the polar coordinates (See Fig. 5(h)),and g is the gradient map in cartesian coordinates. We transformg to polar coordinates as defined in Eq. (13) for better representa-tion. Specifically, xg ¼ RðypÞð½0d�T�oiÞþoi here. Fig. 5(g) shows thatblur curves are pulled straight along angle-axis:

gpðd,ypÞ ¼ gðxgÞ ð13Þ

where gp denotes the gradient map in polar coordinates.For each column of the gradient map gp, we find the longest

line segment to represent the dominant motions, and thenconstruct a histogram of them. So the most possible length valuel can be approximated by the maximum of the histogram. Thusthe ‘‘blur angle’’ yb

i ¼ ygl and ‘‘blur vector’’ tbi ¼ yb

i =yi � ti can beobtained.

With the estimated ‘‘blur angle’’ and ‘‘blur vector’’, we con-struct PSF of xi of frame i as a 2D map:

PSFðyÞ ¼1=lp, yAP

0, y=2P

(ð14Þ

where y is the coordinate of PSF, P represents the image domain offunction piðx,sÞ, when x¼ xi,sA ½0;1�, lp denotes the length of blurline (Fig. 5(h)), which is position-independent in the polarcoordinates. In discrete form, PSF is constructed as follows:

PSFðyÞ ¼1=N, yAPd

0, y=2Pd

(ð15Þ

where Pd denotes the image domain of function piðx,sÞ, whenx¼ xi,s¼ 1=N,2=N, . . . ,N=N, where N is the number to divide the‘‘blur angle’’ or ‘‘blur vector’’.

. (b) Blurred frame Ii. (c) Refined foreground binary mask mi of frame i. (d) Blurred

tion g. (g) Transformed g to polar coordinate gp. (h) Draw longest lines segment in

re legend, the reader is referred to the web version of this article.)

X. Deng et al. / Neurocomputing 86 (2012) 170–178174

2.4. Refined blurred object extraction with non-uniform PSF

Though the binary mask mi and the foreground fim are

obtained, it is noticeable that still there are some backgroundpixels on the boundary of the blurred object (Fig. 5(d)).

In our approach, motivated by the conventional alpha mattingmethod [6,9], we develop a new PSF-based approach to refineblurred object alpha matte extraction with the obtained non-uniform PSF. In conventional deblurring approaches, foregroundmask mi is usually blurred by the obtained PSF to approximate theblurred object alpha matte. However, the extracted foreground islarger than the latent object because the foreground mask coversthe blurred edge, which leads to a incorrect synthesis of theblurred object. Therefore, we develop an improved method toextract the background alpha matte. Since a single round ofblurring using PSF will not shrink the area of the foregroundmask, we blur the background mask m0i ¼ 1�mi twice using thePSF and negative PSF as follows:

a0iðxiÞ ¼ 9PSF �m0i9þ9PSF� �m0i9�m0iðxiÞ

¼ 1=NXN

n ¼ 1

m0iðpiðxi,n=NÞÞþ1=NXN

n ¼ 1

m0iðpiðxi,�n=NÞÞ�m0iðxiÞ ð16Þ

where PSF� denotes the PSF of xi with negative ‘‘blur angle’’ �ybi and

negative ‘‘blur vector’’ �tbi . As shown in Fig. 6(c), the blurred edge of

the alpha matte keeps consistent with that of the blurred object.Finally, the refined blurred object (Fig. 6(d)) can be obtained as

below:

f ai ¼ Ii�b� a0i ð17Þ

2.5. Deblurring with spatial variant Richardson–Lucy algorithm

Because the object in the video frames undergoes a non-uniform motion during exposure, we model the blurred frameas a integral of the latent one:

f ai ¼

Z 1

0Hðoi,y

bi s,tb

i s,f iÞ ds ð18Þ

In discrete form, Eq. (18) is rewritten to be

f ai ¼XN

n ¼ 1

H oi,n

Nyb

i ,n

Ntb

i ,f i

� �ð19Þ

where H denotes the transformation operation on the latent frame,

which is pixel-wise transform of the whole image using RTðo,y,to,xÞ. fi

is the latent object in frame i, and f ai is the blurred one. ybi and ti

b are

blur angle and blur vector. s is the scale factor of ybi and ti

b.

Conventional Richardson–Lucy algorithm [30,31] is designedto restore images degraded by a spatially invariant filter [9,32].Given f ai , the blurred image, the deblurred value at position x canbe calculated iteratively as

f kþ1i ðxÞ ¼ f k

i ðxÞ �X

n

Kðxn�xÞ � f ai ðxnÞPjKðxn�xjÞ � f

ki ðxjÞ

ð20Þ

Fig. 6. The method to extract refined blurred object with a map. (a) Refined foreground

(d) Refined blurred object extracted with a0i .

In our problem setting, the PSF is spatially variant. So wedeblur the blurred object with the constructed PSF for everysingle point iteratively. Such that, given a blurred point x,

f kþ1i ðxÞ ¼ f k

i ðxÞ �1

N

XN

n ¼ 1

f ai ðynÞ

1

N

PNm ¼ 1 f k

i ðzmÞ

ð21Þ

where yn ¼ piðx,n=NÞ and zm ¼ piðyn,�m=NÞ, fik is the updated

version of the latent object after the kth iteration. We assign f 0i

with f ai as the initial value of the iterative algorithm.The detail of the proposed video deblurring algorithm is given

in Algorithm 1.

Algorithm 1. Video deblurring.

Input: Adjacent Frames Ii Iiþ1, static background b

Output: Latent Object fi

1. Track Feature points using KLT tracker, and filter out theunqualified ones:ðXi,Xiþ1Þ ¼ KLTðIi,Iiþ1Þ

2.Estimate rotation center oi, rotation angle yi and translationvector ti of frame i:

minyi , t i

9Xiþ1�ðRðyiÞXiþTiÞ92

minoi

9ððXTi �XT

iþ1ÞoiÞ�12ððX

Ti þXT

iþ1Þ � ðXTi �XT

iþ1ÞÞ1292

ti ¼ t i�oiþRðyÞioi

3. Estimate blur angle ybi and blur vector tb

i :

3.1 Obtain the foreground mask mi with gaussianbackground subtraction, and refine mi usingmorphological operations:

mi ¼ GaussianðIiÞ, mi ¼MorphologyðmiÞ

3.2 Extract the foreground with mi: f mi ¼ Ii �mi

3.3 Calculate the gradient map g of fim and transform it to

polar coordinate as gp.

3.4 Estimate ybi and tb

i by constructing a histogram of gp:

ðybi ,tb

i Þ ¼HistogramðgpÞ

4.Construct PSF for xi with estimated oi, ybi and ti

b

5. Estimate the background alpha map a0i and extract the

refined blurred object: m0i ¼ 1�mi

a0iðxiÞ ¼ 9PSF �m0i9þ9PSF� �mi9�m0iðxiÞ

f ai ¼ Ii�b� a0i6. Get the latent object by applying our iterative Richardson–

Lucy Algorithm :

f i ¼ RLðf ai Þ

Return: fi

3. Experiment and results

In our experiment, all the video data are recorded by Cannon5D Mark II, which is mounted on a tripod for stabilization. Andthe background is quasi-static. The whole process is carried outautomatically without user interaction.

binary mask mi. (b) Blurred background mask a0i . (c) Blurred foreground mask ai .

Fig. 7. PSF construction of a single point. (a) An amplified white point in a black background. (b) PSF, PSF of this white point moves in a circular path. (c) PSF� , PSF of this

white point moves in a circular path in a opposite direction. (d) 9f i � PSF9.

Fig. 8. Deblurring results. We show results of five video samples, one in each row. The original frame is in the first column, foreground is extracted for comparison. Two

reference algorithms, Fergus’s [1] and Whyte’s [32], are in the second and third columns. Our results are shown in the last column.

X. Deng et al. / Neurocomputing 86 (2012) 170–178 175

We compare our approach with two representative approachesproposed by Whyte et al. [32] and Fergus et al. [1]. Fergus’ approach iswidely used in image deblurring, which is spatially invariant. Whyte’sapproach deals with translation and scaling by constructing anspatially variant motion kernel. As shown in Fig. 8, Fergus’ methodfails to recover objects because of the motion in the scene is non-uniform. Whyte’s approach has made some improvement to Fergus’smethod, and can handle rotation, but the resulting images are subjectto noise. In contrast, the proposed approach successfully restore thelatent objects from the blurred frames. We tested five objects indifferent scenarios, as demonstrated in Fig. 8, the results obtained byour approach are convincible.

3.1. Time scaling and translation

Camera exposure time is crucial for the size of the motion blur.When the exposure time is short, rotation can be decomposedinto linear motions. For example, Levin [15] used a box filter toestimate spatially variant motion blur by decompose the imageinto small regions, in which pixels motion can be treated as linear.

In our model, rotation is modeled explicitly and can be used inthe subsequent deconvolution process. So compared with tradi-tional linear methods, our model is stable when exposure timevaries. Here, we did some experiments using synthetical imageswith varying exposure time. Due to exposure time, the rotation

X. Deng et al. / Neurocomputing 86 (2012) 170–178176

angle varies from 0.81 to 3.21, counterclockwise. In Fig. 9, we showour deblur results in the second row. For comparison, Fergus’s [1]results are shown in the third row. Because Fergus’s method isspatially invariant, we can clearly see the deblur quality isunstable. On the contrary, our methods produce clear resultswhenever the blurring angle changes, only with some ringingartifact on the hand. The original images are in the first row.

In the mean time, we want to see if our algorithm can befunctional in the circumstances of linear motion, which is aspecial case of translation and rotation. We tested with puretranslation sequences, and our motion model estimation yieldsreasonable results, that the estimated rotation radius is large, andthe rotation angle becomes nearly zero. Here, we show ourdeblurring results of linear motion in Fig. 10. Notice in the upperright figure that the estimated lines toward the rotation centerare near parallel.

3.2. Multiple objects

Also, we tested our algorithm on video with multiple objects.After background subtraction, several foreground blobs can beobtained. In this paper, we assume different objects do notocclude each other (some techniques to split regions using edges

Fig. 9. Timescaling deblurring results. Deblurred results of 0.81, 1.61, 2.41, 3.21 in each

Fergus’s [1] are in the third row.

Fig. 10. Linear deblurring results. In the upper left figure is the KLT feature match result

feature points. The bottom figures show the blurred image and the reconstructed resu

or optical-flow exist, which we will not address in this paper).When objects are detected, since different object undergoesdifferent motion, we can process multiple objects separately,see Fig. 11.

4. Conclusion and discussion

In this paper, we have proposed a novel approach to restorethe blurred moving object in video based on feature pointtracking. Our method can automatically recover the motion ofmoving object including rotation and translation. And the blur isestimated from the gradient in moving directions. Then the non-uniform PSFs are constructed based on the estimated ‘‘blur angle’’and ‘‘blur vector’’. Afterwards, the alpha matte of the blurredobject can be estimated with the obtained PSF. Finally, the blurredmoving object is successfully recovered by an spatially variantRichardson–Lucy algorithm with the non-uniform PSF.

In our approach, we also notice that the proposed motionmodel does no’t taken into account of scale changes of the object.In our future work, we will devote more effort to deal with theblurring caused by arbitrary motion. In addition, there are stillsome ring artifacts in background of the restored frame in ourresults, which is also a common problem of Richardson–Lucy

column. The first row is the blurred images. Our results are in the second row.

s. The upper right figure shows the estimated lines between the rotation center and

lt.

Fig. 11. Deblurring results of multiple objects.

X. Deng et al. / Neurocomputing 86 (2012) 170–178 177

based algorithms. In the future, we will investigate a newdeconvolution model to reduce these artifacts.

Acknowledgments

This work is supported by National Natural Science Foundationof China (61170142), Program for New Century Excellent Talentsin University (NCET-09-0685).

References

[1] R. Fergus, B. Singh, A. Hertzmann, S.T. Roweis, W.T. Freeman, Removingcamera shake from a single photograph, in: ACM SIGGRAPH, 2006.

[2] J. Jia, Single image motion deblurring using transparency, in: IEEE Conferenceon Computer Vision and Pattern Recognition, 2007.

[3] A. Levin, Y. Weisis, F. Durand, W.T. Freeman, Understanding and evaluatingblind deconvolution algorithm, in: IEEE Conference on Computer Vision andPattern Recognition, 2009.

[4] Q. Shan, J. Jia, A. Agarwala, High-quality motion deblurring from a singleimage, in: ACM SIGGRAPH, 2008.

[5] I. Gallo, E. Binaghi, M. Raspanti, Semi-blind image restoration using a localneural approach, Neurocomputing 73 (1–3) (2009) 389–396.

[6] A. Agrawal, Y. Xu, R. Raskar, Invertible motion blur in video, in: ACMSIGGRAPH, 2009.

[7] L. Bar, B. Berkels, A variational framework for simultaneous motion estima-tion and restoration of motion-blurred video, in: IEEE International Con-ference on Computer Vision, 2007.

[8] R. Raskar, A. Agrawal, J. Tumblin, Coded exposure photography: motiondeblurring using fluttered shutter, in: ACM SIGGRAPH, 2006.

[9] Y.-W. Tai, N. Kong, S. Lin, S.Y. Shin, Coded exposure imaging for projectivemotion deblurring, in: IEEE Conference on Computer Vision and PatternRecognition, 2010.

[10] Y.-W. Tai, H. Du, M.S. Brown, S. Lin, Image/video deblurring using a hybridcamera, in: IEEE Conference on Computer Vision and Pattern Recognition,2008.

X. Deng et al. / Neurocomputing 86 (2012) 170–178178

[11] Q. Shan, W. Xiong, J. Jia, Rotational motion deblurring of a rigid object from arigid object from a single image, in: IEEE International Conference onComputer Vision, 2007.

[12] L. Wang, Y. Huang, X. Luo, Z. Wang, S. Luo, Image deblurring with filterslearned by extreme learning machine, Neurocomputing 74 (16) (2011)2464–2474.

[13] S. Dai, Y. Wu, Motion from blur, in: IEEE Conference on Computer Vision andPattern Recognition, 2008.

[14] N. Joshi, R. Szeliski, D.J. Kriegman, PSF estimation using sharp edge predic-tion, in: IEEE Conference on Computer Vision and Pattern Recognition, 2008.

[15] A. Levin, Blind motion deblurring using image statistics, in: Annual Con-ference on Neural Information Processing Systems, 2006.

[16] S. Cho, Y. Matsushita, S. Lee, Removing non-uniform motion blur fromimages, in: IEEE International Conference on Computer Vision, 2007.

[17] A. Levin, D. Lischinski, Y. Weisis, A closed form solution to natural imagematting, in: IEEE Conference on Computer Vision and Pattern Recognition,2006.

[18] D. Tao, X. Li, Neurocomputing for vision research, Neurocomputing71 (10–12) (2008) 1769–1770.

[19] M. Song, D. Tao, Z. Sun, X. Li, Visual-context boosting for eye detection, Trans.Syst. Man Cybern. Part B 40 (6) (2010) 1460–1467.

[20] X. Gao, W. Lu, X. Li, D. Tao, Wavelet-based contourlet in quality evaluation ofdigital images, Neurocomputing 72 (1–3) (2008) 378–385.

[21] X. Gao, W. Lu, D. Tao, X. Li, Image quality assessment based on multiscalegeometric analysis, IEEE Trans. Image Process. 18 (7) (2009) 1409–1423.

[22] W. Lu, K. Zeng, D. Tao, Y. Yuan, X. Gao, No-reference image qualityassessment in contourlet domain, Neurocomputing 73 (4–6) (2010) 784–794.

[23] J. Yu, D. Tao, M. Wang, Semi-automatic cartoon generation by motionplanning, Multimedia Syst. 17 (5) (2011) 409–419.

[24] J. Shi, C. Tomasi, Good feature to track, in: IEEE Conference on ComputerVision and Pattern Recognition, 1994.

[25] T. Zhou, D. Tao, X. Wu, Manifold elastic net: a unified framework for sparsedimension reduction, Data Min. Knowl. Discovery 22 (3) (2011) 340–371.

[26] N. Guan, D. Tao, Z. Luo, B. Yuan, Manifold regularized discriminativenonnegative matrix factorization with fast gradient descent, IEEE Trans.Image Process. 20 (7) (2011) 2030–2048.

[27] N. Guan, D. Tao, Z.L.B. Yuan, Non-negative patch alignment framework, IEEETrans. Neural Networks 22 (8) (2011) 1218–1230.

[28] D. Tao, X. Li, X. Wu, S.J. Maybank, Geometric mean for subspace selection,IEEE Trans. Pattern Anal. Mach. Intell. 31 (2) (2009) 260–274.

[29] M. Piccardi, Background subtraction techniques: a review, in: InternationalConference on Systems, Man and Cybernetics, 2004.

[30] L.B. Lucy, An iterative technique for the rectification of observed distribu-tions, Astron. J. 79 (1974) 745–754.

[31] W.H. Richardson, Bayesian-based iterative method of image restoration,J. Opt. Soc. Am. 62 (1) (1972) 55–59.

[32] O. Whyte, J. Sivic, A. Zisserman, J. Ponc, Non-uniform deblurring for shakenimage, in: IEEE Conference on Computer Vision and Pattern Recognition,2010.

Xiaoyu Deng received the B.S. degree in ComputerScience from Zhejiang University, China, in 2005. He iscurrently a candidate for a Ph.D. degree in ComputerScience at Zhejiang University. His areas of researchinterests include image processing, pattern recogni-tion, and video surveillance.

Yan Shen received the B.S. degree in Computer Sciencefrom Zhejiang University, China, in 2007. He is cur-rently a candidate for a Ph.D. degree in ComputerScience at Zhejiang University. His areas of researchinterests include image processing, pattern recogni-tion, and graphics.

Mingli Song received the Ph.D. degree in computerscience from Zhejiang University, China, in 2006. He iscurrently an associate professor at the College ofComputer Science and Microsoft Visual PerceptionLaboratory, Zhejiang University. His research interestsinclude visual perception analysis, image enhance-ment, and face modeling. He is a member of the IEEEand the IEEE Computer Society.

Dacheng Tao received the B.Eng. degree from theUniversity of Science and Technology of China (USTC),the M.Phil. degree from the Chinese University of HongKong (CUHK), and the Ph.D. degree from the Universityof London. Currently, he is a Nanyang assistant pro-fessor with the School of Computer Engineering at theNanyang Technological University, a research associateprofessor at the University of London, a visiting pro-fessor at Xi’Dian University and a guest professor atWuhan University. He mainly applies statistics andmathematics for data analysis problems in cognitivescience, computer vision, human computer interac-

tions, multimedia, machine learning, and compressed

sensing. He has authored or coauthored more than 150 scientific articles in topvenues, including IEEE TPAMI, TIP, TKDE, NIPS, AISTATS, CVPR, ECCV, ICDM, ACMTKDD, KDD, and Cognitive Computation, with several best paper awards. His Erdosnumber is 3 and he holds the K.C. Wong Education foundation Award. He is anassociate editor of the IEEE Transactions on Knowledge and Data Engineering, aswell as of several Elsevier and Springer journals. He has served on committees formore than 100 major international conferences and more than 40 prestigiousinternational journals. He is a member of the IEEE and the IEEE Computer Society.

Jiajun Bu received the B.S. and Ph.D. degrees inComputer Science from Zhejiang University, China, in1995 and 2000, respectively. He is a professor inCollege of Computer Science and the Director ofEmbedded System and Software Center at ZhejiangUniversity. His research interests include video codingin embedded system, data mining, and mobiledatabase.

Chun Chen is a professor in the College of ComputerScience at Zhejiang University, China. His researchinterests include computer vision, computer graphics,and embedded technology