Real-time Projector Depixelation for Videosjye/lab_research/08/cgi.pdfin real-time. First, we develop a steepest descent algo-rithm on the GPU for estimating the optimally deblurred

The Visual Computer manuscript No.(will be inserted by the editor)

Kevin Kreiser · Jingyi Yu

Real-time Projector Depixelation for Videos

Abstract The screen-door effect or projector pixelationis a visual artifact produced by many digital projectors.In this paper, we present a real-time projector depixela-tion framework for displaying high resolution videos. Toattenuate pixelation, we use the common defocusing ap-proach of setting the projection a little out-of-focus. Us-ing a camera-projector pair, our system efficiently mea-sures the spatially-varying defocus kernel and stores itas a texture on the graphics hardware. We explore twonovel techniques to compensate for the defocusing blurin real-time. First, we develop a steepest descent algo-rithm on the GPU for estimating the optimally deblurredvideo frames based on the measured defocus kernel. Sec-ond, we present a novel optical flow algorithm that usesan illumination ratio map (IRM) to model illuminationtransformations between consecutive frames. We storeboth the IRM and optical flow as textures. To process aframe at runtime, we use the textures to warp the opti-mized previous frame into an initialization for the iter-ative GPU optimization. We show that our GPU opti-mization achieves one magnitude acceleration using thiswarping. Experiments on various video clips show thatour framework can realize real-time video playback withsignificantly reduced pixelation and defocus while pre-serving fine details and temporal coherence.

Keywords Pixelation · Defocus · Deblur · GPU ·Projector Display

Kevin KreiserUniversity of DelawareNewark, DE 19711E-mail: [email protected]

Jingyi YuUniversity of DelawareNewark, DE 19711E-mail: [email protected]

1 Introduction

The screen-door effect or projection pixelation is a visualartifact produced by most digital projectors, in whichthe gaps (called dead zones) separating the projector’spixels become visible in the projected image. Althoughthe latest generation of DLP projectors provide a closerspacing of the mirror elements to reduce pixelation arti-facts, some space is still required along the mirrors’ edgesto provide control circuit pathways. The pixelation arti-fact is more apparent when one projects images with aresolution higher than the projector’s. The spatial digi-tization creates jaggy boundaries due to undersamplingand magnifies the pixelation.

A common approach to mitigate the pixelation arti-facts is projector defocus, wherein the projector is de-liberately set a little out-of-focus [27]. By defocusing theprojector, a small amount of light leaks into the deadzones and blurs the boundaries between pixels. However,it also introduces blurriness in the projection. Many ef-forts have been proposed to compensate for the defocusblur, ranging from the classical Wiener filters [11] andthe Richardson-Lucy algorithm [20], to the recently pro-posed graph cuts [18] and belief propagation [25]. Othertechniques specific to projector deblur such as image pre-conditioning [5] and focal pre-correction [17] have beenproposed. Although these algorithms are effective in de-blurring still images, they are also computationally ex-pensive and are therefore used to preprocess the imagesbefore projection.

In this paper, we present a real-time projector depix-elation framework for displaying high resolution videos.Our framework also uses the defocus approach. Usinga camera-projector pair, our system efficiently measuresthe spatially-varying defocus kernel and stores it as atexture on the graphics hardware. We explore two noveltechniques to compensate for the defocusing blur in real-time. First, we develop an iterative optimization algo-rithm on the GPU for estimating the optimally deblurredframes based on the measured defocus kernel. Second, wepresent a novel optical flow algorithm that uses an illu-

2 Kevin Kreiser, Jingyi Yu

mination ratio map (IRM) to model illumination trans-formations between consecutive frames. We precomputeboth the IRM and the optical flow for each frame andload them as textures at runtime to warp the optimizedprevious frame into an initialization for the iterative GPUoptimization. We show that our GPU optimization achievesone magnitude acceleration using this warping. We testour framework on various video clips and show that weare able to achieve real-time video playback with signif-icantly reduced pixelation and defocus while preservingfine details and temporal coherence. The complete sys-tem pipeline is shown in Figure 1.

The specific contributions of this work are:

– A real-time projector depixelation framework basedon defocusing for displaying high resolution videos

– A GPU-based steepest descent algorithm for deblur-ring individual frames of video

– An optical flow algorithm that handles illuminationvariations across frames using an illumination ratiomap (IRM)

– An optical flow/IRM based warping method to ac-celerate the iterative GPU optimization

2 Previous Work

The recent introduction and rapid adoption of consumerdigital projectors has redefined the landscape for display-ing images and videos. High resolution and high contrastprojectors are increasingly used in a wide array of com-mercial and scientific applications, ranging from shapeacquisition [28,7], to virtual environments [19] and IMaxtheaters [13]. Many of these applications require the pro-jected images maintain high sharpness, low aliasing [27],and low brightness variation [8].

The most commonly observed visual artifact producedby digital projectors is projector pixelation or the screendoor effect. The fine lines separating the projector’s pix-els become visible in the projected image. The screendoor effect can be mitigated by defocusing the projec-tor a little in front of (or behind) the projection screento allow a slight amount of light to leak into the pixelgaps [27,5]. However, defocusing the projector results ina blurry image, which can make viewing strenuous sinceone’s eyes will constantly try to bring the projection intofocus. Although multiple projectors could be combinedto fill in the pixel gaps [3,14], multiple projection ap-proaches require pixel-level-accurate geometric and ra-diometric calibrations.

Image deblurring algorithms have been widely usedto compensate for defocus blur. However, the problemof deblurring (deconvolution) is inherently ill-posed inthat multiple distinct solutions exist for the same con-volution kernel. Several methods have been proposed forapproximating a solution. The classical Wiener filter [11]attempts to statistically estimate the image’s blur kerneland uses regularization to compute the inverse kernel.

Computer vision methods such as graph cuts [18] andbelief propagation [25] have also been used to recovernearly optimal deblurred images. Mathematically, im-age deblurring can be formulated as a bound-constrainedquadratic programming problem [16]. Iterative methodsbased on steepest descent [27] and conjugate gradient[11] have been used to efficiently locate a local optimalsolution. However, most of these approaches are compu-tationally expensive and cannot achieve real-time perfor-mance.

Our approach focuses on real-time video deblurringand uses the programmable graphics hardware (GPU).The recent addition of programmable graphics chipsetshas led to a myriad of efforts in exploiting them forgeneral-purpose computations, in particular, iterative lin-ear algorithms including the sparse matrix conjugate gra-dient [4], Gauss-Seidel [12], and fast fourier transforma-tion [15]. The stream-based architecture and large tex-ture size of the graphics hardware have also enabled newcomputational video algorithms [6,2,1]. In this paper, wecombine both GPU-based linear algorithms and compu-tational video methods for real-time video deblur.

A key factor that determines the performance of it-erative linear optimization is the initial condition. Wepresent a new optical flow estimation method for warpingthe optimized previous frame into an initialization for theGPU-based optimization. Classical optical flow methods[9] assume that the scene objects maintain consistentappearance in neighboring frames. Tremendous effortshave been focused on how to robustly handle textures,noise, and occlusion boundaries [21,23]. However, thesealgorithms are sensitive to illumination changes such asvarying shadows or lighting, which commonly appear invideos. In this paper, we propose finding the illuminationtransformation between two frames by calculating the ra-tio of intensity for each scene point in the frames. Theresulting ratio image is called the illumination ratio map(IRM). Jacob et al. [10] have shown that illumination ra-tio images tend to exhibit spatial smoothness, althoughabrupt changes may occur across the shadow and occlu-sion boundaries. We develop efficient GPU algorithmsbased on the Bilateral Filter [26] to simultaneously re-cover the optical flow and the IRM and use the two mapsas initialization for the GPU-based optimization.

Before proceeding, we clarify our notation. Fin repre-sents the source video and Fopt represents the optimizedvideo in our framework. Fin(n) and Fopt(n) represent thenth frame of the input and optimized output respectively.OF (n) and IRM(n) represent the optical flow field andillumination ratio map between frame Fin(n − 1) andFin(n). Superscripts such as F i

opt(n) represent the ith

iteration result when estimating Fopt(n).

Real-time Projector Depixelation for Videos 3

OF(n)

OF(n-1)

OF(n-2)

Optical

Flow

Texture

Fin(n+1) Fin(n) Fin(n-1)Videoin

Measured

Spatially Varying

Kernel Texture

Videoout

Fopt(n)

Fopt(n-1)

Fopt(n-2)

Optimized

Video

Initialize Fopt(n) by

Warping Fopt(n-1)

Using OF(n-1)

GPU Iterative

Optimization

of Fopt(n)

Fig. 1 Our real-time projector depixelation framework. Oursystem first defocuses the projector and measures the blurkernel. At runtime, to process frame Fin(n), we warp theoptimized previous frame Fopt(n − 1) using the optical flowand then correct it using an Illumination Ratio Map (IRM).Finally, we use the warped Fopt(n− 1) as the initial guess tofind the optimal deblurred frame on the GPU.

3 Kernel Estimation

To attenuate pixelation, we use the common defocusingapproach of setting the projection a little out-of-focus.Defocusing the projector, however, blurs the projectedimage as shown in Figure 5. To measure the blur kernel,we project a binary dot-pattern onto the screen as shownin Figure 2(a) and 2(b). We then use a camera to cap-ture the projected pattern Γ . A similar setup has beenproposed by Zhang and Nayar [27], where a beam split-ter was used to form a coaxial camera-projector system.The coaxial camera-projector system avoids rectifying Γbut requires accurate geometric calibrations.

We do not require the camera and projector share thesame optical center. Instead, we assume the projectionsurface is planar and we use homography to warp thecaptured pattern Γ . Our system automatically detectsthe center of each blurred dot and uses Singular ValueDecomposition (SVD) to compute the homography.

Since the captured pattern Γ consists of an ambientcomponent, we capture an ambient image Γambient byturning off the projector and then subtract Γambient fromΓ such that Φ = Γ − Γambient.

To measure the spatially-varying blur kernel K, wewarp Φ using the estimated homography and detect thecenter c of each pattern. We then place a window w cen-tered at c of size t × t for each blurred dot. For everypixel [u, v] with respect to window w, the kernel kc[u, v]is normalized as:

kc[u, v] =I[u, v]∑

u,v∈w I[u, v](1)

where I[u, v] represents the radiance received at pixel[u, v]. Since we only measure a sparse sampling of the

Kernel 0, 0 Kernel 19, 0

Kernel 0, 19 Kernel 19, 19

Sampled Kernel Interpolated Kernel Texture

(a)

(c)(b)

Fig. 2 Measuring the defocus kernel. (a) We use a camera tocapture the projected binary pattern and use homography towarp the pattern. (b) The warped camera image with samplesof the blur kernel. (c) We interpolate the spatially varyingkernel and pack them into a single texture.

spatially-varying kernels, we bilinearly interpolate be-tween the neighboring kernels to estimate the deblur ker-nel kp,q for every pixel [p, q] in the image Φ. We then packthe kernels into a large spatially-varying kernel textureK that will be used by our GPU-based deblur algorithm,as shown in Figure 2(c).

We benefit from the large texture memory and mem-ory bandwidth available in today’s graphics cards for effi-cient storage and querying of the kernels. On an nVidia8800GTS card that supports a texture size of 8Kx8K,we are able to store a blur kernel up to 16x16 pixels for512x512 resolution video clips. In our experiments, a de-focus blur of size 5x5 pixels is usually sufficient to removemost pixelation artifacts.

4 GPU-Based Defocusing

We formulate the problem of image deblurring as a con-strained quadratic programming problem [16,27]. Giventhe measured spatially-varying kernel K and frame Fin(n),our goal is to compute the optimal image Fopt(n) that isas close as possible to Fin(n) after it is blurred by kernelK, i.e.:

Fopt(n) = arg minD

{‖K ⊗D − Fin(n)‖2∀p, q 0 ≤ D[p, q] ≤ 255

}(2)

where ‖ · ‖ is an image distance metric.Notice that each pixel [p, q] in image D has an asso-

ciated kernel kp,q in the kernel texture K, therefore ⊗represents spatially-varying convolution for each pixel.In our implementation, we use the sum of squared pixeldifferences as the image distance metric.

4.1 Image and Kernel Representation

To solve for Equation 2, linear optimization algorithmssuch as steepest descent and conjugate gradient can be


used to approximate a nearly optimal solution. For animage of size m×m pixels, traditional CPU-based linearoptimization methods pack all pixels in the image intoa single column vector of size m2 and represent the blurkernel as a sparse matrix of size m2 × m2. Using thisrepresentation, convolving an image with the kernel canbe written as matrix multiplication [27].

Our goal is to use the GPU to solve for Equation2. This requires storing both the image and convolutionkernel as textures on the graphics card. For high reso-lution videos, packing all the pixels of a frame into asingle vector can easily result in a matrix that exceedsthe largest texture size allowed on the graphics card.Therefore, we choose to represent both the image andthe spatially varying blur kernel K as 2D textures. Wethen implement the convolution operation using a frag-ment shader which multiplies the two textures (see sup-plemental material).

4.2 Steepest Descent Optimization

We use the steepest descent method for computing theoptimal D in Equation 2. We start with some initial guessto D as D0. A common choice to D0 is to directly usethe input frame Fin(n). In Section 5.1, we show that abetter initialization is obtained by warping the optimizedprevious frame Fopt(n− 1).

Once we initialize D0, we then repeat the followingthree steps.

In step 1, we compute the residue R as:

R = K ⊗ (Fin(n)−K ⊗Di) (3)

where K is a spatially varying kernel satisfying k[u, v] =k[t−u, t−v]. The derivation of K follows from formulat-ing convolution as matrix multiplication. If we pack thepixels in image D into a single vector V and representK and K as two large sparse matrices M and M respec-tively, it is easy to see that K is the kernel correspondingto the transpose of the sparse matrix form M of kernelK (i.e. M = MT ).

In step 2, we update Di using the steepest descentas Di+1 = Di + αR where α is computed using steepestdescent algorithm as:

α =| R |2

| K ⊗R |2 (4)

Notice that after we update Di, some pixel intensitiesmay become negative or greater than 255. Therefore, weclamp Di+1 after step 2.

Finally, in step 3, we compute the sum of absolutedifferences for all pixels [p, q] in consecutive iterations ofD by:

∑p,q

|D[p, q]i+1 −D[p, q]i| (5)

We repeat steps 1, 2, and 3 until the difference inEquation 5 is less than a predefined threshold. We setour threshold to be βm2, where m2 is the number ofpixels in each frame Fin(n) and β is a user definable con-stant between 0 and 1 exclusive. For scenes with higherdetail, we choose a smaller β (and hence execute moreiterations) to preserve high frequencies.

There are many advantages to implementing the steep-est descent deblur algorithm on the GPU. First, by usinga fragment shader, we efficiently parallelize convolutionfor all pixels. Second, our GPU scale-addition shaderimplicitly clamps Di to values between 0 and 255, byrendering it directly to an 8 byte texture. To efficientlycompute the image difference metric, we employ the clas-sic MIPMAP technique; we generate a MIPMAP of im-age |D[p, q]i+1 − D[p, q]i| and multiply its lowest levelMIPMAP value by m2 to compute Equation 5.

5 Illumination Coherent Optical FlowEstimation

Notice, like many iterative linear optimization methods,the performance of our GPU-based deblur algorithm de-pends heavily on the initial condition D0. The moststraightforward initialization is to use the original frameFin(n) as D0 [27]. A better choice is to take advantage ofthe temporal coherence between the consecutive framesand warp the optimized previous frame Fopt(n − 1) asD0 using the optical flow.

However, two fundamental problems still remain. First,classical optical flow approaches [9] assume the corre-sponding pixels have consistent intensities in consecutiveframes. In the presence of illumination inconsistenciessuch as varying shadows or lighting, which frequentlyoccur in videos, state-of-art optical flow methods breakdown. Second, directly warping Fopt(n−1) using OF (n)without compensating for the illumination inconsisten-cies results in limited improvement in the number of it-erations.

5.1 Illumination Ratio Map

We present a new approach to account for illuminationvariations between consecutive video frames using an Il-lumination Ratio Map (IRM). An IRM computes theintensity ratio of corresponding points in an image pair.If two consecutive frames Fin(n) and Fin(n−1) are cap-tured at the same viewpoint, and the scene is static, thenthe IRM γ simply corresponds to Fin(n)

Fin(n−1) [10]. If thecamera position changes or any part of the scene moves,we need to find the corresponding pixel in Fin(n), i.e.:

γ(i, j) =Fin(n)[i + OF (n).x, j + OF (n).y]

Fin(n− 1)[i, j](6)

where OF [x, y] corresponds to the optical flow field.


Our goal is to simultaneously recover γ and OF . No-tice that the IRM maintains spatial smoothness in gen-eral. Discontinuities in the IRM appear in near sceneocclusion boundaries, shadow boundaries, or moving ob-jects. For instance, when the illumination direction changesacross the two images, shadows abutting the occlusionboundaries may appear or disappear. As a result, sharpedges can appear near depth edges in the IRM. Shadowboundaries also change due to illumination variations.However, since most of the shadows are soft in real scenes,they often appear in the IRM as smooth transitions in-stead of discontinuities.

The characteristics of spatial smoothness and occlu-sion discontinuity in the IRM are very similar to thoseseen in disparity maps from stereo matching. Previousresearchers [24,23] have shown that fields satisfying suchproperties can be modeled as probabilistic graphical mod-els. We propose a new two-step optimization algorithmto iteratively estimate the optical flow and the IRM asfollows:

In step 1 we detect and match a sparse set of featurepoints between frame Fin(n) and Fin(n − 1) using theShi-Tomasi method [22]. The feature points are then tri-angulated and interpolated to form a dense optical flowfield OF (n).

In step 2 we compute γ(n) using Equation 6, applya bilateral filter to γ(n), and de-illuminate Fin(n) usingγ(n). We repeat these steps until satisfactory.

We choose to use the bilateral filter [26] to smooththe IRM as it preserves occlusion boundaries, reducesnoise, and can be implemented on the GPU [6]. To fur-ther accelerate the optical flow estimation, our algorithmonly detects a sparse set of features and uses Delaunaytriangulation to create an interpolant. Each vertex isthen associated with x and y directional motion com-ponents. We then rasterize the triangles’ x and y vectorcomponents using the GPU, as shown in Figure 3. Af-ter rasterization, all pixels have floating point x and ydirectional motion components. Thus, when warping us-ing our IRM-OF shader, we use bilinear interpolation forfetching floating point texture locations.

In our experiments, we found that after two to threeiterations, the estimated optical flow and IRM were suf-ficiently denoised to reduce the iterations of the GPUoptimization by a magnitude as shown in the table inFigure 4. In Figure 3, we show a frame of the recoveredIRM and rasterized optical flow of a video with strongillumination variations. It is also worth noting that moresophisticated algorithms such as [24] can be used to pre-compute more accurate optical flow fields and IRMs, al-though at higher computational cost.

To process each frame Fin(n) at runtime, we use theestimated optical flow and IRM to warp the optimizedprevious frame Fopt(n− 1) to D0 as:

D0[i, j] = Fopt(n−1)[i+OF [i, j].x, i+OF [i, j].y]·γ[i, j](7)

(a) (b)

(c) (d)

Fig. 3 Recovering the optical flow and the Illumination Ra-tio Map (IRM). We detect sparse features and find their cor-respondences in consecutive frames (a). Using our two-stepiterative algorithm, we recover both the optical flow field (b)and the IRM (d). (c) shows the initial IRM estimation and(d) shows the final result after 2 iterations.

10.21.02.417.2Coral Reef

18.02.01.410.1Roadside

21.92.11.110.0Grass Hill

25.41.11.016.4Clown Fish

7x

7 K

ern

el

11.61.04.332.3Coral Reef

18.52.12.719.0Roadside

21.32.12.418.7Grass Hill

34.11.01.531.2Clown Fish

5x

5 K

ern

el

6.31.022.066.3Coral Reef

17.82.48.040.1Roadside

22.33.16.433.7Grass Hill

13.21.010.667.9Clown Fish

3x

3 K

ern

el

Iterations per

Frame w/o Warping

Iterations per Frame

with Warping

FPS w/o

Warping

FPS with

Warping

512x512

Video

10.21.02.417.2Coral Reef

18.02.01.410.1Roadside

21.92.11.110.0Grass Hill

25.41.11.016.4Clown Fish

7x

7 K

ern

el

11.61.04.332.3Coral Reef

18.52.12.719.0Roadside

21.32.12.418.7Grass Hill

34.11.01.531.2Clown Fish

5x

5 K

ern

el

6.31.022.066.3Coral Reef

17.82.48.040.1Roadside

22.33.16.433.7Grass Hill

13.21.010.667.9Clown Fish

3x

3 K

ern

el

Iterations per

Frame w/o Warping

Iterations per Frame

with Warping

FPS w/o

Warping

FPS with

Warping

512x512

Video

Fig. 4 Comparing the performance of our GPU frameworkwith and without warping at different blur kernel sizes onan nVidia 8800GTS. All video clips are rendered at 512x512resolution.

We implement Equation 7 using a fragment shader(see supplemental materials). We then optimize D0 forframe Fin(n) as show in section 4.2.

6 Experimental Results

We have used our framework to display various videoclips on an InFocus LP850 projector. We defocus theprojector a little in front of the projection screen anduse a Canon SD750 camera (7.1M pixels) to measure thedefocus kernel and capture the projected video frames.

Quality. Our video depixelation framework is able tosignificantly reduce screen door artifacts while preservingfine details and high temporal coherence. In our experi-ments, we found that using a defocus blur kernel of size5x5 is usually sufficient to reduce most of the pixelation


Optimization

7x7 Kernel

Defocus

7x7 Kernel

Optimization

3x3 Kernel

Defocus

3x3 Kernel

Fig. 5 Results of depixelated video projection using our framework. We use a Canon SD750 camera to capture the projectedvideo clip rendered at 24 fps. The projected video frames with defocusing blur are shown in the first and third rows. Noticethe strong blurring artifacts in the defocused projection. The optimized results are for the smaller and larger kernel sizes areshown in the second and fourth rows respectively. Please refer to the companion video for more examples.

artifacts without introducing ringing. Our algorithm alsoperforms well with this kernel size for most video clips.High frequency features are much better preserved usingour framework, as is shown on the grass, corn stalks, andcoral reef in Figure 6 and in the supplemental video.

The quality of our algorithm also depends on the sizeof the blurring kernel. We found that our algorithm usu-ally performs better with smaller kernel sizes than withlarger ones. Kernels larger than 7x7 pixels (Figure 5 row4) generally require more optimization and sometimes re-sult in ringing artifacts, as is shown on the clownfish inthe left column of Figure 5. This is an inherent problemin many existing deblurring algorithms [27].

Speed. In the table in Figure 4, we compare the pro-cessing speed of our GPU-based deblur method with andwithout the OF/IRM warping. For small kernel sizes, ini-tializing our steepest descent optimization using IRM-OF warping renders at over 30 fps. This allows real-time playback of the video while it is deblurred. TheIRM-OF warping also achieves a speedup of 5 times oversimply using the input frame as initialization. For evenlarger kernel sizes, this speedup is more significant. This

is intuitive in that large blur kernels product deblurredframes which are very different than the input frames.This leads to a large number of iterations before steep-est descent converges. However, consecutive frames in avideo are temporally coherent. Therefore, the warped op-timized previous frame is more similar to the deblurredtarget frame and this allows the optimization to quicklyconverge to the optimal solution with much fewer iter-ations. In fact, the average number of iterations usingthe IRM-OF warping is a magnitude less than the oneusing the nave initialization, as shown in the rightmosttwo columns of table in Figure 4.

The speed of our algorithm also depends on the qual-ity of the IRM-OF approximation. For highly texturedregions such as the grass hill scene shown in the third col-umn of Figure 6, the estimated IRM-OF is less accurate,and hence, the GPU-based optimization takes more iter-ations. Furthermore, if an input video has many high fre-quency details, those details will require more optimiza-tion. It is also important to note that the optimizationis computed globally. Thus videos which are spatiallyhomogenous with respect to the frequency domain, will


require a predictable amount of optimization, where asspatially heterogenous videos are less predictable.

Finally, both initialization schemes scale approximatelylinearly with the kernel size. This can be explained bythe large number of convolutions performed in our GPU-based optimization, since the overhead of convolutiongrows linearly with the kernel size.

7 Conclusions and Future Work

We have presented a real-time projector depixelationframework for displaying high resolution videos. To at-tenuate pixelation, we have used the common defocusingapproach of setting the projection a little out-of-focus.Our system efficiently measures the spatially-varying de-focus kernel and stores it as a texture on the graph-ics hardware. We have explored two novel techniques tocompensate for the defocusing blur in real-time. First, wehave developed an iterative optimization algorithm onthe GPU for estimating the optimally deblurred framesbased on the measured defocus kernel. Second, we havepresented a novel optical flow algorithm that uses an illu-mination ratio map (IRM) to model illumination trans-formations between consecutive frames. We store boththe IRM and the optical flow as textures and we use themat runtime to warp the optimized previous frame into aninitialization for the GPU optimization. We have shownthat our GPU optimization with warping achieves onemagnitude speedup. Experiments on various video clipshave shown that our framework is able to realize real-time video playback with significantly reduced pixelationand defocus while preserving fine details and temporalcoherence.

Limitations remain in our approach, however. Ourframework can only support blur kernels of relativelysmall sizes, although they are usually sufficient to re-duce pixelation. This is a result of packing the spatiallyvarying kernels into a single texture on the GPU. Sincethe current generation of graphics hardware can onlysupport texture sizes up to 8Kx8K, we would need tostore the kernels as separate textures for larger kernels.Furthermore, larger kernels lead to higher computationaloverhead and texture fetching when computing the con-volution on the GPU. Our warping method also relieson the accuracy of the estimated optical flow and IRM.Our results, in theory, can be further improved using thecombined IRM-OF Markov Random Field model usinga maximum a posteriori (MAP) optimization [24]. Weintend to investigate how to use GPU-based MAP toimprove the accuracy of the optical flow and the IRM.

Although initially designed for video deblur, our frame-work has the potential for real-time motion deblur. InFigure 7, we apply the GPU optimization with knownor estimated kernels and initialize it with the blurred in-put image. Figure 7(c) shows the resolution chart imageblurred with a known 7×1 linear kernel. Our GPU algo-

(a) (b) (c)

(d) (e) (f)

Fig. 7 Motion deblur using our framework. (a) Synthesizedmotion blur on a resolution chart. (b) Our deblurred resultrunning at 30fps. (c) Ground truth image of (a). (d) Capturedmotion blur on a Mustang RC car. (e) Our deblurred resultrunning at 14fps. (f) Ground truth image of (d).

rithm recovers the deblurred image at about 30 fps withan image resolution of 256 × 256. Figure 7(d) shows acaptured image of a moving RC car. We recover the blurkernel using the Wiener Filter and show the deblurredresult using our GPU algorithm in Figure 7(e). Com-pared with the ground truth image of the stationary RCcar (Figure 7(f)), our result robustly captures the detailon the Mustang decal and is computed at 14 fps with animage resolution of 256× 256. In the future, we plan toexplore efficient methods for estimating the blur kernelsand integrate them with our GPU deblurring framework.

References

1. Bennett, E.P., McMillan, L.: Proscenium: a frameworkfor spatio-temporal video editing. In: MULTIMEDIA’03: Proceedings of the eleventh ACM international con-ference on Multimedia, pp. 177–184 (2003)

2. Bennett, E.P., McMillan, L.: Computational time-lapsevideo. In: SIGGRAPH ’07: ACM SIGGRAPH 2007 pa-pers, p. 102 (2007)

3. Bimber, O., Emmerling, A.:4. Bolz, J., Farmer, I., Grinspun, E., Schroder, P.: Sparse

matrix solvers on the gpu: conjugate gradients and multi-grid. In: SIGGRAPH ’03: ACM SIGGRAPH 2003 Pa-pers, pp. 917–924 (2003)

5. Brown, M.S., Song, P., Cham, T.J.: Image pre-conditioning for out-of-focus projector blur. In: CVPR’06: Proceedings of the 2006 IEEE Computer SocietyConference on Computer Vision and Pattern Recogni-tion, pp. 1956–1963 (2006)

6. Chen, J., Paris, S., Durand, F.: Real-time edge-aware im-age processing with the bilateral grid. In: SIGGRAPH’07: ACM SIGGRAPH 2007 papers, p. 103 (2007)

7. Davis, J., Nehab, D., Ramamoorthi, R., Rusinkiewicz, S.:8. Fujii, K., Grossberg, M.D., Nayar, S.K.: A projector-

camera system with real-time photometric adaptation for


In Focus

Optimization

5x5 Kernel

Defocus

5x5 Kernel

Fig. 6 Comparative results from our framework using highly textured video clips. Notice the high frequency features suchas the leaves on the trees (left), the multifaceted coral reef (middle), and the blades of grass (right) retain their detail usingour method. All three videos are processed at over 30 frames per second.

dynamic environments. In: CVPR ’05: Proceedings of the2005 IEEE Computer Society Conference on ComputerVision and Pattern Recognition (CVPR’05) - Volume 1,pp. 814–821 (2005)

9. Horn, B.K., Schunck, B.G.: Determining optical flow.Tech. rep. (1980)

10. Jacobs, D.W., Belhumeur, P.N., Basri, R.: Comparingimages under variable illumination. In: CVPR ’98: Pro-ceedings of the IEEE Computer Society Conference onComputer Vision and Pattern Recognition, p. 610. IEEEComputer Society, Washington, DC, USA (1998)

11. Jain, A.K.: Fundamentals of digital image processing.Prentice-Hall, Inc., Upper Saddle River, NJ, USA (1989)

12. Kruger, J., Westermann, R.: Linear algebra operators forgpu implementation of numerical algorithms. In: SIG-GRAPH ’05: ACM SIGGRAPH 2005 Courses, p. 234(2005)

13. Lantz, E.: A survey of large-scale immersive displays. In:EDT ’07: Proceedings of the 2007 workshop on Emergingdisplays technologies, p. 1 (2007)

14. Majumder, A., GregWelch: COMPUTER GRAPHICSOPTIQUE Optical Superposition of Projected ComputerGraphics . pp. 209–218

15. Moreland, K., Angel, E.: The fft on a gpu.In: HWWS ’03: Proceedings of the ACM SIG-GRAPH/EUROGRAPHICS conference on Graphicshardware, pp. 112–119. Eurographics Association,Aire-la-Ville, Switzerland, Switzerland (2003)

16. Nocedal, J., Wright, S.: Numerical Optimization.Springer (2000)

17. Oyamada, Y., Saito, H.: Focal pre-correction of projectedimage for deblurring screen image. In: CVPR. IEEEComputer Society (2007)

18. Raj, A., Zabih, R.: A graph cut algorithm for general-ized image deconvolution. In: ICCV ’05: Proceedings ofthe Tenth IEEE International Conference on ComputerVision, pp. 1048–1054 (2005)

19. Raskar, R., Welch, G., Cutts, M., Lake, A., Stesin, L.,Fuchs, H.: The office of the future: A unified approach toimage-based modeling and spatially immersive displays.In: SIGGRAPH, pp. 179–188 (1998)

20. Richardson, W.H.: Bayesian-based iterative method ofimage restoration. Journal of the Optical Society ofAmerica (1917-1983) 62, 55–59 (1972)

21. Scharstein, D., Szeliski, R.: A taxonomy and evaluationof dense two-frame stereo correspondence algorithms. In-ternational Journal of Computer Vision 47(1-3), 7–42(2002)

22. Shi, J., Tomasi, C.: Good features to track. In: IEEEConference on Computer Vision and Pattern Recognition(CVPR’94). Seattle (1994)

23. Sun, J., Shum, H.Y., Zheng, N.N.: Stereo matching usingbelief propagation. In: ECCV (2), pp. 510–524 (2002)

24. Tappen, M.F., Freeman, W.T.: Comparison of graph cutswith belief propagation for stereo, using identical mrf pa-rameters. In: ICCV ’03: Proceedings of the Ninth IEEEInternational Conference on Computer Vision, p. 900.IEEE Computer Society, Washington, DC, USA (2003)

25. Tappen, M.F., Russell, B.C., Freeman, W.T.: Efficientgraphical models for processing images. In: CVPR (2),pp. 673–680 (2004)

26. Tomasi, C., Manduchi, R.: Bilateral filtering for gray andcolor images. In: ICCV ’98: Proceedings of the SixthInternational Conference on Computer Vision, p. 839.IEEE Computer Society, Washington, DC, USA (1998)

27. Zhang, L., Nayar, S.: Projection defocus analysis forscene capture and image display. In: SIGGRAPH ’06:ACM SIGGRAPH 2006 Papers, pp. 907–915 (2006)

28. Zhang, L., Snavely, N., Curless, B., Seitz, S.M.: Space-time faces: high resolution capture for modeling and an-imation. In: SIGGRAPH ’04: ACM SIGGRAPH 2004Papers, pp. 548–558 (2004)

Documents

Real-time Projector Depixelation for Videosjye/lab_research/08/cgi.pdfin real-time. First, we develop a steepest descent algo-rithm on the GPU for estimating the optimally deblurred