10
Computer generated hologram with geometric occlusion using GPU-accelerated depth buffer rasterization for three-dimensional display Rick H.-Y. Chen and Timothy D. Wilkinson* Electrical Engineering Division, Engineering Department, University of Cambridge, 9 JJ Thomson Avenue, Cambridge CB3 0FA, UK *Corresponding author: [email protected] Received 30 March 2009; revised 16 June 2009; accepted 24 June 2009; posted 29 June 2009 (Doc. ID 109276); published 17 July 2009 We present a method of rapidly producing computer-generated holograms that exhibit geometric occlu- sion in the reconstructed image. Conceptually, a bundle of rays is shot from every hologram sample into the object volume. We use z buffering to find the nearest intersecting object point for every ray and add its complex field contribution to the corresponding hologram sample. Each hologram sample belongs to an independent operation, allowing us to exploit the parallel computing capability of modern programmable graphics processing units (GPUs). Unlike algorithms that use points or planar segments as the basis for constructing the hologram, our algorithms complexity is dependent on fixed system parameters, such as the number of ray-casting operations, and can therefore handle complicated models more efficiently. The finite number of hologram pixels is, in effect, a windowing function, and from analyzing the Wigner dis- tribution function of windowed free-space transfer function we find an upper limit on the cone angle of the ray bundle. Experimentally, we found that an angular sampling distance of 0:01° for a 2:66° cone angle produces acceptable reconstruction quality. © 2009 Optical Society of America OCIS codes: 090.1760, 090.2870, 090.5694, 100.6890. 1. Introduction Among the various 3D display technologies, holo- graphic display can arguably provide the most con- vincing effects, realism, and viewing comfort. The invention of computer generated holograms (CGH) is central to its use in interactive applications. It has allowed holograms to be created without the optical recording process, making holographic display of ar- bitrary 3D models possible. Much research has been carried out over the years to try to reduce the com- putational intensity and bring CGH generation to real-time [14], but it still remains a very formidable challenge. In the last few years, the increasing pro- grammability of graphics processing units (GPUs) has gained a lot of attention from outside the com- puter graphics community, and GPUs have been successfully used as powerful stream processors for solving computationally intensive problems [5]. Computational holography is, of course, among one of them [68]. Because of the already intensive computation pro- cess in computational holography, visibility com- putation is mostly excluded owing to its additional complexity. This results in see-throughobjects upon reconstruction. For many applications, how- ever, it is essential that CGHs can provide the correct occlusion cue for depth perception. The traditional approach to hologram computation is to decompose the 3D scene into planar segments, evaluate each segments complex amplitudes in the hologram plane individually, and sum the results to obtain the com- plex distribution for the entire scene [9,10]. This 0003-6935/09/214246-10$15.00/0 © 2009 Optical Society of America 4246 APPLIED OPTICS / Vol. 48, No. 21 / 20 July 2009

Computer generated hologram with geometric occlusion using GPU-accelerated depth buffer rasterization for three-dimensional display

Embed Size (px)

Citation preview

Page 1: Computer generated hologram with geometric occlusion using GPU-accelerated depth buffer rasterization for three-dimensional display

Computer generated hologram with geometricocclusion using GPU-accelerated

depth buffer rasterization forthree-dimensional display

Rick H.-Y. Chen and Timothy D. Wilkinson*Electrical Engineering Division, Engineering Department, University of Cambridge,

9 JJ Thomson Avenue, Cambridge CB3 0FA, UK

*Corresponding author: [email protected]

Received 30 March 2009; revised 16 June 2009; accepted 24 June 2009;posted 29 June 2009 (Doc. ID 109276); published 17 July 2009

We present a method of rapidly producing computer-generated holograms that exhibit geometric occlu-sion in the reconstructed image. Conceptually, a bundle of rays is shot from every hologram sample intothe object volume.We use z buffering to find the nearest intersecting object point for every ray and add itscomplex field contribution to the corresponding hologram sample. Each hologram sample belongs to anindependent operation, allowing us to exploit the parallel computing capability of modern programmablegraphics processing units (GPUs). Unlike algorithms that use points or planar segments as the basis forconstructing the hologram, our algorithm’s complexity is dependent on fixed system parameters, such asthe number of ray-casting operations, and can therefore handle complicated models more efficiently. Thefinite number of hologram pixels is, in effect, a windowing function, and from analyzing the Wigner dis-tribution function of windowed free-space transfer functionwe find an upper limit on the cone angle of theray bundle. Experimentally, we found that an angular sampling distance of 0:01° for a 2:66° cone angleproduces acceptable reconstruction quality. © 2009 Optical Society of America

OCIS codes: 090.1760, 090.2870, 090.5694, 100.6890.

1. Introduction

Among the various 3D display technologies, holo-graphic display can arguably provide the most con-vincing effects, realism, and viewing comfort. Theinvention of computer generated holograms (CGH) iscentral to its use in interactive applications. It hasallowed holograms to be created without the opticalrecording process, making holographic display of ar-bitrary 3D models possible. Much research has beencarried out over the years to try to reduce the com-putational intensity and bring CGH generation toreal-time [1–4], but it still remains a very formidablechallenge. In the last few years, the increasing pro-grammability of graphics processing units (GPUs)

has gained a lot of attention from outside the com-puter graphics community, and GPUs have beensuccessfully used as powerful stream processors forsolving computationally intensive problems [5].Computational holography is, of course, among oneof them [6–8].

Because of the already intensive computation pro-cess in computational holography, visibility com-putation is mostly excluded owing to its additionalcomplexity. This results in “see-through” objectsupon reconstruction. For many applications, how-ever, it is essential that CGHs can provide the correctocclusion cue for depth perception. The traditionalapproach to hologram computation is to decomposethe 3D scene into planar segments, evaluate eachsegment’s complex amplitudes in the hologram planeindividually, and sum the results to obtain the com-plex distribution for the entire scene [9,10]. This

0003-6935/09/214246-10$15.00/0© 2009 Optical Society of America

4246 APPLIED OPTICS / Vol. 48, No. 21 / 20 July 2009

Page 2: Computer generated hologram with geometric occlusion using GPU-accelerated depth buffer rasterization for three-dimensional display

approach is slow due to the large number of fastFourier transforms (FFTs) employed, and incorpor-ating a visibility test would aggravate the problemfurther [11,12]. More recently work has been doneto eliminate per planar segment FFT by taking ad-vantage of precomputed triangular meshes [13,14],but occlusion could not be handled properly. An alter-native ray-tracing approach was proposed by Jandaet al. [8,15] that has a visibility test built in. However,their work aimed at improving the visual qualityrather than performance; the problem of high compu-tational cost is still unsolved and user interactivity isall but impossible. Also worth noting are works doneby Bove’s group at MIT [16] and Kang et al. [17] onholographic stereograms. Both exploited the comput-ing power of modern GPUs to accelerate their stereo-gram algorithms, as reported in their recent papers.In this paper we present our implementation of

ray-traced hologram generation and our effort tospeedup this process. Our algorithm combines ras-terization, ray-casting, and optical diffraction theorywith geometric occlusion using a programmableGPU. Our method does not require the scenes to beconsisted of only triangular mesh surfaces. This is, tothe best of our knowledge, the first method that isable to approach real-time generation of CGHsexhibiting correct occlusion cue without relying onpurpose-built hardware or a computer cluster. Sec-tion 2 briefly goes over diffraction theory, which isthe basis of hologram calculation. Section 3 outlinesthe occlusion algorithm. Factors that contribute tocomputational complexity and techniques that weemployed for a practical and efficient implementa-tion of the algorithm are studied in Section 4. Resultsand discussion follow in Section 5.

2. Electromagnetic Disturbance of a Point LightSource

In this work we treat the three-dimensional scene orobject as being made up of individual light scatteringsources. The adoption of point-based model simpli-fies the visibility test and provides the most flexibil-ity in the geometric description of the scene. A pointsource can be described mathematically by

uðxÞ ¼ AδðxÞ expð�jβÞ; ð1Þwhere A is the magnitude of the source, δðxÞ is theDirac delta function, and β is a random phase asso-ciated with the source. The random phase, rangingbetween �π=2, is introduced to simulate the effectof a diffuser that spreads the spectrum over theentire hologram. This ensures that every portion ofthe hologram contains information of the entirescene [18].The transfer function of free-space propagation,

also known as the wave spread function (WSF), isgiven by [19]

T ¼ exp�−j

2πλ zd

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1 − ðλυÞ2

q �; ð2Þ

where λ is the wavelength, zd is the propagation dis-tance, and υ the spatial frequency. Derived fromscalar diffraction theory, this function describes theelectromagnetic disturbance due to a point source[20]. According to the angular spectrum method [21],the complex field distribution given by the pointsource in the hologram plane at zd distance away is

hðx0Þ ¼ F�1fFfuðxÞg × Tg

¼Z∞

�∞

A expð�j2πυxÞ expð�jβÞ

× exp�−j

2πλ zd

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1 − ðλυÞ2

q �expðj2πυx0Þ · dυ:

ð3ÞThis is essentially saying that the field distribu-

tion is given by the sum of an infinite number of har-monic functions, each corresponding to a plane waveof a specific wave vector. This becomes more obviousif we substitute sin θ ¼ λυ, where θ is the wave vectorangle with respect to the z axis, into the above equa-tion and express the field distribution as

hðx0Þ ¼Z∞

�∞

A expð�jkðx0 � xÞsinθÞ expð�jβÞ

× expð�jkzdcosθÞcos θλ dθ:

ð4Þ

In practice, the field distribution in the hologramplane is often computed by modifying Eq. (3) to in-clude a sampling and a window function:

h½x0� ¼ F�1

�FfuðxÞg × T × comb

�υ� n

W

× rect� υW

��; ð5Þ

where W is the aperture size of the hologram, as isdiscussed later in Section 4. Eq. (5) redefines the pro-blem in a finite, discrete signal domain, facilitatingthe use of FFT algorithm to compute the wave field.Our hologram is similar to a kinoform, produced byFourier transforming the sum of the complex field ofall point sources,

H½υ� ¼ F

�XMm

hm

�; ð6Þ

where M is the total number of point sources. Inother words, the hologram represents the spectrumof the total complex field in the hologram plane.

Here we will make the distinction between “holo-gram pixels” and “samples in the hologram plane”. Inour usage, the first term refers to pixels of the CGH,whereas the second term refers to sampling points in

20 July 2009 / Vol. 48, No. 21 / APPLIED OPTICS 4247

Page 3: Computer generated hologram with geometric occlusion using GPU-accelerated depth buffer rasterization for three-dimensional display

the x–y plane at z ¼ 0 with no reference to the actualhologram itself.

3. Occlusion Processing in Computational Holography

Scalar wave diffraction theory forms the basis of ourtreatment of light propagation in Section 2. When itcomes to occlusion processing, however, light is re-garded as behaving like particles, and occlusionbecomes geometric shadowing. This “simplified” geo-metric occlusion is completely adequate in most cir-cumstances, as Underkoffler argued in [22].Before describing our proposedmethod, it is worth-

while going over some background on rendering incomputer graphics to facilitate our discussion later.

A. Rasterization Technique in Computer Graphics

Rasterization with depth buffering or z buffering is astandard rendering technique in computer graphics.To render a 3D scene correctly, each pixel in the im-age is allocated a buffer (the z buffer) that stores thedepth information. When an object in the scene is tobe rendered, its z-axis position is compared to thedepth value currently stored in the z buffer of the rel-evant pixels, and whichever the value that is smaller,i.e., closer to the viewer, is stored in the z buffer. Thisprocess is repeated for all objects in the scene. In theend of the process, the z buffer contains the informa-tion of which object is the nearest for each image pix-el, thus allowing the 3D scene to be rendered withcorrect depth perception.Modern GPUs are highly optimized for rapid

depth-buffering rasterization. By extending the ap-plication of depth buffering to hologram generation,we are able to take advantage of the GPU’s rasteri-zation hardware.

B. Hardware-Accelerated Rasterization for HologramGeneration

The situation is slightly more complicated in compu-tational holography than in computer graphics. Tocalculate the value of one sample in the hologramplane (hereafter simply referred to as the sample),all the primitives visible from that pixel locationmust be taken into account. In our current imple-mentation, a primitive is equivalent to an objectpoint in the 3D scene. Although not strictly correct,it may help to think of each sample as containing acamera that records some sort of information of theentire 3D scene. This is illustrated in Fig. 1.The sampling rays are uniformly distributed over

a semicircle (hemi-spherical surface for full paral-lax), as shown in Fig. 2, and each sample is asso-ciated with one such cluster of rays. Sampling raystraveling in the same direction but originated fromdifferent samples will traverse the 3D scene througha different path, and as such, different collections ofray–object intersection points will be generated foreach sample. This is analogous to the case in com-puter graphics where the camera/viewer positionchanges between frames.

At this point, it should be apparent that this is ef-fectively a ray-casting process. Ray–object intersec-tion calculation represents a significant portion ofthe total computational load in ray tracing. Below, wedescribe a method of accelerating the computationfor hologram generation. The same method was usedby Janda et al. [15]; the main difference is that theyexplicitly restrict their 3D model to a collection ofplanar segments parallel to the hologram plane,whereas we do not impose such a restriction.

Recall that each sample in the hologram plane hasa bundle of N uniformly distributed sampling raysoriginating from it (we call this source clustering).Now instead of grouping the rays according to theirpoint of origin, we group the sampling rays accordingto their traveling direction, which we call directionalclustering. Directional clustering allows the entirescene to be traced in one rasterization pass for onesampling direction. Occlusion calculation is automa-tically performed during rasterization by depth buff-ering with minimum effort on our side. Note that inmost computer graphics rendering, a perspectiveprojection is applied to transform the 3D scene intoa 2D projection. This, however, is not suitable for our

Fig. 1. In computational holography, each sample receives contri-butions from all visible objects. Note that only the bounding rays ofeach object are shown in the diagram.

Fig. 2. Equally distributed sampling rays over a semicircle.

4248 APPLIED OPTICS / Vol. 48, No. 21 / 20 July 2009

Page 4: Computer generated hologram with geometric occlusion using GPU-accelerated depth buffer rasterization for three-dimensional display

purpose, as a perspective projection would distort thescene geometry. We need to apply orthographic pro-jection instead to preserve the scene geometry.A problemwith this method is that depth buffering

only compares the z distance of primitives that havethe same x–y position; therefore, it would yield incor-rect visibility information for all directional ray clus-ters except the cluster whose direction is parallel tothe z axis. As an example, in Fig. 3 point A clearlyoccludes B for the sampling ray shown, but a naïveprojection and depth-buffering scheme would resultin the points being projected to different samplingpositions in the hologram plane and, therefore, incor-rectly concludes that no occlusion occurs (in fact, thealgorithm would discard the two without performinga z comparison at all).This can be solved by applying an appropriate x-

axis shear transformation to the 3D scene, as demon-strated in Fig. 4. The horizontal shear distorts onlythe x coordinate of the 3D scene and the z coordinateis left unchanged. This is important because success-ful occlusion calculation relies on the correct z coor-dinate for depth buffering. Knowing the z coordinateand the shear matrix applied, the undistorted x co-ordinate of ray–object intersection points can easilybe recovered.

After the transformation, the entire scene isrendered into the frame buffer and rasterized. Therasterization stage includes depth buffering for resol-ving visibility of each primitive. Only the primitivesclosest to the viewer are stored in the frame bufferafter rasterization. The rendering pass returns a col-lection of visible object points, where each point is anindependent point source of light. Its complex fieldcontribution to the sample under effect is computedand recorded. How the complex field contribution iscomputed is discussed in Section 4.

The system then proceeds to the next ray cluster,and the entire process repeats for all directional clus-ters. The results from all iterations are summed inthe end to yield the complex distribution in the holo-gram plane that is produced by the 3D scene withgeometric occlusion.

4. Implementation Considerations and HologramGeneration

The computation time grows with the number ofsamples in the hologram plane, the number of sam-pling directions, and the scene complexity. Obviously,as the number of objects in the scene increases, sodoes the rasterization time, as more transformationcalculation and depth comparisons need to be per-formed. We come back to this in Section 5, but fornow we limit our discussion to the other two factors.The data parallelism in the ray-casting process is ap-parent and allows us to exploit the parallel charac-teristic of modern programmable GPUs easily toboost performance. This and the use of precomputedlookup tables for field distribution in the hologramplane and optical reconstruction setup are also dis-cussed here.

A. Parallel Computation on GPU

The current NVIDIA GTX 200 series graphics cardhas 240 stream processors that can run concurrentlyon different threads of data, compared to 4 proces-sors in a quad-core CPU. It is highly desirable to takeadvantage of the GPU’s parallel computing capa-bility. As it happens, the procedure described in Sec-tion 4.A is ideally suited for data-parallel computingon the GPU. Each sampling instance in the hologramplane is calculated independent of the other. Eachsample performs what is called a gather operation:

Fig. 3. With naïve orthogonal projection, point A and B are pro-jected onto different samples. As a result, point B will not be oc-cluded by A under the z-buffer depth test.

Fig. 4. Before and after applying a horizontal shear transformation.

20 July 2009 / Vol. 48, No. 21 / APPLIED OPTICS 4249

Page 5: Computer generated hologram with geometric occlusion using GPU-accelerated depth buffer rasterization for three-dimensional display

it finds those object points that are visible and addstheir complex field contribution to the sample value.Our model is constructed in OpenGL in the usual

way. The model is then shear-transformed and ren-dered for each ray direction, generating visibility,depth, and amplitude data that is passed on to thefragment shader of the GPU’s programmable render-ing pipeline. Based on these data, the fragment sha-der computes the appropriate complex amplitude foreach hologram sample. This is where data-parallelcomputation comes in, as the same shader is carriedout on every sample but with different and indepen-dent input/output data streams. We use OpenGLShading Language (GLSL) to program the fragmentshader.By implementing the entire process from model

construction to hologram generation on the GPU, notonly do we get a speedup from parallel computing,but we also save a significant amount of time by elim-inating large data transfers between the graphicscard and the main memory.

B. Distribution of Sampling Rays

The choice of sampling rays has a significant impacton the performance of the system. The number ofsampling rays translates directly to the number ofrendering passes and computation time. It is there-fore important to minimize this number. The two de-ciding factors are the cone angle of the sampling raybundle in source clustering, and the angular sam-pling distance. Given an arbitrary object, there is anobject point at which the intersecting ray wouldmake the maximum angle with the surface normal~n of hologram plane, as shown in Fig. 5. Assumingthe object is symmetrical about the z axis, the coneangle of the ray bundle is given by

cone angle ¼ 2φ ¼ tan−1

�jxP1 � xSjzP1

�: ð7Þ

As for the angular separation between adjacentsampling rays, we use the angular resolution ofthe human eye, which is approximately 1 arc minuteor 1=60 of a degree. It is reasonable to expect a coneangle of several tens of degrees for a decent-sized ob-

ject, which, for uniformly distributed sampling rays,would mean several thousands to over ten thousandsampling directions and, hence, rendering passes.

If we attempt to implement our occlusion processin accordance with the analysis above, the prospect ofreaching real-time performance is rather bleak. Weneed to reduce the number of rendering passes byone to two orders of magnitude. Since a quantizedand bandlimited version of the WSF in Eq. (2) mustbe used in actual computations, it is logical to ques-tion to what extent this modification affects our cal-culation and what its implication on the cone angleis. The complex field distribution at the hologramplane due to a point source, in other words the pointspread function of the system (free space in thiscase), can be found analytically by the inverse Four-ier transform of Eq. (2) to give

PSF ¼ expð−jkrÞr

zr

�1r� jk

�; ð8Þ

where k is the wavenumber, r is the radial distance ofobject point from a sampling point in the hologramplane, and z is the propagation distance [20]. If wetake sampling and a limited computation windowinto account, the modified PSF becomes

PSF0 ¼ expð−jkrÞr

zr

�1r� jk

�⊗ combðx� nWÞ

⊗ sincðxWÞ; ð9Þ

where ⊗ is the convolution operator and W is thewindow size. This equation, however, is not particu-larly insightful, and it is difficult to get a feeling ofhow the light field might be distributed just by look-ing at Eq. (9). For this reason, we adopted a differentapproach using the Wigner distribution function(WDF).

The WDF is an intermediate signal description be-tween the space domain and the spatial-frequencydomain; it may be considered as the local frequencyspectrum of a signal, since the WDF is both a func-tion of space x and spatial frequency υ. The WDF is auseful concept that has found many applications inoptics. Here we discuss only our analysis using WDF,borrowing results derived in [23,24], and leave thederivations, formal definition of WDF, and its manyproperties to the references to cover.

The WDF Wðx; υÞ of a point source u ¼ δðx� x0Þ isalso an impulse function itself:

Wðx; υÞ ¼ δðx� x0Þ; ð10Þ

and is independent of spatial frequency υ. This im-plies that all frequencies are present at position x0,whereas there is no contribution at other positions.The WDF of a signal after propagating through somedistance in free space can be described by the input–output relationship [23]:Fig. 5. Maximum ray angle φ.

4250 APPLIED OPTICS / Vol. 48, No. 21 / 20 July 2009

Page 6: Computer generated hologram with geometric occlusion using GPU-accelerated depth buffer rasterization for three-dimensional display

Woutðx; υÞ ¼ Win

�x� 2πυzffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

k2 � ð2πυÞ2p ; υ

�; ð11Þ

where Wout is the WDF of the input signal propa-gated over some distance z. Plotting the WDF ofEqs. (10) and (11), free-space propagation of thesignal results in an x-axis shearing operation ofthe signal’s WDF (Fig. 6).The WDFs in Fig. 6 extend to infinity in both the

space and spatial-frequency axis. If we take theWDFand confine it to within a certain range of υ, outside ofwhich the function vanishes to zero, the correspond-ing signal will also be limited to the same frequencyregion due to the finite support property. Another im-portant property that we make use of is

juðxÞj2 ¼ xZ

Wðx; υÞdυ; ð12Þ

which is saying that the projection of WDF onto the υaxis yields the energy density, or intensity, of the sig-nal uðxÞ.Equipped with these properties, we now proceed to

answer the question we put forward earlier. Remem-ber that the CGH we produced is actually the spec-trum of the complex light field [Eq. (6)] quantizedand truncated to be displayed on the spatial lightmodulator (SLM). The finite computation windowis set accordingly. More explicitly, the size of theWSFin Eq. (2) and FFTs in Eq. (5) are determined by thenumber of SLM pixels. Given the maximum spatialfrequency υmax of this truncated spectrum on theSLM and the input–output relationship (11), the out-put WDF to an impulse input is

Woutðx; υÞ ¼8<:Win

�x − 2πυzffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

k2�ð2πυÞ2p ; υ

�; υ ∈ �υmax

0; υ∉� υmax

:

ð13ÞThe frequency restriction, in turn, leads to spatial

restriction

xmax ¼ � 2πυmax zffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffik2 � ð2πυmaxÞ2

p ; ð14Þ

within whichWout is nonzero. Projection formula (12)dictates that the output intensity be nonzero onlywithin this same space interval. This suggests thatthe spreading of light due to a bandlimited pointsource as it propagates through free space is re-stricted to within a finite space interval or, since thelimit is governed by the spatial frequency υmax, towithin a cone angle θ. Angle θ is related to the spec-trum through the diffraction formula

sin θ ¼ λυ: ð15Þ

To verify the analysis above, we simulated free-space propagation by solving Eq. (5) in MATLAB.Figure 7 shows the intensity in the hologram planedue to a point source on the optical axis at zd ¼ 5 cm,plotted against the horizontal distance x. For a com-putation window of 1280 samples with a 13:62 μmsampling distance, angle θ is

θ ¼ sin−1

�640 × 633nm

1280 × 13:62 μm

�¼ 1:33°; ð16Þ

and xmax ¼ �85:23 × 13:62 μm. From the plot, it canbe seen that the intensity value drops off sharplyaround xmax and has reduced by 6:5dB at xmax, assur-ing the validity of our analysis using the WDF. Sam-ples further out may be considered insignificant andbe discarded. Simple geometric relation reveals howangle θ relates to the sampling ray bundle (Fig. 8),and we conclude that 2θ can be approximated as thelimiting cone angle of the sampling ray bundle with ahigh degree of accuracy. Such a cone angle wouldmean a total of 160 sampling rays per bundle with1=60 of a degree angular sampling pitch for oursystem. Evidently, the angle limitation places a

Fig. 6. Wigner distribution function of (a) a point source, (b) thesame point source propagated over a distance z in free space. Thedotted lines define the spatial-frequency limitation imposed by fi-nite computation window.

Fig. 7. Intensity profile of the light field in the hologram planeoriginated from a band-limited point source 5 cm away on the op-tical axis.

20 July 2009 / Vol. 48, No. 21 / APPLIED OPTICS 4251

Page 7: Computer generated hologram with geometric occlusion using GPU-accelerated depth buffer rasterization for three-dimensional display

restriction on the field of view (FOV), but as shouldbecome clear later, this is in fact an inherent restric-tion of the SLM. What we have done is removed theprocessing of information that will be rendered re-dundant by the SLM. Indeed, the same FOV restric-tion exists even if the CGH is generated withconventional wave propagation method using FFTs.This limitation on cone angle is also discussed

briefly in [15], and although we have taken a slightlydifferent route in our derivation, we have reached thesame conclusion as them.

C. Precomputed Lookup Tables

To avoid performing expensive computation on thefly, Eq. (5) could be precomputed offline and the re-sults stored in a lookup table. Unfortunately a lookuptable that stores the complex amplitudes of all sam-ples for every possible object point location will takeup an impractically large amount of computer mem-ory. However, a closer inspection reveals that themagnitude and the random phase terms in Eq. (1)could be taken out of the forward and inverse Fouriertransform operations in Eq. (5), giving

h½x0� ¼ Ae−jβ�F−1

�ðFfδðxÞg × TÞ × comb

�υ − n

W

× rect� υW

���: ð17Þ

The term inside the outer curly brackets in Eq. (17)can be precomputed for a range of zd values, and wecall these the base distributions. A base distributionis a vector containing the complex amplitudes of lightalong the x axis in the hologram plane. The light orig-inates from a point source of zero initial phase andunit magnitude at zd distance away on the opticalaxis. The creation of the lookup table is inherently asampling of the 3D volume, and for convenience wechoose a sampling grid identical to the SLM’s pixelgrid. A corresponding lookup table of random phasesis also created.The lookup tables are calculated during initializa-

tion of the program on the CPU and then loaded ontothe GPU as textures. During online computation,

once the set of visible primitives is found after therendering pass, the fragment shader retrieves theappropriate entry from texture memory to evaluatethe hologram sample value according to Eq. (17). Inthis way, we have reduced the amount of online com-putation to a minimum, while keeping the lookup ta-bles to a reasonable size.

D. Space Domain Versus Frequency Domain

Recall that we require one Fourier transformation inthe final step of the hologram generation process. Al-ternatively, we could work directly in the Fourier do-main by omitting the inverse Fourier transform stepin the computation of the base distributions [seeEq. (17)] and obtain what we call the base fringes.In Fourier domain, a primitive corresponds to afringe pattern of the same dimension as the holo-gram, which means that an entire base fringe mustbe processed for every visible primitive. By contrast,in the spatial domain a primitive visible from a sam-pling point in the hologram plane contributes only tothat particular sample, thereby only one samplevalue—one element in the base distribution vector—needs to be processed; the contribution of indivi-dual samples to the rest of the hologram is calculatedlater in the last Fourier transform step. Since proces-sing a fringe the size of the hologram for every visibleprimitive actually requires more operation than per-forming a single FFT, working with base distribu-tions in the spatial domain is preferred.

The complex hologram is then quantized into abinary phase hologram according to Eq. (18) so itcan be displayed on our SLM. It may seem somewhathard to believe that such a coarse quantization willlead to a decent reconstruction of image on the reti-na. Nevertheless, it is well known that the Fourierphase of a signal or image holds much of the essentialinformation on the nature of the signal [25,26], whichsuggests that high quality reconstructions from bin-ary Fourier phase data are possible:

Hbi½υ� ¼�1; argðH½υ�Þ > 00; argðH½υ�Þ ≤ 0

: ð18Þ

E. Hologram Resolution and Size

The discrete nature of the SLM imposes hologramsampling. The sampling process is modeled mathe-matically in Eq. (5) by a multiplication with the combfunction, with the sampling distance being the SLM’spixel pitch. Furthermore, the finite width (andheight) of the SLM implies a window function, repre-sented by the rect function in Eq. (5). Ideally, theSLM will have an infinite width, which produces aDirac delta in the Fourier plane, in other words, aperfect image point in the replay field. However,the Fourier transform of the finite window functionis a sinc function, resulting in some degree of blur-ring of the image point in the replay field.

In our reconstruction setup, the eye lens acts as aFourier lens that projects the reconstructed image

Fig. 8. Angle made by the bounding rays of two arbitrary pointsP1 and P2 with spread-out angle 2θ at sample S must also be 2θ.

4252 APPLIED OPTICS / Vol. 48, No. 21 / 20 July 2009

Page 8: Computer generated hologram with geometric occlusion using GPU-accelerated depth buffer rasterization for three-dimensional display

directly onto the retina. Since the size of the replayfield in the far field is given by

w ¼ 2λfΔ ; ð19Þ

where w is the dimension of the replay field, f is thefocal length of the eye lens, and Δ is the SLM’s pixelpitch, the size of the reconstructed image the obser-ver sees is inversely proportional to the SLM’spixel pitch. If there are M horizontal pixels in theSLM, the replay field has a horizontal pixel pitchα of approximately

α ≈2λfMΔ : ð20Þ

F. Summary

Given an object description uðx; y; zÞ that specifiesthe amplitude of light, the implementation of our al-gorithm can be summarized in the following steps:

1. Transform uðx; y; zÞ by the shear matrix24 1 0 tan θ0 1 00 0 1

35:

2. The transformed object u0ðx; y; zÞ is fed to thegraphics rendering pipeline. The rendering pipelinesamples the continuous u0ðx; y; zÞ with a 2D samplinggrid ½m;n�. For each grid element, z buffering findsthe closest object point (if any).3. Fetch the appropriate entries from the lookup

tables (LUT) for each grid element. The index to thelookup tables is found by

index ¼ ðintÞz tan θ; ð21Þwhich is just the amount of horizontal shear in num-ber of grid samples. The z value is stored in the z buf-fer of that particular grid element.4. Multiply the fetched base distributions with

the amplitude array u0½m;n� and random phase to ob-tain the complex amplitude on the hologram plane :

h½m;n� ¼ base dist:LUT½index� × u0½m;n�× expðrandom phase LUT½indexþm;n�Þ:

ð22Þ

5. Repeat Steps (1)–(4) for all θ whereθ ¼ f−θmax;−θmax þΔθ;−θmax þ 2Δθ; …; θmaxg, sum-ming the complex amplitude h½m;n� of each iteration.

6. Finally after all iterations, the CGH is ob-tained by a FFT operation and binarized accordingto Eqs. (6) and (18).

5. Results and Discussion

This work is performed on an NVIDIA GeForce8800GT graphics card with 512MBytes of memory.The SLM used is a reflective binary phase device,1280 × 1024 pixels with a 13:62 μm pixel pitch. Fig-ure 9 shows the results of optical reconstruction.The slight blurriness in the images is mainly dueto the apodization of the laser beam and the SLM.The laser beam has a Gaussian profile, while thefinite SLM’s extent can be expressed mathematicallyas a rect function; the shape of the pixel after Fourierreconstruction is thus a combination of Gaussian andsinc, rather than a well-defined point.

The performance of our algorithm is shown inTable 1. The initialization process only occurs once

Fig. 9. (Color online) Optically reconstructed images of a bunnyand teapot exhibiting geometric occlusion.

Table 1. Hologram Generation Time in Seconds

Number ofSampling Rays

Initialization CGH Computation

Solid Cube(12 Quads)

Solid Teapot(2910 Quads)

Bunny in aBox (15306 Triangles)

266 0.5 1.37 1.90 2.47532 0.5 2.28 3.95 4.48

1064 0.5 4.16 7.01 8.452128 0.5 7.79 14.37 16.53

20 July 2009 / Vol. 48, No. 21 / APPLIED OPTICS 4253

Page 9: Computer generated hologram with geometric occlusion using GPU-accelerated depth buffer rasterization for three-dimensional display

at the start of the program and would not be invokedagain with subsequent changes in the 3D scene andthe CGH. The GPU process includes essentially allthe computation except initialization; the processtook only about 2 s without any optimization—a verypromising result.Note that we do not use individual 3D points to

build up the model. The model is constructed inOpenGL in the normal way, and the point samplingonly starts with the iterative computation process.Therefore, the complexity of our algorithm is not asimple function of the number of primitives. Thenumber of depth buffer comparisons will, of course,have an impact on the speed performance, but sincethis process is highly hardware-accelerated, it is re-latively less significant than the other factors dis-cussed in Section 4.If the sampling rays are few and sparsely distrib-

uted, the sampling process would return a disjointset of object points to each hologram sample. Conse-quently, the reconstruction would exhibit disturbingartefacts, and in severe cases, turn the original con-tinuous surface model into a collection of discretepoints. We have assumed the angular resolution ofthe human eye to be a reasonable angular samplingdistance and derived a limit for the maximum coneangle. For our experiment, we varied the samplingcone angle and the angular distance between sam-pling rays, and captured the optical reconstructionresults; these are compiled into Fig. 10. We concludedthat the improvement in image quality from increas-ing the cone angle and/or sampling resolution above2.66 and 0.01 degrees, respectively, is largely indis-tinguishable by human eyes. The apodization of ho-logram discussed earlier helps to somewhat reducethe graininess of the reconstruction with smoothblurring functions.

Motion parallax, another important cue in depthperception, is demonstrated in Fig. 11. The parallaxeffect shown is somewhat limited due to the limitedSLM area. This problem is itself a focus of ongoingresearch [27–31] but is beyond the scope of thispaper. We will note that, however, the increase ofviewing angle would come at the price of longer com-putation time, as discussed in Section 4.

6. Conclusions

We have described and demonstrated a method of ra-pidly generating CGHs whose reconstructed imagesexhibit geometric occlusion and motion parallax. Oc-clusion processing has been a formidable task in com-putational holography and has been largely ignoredin the past. Our method takes advantage of hard-ware-accelerated depth buffering and data-parallelcomputing on a GPU to tackle this problem. Studyingthe intensity profile of a band-limited impulse signalhas allowed us to remove redundant information pro-cessing. We were able to further reduce the computa-tion time by building efficient lookup tables to avoidexpensive online computation.

Compared to other CGH algorithms, our approachis able to handle large 3D scenes efficiently becausethe complexity of our algorithm scales well with thenumber of primitives making up the scene. We haveshown that the performance is explicitly linked tosystem parameters, such as viewing angle, numberof sampling rays, and angular sampling distance.This makes application-specific optimization verystraightforward.

With our current proof-of-concept implementation,we are able to demonstrate clear reconstructedimages at a near interactive rate, and we believe thismethod has much potential for real-time 3D display.

This work was supported by Alps Electric throughthe CAPE Research Projects funding. We thank JonFreeman, Adrian Travis, and Chirstoph Bay for theirhelp and valuable discussions and Wesley Hsu andRongjun Chen for their help in photography.

References1. M. Lucente, “Diffraction-specific fringe computation for

electro-holography,” Ph.D. dissertation (MIT, 1994).Fig. 10. (Color online) Optical reconstructions of CGHs generatedwith different sampling ray combinations.

Fig. 11. (Color online) Reconstructed images of a wireframe cubeexhibiting horizontal parallax. The camera was shifted horizon-tally from (a) left to (b) right.

4254 APPLIED OPTICS / Vol. 48, No. 21 / 20 July 2009

Page 10: Computer generated hologram with geometric occlusion using GPU-accelerated depth buffer rasterization for three-dimensional display

2. S.-C. Kim, J.-H. Yoon, and E.-S. Kim, “Fast generation of three-dimensional video holograms by combined use of datacompression and lookup table techniques,” Appl. Opt. 47,5986–5995 (2008).

3. W. Plesniak, “Incremental update of computer-generatedholograms,” Opt. Eng. 42, 1560–1571 (2003).

4. B. Munjuluri, M. L. Huebschman, and H. R. Garner, “Rapidhologram updates for real-time volumetric information dis-plays,” Appl. Opt. 44, 5076–5085 (2005).

5. M. Pharr and R. Fernando, GPU Gems 2: Programming Tech-niques for High-Performance Graphics and General-PurposeComputation (Addison-Wesley Professional, 2005).

6. N. Masuda, T. Ito, T. Tanaka, A. Shiraki, and T. Sugie, “Com-puter generated holography using a graphics processing unit,”Opt. Express 14, 603–608 (2006).

7. L. Ahrenberg, P. Benzie, M. Magnor, and J. Watson,“Computer generated holography using parallel commoditygraphics hardware,” Opt. Express 14, 7636–7641(2006).

8. M. Janda, I. Hanak, and V. Skala, “HPO hologram synthesisfor full-parallax reconstruction setup,” in proceedings of IEEE3DTV Conference, 2007 (IEEE, 2007), pp. 1–4.

9. D. Leseberg, “Computer-generated three-dimensional imageholograms,” Appl. Opt. 31, 223–229 (1992).

10. K. Matsushima, “Computer-generated holograms for three-dimensional surface objects with shade and texture,” Appl.Opt. 44, 4607–4614 (2005).

11. K. Matsushima, “Exact hidden-surface removal in digitallysynthetic full-parallax holograms,” Proc. SPIE 5742, 25–32(2005).

12. R. Ziegler, S. Croci, and M. Gross, “Lighting and occlusion in awave-based framework,” Comput. Graph. Forum 27, 211–220(2008).

13. L. Ahrenberg, P. Benzie, M. Magnor, and J. Watson, “Compu-ter-generated holograms from three dimensional meshesusing an analytic light transport model,” Appl. Opt. 47,1567–1574 (2008).

14. H. Kim, J. Hahn, and B. Lee, “Mathematical modeling oftriangle-mesh-modeled three-dimensional surface objectsfor digital holography,” Appl. Opt. 47, D117–D127(2008).

15. M. Janda, I. Hanak, and L. Onural, “Hologram synthesis forphotorealistic reconstruction,” J. Opt. Soc. Am. A 25,3083–3096 (2008).

16. Q. Y. J. Smithwick, J. Barabas, D. E. Smalley, and V. M. BoveJr., “Real-time shader rendering of holographic stereograms,”Proc. SPIE 7233, 723302 (2009).

17. H. Kang, T. Yamaguchi, H. Yoshikawa, S.-C. Kim, andE.-S. Kim, “Acceleration method of computing a compensated

phase-added stereogram on a graphic processing unit,” Appl.Opt. 47, 5784–5789 (2008).

18. E. N. Leith and J. Upatnieks, “Wavefront reconstruction withdiffused illumination and three-dimensional objects,” J. Opt.Soc. Am. 54, 1295–1301 (1964).

19. B. Saleh and M. Teich, Fundamentals of Photonics (Wiley,1991), p. 117.

20. N. Delen and B. Hooker, “Free-space beam propagationbetween arbitrarily oriented planes based on full diffractiontheory: a fast Fourier transform approach,” J. Opt. Soc. Am.A 15, 857–867 (1998).

21. J. W. Goodman, Introduction to Fourier Optics, 2nd ed.(McGraw-Hill, 2005), pp. 55–58.

22. J. S. Underkoffler, “Occlusion processing and smooth surfaceshading for fully computed synthetic holography,” Proc. SPIE3011, 53–60 (1997).

23. M. J. Bastiaans, “Application of the Wigner distribution func-tion in optics,” in The Wigner Distribution—Theory and Appli-cations in Signal Processing, W. Mecklenbräuker andF. Hlawatsch, eds. (Elsevier, 1997) pp. 375–426.

24. A. Torre, Linear Ray and Wave Optics in Phase Space(Elsevier, 2005).

25. S. Curtis, A. Oppenheim, and J. Lim, “Signal reconstructionfrom Fourier transform sign information,” IEEE Trans.Acoust. Speech Signal Process. 33, 643–657 (1985).

26. T. T. Huang and J. C. Sanz, “Image representation by one-bitFourier phase: theory, sampling, and coherent image model,”IEEE Trans. Acoust. Speech Signal Process. 36, 1292–1304(1988).

27. T. Mishina, M. Okui, and F. Okano, “Viewing-zone enlarge-ment method for sampled hologram that uses high-order dif-fraction,” Appl. Opt. 41, 1489–1499 (2002).

28. C. W. Slinger, C. D. Cameron, S. D. Coomber, R. J. Miller,D. A. Payne, A. P. Smith, M. G. Smith, M. Stanley, andP. J. Watson, “Recent developments in computer-generated ho-lography: toward a practical electroholography system for in-teractive 3D visualization,” Proc. SPIE 5290, 27–41 (2004).

29. A. Sugita, K. Sato, M. Morimoto, and K. Fujii, “Full-color ho-lographic display with wide visual field and viewing zone,”Proc. SPIE 6016, 60160Y (2005).

30. Y. Takaki and Y. Hayashi, “Increased horizontal viewing zoneangle of a hologram by resolution redistribution of a spatiallight modulator,” Appl. Opt. 47, D6–D11 (2008).

31. R. H.-Y. Chen and T. D. Wilkinson, “Field of view expansionfor 3-D holographic display using a single spatial lightmodulator with scanning reconstruction light,” in proceedingsof IEEE 3DTV Conference: The True Vision—Capture,Transmission and Display of 3D Video, 2009 (IEEE, 2009),pp. 1–4.

20 July 2009 / Vol. 48, No. 21 / APPLIED OPTICS 4255