9
Feature Article Published by the IEEE Computer Society 0272-1716/04/$20.00 © 2004 IEEE IEEE Computer Graphics and Applications 21 M any cultural heritage applications require 3D reconstruction of real-world objects and scenes. Over the past few years, it has become increasingly common to use 3D digitization and modeling for this purpose. This is mainly due to advances in laser-scanning techniques, 3D modeling software, image-based modeling techniques, computer power, and virtual reality. Many approaches are cur- rently available. The most common are based on sur- veys and CAD tools and/or traditional photogrammetry with control points and a human operator. However, this approach is time-consuming and can be costly and impractical for large-scale sites. Modeling methods based on laser-scanned data and more automated image-based techniques have recently become avail- able. Our approach integrates several technologies based on our experience over more than a decade of trying to accurately and completely model large-scale heritage monuments and sites. Using both interactive and auto- matic techniques, we can model a highly detailed struc- ture or site at various levels of detail. We use image-based modeling for basic shape and structural elements, and laser scanning for fine details and sculpt- ed surfaces. To present the site in its proper context, we use image-based rendering for landscapes and sur- roundings. To apply this approach, we created hundreds of models from sites all over the world for documenta- tion, walk-through movies, and interactive visualiza- tion. The results were compelling and encouraging. Motivation and requirements There are many motives for 3D reconstruction of her- itage sites: documenting historic buildings and objects for recon- struction or restoration in case of fire, earthquake, flood, war, erosion, and so on; creating educational resources for history and culture students and researchers; reconstructing historic monuments that no longer or only partially exist; visualizing scenes from viewpoints impossible in the real world due to size or accessibility issues; interacting with objects without risk of damage; and providing virtual tourism and vir- tual museum exhibits. In general, most applications specify eight requirements: high geometric accuracy, capture of all details, photorealism, high automa- tion level, low cost, portability, application flexibility, and model size efficiency. The order of impor- tance of these requirements depends on the application’s objec- tive—for example, whether it’s for documentation or virtual tourism. A single system that can satisfy all eight requirements is still in the future. In particular, accurately capturing all details with a fully automated system for a wide range of objects and scenes remains elusive. For small and medium objects, up to the size of an average adult person, range-based techniques such as laser scanners can provide accurate and complete details with a high degree of automation. However, being rel- atively new systems that aren’t produced in large quan- tities, these scanners remain costly. They are also not portable enough for a single person to carry around and An integrated approach for selecting the most effective technique for modeling large-scale heritage sites combines several technologies to create accurate and complete models. Sabry F. El-Hakim, J.-Angelo Beraldin, Michel Picard, and Guy Godin National Research Council of Canada Detailed 3D Reconstruction of Large-Scale Heritage Sites with Integrated Techniques

Detailed 3D reconstruction of large-scale heritage sites

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Feature Article

Published by the IEEE Computer Society 0272-1716/04/$20.00 © 2004 IEEE IEEE Computer Graphics and Applications 21

Many cultural heritage applicationsrequire 3D reconstruction of real-world

objects and scenes. Over the past few years, it hasbecome increasingly common to use 3D digitization andmodeling for this purpose. This is mainly due toadvances in laser-scanning techniques, 3D modelingsoftware, image-based modeling techniques, computerpower, and virtual reality. Many approaches are cur-rently available. The most common are based on sur-veys and CAD tools and/or traditional photogrammetrywith control points and a human operator. However,this approach is time-consuming and can be costly andimpractical for large-scale sites. Modeling methodsbased on laser-scanned data and more automatedimage-based techniques have recently become avail-able.

Our approach integrates several technologies basedon our experience over more than a decade of trying toaccurately and completely model large-scale heritagemonuments and sites. Using both interactive and auto-matic techniques, we can model a highly detailed struc-ture or site at various levels of detail. We useimage-based modeling for basic shape and structuralelements, and laser scanning for fine details and sculpt-ed surfaces. To present the site in its proper context, weuse image-based rendering for landscapes and sur-roundings. To apply this approach, we created hundredsof models from sites all over the world for documenta-tion, walk-through movies, and interactive visualiza-tion. The results were compelling and encouraging.

Motivation and requirementsThere are many motives for 3D reconstruction of her-

itage sites:

� documenting historic buildings and objects for recon-

struction or restoration in case of fire, earthquake,flood, war, erosion, and so on;

� creating educational resources for history and culturestudents and researchers;

� reconstructing historic monuments that no longer oronly partially exist;

� visualizing scenes from viewpoints impossible in thereal world due to size or accessibility issues;

� interacting with objects withoutrisk of damage; and

� providing virtual tourism and vir-tual museum exhibits.

In general, most applicationsspecify eight requirements: highgeometric accuracy, capture of alldetails, photorealism, high automa-tion level, low cost, portability,application flexibility, and modelsize efficiency. The order of impor-tance of these requirementsdepends on the application’s objec-tive—for example, whether it’s fordocumentation or virtual tourism.A single system that can satisfy alleight requirements is still in thefuture. In particular, accurately capturing all details witha fully automated system for a wide range of objects andscenes remains elusive.

For small and medium objects, up to the size of anaverage adult person, range-based techniques such aslaser scanners can provide accurate and complete detailswith a high degree of automation. However, being rel-atively new systems that aren’t produced in large quan-tities, these scanners remain costly. They are also notportable enough for a single person to carry around and

An integrated approach for

selecting the most effective

technique for modeling

large-scale heritage sites

combines several

technologies to create

accurate and complete

models.

Sabry F. El-Hakim, J.-Angelo Beraldin, Michel Picard, and Guy GodinNational Research Council of Canada

Detailed 3DReconstruction ofLarge-ScaleHeritage Sites withIntegratedTechniques

use like a video camera. Moreover, the resulting modelcan be inefficient for interactive visualization of large-scale scenes.

Image-based approaches use widely availablehardware. The same system can potentially capturea wide range of objects and scenes. Image-basedapproaches are also capable of producing realisticmodels, and those based on photogrammetry havehigh geometric accuracy. The issues that remain inimage-based modeling are the capture of details onunmarked and sculpted surfaces and the full auto-matic creation of the 3D models. Approaches such asimage-based rendering1 that skip the geometric-modeling step might suffice for visualization andwalk-through. However, the lack of geometric mod-eling makes them unsuitable for documentation andreconstruction applications.

Most documented projects on cultural heritage haveused one method or another; only a few have used acombination of techniques. For example, a group fromIBM2 combined structured-light 3D sensing and photo-metric stereo to model Michelangelo’s Florentine Pietà.Researchers have also combined laser scanning withimage-based modeling and rendering,3 image-basedmodeling with image-based rendering,4 and image-based rendering with laser scanning.5

Our approach integrates techniques as follows:

� We construct the basic shape and large regularlyshaped details, such as columns, blocks, windows,and archways, from high-resolution digital images.This technique is based on advanced photogramme-try with several automated features that take advan-tage of properties found in classical architectures.

� We use laser scans to obtain fine geometric details,such as sculpted and irregularly shaped surfaces.Then we integrate this technique with the basic modelcreated in the previous step.

� We obtain visual details in the geometric model fromimage textures and reflectance models.

� We use panoramas from aerial images to completethe surroundings and distant landscapes. This helpspresent the monument in its natural setting.

This combination of techniques satisfies most require-ments except that the cost of laser scanning is not as lowas that of a fully image-based system—at least for now.

Overview of 3D construction techniquesA standard approach to creating a 3D model is to build

it from scratch using tools such as CAD software, whichoffers building blocks in the form of primitive 3D shapes.Some survey data or measurements from drawings andmaps are also necessary. However, this geometry-basedmodeling technique is time-consuming, impractical, andcostly for large-scale projects. Although many applica-tions apply this approach—even TV programs in Europeuse it to render sites that no longer exist—the createdmodels look computer generated rather than realistic.They also don’t include fine details or irregular andsculpted surfaces. Several recent techniques aim toincrease the level of automation and realism by starting

with actual images of the object or directly digitizing itwith a laser scanner.

Image-based modelingThis technique involves widely available hardware,

so the same system can potentially handle a broad rangeof objects and scenes. Such systems can also producerealistic models, and those based on photogrammetryhave high geometric accuracy. Deriving 3D measure-ments from images naturally requires that interestpoints be visible in the image. Often, this is not possible,either because a region is hidden or occluded behind anobject or surface or because there is no mark, edge, orvisual feature to extract. In objects such as monumentsin their normal settings, restrictions also stem from therebeing limited locations from which to take the imagesand from the existence of other objects, shadows, anduncontrolled illumination.

The ultimate goal of all 3D reconstruction methods isto satisfy the eight requirements listed earlier. Becausethis is difficult, these methods typically focus on some ofthe tasks at the expense of others. Efforts to increase thelevel of automation have become essential to widen theuse of the technology. However, approaches to com-pletely automate the process from taking images to a 3Dmodel’s output, although promising, are thus far notalways successful. Some of the steps, mainly the automa-tion of camera-pose estimation and computation of 3Dpixel coordinates, have worked well in many cases.

This automation, now prevalent in computer vision,6

starts with a sequence of images taken by an uncali-brated camera. The vision system extracts interest points(such as corners), sequentially matches them acrossviews, and computes camera parameters and 3D coor-dinates of the matched points using robust techniques.This approach typically uses the first two images to ini-tialize the sequence. Tracking the points over a longsequence is important for reducing error propagation.This all occurs in a projective geometry, usually followedby a bundle adjustment,7 also in the projective space.Self-calibration to compute the intrinsic camera para-meters—usually only the focal length—follows to obtainmetric reconstruction, up to scale, from the projectivereconstruction.6 Again, the system typically applies bun-dle adjustment to the metric reconstruction to optimizethe solution.

The next step, the creation of the 3D model, is moredifficult to automate. This step typically requires somehuman interaction to define the topology and edit orpostprocess the output. For large structures and scenes,the technique could require many images. Therefore,creating the model requires significant human interac-tion even if there was automatic camera-pose estima-tion and if many 3D point coordinates were fullycomputed automatically.

The most impressive results remain those achievedwith highly interactive approaches. Rather than use fullautomation, Debevec developed an easy-to-use hybridsystem known as Façade.4 The method’s main goal wasto realistically create 3D models of architectures from afew photographs. First, the user interactively recoversthe structure’s basic geometric shape using models of

Feature Article

22 May/June 2004

polyhedral elements. In this step, the Façade system cap-tures the actual size of the elements and camera pose,assuming the camera’s intrinsic parameters are known.The second step is an automated matching procedure,constrained by the now-known basic model, to add geo-metric details. This approach proved effective for cre-ating geometrically accurate and realistic models ofarchitectures. The drawbacks are that it requires a highlevel of interaction and it’s restricted to certain shapes.Also, because assumed shapes determine all 3D pointsand camera poses, the results are as accurate as theunderlying assumption that the structure elementsmatch those shapes.

Our method, although conceptually similar, replacesbasic shapes with a few seed points to achieve more flex-ibility and a higher level of detail. In addition, we deter-mine the camera poses and 3D coordinates without anyassumption of the shapes, but rather through a full pho-togrammetric bundle adjustment with or without self-calibration, depending on the given configuration. Thisachieves higher geometric accuracy independent ofassumptions about the object’s shape.

The Façade approach has inspired several researchactivities to automate it. For example, Werner and Zis-serman8 proposed a fully automated Façade-likeapproach. Rather than using the basic shapes, their sys-tem creates the scene’s principal planes automaticallyto assemble a coarse model. As in Façade, the coarsemodel guides a more refined polyhedral model of detailssuch as windows, doors, and wedge blocks. Because thisapproach is fully automated, it requires feature detec-tion, closely spaced images to ensure correct matching,and camera-pose estimation using projective geometry.

Range-based modelingThree-dimensional measurement from images requires

that interest points or edges be visible in the image, whichis not always possible. Illumination or ambient light prob-lems can also affect the extraction of such points andedges. Active sensors such as laser scanners avoid theselimitations by creating features on the surface throughcontrolled light projection. These sensors have the advan-tage of acquiring dense 3D points automatically.

Recent advances in lasers, charge-coupled devicetechnology, and electronics make detailed shape mea-surements possible with accuracies better than 1 partper 5,000 at rates exceeding 10,000 points per second.The scanning and imaging configuration determine thepoint density. Many range sensors also produce orga-nized points, in the form of an array or range image, suit-able for automatic modeling. A single range image isusually insufficient to cover an object or structure. Theamount of necessary images depends on the object’sshape, the amount of self-occlusion and obstacles, andthe object’s size compared to the sensor range. We mustthen register the 3D data in a single coordinate system.Several registration techniques are available; most arebased on the iterative closest point (ICP) approach. Forthis approach to converge to the correct solution, itneeds to start with the images approximately registered.This requires either knowledge of sensor positions ormanual registration using features. Once we’ve regis-

tered the range images in a single coordinate system,they’re ready for modeling. The modeling step reducesthe large number of 3D points into a triangular mesh,which preserves the geometric details and is suitable forfast rendering.8 This process must integrate the areaswhere the images overlap to create a nonredundantmesh. Other requirements include filling holes andremoving any outliers.

There are two main types of range sensors: triangularbased and those based on the time-of-flight principle.Triangulation-based sensors project light in a knowndirection from a known position, and measure the direc-tion of the returning light through its detected position.Measurement accuracy depends on the triangle baserelative to its height. Because the triangle base is rathershort (for practical reasons), such systems have a limit-ed range of less than 10 meters (in fact, most are lessthan 3 meters). Sensors based on the time-of-flight prin-ciple measure the delay between emission and detec-tion of the light reflected by the surface, and thus theaccuracy does not rapidly deteriorate as the rangeincreases. Time-of-flight sensors can provide measure-ments in the kilometer range.

Notwithstanding the advantages of range sensors,they do have some drawbacks. At the moment, accuratesystems are costly and bulky, and surface-reflectiveproperties and ambient light affect those that don’t uselasers. Range sensors can also be complex to operateand calibrate. In addition, they’re intended for a specif-ic range, so one designed for close range is not suitablefor long range, and vice versa. For large-scale environ-ments, using a range sensor to model the entire scenecan generate a huge amount of data and require con-siderable effort to register the many scans.

Image-based renderingIBR uses images directly to generate new views for

rendering without explicit geometric representation.1

This has the advantage of creating realistic virtual envi-ronments at speeds independent of scene complexity.IBR relies on either accurately knowing the camera posi-tions or using automatic stereo matching. In the absenceof geometric data, success of the latter requires manyclosely spaced images. Object occlusions and disconti-nuities, particularly in large-scale and geometricallycomplex environments, also affect the output. The abil-ity to move freely into the scene and view objects fromany position can be limited, depending on the methodused. It’s therefore unlikely that IBR will be the approachof choice for purposes other than limited visualization.For tourists satisfied with general visualization, thisapproach might be adequate, but for historians andresearchers, and of course for documentation, correctgeometric details are essential.

Combining multiple techniquesFrom the preceding summary of current techniques,

it’s obvious that none can satisfy all the requirements oflarge-scale environments for applications such as cultureheritage. Although laser scanning provides all the details,it’s usually not practical as the only technique for everyobject and structure. Large buildings, for example,

IEEE Computer Graphics and Applications 23

require many scans and produce a huge number of pointseven for flat surfaces. Moreover, image-based modelingalone has difficulty with irregular and sculpted surfaces.It’s also important to develop an approach that requiresonly a few widely separated views while simultaneous-ly offering a high level of automation, and that can han-dle occluded and unmarked surfaces.

Therefore, the logical solution is to use image-basedtechniques to determine the basic shapes, and laser scan-ning to determine the fine details. In Figure 1, most of thestructure is easy to model through images taken with adigital camera. However, parts of the surface (such as theenlarged section shown) contain fine geometric details

that are difficult or impractical tomodel from images. It’s best toacquire those parts using a laser scan-ner and then add them to the globalmodel created from the images. Thisinvolves matching and integratinglocal 3D points obtained from thescanner with the global model. Wemeasure several features, usually 8 to10 points, using the images. Then weextract the 3D coordinates of thesame features from the scanned data.We do this interactively using inten-sity images generated by the laserscanner. We then use the transfor-mation parameters to register the twodata sets in one coordinate system.

Details of our techniqueHere we describe each of the

approaches we developed to createmodels from digital images, rangedata, and the integration of the two.

Semiautomatic image-basedmodeling

We conceived this approach main-ly for human-made objects such as classical architectures, whosedesigns are often constrained by pro-portion and configuration. Classicalbuildings are divided into architec-tural elements. These elements arelogically organized hierarchically inspace to produce the full structure.As long as we know some of a classi-cal architecture’s components, wecan reconstruct it, even if imagesonly partially reveal them. For exam-ple, a columnar element consists ofthe capital; a horizontal member ontop; the column itself—a long, verti-cal tapered cylinder; and a pedestalor base on which the column rests.We can further divide each of theseinto smaller elements. In addition tocolumns, other elements include pil-lars, pilasters, banisters, windows,doors, arches, and niches. We can

reconstruct each with a few seed points, from which wecan build the rest of the element.

Our approach, which is photogrammetry based, aimsneither to be fully automated nor to completely rely ona human operator. It provides enough automation to helpthe user without sacrificing accuracy or detail. Figure 2summarizes the procedure and indicates which step isinteractive and which is automatic (interactive opera-tions are orange). The figure also shows an option fortaking a closely spaced sequence of images, if conditionsallow, and increasing the level of automation. Here, wediscuss only the option of widely separated views, whichis more practical for large-scale scenes. We focus on

Feature Article

24 May/June 2004

1 Scannedsection (shownenlarged) onthe facade ofthe Abbey ofPomposa.

Interactive

Automatic

Imaging

OptionsselectionWidely separated views Closely spaced sequence

Initial point extraction

Registration

Modeling and texturing

Initial point extraction

Segmentation

Detailed feature extraction

Add shapes and intersections

Registration

Constraints

Seed points Element properties

2 General procedure for image-based modeling.

images with known camera setups. There should be areasonable distance, or baseline, between the images toguarantee a strong geometric configuration. We inter-actively extract several features appearing in multipleimages, usually 12 to 15 per image. We point to a cornerand label it with a unique number, and the system canaccurately extract the corner point. We use a Harris oper-ator for its simplicity and efficiency. We base imageregistration and 3D coordinate computation on pho-togrammetric bundle adjustment because of its accura-cy, flexibility, and effectiveness and because we wantedto provide the optimum solution compared to otherstructures from motion techniques.7

Advances in bundle adjustment eliminated the need forsurveyed points or for manually entering initial approxi-mate coordinates. (Photogrammetry has successfully tack-led many other aspects required for high accuracy, suchas camera calibration with full-distortion corrections,which we don’t discuss here.) The bundle adjustment letsus register all the camera coordinates and orientations, aswell as the 3D coordinates of the initial points, in one glob-al coordinate system. The next interactive operationinvolves dividing the scene into connected segments todefine the surface topology. This is followed by an auto-matic corner extractor and matching procedure across theimages to add more points into each of the segmentedregions. The epipolar condition and disparity range setupfrom the initial points’ 3D coordinates constrains thematching within a segment. Bundle adjustment repeatswith the newly added points to improve on previousresults and recompute more accurate 3D coordinates ofall points from all the images where they appear.

In addition to the multi-image approach, an approachto obtain 3D coordinates from a single image is essentialbecause some parts of the scene appear only in oneimage. It’s also necessary to cope with occlusions and alack of features. Our approach uses several types of con-straints for surface shapes such as planes and cylinders,and surface relations such as parallelism, perpendicu-larity, and symmetry. We determine the equations ofsome of the planes from seed points previously mea-sured. We determine the equations of the remainingplanes using the knowledge that they are either perpen-dicular or parallel to the planes already determined. Withlittle effort, we can compute the equations of the struc-ture’s main planes and those to which other structuralelements attach. From these equations and the knowncamera parameters for each image, we can determine3D coordinates of any point or pixel from a single image,even without any marking on the surface. When someplane boundaries aren’t visible, we can compute themusing plane intersections. We can also apply this tech-nique to surfaces such as quadrics or cylinders, whoseequations we can compute from existing points. We canalso use other constraints, such as symmetry and pointswith the same depth or same height. The general rule foradding points on structural elements and for generatingpoints in occluded or symmetrical parts is to do the workin the 3D space to find new points to complete the shape,then project them on the images using the known cam-era parameters. Figure 3 shows the main steps.

As Figure 4a shows, the system constructs a cylinder

after automatically determining its direction, radius, andposition from four seed points. The user can set the ratiobetween the upper and the lower circle in advance. Weset the default to about 0.85 to create a tapered column.From this information, the system can automatically gen-erate in 3D the points on the column’s top and bottom cir-cles, resulting in a complete model, as Figure 4b shows.

For windows and doors, we need three (preferablyfour) corner points and one point on the main surface(see Figure 3). By fitting a plane to the corner points, anda plane parallel to it at the surface point, we can recon-struct the complete window or door. (An earlier articleprovides more details of this approach.9) Figure 5 (next

IEEE Computer Graphics and Applications 25

Full-textured model

Column Door/window

(b)

(c )

(a)

3 Main steps for constructing architectural elementssemiautomatically (column and window examples): (a)extract in multiple image steps, match, and computeseed points’ 3D coordinates; (b) in 3D space, recon-struct the object from the seed points; and (c) create afull 3D model and project the new 3D points onto theimage for texture mapping.

4 The system(a) extracts fourseed points onthe base andcrown, then (b) automatical-ly adds columnpoints.

(a) (b)

page) shows examples of models created using thisapproach. We measured about 20 percent of these pointsinteractively as seed points, and the system created theremaining 80 percent automatically.

Range-based modeling and texturingFigure 6 outlines the steps for creating a triangular-

mesh model from 3D images. If the 3D data is a set ofregistered images, it’s easy to create a triangular meshby simply triangulating each image. However, there isoften sizeable overlap between the images from differ-ent views, so a mesh created this way has many redun-dant faces. Because we wanted to create a nonredundantmesh with no overlapping faces, we adopted a techniquedeveloped over the years at our laboratory and at Innov-Metric Software.10 InnovMetric Software implementedthis technique in Polyworks commercial software.

Most laser scanners focus only on acquiring the geom-etry. They usually provide only a monochrome intensityvalue for each pixel as sensed by the laser. Acquiring arealistic look in the model requires texture maps obtainedfrom a high-resolution, color digital camera. Some scan-ners have a color camera attached to them at a knownconfiguration so that the acquired texture is always reg-

istered with the geometry. However,this approach might not provide thebest results, because the ideal condi-tions for taking the images might notcoincide with the best conditions forscanning. Therefore, our approachallows taking images at a differenttime than during scanning, and atwhatever locations and lighting con-ditions (controlled illumination) arebest for texture.11

Combining image- and range-based modeling

We use the semiautomatic image-based approach to model the entirestructure without the fine detailsand sculpted surfaces. We can use

the approach just described to separately model the sec-tions requiring scanning. We use common pointsbetween the image- and range-based models to registerthem in the same coordinate system. We do this inter-actively using our own software, which can display andinteract with images from various types of sensors andcameras. The next step is to automatically samplepoints from the range-based model along its perimeterand insert those into the image-based model. We adjustthe image-based model’s triangulated mesh on the basisof those new points to create a hole into which we addthe range-based model so that there are no overlappingtriangles.

Landscape visualizationWhen images of the entire scene taken at large dis-

tances, such as aerial images, are available, we can rep-resent the landscape and integrate it with the model ofthe structures. This shows the structures in their natur-al setting and increases the level of realism. We deter-mine the elevation of ground points between the mainstructures from aerial images. We then use cylindrical orspherical panoramas to represent the remainder of thelandscapes and far objects, such as mountains. We use

Feature Article

26 May/June 2004

5 Models ofstructures fromall over theworld in wireframe, shadedsolid, and tex-tured solid.

Digital images

Sensor placement

Fill gaps

Modeling

Scanning Multiple views

Registration

Texture mapping

3D points

6 Generalprocedure forrange-basedmodeling.

a few joint 3D points between thestructures and the ground to regis-ter the ground elevation model andlandscape panorama with the struc-tures. The procedure is similar to theapproach used by Sequeira et al.3

Modeling the Abbey ofPomposa

The Abbey of Pomposa, nearFerrara, Italy, is one of the mostappealing Italian churches of theRomanesque period. Founded in the7th century, it comprises severalbuildings that are part of one of themost important Benedictine monas-teries. The abbey’s core includes therefectory, the basilica, the capitularyhall, and the cloister. The bell towerwas added in the 11th century. Theabbey is architecturally simple withplanar stone surfaces. Extensive art-work carved in marble decorates thefacade. There are also three archesdecorated with brick and stonework.

Data collectionUsing the image-based technique,

we completely modeled all of theabbey’s structures, including the belltower, except the carvings on thechurch’s facade. We imaged theentire complex with an Olympus 4-megapixel digital camera. We ac-quired seven different sets of images,including one set from a low-flyingairplane and one set from inside thechurch vestibule. Figure 7 shows theresulting seven models. We used theBiris 3D sensor (a short-range, sub-millimeter-accurate sensor devel-oped in our laboratory) to scandetails such as the left wheel rosone(rose window) and the peacockcarvings on the left side, which areshown in Figure 8.

ResultsWe created seven individual mod-

els (see Figure 7) from the digitalimage, and several detailed onesfrom the scanned smaller regions.Figure 9 (next page) shows a close-up of the general model of the churchbuilding with the addition of eightnew points from the wheel trim andthe retriangulated mesh. The holeshown is where the model of thescanned wheel will fit. This is neces-sary because the wheel has manyopenings in its center section. Thepeacock model doesn’t require cre-

IEEE Computer Graphics and Applications 27

7 Image-basedmodels that wecreated fromthe digitalimage.

8 Texturedmodel of frontbuilding show-ing scannedregions.

ating a hole in the main model, because it’s completelysolid and can simply attach to the plane of the wall model.Figure 10 shows a close-up of part of the middle of thewheel, displayed in a detailed wire frame. The scannedsections’ level of detail, acquired at 0.5-mm resolution, isobviously far higher than that of the image-based regions,which lack the small geometric details. It’s particularlymore convincing when we view these sections up closewhile navigating through the model. Figure 11 showssnapshots from the walk-through movie: one without tex-tures; and one fully textured, including all the landscapes.

ConclusionWe are now focusing on increasing the system’s level of

automation and ease of use. Another issue we are cur-rently addressing is how to define the required accuracyfrom each component of the system to achieve accept-able geometric, visual fidelity and how to access the qual-ity of the final model of large, complex sites. �

AcknowledgmentsWe appreciate the help of Marco Gaiani (now at

Politecnico di Milano), and architects Francesca Pozziand the late Roberta Marchetti, who organized the sur-vey campaign. We also thank architects Anna Maria Ian-nucci and Andrea Alberti for giving us the opportunityto demonstrate our technology in Pomposa.

References1. H.-Y. Shum and S.B. Kang, “A Review of Image-Based Ren-

dering Techniques,” Proc. Int’l Conf. Visual Comm. and ImageProcessing (VCIP 00), SPIE, vol. 4067, 2000, pp. 2-13.

2. F. Bernardini et al., “Building a Digital Model of Michelan-gelo’s Florentine Pieta′,” IEEE Computer Graphics and Appli-cations, vol. 22, no. 1, Jan.-Feb. 2002, pp. 59-67.

3. V. Sequeira et al., “Hybrid 3D Reconstruction and Image-Based Rendering Techniques for Reality Modeling,” Proc.Int’l Conf. Videometrics and Optical Methods for 3D ShapeMeasurement, SPIE, vol. 4309, 2001, pp. 126-136.

4. P. Debevec, “Image-Based Techniques for Digitizing Envi-ronments and Artifacts,” Proc. 4th Int’l Conf. 3-D DigitalImaging and Modeling (3DIM 03), IEEE Press, 2003, pp.234-242.

5. J. Wang and M.M. Oliveira, “Improved Scene Reconstruc-tion from Range Images,” Proc. Ann. Conf. European Assoc.for Computer Graphics (Eurographics 02), Blackwell Pub-lishers, vol. 21, no. 3, 2002, pp. 521-530.

6. M. Pollefeys, R. Koch, and L. Van Gool, “Self-Calibrationand Metric Reconstruction in Spite of Varying andUnknown Intrinsic Camera Parameters,” Int’l J. ComputerVision, vol. 32, no. 1, Aug. 1999, pp. 7-25.

7. W. Triggs et al., “Bundle Adjustment for Structure fromMotion,” Proc. Int’l Workshop Visual Algorithms, VisionAlgorithms: Theory and Practice, LNCS 1883, Springer-Ver-lag, 2000, pp. 298-372.

8. T. Werner and A. Zisserman, “New Technique for Auto-mated Architectural Reconstruction from Photographs,”Proc. 7th European Conf. Computer Vision (ECCV 02), LNCS2351, Springer-Verlag, 2002, pp. 541-555.

9. S.F. El-Hakim, “Semi-automatic 3D Reconstruction ofOccluded and Unmarked Surfaces from Widely SeparatedViews,” Proc. ISPRS Commission V Symp., Ziti Publishing,2002, pp. 143-148.

10. M. Soucy et al., “Sensors and Algorithms for the Construc-tion of Digital 3-D Colour Models of Real Objects,” Proc. Int’l

Feature Article

28 May/June 2004

9 Integrating scanned and image-based models.

10 Close-up in the wire-frame model of part of themiddle of a wheel (rose window). 11 Final integrated model in (a) shaded and (b) tex-

tured rendering.

(a)

(b)

IEEE Computer Graphics and Applications 29

Conf. Image Processing, IEEE Press, 1996, pp. 409-412.11. J.-A. Beraldin et al., “Virtualizing a Byzantine Crypt by

Combining High-Resolution Textures with Laser Scanner3D Data,” Proc. 8th Int’l Conf. Virtual Systems and Multi-media (VSMM 02), Int’l Soc. Virtual Systems and Multi-media, 2002, pp. 3-14.

Sabry F. El-Hakim is a seniorresearch officer at the Visual Infor-mation Technology Group in theInstitute of Information Technolo-gy at the National Research Coun-cil of Canada. His research interestsinclude image-based modeling,

multisensor data integration, camera calibration, andvirtual reality. El-Hakim received an MSc and PhD inphotogrammetry from the University of New Brunswick.He chairs the International Society of Photogrammetryand Remote Sensing (ISPRS) Working Group on SceneModeling and Virtual Reality, and he chairs the SPIEseries of conferences on videometrics.

J.-Angelo Beraldin is a seniorresearch officer at the Visual Infor-mation Technology Group in theInstitute of Information Technologyat the National Research Council ofCanada. His research interests in-clude sensor systems engineering and

signal processing (hardware and software) for 3D vision

systems. Beraldin received a BEng from the University ofSherbrooke, Quebec, and an MS from the University ofOttawa, both in electrical engineering.

Michel Picard is a technical offi-cer in the Institute for InformationTechnology at the National ResearchCouncil of Canada. His researchinterests include 3D imaging systemsand 3D animations, especiallyvirtualized environments. Picard

received a diploma in architectural technology from La CitéCollégiale in Ottawa.

Guy Godin is a senior research offi-cer in the Visual Information Tech-nology Group in the Institute ofInformation Technology at theNational Research Council of Cana-da. His research interests include 3Dcomputer vision and image analysis,

shape and appearance modeling, and interactive visual-ization. Godin received a BEng from Ecole Polytechnique deMontréal and an MEng from McGill University, both inelectrical engineering.

Contact Sabry F. El-Hakim at the Visual TechnologyGroup, Inst. for Information Tech., Nat’l Research Coun-cil of Canada, 1200 Montreal Rd., M-50, Ottawa, OntarioK1A 0R6, Canada; [email protected].

CERTIFIED SOFTWARE DEVELOPMENT PROFESSIONAL PROGRAM

2004 Test Windows: 1 April—30 June and 1 September—30 OctoberApplications now available!

G E T C E RT I F I E D

Visit the CSDP web site at http://computer.org/certification

or contact [email protected]

Doing Software Right

� Demonstrate your level of ability in relation to your peers

� Measure your professional knowledge and competence

Certification through the CSDP Program differentiates between you and other software developers. Although the field offers many kinds of credentials, the CSDP is the only one developed in close collaboration with software engineering professionals.

“The exam is valuable to me for two reasons:

One, it validates my knowledge in various areas of expertise within the software field, without regard to specificknowledge of tools or commercial products...

Two, my participation, along with others, in the exam and in continuing education sends a message that softwaredevelopment is a professional pursuit requiring advanced education and/or experience, and all the otherrequirements the IEEE Computer Society has established. I also believe in living by the Software Engineeringcode of ethics endorsed by the Computer Society. All of this will help to improve the overall quality of theproducts and services we provide to our customers...”

— Karen Thurston, Base Two Solutions