6
DENSE MATCHING IN HIGH RESOLUTION OBLIQUE AIRBORNE IMAGES M. Gerke International Institute for Geo-Information Science and Earth Observation – ITC, Department of Earth Observation Science, Hengelosestraat 99, P.O. Box 6, 7500AA Enschede, The Netherlands, [email protected] KEY WORDS: Adjustment, Bundle, Calibration, Matching, Point Cloud, Rectification ABSTRACT: An increasing number of airborne image acquisition systems being equipped with multiple small- or medium size frame cameras are operational. The cameras normally cover different viewing directions. In contrast to vertical images, those oblique images have some specific properties, like a significantly varying image scale, and more occlusion through high raising objects, like buildings. However, the faces of buildings and other vertically extended objects are well visible and this is why oblique images are used for instance for visualization purposes. This paper shows results from applying the sophisticated Semi-Global-Matching technique to a set of oblique airborne images. The images were acquired by two systems, namely FLI-MAP 400 (Fugro Aerial Mapping B.V.) and Pictometry (BLOM Aerofilms) over the same area. After the joint adjustment of the images, dense matching and forward ray intersection was performed in several image combinations. The disparity maps were evaluated through the comparison with a reference map derived from LIDAR which was acquired in parallel with the FLI-MAP system. Moreover, the 3D point clouds were analyzed visually and also compared to the reference point cloud. Around 60 to 70 percent of all matches were within a range of ± 3pix to the reference. Since the images were acquired in different flight configurations, the impact of different intersection angles and baselines to the triangulation is quite obvious. In general, the overall structures on the building faces are well represented, but the noise reduction needs further attention. 1 INTRODUCTION An increasing number of airborne image acquisition systems are operational (Petrie and Walker, 2007). Because of the availability of low-cost digital cameras with small or medium sized sensors, some of those systems carry multiple cameras covering differ- ent viewing directions. For instance from Pictometry 1 image are available already for a number of cities and they are accessible in the category ”birds eye view” in Microsoft Bing Maps 2 (formerly known as Virtual Earth). The use of oblique images for topographic mapping purposes was shown in quite some papers. In (H¨ ohle, 2008) height determina- tion from single oblique images is demonstrated. The verifica- tion of vector data using oblique imagery is shown in (Mishra et al., 2008). Due to the fact that building fac ¸ades are well vis- ible in oblique images, some researchers concentrate on how to automatically extract fac ¸ade textures (Fr¨ uh et al., 2004, Wang et al., 2008). Besides, the oblique images are interesting for cadas- tre applications, because the building outline as defined at the vertical wall is directly visible (Lemmen et al., 2007). Com- pared to vertical airborne images, oblique images have some spe- cific properties. Depending on the tilt angle, the scale within the imaged scene varies considerably. Moreover, vertical struc- tures of raised objects like buildings or trees are imaged, but the (self)occlusion by those objects is much more significant com- pared to the vertical image case. Another interesting application and research domain concerns the derivation of high dense point information through image mat- ching techniques. The benchmark results from the Middlebury 3 testsets show that high quality state-of-the-art techniques to dense matching are available. If it is possible to apply those techniques to oblique airborne images, interesting new applications would arise, or support existing ones, like the ones listed above. In gene- 1 http://www.pictometry.com 2 http://www.bing.com/maps 3 http://vision.middlebury.edu/stereo/ (accessed 15 March 2009) ral, point clouds as derived from dense matching in oblique ima- ges can be a complementary data source to airborne laser scan- ning, as those devices normally do not capture dense points on vertical structures. Of course, the traditional use of this kind of data to produce digital surface or terrain models is another possi- ble application. In (Besnerais et al., 2008) an approach to dense matching in obli- que airborne images is presented. The authors develop a pixel wise similarity criterion which accounts for the special viewing geometry of oblique images. A dense depth map is obtained through global regularization. The approach was tested on a num- ber of test images and showed good results. However, the ground sampling distance of the used images was not smaller than 1.4m, mostly it was even larger, up to 20m. This paper evaluates the application of the Semi-Global-Matching technique (SGM, see (Hirschm¨ uller, 2008)) to a set of high reso- lution FLI-MAP 4 and Pictometry images. One particular fac ¸ade of a building is normally only visible in images taken from one viewing direction, resulting in a relatively bad intersection angle in object space. Thus, the main objective of this paper is to eval- uate the overall accuracy of the derived 3D point cloud as derived from a forward intersection of matched points. Although other – may be better performing– algorithms for dense matching exist (Seitz et al., 2006) we chose SGM, because it demonstrated al- ready its fitness for the photogrammetric production process, c.f. (Hirschm¨ uller et al., 2005). As no sufficient calibration and orientation information was avai- lable, the whole block first needed to be adjusted. The method for bundle block adjustment, including self-calibration of multi- ple devices and employing scene constraints to enhance the scene geometry was introduced and tested in (Gerke and Nyaruhuma, 2009). The dense matching algorithm then was applied to sev- eral combinations of stereo images, and results were evaluated through LIDAR data which was acquired from the FLI-MAP sys- tem. 4 http://www.flimap.nl 77 In: Stilla U, Rottensteiner F, Paparoditis N (Eds) CMRT09. IAPRS, Vol. XXXVIII, Part 3/W4 --- Paris, France, 3-4 September, 2009 ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

DENSE MATCHING IN HIGH RESOLUTION OBLIQUE AIRBORNE … · 2013-03-08 · DENSE MATCHING IN HIGH RESOLUTION OBLIQUE AIRBORNE IMAGES M. Gerke International Institute for Geo-Information

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: DENSE MATCHING IN HIGH RESOLUTION OBLIQUE AIRBORNE … · 2013-03-08 · DENSE MATCHING IN HIGH RESOLUTION OBLIQUE AIRBORNE IMAGES M. Gerke International Institute for Geo-Information

DENSE MATCHING IN HIGH RESOLUTION OBLIQUE AIRBORNE IMAGES

M. Gerke

International Institute for Geo-Information Science and Earth Observation – ITC, Department of EarthObservation Science, Hengelosestraat 99, P.O. Box 6, 7500AA Enschede, The Netherlands, [email protected]

KEY WORDS: Adjustment, Bundle, Calibration, Matching, Point Cloud, Rectification

ABSTRACT:

An increasing number of airborne image acquisition systems being equipped with multiple small- or medium size frame cameras areoperational. The cameras normally cover different viewing directions. In contrast to vertical images, those oblique images have somespecific properties, like a significantly varying image scale, and more occlusion through high raising objects, like buildings. However,the faces of buildings and other vertically extended objects are well visible and this is why oblique images are used for instance forvisualization purposes.This paper shows results from applying the sophisticated Semi-Global-Matching technique to a set of oblique airborne images. Theimages were acquired by two systems, namely FLI-MAP 400 (Fugro Aerial Mapping B.V.) and Pictometry (BLOM Aerofilms) overthe same area. After the joint adjustment of the images, dense matching and forward ray intersection was performed in several imagecombinations. The disparity maps were evaluated through the comparison with a reference map derived from LIDAR which wasacquired in parallel with the FLI-MAP system. Moreover, the 3D point clouds were analyzed visually and also compared to thereference point cloud. Around 60 to 70 percent of all matches were within a range of ± 3pix to the reference. Since the images wereacquired in different flight configurations, the impact of different intersection angles and baselines to the triangulation is quite obvious.In general, the overall structures on the building faces are well represented, but the noise reduction needs further attention.

1 INTRODUCTION

An increasing number of airborne image acquisition systems areoperational (Petrie and Walker, 2007). Because of the availabilityof low-cost digital cameras with small or medium sized sensors,some of those systems carry multiple cameras covering differ-ent viewing directions. For instance from Pictometry1 image areavailable already for a number of cities and they are accessible inthe category ”birds eye view” in Microsoft Bing Maps2 (formerlyknown as Virtual Earth).

The use of oblique images for topographic mapping purposes wasshown in quite some papers. In (Hohle, 2008) height determina-tion from single oblique images is demonstrated. The verifica-tion of vector data using oblique imagery is shown in (Mishraet al., 2008). Due to the fact that building facades are well vis-ible in oblique images, some researchers concentrate on how toautomatically extract facade textures (Fruh et al., 2004, Wang etal., 2008). Besides, the oblique images are interesting for cadas-tre applications, because the building outline as defined at thevertical wall is directly visible (Lemmen et al., 2007). Com-pared to vertical airborne images, oblique images have some spe-cific properties. Depending on the tilt angle, the scale withinthe imaged scene varies considerably. Moreover, vertical struc-tures of raised objects like buildings or trees are imaged, but the(self)occlusion by those objects is much more significant com-pared to the vertical image case.

Another interesting application and research domain concerns thederivation of high dense point information through image mat-ching techniques. The benchmark results from the Middlebury3

testsets show that high quality state-of-the-art techniques to densematching are available. If it is possible to apply those techniquesto oblique airborne images, interesting new applications wouldarise, or support existing ones, like the ones listed above. In gene-

1http://www.pictometry.com2http://www.bing.com/maps3http://vision.middlebury.edu/stereo/ (accessed 15 March 2009)

ral, point clouds as derived from dense matching in oblique ima-ges can be a complementary data source to airborne laser scan-ning, as those devices normally do not capture dense points onvertical structures. Of course, the traditional use of this kind ofdata to produce digital surface or terrain models is another possi-ble application.

In (Besnerais et al., 2008) an approach to dense matching in obli-que airborne images is presented. The authors develop a pixelwise similarity criterion which accounts for the special viewinggeometry of oblique images. A dense depth map is obtainedthrough global regularization. The approach was tested on a num-ber of test images and showed good results. However, the groundsampling distance of the used images was not smaller than 1.4m,mostly it was even larger, up to 20m.

This paper evaluates the application of the Semi-Global-Matchingtechnique (SGM, see (Hirschmuller, 2008)) to a set of high reso-lution FLI-MAP4 and Pictometry images. One particular facadeof a building is normally only visible in images taken from oneviewing direction, resulting in a relatively bad intersection anglein object space. Thus, the main objective of this paper is to eval-uate the overall accuracy of the derived 3D point cloud as derivedfrom a forward intersection of matched points. Although other –may be better performing– algorithms for dense matching exist(Seitz et al., 2006) we chose SGM, because it demonstrated al-ready its fitness for the photogrammetric production process, c.f.(Hirschmuller et al., 2005).

As no sufficient calibration and orientation information was avai-lable, the whole block first needed to be adjusted. The methodfor bundle block adjustment, including self-calibration of multi-ple devices and employing scene constraints to enhance the scenegeometry was introduced and tested in (Gerke and Nyaruhuma,2009). The dense matching algorithm then was applied to sev-eral combinations of stereo images, and results were evaluatedthrough LIDAR data which was acquired from the FLI-MAP sys-tem.

4http://www.flimap.nl

77

In: Stilla U, Rottensteiner F, Paparoditis N (Eds) CMRT09. IAPRS, Vol. XXXVIII, Part 3/W4 --- Paris, France, 3-4 September, 2009 ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

Page 2: DENSE MATCHING IN HIGH RESOLUTION OBLIQUE AIRBORNE … · 2013-03-08 · DENSE MATCHING IN HIGH RESOLUTION OBLIQUE AIRBORNE IMAGES M. Gerke International Institute for Geo-Information

Figure 1: Geometry of across track (left) and along track (right) baseline in oblique stereo images

2 STEREO GEOMETRY AND ACCURACY INOBLIQUE IMAGES

The dense matching results in point correspondences in a stereopair5. The image rays as resulting from those matching imagepairs are forward intersected in order to compute 3D points in ob-ject space. In this section a brief theoretic approximatoin of theexpected accuracy from the given intersection geometry in obli-que airborne images is derived. Two cases can be distinguished:A) the two cameras used for the stereo intersection are orientedacross track, i.e. they inclose a right angle with the base, and B)the cameras are looking in flight direction. The intersection ge-ometry of the first case can be derived from the standard normalcase, but the varying scale and the tilt of the camera coordinatesystem wrt the actual coordinate system need consideration. Thesecond case can be compared with the longitudinal tilt-setup (Al-bertz and Kreiling, 1980), i.e. the normal case with different ca-mera heights. Only here the whole base is tilted. In Fig. 1 bothcamera geometries are sketched.

The scale within an oblique image depends on the flying heightH , the focal length c the tilt angle t, and the angle between theviewing ray to a target and the vertical β (symbols according to(Hohle, 2008)):

m =H · cos(β − t)

c · cos β, (1)

where m: scale at target point. At the principal point β equals t,whereas at the fore- and the background β is determined from tand the half field of view α: βfore = t− α and βback = t + α.

A: tilt across track (side-looking) In the vertical image case,the accuracy for triangulated points in height (s′H ), and in X-Yplane (s′X,Y ) can be estimated by:

s′H ≈ H ′

B·m · spx, (2)

s′X,Y ≈ sx ·m ≈ sy ·m, (3)

where sx ≈ sy ≈ 0.5 · spx are the standard deviations for imagecoordinate and parallax measurements; the errors in the orienta-tion components are neglected. In the case of the tilted camerasystem, these formulas are applicable to the tilted system, so thevarying scale needs to be considered, according to equation 1,also H ′ needs to be adopted accordingly:

H ′ = m · c. (4)

5For the combination of multiple views see the experiment section

Finally, the respective error components need to be projectedfrom the tilted system to the actual coordinate system:

sH ≈q

(s′H · cos t)2 + (s′X,Y · sin t)2, (5)

sX,Y ≈q

(s′H · sin t)2 + (s′X,Y · cos t)2, (6)

thus for a tilt angle of 45◦ both components will be identical.

B: tilt along track (forward-looking) To derive the accuracyin the tilted system H, X ′, Y ′, first the necessary parameters forthe case of longitudinal tilt need to be computed: Base B′ in thetilted system and the heights of the cameras I and II:

B′ = B · cos t, (7)∆H ′ = B · sin t, (8)

H ′I = m · c, and H ′

II = H ′I −∆H ′. (9)

Applying partial derivation wrt the image and parallax measure-ments to the formulas given in (Albertz and Kreiling, 1980), theaccuracies for the coordinate components in the tilted system canbe derived:

s′HI≈ s′HII

≈q`

HII′px

´2 · s2px +

`B′·sin t

px

´2 · s2x, (10)

s′X,Y ≈ HI′c· sx. (11)

Note that the actual parallax needs to be computed for the estima-tion. In the approximations for the given data, see below, a meanparallax according to a mean depth in fore- and background wasassumed. For the planar accuracy the more pessimistic estima-tion, assuming the smaller image scale, is given here. Finally, theplanar and height components in the actual coordinate system arecomputed according to equations 5 and 6.

3 METHODS ADOPTED

3.1 Block adjustment for multiple platforms

In (Gerke and Nyaruhuma, 2009) a method to incorporate sceneconstraints into the bundle block adjustment is described and tested.The bundle block adjustment algorithm uses horizontal and ver-tical line features, as well as right angles to support the stabilityof block geometry. Those features can be identified at buildingfacades, as visible in oblique images. In addition, the approachis able to perform self-calibration on all devices which are incor-porated in the block. This is an important issue in the case ofoblique images, as those are often acquired by non-metric came-ras. The extension to the setup used for this paper where imagesfrom different platforms are involved is done without any changeto the core approach.

78

CMRT09: Object Extraction for 3D City Models, Road Databases and Traffic Monitoring - Concepts, Algorithms, and Evaluation ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

Page 3: DENSE MATCHING IN HIGH RESOLUTION OBLIQUE AIRBORNE … · 2013-03-08 · DENSE MATCHING IN HIGH RESOLUTION OBLIQUE AIRBORNE IMAGES M. Gerke International Institute for Geo-Information

FLI-MAP PictometryParameter vertical oblique obliqueflying height [m] 275 275 920baseline [m] 50 50 400tilt angle [◦] 0 45 50number of images,viewing di-rection

7 8xSW 2xW, 2xS,2xE, 1xN

focal length [mm] 35 105 85pixel size [µm] 9 9 9sensor size [mm x mm] 36x24 36x24 36x24GSD and theoretic accuracies (for oblique: from fore- to background)ground sampling distance [cm] 7 2.8 – 4 10 – 16sX,Y , vertical [cm] 4 NA NAsZ , vertical [cm] 40 NA NAsX,Y , across track base [cm] NA NA 22 – 44sZ , across track base [cm] NA NA 18 – 37sX,Y , along track base [cm] NA 60 – 92 22 – 42(*)sZ , along track base [cm] NA 60 – 92 19 – 35(*)(*): along track base images from Pictometry were not used.

Figure 2: Image parameters and layout of sample block

3.2 Rectification and dense matching

The approach to dense stereo matching as applied in the currentimplementation is the Semi-Global-Matching algorithm (Hirsch-muller, 2008). The basic idea behind this technique is to aggre-gate local matching costs by a global energy function, which isapproximated by an efficient pathwise 1-dimensional optimiza-tion.

To simplify the matching, the images are rectified beforehand.For this purpose the approach proposed in (Oram, 2001) is ap-plied. A homography is estimated which is compatible to the fun-damental matrix (Hartley and Zisserman, 2004, chap. 13). Theaim is to minimize distortions due to perspective effects, and thusalso to reduce the disparity search space. A consequence from theparticular algorithm is that the epipolar lines are coincident, butnot necessarily parallel. Hence, in the subsequent rectification theimages need resampling to obtain parallel epipolar lines. One dis-advantage of this procedure is that straight lines are not preserved,however, this does not influence the matching. To compute theoptimal homography point correspondences are required, like forinstance in the case at hand the adjusted tie points. If images aretaken from approximately the same viewing direction, it is alsopossible to extract further matches through scale invariant pointdescriptors like SIFT (Lowe, 2004). Outliers in the correspon-dences are identified through RANSAC within the estimation ofthe compatible homography. The inliers are here also used toestimate the disparity search range for the dense matching.

4 EXPERIMENTS

4.1 Description of used data

Part of the data used for these experiments was acquired by theFugro Inpark FLI-MAP 400 system in March 2007 over Enschede,The Netherlands. Besides two LIDAR devices and two video ca-meras, the system carries two small-frame cameras, one pointingvertical, and one oblique camera, looking in flight direction, tiltedby approx. 45◦. Additional Pictometry images were made availa-ble through BLOM Aerofilms. Those images were acquired onlyone month before the FLI-MAP data. A small block of 7 ver-tical and 8 oblique images from FLI-MAP as well as 7 imagesfrom Pictometry was chosen for the experiments. In Fig. 2, upperpart some parameters of the images are given, the GSD and ac-curay estimation was done according to equations 1 to 11, whilea standard deviation for image measurements of a half pixel wasassumed. In the bottom of that figure the layout of the block isshown, including GCP, check points and the approximate posi-tion of defined scene constraints. The highly overlapping ima-ges in the center are from the FLI-MAP acquisition, while the7 regularly aligned outer images are from the Pictometry-flight.Note that no along track images are chosen from Pictometry. Theairplane acquired the images in N-S-direction, so the East- andWest-looking images belong to one flight line (baseline approx.400m) and the two South-looking images are from two adjacentstrips, baseline approx. 350m. For the accuracy estimation thetwo South-looking images can be treated like across-track ima-ges.

4.2 Block adjustment results

Four full and one height GCP were used for the adjustment. Ad-ditionally, one right angle, 3 horizontal and 4 vertical line con-straints were defined. It was assured that in every image at leastone of the features used for the scene constraints was visible. InTable 1 the adjustment results in terms of RMSE at the controland check points, or features respectively are listed. One obser-vation from the residuals is that the Z-component is smaller thanthe X/Y values for all control and check features. Also the resid-uals at vertical constraints are larger than the residuals at horizon-tal constraints, and those are also influenced by the Z-componentonly. One reason for this can be that the tilt of the Pictometryimages is larger than 45◦ and thus the X,Y-component is less ac-curate than the Z-component, refer also the the listed theoreticaccuracies in Fig. 2. One general drawback of this block-setup isthat outside the overlapping areas no GCPs or scene constraintsare available and applicable, respectively, so the overall block ge-ometry at the borders is not optimal. However, since the residualsat the facades are at least for the Pictometry images less than onepixel this result can be considered satisfactory.

Assessment RMSE value[cm]X-Res. at GCP 2.1Y-Res. at GCP 4.8Z-Res. at GCP 1.3X-Res. at Check 16.2Y-Res. at Check 5.8Z-Res. at Check 1.5Res. at H-constraints 1.4Res. at V-constraints 6.8Res. at RA-constraints (◦) 0.01

Table 1: Residuals from bundle block adjustment

4.3 Dense matching results

The dense matching was performed in several stereo image com-binations. Besides the matching in images from one platform,

79

In: Stilla U, Rottensteiner F, Paparoditis N (Eds) CMRT09. IAPRS, Vol. XXXVIII, Part 3/W4 --- Paris, France, 3-4 September, 2009 ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

Page 4: DENSE MATCHING IN HIGH RESOLUTION OBLIQUE AIRBORNE … · 2013-03-08 · DENSE MATCHING IN HIGH RESOLUTION OBLIQUE AIRBORNE IMAGES M. Gerke International Institute for Geo-Information

Figure 3: Two sample relative histograms of disparity differenceswrt reference disparity map. Note the scale difference betweenboth diagrams.

matching between FLI-MAP and Pictometry was tested. This isinteresting, because by this means the scene can be observed fromapproximately the same direction through multiple views. Theoverlap from consecutive Pictometry images is not large enoughto create 3-ray points, however, incorporating also FLI-MAP ima-ges makes this possible. Besides, this setup gives an interestinggeometry for forward intersection.

Two methods were used to assess the results: one quantitativeand one qualitative. For the quantitative assessment a referencedisparity map was computed from the FLI-MAP LIDAR data,then the differences to the disparities from image matching wereanalyzed using histograms. For a more qualitative assessment 3Dpoint clouds were computed from the matching results and thenassessed visually, also in comparison to the LIDAR point cloud.

Disparity map assessment For this assessment the referenceLIDAR points (density: 20 points per m2) were projected intothe image plane as defined by the respective image orientationand calibration parameters and subsequently a reference dispar-ity map was computed. Two issues are important here: first, onlyfirst pulse LIDAR points should be considered, as also in theimage only the visible surface can be matched. Second, throughthe oblique viewing direction as realized with the cameras onehas to take into account self-occlusion through buildings; thelaser scanner scans vertical and thus scans other parts of the scene,especially on the backside of buildings visible in the images. Toavoid errors from that circumstance, only areas which do notshow these effects were used for the evaluation.

The disparity maps were assessed by calculating the differencedisparity map and computing a histogram out of that one. Onlypixels showing a disparity value in both maps were considered,

Figure 4: Results from dense matching in two overlapping South-looking Pictometry images. Top: left image and 3D cloud frommatching, center row: zoom to point cloud from matching at fa-cades (left) and top view (right), bottom row: point cloud colorcoded height: reference (left), from matching (right)

thus matched points at facades which were not acquired by theLIDAR device can not be assessed. Two of such relative his-tograms are shown in Fig. 3. The upper histogram shows the dif-ferences from the matching within two Pictometry images (seeFig. 4). For this histogram approx. 50 · 103 matches were con-sidered (out of 2.2 · 106 in total), and around 70% of them showa difference of ±3 pixels to the reference. The histogram at thebottom shows the analysis from the matches within two obliqueimages from FLI-MAP, refer to Fig. 5. For this histogram approx.200 · 103 matches were considered (out of 6.4 · 106 in total).Because of the smaller baseline between consecutive FLI-MAPimages, compared to Pictometry, the overlapping area is larger,and thus results in more matches. Approximately 60% are withinthe difference of ±3 pixels. All matches outside this tolerancecan be considered as blunder. A more in depth analysis revealedthat most blunders were caused in shadow areas or other areaswith poor texture. When assessing those histograms is should beconsidered that errors from the image calibration and post esti-mation also contribute to those residuals, thus a final conclusionon the absolute matching accuracy of the SGM implementationcan not be made.

Point clouds: Pictometry to Pictometry For the followingevaluations a forward intersection of the matched points was per-formed. A simple blunder detection was implemented by apply-ing a threshold to the residual for image observations. For two-ray intersections this method can filter some blunders, but be-

80

CMRT09: Object Extraction for 3D City Models, Road Databases and Traffic Monitoring - Concepts, Algorithms, and Evaluation ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

Page 5: DENSE MATCHING IN HIGH RESOLUTION OBLIQUE AIRBORNE … · 2013-03-08 · DENSE MATCHING IN HIGH RESOLUTION OBLIQUE AIRBORNE IMAGES M. Gerke International Institute for Geo-Information

Figure 5: Results from dense matching in two overlapping FLI-MAP images. Top: part of left image and point cloud from mat-ching, centre: 3D point cloud from matching, color coded height(left) and color picked from images, bottom row: reference pointcloud and top view to matched point cloud . The circle indicatesan elevator box which is visible in the point cloud from matching,but not in the laser scanning data.

cause only one redundant observation is available, quite a lot ofblunders will not be detected.

The point cloud as resulted from the triangulation of matches ina Pictometry image pair are shown in Fig. 4. The top row inthat figure shows part of the left image and an overview on the3D scene defined by the matched point cloud. The center rowshows a zoom in to the colored point cloud from matching, fo-cusing on some facades and the vertical view to that scene. Fi-nally, the bottom row shows the reference point cloud at the lefthand side, where the color codes the height (one full color cycleequals 10m). The corresponding part of the matched point cloudis depicted on the right hand side. From that figure some interes-ting observations can be made. Although most of the flat roofsshow poor texture, the corresponding part in the matched pointcloud is quite dense and homogeneous. However, the height ofthe roof in the upper part is not correct, it is approx. 50cm lowerthan shown in the reference point cloud. In the vertical view onthe point cloud from SGM the occluded areas are clearly visi-ble, whereas the vertical facades are not visible in the referencepoint cloud. Overall, the structures are well represented, but themismatched pixels impose a certain level of noise to the scene.Those mismatches can hardly be detected, as only stereo mat-ches are available in this case. The detailed zoom on the verticalimage shows that the accuracy in x-y direction is quite good, andapparently even better than the estimated one (sX,Y : 22 – 44cm).

Point clouds: FLI-MAP to FLI-MAP The triangulated pointcloud from the matching in two consecutive oblique images fromthe FLI-MAP data is shown in Fig. 5. The zoom in to the col-

Figure 6: Results from multiple view triangulation. Top: matchedpoint cloud, color and height, center: reference point cloud, bot-tom: top view

ored point cloud shows quite some details, for instance the eleva-tor box close to the statue is clearly visible. The statue itself iswell represented and even gaps in the matches where people werewalking during image exposure are clearly identifiable. However,those views onto the point cloud were made from the viewing di-rection of the cameras, so the main error showing and effect inviewing direction is not visible. The estimated depth accuracy (inviewing direction, s′H ) of the along-track FLI-MAP data variesfrom 90 to 130cm, and the error in X ′, Y ′-direction is only 2cm.To assess the overall quality, the vertical view needs to be consid-ered: Here the uncertainty in viewing direction is quite obvious.If the result from the vertical view-zoom is compared to the onefrom the Pictometry data (Fig.4, center row), it can be observedthat the result from the FLI-MAP data is more inaccurate. Thisvisually achieved observation confirms the theoretically approxi-mated accuracy, which is about four times worse.

Point clouds: Multiple view For this experiment it was de-sired to exclude the wrong matches from the triangulation. Toachieve this goal, the dense matching results from several mat-ches were combined in the following manner: Dense matchingwas performed in the three combinations: ¬ FLI-MAP1 ↔ FLI-MAP2; ­ FLI-MAP2 ↔ Pictometry1; ® FLI-MAP1 ↔ Pictom-etry1. The matches from ¬ and ­ are linked in a way that thematching points from the right image of ¬ are searched in theleft image of ­ and by this means corresponding matches ®’FLI-MAP1 ↔ Pictometry1 are created. In the subsequent trian-gulation only those matches were considered which coincide withthe original matches from ®. Thus it can be expected that throughthis double check some blunders were removed. For more de-tails on this method see (Gerke, 2008). In Fig. 6 some details onthe results are shown. The point cloud contains now less points(1.6 ·106 points from the FLIMAP-only point cloud vs. 190 ·103

81

In: Stilla U, Rottensteiner F, Paparoditis N (Eds) CMRT09. IAPRS, Vol. XXXVIII, Part 3/W4 --- Paris, France, 3-4 September, 2009 ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

Page 6: DENSE MATCHING IN HIGH RESOLUTION OBLIQUE AIRBORNE … · 2013-03-08 · DENSE MATCHING IN HIGH RESOLUTION OBLIQUE AIRBORNE IMAGES M. Gerke International Institute for Geo-Information

matches here), but the overall accuracy seems to be better, seee.g. the height of the buildings. Also the detailed look from ver-tical shows less noise, compared to the two-fold matches before.

5 CONCLUSIONS AND OUTLOOK

This paper reports about the utilization of the dense image mat-ching technique Semi-Global-Matching to a set of high resolu-tion oblique airborne images. The images were acquired fromdifferent platforms and thus in different configurations. The com-parison of the depth maps from matching with a reference com-puted from LIDAR data showed that roughly 70% of all mat-ches are within an error range of ± 3pixel, however, also theresidual errors from camera calibration and orientation have animpact on this evaluation. The remaining matches can be consid-ered as blunders. How can those blunders be removed and thenoise level be reduced? If multiple overlaps are available, sophis-ticated error analysis prior to triangulation is feasible (Hirsch-muller, 2008). Also the method as applied here shows good re-sults, namely to eliminate wrong matches through linking mat-ches of adjacent images and applying a double check througha direct match. Other authors use the much stronger trinocularstereo geometry for matching (Heinrichs et al., 2007), or apply asimilarity criterion for multiple views directly (Besnerais et al.,2008). If only two-fold overlap is available – as mostly in fa-cade observations from oblique images – one method could be toincorporate reliable SIFT features within the SGM approach di-rectly: The disparities as defined by them can be used to reducethe respective matching cost.

The point cloud as resulted from the triangulation of the respec-tive matches revealed the sensitivity to the ray intersection angleand base length of cameras. For instance in the case of consec-utive FLI-MAP images the theoretic standard deviation of trian-gulated points in viewing (depth) direction is – due to the smalleffective baseline – around 1m, but perpendicular to that – dueto the low flying height – around 2cm only. In the shown ex-amples these theoretic measures were confirmed. In the testedimages from Pictometry the intersection geometry is better be-cause of the longer baseline. In general, the overall structures onthe building faces are well represented, but the noise reductionneeds further attention.

In the current research the focus is put on the automatic detec-tion and extraction of buildings in oblique images. Here, thepoint cloud as derived from the matching can give valuable cues.Another issue concerns the derivation of a more complete cove-rage by merging the point clouds as derived from different view-ing directions. At least for the roof areas this can be done ina similar manner as shown above, namely through linking mat-ches, since the majority of roof areas are visible from multipledirections.

ACKNOWLEDGEMENTS

I want to thank Daniel Oram for providing the code for generalrectification on his homepage6. Further I like to thank MatthiasHeinrichs, TU Berlin, for providing me with his code for Semi-Global-Matching. I also want to thank BLOM Aerofilms for pro-viding the Pictometry dataset. Finally I would like to thank theanonymous reviewer for their comments.

REFERENCES

Albertz, J. and Kreiling, W., 1980. PhotogrammetrischesTaschenbuch. 3. edn, Herbert Wichmann, Karlsruhe.

6http://www.aspa29.dsl.pipex.com (accessed 15 March 2009)

Besnerais, G. L., Sanfourche, M. and Champagnat, F., 2008.Dense height map estimation from oblique aerial image se-quences. Computer Vision and Image Understanding 109(2),pp. 204 – 225.

Fruh, C., Sammon, R. and Zakhor, A., 2004. Automated texturemapping of 3d city models with oblique aerial imagery. In: 3DData Processing Visualization and Transmission, InternationalSymposium on, pp. 396–403.

Gerke, M., 2008. Dense image matching in airborne video se-quences. In: ISPRS: XXI congress : Silk road for informationfrom imagery, Vol. XXXVII-B3b, International Archives of Pho-togrammetry and Remote Sensing, Beijing, pp. 639–644.

Gerke, M. and Nyaruhuma, A., 2009. Incorporating scene con-straints into the triangulation of airborne oblique images. In:High-Resolution Earth Imaging for Geospatial Information, Vol.XXXVIII, International Archives of Photogrammetry and Re-mote Sensing, Hannover.

Hartley, R. I. and Zisserman, A., 2004. Multiple View Geometryin Computer Vision. Second edn, Cambridge University Press.

Heinrichs, M., Hellwich, O. and Rodehorst, V., 2007. EfficientSemi-Global Matching for trinocular stereo. In: IAPRS, Vol. 36,pp. 185–190. Part 3-W49A (PIA conference, Munich, Germany).

Hirschmuller, H., 2008. Stereo processing by Semi-Global Mat-ching and Mutual Information. IEEE Transactions on PatternAnalysis and Machine Intelligence 30(2), pp. 328–341.

Hirschmuller, H., Scholten, F. and Hirzinger, G., 2005. Stereovision based reconstruction of huge urban areas from an airbornepushbroom camera (HRSC). In: Lecture Notes in Computer Sci-ence: Pattern Recognition, Proceedings of the 27th DAGM Sym-posium, Vol. 3663, Springer, Berlin, pp. 58–66.

Hohle, J., 2008. Photogrammetric measurements in obliqueaerial images. Photogrammetrie Fernerkundung Geoinformation1, pp. 7–14.

Lemmen, M., Lemmen, C. and Wubbe, M., 2007. Pictometry :potentials for land administration. In: Proceedings of the 6th FIGregional conference, International Federation of Surveyors (FIG).

Lowe, D. G., 2004. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision60(2), pp. 91–110.

Mishra, P., Ofek, E. and Kimchi, G., 2008. Validation of vec-tor data using oblique images. In: Proceedings of the 16thACM SIGSPATIAL International conference on advances in Ge-ographic Information Systems, ACM, Irvine, California.

Oram, D., 2001. Rectification for any epipolar geometry. In:Proceedings of the 12th British Machine Vision Conference(BMVC).

Petrie, G. and Walker, A. S., 2007. Airborne digital imaging tech-nology: a new overview. The Photogrammetric Record 22(119),pp. 203–225.

Seitz, S. M., Curless, B., Diebel, J., Scharstein, D. and Szeliski,R., 2006. A comparison and evaluation of multi-view stereo re-construction algorithms. CVPR 1, pp. 519–526.

Wang, M., Bai, H. and Hu, F., 2008. Automatic texture acqui-sition for 3d model using oblique aerial images. In: IntelligentNetworks and Intelligent Systems, International Workshop on,pp. 495–498.

82

CMRT09: Object Extraction for 3D City Models, Road Databases and Traffic Monitoring - Concepts, Algorithms, and Evaluation ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯