CONVERGING PHOTOGRAMMERTY AND SURVEYING: MAKING 3 …

ASPRS 2012 Annual Conference

Sacramento, California ♦ March 19-23, 2012

CONVERGING PHOTOGRAMMERTY AND SURVEYING:

MAKING 3-D MEASUREMENTS FROM TERRESTRIAL LIDAR POINT CLOUDS

AND IMAGES

Dr. Bingcai Zhang, Engineering Fellow

BAE Systems Geospatial eXploitation Products

10920 Technology Place

San Diego, CA 92127-1874

[email protected]

ABSTRACT

Both surveying and photogrammetry capture 3-D measurements of the Earth’s surface. Photogrammetrists extract 3-

D measurements from photographs, digital aerial images, and digital satellite images. Surveyors collect 3-D

measurements directly from the surface of the Earth. Generating 3-D measurements in an office environment is

typically more efficient and safer than in the field. Modern terrestrial LiDAR point clouds and images make it

feasible for surveyors to obtain a larger portion of their 3-D measurements in the workplace, side-by-side with

photogrammetrists. In that respect, photogrammetry and surveying are converging.

This paper presents the challenges and opportunities of applying photogrammetric algorithms to terrestrial

LiDAR point clouds and images. The key challenge is automatically registering terrestrial LiDAR point clouds and

images. Once point clouds and images are accurately registered, “Smart Images and smart point clouds” can be

generated. Each LiDAR point has a corresponding image pixel assigned to it, and each image pixel has XYZ

coordinates. From these Smart Images and smart point clouds, surveyors can perform 3-D measurements using

existing photogrammetric software tools. Test results indicate that Smart Images can achieve accuracy comparable

to the image ground sample distance (GSD) for well-defined ground points or for 2.5-D features such as roads. For

3-D features such as buildings and houses, the accuracy is close to the LiDAR point cloud spacing.

KEYWORDS: surveying, photogrammetry, LiDAR, smart images

INTRODUCTION

In 2008, Dr. John Kemeny from the University of Arizona and Dr. Keith Turner from the Colorado School of

Mines conducted a study, “Ground-Based LiDAR Rock Slope Mapping and Assessment” (Kemeny and Turner

2008). In their study, they stated:

It is shown in this report that some of the most important types of geotechnical information for rock slope

stability that is currently being collected by hand can be acquired from LiDAR point clouds and associated digital

images. This includes detailed information about rock discontinuity orientation, roughness, length, spacing and

block size. In many cases, this information can be automatically acquired using currently available point cloud

processing software. There are advantages to using LiDAR for collecting this information, including improved

safety, accuracy, slope access, and speed analysis. It is recommended that LiDAR be utilized for future highway

slope stability projects.

Both surveying and photogrammetry capture 3-D measurements of the Earth’s surface. One of the major

differences between them is accuracy. For some surveying tasks, the accuracy requirement cannot be met by

photogrammetry because photogrammetric image resolution, or GSD, is constrained by flying height. With rapid

advancements in ground-based imaging and LiDAR technology, image and LiDAR datasets are becoming so dense

that 3-D measurements with survey grade accuracy are achievable. For example, it is feasible to achieve 1 cm XY

accuracy from 0.5 cm GSD ground-based images. Survey grade accuracy is relative accuracy in many cases. For

example, the size of a crack and the size of a pothole on a highway surface are all relative. In other words, the

absolute accuracy is not relevant. For relative accuracy, image GSD plays the dominant role. Ground-based images

can have a very small GSD because the imaging cameras are very close to the imaged objects.

Digital photogrammetry software offers efficient tools for extracting 3-D and 2.5-D features from stereo images

(Zhang 2006). Some of these tools can be used to extract 3-D and 2.5-D features from ground-based images and

mailto:[email protected]



ground-based LiDAR point clouds. These tools can improve the productivity for surveyors to the extent that a large

portion of their work can be done in the office rather than in the field. Field surveying is subject to harsh weather,

rough terrain, and limited safe access, while obtaining 3-D measurements in an office is convenient and productive.

Digital photogrammetry and LiDAR software has reached a high state of maturity. Therefore, it can be used more

extensively.

In the following section, we introduce Smart Image concepts and discuss how they are generated and applied.

The next section presents test results to validate the accuracy of Smart Images for 3-D measurements followed by a

discussion of the tradeoffs between using Smart Images and ortho images. The final section is a summary of

our findings and recommendations.

SMART IMAGES

We define Smart Images as a dataset that includes the image pixels, the image’s sensor model and sensor model

accuracy, and an elevation or range at each pixel with its associated accuracy. The Smart Image concept has been

around for more than a decade (Spann 2011), and is increasingly relevant due to rapid advancements in LiDAR

technology. Smart Images require very dense and accurate digital surface models (DSMs), which LiDAR systems

can now provide. For a given ground point A(X, Y, Z), an image point a (xa, ya) can be uniquely determined using the

image sensor model. However, the image point may not be the corresponding image point when the ground point is

in an occluded area as shown in Figure 1.

Figure 1. Ground to image yields wrong image detail due to occlusion. S is a sensor or a camera. Pixel a is the

corresponding image detail for this ground point A.

To encode the correct elevation or range for each pixel, we need to select the ground point that is closest to the

sensor. As shown in Figure 2, both ground point A1 and A2 yield the same image point a. Ground point A2 is closer

to the sensor S than ground point A1. Therefore, the correct elevation or range for pixel a should be selected from

ground point A2. Some researchers (Spann 2011) propose using range instead of elevation to encode each pixel. A

range is defined as the distance between a sensor and a ground point. Using range is more suitable for ground-based

LiDAR point clouds and images. For aerial LiDAR point clouds and images, range or elevation are almost

equivalent.

To determine the corresponding unique ground point for each pixel in a Smart Image, we need to use the image

point line and sample coordinates plus its elevation or range. It is well known in photogrammetry that a given

ground point A(X, Y, Z) can yield a unique image point a (xa, ya) using the collinearity equations (1) and (2) (Wolf and

a

A

S

building



Dewitt 2000) for frame images. However, for a given image point a (xa, ya), we cannot obtain a unique ground point

A(X, Y, Z). To determine the corresponding unique ground point for each pixel in a Smart Image, we need to use the

image’s point plus its correct elevation or range. In Equations (1) and (2), ground coordinates XA and YA can be

uniquely determined when ZA is fixed and the image coordinates xa and ya are given, and the rest of the terms are

known.

Figure 2. Encode correct elevation to each pixel. Both ground points A1 and A2 yield the same image coordinates,

but A2 yields the correct elevation for the corresponding image point a. A2 is closer to sensor S than A1. Smart

Image uses elevation or range from A2 instead of A1 to encode pixel a.

[ ( ) ( ) ( )

( ) ( ) ( )] ( )

[ ( ) ( ) ( )

( ) ( ) ( )] ( )

It is a complex process to encode correct elevation or range to each pixel when the image GSD is much smaller

than LiDAR point cloud spacing. LiDAR points are not uniformly distributed. This makes the encoding even more

difficult as each LiDAR point covers a varying number of image pixels. The number of covered pixels varies from

LiDAR point to LiDAR point. We have developed a sophisticated algorithm that takes into consideration the

adjacent LiDAR points. For each LiDAR point, we can determine the appropriate number of image pixels that each

LiDAR point should cover. In doing so, we make sure that there are no gaps and no overlaps.

TEST RESULTS

Test results indicate that Smart Images can achieve an accuracy that is comparable to the image GSD for well-

defined ground points or for 2.5-D features such as roads. For 3-D features such as buildings and houses, the

accuracy is close to LiDAR point cloud spacing. We used a pair of stereo images and four LiDAR LAS files for

testing and validating. The stereo pairs were scanned images with a GSD of 0.07 meters. The LiDAR point cloud

has a density of 25 points per square meter or 0.2 meter spacing. The ratio between LiDAR point cloud spacing and

image GSD is approximately 3. One Smart Image was generated using the left image and four LAS files. The Smart

Image is identical to the original image except that it has an additional band which stores elevation for each pixel. A

test dataset was provided by Ordnance Survey, Great Britain.

A2

a

A1

S

building



In table 1, 19 well-defined ground points were measured from the Smart Image and stereo images. Xsmart,

Ysmart, Zsmart are ground coordinates measured from the Smart Image. Xstereo, Ystereo, Zstereo are the

corresponding ground coordinates measured from high-resolution and highly accurate stereo images. X, Y, Z are

the differences between Xsmart, Ysmart, Zsmart and Xstereo, Ystereo, Zstereo. All units are in meters. There is an

elevation bias of 0.124 between LiDAR point clouds and stereo images. By removing the elevation bias, the root

mean square error (RMSE) is reduced from 0.129 meter to 0.035 meter. The standard deviation for X and Y

coordinates are 0.091 meters and 0.025 meters respectively. As shown in Figure 3, the 19 points are the end points

of road marks which are parallel to the X coordinate. As a result, the Y coordinate measurement is more accurate

than the X coordinate measurement. Table 1 indicates that Smart Images have an accuracy that is comparable to the

image GSD for well-defined ground points or 2.5-D features.

Table 1. Accuracy of 3-D measurements from Smart Images vs. stereo images with well-defined ground points.

id Xsmart Xstereo X Ysmart Ystereo Y Zsmart Zstereo Z -bias

1 370.990 370.982 -0.008 104.677 104.709 0.032 7.640 7.756 0.116 -0.008

2 379.773 379.763 -0.010 104.249 104.235 -0.014 7.590 7.715 0.125 0.001

3 388.945 388.883 -0.062 103.234 103.215 -0.019 7.537 7.674 0.137 0.013

4 397.832 397.816 -0.016 102.044 102.059 0.015 7.549 7.633 0.084 -0.040

5 415.567 415.477 -0.090 099.574 099.593 0.019 7.470 7.592 0.122 -0.002

6 442.442 442.361 -0.081 095.921 095.919 -0.002 7.601 7.715 0.114 -0.010

7 467.205 467.164 -0.041 092.515 092.504 -0.011 7.652 7.797 0.145 0.021

8 494.128 494.031 -0.097 088.911 088.882 -0.029 7.410 7.592 0.182 0.058

9 529.806 529.762 -0.044 084.124 084.120 -0.004 7.422 7.511 0.089 -0.035

10 565.461 565.357 -0.104 079.352 079.341 -0.011 7.580 7.715 0.135 0.011

11 636.999 636.853 -0.146 069.674 069.647 -0.027 7.413 7.551 0.138 0.014

12 645.780 645.735 -0.045 068.362 068.303 -0.059 7.470 7.551 0.081 -0.043

13 654.662 654.581 -0.081 066.720 066.738 0.018 7.530 7.592 0.062 -0.063

14 663.587 663.513 -0.074 064.822 064.848 0.026 7.610 7.715 0.105 -0.019

15 672.259 672.172 -0.087 062.695 062.669 -0.026 7.760 7.838 0.078 -0.046

16 680.830 680.676 -0.154 060.370 060.388 0.018 7.770 7.961 0.191 0.067

17 689.451 689.368 -0.083 057.702 057.715 0.013 7.990 8.124 0.134 0.011

18 697.978 697.854 -0.124 054.362 054.394 0.032 8.100 8.247 0.147 0.023

19 706.180 706.016 -0.164 050.448 050.476 0.028 8.190 8.370 0.180 0.056

Total 2.365

Bias 0.124

STD 0.091 0.025 0.129

RMSE 0.035

Figure 3. 3-D measurements of 19 well-defined ground points. The yellow line in the left image consists of 19

vertices. Each vertex is at the end of a road mark as shown in the right image. The road marks are parallel to X

direction. The measurement cursor can be placed more accurately in Y direction than X direction.

In Table 2, 27 corner and edge points of a building were measured from the Smart Image and stereo images.

Again, there is an elevation bias of approximately 0.116 meters, which is similar to the elevation bias of 0.124 meter

in Table 1. The ratio of LiDAR point spacing and image GSD is about 3. Therefore, each LiDAR point is used to

compute elevation for roughly 3x3 pixels. When there is an elevation discontinuity, this ratio plays an important role

in the accuracy of 3-D measurements. As indicated in Table 2, the errors are close to the image GSD times the ratio

of the image GSD to LiDAR point spacing.



Table 2. Accuracy of 3-D measurements from Smart Images vs. from stereo images for building corners and edges

where there is an elevation discontinuity.

id Xsmart Xstereo X Ysmart Ystereo Y Zsmart Zstereo Z -bias

1 493.832 493.787 -0.045 216.715 216.661 -0.054 10.870 11.071 0.201 0.085

2 505.864 505.715 -0.149 212.011 211.800 -0.211 11.544 11.889 0.345 0.229

3 513.733 513.421 -0.312 211.675 211.447 -0.228 11.580 11.889 0.309 0.193

4 514.903 514.512 -0.391 210.854 210.968 0.114 14.728 14.754 0.026 -0.090

5 514.581 514.535 -0.046 225.788 225.856 0.068 12.120 12.298 0.178 0.062

6 514.760 514.810 0.050 226.658 226.791 0.133 11.917 11.889 -0.028 -0.144

7 515.339 515.442 0.103 238.559 238.521 -0.038 11.904 11.889 -0.015 -0.131

8 521.325 521.305 -0.020 237.085 237.011 -0.074 16.550 16.800 0.250 0.134

9 521.432 521.324 -0.108 249.185 249.083 -0.102 17.180 17.209 0.029 -0.087

10 545.742 545.671 -0.071 248.468 248.500 0.032 17.276 17.209 -0.067 -0.172 11 545.035 545.022 -0.013 225.627 225.448 -0.179 17.240 17.209 -0.031 -0.147

12 545.457 545.430 -0.027 225.048 225.106 0.058 12.018 11.889 -0.129 -0.245

13 618.230 617.995 -0.235 222.493 222.469 -0.024 12.079 12.298 0.219 0.103

14 617.230 617.153 -0.077 207.315 207.329 0.014 15.610 15.572 -0.038 -0.154

15 616.958 616.787 -0.171 191.955 191.848 -0.107 12.064 12.298 0.234 0.118

16 616.128 615.945 -0.183 176.675 176.572 -0.103 15.479 15.981 0.502 0.386

17 625.753 625.698 -0.055 176.379 176.420 0.041 14.640 14.754 0.114 -0.002

18 625.627 625.557 -0.070 173.605 173.624 0.019 15.121 15.163 0.042 -0.074

19 624.945 624.787 -0.158 160.613 160.735 0.122 11.870 11.889 0.019 -0.097

20 561.050 561.020 -0.030 163.125 163.153 0.028 12.010 11.889 -0.121 -0.237

21 560.891 560.743 -0.148 160.559 160.562 0.003 11.560 11.889 0.329 0.213

22 512.442 512.390 -0.052 162.393 162.274 -0.119 11.430 11.480 0.050 -0.066

23 513.298 513.373 0.075 180.310 180.278 -0.032 14.708 14.754 0.046 -0.070

24 513.402 513.524 0.122 189.497 189.554 0.057 13.214 13.117 -0.007 -0.123

25 513.125 512.570 0.445 190.113 190.169 0.056 10.817 11.071 0.254 0.138

26 492.878 492.928 0.050 190.775 190.745 -0.030 10.750 11.071 0.321 0.205

27 493.690 493.562 -0.128 203.725 203.771 0.046 13.040 13.117 0.077 -0.039

total 3.120

Bias 0.116

Std 0.164 0.101 0.197

RMSE 0.155

Figure 4. 3-D measurements of 27 corner and edge points of a building. There is a significant elevation

discontinuity around each point. The elevation discontinuity introduces errors 2 to 3 times the image GSD as shown

in Table 2. These errors are close to the image GSD (0.07 meter) times the ratio (3.0) between the image GSD and

the LiDAR point spacing.



ORTHO IMAGES vs. SMART IMAGES

Ortho images are widely used for geospatial applications. They are easy to use and intuitive for consumers.

Using simple image software, ortho images can be mosaicked while Smart Images require more sophisticated logic

because each Smart Image has its own specific sensor model and sensor parameters such as attitude (e.g. omega,

phi, and kappa). For every pixel in an ortho image, there is a pair of ground coordinates (X and Y) associated with it.

Its elevation can be obtained when a digital terrain model (DTM) is available. Many GIS customers use ortho

images and DTM to extract features such as roads. A mosaicked ortho image allows GIS software to easily generate

different resolution sets, zoom, and roam seamlessly. However, orthos are resampled images which are somewhat

degraded by the rectification process and error modeling is difficult or non-existent. Smart Images contain the error

modeling for XYZ extraction and the image pixels are not degraded by rectification to a DTM.

GIS software must have more sophisticated logic to work with Smart Images. For example roaming around the

image would require GIS software to load different Smart Images based on location as well as look angle. For

example, in Figure 4, GIS software needs to load the right image to view the right side of the building. Some GIS

software may not have the technology of photogrammetric sensor modeling. While sensor modeling is well

understood for frame images, it is more complex for other types of imaging sensors such as ADS80 and satellite

sensors.

Figure 5. Artifacts and distortions in ortho images. Left image is a raw image or Smart Image. Right image is the

corresponding ortho image. A straight line at the cursor location in the raw image is distorted in the ortho image.

Smart Images have certain advantages over ortho images. Smart Images do not have image artifacts and

distortions which are typical in ortho images as shown in Figure 5. The process of ortho generation moves raw

pixels around, which inevitably distorts images. Both ortho images and true ortho images have image distortion

problems.



Figure 6. Here a Smart Image displays the latitude, longitude, elevation, and accuracy for the cursor. The cursor

changes color to indicate where discontinuities occur. The upper-right overviews show range and range error

respectively. Smart images allow the user to measure XYZ and accuracy anywhere in the image and 3-D extraction

such as on these buildings can be done monoscopically (Courtesy of Spann 2011).

Smart Images are more accurate than ortho images. This is especially true for 3-D features such as buildings

and houses as shown in Figures 6 & 7. 3-D features and 3-D GIS are gaining popularity not only in the geospatial

profession, but also in consumer markets. For example, both Google Earth and Microsoft Bing have 3-D features as

one of the main offerings, which is appealing to consumers.

Smart Images have the potential to bring photogrammetry into the non-photogrammetry community.

Consumers such as Google Earth and Microsoft Bing could extract accurate, true 3-D features with Smart Images.

Smart Images could also be very useful for robotic applications, for automatic object classification and recognition.

By adding elevation or range information for each pixel, image classification accuracy would improve.



Figure 7. Ortho images are less accurate for 3-D feature modeling. Ortho images that are generated using a DEM

(bare- earth terrain model) are accurate for pixels that are on the ground. However, pixels on 3-D features such as

buildings, houses, and trees are shifted. Comparing this figure with Figure 4, the building roof boundaries (cyan

lines) have accurate XYZ coordinates. Ortho image corners and edges are shifted away from the accurate building

roof boundaries (cyan lines and vertices).

SUMMARY

Accurate 3-D measurements can be obtained from dense and accurate ground-based LiDAR point clouds and

images. Many of the photogrammetry software tools can be used for surveying grade 3-D measurements. The

convergence of surveying and photogrammetry practices can bring increased productivity and a more comfortable

working environment to surveyors. For example, there are 4 million miles of public roads in the U.S. alone. Using

Smart Images provides an innovative and efficient method to monitor and maintain public roads by performing 3-D

measurements of road surfaces from the office.

Test results indicate that Smart Images can achieve an accuracy that is comparable to the image GSD for well-

defined ground points or 2.5-D features such as roads. For 3-D features such as buildings and houses, the accuracy is

close to LiDAR point cloud spacing. The accuracy of 3-D measurements is constrained by LiDAR point cloud

density when there is an elevation discontinuity. For corners and edges of 3-D features, there is an elevation

discontinuity. Therefore, the accuracy of their 3-D coordinates is determined by the LiDAR point cloud density, the

image GSD, and the registration accuracy between LiDAR point clouds and images. For 2.5-D features, accuracy is

influenced to a greater degree by the image GSD. Accurate registration between images and LiDAR point clouds is

necessary for accurate 3-D measurements. Automatically registering LiDAR point clouds with images remains a hot

research topic.

In conclusion, Smart Images are a good alternative to ortho images for GIS applications. When dense and

accurate LiDAR point clouds are available, Smart Images can be more accurate than ortho images. Smart Images do

not have image artifacts and distortions that ortho images may have. Smart Images require more complex GIS

software than ortho images. Smart Images could pave the way for applying photogrammetric technology to non-

photogrammetric applications and consumers.



ACKNOWLEDGEMENTS

I am especially grateful to my colleagues Joseph Spann and Scott Miller for many in-depth discussions about

Smart Images and invaluable review and comments of this paper. Special thanks go to Ms. Carolyn Gordon for

editing this paper and some of my other publications. I would like to acknowledge Ordnance Survey of Great Britain

and Navteq Research for providing test dataset. Most of this work was done at home on weekends and I would like

to thank my wife Lucy for her support.

REFERENCES

Kemeny J. and Turner K.(2008). Ground-Based LiDAR Rock Slope Mapping and Assessment. Publication No.

FHWA-CFL/TD-08-006, U.S. Department of Transportation, Federal Highway Administration, September

2008.

Spann, J. (2011). Exploitation of UAV and LiDAR Data. ASPRS 2011 Annual Conference, Milwaukee, 1-5 May

2011.

Wolf P.R. and Dewittt B.A. (2000). Elements of Photogrammetry with Applications in GIS. 3rd

edition, McGraw-

Hill.

Zhang B. (2006). Processing LiDAR Data in SOCET SET – Exploring Its Full Potential. International LiDAR

Mapping Forum, Denver, 13-14 February 2006.

Documents

CONVERGING PHOTOGRAMMERTY AND SURVEYING: MAKING 3 …