15
* [email protected]; phone +39 0461 314446; fax +39 0461 314340; http://3dom.fbk.eu/ Improving automated 3D reconstruction methods via vision metrology Isabella Toschi a , Erica Nocerino a , Mona Hess b , Fabio Menna a , Ben Sargeant b , Lindsay MacDonald b , Fabio Remondino a , Stuart Robson b a 3D Optical Metrology (3DOM) unit, Bruno Kessler Foundation (FBK), Trento, Italy b Department of Civil, Environmental and Geomatic Engineering (CEGE), UCL, London, UK ABSTRACT This paper aims to provide a procedure for improving automated 3D reconstruction methods via vision metrology. The 3D reconstruction problem is generally addressed using two different approaches. On the one hand, vision metrology (VM) systems try to accurately derive 3D coordinates of few sparse object points for industrial measurement and inspection applications; on the other, recent dense image matching (DIM) algorithms are designed to produce dense point clouds for surface representations and analyses. This paper strives to demonstrate a step towards narrowing the gap between traditional VM and DIM approaches. Efforts are therefore intended to (i) test the metric performance of the automated photogrammetric 3D reconstruction procedure, (ii) enhance the accuracy of the final results and (iii) obtain statistical indicators of the quality achieved in the orientation step. VM tools are exploited to integrate their main functionalities (centroid measurement, photogrammetric network adjustment, precision assessment, etc.) into the pipeline of 3D dense reconstruction. Finally, geometric analyses and accuracy evaluations are performed on the raw output of the matching (i.e. the point clouds) by adopting a metrological approach. The latter is based on the use of known geometric shapes and quality parameters derived from VDI/VDE guidelines. Tests are carried out by imaging the calibrated Portable Metric Test Object, designed and built at University College London (UCL), UK. It allows assessment of the performance of the image orientation and matching procedures within a typical industrial scenario, characterised by poor texture and known 3D/2D shapes. Keywords: photogrammetry, vision metrology, dense image matching, computer vision, accuracy, precision, SIFT, circular target centroid, pattern projection 1. INTRODUCTION 1.1 Background The 3D reconstruction problem is a fundamental issue of vision systems and refers to the process of recovering 3D information about a surveyed scene from two or more images taken from different viewpoints. Several methods have been developed in order to address the same basic question, i.e. how to compute the 3D position of an object point, given two (or more) corresponding image points. Literature in this field can be divided into two general approaches: (i) vision metrology (VM) systems, that aim to accurately derive 3D coordinates for sparsely distributed object points; (ii) dense image matching methods, that are designed to produce dense point clouds for surface representation. The first approach is based on well-known principles 1,2,3 , that have gained widespread acceptance for industrial measurement, engineering purposes, medical, navigation and inspection applications 4 . Its priorities have remained essentially constant over the years, i.e. measurement results shall be accurate, repeatable and traceable to national or international standards. Photogrammetric multi-view measurements feature relative accuracies in the order of 1:50000 to 1:100,000 of the principal dimension of the object 5 . To achieve these performances, VM systems adopt carefully designed image networks, calibrated cameras and highly structured scenes with coded targets and scale bars. Traditionally, these systems have been employed to assign 3D coordinates to distinct object points, such as targets placed at key locations or target-less features of interest 6,7 . On the other hand, several software solutions have been recently developed to automatically retrieve 3D dense point clouds from a set of un-oriented and un-calibrated images 8,9,10 . The automatic 3D reconstruction procedure consists of two main steps, namely (i) camera calibration & image orientation (often called Structure from Motion SfM) and (ii) dense image matching (DIM). Fully automated methods for these steps were originally developed within the 3D computer vision community 11,12 and then adopted in the photogrammetric community. They were designed to address unstructured and unknown scenes (i.e. without target points and lacking in metric references) and to fully-automate

Improving automated 3D reconstruction methods via vision ...3dom.fbk.eu/sites/3dom.fbk.eu/files/pdf/Toschi... · a3D Optical Metrology (3DOM) unit, Bruno Kessler Foundation (FBK),

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Improving automated 3D reconstruction methods via vision ...3dom.fbk.eu/sites/3dom.fbk.eu/files/pdf/Toschi... · a3D Optical Metrology (3DOM) unit, Bruno Kessler Foundation (FBK),

* [email protected]; phone +39 0461 314446; fax +39 0461 314340; http://3dom.fbk.eu/

Improving automated 3D reconstruction methods via vision metrology

Isabella Toschia, Erica Nocerino

a, Mona Hess

b, Fabio Menna

a,

Ben Sargeant b

, Lindsay MacDonald b

, Fabio Remondinoa, Stuart Robson

b

a3D Optical Metrology (3DOM) unit, Bruno Kessler Foundation (FBK), Trento, Italy

b Department of Civil, Environmental and Geomatic Engineering (CEGE), UCL, London, UK

ABSTRACT

This paper aims to provide a procedure for improving automated 3D reconstruction methods via vision metrology. The

3D reconstruction problem is generally addressed using two different approaches. On the one hand, vision metrology

(VM) systems try to accurately derive 3D coordinates of few sparse object points for industrial measurement and

inspection applications; on the other, recent dense image matching (DIM) algorithms are designed to produce dense

point clouds for surface representations and analyses. This paper strives to demonstrate a step towards narrowing the gap

between traditional VM and DIM approaches. Efforts are therefore intended to (i) test the metric performance of the

automated photogrammetric 3D reconstruction procedure, (ii) enhance the accuracy of the final results and (iii) obtain

statistical indicators of the quality achieved in the orientation step. VM tools are exploited to integrate their main

functionalities (centroid measurement, photogrammetric network adjustment, precision assessment, etc.) into the pipeline

of 3D dense reconstruction. Finally, geometric analyses and accuracy evaluations are performed on the raw output of the

matching (i.e. the point clouds) by adopting a metrological approach. The latter is based on the use of known geometric

shapes and quality parameters derived from VDI/VDE guidelines. Tests are carried out by imaging the calibrated

Portable Metric Test Object, designed and built at University College London (UCL), UK. It allows assessment of the

performance of the image orientation and matching procedures within a typical industrial scenario, characterised by poor

texture and known 3D/2D shapes.

Keywords: photogrammetry, vision metrology, dense image matching, computer vision, accuracy, precision, SIFT,

circular target centroid, pattern projection

1. INTRODUCTION

1.1 Background

The 3D reconstruction problem is a fundamental issue of vision systems and refers to the process of recovering 3D

information about a surveyed scene from two or more images taken from different viewpoints. Several methods have

been developed in order to address the same basic question, i.e. how to compute the 3D position of an object point, given

two (or more) corresponding image points. Literature in this field can be divided into two general approaches:

(i) vision metrology (VM) systems, that aim to accurately derive 3D coordinates for sparsely distributed object points;

(ii) dense image matching methods, that are designed to produce dense point clouds for surface representation.

The first approach is based on well-known principles1,2,3

, that have gained widespread acceptance for industrial

measurement, engineering purposes, medical, navigation and inspection applications4. Its priorities have remained

essentially constant over the years, i.e. measurement results shall be accurate, repeatable and traceable to national or

international standards. Photogrammetric multi-view measurements feature relative accuracies in the order of 1:50000 to

1:100,000 of the principal dimension of the object5. To achieve these performances, VM systems adopt carefully

designed image networks, calibrated cameras and highly structured scenes with coded targets and scale bars.

Traditionally, these systems have been employed to assign 3D coordinates to distinct object points, such as targets placed

at key locations or target-less features of interest6,7

.

On the other hand, several software solutions have been recently developed to automatically retrieve 3D dense point

clouds from a set of un-oriented and un-calibrated images8,9,10

. The automatic 3D reconstruction procedure consists of

two main steps, namely (i) camera calibration & image orientation (often called Structure from Motion – SfM) and (ii)

dense image matching (DIM). Fully automated methods for these steps were originally developed within the 3D

computer vision community11,12

and then adopted in the photogrammetric community. They were designed to address

unstructured and unknown scenes (i.e. without target points and lacking in metric references) and to fully-automate

Page 2: Improving automated 3D reconstruction methods via vision ...3dom.fbk.eu/sites/3dom.fbk.eu/files/pdf/Toschi... · a3D Optical Metrology (3DOM) unit, Bruno Kessler Foundation (FBK),

image analyses. Both low-cost software packages and open-source solutions are nowadays available, providing users

with automated procedures for image orientation (SfM) and 3D dense reconstruction at different scales, thus covering

diverse application fields (e.g. cultural heritage documentation, archaeological mapping, architectural design, etc.). The

metrological consistency of the resulting 3D measurements is highly dependent on the quality of the imaged surface

(structure and texture), on the image network configuration and on the matching algorithm13,14

. Furthermore, the large

degree of automation is normally counter-balanced by an absence of statistical and evaluation parameters, that prevent a

proper quality analysis of the obtained numerical results.

Whilst a decade ago there was a sharp difference between the two aforementioned approaches, the gap is now reducing.

Due to the exchange and sharing of the most effective algorithms, it is difficult to classify the 3D imaging approaches

lately proposed by the research communities. Furthermore, recent advances in both hardware and software technologies

(e.g. GPU-supported solutions) have provided effective procedures for image processing and managing, that now enable

real-time applications even when large datasets are involved. Finally, some industrial applications require the accurate

3D reconstruction of deformed free-form surfaces without any control points. For instance, photogrammetric car-crash

test recordings adopt on-board high-speed stereo cameras to generate 3D point clouds of the deformed object area15

.

Such improvements and overlapping demands raise the expectations of the scientific and industrial communities for a

metrological-oriented use of automated techniques for image orientation and dense 3D reconstructions.

1.2 Paper objectives

The aim of the paper is to demonstrate the possibility to bridge the gap between VM and target-less automated 3D

imaging approaches by testing the metric performance of the automated photogrammetric 3D reconstruction procedure.

The latter approach includes many tasks and, among them, image block orientation and camera calibration represent the

most crucial ones. In order to enhance the accuracy of the final 3D results and obtain statistical indicators of the

orientation step, the well-established algorithms implemented in classical photogrammetric packages for accurate

measurement of single points are adopted. The VM tools Vision Measurement System (VMS)16

and PhotoModeler®

(PM)17

are thus exploited to integrate their main functionalities (centroid measurement, photogrammetric network

adjustment, precision assessment, etc.) into the pipeline of 3D dense reconstruction. Finally, further geometric analyses

and accuracy evaluations are performed on the raw output of the dense matching (i.e. the point clouds) by adopting a

metrological approach. This is based on the use of known geometric shapes and quality parameters which are derived

from VDI/VDE guidelines18,19

. Tests are carried out by imaging a calibrated Portable Metric Test Object designed and

built at UCL CEGE 20,21

. It supports the evaluation of several geometric parameters, such as sphere spacing error, sphere

diameter error, plane spacing error, angular error and structural resolution. Moreover, it features circular targets that can

be adopted to further improve the photogrammetric network adjustment and evaluate its quality. Finally, the test-object

is small and poorly textured, which is often the case in industrial applications. It thus allows assessment of the

performance of the automatic image orientation and matching procedures in dealing with a challenging artefact.

2. METHODOLOGY

The automatic 3D reconstruction procedure consists of two main steps, namely image triangulation (including camera

calibration) and dense image matching. Given the images, the first task requires the identification of homologous points

in different views of the same 3D scene. Once a set of homologous points (also called image correspondences or tie

points) are identified, the exterior orientation parameters of the images, interior parameters of the camera and 3D object

coordinates of the feature points are automatically computed through an iterative process based on a robust bundle

adjustment. Finally, the dense 3D reconstruction problem is performed with dense matching algorithms14

able to deliver

a dense point cloud up to one 3D point for every pixel. Although this pipeline is based on the adoption of well-

established algorithms derived from both the photogrammetric and computer vision communities, detailed information

about the internal quality achieved by the process is normally not provided. So far, this has generally precluded the

application of such automated approaches to the industrial field where rigorous statistical parameters (e.g. reliability,

normalized correction, standard error of unit weight, etc.) and quality analyses of results are mandatory.

To address this issue, this paper proposes an enhanced photogrammetric procedure that exploits the main functionalities

of the classical photogrammetric packages for accurate measurement of single points. In particular, their algorithms and

statistical analyses, together with a home-made filtering tool, are integrated into the classical photogrammetric 3D

reconstruction pipeline (VM). The resulting methodology is summarized in Figure 1 and further described in the

following subsections.

Page 3: Improving automated 3D reconstruction methods via vision ...3dom.fbk.eu/sites/3dom.fbk.eu/files/pdf/Toschi... · a3D Optical Metrology (3DOM) unit, Bruno Kessler Foundation (FBK),

Figure 1. The procedural workflow adopted in the project.

2.1 Photogrammetric network adjustment

The photogrammetric network adjustment requires a set of image correspondences, manually or automatically extracted,

as its input. VM approaches for high precision applications usually adopt artificial object features (e.g. targets) to

identify the necessary image correspondences. Automatic sub-pixel measurements based on centroids, ellipse shape

fitting or least-squares template matching are adopted in order to achieve the highest precision for feature detection5. On

the other hand, a variety of algorithms have been developed by the computer vision community to automatically extract a

large number of points or regions of interest from images of unstructured and unknown scenes11,23

. Among these

solutions, the Scale Invariant Feature Transform (SIFT)22

algorithm provides highly distinctive features by following

four main steps, i.e. scale-space extreme detection, key point localization, orientation assignment and key point

descriptor creation. Corresponding points are then detected by comparing and matching the descriptors among the

different images.

In this paper, both approaches for homologous point identification, i.e. target centroid computation and SIFT point

detection are subsequently tested. In the second case, a filtering and regularization procedure of the extracted tie points is

additionally carried out, by applying a tool internally developed at FBK 3DOM24,25

. The method reduces the number of

image observations, so that they can be efficiently handled by classical photogrammetric bundle adjustments. It

regularizes the point distribution in object space, while preserving connectivity and high multiplicity between

observations.

The extracted image correspondences are then included in a self-calibrating bundle adjustment performed with VM

software packages. Two different processes are carried out by using the image observations derived from (i) the circular

target centroid computation and (ii) the SIFT point detection and filtering. In addition, a further test is performed by

Page 4: Improving automated 3D reconstruction methods via vision ...3dom.fbk.eu/sites/3dom.fbk.eu/files/pdf/Toschi... · a3D Optical Metrology (3DOM) unit, Bruno Kessler Foundation (FBK),

simultaneous use of both types of observations, adequately weighted. A free-network bundle adjustment method is

adopted in all network adjustments, followed by a rigid similarity transformation for the definition of the scale and of a

common coordinate reference system.

2.2 3D reconstruction

Starting from the adjustment results (i.e. computed internal and external orientations), the 3D reconstruction process

allows to determine the 3D position of object points. For VM systems, this normally results in computing the 3D object

space coordinates of targets. On the other hand, dense image matching methodologies aim to determine correspondence

information for every pixel in order to derive dense point clouds describing the imaged surface or object. Algorithms are

generally categorized into local and global14,26

. The former computes the disparity at a given point using the intensity

values within a finite region, being thus sensitive to the choice of the window size and to locally ambiguous areas in the

images. Global methods, on the contrary, perform a global search which seeks for the optimal global solution to the

matching problem. Recently, semi-global methods27

have also been introduced in order to provide an efficient solution

through an approximation of the global model. This third approach gained a large acceptance in the photogrammetric

community that is increasingly involved in improving it28

.

A semi-global approach is here adopted to retrieve a dense 3D reconstruction of the test-object. The raw output of the

matching process (i.e. the dense point clouds) is then evaluated through the metrical characterization procedure described

in Section 3.

3. PROJECT DESCRIPTION

3.1 Test-object

Experiments are performed using the Portable Metric Test Object designed and built at UCL CEGE20

. The artefact

(Figure 2) is conceived for being adopted as an independent means of 3D imaging quality assessment. First results have

already been presented for the quantitative assessment of several commercially available close-range optical recording

technologies21

. The test-object is mainly made of Alcoa aluminium alloy T6061, selected for its thermal stability.

Around the base plate (25 cm x 25 cm) there is an irregular array of six individually calibrated spheres (20 mm

diameter), each mounted onto a conical base: this provides the reference coordinate system of the test-object. A

secondary plate is rigidly wedged into the base plate and includes the following geometric features20

:

angle feature;

step feature;

length features;

gap feature.

Figure 2. Side (left) and top (right) view of the Portable Metric Test-Object built by UCL CEGE.

Page 5: Improving automated 3D reconstruction methods via vision ...3dom.fbk.eu/sites/3dom.fbk.eu/files/pdf/Toschi... · a3D Optical Metrology (3DOM) unit, Bruno Kessler Foundation (FBK),

The test-object was scanned with the Arius3D Foundation model 150 laser scanner (mounted on a CMM), in order to

produce a reference 3D dataset of the artefact for the accuracy evaluation. The sampling grid of the Arius3D scanner is

0.1 x 0.1 mm and it features a measurement uncertainty of +/- 0.035 mm in depth and +/- 0.1 mm in plane.

The test-object includes 87 planar targets of different dimensions, well-distributed on the plates and on the feature sides.

Both circular coded and retro-reflective targets are present. The 3D coordinates of 44 targets situated around the base

plate were measured by an ad-hoc photogrammetric network adjustment performed with VMS and a pre-calibrated

digital camera. Final results from the bundle orientation yielded a mean standard deviation (STDV) value in object space

of 12 μm (1 sigma), with maximum errors of about 24 μm.

3.2 Image acquisition

A Nikon D600 digital camera (6016 x 4016 pixels, pixel size of 6 μm) equipped with a macro prime lens (Nikon AF

Micro-Nikkor 60 mm f/2.8D) is used in the experiment. The test-object is imaged at a focusing distance of 0.9 m, after

having adequately fixed the lens in order to ensure a constant focal length during the overall acquisition phase. The angle

of convergent images is set to about 10°, so that the resulting range uncertainty and lateral resolution (ground sample

distance - GSD) are expected to be of 0.25 mm and 0.09 mm in average, respectively (assuming an a-priori image

measurement accuracy of 1/2 pixel). F-stop and ISO sensibility are kept fixed at f-16 and 100 and no automatic optical

image stabilization is present.

Figure 3. Example of two images acquired from the same camera position: with (left) and without (right) the projected

pattern.

Figure 4. The photogrammetric network geometry.

Page 6: Improving automated 3D reconstruction methods via vision ...3dom.fbk.eu/sites/3dom.fbk.eu/files/pdf/Toschi... · a3D Optical Metrology (3DOM) unit, Bruno Kessler Foundation (FBK),

Since the test-object is poorly textured, two Optoma Pico projectors (PK301, resolution of 854 x 480 pixels) are used to

project a fine texture with recognizable feature points onto the surface. A pebble and gravel image is projected from a

distance of about 1 m, so that the object is illuminated with a sharp pattern. The stability of the projectors is carefully

checked for the entire image acquisition phase in order to avoid any apparent movements of the pattern on the test-

object. Furthermore, for each camera station, two images are taken, i.e. one while projecting the pattern and one without

the pattern projection (Figure 3). In the latter case, a ring light around the camera lens is used to illuminate the retro-

reflecting targets present in the scene, in order to facilitate the automatic extraction of target centroids. All images are

taken with the camera mounted on a tripod, following a quasi-circular protocol (due to the presence of the projectors),

repeated at three different heights. Figure 4 shows the camera network actually realized. Camera roll angles are

additionally included in the network to ensure good conditions for the following self-calibrating bundle adjustments.

Two calibrated scale bars are also included in the scene as further metric references.

3.3 Image processing

Before starting the image correspondence extraction procedure, a pre-processing step is performed on the raw image

files, including histogram equalization, white balance, highlight recovering, etc. The exposure of the photographs is

carefully monitored to ensure that the targets are not over-exposed.

Afterwards, images with the projected pattern are imported into a typical SfM software, namely Agisoft Photoscan

(PS)29

. Its SIFT-like operator is exploited to automatically extract a large number of homologous points, these are then

exported in the form of both image observations (2D points) and corresponding 3D coordinates. These data are then

filtered and regularized, in order to preserve only the more reliable ones (i.e. the observations with higher multiplicity

and lower re-projection error. The filtered correspondences are subsequently imported as image observations into VMS

and PM, where a self-calibrating bundle adjustment is performed (hereinafter called “self-calibrating bundle adjustment

with SIFT points”).

Concurrently, images acquired without the projected pattern are processed directly with PM, where the centroids of the

circular targets are automatically extracted. These image observations are included in a self-calibrating bundle

adjustment performed with PM and VMS (hereinafter called “self-calibrating bundle adjustment with target centroids”).

A further self-calibrating bundle adjustment is carried out in VMS and PM, using both SIFT points and circular target

centroids as image observations (hereinafter called “self-calibrating bundle adjustment with SIFT and circular target

centroids”).

After the precision evaluation (see Subsection 3.3), internal and external orientations are imported into PS, where the

dense image matching is subsequently performed. This is carried out using the second-level image pyramid,

corresponding to a quarter of the original full image resolution, in order to achieve a reasonable trade-off between

processing effort and resolution14

. Thus the derived dense point clouds feature a mean spatial resolution less than 0.2 mm

and consist of more than 4 million points.

3.4 Precision and accuracy evaluation

Precision defines the statistical noise of an adjustment, i.e. it models the inner accuracy of a system5. Typical SfM

methods provide only limited information about the internal quality of the bundle adjustment process, usually restricted

to the final re-projection error. On the other hand, VM systems yield a large number of statistical parameters supporting

the adjusted network results. Typically, residuals of image coordinates and corresponding statistics are used to evaluate

the precision in image space, whereas STDV values of the computed 3D coordinates provide a quality assessment in

object space (together with possible RMSE on check points). Statistics provided by VMS and PM are here exploited to

check the precision of the different self-calibrating bundle adjustment processes. Additionally, starting from the

orientation results achieved in the free-network approach, a forward intersection of the circular target centroids is

performed. After a similarity transformation, these 3D coordinates are compared to the coordinates previously measured

as described in Subsection 3.1. Residuals and corresponding statistics are thus derived.

Accuracy models the deviation of a measured value from an independent, nominal measurement. This nominal

measurement is defined by a measurement system of higher accuracy, a calibrated reference body or a true value5. In

terms of dense image matching algorithms, efforts are being made by the research community towards the accuracy

evaluation of their metric performance13,14,30,31

. Usually, the accuracy assessment is performed by comparing the

achieved results to a ground truth, which should theoretically be two or three times more accurate than the expected

results. Although this general approach may be seen as reasonable, better metrological traceability is given by the use of

Page 7: Improving automated 3D reconstruction methods via vision ...3dom.fbk.eu/sites/3dom.fbk.eu/files/pdf/Toschi... · a3D Optical Metrology (3DOM) unit, Bruno Kessler Foundation (FBK),

geometric artefacts with known form and size: in this case, the accuracy analysis can be carried out using evaluation

parameters defined by national and internationals standards.

Figure 5. Geometric parameters and procedures adopted for the metrical characterization.

The test-object adopted in the experiment supports the assessment of different geometric parameters which are derived

from similar parameters defined in the German VDI/VDE guidelines. Geometric analyses and accuracy evaluations are

performed on the raw output of the dense matching method (i.e. the dense point clouds) using GOM Inspect V832

, a

certified software currently freely available. No prior filtering process is carried out on the photogrammetric dense point

clouds. As a first step, the adopted procedure requires a sphere centroid extraction, in order to register the

photogrammetric-derived 3D data into the reference coordinate system via a centroid-to-centroid alignment. For the next

step an automatic data extraction (internally developed at UCL CEGE) is applied to enable a repeatable and reproducible

evaluation across systems, which is followed by a rigorous workflow procedure (Figure 5), to evaluate the following

parameters:

Sphere diameter error, computed by comparing the diameters of the spheres extracted from the point clouds to their

reference values, i.e. the manufacturer’s certified reference diameters (accuracy 0.001 mm). The best-fit spheres are

calculated according to the least squares method (Gaussian fit, 5 sigma) with unconstrained radius for a standard

sphere artefact.

Sphere spacing error, defined by comparing the 3D distances between sphere centroids to the reference values. The

latter have been measured with digital calipers with an uncertainty of +/- 0.01 mm. Again, the best-fit spheres are

calculated according to the least squares method (Gaussian fit, 5 sigma) with unconstrained radius for a standard

sphere artefact.

Bi-directional plane spacing error, defined as the difference between the measured distance between two sideways

facing parallel planes with the opposite surface normal direction and the reference distance between two parallel

Page 8: Improving automated 3D reconstruction methods via vision ...3dom.fbk.eu/sites/3dom.fbk.eu/files/pdf/Toschi... · a3D Optical Metrology (3DOM) unit, Bruno Kessler Foundation (FBK),

planes with the same normal as reference value. This parameter is computed by using the two length gauges, that

define the reference distances (uncertainty of +/- 0.01 mm).

Uni-directional plane spacing error, defined as the capability of the system in measuring steps. The step feature,

adopted for this assessment, is characterized by steps with nominal height difference between 0.01 mm and 20 mm.

The evaluation procedure requires the definition of nominal planes with vertical direction, in order to compute the

distances along the Z direction. These distances are then compared to the reference values and a significance test

(hypothesis test) is performed taking into account the standard deviation values of both planes which constitute the

distance between two planes.

Structural resolution, defined as the lateral resolution of distance sensors. It characterises the smallest structure

measureable with maximum permissible errors to be specified33

. It is analysed through a structure normal using the

gap feature, which is constructed of eight individual blocks with the same height and seven pits with the same depth

(8 mm). The slot widths are: 0.1 mm – 0.2 mm – 0.3 mm – 0.5 mm – 1.0 mm – 2.0 mm – 3 mm. A pass/fail test is

performed to determine the success of the gap recording where a pass is defined by a measured maximum distance

from the fitted plane with a maximum unsigned deviation from the reference pit depth of 2.00 mm larger than 63%

(=5.04 mm). A best-fit plane is extracted on the top of the gap feature and the colour coded maps of deviations from

this plane are finally evaluated. Other fitting variables of a best-fit plane are noted.

Angular error, defined as the difference between the measured angular deflection from a reference angle, both

computed from the datum plane to an oriented plane in degrees [°]. The angle feature, adopted for this evaluation,

comprises a series of upward-facing planar surfaces that provide varying angles to the base from 0° to 30°.

4. RESULTS

4.1 Precision of the adjustment process

Since VMS and PM delivered equivalent results in terms of precision in both image and object space, only one figure for

each parameter is here given. Regarding the assessment of the internal precision, statistics from the process performed

with the filtered SIFT points (extracted from images with projected pattern) are listed in the first column of Table 1. The

self-calibrating bundle adjustment resulted in a mean precision vector length of 129 μm in object space, whereas

residuals of image coordinates feature a mean re-projection error of 0.4 pixel and a maximum value of 1 pixel. The

results of the processing with circular targets show better statistics, both in image and object space.

When target locations are triangulated starting from the SIFT-based orientations, their residuals to the reference values

are in average 62 μm, whereas target-based orientations yield again better results, with an mean difference of 21 μm (as

shown in Table 2, first and second columns).

Table 1. Statistical results of the internal assessment.

SELF-CALIBRATING BUNDLE ADJUSTMENT

INTERNAL ASSESSMENT

Pattern No Pattern

SIFT Target SIFT SIFT & Target

OB

JE

CT

SP

AC

E sXYZ

Precision vector

length

[micron]

Mean 129 8 96 97

Stdv 102 6 80 81

Max 588 36 503 516

IMA

GE

SP

AC

E

Re-projection

error [pixel]

Mean 0.4 0.1 0.4 0.4

Stdv 0.2 0.1 0.2 0.2

Max 1.0 0.8 1.0 1.1

Page 9: Improving automated 3D reconstruction methods via vision ...3dom.fbk.eu/sites/3dom.fbk.eu/files/pdf/Toschi... · a3D Optical Metrology (3DOM) unit, Bruno Kessler Foundation (FBK),

Table 2. Statistical results of the external assessment.

SELF-CALIBRATING BUNDLE ADJUSTMENT

EXTERNAL ASSESSMENT Pattern No Pattern

SIFT Target SIFT SIFT & Target

RMSE length [micron] 66 25 24 24

RMSE Mean (Magnitude)

[micron] 62 21 21 20

Max difference

[micron] 104 52 46 48

In order to further analyse this different behaviour, SIFT points are extracted and filtered again starting from the images

without projected pattern. A new self-calibrating bundle adjustment is finally carried out using the new set of image

correspondences. In this case, the internal assessment provides results comparable to the ones derived from the images

with a projected pattern. Regarding the external evaluation, SIFT points from images without the pattern show a

behaviour equivalent to the circular target-based process. No significant improvement is achieved by including both

SIFT points and target centroids into the network adjustment (Table 1 and 2, fourth column).

The tests here presented seem to suggest the existence of some problems related to the SIFT points extracted from the

images with projected pattern. The origins of this behaviour are still under study and may be due to the type of the

selected pattern or to instability effects induced by the two projectors, which may have caused small movements of the

pattern during the image acquisition, not noticeable by the naked eye.

4.2 Accuracy of the 3D reconstruction

Starting from the orientation results achieved by the self-calibrating bundle adjustment (SBA) performed with SIFT

points and the ones delivered by the SBA with target centroids, the dense image matching procedure is then carried out.

Both image datasets (with and w/o the projected pattern) are separately processed, delivering four point clouds that will

be referenced as follows:

SBA-SIFT-Pattern (shown in graphs as diamond). The point cloud derives from the self-calibrating bundle

adjustment with SIFT points extracted from images with pattern and dense image matching performed on images

with pattern.

SBA-SIFT-NoPattern (square). The point cloud derives from the self-calibrating bundle adjustment with SIFT

points extracted from image without pattern and dense image matching performed on images without pattern.

SBA-Target-Pattern (triangle). The point cloud derives from the self-calibrating bundle adjustment with circular

target centroids extracted from images without pattern and dense image matching performed on images with pattern

SBA-Target-NoPattern (cross). The point cloud derives from the self-calibrating bundle adjustment with circular

target centroids extracted from images without pattern and dense image matching performed on images without

pattern.

The automatic procedure for accuracy assessment is finally applied and its most notable results are summarized in

Figures 6 – 10 and in Tables 3-5 below. Since the spheres were not reconstructed by dense image matching performed on

images without pattern, neither sphere diameter error or sphere spacing error were performed for these datasets (although

demonstrated in a previous paper21

).

To investigate the orientation recording performance, the angle error is measured and the deviation plotted (Figure 6).

For the tested methodologies, point clouds SBA-SIFT-NoPattern and SBA-target-NoPattern perform best with the least

deviations from the reference angles. All methods show a trend to improved (i.e. lower deviation) values with increasing

angles.

Page 10: Improving automated 3D reconstruction methods via vision ...3dom.fbk.eu/sites/3dom.fbk.eu/files/pdf/Toschi... · a3D Optical Metrology (3DOM) unit, Bruno Kessler Foundation (FBK),

Figure 6. Orientation error through angle error analysis [degree].

Figure 7. Length error through bi-directional plane-spacing error analysis for two length gauges [mm].

To investigate the length measurement error, the bi-directional plane spacing error is measured (Figure 7). SBA-SIFT-

Pattern is performing best with 0.04 mm and 0.06 mm deviations for Length Bar 1 and 2, respectively. This good metric

performance is achieved also by the other point clouds, with the exception of SBA-Target-Pattern, that delivers a

maximum deviation of 0.73 mm.

Further length measurement tests are conducted through the uni-directional plane spacing error (Figure 8). SBA-SIFT-

Pattern shows the most consistent performance with a standard deviation of 0.015 mm, and a maximum unsigned

deviation of 0.036 mm. All datasets show a high deviation for the highest step of 20 mm. When the statistical test (i.e.

-0.10

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

Dev

iati

on

fro

m r

efe

ren

ce a

ng

le [

deg

ree]

Angle: 0.5º 1.0º 2.0º 3.0º 4.0º 5.0º 10.0º 15.0º 25.0º 30.0º

Orientation error: angle error analysis

SBA-SIFT-Pattern

SBA-SIFT-NoPattern

SBA-Target-Pattern

SBA-Target-NoPattern

0.04 0.06

0.07 0.09

0.26

0.73

0.07

0.19

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

Dev

iati

on

fro

m r

efe

ren

ce l

eng

th [

mm

]

Length Bar 1 [74.94 mm] Length Bar 2 [149.87 mm]

Length error: bi-directional plane spacing error analysis

SBA-SIFT-Pattern

SBA-SIFT-NoPattern

SBA-Target-Pattern

SBA-Target-NoPattern

Page 11: Improving automated 3D reconstruction methods via vision ...3dom.fbk.eu/sites/3dom.fbk.eu/files/pdf/Toschi... · a3D Optical Metrology (3DOM) unit, Bruno Kessler Foundation (FBK),

significance test) is conducted, Table 3 shows that the smallest significant step that can be recorded with a confidence of

95% is 0.1 mm by all systems except SBA-Target-Pattern.

Figure 8. Length error through uni-directional plane-spacing error analysis [mm]. Step 1 is the smallest and Step 17 is the largest

step.

Table 3. Smallest significant step evidenced through a significance test (only relevant results are shown).

Step height

[mm] SBA-SIFT-Pattern SBA-SIFT-NoPattern SBA-Target-Pattern SBA-Target-NoPattern

0.05 FAIL FAIL FAIL FAIL

0.10 PASS PASS FAIL PASS

0.30 PASS PASS PASS PASS

0.50 PASS PASS PASS PASS

1.00 PASS PASS PASS PASS

Smallest step

[mm] 0.1 0.1 0.3 0.1

The structural resolution is evaluated through (i) a significance test for gap recording (Table 4) and (ii) visual inspection

and deviation map created to a fitted plane on across the top of the gap artefact (Figure 9). Results of the significance test

are shown in comparison with the reference dataset (Arius3D model) which is able to pass the hypothesis test for all

gaps. According to numerical results, the gap recording is performed best by SBA-Target-Pattern which records the pit

of a gap to a maximum of 1.0 mm. However, results from visual inspection of deviation maps show that all point clouds

feature clear gaps for Gap 1 and Gap 2. Furthermore, SBA-Target-NoPattern shows the clearest indication of gaps with a

visible indentation indicated on the deviation map of ca. 0.2 mm.

Since results delivered by the two analyses, i.e. significance test and visual inspection, do not provide a unique evidence

of the system performances, they are excluded from the final numerical summary of accuracy evaluation.

-0.24-0.22-0.20-0.18-0.16-0.14-0.12-0.10-0.08-0.06-0.04-0.020.000.020.040.06

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

Dev

iati

on

fro

m r

efe

ren

ce l

eng

th [

mm

]

Step name

Length error: uni-directional plane-spacing error analysis

SBA-SIFT-Pattern

SBA-SIFT-NoPattern

SBA-Target-Pattern

SBA-Target-NoPattern

Page 12: Improving automated 3D reconstruction methods via vision ...3dom.fbk.eu/sites/3dom.fbk.eu/files/pdf/Toschi... · a3D Optical Metrology (3DOM) unit, Bruno Kessler Foundation (FBK),

SBA-SIFT-Pattern

SBA-SIFT-NoPattern

SBA-Target-Pattern

SBA-Target-NoPattern

Figure 9. Colour coded deviation maps for gap recording [mm] with histograms of error distribution. The scale ranges from

-0.3 mm to +0.3 mm.

Table 4. Smallest significant gap evidenced through a significance test (only relevant results are shown).

Gap name Gaps [mm] Arius3D

model

SBA-SIFT-

Pattern

SBA-SIFT-

NoPattern

SBA-

Target-

Pattern

SBA-

Target-

NoPattern

Gap 1 (largest) 3.00 PASS PASS PASS PASS PASS Gap 2 2.00 PASS FAIL FAIL PASS PASS Gap 3 1.00 PASS FAIL FAIL PASS FAIL Gap 4 0.50 PASS FAIL FAIL FAIL FAIL Gap 5 0.30 PASS FAIL FAIL FAIL FAIL Gap 6 0.20 PASS FAIL FAIL FAIL FAIL

Gap 7 (smallest) 0.10 PASS FAIL FAIL FAIL FAIL Smallest recorded gap [mm]

0.10 3.00 3.00 1.00 2.00

Page 13: Improving automated 3D reconstruction methods via vision ...3dom.fbk.eu/sites/3dom.fbk.eu/files/pdf/Toschi... · a3D Optical Metrology (3DOM) unit, Bruno Kessler Foundation (FBK),

Figure 10. Summary evaluation for a comparison of all tested methods. The smaller the value, the lower the performance.

Table 5. Metric results of evaluation for geometric parameters of Orientation, Length error and structural resolution

Geometric

Parameter Method

SBA-SIFT-

Pattern

SBA-SIFT-

NoPattern

SBA-Target-

Pattern

SBA-Target-

NoPattern

Length error Bi-directional plane spacing

error at 150 mm [mm] 0.04 0.07 0.26 0.07

Length error

Uni-directional Plane

Spacing error / Smallest

measureable step height

[mm]

0.10 0.10 0.30 0.10

Orientation

Angle error: maximum

unsigned angle deviation

[degrees]

0.68 0.38 0.53 0.42

Summary of results: Each of the system shows distinct strengths and weaknesses, although most differences delivered

by comparisons are not metrically significant, being less than the mean GSD. The results of the performance evaluation

can be summarized either in metrical values (Table 5), or through a graphic evaluation that normalized the results and

stacks the charts to reflect an overall score (Figure 10). The two point clouds extracted from images without projected

pattern, i.e. SBA-Target-NoPattern and SBA-SIFT-NoPattern, achieved the best results, whereas SBA-Target-Pattern

delivers the worst overall outcomes. SBA-SIFT-Pattern is characterised by an “intermediary” performance, although it

shows the best length recording. These general outcomes agree with the results of the orientation assessment. The origins

of these behaviours are still under study. It should be stressed again that results from structural resolution analyses have

not been included in the summary, since they do not provide a clear and consistent evidence of the system performances.

0.32 0.62

0.47 0.58

0.90

0.90

0.70

0.90

0.87

0.78

0.13

0.78

0.00

0.50

1.00

1.50

2.00

2.50

SBA-SIFT-Pattern SBA-SIFT-NoPattern SBA-Target-Pattern SBA-Target-NoPattern

Normalized values for recording quality

Length error: bi-directional plane spacing error at 150 mm [mm]. Compared to a maximum tolerance of 0.3 mm.

Length error: uni-directional Plane Spacing error / Smallest measureable step height [mm]. Compared to a minimum step

height recording of 0.5 mm.

Orientation: angle error, maximum unsigned angle deviation [degrees] Compared to a maximum tolerance of 1 degree.

Page 14: Improving automated 3D reconstruction methods via vision ...3dom.fbk.eu/sites/3dom.fbk.eu/files/pdf/Toschi... · a3D Optical Metrology (3DOM) unit, Bruno Kessler Foundation (FBK),

5. CONCLUSIONS

The paper strives to provide a procedure for improving automated 3D reconstruction methods via vision metrology. A

metrological-oriented use of automated techniques for dense 3D reconstruction purposes requires a narrowing of the

accuracy gap between the industrial sector and the more relaxed amateur or heritage use of 3D optical imaging.

Measurement results should be thus supported by rigorous statistical parameters that provide information about the

precision of the process. To address this issue, an enhanced photogrammetric procedure has been proposed and tested in

this paper. It exploits the main functionalities of the classical photogrammetric packages for accurate measurement of

single points by integrating their algorithms and statistical analyses into the pipeline provided by fully automated 3D

reconstruction software. The test-object adopted in the experiments represents a challenging artefact for dense matching

applications. While the artefact is a highly structured 3D scene, due to the presence of a number of metric references

(like planar targets, scale bars and spheres), it is also small and poorly textured. This allowed assessment of the

performance of the automatic image orientation and matching procedures within a typical industrial scenario,

characterised by poor texture and known 3D/2D shapes. In order to provide a fine texture to the surfaces, a pattern was

projected using non-metrological equipment (small projectors with low resolution, light tripods). The choice

significantly affected the metric quality of the results and this influence has been carefully analysed through the adopted

augmented procedure in terms of both orientation and matching outcome.

Statistics delivered by the assessment of orientation processes show that the filtered SIFT points can represent valuable

image observations for obtaining results that are comparable to the ones achieved with the circular targets. This is

particularly true for SIFT points extracted without the projected pattern, whereas greater differences are highlighted with

SIFT points extracted from the images with a pattern. Further analyses are required to understand the origins of these

results. In particular, future studies are planned in order to compare triangulated target coordinates to a more accurate

reference and test different types of projected pattern.

A procedure for the metric evaluation of 3D reconstructed geometry has been demonstrated for length, orientation and

structural resolution, including automated data segmentation to prepare the input. The evaluation is applicable for other

3D imaging methods and sensors. Quantitative evaluation results showed that 3D reconstructions deriving from the self-

calibrating bundle adjustments with circular target centroids or SIFT points extracted from images without pattern and

dense image matching performed on images without pattern (i.e. SBA-Target-NoPattern and SBA-SIFT-NoPattern,

respectively) brought the best results. However, most differences delivered by the accuracy comparisons are not

metrically significant, as they are beyond the mean GSD (0.09 mm). By testing different types of pattern, future tests will

analyse how the projected texture (and its resolution) can affect the final dense point clouds and the corresponding

outcomes of the metrological assessment.

ACKNOWLEDGEMENTS

We wish to thank the COSCH network (MPNS EU Cost Action TD-1201) which enabled a Short-Term Scientific

Mission of Isabella Toschi, FBK (Italy), at 3DIMPact (3D Imaging, Metrology and Photogrammetry applied coordinate

technologies), a research group at UCL CEGE (UK).

REFERENCES

[1] Fraser, C.S. and Shortis, M.R., “Metric exploitation of still video imagery,” The Photogrammetric Record, 15(85),

107-122 (1995).

[2] Luhmann, T., Robson, S., Kyle, S. and Harley, I., [Close range photogrammetry: Principles, methods and

applications], Whittles, Dunbeath, UK (2006).

[3] Atkinson, K. B., [Close range photogrammetry and machine vision], Whittles Publishing, Caithness, UK (2001).

[4] Fryer, J.G., Mitchell, H.L. and Chandler, J.H., [Applications of 3D Measurement from Images], Whittles Publishing,

Caithness, UK (2007).

[5] Luhmann, T., “3D imaging: how to achieve highest accuracy,” Videometrics, Range Imaging, and Applications XI,

Proc. SPIE Vol. 8085 (2011).

[6] Godding, R., Luhmann, T. and Wendt, A., “4D Surface matching for high-speed stereo sequences,” Int. Arch.

Photogramm. Remote Sens. & Spatial Inf. Sci.,, Vol. XXXVI, Part 5 (2006).

Page 15: Improving automated 3D reconstruction methods via vision ...3dom.fbk.eu/sites/3dom.fbk.eu/files/pdf/Toschi... · a3D Optical Metrology (3DOM) unit, Bruno Kessler Foundation (FBK),

[7] Xiao, Z., Liang, J., Yu, D., Tang, Z. and Asundi, A., “An accurate stereo vision system using cross-shaped target

self-calibration method based on photogrammetry,” Optics and Lasers in Engineering 48, 1252–1261 (2010).

[8] Fritsch, D., Khosravani, A.L.I.M., Cefalu, A. and Wenzel, K., “Multi-sensors and multiray reconstruction for digital

preservation,” Photogrammetry Week, 305–323 (2011).

[9] Remondino, F., Del Pizzo, S., Kersten, T.P. and Troisi, S., “Low-cost and open-source solutions for automated

image orientation – A critical overview,” Proc. EuroMed 2012 Conference, M. Ioannides et al. (Eds.), LNCS 7616,

40-54 (2012).

[10] Haala, N. and Rothermel, M., “Dense multi-stereo matching for high quality digital elevation models,”

Photogrammetrie, Fernerkundung. Geoinformation (PFG) 4, 331–343 (2012).

[11] Hartley, R. and Zisserman, A., [Multiple View Geometry], 2nd ed., Cambridge University Press, Cambridge, UK

(2004).

[12] Wöhler, C., [3D Computer Vision. Efficient Methods and Applications], Springer, Berlin, Germany (2009).

[13] Ahmadabadian A. H., Robson, S., Boehm, J., Shortis, M., Wenzel, K. and Fritsch, D., “A comparison of dense

matching algorithms for scaled surface reconstruction using stereo camera rigs,” ISPRS Journal of Photogrammetry

and Remote Sensing 78, 157-167 (2013).

[14] Remondino, F., Spera, M.G., Nocerino, E., Menna, F. and Nex, F., “State of the art in high density image

matching,” The Photogrammetric Record 29(146), 144-166 (2014).

[15] Jepping, C., Bethmann, F. and Luhmann, T., “Congruence analysis of point clouds from unstable stereo image

sequences,” Int. Arch. Photogramm. Remote Sens. & Spatial Inf. Sci., XL-5, 301-306, (2014).

[16] Shortis, M. and Robson, S. “Vision Measurement System”, <http://www.geomsoft.com/VMS/> (17 April 2015).

[17] Eos Systems Inc., “PhotoModeler”, <http://www.photomodeler.com/index.html> (17 April 2015).

[18] Association of German Engineers, [VDI/VDE 2634/Part2, 2002. Optical 3-D Measuring Systems – Optical Systems

based on Area Scanning] (2002).

[19] Association of German Engineers, [VDI/VDE 2634/Part3, 2008. Optical 3-D Measuring Systems – Multiple View

Systems based on Area Scanning] (2008).

[20] Hess, M. and Robson, S., “3D imaging for museum artefacts: a portable test object for heritage and museum

documentation of small objects,” Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., 39(B5), 103-108, (2012).

[21] Hess, M., Robson, S. and Hosseininaveh Ahmadabadian, A., “A contest of sensors in close range 3D imaging:

performance evaluation with a new metric test object,” Int. Arch. Photogramm. Remote Sens. & Spatial Inf. Sci.,

XL-5, 277-284, (2014).

[22] Lowe, D. G., “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision,

60(2), 91-110 (2004).

[23] Apollonio, F.I., Ballabeni, A., Gaiani, M. and Remondino, F., “Evaluation of feature-based methods for automated

network orientation,” Int. Arch. Photogramm. Remote Sens. & Spatial Inf. Sci, XL/5, 47-54 (2014).

[24] Nocerino, E., Menna, F., Remondino, F. and Saleri, R., “Accuracy and block deformation analysis in automatic

UAV and terrestrial photogrammetry - Lesson learnt,” ISPRS Annals of the Photogrammetry, Remote Sensing and

Spatial Information Sciences, Vol. II(5/W1), 203-208 (2013).

[25] Nocerino, E., Menna, F. and Remondino, F., “Accuracy of typical photogrammetric networks in cultural heritage 3D

modeling projects,” Int. Arch. Photogramm. Remote Sens. & Spatial Inf. Sci., XL-5, 465-472 (2014).

[26] Brown, M.Z., Burschka, D., Hager, G.D. and Member, S., “Advances in computational stereo,” IEEE Transactions

on Pattern Analysis and Machine Intelligence 25, 993–1008 (2003).

[27] Hirschmuller, H., “Accurate and efficient stereo processing by semi-global matching and mutual information,” IEEE

Computer Vision and Pattern Recognition, 2, 807-814 (2005).

[28] Bethmann, F. and, Luhmann, T., “Semi-Global Matching in Object Space,” Int. Arch. Photogramm. Remote Sens.

&Spatial Inf. Sci., XL-3/W2 (2015).

[29] “Agisoft Photoscan,” <http://www.agisoft.com/> (17 April 2015).

[30] Haala, N., “The landscape of dense image matching algorithms,” Proc. Photogrammetric Week 2013. Dieter Fritsch

(Ed.), Stuttgart, 271-284 (2013).

[31] Toschi, I., Beraldin, J.-A., Cournoyer, L., De Luca, L. and Capra, A., “Evaluating dense 3D surface reconstruction

techniques using a metrological approach,” NCSLI Measure Journal, 10(1), 52-62 (2015).

[32] GOM mbH, “GOM Inspect,” <http://www.gom.com/3d-software/gom-inspect.html> (17 April 2015).

[33] Association of German Engineers. [VDI/VDE 2617-1:2007 - Accuracy of coordinate measuring machines with

optical probing] (2007).