Recognising building patterns using matched filters and genetic search

PHOTOGRAMMETRY & REMOTE SENSING

ISPRS Journal of Photogrammetry & Remote Sensing 53 (1998) 95-107

Recognising building patterns using matched filters and genetic search

Andrew D.J. Cross *, Edwin R. Hancock Department of Computer Science, University of York, York, YOl 5DD, UK

Accepted 15 April 1997

Abstract

This paper is concerned with recognising buildings in aerial images. We abstract the images in terms of relational graphs. Specifically, we use Delaunay triangulations to represent the arrangement of located buildings. Localisation is realised using matched filters. The filters are trained by drawing upon the duality between convolution in the image domain and multiplication in the Fourier domain. The matched filters prove to be remarkably stable. We match Delaunay graphs representing image pairs using genetic search with a Bayesian relational consistency criterion as fitness function. The use of genetic search allows us to perform the optimisation without the traditional problems of sensitivity to initial conditions and convergence to local optima. 0 1998 Elsevier Science B.V. All rights reserved.

Keywords: genetic search; building recognition; matched filters; Bayesian consistency; aerial stereograms

1. Introduction

Feature-matching is a process of generic impor- tance in the interpretation of aerial and satellite images. It has found widespread application in both remote sensing and airborne vehicle guidance. The process finds application not only in the registration of image and cartographic information, but also in mapping tasks such as stereopsis and sensor fusion. In practice it is realised either by template-based methods, posed either in terms of pixels (Caves et al., 1992), curves (Wu and Maitre, 1986) or line structures (Roux and McKeown, 1994; Lin et al., 1994), or by matching higher-level relational descriptions (Shapiro and Haralick, 1985; Wilson et al., 1995; Wilson and Hancock, 1995, 1997). While template-based methods aim to recover an explicit

* Corresponding author.

co-ordinate transformation between image pairs, relational matching locates the direct correspondence between individual image tokens. This latter technique has proved popular in the computer-vision literature (Sanfeliu and Fu, 1983; Shapiro and Haral- ick, 1985; Wilson and Hancock, 1995, 1997). How- ever, it has not been widely adopted by the remote- sensing community.

Since it is intrinsically symbolic in character, relational matching offers the advantage of being insensitive to distortions caused by complex image transformations. Unfortunately, it is its symbolic nature that has led to the perception that relational matching is too fragile to be of practical utility in all but the least cluttered imagery. The main obstacle is the stable segmentation of features from raw image data. These features are invariably either line tokens or image regions. Successful unsupervised image segmentation is critically dependent upon the

0924-2716/98/$19.00 Q 1998 Elsevier Science B.V. All rights reserved. PII SO924-2716(97)00037-3

96 A.D.J. Cross, E.R. Hancock/lSPRS Journal of Photogrammetry & Remote Sensing 53 (1998) 95-107

availability of an adaptive parameter selection algorithm. A secondary obstacle is the need to establish stable relational descriptions of raw segmental entities. As recently demonstrated by Tuceryan and Chorzempa, neighbourhood graph structures are no- toriously sensitive to the addition of noise and clutter (Tuceryan and Chorzempa, 1991). Even if these two difficulties associated with image segmentation can be overcome, then there remains the requirement for a robust means of matching potentially corrupt relational descriptions. Here the main difficulties are computational. It is well known that even locating sub-graph isomorphisms is an NE-complete problem.

Recently we have focused on developing robust methods for relational graph-matching. Our methodology differs from that reported elsewhere in the literature in two important ways. Firstly, we have developed a Bayesian framework for modelling the effect of both matching errors and structural errors (Wilson and Hancock, 1996). Specifically, we have shown how graph-matching can be realised as an error-correcting edit process (Wilson and Hancock, 1995). Although, this work is important in extending the methodology for comparing corrupt relational graphs, it does not address the issue of efficient search. In addressing this issue, our second contribution has been to draw on genetic search as a global optimisation process (Cross et al., 1997). A recent comparative study has provided empirical evidence that the genetic search procedure can locate matches in polynomial time (Cross et al., 1997).

This paper builds on this new methodology to de- velop a practical tool for aerial image-matching. The features of our registration process are as follows. We avoid the use of parameter-intensive edge- or region-segmentation methods by constructing matched filters for building localisation. Filter-response maxima are extracted using an automatic thresholding technique (Illingworth and Kittler, 1986). Difficul- ties in establishing stable relational descriptions are overcome by constructing Delaunay triangulations centred on the filter-response maxima. We select the Delaunay graph because of its well-documented robustness to noise (Tuceryan and Chorzempa, 1991), and deformation (Faugeras et al., 1990). Finally, we illustrate how structural matching can be realised using genetic search (Cross et al., 1997).

2. Building localisation

We are interested in matching aerial images of semi-urban areas. A typical image pair from the Graz area is shown in Fig. 1. Image registration is posed in terms of matching arrangements of buildings detected in the image pairs. The imagery studied here is quite complex. In addition to building structures, the scenes contain highways and forested areas. This section describes the image-processing operations required so that we can abstract the registration process in terms of relational graph-matching.

Buildings are localised by searching for matched filter-response maxima. The matched filters are constructed using a set of training examples by exploiting the duality of convolution in the spatial domain and multiplication in the Fourier domain. This filter-based localisation process contrasts with the structurally based analysis of buildings reported elsewhere in the literature (Roux and McKeown, 1994; Lin et al., 1994). One of the bottlenecks in the application of the structural method is the need to hand-craft edge-segment models. It is the quality of the located edge segments that limits the effectiveness of this matching strategy. By contrast, the matched-filter approach offers the advantage that it can learn large-scale image structure.

2.1. Filter design

The building-localisation process commences from a set of hand-segmented training images. Here the desired regions of high-convolution response are hand-labelled by manually assigning a binary representation to the training images. Target building structures are marked with the high bit, while the remaining background regions are marked with the low bit. These raw binary training data are un- suitable for Fourier-domain analysis. Sharp region boundaries require infinite-frequency bandwidth for accurate representation. We overcome this problem by blurring the binary training data with a Gaussian smoothing kernel. In this low-pass filtering stage the blurring scale is set to be of the same order of mag- nitude as the size of the buildings that we aim to localise.

To be more formal, let Z, represent a raw training image and let 0, represent the blurred hand segmen-

A.D.J. Cross, E.R. Hancock/lSPRS Journal of Photogrammetry & Remote Sensing 53 (1998) 95-107 97

Fig. 1. An image pair for a semi-urban area close to Graz in Austria.

tation. Our goal is to locate the convolution filter C, that best matches the input and output images from the training set. In other words we seek the filter for which:

0, = I* * c, (1)

where * denotes the convolution operation. Taking the Fourier transform of both sides of this equation and exploiting the duality theorem:

F(R) = F(L) x F(C,) (2)

The convolution filter C,, may be obtained from the inverse Fourier transform in the following way:

(31

Individual images in the training set yield different convolution filters. To combine the results over the N examples in the training set we average the resulting convolution filters, i.e.:

c+& il=O

(4)

A serious potential problem with this Fourier inversion process is the existence of unsampled fre- quencies in the raw training data. In consequence there are frequency components for which the sam- pled Fourier transform of the image, F(Z,) is zero. The corresponding filter coefficients are undefined. One way of circumventing this problem is to set the undefined filter coefficients to zero. This procedure can induce aliasing artifacts. Because we have access to a large training set of images, we adopt a different strategy in which it is the sample average that is assigned rather than zero.

We illustrate the application of this matched-filter design process in Fig. 2 which shows an example image from our training set. Here the left-hand image are the raw image data, while the right-hand image represents the Gaussian-blurred hand segmentation. It is worth noting that the training images contain buildings of a variety of sizes, shapes and orienta- tions. Moreover, the background is quite complex, containing both highways and forested areas. Fig, 3 illustrates the effect of re-applying the matched filter

A.D.J. Cross, E.R. Hancock/lSPRS Journal of Photogrammetry & Remote Sensing 53 (1998) 95-107

Fig. 2. (left) Training image. (right) Convolution response for the training data.

Fig. 3. Re-applying the matched filter to a training image.

Our next task is to establish a relational structure for graph-matching. The literature suggests many alternative structures as neighbourhood graphs. Examples include nearest-neighbour graphs, De- launay graphs, Gabriel graphs, relative neighbourhood graphs and minimum spanning trees. In a recent comparative sensitivity study, Tuceryan and Chorzempa (1991) have demonstrated experimen- tally that it is the Delaunay graph that offers maximum robustness to added noise. From the standpoint of computational geometry, it is also known that the Delaunay graph is relatively insensitive to affine deformations of the image plane (Faugeras et al., 1990).

to the training image of Fig. 2. In Fig. 4 we show Based on these observations we opt to use the the result of applying the matched filter to an image Delaunay triangulation of the detected buildings as that did not constitute part of the training set. In both a relational description in the matching process. examples, the buildings are well localised. The graphs are computed in a two-stage process.

The convolution filters used in these examples have been computed using training sets of approxi- mately ten images.

2.2. Representing localisedfeatures with relational graphs

A.D.J. Cross, E.R. Hancock/ISPRS Journal of Photogrammetv & Remote Sensing 53 (1998) 95-107

Fig. 4. Applying the matched filter to an image outside the training set. Original image (left) and convolution response (right).

99

Firstly, we automatically threshold the convolution responses to produce a binary segmentation of the detected buildings. Here we use the adaptive algorithm of Illingworth and Kittler (1986) which uses a variance estimate to set the global threshold. Sec- ondly, we locate the centroids of the connected binary image components. These centroids locations are used to seed a Voronoi polygonalisation of the image plane. The Delaunay triangulation is the region-adjacency graph for the Voronoi polygons.

In Figs. 5 and 6 we illustrate the process of triangulating the building locations. In each case the left-hand image is the image obtained by thresholding the matched-filter convolution response. The Delaunay triangulations of the region centroids are superimposed on the right-hand images. It is inter- esting to note that despite their varying size and orientation, all of the buildings are detected. There are no false alarms from structures such as roads or other rectangular paved areas. Moreover, the nodes of the Delaunay graphs are well centred on the ap-

propriate targets. In fact, the building localisation error is roughly five pixels. Since the centroid position error is much smaller than the size of the typical Voronoi zones, there is little structural corruption of the associated Delaunay graph.

The building density in the highway images is relatively sparse. Before we proceed to detail our matching process, we provide some results on more complex imagery. Specifically, the complications arise from the higher building density, the more complex rooftop structure of the buildings and the presence of pronounced shadowing effects. The sequence in Fig. 7 shows the localisation process in the proximity of a cluster of buildings. The left-most image of the sequence is the convolution response. The centre image is a relief plot of the convolution response. Notice, that although the buildings almost abut one another, the response maxima are well separated and do not merge. Finally, the crosses in the right-most image are the computed building locations. To underline the stability of the building-


Fig. 5. Thresholded convolution response (left) and marked buildings (right).

Fig. 6. Thresholded convolution response (left) and marked buildings (right).

A.D.J. Cross, E.R. HancocklISPRS Journal of Photogrammetry & Remote Sensing 53 (1998) 95-107 101

Fig. 7. An enlarged section showing the filter response, the 3D intensity profile and the detected houses.

Fig. 8. An enlarged section demonstrating the stability of the detection process.

localisation process in this denser imagery, Fig. 8 shows the buildings detected in a pair of aerial stereograms. The main point to note is the proximity of the corresponding crosses in the two images.

3. Matching with genetic search

Inexact graph-matching is has proved to be one of the most challenging problems in pattern recognition. Viewed from the perspective of computational complexity, the problem of searching for sub-graph isomorphisms is known to be NP-complete. How- ever, there are many algorithms that are capable of locating useful relational matches in polynomial time (Messmer and Bunke, 1995). There are two

ways in which practical relational matching algorithms have addressed the complexity problem. The classical approach is to employ a heuristic sequen- tial search procedure to prune the search space. A concrete example is provided by forward-checking backtracking (Shapiro and Haralick, 1985). The alternative is to pose relational matching as a parallel iterative optimisation process (Wilson and Hancock, 1995, 1996, 1997; Wilson et al., 1995). Here the goal is to locate the best approximate solution achievable with the available computational resources. From the implementational standpoint, the main require- ments are a means of measuring the consistency of match together with a suitable optimisation engine. In order to avoid problems associated with

102 A.D.J. Cuss, E.R. Hancock/lSPRS Journal of Photogrammetry & Remote Sensing 53 (1998) 95-107

convergence to a local optimum, the optimisation engine should ideally be capable of delivering a solution of global relational consistency. Algorithms meeting this requirement include simulated annealing (Kirkpatrick et al., 1983; Geman and Geman, 1984; Aarts and Korst, 1989), mean-field annealing (Peterson and Soderberg, 1989), and most recently genetic search (Fogel, 1994; Qi and Palmieri, 1994). The main drawbacks of the two annealing algorithms stem from their slow convergence rates. It is for this reason that local optimisation techniques such as probabilistic relaxation (Wilson and Hancock, 1996) have been adopted due to their computational expe- diency.

In our quest for an effective global optimisation technique we have explored several deterministic optimisation strategies (Finch et al., 1996, 1997a,b; Wilson and Hancock, 1996). Although we were initially deterred by its stochastic nature, we have recently found genetic search to offer a surprisingly efficient means of relational matching. In fact we have shown empirically, that convergence can be achieved in order N2.3 computations where N is the number of graph nodes (Cross et al., 1997). This observation may at first sight seem counter-intuitive since the canonical form of the algorithm involving randomised mutation, crossover and selection operations is slow to converge (Qi and Palmieri, 1994). However, the contribution in our recent paper (Cross et al., 1997) has been to show how convergence can be accelerated for the relational graph-matching problem. There are three critical ideas underpin- ning this new genetic search process. The first of these is to use a Bayesian fitness measure. The second idea has been to enhance the crossover process by allowing consistent sub-graphs to be exchanged across the genetic population of competing altema- tive solutions. According to canonical genetic search, crossover is normally effected by either choosing a crossover point at random and exchanging sub- strings, or, by randomly selecting individual bits for crossover (Cedeno et al., 1995; Davis, 1991; Es- helman, 1991; Louis and Rawlins, 1993). The final idea is to use a graph-edit process to remove clutter from the graphs. The net advantage of introducing these algorithm enhancements is to overcome the slow diffusion of consistency that results if standard mutation and crossover operations are employed.

3.1. Bayesian$tness measure

The first step in the development of our genetic search procedure is to formally review the Bayesian consistency measure recently reported by Wilson et al. (1995). Our aim in this paper is to exploit this consistency measure as a fitness function for per- forming relational matching by genetic search. Hith- erto, the optimisation of the consistency measure has been confined to the use of the deterministic search methods. Genetic search not only provides a global optimisation strategy, but as we shall demonstrate in this paper, its architecture is also naturally suited to the structure of the consistency measure.

Central to this paper is the aim of matching relational graphs represented in terms of configurations of symbolic labels. We represent such a graph by G = (V, E), where V is the symbolic label set assigned to the set of nodes, and E c V x V is the set of edges between the nodes. In our genetic realisation of the matching process, we maintain a population of matches between a set of edited versions of a graph representing the data and a sin- gle-model graph. These editing operations involve deleting nodes from the original data and recomputing the edge set. Suppose that Q is an index which runs over the set of solutions in the population. Formally, we represent the matching of the nodes in the data graph indexed cx in the population, i.e. G, = (V,, E,) against those in the model graph G, = (V,, E,) by the function fol : V, + V,. In other words, the current state of match is denoted by the set of Cartesian pairs constituting the function:

fU = 1% fol(i)), vi E Vctl E va x vnl

In order to describe local interactions between the nodes at a manageable level, we will represent the graphs in terms of their clique structure. The clique associated with the node indexed j consists of those nodes that are connected by an edge of the graph, i.e.:

Cj = {i E V,l(i, j) E E,}

The labelling or mapping of this clique onto the nodes of the graph G, is denoted by:

rj = {f(i) E V,, Vi E Cj]

Suppose that we have access to a set of patterns that represent feasible relational mappings between

A.D.J. Cnxs, E.R. Hancock/lSPRS .Jound of Photogrammetry & Remote Sensing 53 (1998) 95-107 103

the cliques of graph G, and those of graph G,. Typically, these relational mappings would be configurations of consistent clique labelling which we want to recover from an initial inconsistent state of the matched graph G,. Assume that there are Zj relational mappings for the clique Cj which we denote by:

A’ = {)Lr E Vm, Vi E Cj} where p E {l, 2. . .Zj} is a pattern index. According to this notation h” E V, is the match onto graph G, assigned to the node i E V, of graph G, by the ,u-th relational mapping. The complete set of legal relational mappings for the clique Cj are stored in a dictionary which we denote by:

Oj = {A’[/_L = 1, Zj}

The development of the consistency measure pro- ceeds by assuming that matching errors occur with a uniform probability p. Under this assumption the probability of the configuration of matches P(fj) acquires an exponential character. Specifically,

2,

P(rj) = $ C exp[-kH,] J &=I

(5)

Here bj = (1 - p)lcjl and k = ln(l - p)/(p) are constants that depend on the label-error probability p. The physical quantity appearing in the probability distribution is the Hamming distance HP between the matched configuration I-” and the dictionary item A@. Formally, this quantity is computed in the following manner:

(6)

In other words, the Kronecker delta function Sf(i), A? counts the label conflicts between the current configuration of matches and the matches demanded by the dictionary item, indexed p.

The configurational probability P(I’j) is the basic ingredient of our genetic search procedure. It represents the probability of a particular matching configuration evaluated over the state space of feasible possibilities (i.e. the dictionary). We use as our global measure of consistency the sum of clique configurational probabilities, i.e.:

(7)

In the next sub-section of this paper we will describe how this average consistency measure can be utilised as a fitness measure in the genetic search for relational matches.

3.2. Crossover

Crossover mixes the pool of solutions to produce new ones. Standard crossover procedures involve selecting at random pairs of global matching configurations from the current population. Random matches at corresponding sites in the match are then interchanged with uniform probability i; we term this probabilistic crossover. However, this crossover mechanism will not necessarily facilitate the merg- ing of locally consistent sub-graphs. Moreover, the process also ignores the underlying structure of the graphs. A better strategy is to combine the solutions by physically dividing the graphs into two disjoint sub-graphs. In this way internally consistent portions of the individual solutions may be exchanged at the structural level

In our experimental evaluation of the genetic search procedure we will confine our attention to Delaunay graphs. Here the nodes of the graphs are point on the image plane. Each node is used to seed a Voronoi cell. Edges in the Delaunay graph indicate region adjacency of the Voronoi polygons. In this case, the sub-graph crossover process is easily im- plemented by dividing the original image plane with a random line. This has the effect of partitioning the set of data graph nodes into two sub-sets V>) and Vcb), where V, = V@) U Vcb). New solutions may be c&structed by inter:hangiig the matches of the two partitions between pairs of matched graphs in the genetic population. Suppose that the two solutions undergoing crossover are denoted by the matches:

fa = I@, _L(i)), Vi E V,l

and

fp = I(L fb(i)), Vi E VP}

The new solutions produced by the crossover process are:

f, = I(i, L(i)), V E V,‘“l U {(i, fp(i)), Vi E $9

and

fs = {(i, f@(i)>, Vi E Vp’“‘} U {(i, for(i)), Vi E V,‘“‘}


Fig. 9. Subgraph crossover.

If the two solutions are defined over the same sets of nodes, i.e. V, E V,, then the partitioning of the data graph by randomly dividing the image plane has no effect on the triangulation, i.e. the edge set of the graph. Later, we will describe our strategy for controlling extraneous clutter in the matching process. This involves a graph-editing process which centres around removing nodes from the point set and recomputing the Delaunay graph. This means that different solutions in the genetic population may in principle be defined over dissimilar sets of data graph nodes. When this is the case, the crossover process not only involves exchanging matches, it must also include re-triangulation. For Delaunay graphs this is a straightforward process. The Voronoi seeds falling into the two partitions of the image plane are exchanged in the usual way and two new Delaunay triangulations are computed. This process is illustrated in Fig. 9.

3.3. Unmatchable nodes

One of the critical ingredients in effective relational matching is the way in which unmatchable entities or clutter are accommodated. Convention- ally, there are two principal ways in which the effect of clutter can be neutralised. The first of these is to retain clutter nodes as an integral part of the graphs, but to explicitly label them as null-matched (Finch et al., 1996). The alternative is to follow a graph-edit philosophy and to remove the clutter nodes, recomputing the edge set of the graph if nec- essary (Sanfeliu and Fu, 1983). The main advantage of graph-edit operations is that, if effectively con- trolled, they can overcome relational fragmentation

due to severe levels of clutter. In an extensive comparative sensitivity analysis Wilson and Hancock (1996) have demonstrated that although effective when sub-graph-matching is being attempted, the null-labelling technique has a greater susceptibility to noise.

In a recent paper, Wilson and Hancock (1995) have shown how this process of active graph recon- figuration can be realised as an optimisation process. Details of the derivation are outside the scope of this paper. The basic idea is to gauge the net effect of deleting a node by examining those contributions to the consistency measure that arise from modification of the super-cliques containing the node in question. Suppose we are considering deleting node u E V, from the data graph G, to produce a new graph Gb = (VP, Eg). Here the two node sets are related by VP = V, - {u} and the edge set E, is computed by re-triangulating the modified node set VP. At the clique level the change in the consistency measure is monitored identifying those nodes that form a clique with node u in graph G,, i.e. C, - (u}, and deter- mining their counterpart cliques in the reconfigured graph GB. We let x,’ denote the clique set of object u in graph G, and x; denote the corresponding clique set in the reconfigured graph Gg. With this notation the change in the consistency criterion caused by the deletion of the node u is proportional to:

A; = c 2 2 exp[-kH,J jsx; J p=~

(8)

By contrast, when considering the change in the MAP criterion caused by re-insertion of the node u it is the super-clique set x,’ to which we turn our attention. The corresponding change to the consistency criterion is proportional to:

Arf = c g 2 exp[-kHJ (9) jex: yJ pL=l

The decision criteria for node deletion or re-insertion are as follows. We delete node u provided A;f < A; and reinstate it provided AZ > A;. This graph-editing procedure is applied at each of the solutions in turn as part of the hill-climbing process. The set of nodes constituting the reconfigured graph is therefore

V, = {u E Vo,lAf > A,}

A.D.J. Cross, E.R. Hancock/lSPRS Journal of Photogramrnetry & Remote Sensing 53 (1998) 95-107 105

3.3. I. Selection The hill-climbing and node-deletion operations

are purely deterministic processes which effectively bring about local improvements in matching consistency. These operations would otherwise prove time-consuming if pursued by stochastic means. The final stochastic element of genetic search is the selection process. The aim here is to randomly admit the configurations refined by the hill-climbing process to the population on the basis of their fitness measure.

The probability distribution defined in Eq. 5 lends itself naturally to the definition of a population mem- bership probability. By normalising the sum of clique configuration probabilities over the population of matches, we arrive at the following probability for randomly admitting the solution indexed Q to the pool of graphs P:

(10)

With this survival probability to hand, population selection can be facilitated by a straightforward application of the roulette-wheel algorithm (Gold- berg, 1989). The final optimal match is located by selecting the graph for which PC” is maximum.

3.4. Matching results

In Figs. 10 and 11 we illustrate the matching results obtained for the image pairs shown in Figs. 1 and 2. In each case the left-hand image is the fittest member of the initial population, while the right-hand image is the fittest match after three iterations. The lines between the graphs represent the correspon-

Fig. 10. (a) An initial guess. (b) The final solution.

Fig. 11. (a) An initial guess. (b) The final solution.


Fig. 12. (a) An initial guess. (b) The recovered solution.

dence matches. In other words, the method converges rapidly. In a more thorough recent investigation Cross et al. (1997) we have shown that this is largely at- tributable to our use of graph-based crossover.

The building configurations used in the matching experiments shown above are devoid of clutter. In order to illustrate the effectiveness of our matching method on noise-contaminated graphs we conclude our experimental discussion with an example on synthetic data. The synthetic graphs used in this study have been constructed by generating random dot patterns and constructing the associated Delaunay graphs. Structural corruption has been generated by adding random noise dots and re-triangulating. Fig. 12 illustrates some typical matching results on synthetic graphs. Fig. 12a shows a rep- resentative solution from the initial population. The left-hand graph is the model while the right-hand graph shows the data; lines between the two graphs indicate matches. The data graph has been obtained by adding random clutter to the model and perturb- ing the nodes with Gaussian position errors. The original model graph contains 20 nodes while the corrupted data graph contains 40 nodes. Fig. 12b shows the fittest match from the genetic population after three iterations. There are two features worth noting. Firstly, the overall consistency of match has improved. The lines connecting the nodes in the data and model graphs are no longer randomly dis- tributed. Secondly, the added clutter nodes have all been correctly identified and deleted from the data graph; they appear as disjoint points on the right-

hand image of Fig. 12b. The overall accuracy of match in this example is 100%.

4. Conclusions

This paper describes some practical steps taken towards the matching of aerial images containing man-made buildings. The features in the scenes under study are represented by Delaunay triangulations. In the application described here, the nodes in these graphs are the locations of buildings. We commence by demonstrating that the building structures in images can be robustly localised using matched filters. Moreover, the triangulation of the convolution response maxima provides a stable relational structure for matching. It is relatively robust under both the addition of noise and distortion of the image plane.

Correspondences between the individual nodes in pairs of image graphs are located using genetic search. This is a global optimisation strategy in which a population of alternative solutions is main- tained. The main advantage of adopting this optimisation strategy resides in the crossover process. This allows partially consistent matches to be rapidly recombined at the sub-graph level. This offers accelerated convergence over alternative optimisation methods such as simulated annealing.

References

Aarts, E., Korst, J., 1989. Simulated healing and Boltzmann Machines. John Wiley, New York.

A.D.J. Cross, E.R. HancockIISPRS Journal of Photogrammetry & Remote Sensing 53 (1998) 95-107 107

Caves, R.G., Harley, P.J., Quegan, S., 1992. Matching map features to Synthetic Aperture Radar (SAR) images using template matching. IEEE Geosci. Remote Sensing 30, 680- 685.

Cedeno, W., Vemuri, V.R., Slezak, T., 1995. Multi-niche crowd- ing in genetic algorithms and its application to the assembly of DNA restriction fragments. Evolut. Comput. 2, 321-345.

Cross, A.D.J., Wilson, R.C., Hancock, E.R., 1997. Inexact matching with genetic search. Pattern Recognition 30, 123- 140.

Davis, L., 1991. Bit-climbing, representational bias and test suite design. Proc. 4th Int. Conf. Genet. Algorithms, pp. 18-23.

Eshelman, L., 1991. The CHC adaptive search algorithm: how to have safe search when engaging in non-traditional genetic recombination. Found. Genet. Algorithms 1, 265-283.

Faugeras, O.D., Le Bras-Mehlman, E., Boissonnat, J.-D., 1990. Representing stereo data with the Delaunay triangulation. Ar- tif. Intell. 44, 41-87.

Finch, A.M., Wilson, R.C., Hancock, E.R., 1996. Matching De- launay graphs. Pattern Recognition 30, 123-140.

Finch, A.M., Wilson, R.C., Hancock, E.R., 1997a. Relational matching with mean-field annealing. Proc. 13th Int. Conf. Pattern Recognition, Vol. II, pp. 359-363.

Finch, A.M., Wilson, R.C., Hancock, E.R., 1997b. Softening Discrete relaxation. In: Jordan, M., Moer, M., Petsche, M. (Eds.), Advances in New Information Processing Systems, 1. MIT Press, pp. 438-444.

Fogel, D.B., 1994. An introduction to simulated evolutionary optimisation. IEEE Trans. Neural Networks 5, 3-14.

Geman, S., Geman, D., 1984. Stochastic relaxation, Gibbs dis- tributions and Bayesian restoration of images. IEEE, Pattern Anal. Mach. Intell. 6, 721-741.

Goldberg, D., 1989. Genetic Algorithms in Search, Optimisation and Learning. Addison-Wesley, Reading, MA.

Illingworth, J., Kittler, J., 1986. Minimum error thresholding. Pattern Recognition 19,41-47.

Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P., 1983. Gptimisation by simulated annealing. Science 220,671-680.

Lin, C., Huertas, A., Navatia, R., 1994. Detection of buildings using perceptual grouping and shadows. IEEE Comput. Vision Pattern Recognition Conf., pp. 62-69.

Louis, S.J., Rawlins, G.J.E., 1993. Syntactic analysis of convergence in genetic algorithms. Found. Genet. Algorithms 2, 141-151.

Messmer, B.T., Bunke, H., 1995. Efficient error-tolerant sub- graph isomorphism detection. In: Dori, D., Bruckstein, A. (Eds.), Shape, Structure and Pattern Recognition. World Sci- entific, Singapore, pp. 231-240.

Peterson, C., Soderberg, B., 1989. A new method for mapping optimisation problems. Int. J. Neural Syst. 1, 2-33.

Qi, X.F., Pahnieri, F., 1994. Theoretical analysis of evolutionruy algorithms with an infinite population in continuous space: basic properties of selection and mutation. IEEE Trans. Neural Networks 5, 102-119.

Roux, M., McKeown, D.M., 1994b. Feature matching for building extraction from multiple views. IEEE Comput. Vision Pattern Recognition Conf., pp. 46-53.

Sanfeliu, A., Fu, KS., 1983. A distance measure between at- tributed relational graphs for pattern recognition. IEEE, SMC 13, 353-362.

Shapiro, L.G., Haralick, R.M., 1985. A metric for comparing relational descriptions. IEEE, Pattern Anal. Mach. Intell. 7, 90-94.

Tuceryan, M., Chorzempa, 1991. Relative sensitivity of a family of closest-point graphs in computer. Pattern Recognition 24, 341-373.

Wilson, R.C., Hancock, E.R., 1995. Relational matching with dynamic graph structures. Proc. 5th Int. Conf. Computer Vision, pp. 450-456.

Wilson, R.C., Hancock, E.R., 1996. A Bayesian compatibility model for graph matching. Pattern Recognition Lett. 17, 263- 276.

Wilson, R.C., Hancock, E.R., 1997. Structural matching by discrete relaxation. IEEE Pattern Anal. Mach. Intell. 19, 634- 648.

Wilson, R.C., Evans, A.N., Hancock, E.R., 1995. Relational matching by discrete relaxation. Image Vision Comput. 13, 411-421.

Wu, Y., Maitre, H., 1986. Registration of rotated pictures with hidden parts using dynamic programming. Proc. 8th Int. Conf. Pattern Recognition, pp. 792-794.

Documents

Recognising building patterns using matched filters and genetic search