10
Computing the Minimum Hausdorff Distance for Point Sets Under Translation Daniel P. Huttenlocher Computer Science Department Cornell University Abstract We consider the problem of computing a translation that minimizes the Hausdorff distance between two sets of points. For points in ~1 in the worst case there are O(mn) translations at which the Hausdorff distance is a local minimum, where m is the number of points in one set and n is the number in the other. For points in ~2 there are e(mn(m + n)) such local minima. We show how to compute the minimal Haus- dorff distance in time O(mn log ran) for points in ~1 and in time O(m2n2a(mn)) for points in ~. The re- suits for the one-dimensional case are applied to the problem of comparing polygons under general affine transformations, where we extend the recent results of Arkin et al on polygon resemblance under rigid body motion. The two-dimensional case is closely related to the problem of finding an approximate congruence between two point sets under translation in the plane, as considered by Alt et al. 1. Introduction We are interested in the problem of comparing geo- metric objects, in order to determine the extent to which one object resembles another. This problem is of central importance in pattern recognition, text recognition, and model-based vision. In this paper we restrict our attention to objects that are composed of a set of points in ~d, where the set is free to trans- late but is of fixed orientation and scale. We develop *Supported by the Pikkowski-Valazzi Fund. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct com- mercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is giventhat copyingis by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requiresa fee and/or specific permission. © 1990 ACM 0-89791-362-0/90/0006/0340 $1.50 0 340 Klara Kedem* Computer Science Department Tel Aviv University a cost function D(A, B) that measures the difference between two such geometric objects A and B. That is, D(A, B) is a translation invariant cost function for comparing two point sets; D(A, B) = D(A, TB) where TB is a translation of B. As discussed in [ACHKM], for pattern recognition applications it is desirable that this function: (i) be a metric, (ii) be efficiently computable, and (iii) match our intuitive notion of resemblance. The cost function that we develop is a distance metric based on finding the minimum Hausdorff dis- tance between two point sets under all possible trans- lations. That is, for two point sets A and B we find a translation that minimizes the Hausdorff dis- tance between the two sets. For point sets in ~1 we show that this minimal distance can be computed in time O(mn log ran), where sets A and B have m and n points respectively. For sets of points in ~ we show that the distance can be computed in time o(m2n2a(mn)). A problem that is closely related to computing our metric in ~ is that of finding the best approzimate congruence under translation between two sets of n points in the plane [AMWW]. That is, finding a matching of points in A and B and a translation, t, such that the maximal difference between each trans- lated point of B and the corresponding point of A is minimized. This matching, the translation, t, and the difference, d, can be computed in time O(n s log n) [AMWWl. While finding the best approximate congruence is similar to the problem that we consider here, our technique of finding the translation minimizing the Hausdorff distance is particularly well suited to pat- tern recognition applications according to the three criteria in the first paragraph: • The distance function D(A, B) is a metric. • The distance D(A,B) can be computed effi- ciently - O(n4a(n)) versus O(ns log n).

[ACM Press the sixth annual symposium - Berkley, California, United States (1990.06.07-1990.06.09)] Proceedings of the sixth annual symposium on Computational geometry - SCG '90 -

  • Upload
    klara

  • View
    215

  • Download
    1

Embed Size (px)

Citation preview

Page 1: [ACM Press the sixth annual symposium - Berkley, California, United States (1990.06.07-1990.06.09)] Proceedings of the sixth annual symposium on Computational geometry - SCG '90 -

Computing the Minimum Hausdorff Distance for Point Sets Under Translation

Daniel P. Huttenlocher Computer Science Department

Cornell University

Abstract We consider the problem of computing a translation that minimizes the Hausdorff distance between two sets of points. For points in ~1 in the worst case there are O(mn) translations at which the Hausdorff distance is a local minimum, where m is the number of points in one set and n is the number in the other. For points in ~2 there are e ( m n ( m + n)) such local minima. We show how to compute the minimal Haus- dorff distance in t ime O(mn log ran) for points in ~1 and in t ime O(m2n2a(mn)) for points in ~ . The re- suits for the one-dimensional case are applied to the problem of comparing polygons under general affine transformations, where we extend the recent results of Arkin et al on polygon resemblance under rigid body motion. The two-dimensional case is closely related to the problem of finding an approximate congruence between two point sets under translation in the plane, as considered by Alt et al.

1. Introduction

We are interested in the problem of comparing geo- metric objects, in order to determine the extent to which one object resembles another. This problem is of central importance in pat tern recognition, text recognition, and model-based vision. In this paper we restrict our a t tent ion to objects tha t are composed of a set of points in ~d, where the set is free to trans- late but is of fixed orientation and scale. We develop

*Supported by the Pikkowski-Valazzi Fund.

Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct com- mercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission.

© 1990 ACM 0-89791-362-0/90/0006/0340 $1.50

0

340

Klara Kedem* Computer Science Department

Tel Aviv University a cost function D(A, B) that measures the difference between two such geometric objects A and B. Tha t is, D(A, B) is a translation invariant cost function for comparing two point sets; D(A, B) = D(A, TB) where TB is a translation of B. As discussed in [ACHKM], for pattern recognition applications it is desirable that this function: (i) be a metric, (ii) be efficiently computable, and (iii) match our intuitive notion of resemblance.

The cost function that we develop is a distance metric based on finding the minimum Hausdorff dis- tance between two point sets under all possible trans- lations. Tha t is, for two point sets A and B we find a translation that minimizes the Hausdorff dis- tance between the two sets. For point sets in ~1 we show that this minimal distance can be computed in time O(mn log ran), where sets A and B have m and n points respectively. For sets of points in ~ we show that the distance can be computed in time o(m2n2a(mn)).

A problem that is closely related to computing our metric in ~ is that of finding the best approzimate congruence under translation between two sets of n points in the plane [AMWW]. Tha t is, finding a matching of points in A and B and a translation, t, such that the maximal difference between each trans- lated point of B and the corresponding point of A is minimized. This matching, the translation, t, and the difference, d, can be computed in t ime O(n s log n) [AMWWl.

While finding the best approximate congruence is similar to the problem that we consider here, our technique of finding the translation minimizing the Hausdorff distance is particularly well suited to pat- tern recognition applications according to the three criteria in the first paragraph:

• The distance function D(A, B) is a metric.

• The distance D(A,B) can be computed effi- ciently - O(n4a(n)) versus O(n s log n).

Page 2: [ACM Press the sixth annual symposium - Berkley, California, United States (1990.06.07-1990.06.09)] Proceedings of the sixth annual symposium on Computational geometry - SCG '90 -

• Our metric allows unmatched points but requires every point of A to be near some point of B (and vice versa) which seems to match intuition well, as illustrated in the next section.

An important application of our method is in com- paring two polygons that are free to undergo an affine transformation. Afline transformations are common in computer vision, where they are used to approxi- mate perspective viewing [Ho]. We represent a poly- gon by the locations of its vertices as a function of affine arc length (a natural analog to the standard notion of arc length along a path) [Sp]. This rep- resentation is invariant under afline transformations, except for the starting location of the arc length pa- rameter along the contour. A change in the starting location corresponds to a translation of the vertices with respect to the afiine arc length parameter. Thus for two polygons, we use our metric D(., .) to find the best translation of the two sequences of vertex locations. Our results extend the shape comparison method of [ACHKM] from rigid body transformations to general affine transformations (using a Hausdorff as opposed to an L 2 metric), without paying any more in terms of time complexity.

In the next section we define our distance function, show that it is a metric, and compare it with the ap- proximate congruence method of [AMWW]. In Sec- tion 3 we discuss the application of our metric to the problem of comparing polygons that are free to un- dergo a general afline transformation. In Section 4 we consider the complexity of computing the metric for sets of points in ~t . We show that the number of possible local minima of the distance function is O(mn) for sets of m and n points, and provide an O(mn log ran) algorithm for computing the distance, D(A, B). In Section 5 we consider the complexity of computing the metric for sets of points in ~2. We show that there are O(mn(m+ n)) possible local min- ima in this case, and present an algorithm that finds D(A, B) in time O(m2n2ot(mn)).

2. A Translat ion Invariant Dis tance Metr ic

In this section we define the coot function D(A, B) that measures the minimum Hausdorff distance be- tween two point sets A and B under translation. (Without loss of generality we fix the set A and allow only B to translate.) As mentioned in the previous section it is desirable that this function be a metric. Thus we require that D(A, A) be zero, and that D(., .) be non-negative, symmetric, and satisfy the triangle inequality (cf. [Ro]).

341

I • - I I 1 - I - -

Figure 1: Two sets of points illustrating the distances H(A, B) and D(A, B).

The Hausdorff distance between two sets of points A and B is given by

H(A, B) = m (h(A, B), h(B, A)) (I)

where h(A, B) = maxminp(a, b), (2)

a6A bEB

and p(a, b) is the distance between the points a and b in some chosen metric.

We define the distance between two point sets A and B as

D(A, B) = rain H(A, B ~ t) (3)

where B ~ t = {b+ rib E B}, and H is the Haus- dorff distance as defined above. For example, Fig- ure 1 shows two sets of points on a line, where the set A is illustrated by dots and the set B by crosses. H(A, B) is large because there are points of A that are not near any points of B and vice versa. D(A, B) is small, however, because there is a translation of B that makes each point of A nearly coincident with some point of B and vice versa.

Clearly the Hausdorff distance, H(A, B), can he computed in time O(mn) for two point sets of size m and n respectively. In the special case where the points are the vertices in ~ of a convex m-gon, X, and a convex n-gon, Y, it has been shown that H(X, Y) can be computed in time O(m + n) [At]. In Sections 4 and 5 we characterize the complexity of computing D(A, B).

Proposition 1 D(A, B) is a metric.

Proof. Clearly D(A, B) is everywhere positive, is symmetric and has the identity property because H(A, B) is a metric and has these properties. We show that D(A, B) also satisfies the triangle inequal- ity.

Denote by tac the translation of A that minimizes H(A, C), and similarly by tbc the translation that

Page 3: [ACM Press the sixth annual symposium - Berkley, California, United States (1990.06.07-1990.06.09)] Proceedings of the sixth annual symposium on Computational geometry - SCG '90 -

4-

+

Q

-H-

i •

+

4- 4-

+

0 0

+ +

4-

0" W' ~. 0 .e

a

Figure 2: of points.

b

Minimizing distances over matching pairs

a b

Figure 3: The minimal Hausdorff distance under translation.

minimizes H(B,C). Let A' = A ~ t a b and B' = B ~ t~c. By equations (1) and (3) and the choice of tac and tbc this implies

H(A', B') < D(A, C) + D(C, B),

but A' and B' are translations of A and B therefore

D(A, B) _< H(A', B') < D(A, C) + D(C, B).

[ As mentioned in the previous section, a problem

closely related to computing D(A, B) is that of find- ing the best approximate congruence under transla- tion for two sets of n points, A and B [AMWW]. Formally, the problem is to find the translation t of B and the bijection l : B --~ A that minimizes

d = ~ p ( b + t, l(b)). (4)

Here p(., .) is either L 2 or L °°. This minimum cost matching is not very well suited

to our problem, however, because in pattern matching applications the sensing device will often merge two close points into one or split one point into two. Fig- ure 2 illustrates why this is a problem for the formula- tion of [AMWW]. This figure shows the best match- ings computed between one point set (shown as dots) and two different point sets (shown as crosses) using (4). Not only are these "best" translations not what one would intuitively report, the cost of the match in Figure 2a is twice as large as that in 2b, whereas intuitively the opposite should be true.

Our technique of finding the translation that mini- mizes the Hausdorff distance using (3) does not suffer from this problem because the minimization is done over over all points rather than just the pairs of points in a matching. Figure 3 shows the same point sets compared using this method. The translations are

intuitively "good matches", and the cost of the pair in Figure 3b is much larger than that in 3a which accords well with intuition.

3. A n A p p l i c a t i o n t o P o l y g o n C o m p a r -

i s o n

We now present an application of our metric to the problem of comparing two polygonal shapes that are free to undergo an affine transformation T(z) = Lz + b, where det(L) # 0 (cf. [Sp] Chapter 2). This application is important in computer vision, because the projection of an imaging device can be modeled using an affine transformation. In computer vision we are concerned with comparing a shape A, which is stored as a model for some particular object, with a shape B, which is found to exist in an image. If A and B are close to being the same shape then the vision system should report a match and return a measure of how good that match is.

The basic outline of the method is as follows. We represent a polygon P = (P,, ...,Pn) in terms of the normalized alfine arc length location of each succes- sive vertex pi 6 ~2. The resulting sequence Ep = ( ~ , ..., ~ ) , o'~ 6 [0, 1], is invariant under an affine transformation. The sequence Ep will change by a translation (on the unit interval) depending on the starting location of the arc length parameter along the contour. Thus we compare two polygons by us- ing D(., .) to find the translation that minimizes the distance between their afflne arc length representa- tions. We show that this distance is zero if and only if the two polygons are airline transformations of one another. Furthermore, slightly perturbing a polygon by adding another vertex results in a small distance.

This method generalizes the results of [ACHKM], where a metric was developed for comparing polyg- onal shapes that may undergo a similarity transfor-

342

Page 4: [ACM Press the sixth annual symposium - Berkley, California, United States (1990.06.07-1990.06.09)] Proceedings of the sixth annual symposium on Computational geometry - SCG '90 -

mation. That is, they defined a shape independently of its position, orientation and scale. Our metric can be used to compare polygonal shapes that may un- dergo an affine transformation. That is, we define a shape independently of translation, rotation, scal- ing and skewing. As in [ACHKM] our method works for both convex and nonconvex polygons and runs in time O(mn log ran) where m is the number of vertices in one polygon and n is the number of vertices in the other. An additional difference between our work and that of [ACHKM] is in the definition of the distance function. While they use the L ~ metric to compare a turning function representation of two polygons, we use a Hausdorff metric to compare an affine arc length representation of two polygons.

The affine arc length of a smooth curve C is in- variant under the class of equi-affine transformations (where det(L) = 1), and is given by

Z' o-(t) = I det(c ' (s) , c"(s))lX/3ds

where d(s) and c"(s) are the first and second deriva- tives of C as a function of the regular arc length, s (parameterized on the unit interval) [Sp]. We use the normalized affine arc length, which is invariant under general affine transformations (where det(L) # 0),

o(t) =

The discrete version, for a polygon P with vertices (p,, . . ,p . ) , can be defined analogously using differ- ences between successive vertices. In this case, the atTine arc length location of vertex Pi is

i o'i = E det(pk+, -- Pk, Pk+2 -- 2pk+, + Pk) 1Is (5)

k=l

where the indices are modulo n so that p .+ l - m- The normalized affine arc length representation of

P is the sequence Ep - (~;,...,cr~), where ~r~ - ~i/crn. Clearly this representation is invariant under the class of alTine transformations. Furthermore two sequences Ep and E 0 are equal, up to a translation on the unit interval, if and only if P and Q are affine transformations of one another.

P r o p o s i t i o n 2 Given two polygons P and Q, their normalized amine arc length representations ~p and E(~ are equal (up to a translation on the unit inter- val) if and only if P and Q are related by an amine transformation.

Proof. First we show that if the arc length parameter starts at the same vertex for both P and Q (e.g., the

locations of corresponding vertices p, and ql are both zero) then Ep and EQ are equal if and only if there is an affine transformation relating P and Q.

By the definition of normalized affine arc length, Ep and EO are equal if and only if each term of the summation in (5) for Ep differs by a fixed constant from the corresponding term of the summation for EQ (that constant being the ratio of the two sums). Expanding the determinant shows that each term of the summation is just

[2(IApjpk+,pk+2l)]

where ]Aabc] denotes the area of the triangle with vertices a, b, c.

Thus each corresponding term for Ep and EQ will differ by a constant if and only if the areas of the triangles formed by successive vertices of P and Q differ by a constant. From planar affine geometry we know that triangle area ratios are preserved if and only if there is an affine transformation relating P and Q [KI]. Changes in the starting location of the arc length parameter will result in ~p and ~Q only being equal up to a translation on the unit interval. |

We now use this affine invariant representation of a polygon to determine a distance between two polyg- onal shapes P and Q. We know that ~p and T 0 are equal up to a translation if and only if P and Q are aftine transformations of one another. Thus we use our metric D(., .) to compute the minimal Haus- dorff distance under translation between Ep and EQ. Clearly this distance measures differences in shape independent of afline transformations. That is, the distance is zero if and only if P is an affine trans- formation of Q. We also note that the distance is small for perturbations such as the addition of an ex- tra vertex to one of the polygons. Adding a vertex that is nearly colinear with the preceding and follow- ing vertices will result in very little affine arc length (because the arlene arc length along a line is zero), and thus will only perturb the locations of the other vertices by a small amount.

L e m m a 1 The distance between two polygons P and Q that are free to undergo general amine transforma- tions can be computed in time O(mn log ran) for poly- gons of m and n vertices respectively.

Proof. We exhibit the algorithm. |

A l g o r i t h m 1 Given two polygons, P = (Pl, .... P,) and Q = (ql,.--,qn) compute the distance between the classes of a~O~ne equivalent shapes defined by each polygon.

343

Page 5: [ACM Press the sixth annual symposium - Berkley, California, United States (1990.06.07-1990.06.09)] Proceedings of the sixth annual symposium on Computational geometry - SCG '90 -

<A Figure 4: Some examples of afline distances.

. Compute Ep and TO, the affine are length loca- tions of the vertices, using (5) and normalizing by the total afline arc length.

. Compute the best Hausdorff translation between Ep and E O on the unit circle. That is, use Al- gorithm 2 (described in the following section) to compute D(Ep, ~20) , where the points of ~2Q are translated modulo 1 rather than on the real line.

The first step requires O(m + n) time by inspection of (5), because each successive ~r~ can be computed by adding a single term to er~_,. Computing D(., .) in the second step requires time O(rnn log rnn) by Lemma 3 in the following section.

Figure 4 shows the result of using Algorithm 1 to compare a square with four other shapes. The shapes are ordered from left to right in terms of increasing distance from the square as ranked by the method.

4. T h e O n e - D i m e n s i o n a l C a s e

In this section we show how to compute the mini- mum Hausdorff distance under translation, D(A, B), where A = {ai}i=l ..... , , and B = {bj}j=, ..... , are sets of points on the real line. First we consider the com- plexity of the function that is minimized in order to determine D(A, B). Then we present an algorithm for computing D(A, B) and the corresponding trans- lation, t, in time O(mn log ran).

In (3) we defined D(A, B) to be the minimum Hausdorff distance between sets A and B, over all possible translations t of B. The Hausdorff distance H(A, B) is defined in (1) as the maximum of two cost functions h(A, B) and h(B, A). In order to com- pute D(A, B) we define f(t) and f ' ( t ) , which specify respectively the values of h(A, B) and h(B, A) as a

344

Figure 5: The function f(t) in one dimension.

function of the translation, t, of the set B. From the definition of h in (2) we see that the functions f(t) and f ( t ) are the results of taking lower and then upper envelopes of the distances between individual pairs of points (ai, bj).

We denote the distance between a pair of points ai K A and bj 6 B, as bj undergoes a translation t, by

6i,~(t) = Ib~ - a, + t l .

In computing f( t) and f '(t) w e f i r s t take lower en- velopes of collections of these functions. We define the function di(t) as the lower envelope of the func- tions 6id(t) for a given point ai 6 A and all bj 6 B,

d~(t) = rain 6i,j (t). (6) bjEB

Similarly, d~(t) is the lower envelope for a given bj 6 B and all al E A. The function f(t) is now just the upper envelope of the functions di(t) for each ai 6 A,

f(t) = maxdi(t). (7) aiEA

Similarly the function f ( t ) is the upper envelope of the functions d~(t) for each bj K B. Figure 5 illus- trates a set of functions di(t), with the resulting f(t) shown in bold.

The value of D(A, B) is that which minimizes the maximum of f(t) and f ( t ) ,

D(A, B) = rain max(f(t),f '(t)). tE~*

Proposition 3 The function f(t) has O(mn) local mazima and O(mn) local minima.

Proof. Clearly each of the m functions di(t) is piece- wise linear with segments of slope :1:1, being the lower envelope of linearly changing absolute value func- tions, 6i,j(t). Each di(t) has n minima (where its value is zero) at t = ai - bj for each bj. We note that two functions g(z) and h(z) with M and N seg- ments of slope ::1:1 can intersect one another at most

Page 6: [ACM Press the sixth annual symposium - Berkley, California, United States (1990.06.07-1990.06.09)] Proceedings of the sixth annual symposium on Computational geometry - SCG '90 -

min(M, N) times, because each segment of g inter- sects at most one segment of h and vice versa. Thus each of the m functions di(t) contributes at most O(n) vertices to the upper envelope, f(t). I

Propos i t ion 4 The function f(t) has f~(mn) maz- ima and f2(mn) minima.

Proof. The proof is by construction. Consider a set B that is a sequence of n points on the real line each separated by a distance of d. Each di(t) thus has n maxima of height d/2 spaced d apart. If the set A now consists of a sequence of m points each separated by d/m, then each di(t) is shifted over by d/re. Clearly the upper envelope has all n peaks of all m functions visible. |

L e m m a 2 The number of vertices of H(A, B ~ t) is e ( m n ) .

Proof. By Propositions 3 and 4 the function f(t) has O(mn) minima and maxima, and parallel arguments hold for f ( t ) . A similar argument also establishes that the upper envelope of f and ff has O(mn) min- ima and maxima. [

Note that it is not necessary to compute f(t) and f '(t) explicitly. Rather one can compute D(A, B) as the upper envelope of the rn+ n lower envelopes di(t) for each ai 6 A and d~ (t) for each bj E B.

L e m m a 3 D(A, B) and the corresponding transla- tion t can be computed in time O(mnlog ran).

Proof. This is shown by exhibiting the algorithm. |

A lgo r i t hm 2 Given two sets A and B, compute D(A, B) and t.

1. Compute the m functions di(t). For a given ai 6 A sort the n minima locations t = ai - bj for each bj 6 B by increasing t (note di(t) = 0 at these points). Form the n segments of slope 1 connecting each minimum to the maximum halfway between it and the following minimum.

2. Do the same for the n functions d~(t), each hav- ing m minima.

3. Compute the upper envelope of the m + n func- tions from the previous two steps.

(a) Rotate the positive slope segments of each function 45 degrees clockwise about the ori- gin, thereby making them into segments of slope 0.

345

(b) Compute the set of subsegments visible from y = c~. Sort the segments by y value in decreasing order, process the segments in this order, and maintain a segment tree to find the dominant set of segments [Me].

(c) Rotate the dominant segments 45 degrees counterclockwise about the origin. The left endpoints of each segment are the minima of D(A, B).

4. Consider each minimum of the function com- puted in the previous step to find the global min- imum.

The first step is dominated by the O(n log n) time to sort the minima of each of the m functions di(t) for a total of O(mn log n). Similarly the second step takes time O(mn log m). The third step is dominated by the O(mn log ran) time required to sort the O(mn) segments and compute the dominant intervals, and the fourth step is linear in the number of minima which is O(mn). Thus the overall running time is O(mn log ran). We have implemented this method and have found that it also runs efficiently in practice.

5. T h e T w o - D i m e n s i o n a l C a s e

Here we show how to compute the distance between two point sets in the plane, A = {ai}i=, ..... ,n and B = {bj}j=l ..... , . In general, we follow an analogous line of reasoning as for the one-dimensional case in the previous section. First we analyze the complexity of the surface corresponding to H(A, B ~ t), and then we present an algorithm for computing D(A, B), the minimum of this surface.

We want to find a translation t of B in the plane and the Hansdorff distance between A and B for this t :

D(A, B) = min max(f(t), f '(t)) f E Z a

where f( t) and f ( t ) are, as in the previous Section, the upper envelopes of the di(t) functions and the d~(t) functions, respectively. These latter functions are in turn the lower envelopes of collections of func- tions

6i,j(t) = p(ai,bj + t),

with p(a, b) being either the L 1 or L °° metric (recall that for two points (z,, I/:) and (z2, Y2) the L 1 metric is Iz, - z~] + ]Yl - Y2] and the L °° metric is max(]z, -

z21, - We analyze the complexity of the function f(t),

where p(a, b) is the L °° metric. We note that a similar analysis holds for the case of the L 1 metric, because the L 1 and L c° metrics differ only by a rotation of

Page 7: [ACM Press the sixth annual symposium - Berkley, California, United States (1990.06.07-1990.06.09)] Proceedings of the sixth annual symposium on Computational geometry - SCG '90 -

3. Rising edge, part of an edge of an original in- verted pyramid, which also runs a 45 degrees di- agonal to the z and y axes. These edges start at a minimum point where z = 0 and rise to a ridge.

P r o p o s i t i o n 5 I f we project the ridges (both con- stant and rising) of the lower envelope di(t) orthog- onall1/ onto the ( z , y ) plane, then we get the L °°. Voronoi diagram of the set of points { a i - bj }j=l ..... , .

Proof. Evident (cf. [CD], [LS]). II

Figure 6: The projection of the lower envelope, di(t).

the distance function (e.g., see [CD]). In describing the surfaces corresponding to f ( t ) , di(t), and 6i j ( t ) , we use a coordinate system where the values of these functions are plotted along the z-axis, and the two- dimensional translation parameter t is plotted along the z and y axes. We refer to the magnitude in z as the "height" of a surface.

Observe the function 6i,j(t) for some fixed points ai E A and bj E B. This function defines an inverted pyramid with a minimum of z = 0 at t = ai - bj. The surface is composed of four faces tha t rise from this minimum. These faces are of slope z = y (the north face, or N), z = - y (south or S), z = z (west or W), and z = - z (east or E).

The function di(t) is the distance between one point al and the shifted set B, or equivalently the lower en- velope of all the functions {6i,j(t) [ j = 1, ...n}. It is easy to see that di(t) is a polyhedral surface, con- sisting of parts of inverted pyramids whose minima (z = 0) are at t i j = ai - bj, with some new intersec- tion edges and new intersection vertices. This surface has the same four types of faces as a single inverted pyramid (N, E, S and, W facing). There are three types of edges on the surface, which we name as fol- lows:

1. Constant ridge, a local maximum which is of con- s tant height, and is parallel to the z or y axis. Constant ridges are formed by a N-S face inter- section or an E-W face intersection of two in- verted pyramids.

2. Rising ridge, an edge which changes height lin- early, and runs at 45 degree diagonal to the z and y axes. Rising ridges are formed by a N-E, N- W, S-E, or S-W face intersection of two inverted pyramids.

346

C o r o l l a r y I The number of vertices, edges and faces in d,(t) is O( . ) .

This is clear, since the number of vertices, edges and faces in the original inverted pyramids is O(n), and the complexity of the new vertices, edges and faces is the complexity of the Voronoi diagram which is also O(n).

The general position that we require, to make the discussion easier, is that there are no two points tl,j = ai - bj and ta,k = bz - ak such that the line passing through them is parallel to the z- or y-axis. It is easy to see that , once we assume general position, the degree of the new vertices in d~(t) is three (three faces intersect). (See Figure 6 for the planar projection of the lower envelope all(t); light lines are the rising edges of the pyramids, and dark lines are the ridges which are new edges of intersection between the faces of the pyramids.)

We now proceed to investigate the complexity of the function f ( t ) = maxa,CA di(t), which measures the distance from the set A to the set B as a function of the translation, t. Similarly to the one-dimensional case considered above, the surface of f ( t ) is the upper envelope of m functions d~(t) for each ai E A. This surface has the same four types of faces and three types of edges as the d~(t) surfaces, and also has two additional types of edges:

. Constant vaile1/, a local minimum which is ofcon- stunt height, and is parallel to the z or 1/axes. Constant valleys are formed by a N-S or an E-W face intersection of two functions di (t) and dj (t).

. Rising valle1/which runs at a 45 degree diagonal to the z and I/axes and increases height linearly. Rising valleys are formed by the intersection of N-E, N-W, S-E and S-W faces of pairs of distinct functions di(t) and dj (t).

Before bounding the number of vertices on this sur- face, we prove some properties of the edges of f ( t ) .

Page 8: [ACM Press the sixth annual symposium - Berkley, California, United States (1990.06.07-1990.06.09)] Proceedings of the sixth annual symposium on Computational geometry - SCG '90 -

L e m m a 4 Each diagonal edge (rising edge or rising ridge) in a given di(t) is visible in f(t) as at most one segment.

Proof. An intersection between a diagonal edge and a face always yields a vertex at some height which we call z0. The edge is unseen on the upper envelope for z-values just below z0, and seen for z-values just above zo. Tha t is, such a vertex must be a lower vertex of a visible segment. A diagonal edge is rising monotonically towards a ridge, and thus there can be only one upper vertex and one lower vertex that are visible. |

L e m m a 5 A rising valley is visible in f(t) as at most one segment.

Proof. A similar argument holds as the one just given above, only the valley rises towards an edge of either d,( t) or dj(t). II

L e m m a 6 Each constant height ridge of a given di(t) is visible on the upper envelope f(t) as at most O(m) segments.

Proof. First we note that for a given di(t) any two ridges which are parallel to one another must be sep- arated by a distance of at least g + h, where g and h are the heights of the ridges. This is because the ridges project to an L °° Voronoi diagram, and the slopes of the faces of di(t) are linear in z and y. Con- sider a ridge e that is of height h. This edge, e, can only be broken into more than one segment in f ( t) if it penetrates some face F of another function dj(t). In order for e to penetrate F the ridge at the peak of F must be of height at least h (the height of e). How- ever we know that a given dj (t) can have at most one edge of this height at a given orientation over a dis- tance of length h, because the edges project to an L °° Voronoi diagram. There are only a constant number of orientations for ridges (horizontal, vertical, and the diagonals), thus each dj(t) function can only intersect the edge e a constant number of times. Since there are m such functions each edge e is broken into at most O(m) segments. [[

We now proceed to bound the number of vertices in f(t). First we enumerate the various types of vertices in f( t ) :

(a) Original vertices of the lower envelopes di (t) that appear on f(t).

(b) Vertices created by the diagonal edges (rising ridges or rising edges) of di(t) that penetrate faces of d i ( t ) .

347

(c) Vertices created by the constant height ridges in di(t) that penetrate faces of dj(t).

(d) Vertices where faces of three different lower en- velopes intersect.

L e m m a 7 The number of vertices of f( t ) is O(m2n).

Proof. Clearly (see Corollary 1) the number of ver- tices of type (a) is O(mn).

In Lemma 4 we showed that any diagonal edge of a di(t) is visible in f( t) as at most one connected segment. Hence each diagonal edge contributes at most two vertices to f(t). There are O(mn) diagonal edges that form the inverted pyramids (rising from minima), and by Proposition 5 there are O(n) diag- onal ridges in each of the m functions di(t). Thus there are a total of O(mn) vertices of type (b).

In Lemma 6 we showed that each constant height ridge of a given di(t) is seen as at most m segments in f(t). There are m such di(t) functions, and by Proposition 5 we know that each has O(n) constant ridges. Thus there are O(m2n) segments in f( t) due to these ridges, and the number of vertices of type (c) is O(m2n).

We will now bound the number of vertices that arise from intersections of faces that belong to dif- ferent lower envelopes (type (d)). Our assumption of general position is that this type of vertex is cre- ated by the intersection of exactly three faces. Let us denote the faces by F1 E di(t), F2 E dj(t) and F3 E dr(t) and their intersection point by P. We can see the intersection point as a point where the in- tersection lines between the pairs (Fi, Fj) meet. (We denote I1 = Flf3F2, 12 = F2NF3 and 13 = FiNF3.) It is easy to see that one of the intersection lines, say !1, is a constant valley, and the other two, say 12 and/3, are rising valleys. (If two were constant valleys there would have to have been a four face intersection.) We also note that a three face intersection point, P, is al- ways a local minimum of the upper envelope (it is the joining of a constant valley, which is a minimum, and two rising valleys).

By Lemma 5 every diagonal line of intersection (ris- ing valley) is seen on f( t) as at most one connected segment. Of course the upper endpoint, Q, of a rising valley cannot be a three face intersection point, since it is not a local minimum. Therefore it is an inter- section of the rising valley with some edge e' of some lower envelope dr(t). Now, if e' is a diagonal edge, then the endpoint Q is of type (b) and there are O(mn) such vertices, therefore we charge the three face intersection point P to Q. Since the degree of Q is bounded (three), the number of such possible P vertices is also O(mn). On the other hand, if e'

Page 9: [ACM Press the sixth annual symposium - Berkley, California, United States (1990.06.07-1990.06.09)] Proceedings of the sixth annual symposium on Computational geometry - SCG '90 -

I - I . I . m m m

I - I - I .

Figure 7: The constant height ridges of a di(t) for which m translated copies yield an f(t) with f~(m2n) vertices.

is a constant height ridge of some di(t), then Q is a vertex of type (c) to which we charge our vertex P , and again, since there are O(m2n) vertices of type (c) then the number of vertices of type (d) is also O(m2n). II

L e m m a 8 The number of vertices (edges, faces) of f(t) is e(m2n).

Proof. Since f(t) is a planar graph the number of edges and faces is linear in the number of vertices which is O(m2n) as we have just shown.

We now construct an example in which the number of vertices is f~(m2n). First we create one lower en- velope, di(t), in the following manner. On the z - y plane we place the heads of the inverted pyramids (where z = 0) such that they form two rows of points (see Figure 7). In the first row there are O(n) points, each 2H apart in = and at some given It0. The second row also has O(n) points in a row, and the row is lo- cated 2(H - 6 ) below the first row, in y. The points of the second row are 2 (H - 6) apar t in z. We are inter- ested in the constant height ridges of d(t) which are the L °° Voronoi diagram lines parallel to the z axis (we call them here horizontal) or y axis (which we call here vertical). Given the thus described pyramid heads we get in the first row O(n) vertical Voronoi edges of height z = H. Dividing the two rows of points are O(n) horizontal Voronoi edges of height z = H - 6, and between the points of the second row are O(n) vertical edges, also of z = H - 6.

We now make m copies of this structure by trans- lating it in the direction of positive y in increments of some small amount e, where e > 26. Each horizontal

348

edge of height H - 6 in the first structure will now intersect each of the O(m) vertical edges of height H created by the O(m) copies of the structure. (Their projections on the z - y plane intersect.) Each such horizontal edge is split into O(m) segments on f(t). This gets us to f~(m2n) segments for all the O(n) hor- izontal edges of all the O(m) copies of the structure. |

A similar set of arguments to those used in the above Lemmas establishes that the complexity of f ( t ) is O(mn2). However, as in Section 4 we note that it is not necessary to compute f(t) and f'(t) ex- plicitly. Rather one can compute H(A, B ~ t) as the upper envelope of the m + n lower envelopes, di(t) for each ai 6 A and d~(t) for each bj 6 B, and then find to at which the minimum of H is achieved and the value D(A, B) = H(A, B ~ to). The only change in the above analysis is in Lemma 6, where we argue, without changing the proof, that each constant height ridge is visible on the upper envelope H(A, B ~ t) as at most O(m + n) segments. Which in turn implies that the number of vertices of types (c) and (d) is O(mn(rn + n)). In the same manner we can modify the construction in the proof of Lemma 7, by drawing an O(m) size initial lower envelope d'(t) , then making O(n) copies of it, thus getting to

T h e o r e m 1 The number of vertices (edges, faces) of H(A, B • t) is O(rnn(m + n)).

In order to compute the value of t tha t minimizes H(A, B ~ t) we make use of the results of [EGS] on envelopes of piecewise linear functions to calculate the surface. Then we consider each local minimum to find the minimizing t.

L e m m a 9 The value of D(A, B) can be computed in time O(m2n20(mn)).

Proof. We exhibit the algorithm. |

A l g o r i t h m 3 Given two sets of points A and B in ~2, compute the translation t that minimizes H(A, B~t) , and the corresponding distance D(A, B).

.

.

For each ai 6 A calculate di(t) by forming the appropriate Voronoi diagram (either L I or L °°) of the n points ai - bj for each bj 6 B.

For each bj 6 B calculate dj (t) in an analogous fashion.

. Find the upper envelope of the m + n functions computed in the first two steps, using the method of lEGS].

Page 10: [ACM Press the sixth annual symposium - Berkley, California, United States (1990.06.07-1990.06.09)] Proceedings of the sixth annual symposium on Computational geometry - SCG '90 -

4. Find the t that minimizes the resulting function by scanning the vertices of the upper envelope for the one with the minimal z value.

Clearly the first step requires time O(nlogn) for each of the m functions that are computed, and the second step requires time O(m log m) for each of the n functions that are computed. The third step applies the method of [EGS] to the n functions each of com- plexity O(m) and the m functions each of complexity O(n), which requires time O(m2n2a(mn)). The fi- nal step runs in time linear in the number of vertices which is O(mn(m + n)). Thus the third step domi- nates the overall running time. We note that using recent results of [OS], an algorithm can be developed that runs in time O(mn(m + n) polylog ran) [HKS].

6. S u m m a r y

We have developed a distance metric for sets of points in ~d based on finding the minimum Hausdortf dis- tance between the sets under all possible translations. We have shown that this metric can be computed effi- ciently for d = 1, 2. Specifically, for two sets of points with m and n points respectively, the distance can be computed in time O(mn log ran) for points in ~1, and in time O(m2n2a(mn)) for points in ~R 2. We have used the one-dimensional metric to compute the dis- tance between two polygons, P and Q, that are free to undergo an affine transformation. That is, the re- suiting distance is zero if and only if P and Q are related by an affine transformation.

In the two dimensional case, there are a number of questions for future work that we would like to inves- tigate. First, we use the L l or L °° metrics for p~a, b), and it would be interesting to consider o the r /2 met- rics - particularly L 2. Second, we are interested in adding rotation as a minimization parameter, and in developing efficient methods for computing the min- imum Hausdorff distance between point sets under rigid body motion.

A c k n o w l e d g m e n t s

We wish to thank Micha Sharir and Nir Naor from Tel-Aviv University and Paul Chew from Cornell Uni- versity for fruitful discussions.

References

[ACHKM] Arkin, E., Chew, L.P., Huttenlocher, D.P., Kedem, K. and Mitchell, J.S.B. "An Effi- ciently Computable Metric for Comparing Polygo-

nal Shapes", Proceedings of the First ACM-SIAM Symposium on Discrete Algorithms, 1990.

[AMWW] Alt, H., Mehlhorn, K., Wagener, 1-L and Wehl, E. "Congruence, Similarity, and Symme- tries of Geometric Objects", Discrete and Compu- tational Geometry, vol. 3, 23%256, 1988.

[At] Atallah, M.J. "A Linear Time Algorithm for the Hausdorff Distance Between Convex Polygons", Information Processing Letters, vol. 17, 207-209, 1983.

[CD] Chew, L.P. and Drysdale, R.L. "Voronoi dia- grams based on convex distance functions", Proc. ACM Symposium on Computational Geometry, pp. 235-244, 1985.

lEGS] Edelsbrunner, H., Guibas, L.J and Sharir, M. "The Upper Envelope of Piecewise Linear Func- tions: Algorithms and Applications", Tech. Report UIUCDCDS-R-87-1390, Dept. of Comp. Sci., Uni- versity of Illinois at Urbana-Champaign, November 1987.

[HKS] Huttenlocher, D.P., Kedem, K. and Sharir, M. "Efficiently Computing the Hausdorff Distance for Point Sets Under Translation", in preparation.

[Ho] Horn, B.K.P Robot Vision, MIT Press, Cam- bridge Mass., 1986.

[KI] Klein, F. Elementary Mathematics from an Ad- vanced Standpoint: Geometry, MacMillan, New York, 1939.

[OS] Overmars, M.H, and Sharir, M. "Merging Vis- ibility Graphs", to appear in this Proceedings of ACM Symposium on Computational Geometry, 1990.

[LS] Leven, D. and Sharir, M. "Planning a purely translational motion for a convex polygonal ob- ject in two dimensional space using Generalized Voronoi diagrams", Discrete and Computational Geometry, vol 2, pp. 9-31, 1987.

[Me] Mehlhorn, K. Data Structures and Algorithms 3: Multi-Dimensional Searching and Computa- tional Geometry, Springer-Verlag, Berlin, 1984.

[Ro] Royden, H.L. Real Analysis, Macmillan, New York, 1968.

[Sp] Spivak, M. A Comprehensive Introduction to Differential Geometry, Volume 3, Publish or Perish Press, Berkeley Calif., 1979.

349