A (1+ )-Approximation Algorithm for 2-Line-Center P.K. Agarwal, C.M. Procopiuc, K.R. Varadarajan...

Preview:

Citation preview

A (1+)-Approximation Algorithm for 2-Line-Center

P.K. Agarwal, C.M. Procopiuc, K.R. VaradarajanComputational Geometry 2003

Outline Introduction Preliminaries Approximation Algorithm Conclusion

1.Introduction: Projective clustering Given a set S of n objects in Rd and two integers k < n and

q d, find k q-dimensional flats h1,...,hk and partition S into k subsets S1, ...,Sk so that

is minimized. The k-line-center problem is the projective clustering

problem for d =2 and q = 1. Partition S into k clusters and each cluster Si is projected onto a

line so that the maximum distance between a point p and its projection p* is minimized.

1.Introduction:This paper 2-line-center

Given a set S of n points in R2, cover S by two strips so that th maximum width of a strip is minimized

Projective clustering has recently received attention as a tool for creating more efficient nearest neighbor structures, as searching amid high dimensional point set is becoming increasingly important.

1.Introduction: Previous Work 2-line:

near-quadratic running time for exact version. 1-line: width problem

(nlogn) for d =2 (1+ )Approximation:

General: computing k projective clusters Whether a set of n points in the plane can be covered by k lines is NP-

Complete Projective clustering is NP-Complete Approximating the minimum width within a constant factor is NP-

Complete.

1.Introduction: This result Let w* denote the minimum value so that S can be covered

by two strips of width at most w*. This paper present an algorithm that computes, for any >0,

a cover of S by two strips of width at most (1+ ) w*, in time

Strategy of this paper: first presenting a 6-approximation algorithm then derive a (1+ )-approximation algorithm

2.Preliminaries Notations

Strip : the region lying between two parallel lines l1 and l2

width of : distance between l1 and l2

direction of : direction of l1 strip cover of S: two strips that each point of S lies

in one of the strips. For any points p,q, lpq: the line passing through p, q (p,q, r): if r lpq , is the same as lpq

(p,q; w): the strip having lpq as the median line of width 2w.

p

q

r

(p,q, r)

lpq

2.Preliminaries Notations

Optimal cover: * = {1*, 2

*} of S, its width w*

Si* = S i

*

Anchor pair (p,q) of : if d(p,q) diam(S )

p

q

S

diam(S )

2.Preliminaries

Proof let be the diameter

of S*

: the smallest rectangle containing S*, the length of is L, the width of is w.

We choose rS* to be the point farthest away from lpq . Since r, d(r,lpq) 3w.

Moreover S* =S * (p,q,r), and the lemma follows

’: parallel to lpq, thinnest strip contains , its width w’.

3. Approximation Algorithm Two phases

phase 1: computes a cover of S by two strips of width at most 6w*

phase 2: Use to compute a new cover by two strips of width at most (1+ )w*

3.1 6-approximation cover Suppose we have an anchor pair (p,q) of a strip in

*

How to obtain such a pair will be described in 3.2 WLOG, let (p,q) be an anchor pair of 1

*

By Lemma 2.1 there exist r S so that width((p,q,r)) 6w* and (S\ (p,q,r)) 2

*

Perform a binary search to find that r !! Then compute a strip of width at most 2w* that

contains the rest points, i.e. S\ (p,q,r)

Suppose we have an anchor pair (p,q) of a strip in *

f(w)

Proof:

3.1 6-approximation cover

w

f(w)2wg(w)

wi wi+1 wn

f(w)

Binary search over w

Proof:

3.1 6-approximation cover

w

f(w) 2wg(w)

wi wi+1 wn

Compute a family F of at most 11 pairs of points that contains an anchor pair. compute the diameter of S, and let (p,q) be

a diametral pair in S. Let Dp, Dq be the disks of radius /2, centered at p, respectively q.

3.2 Computing an anchor pair

Case 1 If S\(Dp Dq) , let rS\(Dp Dq). Return F ={(p,q),(q,r),(p,r)}

Correctness: At least two points among p,q, and r must be in the

same strip subset. Since d(p,q) = and d(p,r), d(q,r) /2.

all these 3 are greater than diam(S)/2, and is also greater than any diam/2 of any subset.

At least one of these 3 pairs must be an anchor pair. (of an optimal strip)

(Recall the definition of anchor pairs)

3.2 Computing an anchor pair

Case 2 else, S\(Dp Dq) =

Let P =S Dp and Q = S Dq .

conv(P) and conv(Q) be their convex hulls, these two hulls do not intersect

Compute l1 and l2, the inner common tangent lines of conv(P) and conv(Q)

let p1 P, q1 Q be the points lying on l1. Respectively p2, q2

let p3, p4 be a diametral pair in P, and q3,q4 be a diametral pair in Q

Return F = { (p,q), (p3,p4), (q3,q4),

(p,q1), (p,q2), (p,q3), (p,q4),

(q, p1), (q,p2), (q,p3), (q,p4)}

3.2 Computing an anchor pair

P, Q are points

Correctness of Case 2 Suppose on the contrary that no pair of F

is an anchor pair. This implies p,q is neither an anchor pair

of 1* nor of 2

* , so S12* contains either p

or q but not both. WLOG, let p S12

* and q S21*. Since

d(p,qi), d(q,pi) /2 (different disk), pi S12 and qi S21

*, for i = 1,2,3,4

S12* Q . because otherwise S12

* P, and (p3,p4) is an anchor pair, a contradiction. Similarly S21

* P Therefore there exist point p’ S12

* Q, and q’ S21

* P

3.2 Computing an anchor pair

P, Q are points

S12*

S21*

Correctness of Case 2 1

* | 2*

p1~p4 | q1~q4

p | q p’ | q’ x let s be the intersection point of l1 and l2.

Since strip q1, q2, and q’, it also contains the triangle q1q2s. Hence, p’ q1q2s

But p’ lies in the wedge. therefore p1p2p’ intersects the segment q1q2

(green) .Let x be a point on this segment. Since 1

* contains p1p2p’, it also contains x. But q1 q2 do not lie inside 1

* ,so 1*

separates q1 and q2. 2* separates p1 and p2.

3.2 Computing an anchor pair

P, Q are points

x

Strategy We have a 6-approximation cover. Within this region, we try to “guess” the optimal *. We guess its

direction(), displacement(by z) and its width w* (by w) The result of our guess is ’ , an (1+ )-approximation of *, and

totally contains *. For the points not covered by ’ , we run the known PTAS width algorithm to find the second strip covering them.

3.3 (1+ )-Approximation

z

3w’

4d(p,q)

Detail R as shown. Z , , and W Let = C, where C is a constant to be specified later. Z: grid of “positions” along the boundary, so that there are

grid points on each side of R : grid of “directions”.

W: grid of the value of “width”

3.3 (1+ )-APX

-

2w~w~/6

(ε/2) . (w~/6)

Existence of z’, ’, w’ Assuming we know z’, ’, we can perform binary search on w’.

(By computing the width of the “rest”) Since we don’t know z’, ’, we try all possible pairs of them.

3.3 (1+ )-APX

Proof of correctness Can we find by guessing?

Also by Lemma 2.1, we know So R is “big enough”. The remaining question is whether the grids are “dense enough”?

1.First we prove there exists a “good” 2. Then we prove there exists a good z.

3.3 (1+ )-APX

s

Proof of Lemma 3.6 2. There is a good z (together with the previous ,) such that there

exists a strip such that S* , width() (1+ /2)w*

3.3 (1+ )-APX s

Proof of Lemma 3.6 1.There is a good , such that there exists a strip such that S*

, width() (1+ /4)w*

If ½ (w~) /d(p,q), assuming 2/3,

If ½ >(w~) /d(p,q) , which implies <30°

3.3 (1+ )-APX s

Conclusion We have an simple and efficient 2-line-center

approximation algorithm. k-line-center for fixed k, to higher dimensions hyper-strips

Recommended