17
Randomized Greedy Matching Martin Dyer* School of Computer Studies, University of Leeds, Leeds, United Kingdom Alan Frieze? Department of Mathematics, Carnegie-Mellon University, Pittsburgh, PA 152 13-3890 ABSTRACT We consider a randomized version of the greedy algorithm for finding a large matching in a graph. We assume that the next edge is always randomly chosen from those remaining. We analyze the performance of this algorithm when the input graph is fixed. We show that there are graphs for which this Randomized Greedy Algorithm (RGA) usually only obtains a matching close in size to that guaranteed by worst-case analysis (i.e., half the size of the maximum). For some classes of sparse graphs (e.g., planar graphs and forests) we show that the RGA performs significantly better than the worst-case. Our main theorem concerns forests. We prove that the ratio to maximum here is at least 0.7690. . . , and that this bound is tight. 1. INTRODUCTION Perhaps the simplest heuristic for finding a large cardinality matching in a graph G = (V, E) is the “Greedy Heuristic.” GREEDY MATCHING begin M +0; while E( G) # 0 do * Supported by NATO Grant RG0088/89. t Supported by NSF Grant CCR-8900112 and NATO Grant RG0088/89. Random Structures and Algorithms, Vol. 2, No. 1 (1991) 0 1991 John Wiley & Sons, Inc. CCC 1042-9832/91/010029-17$04.00 29

Randomized greedy matching

Embed Size (px)

Citation preview

Page 1: Randomized greedy matching

Randomized Greedy Matching

Martin Dyer* School of Computer Studies, University of Leeds, Leeds, United Kingdom

Alan Frieze? Department of Mathematics, Carnegie-Mellon University, Pittsburgh, PA 152 13-3890

ABSTRACT

We consider a randomized version of the greedy algorithm for finding a large matching in a graph. We assume that the next edge is always randomly chosen from those remaining. We analyze the performance of this algorithm when the input graph is fixed. We show that there are graphs for which this Randomized Greedy Algorithm (RGA) usually only obtains a matching close in size to that guaranteed by worst-case analysis (i.e., half the size of the maximum). For some classes of sparse graphs (e.g., planar graphs and forests) we show that the RGA performs significantly better than the worst-case. Our main theorem concerns forests. We prove that the ratio to maximum here is at least 0.7690. . . , and that this bound is tight.

1. INTRODUCTION

Perhaps the simplest heuristic for finding a large cardinality matching in a graph G = (V, E ) is the “Greedy Heuristic.”

GREEDY MATCHING

begin M +0; while E( G ) # 0 do

* Supported by NATO Grant RG0088/89. t Supported by NSF Grant CCR-8900112 and NATO Grant RG0088/89.

Random Structures and Algorithms, Vol. 2, No. 1 (1991) 0 1991 John Wiley & Sons, Inc. CCC 1042-9832/91/010029-17$04.00

29

Page 2: Randomized greedy matching

30 DYER AND FRIEZE

begin A: Choose e = { u , u } E E

G + G\{ u, u } ; M + M U { e }

end; Output M end

The choice of e in statement A is unspecified. It is known [3] that, if the worst possible choices are made in A, the size of the matching M produced is at least one half of the size of the largest matching, and one half is attainable. (Consider choosing the middle edge of a path of length three.)

Now randomization sometimes improves the performance of algorithms (per- haps the most important example being primality testing). The question we pose here is what effect does randomizing statement A have? In particular, if e is chosen uniformly at random from the remaining edges, what is the expected ratio of the size of M to that of the maximum matching? We prove that there are graphs for which the average-case is hardly better than the worst-case, but also that there are classes of graphs (e.g., planar graphs) for which it is significantly better.

2. NOTATION

Let G = (V, E) be a (simple) graph with IVI = n. For any u E V, r(u) denotes its neighbours in G. For any S G V, G\S denotes the subgraph induced by the vertex set V\S. Let m(G) be the maximum size of a matching in G and p(G) be the expected size of the randomized greedy matching. Let

r(G) = p ( G ) / m ( G ) if m(G) > O

if m( G) = 0 = 1

If X is any class of graphs p ( X ) = infGexr(G). Unless otherwise stated, % will denote any class of graphs closed under vertex deletions and (to avoid trivialities) we suppose 1E( > 0 for some G E %.

K ( % ) = inf { ( V l / l E / : G = (V, E ) , (El > 0 ) G E 3

Note that since some G E % has an edge, and % is closed under deletions, the graph containing a single edge lies in 3. Thus 0 5 ~ ( % ) 1 2 for any %. In particular K(GRAPHS) = 0, K(PLANAR GRAPHS) = 4 , K(FORESTS) = 1. The abbrevia- tion RGA is used for “Randomized Greedy Algorithm.”

i

3. A MONOTONICITY PROPERTY

Many of our results depend on the following

Page 3: Randomized greedy matching

RAN DOMlZE D MATCH I NC 31

Proof. The statement clearly holds for G = ({ u } , 0) and we argue by induction on IVI. L e t H = G \ { u } , A = { x y E E : u ~ { x , y } } andd=Ir(u)I. Then

2 P ( H )

Conversely,

by induction.

= 1 + p ( H ) . H

Corollary 1. Let u E V be exposed in some maximum matching of G , then

r ( G \ { u } ) 5 r (G) .

Proof. Clearly, m(G\{ u } ) = m ( G ) , so the result follows from Lemma 1. H

Corollary 2. perfect matching. Then p ( '3) = p ( X ) .

Let X c '3 be the set of G E '3 which are connected and contain a

Proof. Clearly, p ( X ) p ( ' 3 ) . By Corollary 1 (applied repeatedly if necessary), any G E '3 can be reduced to a G' which contains a perfect matching and has r(G') 5 r (G) . If G' has components G : ( i = 1, . . . , c), let H = GI where r(C;) = min,,,,,r(G~). Clearly, H E X and r ( H ) I r(G') I r (G) . Thus p ( H ) I p(G) . H

Page 4: Randomized greedy matching

32 DYER AND FRIEZE

In particular we have the following, which we use below:

~(FORESTS) = TREES WITH A PERFECT MATCHING)

We note in passing that monotonicity under edge deletions does not hold. As a simple example, let G be a path of three edges. Then p = $, but, when the middle edge is deleted, p = 2 .

4. A LOWER BOUND

We give a weak, but easily proved, lower bound and examine its consequences.

Theorem 1. Let a(%)= 1 / ( 2 - f~(%)). Then p ( % ) Z a ( % ) .

Proof. IVI = 0, r(G) = 1 and hence r (G) 2 a(%).

By induction on IV(. Since 0 5 ~ ( % ) 5 2 we have f 5 a ( % ) l l . If

Since (by Corollary 2 ) we may assume G has a perfect matching we take (VI = 2m(G) > 0. Now

1 p(G) = 1 + - c p(G\{u, u } ) . ( E l u v E E

However.

m(G\{u, u } ) = m ( G ) - 1 if uu lies in some perfect matching

= m ( G ) - 2 otherwise.

Thus, where m = m ( G ) ,

Hence, using the inductive hypothesis,

1 2 1 + - a(lEl(m - 2 ) + m)

IEI

completing the induction.

Page 5: Randomized greedy matching

RANDOMIZED MATCHING 33

Corollary 3. ~ ( G R A P H S ) 2 4, PLANAR GRAPHS) 2 f t , ~(FORESTS) 2 3 , p(A- GRAPHS) 2 Al(2A - l ) , where A-GRAPHS stands for graphs with maximum degree at most A. So, in particular, CUBIC GRAPHS) 2 3 . rn

5. THE CLASS GRAPHS

Theorem 2. ~ ( G R A P H S ) = a . Proof. to each vertex of the complete graph K,.

Let G, be the graph obtained by adding a new vertex and edge adjacent

Clearly, m(G,) = m, and write p(G,) = p,. Consider the first step of the RGA on G,. There are ( 7 ) + m edges. Thus, with probability

2 - -- m ( y ) + m m + l '

we choose an added edge. Its removal leaves G,,-l . Otherwise we choose a K , edge whose removal leaves G,-2 (and two isolated vertices). Thus the final matching size will be, in expectation,

2 with probability - m + l 1 + pm-I

m - 1 and 1 + pmP2 with probability - m + l '

Thus,

with p,, = 0, p1 = 1 . Writing this as

we make the substitution u, = p,,, - pm-l and uo = po. Thus uo = 0 , u1 = 1 , and p, = Cy=o u j , and from ( l ) ,

It is easy to show inductively that (2) has solution, for m 2 1 :

um = 7 1 + & ( m odd) - t + L

2(m + 1) ( m even) -

Let L, = 1 + f + + . . . + for m odd. Thus,

Page 6: Randomized greedy matching

34 DYER AND FRIEZE

m

p, = C. ui = f m - f + L, j = O

(m odd)

= f m - 4 + Lm-l + q&j ( m even)

Asymptotically L, = f ( y + log 2m), where y is Euler’s constant. So

p, = f ( m + log2m + y - 1) + o(1).

Thus r(G,) = $ + O(1og m / m ) and r(C,)+ f as m-m.

6. CONCENTRATION NEAR THE MEAN

We now show that the value of the matching obtained by the RGA is “almost always” near its expectation.

Theorem 3. the random size of the matching obtained by the RGA in G. Then

Let G be a graph with m = m ( G ) , p = p(G) and let X = X ( G ) be

Pr(IX - p I > Em) 5 2e-2e2m

Proof. Let Yj7 ( i = 0,1, . . . , rn) be the Doob martingale induced by the first i choices of the RGA on G , i.e., Yj = E(X1first i choices). Clearly, Y j = K + p ( H ) for some integer K 5 i and subgraph H of G. In fact K = i unless H = 0. Also

Y j + l = K + 1 + E( p(m{ u, u } ) if H contains an edge,

= K otherwise,

where the expectation is over the random choices of the edge uu. Thus,

Yi+l - Y j = 1 + E( p(H\{ u, u } ) - p ( H ) ) if H contains an edge,

= O otherwise.

Thus if H contains an edge,

where all inequalities follow from Lemma 1.

Page 7: Randomized greedy matching

RANDOMIZED MATCHING 35

Thus lYi+, - Yil I 1 whether or not H has an edge. Hence { Yi } is a bounded difference martingale sequence, and it follows from the Hoeffding-Azuma inequality (see Bollobis [l] and McDiarmid [4]) that

Pr(IX - pi > Em) I 2e-2(em)2'm - - 2e-2~2m

Corollary 4. w, + m (arbitrarily slo wly) , then

If {G,} k a graph sequence such that rn(G,)(=m)+m, and

Pr( p(Gm) - w m f i I X(G, ) I p(G,) + urn-)-+ 1

Proof. Put E = urn/- in Theorem 3. m

Corollary 5. If { G,} is the graph sequence defined in the proof of Theorem 2, let X(G,) be the best solution obtained from any polynomial number p(m) of repetitions of the RGA on G,. Then

Pr(fm 5 X(G,) I f m + log m / f i ) + 1 as m + m .

Proof. &(G,) 2 f m follows from the worst-case result (Korte and Haussman [3]). Putting E = log m/- in Lemma 3, the probability of X(G,) not falling in the required interval is at most 2p(m)e-*"Og ,)* -0 as m A m .

7. A M O N O T O N E TRANSFORMATION

Deletion of exposed vertices does not increase r(G). We consider another transformation with this property. Let { u, u } be an edge in a maximum matching of G which does lie in any triangle. Let G' be the graph obtained by substituting all edges vw(w E T(u)\{ u } ) with uw.

Note the restriction that uu does not lie in a triangle ensures that G' is a simple graph.

Lemma 2. r(G') I r(G).

Proof. Clearly, m(G') = m(G) so we only need show p(G') I p(G). We pro- ceed by induction on the number of vertices of G. The base case is when G is a single edge. Then G' = G and the result is trivial. For the induction, let A = {xy: { x , y } n { u , u } = O } and for w = u, u let f ( w ) = T(w)\{u, u } . Note that f(u), f ( u ) are fixed sets here, defined in G and unaltered in G'. Moreover, they are disjoint by assumption. Thus,

Page 8: Randomized greedy matching

36 DYER AND FRIEZE

where we have used the following:

(i) (G\{x, y})’ = G’\{x, y} if xy E A and so (by induction) p(G\{x, y}) 2

(ii) G\{ u, u } = G’\{ u , u } ;

(iii) (G’\{ u, x})’ = G\{ u, u , x} for x E f ( u ) U f ( u ) (after removing the isolated vertex u in G’). So p(C’\{u, x}) I p(G\{u, x}) and p(G’\{u, x}) 5

Let us denote this transformation by u: GRAPHS- GRAPHS, i.e., G’ = u ( G ) . Let %* be any graph-family which is also closed under u. Let X * be the subfamily of %* such that any G E X * is connected, has a perfect matching, and such that every edge in any perfect matching either contains a vertex of degree 1 or lies in a triangle. Then,

P((G\{X, Y } ) ’ ) = P(G’\{X? Y } ) ;

p(G\{u, x}) follow from Lemma 1 for these values of x. w

Corollary 6 . p ( X * ) = p ( % * ) . w

This Corollary is useful for FORESTS, since it implies we may assume

(i) The graph is a tree (ii) All edges in the maximum matching are leaves.

(iii) Every internal vertex is adjacent to exactly one leaf.

To see this let T be a tree for which (ii) or (iii) does not hold. If vertex u is adjacent to leaves w l , w 2 , . . . , w d , d 2 2 then deleting w2, . . . , wd yields a tree T‘ for which m ( T ’ ) = m ( T ) and p ( T ’ ) 5 p ( T ) (Lemma 1.) If u is a vertex not adjacent to any leaf, then we can assume that it lies on an edge uu of some perfect matching. We can then apply u and if necessary reduce the number of leaf neighbours of u in c( T ) to one. After a finite number of iterations of the above procedure we satisfy (i), (ii), (iii) without increasing r.

A tree satisfying (ii) and (iii) will be called an L-tree.

For FORESTS, Corollary 3 gives p 2 f , but this is not tight. In this section we prove

Theorem 4. = 0.7690397. . . (where n ! ! =

x

~(FORESTS) = (Y = f + 2 (-2)k k = O ( 2 k + 5 ) ! !

n(n - 2)(n - 4) * * * 3 * 1 for n odd) . w

We first establish the upper bound.

Page 9: Randomized greedy matching

RANDOMIZED MATCHING 37

Lemma 3. ~(FORESTS) 5 a.

Proof. m-vertex path. Let tm = p(Tm). Thus to = 0, t , = 1. Clearly, for m 2 2,

Let Tm be the graph obtained by adding a leaf to each vertex of an

m m - l 1 t m = l + - ( c ( f ; - l + t m - i ) + c ([;-I + f m - i - 1 ) ) 2 m - 1 i = I

ti + x2 I ; ) 2 m - 1 ( F ~ i = O

2 m - l = 1 + - (3)

From (3),

m - I m - 2

for m 2 3, (2m - l)tm = (2m - 1 ) + 2( c t i + c f , ) , i = n i = 0

and also

Subtracting,

or

(2m - l ) ( t m - t m - l ) =2(1 + f m - 2 ) . (4)

In fact, (4) holds also for m = 2 since t , = $ from (3). Let

m 2 u m = t m - t m - , , u,,=tO, s o t m = C u i , andu,=O, u , = l , u 2 = 5 .

i = O

So, from (4),

m - 2

( 2 m - l ) u m = 2 ( 1 + 2 Ui) ( m 1 2 )

Thus,

Subtracting,

(2m - l )um - (2m - 3)um-, = 2 ~ , - ~ , (m 2 3)

Page 10: Randomized greedy matching

38 DYER A N D FRIEZE

(2m - l ) ( u , - u r n - l ) = -2(urn-, - urn-2) ( 5 )

Let rn

urn = urn - u r n - l , uo = uo, so urn = C ui and uo = 0, u1 = 1 , u = - 2 3 . i = O

So, from (5) -2

urn = urn-] ( m ~ 3 ) .

Thus, - (- 2)" - 2

urn = ( m r 3 ) . (2m - l ) ! !

Therefore, m

... ~

= 3 + 2 2 ( -2)k / (2k + 5 ) ! ! (m r 3 ) , k = O

with

Now

i.e.

u , = o , u , = l , u 2 = j . 2

rn

= C ( m + l - j ) u j

= ( m + 1)

j = O

rn rn

uj - C juj j = O j = O

rn

= (m + l ) u , - 4 C ((2j - 1 ) + 1)uj

= (m + $)urn - 1 C (2j - 1)uj

= (m + $)urn - 1 C (2j - 1 ) u j ,

= ( m + $)urn + uj , from (6) ,

j = O

rn

j = O

rn

j = 3

m-1

j = 2

= (m + $)urn + urn - urn - 1

trn= mu, + ($urn - u r n - I ) , ( m r 2 ) .

Page 11: Randomized greedy matching

RANDOMIZED MATCHING 39

For large m, u, is close to a and u, is very small. Therefore let us define E , by

t , = m a + p + E , ( m z l ) , (7)

where

and, for m 2 2,

Em = (m + $)(urn - a) - u, .

It follows from Lemma 4 below that E , = O( h), and thus

so, in particular,

r (T , )+a as m+m. 8

Numerically, the first two terms in (7) are an excellent approximation to t,.

Our lower bound argument requires knowledge of the behavior of the se- When m = 10, for example, the error term

quence { E , } . Its first few terms are (approximately)

is less than lo-'.

= +0.0774006, eZ = -0.0249725 , E~ = +0.0059878, (9)

E~ = -0.0011472, E~ = +0.0001834.

Lemma 4. E , = (m + $ ) ( u , + ~ - a).

Proof. For m = 1, this may be verified directly.

For m 1 2 ,

Em = (m + $)(urn - a) - u, ,

and the lemma amounts to the claim that

But, from (6), this is the obvious identity

4 l = - ( m + $ ) ( - -2 +

2 m + 1 (2m+1)(2m+3) 8

Corollary 7. (a) ~ ~ ~ - ~ > O a n d eZk<O ( k = 1 , 2 , ...) (b) 1~,+~1121~,,,,1/(2m + 3) (m = 1 , 2 , . . .), and hence {(2m - l)\E,,,l} is a

decreasing sequence.

Page 12: Randomized greedy matching

40 D Y E R AND FRIEZE

Proof.

the ui alternate in sign and decrease strictly in absolute value. (a) These inequalities follow from Lemma 4, since E , = - ( m + t ) C T = m + 3 ~ i ,

(b) The case m = 1 can be verified by direct calculation. For m 2 2,

and so, using (6)

We now prove the lower bound for FORESTS. We prove that the worst-case examples for forests are the trees T , of the previous lemma. Let 5 = { T , : m = 1 ,2 , . . .}.

Lemma 5. Let T be an L-tree with 2m vertices. Then

Proof. We establish by induction a bound

provided T # T , ( 6 : = max(0, E , } . ) This will, of course, prove the lemma. We next define

2; = ei if i = 1 or i is even

= 0 otherwise.

Thus suppose e = uu and n{ u, u } has components { Ci : 1 5 i I k,} which contain at least one edge. Now define S(C,) (i = 1 ,2 , . . . , k,) by

S(Ci) = 2,. if Ci = T,,

= O otherwise

where m' = m(C,) , and let ye = Efz l S(Ci) for e E E. (The precise form of this definition may seem curious, but will be justified later in our proof.) Now, by induction

Page 13: Randomized greedy matching

RANDOMIZED MATCHING 41

Hence

using the fact that T is an L-tree. So

Now let

A,(T) = ( p ( T ) - (ma + p + ~;))(2rn - 1)

We must show that A,(T)?O. But (10) and ( 1 1 ) imply

A , ( T ) ? 2 m - 1 + 2 ( m - 1 ) 2 ~ + P C k , + C ye e E E e E E

- (2m - l)(ma + P + E : ) (12)

= 2 a + p - 1 - ( 2 m - l ) ~ : - 4 m P + p C k , + C 7,. e E E e E E

2 2 a + / 3 - 1 - 9 ~ ~ - 4 m p + P C k e + Z y e . e E E P E E

by Corollary 7. Now let A,( T) denote the right-hand side of (13). We show that A,(T) 2 0. We prove this by induction on m. Note that, if m < 4 there is nothing to prove. Now our base cases will be, for each m ? 4, the trees S(a, b, c) where a, b, c are positive integers and a + b + c + 1 = m. The tree S(a, b, c) consists of a central vertex u , paths u , x , , x , - , , . . . , xl, u , y b , . . , y, and u , z c , z C - , , . . . , z1 plus leaves w, x i , . . . , x,, y , , . . . , y ; , z ; , . . . , z : where u is adjacent to w, x i is adjacent to xi and so on.

I f

We first evaluate C e E E k e . This is the sum of terms

CONTRIBUTION EDGES

2 ( ( a - l ) + ( b - l ) + ( c - l ) ) x ix i ( i>2) , . . . 2((a - 2) + ( b - 2) + (c - 2)) x i x i + l (i 2 2), . . .

6

3 uw

x l x ; and x ,xz , . . . 9 ux,, . . .

which sum to 4(a + b + c) = 4(m - 1). The above analysis only applies if a, b, c 2 2. On the other hand, if say a = 1,

then the path ux , contributes 1 + 2 = 2(a - 1) + 2(a - 2) + 2 + 3 and so C e E E k , = 4(m - 1) in this case also.

We must now compute C e E E ye. This comprises terms

Page 14: Randomized greedy matching

42 DYER AND FRIEZE

P

i = l

( y+. . . ) > € l + € 2 + € 4 1 + - + - ( 143 143 = el + e2 + 143e4/139

2 0.05, by direct calculation.

Thus

and hence

A , ( T ) 2 2 a + p - 1 - 9 ~ , - 4mp + 4(m - 1)p + 3(E2 + €4) + .2

> 0 , by direct calculation.

(See Theorem 4, (8) and (9) for numerical estimates of the relevant quantities.) We now have a basis for our induction. So suppose that T # T,,, and T is not of

the form S(a, b, c). Let f be obtained from T by deleting all leaves. Let x be a leaf of f. Consider the path from x in f, P = { x l ( = x ) , x 2 , . . . , x k , y}, where the degree of xi (in 5 is 2, (i = 2,3 , . . . , k) and the degree of y is d + 1 (d h 2). Now, in T, let x i , i = 1,2, . . . , k be the leaf neighbors of x i , y' be the leaf neighbor of y, zl, z2, . . . , z,, be the nonleaf neighbors of y and finally, let z : , i = 1 , 2 , . . . , d be the leaf neighbors of z ; . Let T ' = T\P (after deleting all isolated vertices). Then T' # Tm-& as T is not of the form S(a, b, c). Let now

Page 15: Randomized greedy matching

RANDOMIZED MATCHING 43

D = A,(T) - A,(T’). Then

D = (Dl - 4k)P + D, where

and

and where E’ is the edge set of T’ and k : , 7: stand for k, , ye in T’. We will now justify our definition of 7,. For each component Ci associated with

edge e in T , there is a (possibly edgeless) component Ci = Ci fl E’ in T’. Now (ye - y : ) is the sum of terms (S(Ci) - S(Ci)) for Ci f Ci. Now we certainly have (S(Ci) - S(Ci)) 2 S(Ci) unless Ci = T,. But clearly, Cj = T , implies Ci = T , so the inequality holds regardless. Therefore we may justifiably use the bound

Y,-Y:’ c S ( C ; ) . c; # ci E J

The consequence is that we do not need to consider the effect of the “destruction” of members of F in T‘, only their “creation” in T. This allows combination of what might otherwise be distinct cases in the induction. We use these ideas below without further comment.

Now, when k 2 2,

D, = 1 + 1 + 2(k - 1) + 2(k - 2) + (d + 1) + 1 + d

= 4k + 2d - 2 .

The successive terms in the expression for D, are the contributions due, respectively, to x , x ; , x ,x , , x ix i (2 I i 5 k), xixi+, (2 5 i I k - I), xky, yy’and yzi (1 5 i I d). The last two terms are increases ( k , - k:) . For all other edges e E E‘, it is clear that k, = k: . The expression 4k + 2d - 2 is also valid for k = 1, as the reader can easily check.

We now turn to D,. If y is deleted from T then T breaks up into Tk plus d subtrees H,, H, , . . . , H d , say. Let us assume, without loss, that Hi€ T for 1 5 i I s, for some 0 5 s I d. Note that, if d = 2, then s < 2 since T is not an S(a, b, c). We consider two cases.

Case 1. d r 3 or d = 2 , s = O .

For k r 2,

D, = 2;, + 22, + * - * + 22kk-2 + g k - , + ( g k k - , + g;, + * * i;) gk + d;k .

The terms here arise as follows. The terms 2 4 , (1 I i I k - 2) come from x;+,x:+,

Page 16: Randomized greedy matching

44 DYER AND FRIEZE

and comes from yy’ and, finally, dfk comes from y z i , (1 5 i 4 d ) . So

Z k - l comes from x k x ; ; ( fkk- l + Zil + * * . + f i ) comes from x k y l ; f k

k

D, = 2 C f i + (d - l)Zk + Oil + . . * + Zis . i = l

The same expression is also valid for k = 1. Thus, from (15),

k

D 2 (2d - 2)p + 2 C Zi + (d - l)Zk + ti, + . . . + Zi,? i = l

2 0.1 + (d - l)(2p + f k + d€, / (d - 1))

2 0.1 + (d - l)(2p + 3 4

> (d - 1 ) / 5

> O f

Now, by the induction hypothesis, A l ( T’) 2 0 and so A l ( T) 2 0 is immediate.

Case 2. d = 2, s = 1.

Let the degree of z2 be d’ + 2, where d’ 2 1 (since T is not of the form S ( a , b , c)). Let w I , . . . , wdz be the neighbors of z2 other than y and z ; . Assume first that d ’ r 2 . Now the only difference from Case 1 is that Y,,,~ and yzzw,, (i = 1,2, . . . , d‘) contain new contributions Z k + p + l z e4, and so

(16) D 2 0.1 + 2p + 3e2 + (d’ + .

If d ’ I 10 then D > 0, by direct calculation, and so A l ( T) 2 Al(T’) 2 0 as in the previous case. For d’ > 10 we could have D < 0 and so we let T = T’\(H, U { y}), the tree obtained by deleting vertex y and HI from T’. By the analysis of the previous case A l ( T ’ ) 2 A 1 ( T ) + (d’ - 1)/5 2 (d‘ - 1)/5, since T” is clearly not in 9 and is not an S(a, b, c). Hence, from (16),

Al(T) 2 0.1 + 2p + 3e2 + 2e4 + (d’ - 1)(0.2 + e4)

2 0.1 + 2p + 3 E 2 + 26, + lO(0.2 + E4) > 0 by direct calculation.

This completes the induction, and the proof of the lemma.

9. CONCLUDING REMARKS

We have established the worst case examples for FORESTS, but we have little idea for PLANAR GRAPHS. The lower bound $ is almost certainly not tight, but currently our best upper bound is G, from Theorem 2 which gives PLANAR GRAPHS) 5 g. This leaves a large gap.

Page 17: Randomized greedy matching

RANDOMIZED MATCHING 45

A randomized greedy algorithm could also be applied to the problem of finding a large H-matching in a graph G. We consider H to be some fixed graph like a triangle, and an H-matching of G is a collection of vertex-disjoint copies of H in G . Finding the largest H-matching is NP-hard for all graphs H with at least three vertices (Kirkpatrick and Hell [2]). One difficulty with extending our analysis to this case is that the analogue of Lemma 1 fails to hold in general. Consider, for example, H to be a path of length two, and compare the performance of the RGA on two paths of length two with its performance on a path of length six.

Another possible generalization is to weighted problems. For example, in the weighted matching problem, we might consider choosing the next edge with probability proportional to its edge weight.

REFERENCES

[ 11 B. Bollobas, Martingales, isoperimetric inequalities and random graphs, Coll. Math. SOC. J . Bolyai, 52, 113-139 (1987).

[2] D. G . Kirkpatrick and P. Hell, On the complexity of a generalised matching problem, Proceedings of the 10th Annual ACM Symposium on Theory of Computing, ACM, New York, 1978, pp. 240-245.

[3] B. Korte and D. Hausmann, An analysis of the greedy algorithm for independence systems, in Algorithmic Aspects of Cornbinatorics, B. Alspach, P. Hall, and D. J. Miller, Eds., Annals of Discrete Mathematics 2, 65-74 (1978).

[4] C. J . H . McDiarmid, On the method of bounded difference, in Surveys in Com- binatorics, Proceedings of the Twelfth British Combinntorial Conference J . Siemons, Ed., 1989, pp. 148-188.

Received February 23, 1990