Constrained Non-Monotone Submodular Maximization ...In another line of work, [Jen76, KH78, HKJ80] showed that the greedy algorithm is a p-approximation for maximizing a modular (i.e.,

arX

iv:1

003.

1517

v2 [

cs.D

S]

6 O

ct 2

010

Constrained Non-Monotone Submodular Maximization:Offline and Secretary Algorithms

Anupam Gupta∗ Aaron Roth† Grant Schoenebeck‡ Kunal Talwar§

Abstract

Constrained submodular maximization problems have long been studied, most recently in the context of auc-tions and computational advertising, with near-optimal results known under a variety of constraints when thesubmodular function ismonotone. The case of non-monotone submodular maximization is less well understood:the first approximation algorithms even for the unconstrained setting were given by Feige et al.(FOCS ’07). Morerecently, Lee et al.(STOC ’09, APPROX ’09)show how to approximately maximize non-monotone submodularfunctions when the constraints are given by the intersection of p matroid constraints; their algorithm is based onlocal-search procedures that considerp-swaps, and hence the running time may benΩ(p), implying their algorithmis polynomial-time only for constantly many matroids.

In this paper, we give algorithms that work forp-independence systems(which generalize constraints givenby the intersection ofp matroids), where the running time is poly(n, p). Both our algorithms and analyses aresimple: our algorithm essentially reduces the non-monotone maximization problem to multiple runs of the greedyalgorithm previously used in the monotone case. Our idea of using existing algorithms for monotone functionsto solve the non-monotone case also works for maximizing a submodular function with respect toa knapsackconstraint: we get a simple greedy-based constant-factor approximation for this problem.

With these simpler algorithms, we are able to adapt our approach to constrained non-monotone submodularmaximization to the(online) secretary setting, where elements arrive one at a time in random order, and thealgorithm must make irrevocable decisions about whether ornot to select each element as it arrives. We giveconstant approximations in this secretary setting when thealgorithm is constrained subject to a uniform matroidor a partition matroid, and give anO(log k) approximation when it is constrained by a general matroid ofrankk.

∗Carnegie Mellon University, Pittsburgh, PA. Email:[email protected]†Microsoft Research New England, Cambridge, MA. Email:[email protected]‡UC Berkeley, Berkeley, CA. Email:[email protected]§Microsoft Research, Mountain View, CA. Email:[email protected]

http://arxiv.org/abs/1003.1517v2

1 Introduction

We present algorithms for maximizing (not necessarily monotone) non-negative submodular functions satisfyingf(∅) = 0 under a variety of constraints considered earlier in the literature. Lee et al. [LMNS10, LSV09] gave thefirst algorithms for these problems via local-search algorithms: in this paper, we consider greedy approaches thathave been successful formonotonesubmodular maximization, and show how these algorithms canbe adapted verysimply to non-monotone maximization as well. Using this idea, we show the following results:

• We give anO(p)-approximation for maximizing submodular functions subject to ap-independence system.This extends the result of Lee et al. [LMNS10, LSV09] which applied to constraints given by the intersectionof p matroids, wherep was a constant. (Intersections ofp matroids givep-indep. systems, but the converse isnot true.) Our greedy-based algorithm has a run-time polynomial in p, and hence gives the first polynomial-time algorithms for non-constant values ofp.

• We give a constant-factor approximation for maximizing submodular functions subject to a knapsack con-straint. This greedy-based algorithm gives an alternate approach to solve this problem; Lee et al. [LMNS10]gave LP-rounding-based algorithms that achieved a(5 + ǫ)-approximation algorithm for constraints given bythe intersection ofp knapsack constraints, wherep is a constant.

Armed with simpler greedy algorithms for nonmonotone submodular maximization, we are able to perform con-strained nonmonotone submodular maximization in several special cases in the secretary setting as well: when itemsarrive online in random order, and the algorithm must make irrevocable decisions as they arrive.

• We give anO(1)-approximation for maximizing submodular functions subject to a cardinality constraint andsubject to a partition matroid. (Using a reduction of [BDG+09], the latter impliesO(1)-approximations toe.g., graphical matroids.) Our secretary algorithms are simple and efficient.

• We give anO(log k)-approximation for maximizing submodular functions subject to an arbitrary rankk ma-troid constraint. This matches the known bound for thematroid secretary problem, in which the function tobe maximized is simply linear.

No prior results were known for submodular maximization in the secretary setting, even formonotonesubmodularmaximization; there is some independent work, see§1.3.1for details.

Compared to previous offline results, we trade off small constant factors in our approximation ratios of ouralgorithms for exponential improvements in run time: maximizing nonmonotone submodular functions subject to(constant)p ≥ 2 matroid constraints currently has a( p2p−1 + ǫ) approximation due to a paper of Lee, Sviridenkoand Vondrák [LSV09], using an algorithm with run-time exponential inp. For p = 1 the best result is a3.23-approximation by Vondrák [Von09]. In contrast, our algorithms have run time only linear inp, but our approximationfactors are worse by constant factors for the small values ofp where previous results exist. We have not tried tooptimize our constants, but it seems likely that matching, or improving on the previous results for constantp willneed more than just choosing the parameters carefully. We leave such improvements as an open problem.

1.1 Submodular Maximization and Secretary Problems in an Economic Context

Submodular maximization and secretary problems have both been widely studied in their economic contexts. Theproblem of selecting a subset of people in a social network tomaximize their influence in a viral marketing campaigncan be modeled as a constrained submodular maximization problem [KKT03, MR07]. When costs are introduced,the influence minus the cost gives usnon-monotonesubmodular maximization problems; prior to this work,onlinealgorithms for non-monotone submodular maximization problems were not known. Asadpour et al. studied theproblem of adaptive stochastic (monotone) submodular maximization with applications to budgeting and sensorplacement [ANS08], and Agrawal et al. showed that thecorrelation gapof submodular functions was bounded bya constant using an elegant cost-sharing argument, and related this result to social welfare maximizing auctions[ADSY09]. Finally, secretary problems, in which elements arrivingin random order must be selected so as tomaximize some constrained objective function have well-known connections to online auctions [Kle05, BIK07,

1

BIKK07, HKP04]. Our simpleroffline algorithms allow us to generalize these results to give the first secretaryalgorithms capable of handling a non-monotone submodular objective function.

1.2 Our Main Ideas

At a high level, the simple yet crucial observation for the offline results is this: many of the previous algorithms andproofs for constrained monotone submodular maximization can be adapted to show that the setS produced by themsatisfiesf(S) ≥ βf(S ∪ C∗), for some0 < β ≤ 1, andC∗ being an optimal solution. In the monotone case, theright hand side is at leastf(C∗) = OPT and we are done. In the non-monotone case, we cannot do this. However,we observe that iff(S ∩ C∗) is a reasonable fraction ofOPT, then (approximately) finding the most valuable setwithin S would give us a large value—and since we work with constraints that are downwards closed, finding such aset is justunconstrainedmaximization onf(·) restricted toS, for which Feige et al. [FMV07] give good algorithms!On the other hand, iff(S ∩C∗) ≤ ǫOPT andf(S) is also too small, then one can show that deleting the elementsin S and running the procedure again to find another setS′ ⊆ Ω \ S with f(S′) ≥ βf(S′ ∩ (C∗ \ S)) wouldguarantee a good solution! Details for the specific problemsappear in the following sections; we first consider thesimplest cardinality constraint case inSection 2to illustrate the general idea, and then give more general results inSections 3.1and3.2.

For the secretary case where the elements arrive in random order, algorithms were not known for the monotonecase either—the main complication being that we cannot run agreedy algorithm (since the elements are arrivingrandomly), and moreover the value of an incoming element depends on the previously chosen set of elements.Furthermore, to extend the results to the non-monotone case, one needs to avoid the local-search algorithms (which,in fact, motivated the above results), since these algorithms necessarily implement multiple passes over the input,while the secretary model only allows a single pass over it. The details on all these are given inSection 4.

1.3 Related Work

Monotone Submodular Maximization.The (offline) monotone submodular optimization problem hasbeen longstudied: Fisher, Nemhauser, and Wolsey [NWF78, FNW78] showed that the greedy and local-search algorithmsgive a (e/e − 1)-approximation with cardinality constraints, and a(p + 1)-approximation underp matroid con-straints. In another line of work, [Jen76, KH78, HKJ80] showed that the greedy algorithm is ap-approximation formaximizing amodular (i.e., additive) function subject to ap-independence system. This proof extends to show a(p+1)-approximation for monotone submodular functions under the same constraints (see, e.g., [CCPV09]). A longstanding open problem was to improve on these results; nothing better than a2-approximation was known even formonotone maximization subject to a single partition matroid constraint. Calinescu et al. [CCPV07] showed how tomaximize monotone submodular functions representable as weighted matroid rank functions subject to any matroidwith an approximation ratio of(e/e − 1), and soon thereafter, Vondrák extended this result toall submodular func-tions [Von08]; these highly influential results appear jointly in [CCPV09]. Subsequently, Lee et al. [LSV09] givealgorithms that beat the(p+ 1)-bound forp matroid constraints withp ≥ 2 to get a( p2p−1 + ǫ)-approximation.

Knapsack constraints.Sviridenko [Svi04] extended results of Wolsey [Wol82] and Khuller et al. [KMN99] toshow that a greedy-like algorithm with partial enumerationgives an(e/e − 1)-approximation to monotone sub-modular maximization subject to a knapsack constraint. Kulik et al. [KST09] showed that one could get essen-tially the same approximation subject to a constant number of knapsack constraints. Lee et al. [LMNS10] give a5-approximation for the same problem in the non-monotone case.

Mixed Matroid-Knapsack Constraints.Chekuri et al. [CVZ09] give strong concentration results for dependentrandomized rounding with many applications; one of these applications is a((e/e − 1) − ǫ)-approximation formonotone maximization with respect to a matroid and any constant number of knapsack constraints. [GNR09, Sec-tion F.1] extends ideas from [CK05] to give polynomial-time algorithms with respect to non-monotone submodularmaximization with respect to ap-system andq knapsacks: these algorithms achieve anp+ q+O(1)-approximationfor constantq (since the running time isnpoly(q)), or a(p + 2)(q + 1)-approximation for arbitraryq; at a high level,their idea is to “emulate” a knapsack constraint by a polynomial number of partition matroid constraints.

2

Non-Monotone Submodular Maximization.In the non-monotone case, even the unconstrained problem isNP-hard (it captures max-cut). Feige, Mirrokni and Vondrák [FMV07] first gave constant-factor approximations for thisproblem. Lee et al. [LMNS10] gave the first approximation algorithms for constrained non-monotone maximization(subject top matroid constraints, orp knapsack constraints); the approximation factors were improved by Lee etal. [LSV09]. The algorithms in the previous two papers are based on local-search withp-swaps and would takenΘ(p) time. Recent work by Vondrák [Von09] gives much further insight into the approximability of submodularmaximization problems.

Secretary Problems.The original secretary problem seeks to maximize the probability of picking the element ina collection having the highest value, given that the elements are examined in random order [Dyn63, Fre83, Fer89].The problem was used to model item-pricing problems by Hajiaghayi et al. [HKP04]. Kleinberg [Kle05] showed thatthe problem of maximizing amodular function subject to a cardinality constraint in the secretary setting admits a(1+ Θ(1)√

k)-approximation, wherek is the cardinality. (We show that maximizing asubmodularfunction subject to a

cardinality constraint cannot be approximated to better than some universal constant, independent of the value ofk.)Babaioff et al. [BIK07] wanted to maximize modular functions subject to matroid constraints, again in a secretary-setting, and gave constant-factor approximations for somespecial matroids, and anO(log k) approximation forgeneral matroids having rankk. This line of research has seen several developments recently [BIKK07, DP08,KP09, BDG+09].

1.3.1 Independent Work on Submodular Secretaries

Concurrently and independently of our work, Bobby Kleinberg has given an algorithm similar to that in§4.1 formonotone secretary submodular maximization under a cardinality constraint [Kle09]. Again independently, Bateniet al. consider the problem of non-monotone submodular maximization in the secretary setting [BHZ10]; they give adifferentO(1)-approximation subject to a cardinality constraint, anO(L log2 k)-approximation subject toL matroidconstraints, and anO(L)-approximation subject toL knapsack constraints in the secretary setting. While we donot consider multiple constraints, it is easy to extend our results to obtainO(L log k) andO(L) respectively usingstandard techniques.

1.4 Preliminaries

Given a setS and an elemente, we useS + e to denoteS ∪ {e}. A function f : 2Ω → R+ is submodularif forall S, T ⊆ Ω, f(S) + f(T ) ≥ f(S ∪ T ) + f(S ∩ T ). Equivalently,f is submodular if it hasdecreasing marginalutility: i.e., for allS ⊆ T ⊆ Ω, and for alle ∈ Ω, f(S + e)− f(S) ≥ f(T + e)− f(T ). Also,f is calledmonotoneif f(S) ≤ f(T ) for S ⊆ T . Givenf andS ⊆ Ω, definefS : 2Ω → R asfS(A) := f(S ∪A)− f(S). The followingfacts are standard.

Proposition 1.1. If f is submodular withf(∅) = 0, then• for anyS, fS is submodular withfS(∅) = 0, and• f is alsosubadditive; i.e., for disjoint setsA,B, we havef(A) + f(B) ≥ f(A ∪B).

Matroids. A matroid is a pairM = (Ω,I ⊆ 2Ω), whereI contains∅, if A ∈ I andB ⊆ A thenB ∈ I, and foreveryA,B ∈ I with |A| < |B|, there existse ∈ B \ A such thatA + e ∈ I. The sets inI are calledindependent,and therank of a matroid is the size of any maximal independent set (base)inM. In auniformmatroid,I containsall subsets of size at mostk. A partition matroid, we have groupsg1, g2, . . . , gk ⊆ Ω with gi∩gj = ∅ and∪jgj = Ω;the independent sets areS ⊆ Ω such that|S ∩ gi| ≤ 1.Unconstrained (Non-Monotone) Submodular Maximization. We useFMVα(S) to denote an approximationalgorithm given by Feige, Mirrokni, and Vondrák [FMV07] for unconstrained submodular maximization in the non-monotone setting: it returns a setT ⊆ S such thatf(T ) ≥ 1α maxT ′⊆S f(T ′). In fact, Feige et al. present manysuch algorithms, the best approximation ratio among these isα = 2.5 via a local-search algorithm, the easiest is a4-approximation that just returns a uniformly random subsetof S.

3

2 Submodular Maximization subject to a Cardinality Constraint

We first give an offline algorithm for submodular maximization subject to a cardinality constraint: this illustratesour simple approach, upon which we build in the following sections. Formally, given a subsetX ⊆ Ω and anon-negative submodular functionf that is potentially non-monotone, but hasf(∅) = 0. We want to approximatemaxS⊆X:|S|≤k f(S). The greedy algorithm starts withS ← ∅, and repeatedly picks an elemente with maximummarginal valuefS(e) until it hask elements.

Lemma 2.1. For any set|C| ≤ k, the greedy algorithm returns a setS that satisfiesf(S) ≥ 12 f(S ∪ C).Proof. Suppose not. ThenfS(C) = f(S ∪ C)− f(S) > f(S), and hence there is at least one elemente ∈ C \ Sthat hasfS({e}) > f(S)|C\S| >

f(S)k . Since we ran the greedy algorithm, at each step this elemente would have been

a contender to be added, and by submodularity,e’s marginal value would have been only higher then. Hence theelements actually added in each of thek steps would have had marginal value more thane’s marginal value at thattime, which is more thanf(S)/k. This implies thatf(S) > k · f(S)/k, a contradiction.

This theorem is existentially tight: observe that if the function f is just the cardinality functionf(S) = |S|, andif S andC happen to be disjoint, thenf(S) = 12f(S ∪ C).Lemma 2.2(Special Case of Claim 2.7 in [LMNS10]). Given setsC,S1 ⊆ U , letC ′ = C \ S1, andS2 ⊆ U \ S1.Thenf(S1 ∪C) + f(S1 ∩ C) + f(S2 ∪ C ′) ≥ f(C).Proof. By submodularity, it follows thatf(S1 ∪ C) + f(S2 ∪ C ′) ≥ f(S1 ∪ S2 ∪ C) + f(C ′). Again usingsubmodularity, we getf(C ′) + f(S1 ∩C) ≥ f(C) + f(∅). Putting these together and using non-negativity off(·),the lemma follows.

1: let X1 ← X2: for i = 1 to 2 do3: let Si ← Greedy(Xi)4: let S′i ← FMVα(Si)5: let Xi+1 ← Xi \ Si.6: end for7: return best ofS1, S′1, S2.

Figure 1: Submod-Max-Cardinality(X, k, f)

We now give our algorithm Submod-Max-Cardinality(Figure 1) for submodular maximization: it has the samemulti-pass structure as that of Lee et al., but uses the greedyanalysis above instead of a local-search algorithm.

Theorem 2.3. The algorithm Submod-Max-Cardinality is a(4 + α)-approximation.

Proof. Let C∗ be the optimal solution withf(C∗) = OPT.We know thatf(S1) ≥ 12f(S1∪C∗). Also, if f(S1∩C∗) is atleastǫOPT, then we know that theα-approximate algorithmFMVα gives us a value of at least(ǫ/α)OPT. Else,

f(S1) ≥ 12f(S1 ∪ C∗) ≥ 12f(S1 ∪ C∗) + 12f(S1 ∩ C∗)− ǫOPT/2 (1)

Similarly, we get thatf(S2) ≥ 12f(S2 ∪ (C∗ \ S1)). Adding this to (1), we get

2max(f(S1), f(S2)) ≥ f(S1) + f(S2)≥ 12

(

f(S1 ∪ C∗) + f(S1 ∩ C∗) + f(S2 ∪ (C∗ \ S1)))

− ǫOPT/2 (2)≥ 12f(C∗)− ǫOPT/2 (3)≥ 12(1− ǫ)OPT.

where we usedLemma 2.2to get from (2) to (3). Hencemax{f(S1), f(S2)} ≥ 1−ǫ4 OPT. The approximationfactor now ismax{α/ǫ, 4/(1 − ǫ)}. Settingǫ = αα+4 , we get a(4 + α)-approximation, as claimed.

Using the known value ofα = 2.5 from Feige et al. [FMV07], we get a6.5-approximation for submodularmaximization under cardinality constraints. While this isweaker than the3.23-approximation of Vondrák [Von09],or even the4-approximation we could get from Lee et al. [LMNS10] for this special case, the algorithm is faster,and the idea behind the improvement works in several other contexts, as we show in the following sections.

4

3 Fast Algorithms for p-Systems and Knapsacks

In this section, we show our greedy-style algorithms which achieve anO(p)-approximation for submodular maxi-mization overp-systems, and a constant-factor approximation for submodular maximization over a knapsack. Dueto space constraints, many proofs are deferred to the appendices.

3.1 Submodular Maximization for Independence Systems

LetΩ be a universe of elements and consider a collectionI ⊆ 2Ω of subsets ofΩ. (Ω,I) is called anindependencesystemif (a) ∅ ∈ I, and (b) ifX ∈ I andY ⊆ X, thenY ∈ I as well. The subsets inI are calledindependent; forany setS of elements, an inclusion-wise maximal independent setT of S is called abasisof S. For brevity, we saythatT is a basis, if it is a basis ofΩ.

Definition 3.1. Given an independence system(Ω,I) and a subsetS ⊆ Ω. Therankr(S) is defined as the cardinal-ity of the largestbasis ofS, and thelower rankρ(S) is the cardinality of thesmallestbasis ofS. The independencesystem is called ap-independence system (or ap-system) ifmaxS⊆Ω

r(S)ρ(S) ≤ p.

See, e.g., [CCPV09] for a discussion of independence systems and their relationship to other families of con-straints; it is useful to recall that intersections ofp matroids form ap-independent system.

3.1.1 The Algorithm for p-Independence Systems

Suppose we are given an independence system(Ω,I), a subsetX ⊆ Ω and a non-negative submodular functionfthat is potentially non-monotone, but hasf(∅) = 0. We want to find (or at least approximate)maxS⊆X:S∈I f(S).The greedy algorithm for this problem is what you would expect: start with the setS = ∅, and at each step pickan elemente ∈ X \ S that maximizesfS(e) and ensures thatS + e is also independent. If no such element exists,the algorithm terminates, else we setS ← S + e, and repeat. (Ideally, we would also check to see iffS(e) ≤ 0,and terminate at the first time this happens; we don’t do that,and instead we add elements even when the marginalgain is negative until we cannot add any more elements without violating independence.) The proof of the followinglemma appears inSection A, and closely follows that for the monotone case from [CCPV09].

Lemma 3.2. For a p-independence system, ifS is the independent set returned by the greedy algorithm, then forany independent setC, f(S) ≥ 1p+1f(C ∪ S).

1: X1 ← X2: for i = 1 to p+ 1 do3: Si ← Greedy(Xi,I, f)4: S′i ← FMVα(Si)5: Xi+1 ← Xi \ Si6: end for7: return S ← best among{Si}p+1i=1 ∪ {S′i}

p+1i=1 .

Figure 2: Submod-Max-p-System(X,I, f)

The algorithm Submod-Max-p-Systems (Figure 2) formaximizing a non-monotone submodular functionf withf(∅) = 0 over ap-independence system now immediatelysuggests itself.

Theorem 3.3. The algorithm Submod-Max-p-System is a(1 + α)(p + 2 + 1/p)-approximation for maximizing a non-monotone submodular function over ap-independence sys-tem, whereα is the approximation guarantee for uncon-strained (non-monotone) submodular maximization.

Proof. Let C∗ be an optimal solution withOPT = f(C∗),and letCi = C∗ ∩ Xi for all i ∈ [p + 1]—henceC1 = C∗.Note thatCi is a feasible solution to the greedy optimization inStep 3. Hence, byLemma 3.2, we know thatf(Si) ≥ 1p+1f(Ci ∪ Si). Now, if for somei, it holds thatf(Si ∩ Ci) ≥ ǫOPT (for ǫ > 0 to be chosen later), thenthe guarantees ofFMVα ensure thatf(S′i) ≥ (ǫOPT)/α, and we will get aα/ǫ-approximation. Else, it holds forall i ∈ [p+ 1] that

f(Si) ≥ 1p+1f(Ci ∪ Si) + f(Ci ∩ Si)− ǫOPT (4)

5

Now we can add all these inequalities, divide byp + 1, and use the argument from [LMNS10, Claim 2.7] to inferthat

f(S) ≥ p(p+1)2

f(C∗)− ǫOPT = OPT(

p(p+1)2

− ǫ)

. (5)

(While Claim 2.7 of [LMNS10] is used in the context of a local-search algorithm, it uses just the submodularity ofthe functionf , and the facts that(∪j

We also show an information theoretic lower bound: no secretary algorithm can approximately maximize asubmodular function subject to a cardinality constraintk to a factor better than some universal constant greaterthan 1, independent ofk (This is ignoring computational constraints, and so the computational inapproximability ofoffline submodular maximization does not apply). This is in contrast to the additive secretary problem, for whichKleinberg gives a secretary algorithm achieving a1

1−5/√k-approximation [Kle05]. This lower bound is found in

Appendix D. (For a discussion about independent work on submodular secretary problems, see§1.3.1.)

4.1 Subject to a Cardinality Constraint

The offline algorithm presented inSection 2builds three potential solutions and chooses the best amongst them.We now want to build just one solution in anonline fashion, so that elements arrive in random order, and when anelement is added to the solution, it is never discarded subsequently. We first give an online algorithm that is given theoptimal valueOPT as input but where the elements can come inworst-caseorder (we call this an “online algorithmwith advice”). Using sampling ideas we can estimateOPT, and hence use this advice-taking online algorithm inthe secretary model where elements arrive in random order.

To get the advice-taking online algorithm, we make two changes. First, we do not use the greedy algorithmwhich selects elements of highest marginal utility, but instead use athreshold algorithm, which selects any elementthat has marginal utility above a certain threshold. Second, we will changeStep 4of Algorithm Submod-Max-Cardinality to use FMV4, which simply selects a random subset of the elements to get a4-approximation to theunconstrained submodular maximization problem [FMV07]. The Threshold Algorithmwith inputs (τ, k) simplyselects each element as it appears if it has marginal utilityat leastτ , up to a maximum ofk elements.

Lemma 4.1(Threshold Algorithm). LetC∗ satisfyf(C∗) = OPT. The threshold algorithm on inputs(τ, k) returnsa setS that either hask elements and hence a value of at leastτk, or a setS with valuef(S) ≥ f(S∪C∗)−|C∗|τ .Proof. The claim is immediate if the algorithm picksk elements, so suppose it does not pickk elements, and also

f(S) < f(S ∪C∗)− |C∗|τ . ThenfS(C∗) > |C∗|τ , or τ < fS(C∗)

|C∗| ≤∑

e∈C∗ fS(e)

|C∗| . By averaging, this implies thereexists an elemente ∈ C∗ such thatfS(e) > τ ; this element cannot have been chosen intoS (otherwise the marginalvalue would be0), but it would have been chosen intoS when it was considered by the algorithm (since at that timeits marginal value would only have been higher). This gives the desired contradiction.

Theorem 4.2. If we change Algorithm Submod-Max-Cardinality from§2 to use the threshold algorithm with thresh-old τ = OPT7k in Step 3, and to use the random sampling algorithmFMV4 in Step 4, and return a (uniformly) randomone ofS1, S′1, S2 in Step 7, the expected value of the returned set is at leastOPT/21.

Proof. We show thatf(S1) + f(S′1) + f(S2) ≥ τk = OPT7 , and picking a random one of these gets a third of thatin expectation. Indeed, ifS1 orS2 hask elements, thenf(S1)+f(S2) ≥ τk. Else iff(S1∩C∗) ≥ 4τk, thenFMV4guarantees thatf(S′1) ≥ τk. Elsef(S1)+f(S2) ≥ (f(S1∪C∗)− τk)+(f(S2∪C∗)− τk)+(f(S1 ∩C∗)−4τk),which byLemma 2.2is at leastOPT− 6τk = τk.

Observation 4.3. Given the value ofOPT, the algorithm ofTheorem 4.2can be implemented in an online fashionwhere we (irrevocably) pick at mostk elements.

Proof. We can randomly choose which one ofS1, S′1, S2 we want to output before observing any elements. ClearlyS1 can be determined online, as canS2 by choosing any element that has high marginal value and is not chosen inS1. Moreover,S′1 just selects elements fromS1 independently with probability1/2.

Observation 4.4. In both the algorithms ofTheorems 2.3and4.2, if we use some valueZ ≤ OPT instead ofOPT,the returned set has value at leastZ/(4 + α), and expected value at leastZ/21, respectively.

Finally, it will be convenient to recall Dynkin’s algorithm: given a stream ofn numbers randomly ordered, itsamples the first1/e fraction of the numbers and picks the next element that is larger than all elements in the sample.

4.1.1 The Secretary Algorithm for the Cardinality Case

7

Let Solution← ∅.Flip a fair coinif headsthen

Solution←most valuable item using Dynkin’s-Algoelse

Let m ∈ B(n, 1/2) be a draw from the binomial distributionA1 ← ρoff-approximate offline algorithm on the firstm elements.A2 ← ρon-approximate advice-taking online algorithm with

f(A1) as the guess forOPT.ReturnA2

end if

Figure 3:Algorithm SubmodularSecretaries

For a constrained submodular optimiza-tion, if we are given(a) a ρoff-approximateoffline algorithm, and also(b) a ρon-approximate online advice-taking algorithmthat works given an estimate ofOPT, wecan now get an algorithm in the secretarymodel thus: we use the offline algorithm toestimateOPT on the first half of the ele-ments, and then run the advice-taking on-line algorithm with that estimate. The for-mal algorithm appears inFigure 3. Be-cause of space constraints, we have deferredthe proof of the following theorem toAp-pendix C.

Theorem 4.5.The above algorithm is anO(1)-approximation algorithm for the cardinality-constrained submodularmaximization problem in the secretary setting.

4.2 Subject to a Partition Matroid Constraint

In this section, we give a constant-factor approximation for maximizing submodular functions subject to a partitionmatroid. Recall that in such a matroid, the universe is partitioned intok “groups”, and the independent sets arethose which contain at most one element from each group. To get a secretary-style algorithm formodular (additive)function maximization subject to a partition matroid, we can run Dynkin’s algorithm on each group independently.However, if we have a submodular function, the marginal value of an element depends on the elements previouslypicked—and hence the marginal value of an element as seen by the online algorithm and the adversary become verydifferent.

We first build some intuition by considering a simpler “contiguous partitions” model where all the elements ofeach group arrive together (in random order), but the groupsof the partition are presented in somearbitrary orderg1, g2, . . . , gr. We then go on to handle the case when all the elements indeed come in completely random order,using what is morally a reduction to the contiguous partitions case.

4.2.1 A Special Case: Contiguous Partitions

For the contiguous case, one can show that executing Dynkin’s algorithm with the obvious marginal valuationfunction is a good algorithm: this is not immediate, since the valuation function changes as we pick some elements—but it works out, since the groups come contiguously. Now, asin the previous section, one wants to run two parallelcopies of this algorithm (with the second one picking elements from among those not picked by the first)—but thecorrelation causes the second algorithm to not see a random permutation any more! We get around this by couplingthe two together as follows:

Initially, the algorithm determines whether it is one of 3 different modes (A, B, or C) uniformly atrandom. The algorithm maintains a set of selected elements,initially S0. When groupgi of the partitionarrives, it runs Dynkin’s secretary algorithm on the elements from this group using valuation functionfSi−1 . If Dynkin’s algorithm selects an elementx, our algorithm flips a coin. If we are in modesA orB, we letSi ← Si−1 ∪ {x} if the coin is heads, and letSi ← Si−1 otherwise. If we are in modeC,we do the reverse, and letSi ← Si−1 ∪ {x} if the coin is tails, and letSi ← Si−1 otherwise. Finally,after the algorithm has completed, if we are in modeB, we discard each element ofSr with probability1/2. (Note that we can actually implement this step online, by ’marking’ but not selecting elementswith probability1/2 when they arrive).

Lemma 4.6. The above algorithm is a(3 + 6e)-approximation for the submodular maximization problem underpartition matroids, when each group of the partition comes as a contiguous segment.

8

Proof. We first analyze the case in which the algorithm is in modeA or C. Consider a hypothetical run oftwoversions of our algorithm simultaneously, one in modeA and one in modeC which share coins and produce setsSAr andS

Cr . The two algorithms run with identical marginal distributions, but are coupled such that whenever both

algorithms attempt to select the same element (each with probability 1/2), we flip only one coin, so one succeedswhile the other fails. Note thatSCr ⊆ U \ SAr , and so we will be able to applyLemma 2.2. For a fixed permutationπ, let SAr (π) be the set chosen by the modeA algorithm for that particular permutation. As usual, we definefA(B) = f(A ∪B)− f(A). Hence,f(SAr (π)) = f(SAr (π) ∪ C∗)− fSAr (π)(C

∗), and taking expectations, we get

E[f(SAr )] = E[f(SAr ∪ C∗)]− E[fSAr (C

∗)] (6)

Now, for anye ∈ X, let j(e) be the index of the group containinge; hence we have

E[fSAr (C∗)] ≤

∑

e∈C∗E[fSAr ({e})] ≤

∑

e∈C∗E[fSA

j(e)−1({e})] ≤

∑

e∈C∗2e · E[fSA

j(e)−1({Yj(e)})]

= 2e · E[f(SAr )], (7)where the first inequality is just subadditivity, the secondsubmodularity, the third follows from the fact that Dynkin’salgorithm is ane-approximation for the secretary problem and selecting theelement that Dynkin’s selects with proba-bility 1/2 gives a2e approximation, and the resulting telescoping sum gives thefourth equality. Now substituting (7)into (6) and rearranging, we getE[f(SAr )] ≥ 11+2e f(SAr ∪ C∗). An identical analysis of the second hypotheticalalgorithm gives:E[f(SCr )] ≥ 11+2e f(SCr ∪C∗ \ SAr ).

It remains to analyze the case in which the algorithm runs in modeB. In this case, the algorithm generates a setSBr by selecting each element inS

Ar uniformly at random. By the theorem of [FMV07], uniform random sampling

achieves a4-approximation to the problem ofunconstrainedsubmodular maximization. Therefore, we have inthis case:E[f(SBr )] ≥ 14f(SAr ∩ C∗). By Lemma 2.2, we therefore have:E[f(SAr )] + E[f(SBr )] + E[f(SCr )] ≥

11+2ef(C

∗). Since our algorithm outputs one of these three sets uniformly at random, it gets a(3+6e) approximationto f(C∗).

4.2.2 General Case

We now consider the general secretary setting, in which the elements come in random order, not necessarily groupedby partition. Our previous approach will not work: we cannotsimply run Dynkin’s secretary algorithm on contiguouschunks of elements, because some elements may be blocked by our previous choices. We instead do somethingsimilar in spirit: we divide the elements up intok ‘epochs’, and attempt to select a single element from each. Wetreat every element that arrives before the current epoch aspart of a sample, and according to the current valuationfunction at the beginning of an epoch, we select the first element that we encounter that has higher value than anyelement from its own partition group in the sample, so long aswe have not already selected something from thesame partition group. Our algorithm is as follows:

Initially, the algorithm determines whether it is one of 3 different modes (A, B, or C) uniformly atrandom. The algorithm maintains a set of selected elements,initially S0, and observes the firstN0 ∼B(n, 12) of the elements without selecting anything. The algorithm then considersk epochs, where theith epoch is the set ofNi ∼ B(n, 1100k ) contiguous elements after the(i−1)th epoch. At epochi, we usevaluation functionfSi−1. If an element has higher value than any element from its own partition groupthat arrived earlier than epochi, we flip a coin. If we are in modesA or B, we letSi ← Si−1 ∪ {x}if the coin is heads, and letSi ← Si−1 otherwise. If we are in modeC, we do the reverse, and letSi ← Si−1 ∪ {x} if the coin is tails, and letSi ← Si−1 otherwise. After allk epochs have passed,we ignore the remaining elements. Finally, after the algorithm has completed, if we are in modeB, wediscard each element ofSr with probability1/2. (Note that we can actually implement this step online,by ’marking’ but not selecting elements with probability1/2 when they arrive).

If we were guaranteed to select an element in every epochi that was the highest valued element according tofSi−1, then the analysis of this algorithm would be identical to the analysis in the contiguous case. This is of coursenot the case. However, we prove a technical lemma that says that we are “close enough” to this case.

9

Lemma 4.7. For all partition groupsi and epochsj, the algorithm selects the highest element from groupi (ac-cording to the valuation functionfSj−1 used during epochj) during epochj with probability at leastΩ(

1k ).

Because of space constraints, we defer the proof of this technical lemma toAppendix C.Note an immediate consequence of the above lemma: ife is the element selected from epochj, by summing

over the elements in the optimal setC∗ (1 from each of thek partition groups), we get:

E[fSj−1(e)] ≥ Ω(1

k)∑

e′∈C∗fSj−1(e

′) ≥ Ω(fSj−1(C∗)

k)

Summing over the expected contribution toSr from each of thek epochs and applying submodularity, we getE[fSAr (C∗)] ≤ O(E[f(S

Ar )]). Using this derivation in place of inequality7 in the proof ofLemma 4.6proves that

our algorithm gives anO(1) approximation to the non-monotone submodular maximization problem subject to apartition matroid constraint.

4.3 Subject to a General Matroid Constraint

We consider matroid constraints where the matroid isM = (Ω,I) with rank k. Let w1 = maxe∈Ω f({e}) themaximum value obtained by any single element, and lete1 be the element that achieves this maximum value. (Notethat we do not know these values up-front in the secretary setting.) In this section, we first give an algorithm that getsa set of fairly high value given a thresholdτ . We then show how to choose this threshold, assuming we know thevaluew1 of the most valuable element, and why this implies an advice-taking online algorithm having a logarithmicapproximation. Finally, we show how to implement this in a secretary framework.

A Threshold Algorithm. Given a valueτ , run the following algorithm. InitializeS1, S2 ← ∅. Go over the elementsof the universeΩ in arbitrary order: when considering elemente, add it toS1 if fS1(e) ≥ ǫτ andS1 ∪ {e} isindependent, else add it toS2 if fS2(e) ≥ ǫτ andS2 ∪ {e} is independent, else discard it. (We will choose the valueof ǫ later.) Finally, output a uniformly random one ofS1 or S2.

To analyze this algorithm, letC∗ be the optimal set withf(C∗) = OPT. Order the elements ofC∗ by pickingits elements greedily based on marginal values. Givenτ > 0, letC∗τ ⊆ C∗ be the elements whose marginal benefitwas at leastτ when added in this greedy order: note thatf(C∗τ ) ≥ |C∗τ |τ .

Lemma 4.8. For ǫ = 2/5, the set produced by our algorithm has expected value is at least |C∗τ | · τ/10.

Proof. If either |S1| or |S2| is at least|C∗τ |/4, we get value at least|C∗τ |/4 · ǫτ . Else both these sets have smallcardinality. Since we are in a matroid, there must be a setA ⊆ C∗τ of cardinality|A| ≥ |C∗τ |− |S1|− |S2| ≥ |C∗τ |/2,such thatA is disjoint from bothS1 andS2, and bothS1 ∪A andS2 ∪A lie in I (i.e., they are independent).

We claim thatf(S1) ≥ f(S1 ∪ A) − |A| · ǫτ . Indeed, an element ine ∈ A was not added by the thresholdalgorithm; since it could be added while maintaining independence, it must have been discarded because the marginalvalue was less thanǫτ . HencefS1({e}) < ǫτ , and hencef(S1∪A)−f(S1) = fS1(A) ≤

∑

e∈A fS1({e}) < |A|·ǫτ .Similarly, f(S2) ≥ f(S2 ∪A)− |A| · ǫτ . And by disjointness,f(S1 ∩A) = f(∅) = 0. Hence, summing these andapplyingLemma 2.2, we get thatf(S1)+f(S2) ≥ f(S1∪A)+f(S2∪A)+f(S1∩A)−2ǫτ |A| ≥ f(A)−2ǫτ |A|.

Since the marginal values of all the elements inC∗τ were at leastτ when they were added by the greedy ordering,andA ⊆ C∗τ , submodularity implies thatf(A) ≥ |A|τ , which in turn impliesf(S1) + f(S2) ≥ (1 − 2ǫ)τ |A| ≥(1 − 2ǫ)τ |C∗τ |/2. A random one ofS1, S2 gets half of that in expectation. Taking the minimum of|C∗τ |/4 · ǫτ and(1− 2ǫ)τ |C∗τ |/2 and settingǫ = 2/5, we get the claim.

Lemma 4.9.∑log 2k

i=0 |C∗w1/2i | ·w12i≥ f(C∗)/4 = OPT/4.

Proof. Consider the greedy enumeration{e1, e2, . . . , et} of C, and letwj = f{e1,e2,...,ei−1}({ej}). First consider aninfinite summation

∑∞i=0 |C∗w1/2i | ·

w12i

—each elementej contributes at leastwj/2 to it, and hence the summation

is at least12∑

j wj . But f(C∗) =

∑tj=1wj , which says the infinite sum is at leastf(C

∗)/2 = OPT/2. But thefinite sum merely drops a contribution ofw1/4k from at most|C∗| ≤ k elements, and clearlyOPT is at leastw1,so removing this contribution means the finite sum is at leastOPT/4.

10

Hence, if we choose a valueτ uniformly fromw1, w1/2, w1/4, . . . , w1/2k and run the above threshold algorithmwith that setting ofτ , we get that the expected value of the set output by the algorithm is:

11+log 2k

∑log 2ki=0 |C∗w1/2i | ·

w110·2i ≥ 11+log 2k OPT40 . (8)

The Secretary Algorithm. The secretary algorithm for general matroids is the following:

Sample half the elements, letW be the weight of the highest weight element in the first half. Choose avaluei ∈ {0, 1, . . . , 2 + log 2k} uniformly at random. Run the threshold algorithm withW/2i as thethreshold

Lemma 4.10. The algorithm is anO(log k)-approximation in the secretary setting for rankk matroids.

Proof. With probabilityΘ(1/ log k), we choose the valuei = 0. In this case, with constant probability the elementwith second-highest value comes in the first half, and the highest-value elemente1 comes in the second half; henceour (conditional) expected value in this case is at leastw1. In case this single element accounts for more than halfof the optimal value, we getΩ(OPT/ log k). We ignore the casei = 1. If we choosei ≥ 2, now with constantprobability e1 comes in the first half, implying thatW = w1. Moreover, each element inC − e1 appears in thesecond half with probability slightly higher than1/2. Sincee1 accounts for at most half the optimal value, theexpected optimal value in the second half is at leastOPT/4. The above argument then ensures that we get valueΩ(OPT/ log k) in expectation.

Acknowledgments.We thank C. Chekuri, V. Nagarajan, M.I. Sviridenko, J. Vondrák, and especially R.D. Kleinbergfor valuable comments, suggestions, and conversations. Thanks to C. Chekuri also for pointing out an error inSectionB, and to M.T. Hajiaghayi for informing us of the results in [BHZ10].

References

[ADSY09] S. Agrawal, Y. Ding, A. Saberi, and Y. Ye. Correlation robust stochastic optimization.CoRR, abs/0902.1792,2009.

[ANS08] A. Asadpour, H. Nazerzadeh, and A. Saberi. Stochastic submodular maximization.Internet and Network Eco-nomics, pages 477–489, 2008.

[BDG+09] Moshe Babaioff, Michael Dinitz, Anupam Gupta, Nicole Immorlica, and Kunal Talwar. Secretary problems:weights and discounts. In19thProceedings, ACM-SIAM Symposium on Discrete Algorithms, pages 1245–1254,2009.

[BHZ10] MohammadHossein Bateni, MohammadTaghi Hajiaghayi, and Morteza Zadimoghaddam. Submodular secretaryproblem and extensions. manuscript http://hdl.handle.net/1721.1/51336, 2010.

[BIK07] Moshe Babaioff, Nicole Immorlica, and Robert Kleinberg. Matroids, secretary problems, and online mechanisms.In SODA ’07, pages 434–443, 2007.

[BIKK07] M. Babaioff, N. Immorlica, D. Kempe, and R. Kleinberg. A knapsack secretary problem with applications. InAPPROX ’07, 2007.

[CCPV07] Gruia Calinescu, Chandra Chekuri, Martin Pál, and Jan Vondrák. Maximizing a submodular set function sub-ject to a matroid constraint (extended abstract). InProceedings, MPS Conference on Integer Programming andCombinatorial Optimization, pages 182–196, 2007.

[CCPV09] Gruia Calinescu, Chandra Chekuri, Martin Pál, and Jan Vondrák. Maximizing a monotone submodular functionsubject to a matroid constraint.to appear in SICOMP, 2009.

[CK05] Chandra Chekuri and Sanjeev Khanna. A polynomial time approximation scheme for the multiple knapsack prob-lem. SIAM J. Comput., 35(3):713–728 (electronic), 2005.

[CVZ09] Chandra Chekuri, Jan Vondrák, and Rico Zenklusen.Randomized pipage rounding for matroid polytopes andapplications.CoRR, abs/0909.4348, 2009. To appear in FOCS 2010.

11

[DP08] Nedialko B. Dimitrov and C. Greg Plaxton. Competitive weighted matching in transversal matroids. InAutomata,languages and programming. Part I, volume 5125 ofLecture Notes in Comput. Sci., pages 397–408, Berlin, 2008.Springer.

[Dyn63] E. B. Dynkin. Optimal choice of the stopping moment of a Markov process.Dokl. Akad. Nauk SSSR, 150:238–240,1963.

[Fer89] T.S. Ferguson. Who solved the secretary problem?Statistical Science, 4:282–296, 1989.

[FMV07] U. Feige, V. Mirrokni, and J. Vondrak. Maximizing non-monotone submodular functions. InProceedings of 48thAnnual IEEE Symposium on Foundations of Computer Science (FOCS), 2007.

[FNW78] M. L. Fisher, G. L. Nemhauser, and L. A. Wolsey. An analysis of approximations for maximizing submodular setfunctions. II.Math. Programming Stud., (8):73–87, 1978. Polyhedral combinatorics.

[Fre83] P. R. Freeman. The secretary problem and its extensions: a review.Internat. Statist. Rev., 51(2):189–206, 1983.

[GNR09] Anupam Gupta, Viswanath Nagarajan, and R. Ravi. Thresholded covering algorithms for robust and max-minoptimization.CoRR, abs/0912.1045, 2009. To appear in ICALP 2010.

[HKJ80] D. Hausmann, B. Korte, and T. A. Jenkyns. Worst case analysis of greedy type algorithms for independencesystems.Math. Programming Stud., (12):120–131, 1980. Combinatorial optimization.

[HKP04] Mohammad Taghi Hajiaghayi, Robert Kleinberg, and David C. Parkes. Adaptive limited-supply online auctions.In EC ’04: Proceedings of the 5th ACM conference on Electronic commerce, pages 71–80, New York, NY, USA,2004. ACM.

[Jen76] Thomas A. Jenkyns. The efficacy of the “greedy” algorithm. In Proceedings of the Seventh Southeastern Con-ference on Combinatorics, Graph Theory, and Computing (Louisiana State Univ., Baton Rouge, La., 1976), pages341–350. Congressus Numerantium, No. XVII, Winnipeg, Man., 1976. Utilitas Math.

[KH78] Bernhard Korte and Dirk Hausmann. An analysis of the greedy heuristic for independence systems.Ann. DiscreteMath., 2:65–74, 1978. Algorithmic aspects of combinatorics (Conf., Vancouver Island, B.C., 1976).

[KKT03] D. Kempe, J. Kleinberg, and́E. Tardos. Maximizing the spread of influence through a social network. InPro-ceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pages137–146. ACM New York, NY, USA, 2003.

[Kle05] Robert Kleinberg. A multiple-choice secretary algorithm with applications to online auctions. In16th SODA,pages 630–631. ACM, 2005.

[Kle09] Robert D. Kleinberg. A secretary problem with submodular payoff function. manuscript, 2009.

[KMN99] Samir Khuller, Anna Moss, and Joseph Naor. The budgeted maximum coverage problem.Inform. Process. Lett.,70(1):39–45, 1999.

[KP09] Nitish Korula and Martin Pál. Algorithms for secretary problems on graphs and hypergraphs. pages 508–520,2009.

[KST09] Ariel Kulik, Hadas Shachnai, and Tami Tamir. Maximizing submodular set functions subject to multiple linearconstraints. InSODA ’09: Proceedings of the Nineteenth Annual ACM -SIAM Symposium on Discrete Algorithms,pages 545–554, 2009.

[LMNS10] Jon Lee, Vahab S. Mirrokni, Viswanath Nagarajan, and Maxim Sviridenko. Maximizing nonmonotone submodularfunctions under matroid or knapsack constraints.SIAM J. Discrete Math., 23(4):2053–2078, 2009/10. (Preliminaryversions in STOC 2009 and arXiv:0902.0353.).

[LSV09] Jon Lee, Maxim Sviridenko, and Jan Vondrák. Submodular maximization over multiple matroids via generalizedexchange properties. InProceedings, International Workshop on Approximation Algorithms for CombinatorialOptimization Problems, pages 244–257, 2009.

[MR07] E. Mossel and S. Roch. On the submodularity of influence in social networks. InProceedings of the thirty-ninthannual ACM symposium on Theory of computing, page 134. ACM, 2007.

[NWF78] G. L. Nemhauser, L. A. Wolsey, and M. L. Fisher. An analysis of approximations for maximizing submodular setfunctions. I.Math. Programming, 14(3):265–294, 1978.

[Svi04] Maxim Sviridenko. A note on maximizing a submodularset function subject to a knapsack constraint.Oper. Res.Lett., 32(1):41–43, 2004.

12

[Von08] Jan Vondrák. Optimal approximation for the submodular welfare problem in the value oracle model. InProceed-ings, ACM Symposium on Theory of Computing, pages 67–74, 2008.

[Von09] Jan Vondrák. Symmetry and approximability of submodular maximization problems. InProceedings, IEEE Sym-posium on Foundations of Computer Science, page to appear, 2009.

[Wol82] Laurence A. Wolsey. Maximising real-valued submodular functions: primal and dual heuristics for location prob-lems.Math. Oper. Res., 7(3):410–425, 1982.

A Proof of Main Lemma for p-Systems

Let e1, e2, . . . , ek be the elements added toS by greedy, and letSi be the firsti elements in this order, withδi =fSi−1({ei}) = f(Si)− f(Si−1), which may be positive or negative. Sincef(∅) = 0, we havef(S = Sk) =

∑

i δi.And sincef is submodular,δi ≥ δi+1 for all i.

Lemma A.1 (following [CCPV09]). For any independent setC, it holds thatf(Sk) ≥ 1p+1f(C ∪ Sk).

Proof. We show the existence of a partition ofC into C1, C2, . . . , Ck with the following two properties:

• for all i ∈ [k], p1 + p2 + . . .+ pi ≤ i · p wherepi := |Ci|, and

• for all i ∈ [k], piδi ≥ fSk(Ci).

Assuming such a partition, we can complete the proof thus:

p∑

i

δi ≥∑

i

piδi ≥∑

i

fSk(Ci) ≥ fSk(C) = f(Sk ∪ C)− f(Sk), (9)

where the first inequality follows from [CCPV09, Claim A.1] (using the first property above, and that theδ’s arenon-increasing), the second from the second property of thepartition ofC, the third from subadditivity offSk(·)(which is implied by the submodularity off and applications of both facts inProposition 1.1), and the fourth fromthe definition offSk(·). Using the fact that

∑

i δi = f(Sk), and rearranging, we get the lemma.Now to prove the existence of such a partition ofC. DefineA0, A1, . . . , Ak as follows:Ai = {e ∈ C \ Si |

Si + e ∈ I}. Note that sinceC ∈ I, it follows thatA0 = C; since the independence system is closed under subsets,we haveAi ⊆ Ai−1; and since the greedy algorithm stops only when there are no more elements to add, we getAk = ∅. DefiningCi = Ai−1 \ Ai ensures we have a partitionC1, C2, . . . , Ck of C.

Fix a valuei. We claim thatSi is a basis (a maximal independent set) forSi∪(C1∪C2∪. . .∪Ci) = Si∪(C\Ai).Clearly Si ∈ I by construction; moreover, anye ∈ (C \ Ai) \ Si was considered but not added toAi becauseSi + e 6∈ I. Moreover,(C1 ∪ C2 ∪ . . . ∪ Ci) ⊆ C is clearly independent by subset-closure. SinceI is a p-independence system,|C1 ∪ C2 ∪ . . . ∪ Ci| ≤ p · |Si|, and thus

∑

i |Ci| =∑

i pi ≤ i · p, proving the first property.For the second property, note thatCi = Ai−1 \ Ai ⊆ Ai−1; hence eache ∈ Ci does not belong toSi−1

but could have been added toSi−1 whilst maintaining independence, and was considered by thegreedy algorithm.Since greedy chose theei maximizing the “gain”,δi ≥ fSi−1({e}) for eache ∈ Ci. Summing over alle ∈ Ci, weget piδi ≤

∑

e∈Ci fSi−1({e}) ≤ fSi−1(Ci), where the last inequality is by the subadditivity offSi−1. Again, bysubmodularity,fSi−1(Ci) ≤ fSk(Ci), which proves the second fact about the partition{Cj}kj=1 of C.

Clearly, the greedy algorithm works no worse if we stop it when the best “gain” is negative, but the above proofdoes not use that fact.

B Proofs for Knapsack Constraints

The proof is similar to that in [Svi04] and the proof ofLemma 2.1. We use notation similar to [Svi04] for consistency.Let f be a non-negative submodular function withf(∅) = 0. Let I = [n], and we are givenn items with weightsci ∈ Z+, andB ≥ 0; let F = {S ⊆ I | c(S) ≤ B, wherec(S) =

∑

i∈S ci. Our goal to solvemaxS⊆F f(S). Tothat end, we want to prove the following result:

13

Theorem B.1. There is a polynomial-time algorithm that outputs a collection of sets such that for anyC ∈ F , thecollection contains a setS satisfyingf(S) ≥ 12f(S ∪ C). 1

B.1 The Algorithm

The algorithm is the following: it constructs a polynomial number of solutions and chooses the best among them(and in case of ties, outputs the lexicographically smallest one of them).

• First, the family contains all solutions with cardinality1, 2, 3: clearly, if |C| ≤ 3 then we will outputC itself,which will satisfy the condition of the theorem.

• Now for each solutionU ⊆ I of cardinality3, we greedily extend it as follows: SetS0 = U , I0 = I. At stept,we have a partial solutionSt−1. Now compute

θt = maxi∈It−1\St−1

f(St−1 + i)− f(St−1)ci

. (10)

Let the maximum be achieved on indexit. If θt ≤ 0, terminate the algorithm. Else check ifc(St−1+ it) ≤ B:if so, setSt = St−1 + it andIt = It−1, else setSt = St−1 andIt = It−1 − it. Stop ifIt \ St = ∅.

The family of sets we output is all sets of cardinality at mostthree, as well as for each greedy extension of a set ofcardinality three, we output all the setsSt created during the run of the algorithm. Since each set can have at mostnelements, we getO(n4) sets output by the algorithm.

B.2 The Analysis

Let us assume that|C| = t > 3, and orderC asj1, j2, . . . , jt such that

jk = maxj∈C\{j1,...,jk−1}

f{j1,...,jk−1}({j}), (11)

i.e., index the elements in the order they would be considered by the greedy algorithm that picks items of maximummarginal value (and does not consider their weightsci). Let Y = {j1, j2, j3}. Submodularity and the ordering ofCgives us the following:

Lemma B.2. For anyjk ∈ C with k ≥ 4 and anyZ ⊆ I \ {j1, j2, j3, jk}, it holds that:

fY ∪Z({jk}) ≤ f({jk}) ≤ f({j1})fY ∪Z({jk}) ≤ f({j1, jk})− f({j1}) ≤ f({j1, j2})− f({j1})fY ∪Z({jk}) ≤ f({j1, j2, jk})− f({j1, j2}) ≤ f({j1, j2, j3})− f({j1, j2})

Summing the above three inequalities we get that forjk 6∈ Y ∪ Z,

3 fY ∪Z({jk}) ≤ f(Y ). (12)

For the rest of the discussion, consider the iteration of thealgorithm which starts withS0 = Y . ForS such thatS0 = Y ⊆ S ⊆ I, recall thatfY (S) = f(Y ∪ S) − f(Y ) = f(S) − f(Y ). Proposition 1.1shows thatfY (·) is asubmodular function withfY (∅) = 0. The following lemma is the analog of [Svi04, eq. 2]:

Lemma B.3. For any submodular functiong and allS, T ⊆ I it holds that

g(T ∪ S) ≤ g(S) +∑

i∈T\S(g(S + i)− g(S)) (13)

1A preliminary version of the paper claimed a factor of(1− 1/e) instead of1/2—we thank C. Chekuri for pointing out the error.

14

Proof. g(T ∪ S) = g(S) + (g(T ∪ S) − g(S)) = g(S) + gS(T \ S) ≤ g(S) +∑

i∈T\S gS({i}) = g(S) +∑

i∈T\S(g(S + i)− g(S)), where we used subadditivity of the submodular functiongS .

Let τ + 1 be the first step in the greedy algorithm at which either (a) the algorithm stops becauseθτ+1 ≤ 0,or (b) we consider some elementiτ+1 ∈ C and it is dropped by the greedy algorithm—i.e., we setSτ+1 = SτandIτ+1 = Iτ − iτ+1. Note that before this point either we considered elements fromC and picked them, or theelement considered was not inC. In fact, let us assume that there are no elements that are neither inC nor are pickedby our algorithm, since we can drop them and perform the same algorithm and analysis again, it will not changeanything—hence we can assume we have not dropped any elements before this, andSt = {i1, i2, . . . , it} for allt ∈ {0, 1, . . . , τ}.

Now we applyLemma B.3to the submodular functionfY (·) with setsS = St andT = C to get

fY (C ∪ St) ≤ fY (St) +∑

i∈C\StfY (St + i)− fY (St) = fY (St) +

∑

i∈C\Stf(St + i)− f(St) (14)

Suppose case (a) happened and we stopped becauseθτ+1 ≤ 0. This means that every term in the summation in (14)must be negative, and hencefY (C ∪ Sτ ) ≤ fY (Sτ ), or equivalently,f(C ∪ Sτ ) ≤ f(Sτ ). In this case, we are noteven losing the(1− 1/e) factor.

Case (b) is if the greedy algorithm drops the elementiτ+1 ∈ C. Sinceiτ+1 was dropped, it must be the case thatc(Sτ ) ≤ B but c(Sτ + iτ+1) = B′ > B. In this case the right-hand expression in (14) has some positive terms foreach of the values oft ≤ τ , and hence for eacht, we get

fY (C ∪ St) ≤ fY (St) +B · θt+1. (15)

To finish up, we prove a lemma similar toLemma 2.1.

Lemma B.4. fY (Sτ + iτ+1) ≥ 12 fY (Sτ ∪C).

Proof. If not, then we havefY (Sτ ∪C)− fY (Sτ + iτ+1) > fY (Sτ + iτ+1).

Since we are in the case thatθτ+1 > 0, we know thatfY (Sτ + iτ+1) > fY (Sτ ), and hence

fY ∪Sτ (C) = fY (Sτ ∪ C)− fY (Sτ ) > fY (Sτ + iτ+1).

Now, the subadditivity offY ∪Sτ () implies that there exists some elemente ∈ C with fY ∪Sτ (e)ce >fY (Sτ+iτ+1)

B .Submodularity now implies that at each point in timei ≤ τ + 1, the marginal increase per unit cost for elemente isfY ∪Si(e)

ce> fY (Sτ+iτ+1)B . Now since the greedy algorithm picked elements with the largest marginal increase per unit

cost, the marginal increase per unit cost at each step was strictly greater thanfY (Sτ+iτ+1)B . Hence, at the moment thetotal cost of the picked exceededB, the total value accrued would be strictly greater thanfY (Sτ + iτ+1), which is acontradiction.

Now for the final calculations:

f(Sτ ) ≥ f(Y ) + fY (Sτ )≥ f(Y ) + fY (Sτ + iτ+1)−

(

fY (Sτ + iτ+1)− fY (Sτ ))

≥ f(Y ) + fY (Sτ + iτ+1)−(

f(Sτ + iτ+1)− f(Sτ ))

≥ f(Y ) + (1/2)fY (Sτ ∪ C)− f(Y )/3 (usingLemma B.4and (12))≥ (1/2)f(Sτ ∪ C). (by the definition offY ())

Hence this setSτ will be in the family of sets output, and will satisfy the claim of the theorem.

15

C Proofs from the Submodular Secretaries Section

In this section, we give the missing proofs fromSection 4.

C.1 Proof for Cardinality Constrained Submodular Secretaries

Theorem 4.5The algorithm for the cardinality-constrained submodularmaximization problem in the secretarysetting gives anO(1) approximation toOPT.

The proof basically shows that with reasonable probability, both the first and the second half of the stream have areasonable fraction ofOPT, so when we run the offline algorithm on the first half, using its output to extract valuefrom the second half gives us a constant fraction ofOPT.

Proof. Let C∗ = {e1, . . . , ek′} denote some set withk′ ≤ k elements such thatf(C∗) = OPT. Without lossof generality, we normalize so thatOPT = 1. Suppose the elements ofC∗ have been listed in the “greedy or-der” (i.e., in order of decreasing marginal utility), and let ai denote the marginal utility ofei when it is added to{e1, e2, . . . , ei−1}. We consider two cases: in the first case,a1 ≥ 1/c, wherec ≥ 1 is some constant to be deter-mined. In this case, with probability1/2e, the algorithm runs Dynkin’s secretary algorithm and selectsa1, achievingan1/(2ce) approximation.

In the other case,ai < 1/c for all i. We imagine randomly partitioning the elements of the inputsetX into twosets,X1 andX2, with each element belonging toX1 independently with probability1/2. This corresponds to thealgorithm’s division ofσσσ into the first (random)m elementsσσσm and the remaining elementsσσσ−σσσm. LetC∗1 andC∗2denote the optimal solutions restricted to setsX1 andX2 respectively. Define the random variableA =

∑k′

i=1 Yiaiwhere eachYi ∈r {−1, 1} is selected uniformly at random. Note that by submodularity, f(C∗1 )+f(C∗2 ) ≥ f(C∗) =1. We wish to lower boundmin(f(C∗1 ), f(C

∗2 )), and to do this it is sufficient to upper bound the absolute value |A|.

To see this, suppose that, for some setting of theYi’s it holds that∑

i:Yi=1ai ≥

∑

i:Yi=−1 ai (the other case isidentical). Now if|A| = ∑i:Yi=1 ai −

∑

i:Yi=−1 ai ≤ x, we have:∑

i:Yi=−1ai ≥ (

∑

i:Yi=1

ai)− x = 1− (∑

i:Yi=−1ai)− x

and hence

min(f(C∗1), f(C∗2 )) ≥

∑

i:Yi=−1ai ≥

1− x2

.

Hence, we would like to upper bound|A| with high probability. Since eachYi is independent with expectation0, wehaveE[A] = 0 andE[A2] =

∑k′

i=1 a2i . The standard deviation ofA is:

σ =

√

√

√

√

k′∑

i=1

a2i ≤√

c · 1c

2

=1√c.

By Chebyshev’s inequality, for anyd ≥ 0, we have

Pr[|A| ≥ d√c] ≤ 1

d2.

That is, except with probability1/d2, min(f(C∗1 ), f(C∗2 )) ≥ (1 − d√c)/2. Now for some calculations. With

probability1/2, we do not run Dynkin’s algorithm. Independently of this, with probability1/2, f(C∗1 ) ≤ f(C∗2 )—i.e., the valuemin(f(C∗1 ), f(C

∗2 )) is achieved onσσσm. With probability(1− 1/d2), this value is at least(1− d√c)/2.

Now we run aρoff-approximation onσσσm, and thus with probability14 (1− 1/d2),

f(A1) ≥1

2(1− d√

c) · 1

ρoff.

16

If we use this as a lower bound forf(C∗2 ) (which is fine since we are in the case wheref(A1) ≤ f(C∗1) ≤ f(C∗2 )),the semi-online algorithm gives us a value of at leastf(A1)ρon . Hence we have

E[f(A2)] ≥1

2(1− d√

c) · 1

ρoff· 14(1− 1/d2) · 1

ρon. (16)

Combining both cases and optimizing over parametersd andc (d← 3.08, c← 260.24) we have:

E[f(A2)] ≥ min(

1

8 ρoff ρon(1− 1/d2)(1− d√

c),

1

2ce

)

·OPT ≥ OPT1417

(17)

C.2 Proof for Partition Matroid Submodular Secretaries

Let S0 be the set of firstN0 elements, and letSj denote the elements in epochj. Since the input permutationitself is random, the distribution over the setsS0, . . . , Sk is identical to one resulting from the following process:each elemente independently chooses a real numberre in (0, 1) and is placed inS0 if re ≤ 12 , and inSj ifre ∈ (12 +

j−1100k ,

12 +

j100k ]. We shall use this observation to simplify our analysis.

For the following lemma, we need to keep track of several events:

1. Hi,j: The highest element from partition groupi defined under the valuation function used during epochjfalls into epochj.

2. Fi,j : The highest element from partition groupi among those seen until the end of epochj (defined under thevaluation function used during epochj) falls into epochj.

3. Li,j: The highest element from partition groupi defined under the valuation function used during epochjdoes not fall before epochj.

4. Si,j: The second highest element (if any) from partition groupi defined under the valuation function usedduring epochj falls before epochj.

5. Pi,j : Some element from partition groupi has already been selected before epochj.

In the definitions above, we assume that a fixed tie breaking rule is used to ensure that there is a unique highest andsecond highest element.

Lemma 4.7 For all partition groupsi and epochsj, the algorithm selects the highest element from groupi(according to the valuation function during epochj) during epochj with probability at leastΩ( 1k ). Specifically:

Pr[(Hi,j ∧ Si,j ∧ ¬Pi,j) ∧ (∧

i′ 6=i¬Fi′,j)] = Ω(

1

k)

where the probability is over the random permutation of the elements.

Proof. We observe that the event(Hi,j ∧ Si,j ∧ ¬Pi,j) ∧ (∧

i′ 6=i ¬Fi′,j) implies that algorithm selects the highestelement from groupi in epochj. We will lower bound the probability of this event. We will show this by consideringthe eventsPi,j , Si,j, Li,j ,

∧

i′ 6=i ¬Fi′,j,Hi,j in this order, and lower bound the probability of each conditioning onthe previous ones.

Under any (arbitrary) valuation function, the eventsLi,j andSi,j depend on the real numbers chosen by thehighest and the second highest elements. ThusPr[Li,j ∧ Si,j] ≥ 12 (12 −

j−1100k ) ≥ 15 .

LetQi,j denote the number of elements from groupi that do not appear inS0, . . . , Sj−1, but are higher (under thevaluation function at epochj) than any groupi element inS0, . . . , Sj−1. It is easy to see that the random variableQi,jis dominated by a geometric random variable with parameter12 . Moreover, for any elemente contributing toQi,j, it

17

appears in epochj with probability at most1

100k12− j

100k

≥ 140k so thatPr[Fi,j] ≤E[Qi,j]40k ≤ 120k . SincePi,j ⊆ ∪j′

1 2

T(op)

B(ottom)

1�

2��

2�

��

��

Figure 4:Illustration ofCOVER({1, 2}, {2})

ClaimD.2. No algorithm has expected payoff greater than83 on the instanceCOVER({1, 2}, {r}) in the semi-onlinesetting whenr is drawn uniformly at random.

BecauseOPT = 3, Claim D.2 implies no algorithm can do better than89 fraction ofOPT , which gives us thefirst part of the theorem.

Proof. We proceed by case analysis. In the case where the first element that arrives isrTB , the algorithm knowsrand can obtainOPT = 3. This happens with probability13 .

In the case where the first element that arrives is1B , the algorithm can accept or reject the element. If thealgorithm rejects, then it may as well take the next two elements that arrive. Sincer = 1 with probability half, theexpected payoff is at most52 .

Now suppose the algorithm accepts1B in the the first position. The algorithm should now pickrTB (and reject2B if it comes beforerTB) because the marginal value ofrTB is at least as large as that of2B . Sincer is random,this marginal value is32 in expectation, and hence the expected payoff of the algorithm is once again

52 .

Similarly, if the first element is2B , the payoff is bounded by52 in expectation. Thus the total expected payoff ofthe algorithm is bounded by13 · 3 + 13 · 52 + 13 · 52 = 83 .

Now we would like to show that something similar is true for much largerk. The basic idea is to combine manydisjoint instances ofCOVER({1, 2}, {r}), and show that if the algorithm does well overall, it must have done wellon each instance, violating ClaimD.2.

Claim D.3. For any evenk, no algorithm has expected payoff more than1712k in the semi-online setting on instancesof COVER({1, . . . , k}, S) trying to maximizef and restricted to pickingk sets, whenS is drawn uniformly at randomamong subsets of{1, . . . , k} with k/2 elements.

BecauseOPT = 3k/2., ClaimD.2 implies no algorithm can do better than a1718 fraction ofOPT , which givesus the second part of the theorem.

Proof. For the sake of analysis, we think of the instance ofCOVER({1, . . . , k}, S) being created by first choosing amatching on the set{1, . . . , k} and then within each edgee = (i, j) of the matching choosinger ∈ {i, j} to includein S.

We can then think of the sets ofCOVER({1, . . . , k}, S) being generated by taking the sets of the instanceCOVER({i, j}, {er}) for each edgee = (i, j) in the matching. Call each of thesek/2 instances ofCOVER({i, j}, {er})apuzzle.

Fix an semi-online algorithmA. Let C be the set of elements chosen byA. For 0 ≤ i ≤ 3, let Pi be the setof puzzles such thatC contains exactlyi elements from the puzzle; letxi be the expected sizes ofPi (over therandomness of the assignments of puzzles, the ordering, andA); and letEi be the expected payoff from all thepuzzles inPi. Note thatE0 = 0 andE3 = 3x3.

19

Claim D.4. E1 + E2 ≤ 4k3 − 2x3Proof. Given an instance ofCOVER({1, 2}, {r}), construct a random instance ofCOVER({1, . . . , k}, S) by gener-ating a random matching and randomly picking a special edgee = (i, j), wherei andj are randomly ordered. Foreach edgee′ = (i′, j′) pick e′r ∈ {i′, j′} to include inS. Now runA on this instance ofCOVER({1, . . . , k}, S),except than whenever an element of the puzzle correspondingto edgee comes along, replace it with the next elementfrom the given instance ofCOVER({1, 2}, {r}); however replace1 with i, and2 with j. RunA on this instance ofCOVER({1, . . . , k}, S), and whenevenA chooses an element from the instance ofCOVER({1, 2}, {r}), choose thatelement (it may be thatA selects more than2 elements, in which case, just select the first 2).

This instance ofCOVER({1, . . . , k}, S) has the same distribution as in the claim, and the given instance ofCOVER({1, 2}, {r}) is a random puzzle in this distribution. Thus, the expected payoff of theCOVER({1, 2}, {r})instance is at least(E0 + E1 + E2 + 2x3)/(k2 ). By ClaimD.2 this is≤ 83 . RecallingE0 = 0 and rearranging givesus the claim.

We combing the above claim with the fact thatE0 = 0 andE3 = 3x3 to get that

E[f(C)] = E1 +E2 + E3 ≤4

3k + x3. (18)

Note also thatA receives payoff at most 2 from any puzzle inP1, and at most 0 from any puzzle inP0. Themaximum payoff from each puzzle is 3, which occurs in OPT. Thus

E[f(C)] ≤ 3k/2 − 3x0 − x1. (19)

Finally, there arek/2 puzzles, sox0 + x1 + x2 + x3 = k/2. Additionally, because the algorithm never picksmore thank elements, we havex1 + 2x2 + 3x3 ≤ k. Solving the first equation forx2 and substituting forx2 in thesecond we get

0 ≤ 2x0 + x1 − x3. (20)Adding Equations18, 19, and20, we see that2E[f(C)] ≤ 176 k − x0 which implies thatE[f(C)] ≤ 1712k.

20

Documents

Constrained Non-Monotone Submodular Maximization ...In another line of work, [Jen76, KH78, HKJ80] showed that the greedy algorithm is a p-approximation for maximizing a modular (i.e.,