22
Better Approximations for the Minimum Common Integer Partition Problem David Woodruff Approx 2006 MIT and Tsinghua University

Better Approximations for the Minimum Common Integer Partition Problem

  • Upload
    tab

  • View
    35

  • Download
    0

Embed Size (px)

DESCRIPTION

Better Approximations for the Minimum Common Integer Partition Problem. David Woodruff. MIT and Tsinghua University. Approx 2006. Minimum Common Integer Partition. X = {x 1 , …, x r }, Y = {y 1 , …, y s } are multisets of positive integers. r ¸ s - PowerPoint PPT Presentation

Citation preview

Page 1: Better Approximations for the Minimum Common Integer Partition Problem

Better Approximations for the Minimum Common Integer

Partition Problem

David Woodruff

Approx 2006

MIT and Tsinghua University

Page 2: Better Approximations for the Minimum Common Integer Partition Problem

Minimum Common Integer Partition

• X = {x1, …, xr}, Y = {y1, …, ys} are multisets of positive integers. r ¸ s

• Consider a partition of X into s subsets B1, …, Bs

• If there exist B1, …, Bs with b 2 Bi b = yi for all i, then X is an integer partition of Y. Think of X as a refinement of Y

• k-MCIP problem: Given Y1, …, Yk, find a smallest integer partition X of each of Y1, …, Yk

• Let m = i=1k |Yi|. Efficiency in terms of m.

Page 3: Better Approximations for the Minimum Common Integer Partition Problem

MCIP Example

Y1 = {2, 2, 3}, Y2 = {1, 1, 5}

Claim: {1, 1, 2, 3} = k-MCIP(Y1, Y2)

Proof: Partition 1: {1, 1}, {2}, {3} Partition 2: {1}, {1}, {2, 3} {1, 1, 2, 3} is an integer partition of Y1 and Y2

Any integer partition of both Y1, Y2 has size ¸ 4

Page 4: Better Approximations for the Minimum Common Integer Partition Problem

Applications

AAA-AAAAA-AA-AAA-AA-AAAA-AAA

{2,2,4,3} {3,5,2,1}

MCIP = {2, 3, 1, 2, 3}

Since |MCIP| small, humans and monkeys are similar(this measure has been proposed in practice [Jiang, et al])

Page 5: Better Approximations for the Minimum Common Integer Partition Problem

Applications

A-A-A-A-AA-A-AA-A-AAA-AA-AAAA-AAA

{2,2,4,3} {1,1,1,1,2,1,2,1,1}

MCIP = {1, 1, 1, 1, 1, 1, 1, 2, 2}

Since |MCIP| large, humans and mice are not similar

Page 6: Better Approximations for the Minimum Common Integer Partition Problem

Applications

• DNA fingerprint assembly– Oligonucleotide Fingerprinting Ribosomal

Genes Project [Valinsky, et al]– Goal is to identify microbial organisms – Use MCIP as a subroutine, k ¼ 28, m ¼ 212

[Jiang]

• Clustering? Scheduling?

Page 7: Better Approximations for the Minimum Common Integer Partition Problem

Previous Work

k-MCIP problem: Given Y1, …, Yk, find a smallest integer partition of each of Y1, …, Yk

[CLLJ] NP-hard (Maximum Set Packing)

APX-hard for every k ¸ 2 (Maximum-3-Dimensional Matching with Bounded Degree)

Page 8: Better Approximations for the Minimum Common Integer Partition Problem

Previous Work

[CLLJ] Upper Bounds (5/4)-approximation for k = 2Problem: (m9) running time (m ¼ 212 in practice)

(k-1/3)-approximation in generalProblems: (1) Large ratio (2) Unknown if there is a tight instance

Page 9: Better Approximations for the Minimum Common Integer Partition Problem

Our Contributions

• .614k + o(k) approximation– O(m log k) time– Extremely easy to implement– If Y1, …, Yk are disjoint, then (k+1)/2

approximation

• We show that the [CLLJ] k-1/3 approximation algorithm is actually a k-1/2 approximation, and this is tight

Page 10: Better Approximations for the Minimum Common Integer Partition Problem

Algorithm Overview

• Let A be an algorithm for 2-MCIP. We build an algorithm B for k-MCIP

• Choose a random set partition of {1, …, k} into pairs of integers

• For each pair (i,j) 2 , let Ai,j = A(Yi, Yj)

• If there is only one pair (1,2) 2 , output A1,2, otherwise recurse on multisets Ai,j with (i,j) 2

Page 11: Better Approximations for the Minimum Common Integer Partition Problem

2-MCIP Algorithm

• What is the algorithm for 2-MCIP?

• Greedy algorithm

3422

1253

Y1:

Y2:

Choose two integersTake the minimumSubtract the minimum from both integers and append it to the output

1

0

Remove all 0s

3213

Output

Repeat|Greedy(Y1, Y2)| < |Y1| + |Y2|Generalization: Greedy(Y1, …, Yk) · i=1k |Yi| = m

Page 12: Better Approximations for the Minimum Common Integer Partition Problem

Better 2-MCIP Algorithm• CommonElements algorithm for 2-MCIP of Y1, Y2:

• T Ã ;. While there is a common integer x of Y1 and Y2, T Ã T [ x Y1 Ã Y1 n x Y2 Ã Y2 n x

• Output T [ Greedy(Y1, Y2)

• Let c1,2 be the # of common integers of Y1 and Y2

• |CommonElements(Y1, Y2)| · (|Y1| + |Y2| - 2c1,2) + c1,2

= |Y1| + |Y2| - c1,2

Page 13: Better Approximations for the Minimum Common Integer Partition Problem

Algorithm Recap

• Choose a random set partition of {1, …, k} into pairs of integers

• For each pair (i,j) 2 , let Ai,j = CommonElements(Yi, Yj)

• If there is only one pair (1,2) 2 , output A1,2, otherwise recurse on multisets Ai,j with (i,j) 2

Page 14: Better Approximations for the Minimum Common Integer Partition Problem

Analysis

• Lower bound the output size of our algorithm as a function of the frequency of different integers

• Find the expected output size as a function of the frequency of different integers

• Divide these two to get a worst-case (expected) ratio

• Derandomize using conditional expectations

Page 15: Better Approximations for the Minimum Common Integer Partition Problem

Frequency of Integers

Define the r-redundancy Red(r) to capture integer frequencies

13132

11125

11341

Consider r disjoint multisets A1, …, Ar such that 1. Each Ai intersects at most one input multiset 2. Ai only contains 1 distinct integer

Red(r) is maxA1, …, Ar i=1r |Ai|

Y1

Y2

Y3

Page 16: Better Approximations for the Minimum Common Integer Partition Problem

Lower BoundOpt is the size of k-MCIP

Elements of Y1 , Y2, …, Yk

There are opt right vertices each of

degree k

Elements ofk-MCIP

A left vertex is joined to elements partitioning it

5 2

3

# degree-1 vertices on the left is · Red(opt).So, # edges is ¸ 1¢Red(opt) + 2¢(m – Red(opt)).

But, # edges is exactly k¢opt.So, k ¢ opt ¸ 2m – Red(opt)

Page 17: Better Approximations for the Minimum Common Integer Partition Problem

Example

• Our bound is k ¢ opt ¸ 2m – Red(opt)

• If input multisets are disjoint, Red(opt)=opt

• Trivial greedy algorithm has output size · m

• So greedy algorithm is a m/opt = (k+1)/2 approximation

Page 18: Better Approximations for the Minimum Common Integer Partition Problem

Algorithm Recap

• Choose a random set partition of {1, …, k} into pairs of integers

• For each pair (i,j) 2 , let Ai,j = CommonElements(Yi, Yj)

• If there is only one pair (1,2) 2 , output A1,2, otherwise recurse on multisets Ai,j with (i,j) 2

Page 19: Better Approximations for the Minimum Common Integer Partition Problem

Upper Bound

• In some recursive call on multisets Ya and Yb, we are interested in the number of common elements of Ya, Yb

• Since we choose a random partition of input multisets, we can bound the expected number of common elements as a function of Red(opt)

• Linearity of expectations and some calculus allows us to bound the expected number of common elements encountered over all recursive calls, in terms of Red(opt)

• Use lower bound in terms of Red(opt) to get overall ratio

Page 20: Better Approximations for the Minimum Common Integer Partition Problem

Upper Bound

• Each of O(log k) recursive calls can be implemented in O(m) time, so O(m log k) time

• Actually, proof shows that only 3 recursive calls are necessary to get .614k + o(k) approximation

• This allows derandomization using conditional expectations in O(m poly(k)) time

Page 21: Better Approximations for the Minimum Common Integer Partition Problem

Conclusions and Future Work

• .614k + o(k) approximation in O(m log k) time

• Improve analysis of previous best algorithm, showing it has ratio exactly k-1/2. – Upper bound uses our notion of redundancy– Lower bound uses an adversarial argument

• Best known lower bound is (1), so there is a huge gap.

Page 22: Better Approximations for the Minimum Common Integer Partition Problem

Another Example• Consider algorithm which repeatedly removes an integer

common to all k input multisets, and then runs a greedy algorithm on the remaining multisets [CLLJ06]

• Suppose r common integers are removed. Then output size · (m-rk) + r

• But Red(opt) · rk + (opt – r)(k-1). • Our bound is k ¢ opt ¸ 2m – Red(opt)

• This implies opt ¸ (2m-r)/(2k-1), and (m-rk+r)/opt · k – ½.

• Using an adversarial argument, can show this is tight