21
On NP-Hardness of the Paired de Bruijn Sound Cycle Problem Evgeny Kapun Fedor Tsarev Genome Assembly Algorithms Laboratory University ITMO, St. Petersburg, Russia September 2, 2013

On NP-Hardness of the Paired de Bruijn Sound Cycle Problem

Embed Size (px)

DESCRIPTION

Talk at WABI-2013

Citation preview

Page 1: On NP-Hardness of the Paired de Bruijn Sound Cycle Problem

On NP-Hardness of thePaired de Bruijn Sound Cycle

Problem

Evgeny Kapun Fedor Tsarev

Genome Assembly Algorithms Laboratory

University ITMO, St. Petersburg, Russia

September 2, 2013

Page 2: On NP-Hardness of the Paired de Bruijn Sound Cycle Problem

Genome assembly models

I Shortest common superstring –

NP-hard.

I Shortest common superwalk in a de

Bruijn graph – NP-hard.

I Superwalk in a de Bruijn graph with

known edge multiplicities – NP-hard.

I Path in a paired de Bruijn graph?

Page 3: On NP-Hardness of the Paired de Bruijn Sound Cycle Problem

Paired de Bruijn graph

TT

CT

TA

TT

AG

TC

GC

CT

CA

TT

AT

TC

TTACTT

TAGTTC

AGCTCT

GCACTT

CATTTC

ATTTCT

TATTTC

CAGTTC

Page 4: On NP-Hardness of the Paired de Bruijn Sound Cycle Problem

Sound path

Sound path: strings match with shift d = 6.

TAGCTCACCCGTTGGT

ACCCGTTGGTAATTGCSound cycle: cyclic strings match with shift

d = 6.

TGATAAGTAGGCTAAG

GTAGGCTAAGTGATAA

Page 5: On NP-Hardness of the Paired de Bruijn Sound Cycle Problem

Paired de Bruijn Sound CycleProblem

Given a paired de Bruijn graph G and an

integer d (represented in unary coding), find

if G has a sound cycle with respect to shift d .

Page 6: On NP-Hardness of the Paired de Bruijn Sound Cycle Problem

Paired de Bruijn Covering SoundCycle Problem

Given a paired de Bruijn graph G and an

integer d (represented in unary coding), find

if G has a covering sound cycle with respect

to shift d .

Page 7: On NP-Hardness of the Paired de Bruijn Sound Cycle Problem

Parameters

I |Σ|: size of the alphabet.

I k : length of vertex labels.

I |V |, |E |: size of the graph (bounded in

terms of |Σ| and k).

I d : shift distance.

Page 8: On NP-Hardness of the Paired de Bruijn Sound Cycle Problem

Simple cases

I |Σ| = 1: at most one vertex, at most

one edge.

I k = 0: at most one vertex, at most |Σ|2edges, reduces to the problem of

computing strongly connected

components.

I d is fixed: find a cycle in a graph of

|V ||Σ|d states.

Page 9: On NP-Hardness of the Paired de Bruijn Sound Cycle Problem

Interesting case: k = 1

With fixed k = 1, the problem is NP-hard.

Proof outline:

1. Reduce Hamiltonian Cycle Problem,

which is NP-hard, to an intermediate

problem.

2. Reduce that problem to Paired de Bruijn

(Covering) Sound Cycle Problem.

Page 10: On NP-Hardness of the Paired de Bruijn Sound Cycle Problem

The intermediate problem

Given an undirected graph,

I if it contains a hamiltonian cycle, output

1.

I if it doesn’t contain hamiltonian paths,

output 0.

I otherwise, invoke undefined behavior.

Page 11: On NP-Hardness of the Paired de Bruijn Sound Cycle Problem

Solution, step 1

G1 G2

a1

a2

a3

b1

b2

b3

Page 12: On NP-Hardness of the Paired de Bruijn Sound Cycle Problem

Properties of a graph with ahamiltonian cycle

Such graph contains hamiltonian paths

I ending at any vertex.

I passing through any edge.

I for any edge {i , j} and vertex k 6= i , j ,

passing through {i , j} such that j is

between i and k on the path.

Page 13: On NP-Hardness of the Paired de Bruijn Sound Cycle Problem

Solution, step 2

V2n+2s1s2 V1

s2s3 V2

. . . . . . s1. . . . . . . . . . . . s2. . . . . .

. . . . . . s2. . . . . . . . . . . . s3. . . . . .

Page 14: On NP-Hardness of the Paired de Bruijn Sound Cycle Problem

Solution, step 2

In V1, V2n+2:

i

j

−→

i

j

Page 15: On NP-Hardness of the Paired de Bruijn Sound Cycle Problem

Solution, step 2

In V3. . .V2n+1:

i

Page 16: On NP-Hardness of the Paired de Bruijn Sound Cycle Problem

How to make a covering cycle

Pass along

the loop

multiple

times,

covering

more and

more edges

with each

iteration.

Page 17: On NP-Hardness of the Paired de Bruijn Sound Cycle Problem

Interesting case: |Σ| = 2

With fixed |Σ| = 2, the problem is NP-hard.

The proof is done by reduction from the case

k = 1. The characters are replaced with

binary sequences, and a transformation is

done to avoid undesired overlaps.

Page 18: On NP-Hardness of the Paired de Bruijn Sound Cycle Problem

Fix both k and |Σ|If both k and |Σ| are fixed, the number of

different de Bruijn graphs is finite. The only

argument which can take infinitely many

values is d , and it is an integer represented in

unary coding. As a result, the number of

valid problem instances having any fixed

length is bounded. So, the language defined

by the problem is sparse. Therefore, the

problem is not NP-hard unless P=NP.

Page 19: On NP-Hardness of the Paired de Bruijn Sound Cycle Problem

Results

Paired de Bruijn (Covering) Sound Cycle

Problem is

I NP-hard for any fixed k ≥ 1 (can be

reduced from k = 1).

I NP-hard for any fixed |Σ| ≥ 2 (trivially

reduced from |Σ| = 2).

I NP-hard in the general case.

Page 20: On NP-Hardness of the Paired de Bruijn Sound Cycle Problem

Results

Paired de Bruijn (Covering) Sound Cycle

Problem is

I Not NP-hard if both k and |Σ| are fixed,

unless P=NP.

I Solvable in polynomial time if k = 0,

|Σ| = 1, or d is fixed.

Page 21: On NP-Hardness of the Paired de Bruijn Sound Cycle Problem

Thank you!