View
218
Download
3
Embed Size (px)
Citation preview
1
Michal Ozery-Flato and Ron Shamir
3/ 2An ( log( )) Algorithm
f or Sorting by
Reciprocal Translocations
O n n
3
Overview
• Preliminaries• Reduction to a simpler case• The main algorithm (reduced case)
• Preliminaries• Reduction to a simpler case• The main algorithm (reduced case)
6
Reciprocal Translocations
• Exchange non-empty ends between two chromosomes
Prefix-prefix
Prefix-postfix
X1 X2
Y1 Y2X1 X2
Y1 Y2X1 X2
Y1 Y2-Y1
-X2
7
Sorting by Reciprocal Translocations
• Tails {(1, 2,-4), (-3, 5) ,(6,-8,-7,9)} = {1, 4, -3,-5, 6, -9 }
• A B:
– genes(A) = genes(B)
– Tails (A) = Tails(B)
• An O(n3) algorithm (Hannenhalli 96, Bergeron et al. 06)
reciprocal translocations
8
The Cycle Graph
40 41 11 10
31 30 21 20 50 51
60 61 71 70 80 81
cycle graph(A,B)
external
internal
adjacency
#cycles(A,B) =3
A={(4, -1), (-3,-2, 5), (6,-7,8)}B={(1,2,3), (4,5), (6,7,8)}
9
A = (4, -1, -3,-2, 5, 6 -7 ,8) (concatenation of A’s chrs)
The Overlap Graph (with Chromosomes)
edge
chromosome
Overlap graph (A, B, A)
(1,2) (4,5) (2,3) (6,7) (7,8)
40 41 11 10 31 30 21 20 50 51 60 61 71 70 80 81
10
(Connected) Components
Overlap graph (A, B, A)
(1,2) (4,5) (2,3) (6,7) (7,8)
bad component= non-trivial internal
component
trivial component
=adjacency
12
The Reciprocal Translocation Distance
• dRT(A,B) = reciprocal translocation distance
• Theorem [Hannenhalli 96, Bergeron et al. 06]: dRT(A,B) = #genes - #chrs - #cycles(A,B) + F(A,B)– F(A,B) = depends on the topology of the bad
components. If there are no bad components then F=0.
13
Reduced Case: No Bad Components
Result 1:
The problem“Sorting by Reciprocal Translocations”can be reduced to the problem“Sorting by Reciprocal Translocations, No Bad Components”in linear time.
14
Reduction’s Main Idea
• Isolation: all bad components are found in one chromosome.
• Goal: eliminate the bad components without creating– Maintain two lists of chromosomes:
• Exactly one minimal bad component• Two or more minimal bad components
– Use prefix-prefix translocations (no sign changes)
16
Translocations Defined by External Edges
e = external edge(e) = transforms e into an adjacency
– Increases #cycles(A,B)– May create a bad component
dRT(A,B) = #genes – #chrs – #cycles(A,B) +F(A,B)
1 2
eG
yx 1 2
G(e) e
y x
17
The Main Algorithm1. Mark all edges (except adjacencies) as “unused”,
S, L2. While there is an unused external edge e
a. Mark e as “used”b. If (e) (FIRST(L)):
Apply (e) to A and APPEND (S, e) 3. If all the edges are used return (S,L)4. While all the unused edges are internal
Undo last translocation and PREPEND(L, POP(S))5. Goto 1
“Farward part”
(S)
“Backward part”
(L)
Solution
18
The Main Algorithm
AUnused edges
SL
(1,-5,6) (3,-4,2)1,3,4,5
1. Mark all edges (except adjacencies) as “unused”,S, L
2. While there is an unused external edge e a. Mark e as “used”b. If (e) (FIRST(L)):
Apply (e) to A and APPEND (S, e) 3. If all the edges are used return (S,L)4. While all the unused edges are internal
Undo last translocation and PREPEND(L, POP(S))5. Goto 1
B= {(1,2),(3,4,5,6)} edge (i,i+1) identified by i
19
The Main Algorithm
AUnused edges
SL
(3-,4-,5,6( )1,2) 3,4,51
1. Mark all edges (except adjacencies) as “unused”,S, L
2. While there is an unused external edge e a. Mark e as “used”b. If (e) (FIRST(L)):
Apply (e) to A and APPEND (S, e) 3. If all the edges are used return (S,L)4. While all the unused edges are internal
Undo last translocation and PREPEND(L, POP(S))5. Goto 1
B= {(1,2),(3,4,5,6)} edge (i,i+1) identified by i
20
The Main Algorithm
AUnused edges
SL
(1,-5,6) (3,-4,2) 3,4,51
1. Mark all edges (except adjacencies) as “unused”,S, L
2. While there is an unused external edge e a. Mark e as “used”b. If (e) (FIRST(L)):
Apply (e) to A and APPEND (S, e) 3. If all the edges are used return (S,L)4. While all the unused edges are internal
Undo last translocation and PREPEND(L, POP(S))5. Goto 1
B= {(1,2),(3,4,5,6)} edge (i,i+1) identified by i
21
The Main Algorithm
AUnused edges
SL
(3,6) (1,-5,-4,2)3,541
1. Mark all edges (except adjacencies) as “unused”,S, L
2. While there is an unused external edge e a. Mark e as “used”b. If (e) (FIRST(L)):
Apply (e) to A and APPEND (S, e) 3. If all the edges are used return (S,L)4. While all the unused edges are internal
Undo last translocation and PREPEND(L, POP(S))5. Goto 1
B= {(1,2),(3,4,5,6)} edge (i,i+1) identified by i
22
The Main Algorithm
AUnused edges
SL
(-2,6) (1,-5,-4,-3)54,31
1. Mark all edges (except adjacencies) as “unused”,S, L
2. While there is an unused external edge e a. Mark e as “used”b. If (e) (FIRST(L)):
Apply (e) to A and APPEND (S, e) 3. If all the edges are used return (S,L)4. While all the unused edges are internal
Undo last translocation and PREPEND(L, POP(S))5. Goto 1
B= {(1,2),(3,4,5,6)} edge (i,i+1) identified by i
23
The Main Algorithm
AUnused edges
SL
(-2,6) (1,-5,-4,-3)4,31
1. Mark all edges (except adjacencies) as “unused”,S, L
2. While there is an unused external edge e a. Mark e as “used”b. If (e) (FIRST(L)):
Apply (e) to A and APPEND (S, e) 3. If all the edges are used return (S,L)4. While all the unused edges are internal
Undo last translocation and PREPEND(L, POP(S))5. Goto 1
B= {(1,2),(3,4,5,6)} edge (i,i+1) identified by i
24
Implementation of the Algorithm
• Simple O(n2) time implementation• time implementation
using a data structure that:– Maintains a fragmented signed
permutation– Allows one to find an external edge e and
perform the translocation (e) in time
– Based on a data structure by Kaplan & Verbin 05'
3/ 2( log( ))O n n
( log( ))O n n
26
Simulating Translocations by Reversals [Hannenhalli & Pevzner]
A translocation can be simulated by:
• A reversal on A, or
• A chromosome flip in A + a reversal on A
10 11 20 21 30 31 40 41 50 51
cycle graph(A,B)
10 11 41 40 31 30 21 20 50 51
27
Working on the overlap graph
• H = overlap graph(A, B, A)
• H is sorted if every component is trivial
• Operations: (v) : a reversal on an oriented external
vertex v (cost = 1) (X) : a flip on chromosome X (cost = 0)
28
H●(v) (two chromosome only)
unoriented edgeoriented edge
chromosome
Hv
unoriented edgeoriented edge
chromosome
H● (v)v unoriented edge
oriented edgechromosome
Hv